How to group months column in a data frame in R

236 Views Asked by At

I have a data frame in the following fashion :

Year <- 1948:2017
Jan<- rnorm(70)
Feb<- rnorm(70)
Mar<- rnorm(70)
Apr<- rnorm(70)
May<- rnorm(70)
Jun<- rnorm(70)
Jul<- rnorm(70)
Aug<- rnorm(70)
Sep<- rnorm(70)
Oct<- rnorm(70)
Nov<- rnorm(70)
Dec<- rnorm(70)
test_df <- cbind.data.frame(Year, Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)
head(test_df)
########Console result


    Year        Jan        Feb        Mar         Apr
1 1948 -0.5918300  0.0497792 -0.9302350  0.73162688
2 1949 -1.2731259  0.8933090  0.2340527  1.03077077
3 1950 -0.3727786 -0.5680272  1.4439980  0.53150414
4 1951  0.6520741 -1.4229818 -0.9700416 -0.07151535
5 1952  0.4296101 -0.2294352  1.0863566  1.58652232
6 1953  0.3334147 -0.5386016  1.3432490  1.91005906
          May        Jun         Jul         Aug
1  0.28268233  0.7870373 -0.06178119 -0.14469371
2 -0.02048683 -1.4834607 -0.17926819 -0.38662117
3  0.24659095  0.4929837  0.79430914  0.03486687
4 -0.60123934  1.1304690 -0.13452649 -1.07814801
5  1.39161546  0.6827090  0.54729206  0.50188908
6 -0.53882956 -0.3246258  0.09602686 -2.35509441
         Sep        Oct        Nov         Dec
1  2.0492817  0.6185466  2.0427045 -0.06097253
2  0.7804505 -0.3416864 -1.5192509  2.01911948
3  1.9193976 -0.3120360  1.5646020 -0.04911313
4 -0.1147404 -0.3593639  0.5186583  1.39936930
5  2.4481574 -1.2349037 -0.3519640  0.58429371
6  0.6639531 -0.4471403  0.7071486 -1.02036467

I require to group random months such as JanFeb, JanMar or AprFeb or MarMayNov, like so. The grouping of months could be anything (Many number of possibilities and combinations). And when I group this months their values should be averages as for example, JanFeb value should be the mean of the values of Jan and Feb or MarMayNov value should be the mean of Mar, Nov and May. How to approach this problem? Any help is appreciated. Thanks.

Edit

Lets say for simplicity that I only want to group 2 months or 3 months at most not more than that.

1

There are 1 best solutions below

2
On

We can create all possible combinations of names using lapply and combn. For each combination find the average of selected columns in one column and combine such columns together in one dataframe.

cols <- names(test_df)[-1]

result <- do.call(cbind, lapply(2:length(cols), function(x)
  do.call(cbind, combn(cols, x, function(y) 
    setNames(data.frame(rowMeans(test_df[y])), 
              paste0(y, collapse = "")), simplify = FALSE))))

If you want to combine only 3 months at most, change 2:length(cols) to 2:3 in lapply.