Rolling mean across day of year

Question

Rolling mean across day of year

76 Views Asked by tassones At 30 November 2023 at 03:06

In my example below, I can calculate a centered 7-day rolling mean however, the first three days and the last three days are NA values. The rolling mean should take into account that day 365 is followed by day 1 and use that in the rolling mean calculation. How can I calculate a rolling 7-day mean so that there are no NA values?

library(tidyverse)
library(zoo)

set.seed(321)

aa <- data.frame(
  doy = seq(1,365,1),
  value = round(rnorm(365,30,5))
)

bb <- aa %>%
  mutate(movingAVG = round(rollmean(value, k = 7, align = 'center', fill = NA)))

head(bb)
#>   doy value movingAVG
#> 1   1    39        NA
#> 2   2    26        NA
#> 3   3    29        NA
#> 4   4    29        31
#> 5   5    29        30
#> 6   6    31        31

tail(bb)
#>     doy value movingAVG
#> 360 360    24        30
#> 361 361    38        29
#> 362 362    30        29
#> 363 363    20        NA
#> 364 364    26        NA
#> 365 365    29        NA

^{Created on 2023-11-29 with reprex v2.0.2}

Original Q&A

There are 2 best solutions below

jay.sf On 30 November 2023 at 09:38

Simply using the same data for preceding and subsequent year doesn't make much sense, does it? Alternatively you could expand value by the amount of missings generatyed in the rolling average and linearly extrapolate using this function,

> f <- \(x, n) {
+   na <- replicate(2, rep_len(NA, floor(n/2)), simplify=FALSE)
+   if (n %% 2 == 0) {
+     na[[1]] <- `length<-`(na[[1]], n/2 - 1L)
+   }
+   u <- unlist(c(na, list(x))[c(1, 3, 2)])
+   approx(u, xout=seq_along(u), rule=2)$y
+ }

calculate the rolling average and delete the NAs aftwerwards.

Here using data.table.

> n <- 7
> library(data.table)
> setDT(aa)[, mavg := approx(
+   round(na.omit(frollmean(f(value, n), n, align='c'))), 
+   xout=seq_len(nrow(aa)), rule=2)$y]
> aa
     doy value mavg
  1:   1    39   34
  2:   2    26   33
  3:   3    29   32
  4:   4    29   31
  5:   5    29   30
 ---               
361: 361    38   29
362: 362    30   29
363: 363    20   28
364: 364    26   29
365: 365    29   27

Of course, you could think of a model and predict the tails instead of linear extrapolation.

**jared_mamrot** · Accepted Answer · 2023-11-30T03:24:28.103000

One potential option is to replicate your "aa" dataframe three times (e.g. 1-365 + 1-365 + 1-365), calculate your rolling mean for all values, then filter the middle "aa" dataframe (i.e. ~~1-365 +~~ 1-365 ~~+ 1-365~~), e.g.

library(tidyverse)
library(zoo)

set.seed(321)

aa <- data.frame(
  doy = seq(1,365,1),
  value = round(rnorm(365,30,5))
)

bb <- aa %>%
  bind_rows(aa, .id = "index") %>%
  bind_rows(aa) %>%
  mutate(movingAVG = round(rollmean(value, k = 7, align = 'center', fill = NA))) %>%
  filter(index == 2) %>%
  select(-index)

head(bb)
#>   doy value movingAVG
#> 1   1    39        28
#> 2   2    26        30
#> 3   3    29        30
#> 4   4    29        31
#> 5   5    29        30
#> 6   6    31        31
tail(bb)
#>     doy value movingAVG
#> 360 360    24        30
#> 361 361    38        29
#> 362 362    30        29
#> 363 363    20        29
#> 364 364    26        30
#> 365 365    29        28

^{Created on 2023-11-30 with reprex v2.0.2}

Does that make sense?

Rolling mean across day of year

There are 2 best solutions below

Related Questions in R

Related Questions in DPLYR

Related Questions in MEAN

Related Questions in ZOO

Trending Questions

Popular # Hahtags

Popular Questions