R function to sum/count consecutives values - Rainfall indices

42 Views Asked by At

I would like to calculate some rainfall indices using a data frame with year, day and the rainfall values.

How to calculate Maximum 5 day Consecutive Precipitation (R5d)? Maximum number of consecutive days with rainfall (rainfall higher than 0mm)? Maximum number of consecutive days without rainfall (rainfall 0 mm)? Annual total rainfall above the 95 percentile?

I tried the followed code for R5d, but it does not work because it sum the days from 1 to 5, from 6 to 10 and so on. I want the sum of days 1 to 5, 2 to 6, 3 to 7 and so on.

    year=c(rep(2000,360),rep(2001,360))
    day=rep(1:360,2)
    rainfall<-rpois(360*2,0:100)
    df5=data.frame(year,day,rainfall)
    head(df5)
    
    #Sum values 5 by 5
    library(dplyr)
    sum5=df5 %>%
    mutate(Intervals_by5 = rep(row_number(), each=5, length.out = n())) %>%
    group_by(Intervals_by5) %>%
    summarise(Freq_sum = sum(rainfall))
    max(sum5)

Many thanks!

2

There are 2 best solutions below

0
Carl On

You could use a sliding window per below.

.complete = TRUE if you only want to start summing when you have 4 prior rows.

library(dplyr)
library(slider)

year <- c(rep(2000, 360), rep(2001, 360))
day <- rep(1:360, 2)
rainfall <- rpois(360 * 2, 0:100)
df5 <- data.frame(year, day, rainfall)

new_df <- df5 %>%
  mutate(sliding_sum = slide_dbl(rainfall, sum, .before = 4, .complete = TRUE))

head(new_df, 30)
#>    year day rainfall sliding_sum
#> 1  2000   1        0          NA
#> 2  2000   2        0          NA
#> 3  2000   3        6          NA
#> 4  2000   4        2          NA
#> 5  2000   5        4          12
#> 6  2000   6        8          20
#> 7  2000   7        2          22
#> 8  2000   8        8          24
#> 9  2000   9        8          30
#> 10 2000  10        9          35
#> 11 2000  11        6          33
#> 12 2000  12        9          40
#> 13 2000  13        6          38
#> 14 2000  14        7          37
#> 15 2000  15        9          37
#> 16 2000  16       20          51
#> 17 2000  17       18          60
#> 18 2000  18       22          76
#> 19 2000  19       14          83
#> 20 2000  20       23          97
#> 21 2000  21       21          98
#> 22 2000  22       20         100
#> 23 2000  23       18          96
#> 24 2000  24       19         101
#> 25 2000  25       19          97
#> 26 2000  26       29         105
#> 27 2000  27       27         112
#> 28 2000  28       26         120
#> 29 2000  29       32         133
#> 30 2000  30       28         142

Created on 2024-03-23 with reprex v2.1.0

0
TarJae On

Please check this one?


library(dplyr)
library(zoo)
library(data.table)

df %>% 
  mutate(
    R5d = rollapply(rainfall, 5, sum, fill = NA, align = "right"), # calculates  sum of days 1 to 5, 2 to 6, 3 to 7 and so on
    Rain_Days = rleid(rainfall > 0),
    No_Rain_Days = rleid(rainfall == 0), .by = year
  ) %>% 
  mutate(Consec_Rain_Days = if_else(first(rainfall) > 0, n(), 0), .by = Rain_Days) %>% #lengths of true sequences and find the max
  mutate(Consec_No_Rain_Days = if_else(first(rainfall) == 0, n(), 0), .by=No_Rain_Days) %>% 
  group_by(year) %>% 
  summarise(
    Max_R5d = max(R5d, na.rm = TRUE),
    Max_Consec_Rain_Days = max(Consec_Rain_Days),
    Max_Consec_No_Rain_Days = max(Consec_No_Rain_Days),
    Rain_95_Percentile = quantile(rainfall, 0.95), 
    Total_Above_95 = sum(rainfall[rainfall > quantile(rainfall, 0.95)]) # Yearly total rainfall > 95th percentile
  ) %>% 
    ungroup()
   year Max_R5d Max_Consec_Rain_Days Max_Consec_No_Rain_Days Rain_95_Percentile Total_Above_95
  <dbl>   <int>                <dbl>                   <dbl>              <dbl>          <int>
1  2000      20                   65                       4                  5             38
2  2001      19                   65                       4                  5             43