How to execute a rolling conditional calculation on tipping bucket rain gauge data in Rstudio?

64 Views Asked by At

I'm currently working on a R script for analyzing tipping bucket rain gauge data that was collected from a national forest. These tipping bucket rain gauges are designed to collect 0.01 inches of water per bucket tip and count up linearly as they happen. For example, upon retrieving a gauge the count is 5.00, meaning the bucket tipped 500 times, and collected 5.00 inches of rain.

Inside this R script, I would like to analyze the data for instantaneous, high-intensity rainfall events. These rainfall events are characterized by the collection of >= 0.25 inches of water in a 15-minute period. This dataset is quite large, so identifying these events by hand is impractical. I'm wondering if it is possible to create a rolling calculation with a window of size 15 minutes. This rolling sum would utilize two columns, one column of DateTime (POSIXct) and one column of event tips (double). The window would take the tip at 0:00 and subtract it from the tip at 15:00, giving the total rainfall in that 15-minute window. This would roll forward until the dataset is finished. The ideal output would be a column of type double that gave the total rainfall in each 15-minute segment. Alternatively, the new column could be of type logical that outputs TRUE if the window is >= 0.25 inches.

Segment of Data in Rstudio

This is a small segment of the data in R. One of the main concerns is the time intervals between rows/tips is inconsistent. Some days may be entirely skipped over in the deployment period. If you have any ideas, suggestions, and/or solutions please let me know.

Have a good day!

1

There are 1 best solutions below

0
Allan Cameron On

You can create a new data frame with two columns. One, called start is every 15 minute interval from the beginning to the end of your measurement period. The second column, called stop is the same as the first column except 15 minutes later.

Then, grouping this data frame row-wise, just count how many tips were between the start and stop times:

library(tidyverse)

tibble(start = seq(as.POSIXct('2023-06-08'), by = '15 min', len = 96 * 6),
       stop = start + lubridate::minutes(15)) %>%
  filter(complete.cases(.)) %>%
  rowwise() %>%
  mutate(tips = length(which(df$DateTime >= start & df$DateTime < stop)))
#> # A tibble: 576 x 3
#> # Rowwise: 
#>    start               stop                 tips
#>    <dttm>              <dttm>              <int>
#>  1 2023-06-08 00:00:00 2023-06-08 00:15:00     0
#>  2 2023-06-08 00:15:00 2023-06-08 00:30:00     0
#>  3 2023-06-08 00:30:00 2023-06-08 00:45:00     0
#>  4 2023-06-08 00:45:00 2023-06-08 01:00:00     1
#>  5 2023-06-08 01:00:00 2023-06-08 01:15:00     0
#>  6 2023-06-08 01:15:00 2023-06-08 01:30:00     0
#>  7 2023-06-08 01:30:00 2023-06-08 01:45:00     0
#>  8 2023-06-08 01:45:00 2023-06-08 02:00:00     0
#>  9 2023-06-08 02:00:00 2023-06-08 02:15:00     0
#> 10 2023-06-08 02:15:00 2023-06-08 02:30:00     0
#> # ... with 566 more rows
#> # i Use `print(n = ...)` to see more rows

Data used - obtained using OCR from picture of data in question

df <- structure(list(DateTime = structure(c(1686182399, 1686231919, 
1686515980, 1686516586, 1686580500, 1686581191, 1686582039, 1686585522, 
1686586899, 1686587520, 1686588807, 1686589302, 1686589641, 1686639178, 
1686640430, 1686641549, 1686643287, 1686645039, 1686646602, 1686657085, 
1686658572, 1686660764, 1686660826, 1686660934), class = c("POSIXct", 
"POSIXt"), tzone = ""), Event = c(0, 0.01, 0.02, 0.03, 0.04, 
0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, 
0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -24L))