Rollapply or similar for multi-condition thresholds in time series data

130 Views Asked by At

I am trying to find dates of yearly instances exceeding multi-condition thresholds in time series data. I used the rollapply function to identify the first date each year (lets call this date A) when the daily temperature exceeds B (B >= 5.0; Mean_Temp) for C days (C = 5 days; width).

Sample Data:

    DOY Year    Month   Day Mean_Temp
    96  1960    4       5   1.5
    97  1960    4       6   -1
    98  1960    4       7   -1.9
    99  1960    4       8   -2.3
   100  1960    4       9   1.3
   101  1960    4       10  -0.5
   102  1960    4       11  5.9
   103  1960    4       12  5.7
   104  1960    4       13  5.3
   105  1960    4       14  6.1
   106  1960    4       15  9.9

Sample Code:

Table <- data %>% group_by(Year) %>%
mutate(Mean_Temp=rollapply(Mean_Temp, width=5, min, align="right", fill=NA, na.rm=TRUE)) %>%
filter(Mean_Temp >=5.0) %>%
filter(row_number() == 1)

Sample Output:

        X       Year    Month   Day Mean_Temp
    1   106     1960    4       15  5.3
    2   466     1961    4       10  5.6
    3   830     1962    4       9   5.6
    4   1205    1963    4       19  5.6
    5   1561    1964    4       9   5.6
    6   1948    1965    5       1   7.8

    

However, I would now like to find any instances (if they occur) of temperatures below a new threshold, X, for Y days, within Z days of A (e.g. 1960-04-15). For example, when does the temp drop below -15 within 28 days after the dates above?

The kind of output I am looking for would be something like:

    Year    Month   Day Mean_Temp
    1960    4       16  -16.1
    1960    4       19  -17.2
    1961    4       14  -15.2
    1961    4       15  -15.1
    1963    4       30  -16.7
    1963    5       1   -17.1
    1964    4       16  -15.3
    1964    4       17  -16.3

I am wondering about using the output from my rollapply function to indicate the starting dates each year (A) to monitor the temperature for the next C days to see if it drops below B. However, I am a little lost as to how to code that type of function, essentially looping through daily temperature data each year after a given date (presumably referenced from a separate table) watching for a certain temperature threshold.

Here is a sample of the data.

structure(list(X = 1:20, Year = c(1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L), Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Day = 1:20, Mean_Temp = c(-12.2, -10, -2.3, -4.2, -7.2, -12.3, -6.1, -5, -12.5, -9.2, -9.2, -6.7, -4.2, -6.1, -4.7, -6.7, -6.1, -6.7, -8.1, -7.8)), row.names = c(NA, 20L), class = "data.frame")
1

There are 1 best solutions below

1
On

I do not know what you are exactly looking for, but maybe the following is in the right direction.

MEAN.TEMPERATURE.THRESHOLD <- -5.0

df <- structure(list(X = 1:20, Year = c(1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L), Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Day = 1:20, Mean_Temp = c(-12.2, -10, -2.3, -4.2, -7.2, -12.3, -6.1, -5, -12.5, -9.2, -9.2, -6.7, -4.2, -6.1, -4.7, -6.7, -6.1, -6.7, -8.1, -7.8)), row.names = c(NA, 20L), class = "data.frame")

df <- subset(df, select = -X)


df$belowThreshold <- ifelse(df$Mean_Temp < MEAN.TEMPERATURE.THRESHOLD, TRUE, FALSE)

df$cumSumBelowThreshold <- with(df,
                                ave(belowThreshold,
                                    cumsum(belowThreshold == 0),
                                    FUN = cumsum))

df