I have a dataset that looks like this:
data <- structure(list(Date = structure(c(-2208988800, -2208902400, -2208816000,
-2208729600, -2208643200, -2208556800, -2208470400, -2208384000,
-2208297600, -2208211200, -2208124800, -2208038400, -2207952000
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), count = c(4668.8,
4476.9, 4945, 5275.7, 15013.1, 14418, 14059.1, 14043.5, 14142.2,
14904.2, 13849.9, 14712.1, 8793.9)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -13L))
| Date | count |
|---|---|
| 01-01-1900 | 4,668.80 |
| 02-01-1900 | 4,476.90 |
| 03-01-1900 | 4,945.00 |
| 04-01-1900 | 5,275.70 |
| 05-01-1900 | 15,013.10 |
| 06-01-1900 | 14,418.00 |
| 07-01-1900 | 14,059.10 |
| 08-01-1900 | 14,043.50 |
| 09-01-1900 | 14,142.20 |
| 10-01-1900 | 14,904.20 |
| 11-01-1900 | 13,849.90 |
| 12-01-1900 | 14,712.10 |
| 13-01-1900 | 8,793.90 |
I am trying to write a function that adds columns based on whether the previous cell is an outlier. I am hoping for a dataset that looks like this:
| Date | count | Outlier_T1 | Outlier_T2 | Outlier_T3 | Outlier_T4 | Outlier_T5 | Outlier_T6 | Outlier_T7 | Outlier_T8 | Outlier_T9 | Outlier_T10 | Outlier_T11 | Outlier_T12 | Outlier_T13 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01-01-1900 | 4,668.80 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 02-01-1900 | 4,476.90 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 03-01-1900 | 4,945.00 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 04-01-1900 | 5,275.70 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 05-01-1900 | 15,013.10 | 1 | ||||||||||||
| 06-01-1900 | 14,418.00 | 1 | ||||||||||||
| 07-01-1900 | 14,059.10 | 1 | ||||||||||||
| 08-01-1900 | 14,043.50 | 1 | ||||||||||||
| 09-01-1900 | 14,142.20 | 1 | ||||||||||||
| 10-01-1900 | 14,904.20 | 1 | ||||||||||||
| 11-01-1900 | 13,849.90 | 1 | ||||||||||||
| 12-01-1900 | 14,712.10 | 1 | ||||||||||||
| 13-01-1900 | 8,793.90 | 1 |
Until the fourth row, there aren't any outliers. But, the fifth row is an outlier, therefore outlier_t5 = 1. Now, that outlier_t5 equals 1, it is exempt from the analysis, therefore outlier_t5 = NA, but outlier_t6 = 1 (because the first four rows and the sixth row are part of the next outlier calculation) ... and so on.
I would really appreciate some help here.