I have a dataset which is the output of multiple data loggers measuring temperature and lux (strength of light) at 1-hour intervals.
There are approx. 250,000 data points. I'm having trouble with temperature readings from 'sun flecks' where a shaft of lights hits the logger, heating it up quickly and then giving 'warm' readings for the rest of the day. I can use dplyr to subset these data (i.e. LUX>32,000) but I would like to remove all readings from that day if the logger had a LUX>32,000 reading. For ref each data logger has a name, date & time variables.
Is there a way to do this with dplyr?
If I remember right,
filter
doesn't work well with grouped data, so I'm first sorting the data frame by times (this may not be necessary if your data is already sorted appropriately). Then, for each logger and date, I'm identifying all points after aLUX > 32000
event and marking them. With that done, the filter should work.Edit
If you want to remove the entire day, you can change how the
fleck
variable is defined. For example,