I have a very large data frame that I need to subset by last values. I know that the data.table
library includes the last()
function which returns the last value of an array, but what I need is to subset foo
by the last value in id
for every separate value in track
. Values in id
are consecutive integers, but the last values will be different for every track
.
> head(foo)
track id coords.x coords.y
1 0 0 -79.90732 43.26133
2 0 1 -79.90733 43.26124
3 0 2 -79.90733 43.26124
4 0 3 -79.90733 43.26124
5 0 4 -79.90725 43.26121
6 0 5 -79.90725 43.26121
The output would look something like this.
track id coords.x coords.y
1 0 57 -79.90756 43.26123
2 1 98 -79.90777 43.26231
3 2 61 -79.90716 43.26200
... and so on
How would one apply the last()
function (or another function like tail()
) to produce this output?
We can use
data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by 'track' get the last row withtail
As the also mentioned another logic with 'id' about the consecutive numbers, we can also create a logical index using
diff
, get the row index (.I
) and subset the rows.Or we can do this using
base R
itselfOr another option is
dplyr