Data table, removing leading missing values by group

171 Views Asked by Zachary At 19 October 2024 at 16:33

Below is an example data table, which I would like to remove the rows where value is NA and no earlier row has a value, i.e. also NA, and by group. As not all have the same number leading missing values I'm getting stuck and not having any luck searching.

Example data table

    group    date     value
      a 2015-01-01    NA
      a 2015-01-02     2
      a 2015-01-03     3
      a 2015-01-04    NA
      a 2015-01-05     2
      b 2015-01-01    NA
      b 2015-01-02    NA
      b 2015-01-03     2
      b 2015-01-04    NA
      b 2015-01-05     2

Finished data table

    group    date     value
      a 2015-01-02     2
      a 2015-01-03     3
      a 2015-01-04    NA
      a 2015-01-05     2
      b 2015-01-03     2
      b 2015-01-04    NA
      b 2015-01-05     2

Later I plan to impute the missing values by those that come before and after.

EDIT: Found previously asked question here, which is similar.

Original Q&A

There are 1 best solutions below

A5C1D2H2I1M1N2O1R2T1 On 11 June 2015 at 03:48 BEST ANSWER

A basic approach would be to use which and .N, like this:

DT[, .SD[(which(!is.na(value))[1]):.N], by = group]
##    group       date value
## 1:     a 2015-01-02     2
## 2:     a 2015-01-03     3
## 3:     a 2015-01-04    NA
## 4:     a 2015-01-05     2
## 5:     b 2015-01-03     2
## 6:     b 2015-01-04    NA
## 7:     b 2015-01-05     2

Data table, removing leading missing values by group

There are 1 best solutions below

Related Questions in R

Related Questions in DATA.TABLE

Trending Questions

Popular # Hahtags

Popular Questions