I have an R dataframe that looks like this
1 A 1
2 A 0.9
5 A 0.7
6 A 0.6
8 A 0.5
3 B 0.6
4 B 0.5
5 B 0.4
6 B 0.3
I'd need to fill all the gaps till the maximum per category (second column). i.e. the result I wish to obtain is the following
1 A 1
2 A 0.9
3 A 0.9
4 A 0.9
5 A 0.7
6 A 0.6
7 A 0.6
8 A 0.5
1 B 0.6
2 B 0.6
3 B 0.6
4 B 0.5
5 B 0.4
6 B 0.3
basically, padding backwards when there are missing data before the first obs and forward when missing data is in between. what I did is grouping by cat
groupby = ddply(df, ~fit$group,summarise, max=max(time))
A 8
B 6
but now I'm stuck on the next steps.
We can try with
data.table/zoo
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), expand the 'v1' column based on the sequence ofmax
value grouped by 'v2', joinon
with 'v1' and 'v2' and then grouped by 'v2', we pad the NA elements with adjacent elements usingna.locf
(fromzoo
)Or using
dplyr/zoo
data