I have a short question about computing the number of participants by timepoints. Consider the sample long format data:
data<-data.frame(id=c(1,1,1,1,2,2,2,3,3,3,3),survey_date=c("01/12/2020","02/12/2020","03/12/2020","04/12/2020","01/31/2020","03/12/2020","04/05/2020","02/12/2020","04/12/2020","05/12/2020","06/12/2020"),last_seen=c('05/25/2020','05/25/2020','05/25/2020','05/25/2020','04/06/2020','04/06/2020','04/06/2020','','','',''))
Survey date indicates when the survey took place. \Some subjects in the data became lost to follow-up, we do have that last-seen date recorded in the data for those subjects (it appears in all rows for that subject).
I would like to add a column 'num_N' to the existing data indicating the number of participants who were in the study at the timepoint survey_date.
For example, on 01/12/2020, only subject id=1 was in the study, so num_N for that row would be 1.
On 06/12/2020, only subject id=3 was in the study, so num_N for that row would be 1.
Any help would be appreciated. Thanks!
This is a good case for
iv_count_between
fromivs
: