I have a question regarding subsetting in a database regarding patient specific data and repeating occurences. I have (essentially, the table shown is simplified) a database, which contains the record id of any given patient, as well as information about their age and the length of stay (los) in the hospital. Additionally we also collect data on adverse events (ie. harm to patients that happened because of medical intervention). Those ae are repeating events and are listed as such, meaning that any ae corresponds to a certain patient (record id) and contains information about whether it was an ae and if yes, what type of ae it was (ae_type, if NA it is not an ae, if it is one, it corresponds to a certain type). Because the database is built like it is, we have to fill in a form that corresponds to different types of ae's and if the form is completed, it is displayed as the repeating event = 1.
recor_id | ae_repeat | ae_type | age | los |
---|---|---|---|---|
1 | NA | NA | 25 | 7 |
1 | 1 | 1 | NA | NA |
1 | 1 | NA | NA | NA |
1 | 1 | NA | NA | NA |
2 | NA | NA | 30 | 12 |
2 | 1 | NA | NA | NA |
2 | 1 | NA | NA | NA |
2 | 1 | 1 | NA | NA |
3 | NA | NA | 55 | 14 |
3 | 1 | 1 | NA | NA |
3 | 1 | 1 | NA | NA |
3 | 1 | NA | NA | NA |
Now to my actual question. For subsetting purposes I need to count how many ae's (ie ae_type != NA) any given patient suffered and for that I would like to create a new variable, which counts the amounts of ae the patient has suffered. Is there any way to do that?
I have tried various methods with dplyr and mutate or summarise functions, but nothing seems to work like it should.
Just group by the recor_id column. Also have a look here on how to make a good question on stackoverflow