Say I have a dataset of laboratory records for a set of patients (ordered by ID and date from earliest to latest) with this format, I want to remove all the bolded labs and keep all the non-bolded ones.
Basically, I want to only keep labs for a specific patient (labs from different patients should not be compared against each other) that are done >30 days from the previous lab, while not counting the removed labs when evaluating the next lab (i.e. if a lab is marked in red I want to ignore it when evaluating if the next lab is within 30 days and should thus be removed).
You can see for patient 1111, their 3rd lab is within 30 days of their 2nd lab, but because the 2nd lab is already marked for removal the 2nd lab does not count and thus the 3rd lab should be kept.
Does any one have any advice or suggestions for how this could be accomplished in SAS?
ID Lab_Date
1111 Jan 1 2023
1111 Jan 15 2023
1111 Feb 3 2023
1111 Feb 16 2023
2222 Jan 2 2023
2222 Jan 20 2023
2222 Feb 8 2023
2222 Feb 10 2023
2222 Feb 12 2023
3333 Jan 15 2023
3333 Feb 5 2023
3333 Feb 18 2023
I've tried retaining/lagging the lab date from the previous record and then comparing it to the current record, but this ends up removing records that shouldn't be removed.
You can use
retainto do this:When first record, let
lab_datebe the benchmark date, namedlab_date_pre.In the following records: If
lab_dateis within 30 days from the benchmark date, delete it. Else iflab_dateis more than 30 days from the benchmark date, letlab_datebe the new benchmark date.