How to reformat data so ID corresponds to two rows, one row contains sample source, the second row contains the result for the source

Question

How to reformat data so ID corresponds to two rows, one row contains sample source, the second row contains the result for the source

30 Views Asked by T.McMillen At 30 July 2025 at 19:09

I'm working with a clinical dataset of ~2500 unique ID's. Some of the ID's correspond to 20+ occurrences. I would like to see the sample type (NP, Throat, etc.) as well as the result of the test "Not Detected" or "Detected", but I would like to see them spread out over multiple columns and for the ID to basically be two rows. The first row all of the sample types for each occurrence and then the second row is the result for each occurrence. I can get the first row no problem, but I haven't been able to figure out how to add the second row on the same ID with the results below the corresponding sample type. Any help would be greatly appreciated!

ID <- c(1,1,2,2,3,3,3,4)
Type<-c("EM","EM","PA","PA","PA","PA","PA","EM")
Specimen_Type <- c("NP", "NP", "Throat", "Throat", "NP", "Throat", "Throat", "NP")
RESULT_VAL <- c("Not Detected", "Detected", "Not Detected", "Detected", "Not Detected", "Not Detected", "Detected", "Not Detected")
RESULT_DATE <- c("6-1-2020", "6-10-2020","6-1-2020", "6-10-2020","6-1-2020", "6-10-2020", "6-20-2020", "6-1-2020")
Data_sum<- data.frame(ID, Type, Specimen_Type, RESULT_VAL, RESULT_DATE)

I would like it to look like

ID    Type     Occurrence_1          Occurrence_2         Occurrence_3
1      EM        NP                    NP
1      EM        Not Detected          Detected
2      PA        Throat                Throat
2      PA        Not Detected          Detected
3      PA        NP                    Throat               Throat
3      PA        Not Detected          Not Detected         Detected
4      EM        NP
4      EM        Not Detected

Original Q&A

There are 1 best solutions below

**akrun** · Accepted Answer

We can reshape to 'long' and then to 'wide'

library(dplyr)
library(stringr)
library(tidyr)
library(data.table)
Data_sum %>% 
    pivot_longer(cols = c(Specimen_Type, RESULT_VAL)) %>%
    arrange(ID, Type, 
        factor(name, levels = c('Specimen_Type', 'RESULT_VAL'))) %>% 
    mutate(rn = str_c('Occurence_', rowid(ID, Type, name))) %>% 
   select(-RESULT_DATE) %>% 
   pivot_wider(names_from = rn, values_from = value) %>%    
   select(-name)
# A tibble: 8 x 5
#     ID Type  Occurence_1  Occurence_2  Occurence_3
#   <dbl> <chr> <chr>        <chr>        <chr>      
#1     1 EM    NP           NP           <NA>       
#2     1 EM    Not Detected Detected     <NA>       
#3     2 PA    Throat       Throat       <NA>       
#4     2 PA    Not Detected Detected     <NA>       
#5     3 PA    NP           Throat       Throat     
#6     3 PA    Not Detected Not Detected Detected   
#7     4 EM    NP           <NA>         <NA>       
#8     4 EM    Not Detected <NA>         <NA>

How to reformat data so ID corresponds to two rows, one row contains sample source, the second row contains the result for the source

There are 1 best solutions below

Related Questions in R

Related Questions in RESHAPE

Related Questions in SPREAD

Trending Questions

Popular # Hahtags

Popular Questions