recode_shadow for multiple variables using naniar

53 Views Asked by At

I am just starting in R and I came across this problem when trying to recode 88 to special missing value using naniar library. Given this dataframe:

library(tidyverse)
df <- tibble::tribble(
  ~wind, ~temp, ~sun, ~test,
  88,    45,   88,     NA,
  68,    NA,   23,     63,
  NA,    88,   88,     15,
)

I want to change all the "88" to NA_broken_machine in the shadow matrix. So what I did was first use bind_shadow to attach the shadow matrix to my data and then add the new missing level using recode_shadow, but even though the new level applied for all the shadow matrix it only changed the wind variable

library(naniar)
dfs_recode <- df %>%
  bind_shadow() %>% 
  recode_shadow(wind = .where(wind == 88 ~ "broken_machine")) 

dfs_recode
# A tibble: 3 x 8
   wind  temp   sun  test wind_NA           temp_NA sun_NA test_NA
  <dbl> <dbl> <dbl> <dbl> <fct>             <fct>   <fct>  <fct>  
1    88    45    88    NA NA_broken_machine !NA     !NA    NA     
2    68    NA    23    63 !NA               NA      !NA    !NA    
3    NA    88    88    15 NA                !NA     !NA    !NA    

str(dfs_recode)
tibble [3 x 8] (S3: nabular/tbl_df/tbl/data.frame)
 $ wind   : num [1:3] 88 68 NA
 $ temp   : num [1:3] 45 NA 88
 $ sun    : num [1:3] 88 23 88
 $ test   : num [1:3] NA 63 15
 $ wind_NA: Factor w/ 3 levels "!NA","NA","NA_broken_machine": 3 1 2
 $ temp_NA: Factor w/ 3 levels "!NA","NA","NA_broken_machine": 1 2 1
 $ sun_NA : Factor w/ 3 levels "!NA","NA","NA_broken_machine": 1 1 1
 $ test_NA: Factor w/ 3 levels "!NA","NA","NA_broken_machine": 2 1 1

The only one I came across was doing variable by variable using mutate and then convert it to shade. I think there is a better way to do this but I can not figure it out. I would really appreciate your help with this.

x <- dfs_recode %>% 
   mutate(temp_NA = factor(case_when(temp == 88 ~ "NA_broken_machine",
                                    TRUE ~ as.character(temp_NA))))
x
# A tibble: 3 x 8
   wind  temp   sun  test wind_NA           temp_NA           sun_NA test_NA
  <dbl> <dbl> <dbl> <dbl> <fct>             <fct>             <fct>  <fct>  
1    88    45    88    NA NA_broken_machine !NA               !NA    NA     
2    68    NA    23    63 !NA               NA                !NA    !NA    
3    NA    88    88    15 NA                NA_broken_machine !NA    !NA    

are_shade(x)
   wind    temp     sun    test wind_NA temp_NA  sun_NA test_NA 
  FALSE   FALSE   FALSE   FALSE    TRUE   FALSE    TRUE    TRUE 

x$temp_NA <- shade(x$temp, broken_machine = 88) 

are_shade(x)
   wind    temp     sun    test wind_NA temp_NA  sun_NA test_NA 
  FALSE   FALSE   FALSE   FALSE    TRUE    TRUE    TRUE    TRUE 
1

There are 1 best solutions below

1
On

Is this maybe what you are looking for?

dfs_recode <- df %>%
    bind_shadow() %>% 
    mutate_all(~ ifelse(.==88, "NA_broken_machine", .))

which produces:

> dfs_recode
# A tibble: 3 x 8
  wind              temp              sun                test wind_NA temp_NA sun_NA test_NA
  <chr>             <chr>             <chr>             <dbl>   <int>   <int>  <int>   <int>
1 NA_broken_machine 45                NA_broken_machine    NA       1       1      1       2
2 68                NA                23                   63       1       2      1       1
3 NA                NA_broken_machine NA_broken_machine    15       2       1      1       1

You can remove the extra columns by adding a %>% select(wind, temp, sun, test) at the end.