ifelse removes labels of a labelled variable

182 Views Asked by At

I'm fairly new to R and trying to recode missing values (stored as -99) to NA. For some reasons this removes all my variable labels from the data frame.

My code is the following

df <- df %>%
  mutate(across(everything(), ~ ifelse(. == -99, NA, .)))

Is their any way to work around this or possibly use another command? Thank you very much in advance!

Here is some of the data I'm using:

structure(list(yrbrn = structure(c(1965, 1952, 1952, 1969, 1980, 
1975, 1989, 2000, 2005, 1963, 2001, 1985, 2002, 1956, 1999, 1997, 
1953, 1991, 1993, 1966, 2004, 1977, 1964, 1991, 1970, 1990, 1946, 
1944, 1957, 2005, 1997, 1960, 1944, 1982, 1956, 1980, 1964, 1956, 
1957, 1957, 1949, 1997, 1948, -99, 2004, 1961, 1973, 1935, 1983, 
1964), label = "Year of birth", format.stata = "%10.0g", labels = c(`no answer` = -99, 
Refusal = NA, `Don't know` = NA, `No answer` = NA), class = c("haven_labelled", 
"vctrs_vctr", "double")), gndr = structure(c(1, -99, 1, 1, 1, 
1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1, 1, 2, 1, 1, 2, 2, 1, 2, 1, 1, 
1, 2, -99, 1, 1, 2, 1, 2, 2, 2, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 
2, 2, 2, 1), label = "Gender", format.stata = "%9.0g", labels = c(`no answer` = -99, 
Male = 1, Female = 2, `No answer` = NA), class = c("haven_labelled", 
"vctrs_vctr", "double"))), row.names = c(NA, -50L), class = c("tbl_df", 
"tbl", "data.frame"))
2

There are 2 best solutions below

3
Darren Tsai On BEST ANSWER

It seems that ifelse from base is not able to keep the labels of a labelled column. You can use if_else from dplyr instead:

df %>%
  mutate(across(everything(), ~ if_else(. == -99, NA, .)))

# # A tibble: 50 × 2
#        yrbrn        gndr
#    <dbl+lbl>   <dbl+lbl>
#  1      1965  1 [Male]  
#  2      1952 NA         
#  3      1952  1 [Male]  
#  4      1969  1 [Male]  
#  5      1980  1 [Male]  
#  6      1975  1 [Male]  
#  7      1989  1 [Male]  
#  8      2000  2 [Female]
#  9      2005  1 [Male]  
# 10      1963  2 [Female]
# # … with 40 more rows

You can also use replace:

df %>%
  mutate(across(everything(), ~ replace(.x, .x == -99, NA)))
1
L-- On

You can do the following using the data.table grammar. The loop is not very elegant but it works:

library(data.table)
setDT(df)
for (x in names(df)) {df[get(x) == -99, (x) := NA]}