Label missing values as any other strings

196 Views Asked by At

I create a labelled column with haven.

library(dplyr)
library(haven)

df <- tibble(x = labelled(c(1:3, NA), c('a' = 1, 'b' = 2, 'missing' = NA)))

# # A tibble: 4 × 1
#           x
#   <int+lbl>
# 1     1 [a]
# 2     2 [b]
# 3     3
# 4    NA

I seems that the missing value fails to be labelled. I expect to see somthing like

# # A tibble: 4 × 1
#           x
#   <int+lbl>
# 1     1 [a]
# 2     2 [b]
# 3     3
# 4    NA [missing]

In addition, as_factor gives

as_factor(df$x)

# [1] a    b    3    <NA>
# Levels: a b 3 missing

but I want

# [1] a    b    3    missing
# Levels: a b 3 missing
2

There are 2 best solutions below

2
Maël On BEST ANSWER

With haven::tagged_na:

x <- labelled(c(1:3, tagged_na("x")), c('a' = 1, 'b' = 2, "missing" = tagged_na("x")))

<labelled<double>[4]>
[1]     1     2     3 NA(x)

Labels:
 value   label
     1       a
     2       b
 NA(x) missing

Works as intended with as_factor:

as_factor(x)
#[1] a       b       3       missing
#Levels: a b 3 missing
0
TarJae On

We could use fct_explicit_na() from forcats package:

library(dplyr)
library(haven)
library(forcasts)

df1 <- df %>% 
  mutate(x = fct_explicit_na(as_factor(x), na_level = "missing"))

levels(df1$x)

[1] "a"       "b"       "3"       "missing"