I have a survey that I need to filter to include only two specific categories for comparison. A simulation of it can be reproduced with this code:
library(survey)
library(gtsummary)
data(api)
svy <- survey::svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
svy <- subset(svy, stype %in% c("E", "H"))
And I am trying to use this code to create a table with the differences:
svy %>%
tbl_svysummary(by = stype, include = c(api00, stype),
statistic = list(all_categorical() ~ "{p}%",
all_continuous() ~ "{mean}"),
digits = ~ 2,
missing = "no") %>%
add_difference()
And getting this error: Error: 'tbl_summary'/'tbl_svysummary' object must have a by= value with exactly two levels.
How could I drop the level to remain just 2 and therefore use add_difference?
I use the filter in the dataframe, but I'm afraid it will cause me to lose the lifting weights.
Editing this post from the answer from @astra bellow, I've made a small example with my data, but it not works. Works only with the example above
install.packages("PNSIBGE")
library(PNSIBGE)
library(tidyverse)
pns <- get_pns(year = 2019, labels = T)
pns.2 <- subset(pns, C009 %in% c("Branca", "Preta"))
pns.2$variables$C009 <- droplevels(pns.2$variables$C009)
pns.2 %>%
tbl_svysummary(by = C009, include = c(C006)) %>%
add_difference()
Use the
droplevels()function: