I want to perform a regression analysis with R, using a difference contrast for a nominal independent variable. However the contrast produces factor level names that are not suitable for publication. So I want to change them. The problem is how to change them.
I first looked at the package labelled, but that did not fix the problem. That is, I use the function tbl_regression from the gtsummary package, and labelling did not change anything. Here is a sample code:
# create data
set.seed(345)
depvar <- rnorm(300,300,60) #baseline area
indepvar <- rep(c("A","B", "C"), times=100)
data <- data.frame(depvar, indepvar)
# set indep to factor
data$indepvar<- as.factor(data$indepvar)
# model without contrast
## model 1
m1 <- lm(depvar ~ indepvar, data = data)
## create table
library(gtsummary)
tbl_regression(m1)
# model with contrast
## create contrast
library(MASS)
contrasts(data$indepvar) <- contr.sdif
## model 2
m2 <- lm(depvar ~ indepvar, data = data)
## create table
tbl_regression(m2)
I want to change indepvarB-A into something like B minus A. Below is some code to inspect the data.
## inspect data structure and attributes
head(data$indepvar)
str(data)
attributes(data$indepvar)
Options: either add or change value labels, or change attributes. Or maybe there is a different/better way to create the contrast. Any advice how to is much appreciated.
Essentially R is doing behind the scenes is a form of one-hot encoding. You can do this yourself:
In fact you can leave off the C. The regression coefficients are slightly different but functionally equivalent. That may look more like what you want?