Note(I have used One hot encoding because my data have multiple labels for multiple columns)I have build one linear regression model using R.From the model output I am able to get the linear equation. As my data contains multiple labels for a single column , so I am getting the linear equation for all the lables with coefficients. But I want to get a single coefficient which should be average of all the coefficient or something similar.
My Code:
**Linear Regression:**
#define one-hot encoding function
dummy <- dummyVars(" ~ .", data=mydata)
#perform one-hot encoding on data frame
final_df_supp <- data.frame(predict(dummy, newdata=mydata))
mymodel <- lm(y ~ ., data = mydata)
**Extract linear Equation from Linear Model**
model_equation <- function(model, ...) {
format_args <- list(...)
model_coeff <- model$coefficients
format_args$x <- abs(model$coefficients)
model_coeff_sign <- sign(model_coeff)
model_coeff_prefix <- case_when(model_coeff_sign == -1 ~ " - ",
model_coeff_sign == 1 ~ " + ",
model_coeff_sign == 0 ~ " + ")
model_eqn <- paste(strsplit(as.character(model$call$formula), "~")[[2]], # 'y'
"=",
paste(if_else(model_coeff[1]<0, "- ", ""),
do.call(format, format_args)[1],
paste(model_coeff_prefix[-1],
do.call(format, format_args)[-1],
" * ",
names(model_coeff[-1]),
sep = "", collapse = ""),
sep = ""))
return(model_eqn)
}
library(MASS)
y = data.frame(model_equation(mymodel , digits = 3, trim = TRUE))
type here
Output: y = 2+0.05x1.a+0.08x1.b+0.3x1.c+3y1.ab+5y1.bc+4y1.cd
data: x1 y1 a ab b bc c cd
I want the output should be : y= 2+0.34x1+6y1