I have a data frame with post and follow-up measurements for approximately 200 people. In the study, we try to find out if there is a correlation between sports participation and distress symptoms. We have two measurement periods (post and follow-up) that are conducted after a workshop about health and sports. Post was conducted 6 months after the Workshop and followup one year after the workshop. We formed the following hypothesis: „Participation in sport for obese people within one year after a workshop correlates significantly positively with psychological distress symptoms at follow up.“ I assume, the dependent variable is psychological distress and the independent is the participation in sports activities. The data structure looks like:
Df
$ measurement_period : Factor w/ 2 levels "0","1": 1 1 1 1
$ psychological_distress ; int 12 45 32 85
$ participation : Factor w/ 2 levels "0","1": 1 1 1 1
$ id : num 1 2 3 4
After reading some posts here, we believe that there are 2 levels in the model: 1 ) measurement period (post and follow up) 2) id
At first we conductet a unconditiional Model (intercept only Model for confirming if a multilevel Model fits, hope that this is right) with following code:
test <-lmer(psychological_distress ~1+(1|id),data=Df
But we are not sure if the model is appropriate given the data structure and, whether the level 1 and level 2 classification is correct.
Thank you very much in advance!
Your model:
is a variance components model. It will tell you how much of the variation in
psychological_distress
is attributable to theid
level, and how much is attributable to the unit/residual level. That isn't going to answer your research question:To look into this, you need to include the participation variable as a fixed effect, and also the time variable, and their interaction. So in the first instance I would consider this: