I have the following decision rules:
RELIABILITY LEVEL DESCRIPTION
LEVEL I Multiple regression
LEVEL II Multiple regression + mechanisms specified (all interest variables)
LEVEL III Multiple regression + mechanisms specified (all interest + control vars)
The first three columns are the data upon which the 4th column should be reproduced using dplyr.
The reliability level should be the same for the whole table (model)... I want to code it using dplyr.
Here is my try so far... As you can see, I can't get it to be the same for the whole model
library(tidyverse)
library(readxl)
library(effectsize)
df <- read_excel("https://github.com/timverlaan/relia/blob/59d2cbc5d7830c41542c5f65449d5f324d6013ad/relia.xlsx")
df1 <- df %>%
group_by(study, table, function_var) %>%
mutate(count_vars = n()) %>%
ungroup %>%
group_by(study, table, function_var, mechanism_described) %>%
mutate(count_int = case_when(
function_var == 'interest' & mechanism_described == 'yes' ~ n()
)) %>%
mutate(count_con = case_when(
function_var == 'control' & mechanism_described == 'yes' ~ n()
)) %>%
mutate(reliable_int = case_when(
function_var == 'interest' & count_vars/count_int == 1 ~ 1)) %>%
mutate(reliable_con = case_when(
function_var == 'control' & count_vars/count_con == 1 ~ 1)) %>%
# group_by(study, source) %>%
mutate(reliable = case_when(
reliable_int != 1 ~ 1,
reliable_int == 1 ~ 2,
reliable_int + reliable_con == 2 ~ 3)) %>%
# ungroup() %>%
The code settled on is:
However, I would prefer to do this all within one dplyr pipe. If anyone has a solution I would love to hear it...