Suppose I have 8 variables named MED1 to MED8 in a dataset called 2007d. I want to create a new variable called Totalmeds which will be the sum of all 8 variables in case an entry is present in a certain row. The variables MED1 to MED8 also contains -9 as an entry which is to be treated as null and not counted towards the total for the 8 variables.
For example, if MED 1= 2368, MED 2= 2344, MED3=-9, MED4=2145, MED5=-9, MED6=-9, MED7=8765, and MED8=2345 for a certain row, then I want the total column to show 5 as the output, i.e, not considering -9 as an entry for the total.
I tried individually replacing each column with NA
library{dplyr}
`2007d` <-
`2007d` %>%
replace_with_na(replace = list(MED1 = c(-9.00)))
but since I have multiple columns was wondering if there is a more efficient way.
I also tried using the mutate_at function:
`2007d` <-
`2007d` %>%
mutate_at(vars(starts_with("MED")), na_if, y = -9)
but it is not working.
The initial rows of the 8 columns in the dataset looks like:
I wanted to know if there is an efficient method to change just those variables in the whole dataset so that I do not have to individually make modifications to each variable and then calculate total
Probably, you are looking for
rowSumswith condition> 0:If it's about the sum of positive values per row
If you want to replace all negative values with
0simply doIf it's just about
-9, dodf[df == -9] = 0instead. This operation is not necessary to perform the above mentioned computations as those ignore negative values.If there are more columns and we only want to select those containing
MEDfollowed by one or two digits, we can do:and