I have a string specifying a subpopulation of interest with two or more variables in the string and I wish to specifically reference the dataset.
For example, assume the name of the dataframe is "df", I wish to change the following:
"age == 1 & sex == 1 & race == 1" to
"df$age == 1 & df$sex == 1 & df$race == 1".
I tried to use attach(df) alongside eval(parse(text ="age == 1 & sex == 1 & race == 1" )) but ran into memory issues with my computer. I also would rather avoid using 'attach' altogether if I can.
My ultimate goal is not to filter the dataset, but to replace rows of a particular variable with a certain value if they meet the criterion, else replace with another value where the criterion is not met. With the current setup generating the strings, it is impossible for me to know a priori which variables (or how many) will be specified in the string. They could be any number of variables from colnames(df)
Your help is greatly appreciated.
I was finally able to figure this out, in four steps:
#Step 1: split text into units based on the delimiters "&" or "|". Keep the delimiters
#Step 2: paste the character 'df$' in front of each variable name
#Step 3: Splice together the split (and reformulated) elements
#Step 4: remove any spaces between '$' character and the variable name
If there is a shorter/faster way of doing this, I'd be surely keen on learning. Thanks!