R: Parsing language object to get Formula

202 Views Asked by At

I am trying to parse a selection object (returned by the selection function in the sampleSelection package), so that it becomes fit for the construction of a Formula object (from the Formula package).

A concrete example of what I want is given below. I have a strategy in mind, but for that strategy to work, I need to understand the R language data type a little better.

I am basically looking for an explanation of the R language data type/object in the context below.

Here is an example:

library(Formula)
library(sampleSelection)
data(Mroz87)
# define a new variable
Mroz87$kids  = (Mroz87$kids5 + Mroz87$kids618 > 0)
# create the estimation sample
Mroz87Est = Mroz87[1:600, ]
# create the hold out sample
Mroz87Holdout = Mroz87[601:nrow(Mroz87), ]
# estimate the model using MLE
heckML =  selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ,
         outcome = wage ~ exper + I(exper^2) + educ + city, data = Mroz87Est) 
summary(heckML)  

This code estimates a Heckman sample selection model and the model object of class selection is available in heckML. It has a complicated structure which can be seen by a call to str(heckML).

I need to be able to populate a Formula object like this programmatically from the selection object heckML:

FormHeck = Formula(lfp |  wage ~ age + I(age^2) + faminc + kids + educ | 
                     exper + I(exper^2) + educ + city)

for downstream processing.

I know that all the components I need to populate this are available in heckML$call$selection and heckML$call$outcome, and I can use it like so

tempS = evalq(heckML$call$selection)
tempO = evalq(heckML$call$outcome)

as.Formula(paste0(tempO[2], '|', tempS[2], '~', tempO[3], '|', tempS[3]))

but I have no idea why this works. Note that tempS and tempO are objects of type language.

a. What does evalq do with the language object? What is it supposed to do?
b. How is a language object different from an expression object? When to use either? Pointers to readings welcome.

Lastly, I was wondering if there is a nicer way to populate the Formula object FormHeck form the return object heckML. The above is just one strategy that works, and until I understand why, it is basically a hack.

Thanks.

0

There are 0 best solutions below