I have a survey dataset which contains household ids and individual ids within each household: individual 1 represents the interviewee him/herself. Some variable represents each individual's relationship to the interviewee (for example, 2 for spouse, 3 for parents and so on), the data structure like the following
???
Now what I want to do is detect the occurrence of certain values in var1 and, if it occurs, whether the values of var1 and var2 satisfy a certain condition.
For example, if var1 and var2 satisfy
(var1 == 3 & var2 == 1) | (var1 == 4 & var2 == 1)
then I can attach value 1 to a new generated variable, say var3, for each individual in the same group (household in this case, to represent family structure) and 0 otherwise.
It seems not a big problem, and I suppose I should employ some
by group: egen
or
by group: gen
command, but I'm not sure. I used to apply commands like
gen l_w_p = 0
by hhid: replace l_w_p = 1 if (var1 == 3 & a2004 == 1) | (var2 == 4 & a2004 == 1)
by hhid: replace l_w_p = 2 if (var1 == 3 & a2004 == 2) & (var2 == 4 & a2004 == 2)
but it seems it doesn't work. Does that need some kind of loop?
I have a hard time figuring what you are asking. A good strategy is to give an example of your data and desired output, simplified as far as possible to the essence of your problem. This is much easier than describing the data with words.
Let's start simple. Suppose you have data that looks like this:
and you want to tag households where
xis ever 2. One way isbys hhid: egen tag=max(cond(x==2,1,0))This will produce:
Working from the inside out, for each member, you check if
xis ever 2. If it is, the member gets a1. If not, he gets a0. Themax()calculates the maximum of this binary indicator over the entire household.The conditions can get more complicated and the condition functions can be nested like Russian dolls.
Here's a more complicated example. Suppose you want to tag households where someone has
x = 2(tag with a1) ory >= 5(tag with a2) in this dataset:We check
xfirst, and then checkyif thexcondition is false:bys hhid : egen tag=max(cond(x==2,1,cond(y>=5,2,0)))This yields: