I am trying to recode a variable that indicates total number of responses to a multiple response survey question. Question 4 has options 1, 2, 3, 4, 5, 6, and participants may choose one or more options when submitting a response. The data is currently coded as binary outputs for each option: var Q4___1
= yes or no (1/0), var Q4___2 = yes or no (1/0), and so forth.
This is the tabstat
of all yes (1) responses to the 6 Q4___* variables
Variable | Sum
-------------+----------
q4___1 | 63
q4___2 | 33
q4___3 | 7
q4___4 | 2
q4___5 | 3
q4___6 | 7
------------------------
total = 115
I would like to create a new variable that encapsulates these values.
Can someone help me figure out how to create this variable, and if coding a variable in this manner for a multiple option survey question is valid?
When I used the replace
command the total number of responses were not adding up, as shown below
gen q4=.
replace q4 =1 if q4___1 == 1
replace q4 =2 if q4___2 == 1
replace q4 =3 if q4___3 == 1
replace q4 =4 if q4___4 == 1
replace q4 =5 if q4___5 == 1
replace q4 =6 if q4___6 == 1
label values q4 primarysource`
q4 | Freq. Percent Cum.
------------+-----------------------------------
1 | 46 48.94 48.94
2 | 31 32.98 81.91
3 | 6 6.38 88.30
4 | 1 1.06 89.36
5 | 3 3.19 92.55
6 | 7 7.45 100.00
------------+-----------------------------------
Total | 94 100.00
UPDATE to specify I am trying to create a new variable that captures the column sum of each question, not the rowtotal across all questions. I know that 63 participants responded yes to question 4 a) and 33 to question 4 b) so I want my new variable to reflect that.
This is what I want my new variable's values to look like.
q4
-------------+----------
q4___1 | 63
q4___2 | 33
q4___3 | 7
q4___4 | 2
q4___5 | 3
q4___6 | 7
------------------------
total = 115
The fallacy here is ignoring the possibility of multiple 1s as answers to the various
Q4????
variables. For example if someone answers1 1 1 1 1 1
to all questions, they appear in your final variable only in respect of their answer to the 6th question. Otherwise put, your code overwrites and so ignores all positive answers before the last positive answer.What is likely to be more useful are
(1) the total across all 6 questions which is just
where the 4 instances of
?
mean that by eye I count 3 underscores and 1 numeral.(2) a concatenation of responses that is just
(3) a variable that is a concatenation of questions with positive responses, so 246 if those questions were answered 1 and the others were answered 0.
EDIT
Here is a test script giving concrete examples.
Results: