Recode multiple variables into one and assign new values based on their former names - spss

So as per the title, I have a dataset with rows organised by ID for each household, where each household has at most 7 kids. The rows are Child1.Age, Child1.Sex, Child1.Immunisation and so forth for up to Child7.
I would like to recode the variables such that I have all the children in variables like Children.Age, Household.ChildCount, Children.BirthOrder, Children.Immunisation, Children.Sex, and so forth. As this can't be done through the "Recode variables into different variables" option, how would I do this using either SPSS syntax or Python, while preserving the identities of multiple children from a household?

Complete this command with the rest of the needed variables:
Varstocases
/make Children.Age from Child1.Age Child2.Age Child3.Age Child4.Age ...
/make Children.Sex from Child1.Sex Child2.Sex Child3.Sex Child4.Sex...
/index=childID(Children.Age).
compute childID=substr(childID,1,6).
Then use aggregate with addvariables to count the children in each family.

Related

Pair multiple response sets in SPSS for their direct comparison

In my database, I have 5 multiple dichotomy sets MRST1 to MRST5 (already defined by MRSETS command); where each of the sets consists from the same list of items (item 1 to 10) although from different variables (v1 to v50).
And I want to create a table with direct comparison of the column percentages in such a way that I have the sets in columns (MRST1 to MRST5) and their items (item 1 to 10) in rows.
Already tried using MULT RESPONSE and MRSETS but these do not allow for "item pairing" as far as documentations explains; I've also used CTABLES and CROSSTABS with no success...
Any help on this would be appreciated!
Disregarding the multiple sets definitions, you can get the table you want through some restructure:
varstocases
/make grp1 from v1 to v10
/make grp2 from v11 to v20
/make grp3 from v21 to v30
/make grp4 from v31 to v40
/make grp5 from v41 to v50/index=vr(grp1)/null=keep.
means grp1 to grp5 by vr/cells=mean.

Visualize sum of column percentage for multiple response set variables

I'm trying to understand how to visualise the sum of column percentages in some tabulations of multiple variables.
suppose that i have defined the variable $q12 as a multiple response set of categorical values of the variables sq12m1 sq12m2 sq12m3 sq12m4 sq12m5.
i could have cases with values only in sq12m1 or cases with values in all of those.
if i want to see how many times any brand appear in any of those sq12m1 to sq12m5 i am using this:
CTABLES
/VLABELS VARIABLES=$q12 DISPLAY=DEFAULT
/TABLE $q12 [C][COUNT F40.0, COLPCT.COUNT PCT40.1]
/CATEGORIES VARIABLES=$q12 ORDER=A KEY=VALUE EMPTY=INCLUDE TOTAL=YES POSITION=AFTER
MISSING=EXCLUDE.
and it will generate this:
how can i sum the column percentages? using this syntax the total is always 100%, i would like to visualise the sum (which in this case is 215.10%) which represents the average number of mentions...
do you know how to do it?
thanks!!!
Only one thing you need to change in your syntax, in the /TABLE sub-command:COLPCT.RESPONSES.COUNT instead of COLPCT.COUNT:
CTABLES
/VLABELS VARIABLES=$q12 DISPLAY=DEFAULT
/TABLE $q12 [C][COUNT F40.0, COLPCT.RESPONSES.COUNT PCT40.1]
/CATEGORIES VARIABLES=$q12 ORDER=A KEY=VALUE EMPTY=INCLUDE TOTAL=YES POSITION=AFTER
MISSING=EXCLUDE.

Using SPSS IF syntax to create a new variable from two categorical variables

I want to create a new variable from two other variables.
The first is SEX (0=male, 1=female; there were no other genders selected by respondents though we had planned for that possibility) whereas the second is RACE9 (0=white, 1=racialized). The new variable is named SEXRACE9.
While the following code produces counts for white males, racialized males, white females and racialized females, the code fails to produce a count for total male or total female.
* Create combined sex and race categorical variable.
IF (sex=0 AND (race9=0 OR race9=1)) sexrace9=1. /*Total males - glitchy.
IF sex=0 AND race9=1 sexrace9=2. /*White males.
IF sex=0 AND race9=0 sexrace9=3. /*Racialized males.
IF (sex=1 AND (race9=0 OR race9=1)) sexrace9=4. /*Total females - glitchy.
IF sex=1 AND race9=1 sexrace9=5. /*White females.
IF sex=1 AND race9=0 sexrace9=6. /*Racialized females.
EXECUTE.
Am I missing something? Alternately, does anyone have a solution for how to insert a count for total males and total females using COMPUTE? Any help is greatly appreciated.
You are missing two key aspects:
Your sexracevariable is intended to define mutually exclusive groups (i.e. - each case will belong to one group, and no case could qualify for more than one group)
SPSS syntax is being run sequentially, line by line, so a syntax line can overwrite previous lines.
More to the point:
IF (sex=0 AND (race9=0 OR race9=1)) sexrace9=1.
is being partially overwritten by
IF sex=0 AND race9=1 sexrace9=2. /*White males.
because white males would qualify for both sexrace=1 and sexrace=2.
, and then by the line
IF sex=0 AND race9=0 sexrace9=3. /*Racialized males.
, because Racialized males qualify for both sexrace=1 and sexrace =3.
So I am guessing that no cases ghave sexrace=1 after running your syntax :)
Exactly the same logic goes for Females.
I am not sure what you want to achieve by your Total Males and Total Femalessyntax lines. You already have the sexvariable to differentiate between males and females.

SPSS: aggregate and count different values

In SPSS i have a variabele with a lot of different values (8 figure number; 00000000). Every row is a person. I want to aggregate this data on postal area and count the number of different values in a postal area. Is there a way?
Result within a postal area should be 1 to N : 1 = every person has the same value, N = every person has a different value
Aggregate in two steps. Assuming your dataset name is data1, with variables var1 (the variable of interest) and postalcode, I would do this:
Create a dataset step1, with one row for each combination of values of postalcode and var1. Also possible by using the command casestovars.
dataset declare step1.
dataset activate data1.
aggregate outf=step1 /break=postalcode var1 /n=n(var1).
Create a dataset result with one row for each postalcode, and a variable n for the number of rows from the previous dataset step1.
dataset declare result.
dataset activate step1.
aggregate outf=result /break=postalcode /n=n(var1).
So, in conclusion: first break by both of the variables, then break only by the variable of postal code. This should do the trick!

Sort all the cases of specific variable in descending order but other will remain same using SPSS Syntax

I have two variables (id and Var1) in SPSS as below. I want to sort Var1 as descending order but other variables do not change accordingly with Var1. i.e. other variable will remain same as before sort.
My data is...
id Var1
-- ----
M-1 3
M-2 4
M-3 2
M-4 7
But I want like this..
id Var1
-- ----
M-1 7
M-2 4
M-3 3
M-4 2
My Syntax/code is...
data list list
/id(A3) Var1(F2.0).
begin data.
M-1 3
M-2 4
M-3 2
M-4 7
end data.
sort cases by BY Var1(D).
execute.
When I run this code it also sort id according to Var1. But I do not want to expand this sort command for entire variables. I only want to sort for current selection variable in SPSS.
Can anyone help using SPSS Syntax?
You Could split the dataset sort the Var1 variable and then merge them together. One way to do so would be this:
* create data.
data list list
/id(A3) Var1(F2.0).
begin data.
M-1 3
M-2 4
M-3 2
M-4 7
end data.
DATASET NAME ids.
DATASET COPY sortvar.
* Delete sort variable (Var1) from dataset "ids".
DELETE VARIABLES Var1.
* Keep only sort variable in dataset "sortvars".
DATASET ACTIVATE sortvar.
DELETE VARIABLES id.
* sort Var1.
SORT CASES BY Var1(D).
* Merge datasets.
MATCH FILES
/FILE ids
/FILE sortvar.
EXECUTE.
If you have lots of variables to delete in the sortvar dataset you could also use the MATCH CASES command:
* Delete all variables but Var1.
DATASET ACTIVATE sortvar.
MATCH CASES
/FILE *
/KEEP Var1.
Alternativly you can use the SAVE command in combination with the KEEP or DROP options in order to split the dataset.

Resources