How to load CSV files into SPSS Variable and Value Labels - spss

Summary
Let me preface this by saying I'm new to SPSS so I apologize if my terminology is incorrect. I have two CSV files about the same survey (one with the 'Variable Labels' and one with the 'Value Labels'. I want to combine these without having to manually code through each syntax (if possible).
1 - CSV with Value Labels
respondent_id, I_am_between, I_am_happy
3470220950, 26-33 years old, Sometimes
3470226804, 34-41 years old, Very Often
3470226906, 34-41 years old, Sometimes
2 - CSV with Values
respondent_id, I_am_between, I_am_happy
3470220950, 2, 3
3470226804, 3, 4
3470226906, 3, 3
What I'm looking to do is match the question "I_am_between" variable label of '26-33 years old' to the value of '2'. Is this possible in SPSS (and if so, how)? Thanks.
Update to Jay's solution and comment: As mentioned in Jay's post, the first method might not load the answer in an order that you like if you want to keep rank/order. For example, a question 'I_have_been_with_the_company' might load the following: (1='<2 years', 2='>10 years', 3='3-5 years') when instead you would want (1='<2 years', 2='3-5 years', etc.) I fixed this by loading the second file (that shows values) and manually editing the labels.
VALUE LABELS
I_have_been_with_the_company
1 '<2 years'
2 '3-5 years'
3 '5-7 years'
4 '8- 10 years'
5 '>10 years'.
EXECUTE.

The easiest way to do this is to import the first file only and use automatic recode. This has the advantage of being straightforward but the disadvantage that the recoded values may not necessarily match up with the values in file 2.
GET DATA /TYPE=TXT
/FILE="file1.csv"
/ENCODING='UTF8'
/DELCASE=LINE
/DELIMITERS=","
/ARRANGEMENT=DELIMITED
/FIRSTCASE=2
/IMPORTCASE=ALL
/VARIABLES=
respondent_id F10.0
V2 A15
V3 A10.
CACHE.
AUTORECODE VARIABLES=V2 V3
/INTO I_am_between I_am_happy.
DELETE VARIABLES V2 V3.
Alternatively, a second approach would be to import both files into separate data files, merge them using add variables, then use the STATS VALLBLS FROMDATA extension command (which you'll need to install) to apply the values of one variable as labels to another variable.
GET DATA /TYPE=TXT
/FILE="file2.csv"
/ENCODING='Locale'
/DELCASE=LINE
/DELIMITERS=","
/ARRANGEMENT=DELIMITED
/FIRSTCASE=2
/IMPORTCASE=ALL
/VARIABLES=
respondent_id F10.0
I_am_between F2
I_am_happy F2.
CACHE.
DATASET NAME DataSet1 WINDOW=FRONT.
GET DATA /TYPE=TXT
/FILE="file1.csv"
/ENCODING='UTF8'
/DELCASE=LINE
/DELIMITERS=","
/ARRANGEMENT=DELIMITED
/FIRSTCASE=2
/IMPORTCASE=ALL
/VARIABLES=
respondent_id F10.0
V2 A15
V3 A10.
CACHE.
DATASET NAME DataSet2 WINDOW=FRONT.
STAR JOIN
/SELECT t0.V2, t0.V3, t1.I_am_between, t1.I_am_happy
/FROM * AS t0
/JOIN 'DataSet1' AS t1
ON t0.respondent_id=t1.respondent_id
/OUTFILE FILE=*.
STATS VALLBLS FROMDATA VARIABLES=I_am_between I_am_happy LBLVARS=V2 V3
/OPTIONS VARSPERPASS=20
/OUTPUT EXECUTE=YES.
DELETE VARIABLES V2 V3.

Related

Pair multiple response sets in SPSS for their direct comparison

In my database, I have 5 multiple dichotomy sets MRST1 to MRST5 (already defined by MRSETS command); where each of the sets consists from the same list of items (item 1 to 10) although from different variables (v1 to v50).
And I want to create a table with direct comparison of the column percentages in such a way that I have the sets in columns (MRST1 to MRST5) and their items (item 1 to 10) in rows.
Already tried using MULT RESPONSE and MRSETS but these do not allow for "item pairing" as far as documentations explains; I've also used CTABLES and CROSSTABS with no success...
Any help on this would be appreciated!
Disregarding the multiple sets definitions, you can get the table you want through some restructure:
varstocases
/make grp1 from v1 to v10
/make grp2 from v11 to v20
/make grp3 from v21 to v30
/make grp4 from v31 to v40
/make grp5 from v41 to v50/index=vr(grp1)/null=keep.
means grp1 to grp5 by vr/cells=mean.

Syntax to add a new case to the data

If i have a variable in SPSS, with name (My_Variable), label (My Variable), values(1: Yes, 2: No) etc but without data (the column in data view is empty), i want to add data using syntax! For example, i want to add a participant in 1st row, who answered "Yes", so i want 1 to be added!!! How can i do it???
I found similar questions, but the solutions refers to creating A NEW SPSS window and add the values there! But i dont want this! I want to add data in an existing variable, without creating new SPSS file!
Apparently there is no way to directly add cases to an SPSS dataset through syntax.
But the following seems to me pretty close - you don't create new files but you create a new dataset and add it to your original.
Let's first create a small data to demonstrate on:
Data list list/ID (a5) var1 var2 var3 (3f2).
begin data
"first" 1 17 7
"secnd" 5 5 12
"third" 34 11 91
end data.
dataset name originalDataset.
So this is your original data. Now imaging that you want to add a new case to the data, with the ID value of "hello" and the number 42 in all the columns. This is what you do:
* creating the new case in a separate dataset.
Data list list/ID (a5) var1 var2 var3 (3f2).
begin data
"hello" 42 42 42
end data.
dataset name addition.
* going back to original dataset and adding the new case.
dataset activate originalDataset.
add files /file=* /file=addition.
exe.
dataset close addition.
You don't have to create data in the first data set. Just create the variables and define them however you want.
DATASET CLOSE ALL.
INPUT PROGRAM.
NUMERIC My_Variable (F1).
VARIABLE LABELS My_Variable "I want this!".
VALUE LABELS My_Variable 1 "Yes" 2 "No".
END FILE.
END INPUT PROGRAM.
DATASET NAME Empty.
DATA LIST FREE /My_Variable.
BEGIN DATA.
1 2
END DATA.
APPLY DICTIONARY /FROM Empty
/SOURCE VARIABLES=My_Variable
/TARGET VARIABLES=My_Variable
/VARINFO VALLABELS=REPLACE VARLABEL.
DATASET CLOSE Empty.
FREQUENCIES VARIABLES ALL.
I used DATASET but you could have save the empty file to disk.
See the APPLY DICTIONARY command for more details about how it works.
Using python you can add data with the cases.append() method
begin program.
import spss
spss.StartDataStep()
dataset = spss.Dataset()
dataset.cases.append([1])
spss.EndDataStep()
end program.
Say you have 3 variables, you can assign values to each by appending the list passed to the method
begin program.
spss.StartDataStep()
dataset = spss.Dataset()
dataset.cases.append([1,2,3])
spss.EndDataStep()
end program.
Would add a case wit value 1 in the first variable, value 2 in the second variable, 3 in the third variable.
Note: the method will only work within an open datastep.
Check out the ADD FILES command. You can also add cases with Python code.

SPSS - How to create a 'Totals' row (not a column)

I have a dataset like this:
Program Timely_Count Total_Count
PROG1 51,761 53,356
PROG2 232,371 235,769
PROG3 100,756 110,859
PROG4 25,713 36,309
PROG5 17,985 18,995
PROG6 24,673 24,732
I want to create a "Total" row (not a column) so when I save this into Excel I will have a table that looks like this:
Program Timely_Count Total_Count
PROG1 51,761 53,356
PROG2 232,371 235,769
PROG3 100,756 110,859
PROG4 25,713 36,309
PROG5 17,985 18,995
PROG6 24,673 24,732
TOTAL 453,259 480,020
I know I can use the AGGRAGATE function to add a TOTALS column, but that does not format the dataset the way I need for this report.
I also need this in syntax since it is run multiple times per day on multiple datasets. I have SPSS version 22. (If any of that helps.) –
first you aggregate, then add the aggregated results back to your original table.
First let's recreate your sample data:
data list list/Program (a20) Timely_Count Total_Count (2f8).
begin data
PROG1 51,761 53,356
PROG2 232,371 235,769
PROG3 100,756 110,859
PROG4 25,713 36,309
PROG5 17,985 18,995
PROG6 24,673 24,732
end data.
Now run this:
dataset name OrigData.
dataset declare tot.
aggregate /out='tot'/break = /Timely_Count Total_Count=sum(Timely_Count Total_Count).
add files /file=*/file=tot.
recode program (""="TOTAL").

Sort all the cases of specific variable in descending order but other will remain same using SPSS Syntax

I have two variables (id and Var1) in SPSS as below. I want to sort Var1 as descending order but other variables do not change accordingly with Var1. i.e. other variable will remain same as before sort.
My data is...
id Var1
-- ----
M-1 3
M-2 4
M-3 2
M-4 7
But I want like this..
id Var1
-- ----
M-1 7
M-2 4
M-3 3
M-4 2
My Syntax/code is...
data list list
/id(A3) Var1(F2.0).
begin data.
M-1 3
M-2 4
M-3 2
M-4 7
end data.
sort cases by BY Var1(D).
execute.
When I run this code it also sort id according to Var1. But I do not want to expand this sort command for entire variables. I only want to sort for current selection variable in SPSS.
Can anyone help using SPSS Syntax?
You Could split the dataset sort the Var1 variable and then merge them together. One way to do so would be this:
* create data.
data list list
/id(A3) Var1(F2.0).
begin data.
M-1 3
M-2 4
M-3 2
M-4 7
end data.
DATASET NAME ids.
DATASET COPY sortvar.
* Delete sort variable (Var1) from dataset "ids".
DELETE VARIABLES Var1.
* Keep only sort variable in dataset "sortvars".
DATASET ACTIVATE sortvar.
DELETE VARIABLES id.
* sort Var1.
SORT CASES BY Var1(D).
* Merge datasets.
MATCH FILES
/FILE ids
/FILE sortvar.
EXECUTE.
If you have lots of variables to delete in the sortvar dataset you could also use the MATCH CASES command:
* Delete all variables but Var1.
DATASET ACTIVATE sortvar.
MATCH CASES
/FILE *
/KEEP Var1.
Alternativly you can use the SAVE command in combination with the KEEP or DROP options in order to split the dataset.

Looping in SPSS to work through the cases

I have a data set in SPSS containing a sequence of six variables from which I have to create a new variable which should contain the last value present in the sequence. Let's say the data look like this: (the second row contains all missing values but represents a case to which I'll merge some other variables later, so I need this too.)
DATA LIST /V1 TO V6 1-6.
BEGIN DATA
423451
73453
929
0257
END DATA.
Now if I wish to generate a variable named lastscr which should have values 1, ., 3, 9, 7. Can anyone help me on how should I do it in SPSS? I could not find any clue about it. Thank you in advance for any help.
This can easily be done with the DO REPEAT command:
DO REPEAT Var = V1 TO V6.
IF NOT(SYSMIS(Var)) lastscr = Var.
END REPEAT.

Resources