SPSS: How to create sequential variables - spss

Im very new to SPSS and I am to create variables with names that are similar.
Specifically, i have to create variables:
Visit1_microbe1_test1
Visit1_microbe1_result1
Visit1_microbe1_test2
Visit1_microbe1_result2
...
Visit1_microbe2_test1
Visit1_microbe2_result1
Visit1_microbe2_test2
Visit1_microbe2_result2
...
Visit3_microbe1_test1
Visit3_microbe1_result1
...
Visit3_microbe10_test5
Visit3_microbe10_result5
I can do it manually but it will take a lot of time, please help...

There are various potential commands in SPSS to deal with repetive task such as this.
See for example:
DO REPEAT
VECTOR / LOOP
In this instance SPSS's Macro language is perhaps most apt.
So you may do something like this (This isn't an attempt to answer your exact specific requirement but enough to give you soemthing to work with to adapt to your needs):
DEFINE !CreateNewVars ().
!DO !i = 1 !TO 5
!DO !j = 2 !TO 10
COMPUTE !CONCAT("Q", !i,"_X", !j)=1.
!DOEND
!DOEND
!ENDDEFINE.
!CreateNewVars.

Related

Temporary variable aliases in SPSS syntax?

Imagine I want to run a set of the same commands over multiple variables. The variables have distinct names, so I can't loop over them.
For example, these are the commands (variable action_time):
sort cases by technique.
split file by technique.
desc action_time (Z_VAR).
compute VAR_O3SD = 0.
execute.
if (abs(Z_VAR) > 3) VAR_O3SD = 1.
execute.
GRAPH
/HISTOGRAM = action_time.
DATASET ACTIVATE dataset1.
DATASET COPY No_Outliers.
DATASET ACTIVATE No_Outliers.
FILTER OFF.
USE ALL.
SELECT IF (VAR_O3SD = 0).
EXECUTE.
DATASET ACTIVATE No_Outliers.
* Histogram (now with no outliers)
GRAPH
/HISTOGRAM = action_time.
Is there an option for using a temporary variable and setting it once instead of replacing all the occurrences? Something like this:
var = action_time
sort cases by technique.
split file by technique.
desc var (Z_VAR).
... (rest of the commands)
I know about Scratch variables (e.g. COMPUTE #var = action_time). But the problem is that commands like GRAPH only work with standard variables.
You can do this with SPSS macros. After defining a macro, running the macro creates new syntax and runs it. In your example it could look like this:
define !runthisvar (!pos=!cmdend)
sort cases by technique.
split file by technique.
desc !1 (Z_VAR).
compute VAR_O3SD = 0.
execute.
if (abs(Z_VAR) > 3) VAR_O3SD = 1.
execute.
GRAPH /HISTOGRAM = !1 .
DATASET ACTIVATE dataset1.
DATASET COPY No_Outliers.
DATASET ACTIVATE No_Outliers.
FILTER OFF.
USE ALL.
SELECT IF (VAR_O3SD = 0).
EXECUTE.
DATASET ACTIVATE No_Outliers.
* Histogram (now with no outliers)
GRAPH /HISTOGRAM = !1 .
!enddefine.
Once you run this macro definition, you can call it using
!runthisvar somevarname .
This will create a copy of your original syntax, except instead of !1 the macro will write in the variable name you gave it in the macro call.
You can also define the macro to run on a list of variables, like this:
define !runthesevars (!pos=!cmdend)
!do !i !in(!1)
.
.
desc !i (Z_VAR).
.
.
!doend
!enddefine.
and the macro call will be
!runthesevars thisvar action_time thatvar.

Extract a list of variables satisfying certain conditions and storing it in a new variable using SPSS Syntax

I have around 300 variables and I am calculating their Skewness and Kurtosis. Now, I want to create a new varaible which will consist of the list of all those variables whose Skewness and Kurtosis are within a certain range. The idea is to select only those variables which are satisfying a condition and perform normalization on all the other variables.
To calcualte Skewness i am using;
Descriptives A TO Z
/Statistics Skewness.
Execute.
I know this is not a valid Syntax but i Need something like this:
Compute x= if(Skewness(A TO Z)>1)
Please help me out with an SPSS Syntax for this.
There are multiple ways to approach this, so there might be an easier way.
you just need to change the 'var1 TO varN' to your list of variables and whatever criteria you want for Skewness & Kurtosis on the two COMPUTE lines that create the flags, and this will do it for you.
If I were doing this I would go a step further and build the normalization into the syntax using WRITE OUT = ".sps" /CMD. INSERT FILE = ".sps", but that isn't what you asked for.
DATASET DECLARE DistributionSyntax.
OMS
/SELECT TABLES
/IF SUBTYPES=["Descriptives"] INSTANCES=[1]
/DESTINATION FORMAT=SAV OUTFILE = 'DistributionSyntax'.
EXAMINE VARIABLES=var1 TO varN
/PLOT NONE
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING PAIRWISE
/NOTOTAL.
OMSEND.
DATASET ACTIVATE DistributionSyntax.
USE ALL.
FILTER OFF.
SELECT IF ANY(Var2,'Skewness','Kurtosis').
EXECUTE.
STRING VarName (A64).
COMPUTE SkewnessFlag = (Var2 = 'Skewness' AND ABS(Statistic) > 2).
COMPUTE KurtosisFlag = (Var2 = 'Kurtosis' AND ABS(Statistic) > 2).
COMPUTE VarName = CHAR.SUBSTR(Var1,1,CHAR.INDEX(Var1,' ')-1).
EXECUTE.
USE ALL.
COMPUTE filter_$=(SkewnessFlag = 1).
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
FRE VarName.
USE ALL.
COMPUTE filter_$=(KurtosisFlag= 1).
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
FRE VarName.
USE ALL.
FILTER OFF.
EXECUTE.
If you omit the select data blocks after you compute the flags and replace it with this, it will calculate normalized versions of the variables that meet your criteria. This calculates new variables, and you will want to add a file location for the syntax file (replace the "~/" in the WRITE and INSERT commands), and change the name of the dataset referenced as 'RAWDATA' to whatever your dataset name is:
USE ALL.
FILTER OFF.
SELECT IF ANY(1,SkewnessFlag,KurtosisFlag).
EXECUTE.
STRING CMD (A250).
COMPUTE CMD = CONCAT("COMPUTE ",RTRIM(VarName),".Norm = ln(",RTRIM(VarName),").").
EXECUTE.
DATA LIST /CMD 1-250 (A).
BEGIN DATA
EXECUTE.
END DATA.
DATASET NAME EXE WINDOW = FRONT.
DATASET ACTIVATE DistributionSyntax.
ADD FILES /FILE = *
/FILE = 'EXE'.
EXECUTE.
DATASET CLOSE EXE.
DATASET ACTIVATE DistributionSyntax.
WRITE OUT="~\Normalize Variables.sps" /CMD.
DATASET CLOSE DistributionSyntax.
DATASET ACTIVATE RAWDATA.
INSERT FILE="~\Normalize Variables.sps".

How to create a dummy variable

I'm working in a project that uses the IBM SPSS but I had some problems to set a dummy variable(binary variable).The process to get the variable is following : Consider an any variable(width for example), to get the dummy variable, we need
to sort this variable in the decreasing way; The next step is make a somatory of the cases until a limit, the cases before the limit receive the value 1 in the dummy variable the other values receive 0.
Your explanation is rather vague. And the critical value you give in the printscreen should be 2.009 in stead of 20.09?
But I think you mean the following.
When using syntax, use:
compute newdummyvariable eq (ABr gt 2.009477106).
To check if it's okay:
fre newdummyvariable.
UPDATE:
In order to compute a dummy based on the cumulative sum, the answer is as follows:
If your critical value is predetermined, the fastest way is to sort in decending order, and to use the command create with csum() to compute an extra variable which I called ABr_cumul. This one, you use to compute the newdummyvariable. As follows:
sort cases by ABr (d).
create ABr_cumul = csum(VAR00001).
compute newdummyvariable = (ABr_cumul le 20.094771061766488).
fre newdummyvariable.
the dummy comes from the sum of all cases, after decreasing order raqueados when cases of a variable representing 50% of the variable t0tal, these cases receive 1 and the other 0 ...

Syntax for counting cases

I work with SPSS and have difficulty finding/generating a syntax for counting cases.
I have about 120 cases and five variables. I need to know the count /proportion of cases where just one, more than one, or all of the cases have a value of 1 (dichotomous variable). Then I need to compute a new variable that shows the number / proportion of cases which include all of the aforementioned cases (also dichotomous).
For example case number one: var1=1, var2=1, var3=1, var4=0, var5=0 --> newvariable=1.
Case number two: var1=0, var2=0, var3=0, var4=0, var5=0 --> newvariable=1.
And so on...
Can anybody help me with a syntax?
Help would much appreciated!
Here we can use the sum of the variables to determine your conditions. So using a scratch variable that is the sum, we can see if it is equal to 1, more than 1 or 5 in your example.
compute #sum = SUM(var1 to var5).
compute just_one = (#sum = 1).
compute more_one = (#sum > 1).
compute all_one = (#sum = 5).
Similarly, all_one could be computed using the ANY command to evaluate if any zeroes exist, i.e. compute all_one = ANY(0,var1 to var5).. These code snippets assume that var1 to var5 are contiguous in the data frame, if not they just need to be replaced with var1,var2,var3,var4,var5 in all given instances.
You could read up on the logical function ANY in the Command Syntax Reference manual, if you negated a test for ANY with "0", then that is effectively a test for all "1"s. Use of the COUNT command would be another approach.

SPSS: Looping over Several Variables

I am working in SPSS and have a large number of variables, call them v1 to v7000.
I want to perform a series of "complex operations" on each variable to create a new set of variables: t1 to t7000.
For the sake of illustration, let's just say the "complex operation" is to have t1 be the square of v1, t2 be the square of v2, etc.
My thought is to write some code like this.
do repeat t=t1 to t7000
compute t = v*v;
end repeat.
But, I don't think this will work.
What is the right way to do this? Thanks so much in advance.
Multiple stand-in variables can be speciļ¬ed on a DO REPEAT command.
do repeat t = t1 to t7000
/v = v1 to v7000.
compute t = v**2.
end repeat.

Resources