Temporary variable aliases in SPSS syntax? - spss

Imagine I want to run a set of the same commands over multiple variables. The variables have distinct names, so I can't loop over them.
For example, these are the commands (variable action_time):
sort cases by technique.
split file by technique.
desc action_time (Z_VAR).
compute VAR_O3SD = 0.
execute.
if (abs(Z_VAR) > 3) VAR_O3SD = 1.
execute.
GRAPH
/HISTOGRAM = action_time.
DATASET ACTIVATE dataset1.
DATASET COPY No_Outliers.
DATASET ACTIVATE No_Outliers.
FILTER OFF.
USE ALL.
SELECT IF (VAR_O3SD = 0).
EXECUTE.
DATASET ACTIVATE No_Outliers.
* Histogram (now with no outliers)
GRAPH
/HISTOGRAM = action_time.
Is there an option for using a temporary variable and setting it once instead of replacing all the occurrences? Something like this:
var = action_time
sort cases by technique.
split file by technique.
desc var (Z_VAR).
... (rest of the commands)
I know about Scratch variables (e.g. COMPUTE #var = action_time). But the problem is that commands like GRAPH only work with standard variables.

You can do this with SPSS macros. After defining a macro, running the macro creates new syntax and runs it. In your example it could look like this:
define !runthisvar (!pos=!cmdend)
sort cases by technique.
split file by technique.
desc !1 (Z_VAR).
compute VAR_O3SD = 0.
execute.
if (abs(Z_VAR) > 3) VAR_O3SD = 1.
execute.
GRAPH /HISTOGRAM = !1 .
DATASET ACTIVATE dataset1.
DATASET COPY No_Outliers.
DATASET ACTIVATE No_Outliers.
FILTER OFF.
USE ALL.
SELECT IF (VAR_O3SD = 0).
EXECUTE.
DATASET ACTIVATE No_Outliers.
* Histogram (now with no outliers)
GRAPH /HISTOGRAM = !1 .
!enddefine.
Once you run this macro definition, you can call it using
!runthisvar somevarname .
This will create a copy of your original syntax, except instead of !1 the macro will write in the variable name you gave it in the macro call.
You can also define the macro to run on a list of variables, like this:
define !runthesevars (!pos=!cmdend)
!do !i !in(!1)
.
.
desc !i (Z_VAR).
.
.
!doend
!enddefine.
and the macro call will be
!runthesevars thisvar action_time thatvar.

Related

Extract a list of variables satisfying certain conditions and storing it in a new variable using SPSS Syntax

I have around 300 variables and I am calculating their Skewness and Kurtosis. Now, I want to create a new varaible which will consist of the list of all those variables whose Skewness and Kurtosis are within a certain range. The idea is to select only those variables which are satisfying a condition and perform normalization on all the other variables.
To calcualte Skewness i am using;
Descriptives A TO Z
/Statistics Skewness.
Execute.
I know this is not a valid Syntax but i Need something like this:
Compute x= if(Skewness(A TO Z)>1)
Please help me out with an SPSS Syntax for this.
There are multiple ways to approach this, so there might be an easier way.
you just need to change the 'var1 TO varN' to your list of variables and whatever criteria you want for Skewness & Kurtosis on the two COMPUTE lines that create the flags, and this will do it for you.
If I were doing this I would go a step further and build the normalization into the syntax using WRITE OUT = ".sps" /CMD. INSERT FILE = ".sps", but that isn't what you asked for.
DATASET DECLARE DistributionSyntax.
OMS
/SELECT TABLES
/IF SUBTYPES=["Descriptives"] INSTANCES=[1]
/DESTINATION FORMAT=SAV OUTFILE = 'DistributionSyntax'.
EXAMINE VARIABLES=var1 TO varN
/PLOT NONE
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING PAIRWISE
/NOTOTAL.
OMSEND.
DATASET ACTIVATE DistributionSyntax.
USE ALL.
FILTER OFF.
SELECT IF ANY(Var2,'Skewness','Kurtosis').
EXECUTE.
STRING VarName (A64).
COMPUTE SkewnessFlag = (Var2 = 'Skewness' AND ABS(Statistic) > 2).
COMPUTE KurtosisFlag = (Var2 = 'Kurtosis' AND ABS(Statistic) > 2).
COMPUTE VarName = CHAR.SUBSTR(Var1,1,CHAR.INDEX(Var1,' ')-1).
EXECUTE.
USE ALL.
COMPUTE filter_$=(SkewnessFlag = 1).
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
FRE VarName.
USE ALL.
COMPUTE filter_$=(KurtosisFlag= 1).
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.
FRE VarName.
USE ALL.
FILTER OFF.
EXECUTE.
If you omit the select data blocks after you compute the flags and replace it with this, it will calculate normalized versions of the variables that meet your criteria. This calculates new variables, and you will want to add a file location for the syntax file (replace the "~/" in the WRITE and INSERT commands), and change the name of the dataset referenced as 'RAWDATA' to whatever your dataset name is:
USE ALL.
FILTER OFF.
SELECT IF ANY(1,SkewnessFlag,KurtosisFlag).
EXECUTE.
STRING CMD (A250).
COMPUTE CMD = CONCAT("COMPUTE ",RTRIM(VarName),".Norm = ln(",RTRIM(VarName),").").
EXECUTE.
DATA LIST /CMD 1-250 (A).
BEGIN DATA
EXECUTE.
END DATA.
DATASET NAME EXE WINDOW = FRONT.
DATASET ACTIVATE DistributionSyntax.
ADD FILES /FILE = *
/FILE = 'EXE'.
EXECUTE.
DATASET CLOSE EXE.
DATASET ACTIVATE DistributionSyntax.
WRITE OUT="~\Normalize Variables.sps" /CMD.
DATASET CLOSE DistributionSyntax.
DATASET ACTIVATE RAWDATA.
INSERT FILE="~\Normalize Variables.sps".

intersect multiple sets with lua script using redis.call("sinter", ...) command

I want to intersect multiple sets (2 or more). The number of sets to be intersected are passed as ARGV from command line. As number of sets are being passed from command-line. So the number of arguments in redis.call() function are uncertain.
How can I do so using redis.call() function in Lua script.
However, I have written a script which has algo like:
Accepting the number of sets to be intersected in the KEYS[1].
Intersecting the first two sets by using setIntersected = redis.call(ARGV[1], ARGV[2]).
Running a loop and using setIntersected = redis.call("sinter", tostring(setIntersected), set[i])
Then finally I should get the intersected set.
The code for the above algorithm is :
local noOfArgs = KEYS[1] -- storing the number of arguments that will get passed from cli
--[[
run a loop noOfArgs time and initialize table elements, since we don't know the number of sets to be intersected so we will use Table (arrays)
--]]
local setsTable = {}
for i = 1, noOfArgs, 1 do
setsTable[i] = tostring(ARGV[i])
end
-- now find intersection
local intersectedVal = redis.call("sinter", setsTable[1], setsTable[2]) -- finding first intersection because atleast we will have two sets
local new_updated_set = ""
for i = 3, noOfArgs, 1 do
new_updated_set = tostring(intersectedVal)
intersectedVal = redis.call("sinter", new_updated_set, setsTable[i])
end
return intersectedVal
This script works fine when I pass two sets using command-line.
EG:
redic-cli --eval scriptfile.lua 2 , points:Above20 points:Above30
output:-
1) "playerid:1"
2) "playerid:2"
3) "playerid:7"
Where points:Above20 and points:Above30 are sets. This time it doesn't go through the for loop which starts from i = 3.
But when I pass 3 sets then I always get the output as:
(empty list or set)
So there is some problem with the loop I have written to find intersection of sets.
Where am I going wrong? Is there any optimized way using which I can find the intersection of multiple sets directly?
What you're probably looking for is the elusive unpack() Lua command, which is equivalent to what is known as the "Splat" operator in other languages.
In your code, use the following:
local intersectedVal = redis.call("sinter", unpack(setsTable))
That said, SINTER is variadic and can accept multiple keys as arguments. Unless your script does something in addition to just intesects, you'd be better use that instead.

How to create a dummy variable

I'm working in a project that uses the IBM SPSS but I had some problems to set a dummy variable(binary variable).The process to get the variable is following : Consider an any variable(width for example), to get the dummy variable, we need
to sort this variable in the decreasing way; The next step is make a somatory of the cases until a limit, the cases before the limit receive the value 1 in the dummy variable the other values receive 0.
Your explanation is rather vague. And the critical value you give in the printscreen should be 2.009 in stead of 20.09?
But I think you mean the following.
When using syntax, use:
compute newdummyvariable eq (ABr gt 2.009477106).
To check if it's okay:
fre newdummyvariable.
UPDATE:
In order to compute a dummy based on the cumulative sum, the answer is as follows:
If your critical value is predetermined, the fastest way is to sort in decending order, and to use the command create with csum() to compute an extra variable which I called ABr_cumul. This one, you use to compute the newdummyvariable. As follows:
sort cases by ABr (d).
create ABr_cumul = csum(VAR00001).
compute newdummyvariable = (ABr_cumul le 20.094771061766488).
fre newdummyvariable.
the dummy comes from the sum of all cases, after decreasing order raqueados when cases of a variable representing 50% of the variable t0tal, these cases receive 1 and the other 0 ...

SPSS: How to create sequential variables

Im very new to SPSS and I am to create variables with names that are similar.
Specifically, i have to create variables:
Visit1_microbe1_test1
Visit1_microbe1_result1
Visit1_microbe1_test2
Visit1_microbe1_result2
...
Visit1_microbe2_test1
Visit1_microbe2_result1
Visit1_microbe2_test2
Visit1_microbe2_result2
...
Visit3_microbe1_test1
Visit3_microbe1_result1
...
Visit3_microbe10_test5
Visit3_microbe10_result5
I can do it manually but it will take a lot of time, please help...
There are various potential commands in SPSS to deal with repetive task such as this.
See for example:
DO REPEAT
VECTOR / LOOP
In this instance SPSS's Macro language is perhaps most apt.
So you may do something like this (This isn't an attempt to answer your exact specific requirement but enough to give you soemthing to work with to adapt to your needs):
DEFINE !CreateNewVars ().
!DO !i = 1 !TO 5
!DO !j = 2 !TO 10
COMPUTE !CONCAT("Q", !i,"_X", !j)=1.
!DOEND
!DOEND
!ENDDEFINE.
!CreateNewVars.

Syntax for counting cases

I work with SPSS and have difficulty finding/generating a syntax for counting cases.
I have about 120 cases and five variables. I need to know the count /proportion of cases where just one, more than one, or all of the cases have a value of 1 (dichotomous variable). Then I need to compute a new variable that shows the number / proportion of cases which include all of the aforementioned cases (also dichotomous).
For example case number one: var1=1, var2=1, var3=1, var4=0, var5=0 --> newvariable=1.
Case number two: var1=0, var2=0, var3=0, var4=0, var5=0 --> newvariable=1.
And so on...
Can anybody help me with a syntax?
Help would much appreciated!
Here we can use the sum of the variables to determine your conditions. So using a scratch variable that is the sum, we can see if it is equal to 1, more than 1 or 5 in your example.
compute #sum = SUM(var1 to var5).
compute just_one = (#sum = 1).
compute more_one = (#sum > 1).
compute all_one = (#sum = 5).
Similarly, all_one could be computed using the ANY command to evaluate if any zeroes exist, i.e. compute all_one = ANY(0,var1 to var5).. These code snippets assume that var1 to var5 are contiguous in the data frame, if not they just need to be replaced with var1,var2,var3,var4,var5 in all given instances.
You could read up on the logical function ANY in the Command Syntax Reference manual, if you negated a test for ANY with "0", then that is effectively a test for all "1"s. Use of the COUNT command would be another approach.

Resources