How to select cases based on criteria of several variables in SPSS? - spss

In a dataset, there are 10 variables V1, V2,..., V10.
How can I select cases in which the value of any of those variables is greater or equal, say, 10?
I tried this but it didn't work:
temporary.
select if any(v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, ge 10).
list id.
This and a couple of others didn't work either:
select if ((v1, v2, v3, v4, v5, v6, v7, v8, v9, v10) ge 10).

You could use VECTOR/LOOP approach here and specifying the loop to be exited as soon as the first variable meets the given criteria, in your case variable to be greater than value of 10, so to not unnecessarily continue looping over remaining variables:
*****************************************.
* set up dummy data.
set seed = 10.
input program.
loop #i = 1 to 500.
compute case = #i.
end case.
end loop.
end file.
end input program.
dataset name sim.
execute.
vector v(10, F1.0).
do repeat v = v1 to v10.
compute v = TRUNC(RV.UNIFORM(1,12)).
end repeat.
execute.
*****************************************.
vector v=v1 to v10.
loop i=1 to 10.
if (v(i) > 10) Keep=1.
end loop if v(i) > 10.
select if Keep.

You'll have to loop for it:
do repeat vr=v1 to v10.
if vr ge 10 KeepMe=1.
end repeat.
select if KeepMe=1.

This will also work:
count cnt_ = v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 (10 thru highest).
exe.
select if cnt_>0.
exe.
The cnt_variable is used for counting how many variables have a value of 10 or greater. Then the selection command selects what you need.
Also, don't forget about execute, to apply all pending transformations. Otherwise nothing will happen.

Related

Selecting a cut-off score in SPSS

I have 5 variables for one questionnaire about social support. I want to define the group with low vs. high support. According to the authors low support is defined as a sum score <= 18 AND two items scoring <= 3.
It would be great to get a dummy variable which shows which people are low vs high in support.
How can I do this in the syntax?
Thanks ;)
Assuming your variables are named Var1, Var2 .... Var5, and that they are consecutive in the dataset, this should work:
recode Var1 to Var5 (1 2 3=1)(4 thr hi=0) into L1 to L5.
compute LowSupport = sum(Var1 to Var5) <= 18 and sum(L1 to L5)>=2.
execute.
New variable LowSupport will have value 1 for rows that have the parameters you defined and 0 for other rows.
Note: If your variables are not consecutive you'll have to list all of them instead of using Var1 to var5.

How to randomly get a value from a table [duplicate]

I am working on programming a Markov chain in Lua, and one element of this requires me to uniformly generate random numbers. Here is a simplified example to illustrate my question:
example = function(x)
local r = math.random(1,10)
print(r)
return x[r]
end
exampleArray = {"a","b","c","d","e","f","g","h","i","j"}
print(example(exampleArray))
My issue is that when I re-run this program multiple times (mash F5) the exact same random number is generated resulting in the example function selecting the exact same array element. However, if I include many calls to the example function within the single program by repeating the print line at the end many times I get suitable random results.
This is not my intention as a proper Markov pseudo-random text generator should be able to run the same program with the same inputs multiple times and output different pseudo-random text every time. I have tried resetting the seed using math.randomseed(os.time()) and this makes it so the random number distribution is no longer uniform. My goal is to be able to re-run the above program and receive a randomly selected number every time.
You need to run math.randomseed() once before using math.random(), like this:
math.randomseed(os.time())
From your comment that you saw the first number is still the same. This is caused by the implementation of random generator in some platforms.
The solution is to pop some random numbers before using them for real:
math.randomseed(os.time())
math.random(); math.random(); math.random()
Note that the standard C library random() is usually not so uniformly random, a better solution is to use a better random generator if your platform provides one.
Reference: Lua Math Library
Standard C random numbers generator used in Lua isn't guananteed to be good for simulation. The words "Markov chain" suggest that you may need a better one. Here's a generator widely used for Monte-Carlo calculations:
local A1, A2 = 727595, 798405 -- 5^17=D20*A1+A2
local D20, D40 = 1048576, 1099511627776 -- 2^20, 2^40
local X1, X2 = 0, 1
function rand()
local U = X2*A2
local V = (X1*A2 + X2*A1) % D20
V = (V*D20 + U) % D40
X1 = math.floor(V/D20)
X2 = V - X1*D20
return V/D40
end
It generates a number between 0 and 1, so r = math.floor(rand()*10) + 1 would go into your example.
(That's multiplicative random number generator with period 2^38, multiplier 5^17 and modulo 2^40, original Pascal code by http://osmf.sscc.ru/~smp/)
math.randomseed(os.clock()*100000000000)
for i=1,3 do
math.random(10000, 65000)
end
Always results in new random numbers. Changing the seed value will ensure randomness. Don't follow os.time() because it is the epoch time and changes after one second but os.clock() won't have the same value at any close instance.
There's the Luaossl library solution: (https://github.com/wahern/luaossl)
local rand = require "openssl.rand"
local randominteger
if rand.ready() then -- rand has been properly seeded
-- Returns a cryptographically strong uniform random integer in the interval [0, n−1].
randominteger = rand.uniform(99) + 1 -- randomizes an integer from range 1 to 100
end
http://25thandclement.com/~william/projects/luaossl.pdf

Possible to use less/greater than operators with IF ANY?

Is it possible to use <,> operators with the if any function? Something like this:
select if (any(>10,Q1) AND any(<2,Q2 to Q10))
You definitely need to create an auxiliary variable to do this.
#Jignesh Sutar's solution is one that works fine. However there are often multiple ways in SPSS to accomplish a certain task.
Here is another solution where the COUNT command comes in handy.
It is important to note that the following solution assumes that the values of the variables are integers. If you have float values (1.5 for instance) you'll get a wrong result.
* count occurrences where Q2 to Q10 is less then 2.
COUNT #QLT2 = Q2 TO Q10 (LOWEST THRU 1).
* select if Q1>10 and
* there is at least one occurrence where Q2 to Q10 is less then 2.
SELECT (Q1>10 AND #QLT2>0).
There is also a variant for this sort of solution that deals with float variables correctly. But I think it is less intuitive though.
* count occurrences where Q2 to Q10 is 2 or higher.
COUNT #QGE2 = Q2 TO Q10 (2 THRU HIGHEST).
* select if Q1>10 and
* not every occurences of (the 9 variables) Q2 to Q10 is two or higher.
SELECT IF (Q1>10 AND #QGE2<9).
Note: Variables beginning with # are temporary variables. They are not stored in the data set.
I don't think you can (would be nice if you could - you can do something similar in Excel with COUNTIF & SUMIF IIRC).
You've have to construct a new variable which tests the multiple ANY less than condition, as per below example:
input program.
loop #j = 1 to 1000.
compute ID=#j.
vector Q(10).
loop #i = 1 to 10.
compute Q(#i) = trunc(rv.uniform(-20,20)).
end loop.
end case.
end loop.
end file.
end input program.
execute.
vector Q=Q2 to Q10.
loop #i=1 to 9 if Q(#i)<2.
compute #QLT2=1.
end loop if Q(#i)<2.
select if (Q1>10 and #QLT2=1).
exe.

Writing an If condition within a Loop in SPSS

I want to have a if condition within a loop. That is As long as id < 10,
check if Modc_initial is equal to MODC, if true then set d = 12
This is the code I tried bit not working, can anyone please help.
LOOP if (id LT 10)
IF(Modc_initial EQ MODC))
COMPUTE d = 12.
END LOOP.
EXECUTE.
You can either use a one line conditional of the form IF (condition) d = 12. or a multiple line DO IF. Below I provide an example of DO IF adapted to your syntax.
data list free / id MODC Modc_initial.
begin data
1 3 3
2 3 5
12 1 1
end data.
LOOP if (id LT 10).
DO IF (Modc_initial EQ MODC).
COMPUTE d = 12.
END IF.
END LOOP IF (d = 12).
EXECUTE.
Note you had a period missing in your original syntax on the initial LOOP. I also added an end loop condition, otherwise the code as written would just go until the maximum set number of loops per your system.

SPSS: Looping over Several Variables

I am working in SPSS and have a large number of variables, call them v1 to v7000.
I want to perform a series of "complex operations" on each variable to create a new set of variables: t1 to t7000.
For the sake of illustration, let's just say the "complex operation" is to have t1 be the square of v1, t2 be the square of v2, etc.
My thought is to write some code like this.
do repeat t=t1 to t7000
compute t = v*v;
end repeat.
But, I don't think this will work.
What is the right way to do this? Thanks so much in advance.
Multiple stand-in variables can be specified on a DO REPEAT command.
do repeat t = t1 to t7000
/v = v1 to v7000.
compute t = v**2.
end repeat.

Resources