Syntax to add a new case to the data - spss

If i have a variable in SPSS, with name (My_Variable), label (My Variable), values(1: Yes, 2: No) etc but without data (the column in data view is empty), i want to add data using syntax! For example, i want to add a participant in 1st row, who answered "Yes", so i want 1 to be added!!! How can i do it???
I found similar questions, but the solutions refers to creating A NEW SPSS window and add the values there! But i dont want this! I want to add data in an existing variable, without creating new SPSS file!

Apparently there is no way to directly add cases to an SPSS dataset through syntax.
But the following seems to me pretty close - you don't create new files but you create a new dataset and add it to your original.
Let's first create a small data to demonstrate on:
Data list list/ID (a5) var1 var2 var3 (3f2).
begin data
"first" 1 17 7
"secnd" 5 5 12
"third" 34 11 91
end data.
dataset name originalDataset.
So this is your original data. Now imaging that you want to add a new case to the data, with the ID value of "hello" and the number 42 in all the columns. This is what you do:
* creating the new case in a separate dataset.
Data list list/ID (a5) var1 var2 var3 (3f2).
begin data
"hello" 42 42 42
end data.
dataset name addition.
* going back to original dataset and adding the new case.
dataset activate originalDataset.
add files /file=* /file=addition.
exe.
dataset close addition.

You don't have to create data in the first data set. Just create the variables and define them however you want.
DATASET CLOSE ALL.
INPUT PROGRAM.
NUMERIC My_Variable (F1).
VARIABLE LABELS My_Variable "I want this!".
VALUE LABELS My_Variable 1 "Yes" 2 "No".
END FILE.
END INPUT PROGRAM.
DATASET NAME Empty.
DATA LIST FREE /My_Variable.
BEGIN DATA.
1 2
END DATA.
APPLY DICTIONARY /FROM Empty
/SOURCE VARIABLES=My_Variable
/TARGET VARIABLES=My_Variable
/VARINFO VALLABELS=REPLACE VARLABEL.
DATASET CLOSE Empty.
FREQUENCIES VARIABLES ALL.
I used DATASET but you could have save the empty file to disk.
See the APPLY DICTIONARY command for more details about how it works.

Using python you can add data with the cases.append() method
begin program.
import spss
spss.StartDataStep()
dataset = spss.Dataset()
dataset.cases.append([1])
spss.EndDataStep()
end program.
Say you have 3 variables, you can assign values to each by appending the list passed to the method
begin program.
spss.StartDataStep()
dataset = spss.Dataset()
dataset.cases.append([1,2,3])
spss.EndDataStep()
end program.
Would add a case wit value 1 in the first variable, value 2 in the second variable, 3 in the third variable.
Note: the method will only work within an open datastep.

Check out the ADD FILES command. You can also add cases with Python code.

Related

We giving a task for Lua table but it is not working as expectable

Our task is create a table, and read values to the table using a loop. Print the values after the process is complete. - Create a table. - Read the number of values to be read to the table. - Read the values to the table using a loop. - Print the values in the table using another loop. for this we had written code as
local table = {}
for value in ipairs(table) do
io.read()
end
for value in ipairs(table) do
print(value)
end
not sure where we went wrong please help us. Our exception is
Input (stdin)
3
11
22
abc
Your Output (stdout)
~ no output ~
Expected Output
11
22
abc
Correct Code is
local table1 = {}
local x = io.read()
for line in io.lines() do
table.insert(table1, line)
end
for K, value in ipairs(table1) do
print(value)
end
Let's walk through this step-by-step.
Create a table.
Though the syntax is correct, table is a reserved pre-defined global name in Lua, and thus cannot should not be declared a variable name to avoid future issues. Instead, you'll need to want to use a different name. If you're insistent on using the word table, you'll have to distinguish it from the function global table. The easiest way to do this is change it to Table, as Lua is a case-sensitive language. Therefore, your table creation should look something like:
local Table = {}
Read values to the table using a loop.
Though Table is now established as a table, your for loop is only iterating through an empty table. It seems your goal is to iterate through the io.read() instead. But io.read() is probably not what you want here, though you can utilize a repeat loop if you wish to use io.read() via table.insert. However, repeat requires a condition that must be met for it to terminate, such as the length of the table reaching a certain amount (in your example, it would be until (#Table == 4)). Since this is a task you are given, I will not provide an example, but allow you to research this method and use it to your advantage.
Print the values after the process is complete.
You are on the right track with your printing loop. However, it must be noted that iterating through a table always returns two results, an index and a value. In your code, you would only return the index number, so your output would simply return:
1
2
3
4
If you are wanting the actual values, you'll need a placeholder for the index. Oftentimes, the placeholder for an unneeded variable in Lua is the underscore (_). Modify your for loop to account for the index, and you should be set.
Try modifying your code with the suggestions I've given and see if you can figure out how to achieve your end result.
Edited:
Thanks, Piglet, for corrections on the insight! I'd forgotten table itself wasn't a function, and wasn't reserved, but still bad form to use it as a variable name whether local or global. At least, it's how I was taught, but your comment is correct!

How to read out a list of cases in one variable in SPSS and use that to add data?

To explain my problem I use this example data set:
SampleID Date Project Problem
03D00173 03-Dec-2010 1,00
03D00173 03-Dec-2010 1,00
03D00173 28-Sep-2009 YNTRAD
03D00173 28-Sep-2009 YNTRAD
Now, the problem is that I need to replace the text "YNTRAD" with "YNTRAD_PILOT" but only for the cases with Date = 28-Sep-2009.
This is example is part of a much larger database, with many more cases having Project=YNTRAD and Data=28-Sep-2009, so I can not simply select first all cases with 28-Sep-2009, then check which of these cases have Project=YNTRAD and then replace. Instead, what I need to do is:
Look at each case that has a 1,00 in Problem (these are problem
cases)
Then find the SampleID that corresponds with that sample
Then find all other cases with the same SampleID BUT WITH
Date=28-Sep-2009 (this is needed because only those samples are part
of a pilot study) and then replace YNTRAD in Project to
YNTRAD_PILOT.
I read a lot about:
LOOP
- DO REPEAT
- DO IF
but I don't know how to use these in solving this problem.
I first tried making a list containing only the sample ID's that need eventually to be changed (again, this is part of a much larger database).
STRING SampleID2 (A20).
IF (Problem=1) SampleID2=SampleID.
EXECUTE.
AGGREGATE
/OUTFILE=*
/BREAK=SampleID2
/n_SampleID2=N.
This gives a dataset with only the SampleID's for which a change should be made. However I don't know how to read out this dataset case by case and looking up each SampleID in the overall file with all the date and then change only those cases were Date = 28-Sep-2009.
It sounds like once we can identify the IDs that need to be changed we've done the tricky part here. We can use AGGREGATE with MODE=ADDVARIABLES to add a problem Id counter variable to our dataset. From there, it's as you'd expect.
* Add var IdProblemCnt to your database . Stores # of times a given Id had a record with Problem = 1.
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=SampleId
/IdProblemCnt=CIN(Problem, 1, 1) .
EXE .
* once we've identified the "problem" Ids we can use `RECODE` Project var.
DO IF (IdProblemCnt>0 AND Date = DATE.MDY(9,28,2009) .
RECODE Project ('YNTRAD' = 'YNTRAD_PILOT') .
END IF .
EXE .

Sort all the cases of specific variable in descending order but other will remain same using SPSS Syntax

I have two variables (id and Var1) in SPSS as below. I want to sort Var1 as descending order but other variables do not change accordingly with Var1. i.e. other variable will remain same as before sort.
My data is...
id Var1
-- ----
M-1 3
M-2 4
M-3 2
M-4 7
But I want like this..
id Var1
-- ----
M-1 7
M-2 4
M-3 3
M-4 2
My Syntax/code is...
data list list
/id(A3) Var1(F2.0).
begin data.
M-1 3
M-2 4
M-3 2
M-4 7
end data.
sort cases by BY Var1(D).
execute.
When I run this code it also sort id according to Var1. But I do not want to expand this sort command for entire variables. I only want to sort for current selection variable in SPSS.
Can anyone help using SPSS Syntax?
You Could split the dataset sort the Var1 variable and then merge them together. One way to do so would be this:
* create data.
data list list
/id(A3) Var1(F2.0).
begin data.
M-1 3
M-2 4
M-3 2
M-4 7
end data.
DATASET NAME ids.
DATASET COPY sortvar.
* Delete sort variable (Var1) from dataset "ids".
DELETE VARIABLES Var1.
* Keep only sort variable in dataset "sortvars".
DATASET ACTIVATE sortvar.
DELETE VARIABLES id.
* sort Var1.
SORT CASES BY Var1(D).
* Merge datasets.
MATCH FILES
/FILE ids
/FILE sortvar.
EXECUTE.
If you have lots of variables to delete in the sortvar dataset you could also use the MATCH CASES command:
* Delete all variables but Var1.
DATASET ACTIVATE sortvar.
MATCH CASES
/FILE *
/KEEP Var1.
Alternativly you can use the SAVE command in combination with the KEEP or DROP options in order to split the dataset.

Looping in SPSS to work through the cases

I have a data set in SPSS containing a sequence of six variables from which I have to create a new variable which should contain the last value present in the sequence. Let's say the data look like this: (the second row contains all missing values but represents a case to which I'll merge some other variables later, so I need this too.)
DATA LIST /V1 TO V6 1-6.
BEGIN DATA
423451
73453
929
0257
END DATA.
Now if I wish to generate a variable named lastscr which should have values 1, ., 3, 9, 7. Can anyone help me on how should I do it in SPSS? I could not find any clue about it. Thank you in advance for any help.
This can easily be done with the DO REPEAT command:
DO REPEAT Var = V1 TO V6.
IF NOT(SYSMIS(Var)) lastscr = Var.
END REPEAT.

How does foreach work in Pig?

I have a sample data looks like:
1950,0,1
1950,22,1
1950,-11,1
1949,111,1
1949,78,1
and I used following commands:
A = load 'path/to/the/sample';
B = foreach A generate $0,$1;
which should only generate first 2 columns of the A.
then I used
describe B
to check how it works, it returns: B: {a: bytearray,b: bytearray}, that is correct.
HOWEVER, when I run the command
dump B
why it returns:
(1950,0,1,)
(1950,22,1,)
(1950,-11,1,)
(1949,111,1,)
(1949,78,1,)
as the result??? It's sooooo weird. I'v tried it several time... but still the same result
The reason this happens is because Pig by default tries to separate your data by tabs. So when you pass it a line like
1950,0,1
it thinks it has found just a single field, 1950,0,1. Since you indicated that each line has two fields, the second field is just set to NULL.
So when you GENERATE the two fields you loaded, it prints out the tuple
(1950,0,1,)
If you were to STORE this instead of DUMPing it you would see it more clearly. Pig would store the data separated by tabs (again, the default), and your output file would look like
1950,0,1
1950,22,1
1950,-11,1
1949,111,1
1949,78,1
That's not very enlightening, so look instead what happens if you were to do this:
B = foreach A generate $0, "test";
store B into 'output';
Now the data in output would be
1950,0,1 test
1950,22,1 test
1950,-11,1 test
1949,111,1 test
1949,78,1 test
You can control what Pig uses as the field separator for both LOAD and STORE by using the clause USING PigStorage(','). The argument to PigStorage can be whatever character you like. One other common one is USING PigStorage('\n'), which will load in each line as a whole.
Use PigStorage Clause in your Load statement.
A = load 'path/to/the/sample' using PigStorage(',');
B = foreach A generate $0,$1;
dump B
now you will get the result that what u expect
(1950,0)
(1950,22)
(1950,-11)
(1949,111)
(1949,78)

Resources