Splitting a string variable delimited list into individual binary variables in SPSS - spss

I have a string variable created from a checkbox questions (Which of the following assets do you own?)
I am trying to create individual binary variables for each type of asset based on whether that number is present in the string list.
The syntax I am using cannot differentiate between 1 and 11.
do repeat wrd="1," ",2," ",3," ",4," ",5," ",6," ",7,"/NewVar= W3_CG_asset_TV_1 W3_CG_asset_radio_2 W3_CG_asset_payTV_3 W3_CG_asset_tel_4 W3_CG_asset_cellphone_5
W3_CG_asset_fridge_6 W3_CG_asset_freezer_7.
compute NewVar=char.index(W3_CG_HouseExpen1, wrd)>0.
end repeat.
do repeat wrd= ",8," ",9," ",10," ",11," ",12," ",13," ",14," ",15," ",16," ",17," ",18," ",19," /NewVar= W3_CG_asset_electricstove_8 W3_CG_asset_primusstove_9
W3_CG_asset_gasstove_10 W3_CG_asset_electrickettle_11 W3_CG_asset_microwave_12 W3_CG_asset_computer_13 W3_CG_asset_electricity_14 W3_CG_asset_geyser_15
W3_CG_asset_washingmachine_16 W3_CG_asset_workingvehicle_17 W3_CG_asset_bicycle_18 W3_CG_asset_donkeyhorse_19.
compute NewVar=char.index(W3_CG_HouseExpen1, wrd)>0.
end repeat.

I have tested this on SPSS 28.
make sure the column W3_CG_HouseExpen1 is string and length of it long enough to hold the data.
Then I added execute
data list list/W3_CG_HouseExpen1 (a50).
begin data
"1,2,11,12,"
"2,12,"
"1,2,"
"1,11,12,"
end data.
do repeat
wrd="1," ",2," ",3," ",4," ",5," ",6," ",7," ",8," ",9," ",10," ",11," ",12," ",13," ",14," ",15," ",16," ",17," ",18," ",19,"
/NewVar = W3_CG_asset_TV_1 W3_CG_asset_radio_2 W3_CG_asset_payTV_3 W3_CG_asset_tel_4 W3_CG_asset_cellphone_5 W3_CG_asset_fridge_6 W3_CG_asset_freezer_7
W3_CG_asset_electricstove_8 W3_CG_asset_primusstove_9 W3_CG_asset_gasstove_10 W3_CG_asset_electrickettle_11 W3_CG_asset_microwave_12 W3_CG_asset_computer_13 W3_CG_asset_electricity_14 W3_CG_asset_geyser_15
W3_CG_asset_washingmachine_16 W3_CG_asset_workingvehicle_17 W3_CG_asset_bicycle_18 W3_CG_asset_donkeyhorse_19.
compute NewVar=char.index(W3_CG_HouseExpen1, wrd)>0.
end repeat.
EXECUTE.

My suggestion is to run through this in reverse, erasing the values you've already recognized. So if you've got "11" and erased it, when you later search for "1" you won't find it in an "11".
I recreated a tiny exaple dataset to demonstrate on (EDIT-improved example):
data list list/W3_CG_HouseExpen1 (a50) .
begin data
"1,2,11,12,"
"11,12,"
"2,11,"
end data.
Now I do the whole process on a copy of the original W3_CG_HouseExpen1 variable so I can eat it away without damage to the original data:
string #temp(a50).
compute #temp=W3_CG_HouseExpen1.
do repeat wrd="12," "11," "2," "1," /NewVar= W3_12 W3_11 W3_2 W3_1.
compute NewVar=char.index(#temp, wrd)>0.
compute #temp=replace(#temp, wrd, ""). /*deleting the search string from the full string.
end repeat.
exe.

Related

How to iterate/while a mapping variables from environment to message assembly in IBM Integration Bus (toolkit)?

I have a SOAP node, that retrieve information from a URL in a tree structure.
Then i have a compute node to define each environment variable to each namespace variable of the SOAP retrieve.
And finally, i have a mapping node, to move the content to my message assembly structure in XML.
The error its giving me it's this (IN THE COMPUTE NODE):
I have a structure like this:
ListDocs
Description
DocType
ListTypes
Attribute
Lenght
Description
Nature
Required
ListDocs
Description
DocType
ListTypes
Attribute
Lenght
Description
Nature
Required
ListDocs
Description
DocType
ListTypes
Attribute
Lenght
Description
Nature
Required
The problem is that, when i do the definition of the variables, I do it like the code below, in the COMPUTE NODE:
WHILE I < InputRoot.SOAP.Body.ns:obterTiposDocProcessosResponse.ns:return.ns75:processo.ns75:listaTiposDocumentos
DO
SET Environment.Variables.XMLMessage.return.process.listDocs.description = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:description;
SET Environment.Variables.XMLMessage.return.process.listDocs.tipoDocumento = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:DocType;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.attribute = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:listTypes.ns75:atribbute;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.lenght = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:listTypes.ns75:lenght;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.description = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:listTypes.ns75:description;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.nature = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:listTypes.ns75:nature;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.required = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs.ns75:listTypes.ns75:required;
SET I = I+1;
END WHILE;
BUT, in my XML final structure, it only prints the values of my first listDocs, and i want to print all of my listDocs structures.
NOTE: WITH THE WHILE LIKE THIS, IT DOESN'T EVEN WORK. I HAVE TO REMOVE THE WHILE TO PRINT THE FIRST listDocs like i said Above.
Any help?
I NEED HELP TO LOOP THE STRUCTURES, WITH A WHILE OR SOMETHING.
You should try to use the following synthax :
DECLARE I INTEGER 1;
DECLARE J INTEGER;
J = CARDINALITY(InputRoot.SOAP.Body.ns:obterTiposDocProcessosResponse.ns:return.ns75:processo.ns75:listaTiposDocumentos[])
WHILE I <= J DO
SET Environment.Variables.XMLMessage.return.process.listDocs.description = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:description;
....
END WHILE;
You only missed the CARDINALITY function to get the number of elements, and also the [] to define the table, and then using this [I] while accessing the elements
Note : in my sample above, the environment will be overridden at each iteration of the loop, so only the last record will be printed. You can use the [I] in the output as well if you want to construct a table in output, or you can use the following code to push each message to the output terminal (this means you have one message in input, and 3 message coming out of the output terminal)
PROPAGATE TO TERMINAL 'Out';
So for example, based on your code, if you want to generate 3 messages based on your input containing multiple element :
DECLARE I INTEGER 1;
DECLARE J INTEGER;
J = CARDINALITY(InputRoot.SOAP.Body.ns:obterTiposDocProcessosResponse.ns:return.ns75:processo.ns75:listaTiposDocumentos[])
WHILE I <= J DO
SET Environment.Variables.XMLMessage.return.process.listDocs.description = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:description;
SET Environment.Variables.XMLMessage.return.process.listDocs.tipoDocumento = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:DocType;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.attribute = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:listTypes.ns75:atribbute;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.lenght = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:listTypes.ns75:lenght;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.description = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:listTypes.ns75:description;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.nature = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:listTypes.ns75:nature;
SET Environment.Variables.XMLMessage.return.process.listDocs.listTypes.required = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:listTypes.ns75:required;
PROPAGATE TO TERMINAL 'Out';
END WHILE;
RETURN FALSE;
For your global information, the RETURN TRUE is the instruction "pushing" the message built in the ESQL code to the output terminal. If you use PROPAGATE instruction (same effect), you should RETURN FALSE to avoid sending an empty message after looping on your records. Another way to do it is to propagate on another terminal (i.e : 'out1'), and keep the return true. In this case, you would have all you records coming out from the out1 terminal, and a message going out of the output temrinal (due to the return true) once all the messages have been propagated (this might be useful in many situations)
So the key to understanding IIB and ESQL is that you are looking at in memory Trees built from nodes.
Each Node has pointers/REFERENCEs to PARENT, NEXTSIBLING, PREVSIBLING, FIRSTCHILD and LASTCHILD Nodes.
Nodes also have FIELDNAME, FIELDNAMESPACE, FIELDTYPE and FIELDVALUE attributes.
And last but not least that you are building Output Trees by navigating Input Trees. The Environment Tree, which you are using, is a special long lasting Tree that you can both read from and write to.
So in your code InputRoot.SOAP.Body.ns75:processo.ns75:listDocs can be thought of as shorthand for instructions to navigate to the ns75:listDocs Node. The dots '.' tell ESQL interpreter the name of the child Node of the current Node. If you were telling someone how to navigate the Nodes it would go something like this.
Start at InputRoot. InputRoot is a special Node that is automatically available to you in your ESQL Modules code.
Navigate to the first child Node of InputRoot that has the name SOAP
Navigate to the first child Node of SOAP that has the name Body
Navigate to the first child Node of Body that has the name listDocs and is in the ns75 namespace.
In the absence of a subscript ESQL assumes you want the first Node that matches the specified name ns75:listDocs and ns75:listDocs[1] both refer to the same Node.
This explains what was happening in your code. You were always navigating to the same listDocs[1] node in the InputRoot and Environment Trees.
#Jerem's code improves on what you were doing by at least navigating across the listDocs nodes in the Input tree.
For each iteration of the loop the subscript [I] gets incremented and thus it chooses a different listDocs Node. The listDocs Nodes are siblings and thus the code will access the first, second and third instance of the listDocs Nodes.
InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[1] <-- Iteration I=1
InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[2] <-- Iteration I=2
InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[3] <-- Iteration I=3
To correct #Jerem's answer you'd need to use subscripts on the lefthand side of the statement as well. Picking the description field as an example you'd need to change your code as follows.
SET Environment.Variables.XMLMessage.return.process.listDocs[I].listTypes.description = InputRoot.SOAP.Body.ns75:processo.ns75:listDocs[I].ns75:listTypes.ns75:description;
Using subscripts is regarded as a performance no no. Imagine you had 10,000 listDocs this would result in each and every iteration of the loop walking down the tree over the InputRoot, SOAP, Body, ns75:processo Nodes and then across the listDocs sibling nodes until it found the ns75:listDocs[I] Node.
This means by the time we get round to processing ns75:listDocs[10000] it will have had to repetetively walked over all the other listDocs Nodes time and time again, In fact we can calculate it would have walked over (4 x 10,000) + ((10,000 x (10,000 + 1)) / 2) = 50,045,000 Nodes
So it's REFERENCE's to the rescue and also the answer to your question. Try a loop like this.
DECLARE ns75 NAMESPACE 'http://something.or.other.from.your.wsdl';
DECLARE InListDocsRef REFERENCE TO
InputRoot.SOAP.Body.ns75:processo.ns75:listDocs;
WHILE LASTMOVE(InListDocsRef) DO
DECLARE EnvListDocsRef REFERENCE TO Environment;
CREATE LASTCHILD OF Environment.Variables.XMLMessage.return.process AS EnvListDocsRef NAME 'listDocs';
SET EnvListDocsRef.description = InListDocsRef.ns75:description;
SET EnvListDocsRef.tipoDocumento = InListDocsRef.ns75:DocType;
SET EnvListDocsRef.listTypes.attribute = InListDocsRef.ns75:listTypes.ns75:atribbute;
SET EnvListDocsRef.listTypes.lenght = InListDocsRef.ns75:listTypes.ns75:lenght;
SET EnvListDocsRef.listTypes.description = InListDocsRef.ns75:listTypes.ns75:description;
SET EnvListDocsRef.listTypes.nature = InListDocsRef.ns75:listTypes.ns75:nature;
SET EnvListDocsRef.listTypes.required = InListDocsRef.ns75:listTypes.ns75:required;
MOVE InListDocsRef NEXTSIBLING REPEAT NAME;
END WHILE;
The code above only walks over 4 + 10,000 Nodes i.e. 10 thousand Nodes vs 50 million Nodes.
A couple of other useful things to know about setting references are:
To point to the last element you can use a subscript of [<]. So to point to the last ListItem in the aggregate MyList you would code Environment.MyList.ListItem[<]
You can use an asterisk * to set a reference to an element in the tree that you don't know the name of e.g. Environment.MyAggregate.* points to the first child of MyAggregate regardless of it's name.
You can also use asterisks * to choose an element irregardless of it's namespace InListDocsRef.*:listTypes.*:description
For anonymous namespaced elements use *:* but be very careful * and *:* are not the same thing the first means no namespace any element and the second means any namespace any element.
To process lists in reverse combine the [<] subscript with the PREVIOUSSIBLING option of MOVE.
So a chunk of code for reversing a list might go something like:
DECLARE MyReverseListItemWalkingRef REFERENCE TO Environment.MyList.ListItem[<];
WHILE LASTMOVE(MyReverseListItemWalkingRef) DO
CREATE LASTCHILD OF OuputRoot.ReversedList.Item NAME 'Description' VALUE MyReverseListItemWalkingRef.Desc;
MOVE MyReverseListItemWalkingRef PREVIOUSSIBLING REPEAT NAME;
END WHILE;
Learn how to use REFERENCES they are extremely powerful and one of your simplest options when it comes to performance.

Modify values programmatically SPSS

I have a file with more than 250 variables and more than 100 cases. Some of these variables have an error in decimal dot (20445.12 should be 2.044512).
I want to modify programatically these data, I found a possible way in a Visual Basic editor provided by SPSS (I show you a screen shot below), but I have an absolute lack of knowledge.
How can I select a range of cells in this language?
How can I store the cell once modified its data?
--- EDITED NEW DATA ----
Thank you for your fast reply.
The problem now its the number of digits that number has. For example, error data could have the following format:
Case A) 43998 (five digits) ---> 4.3998 as correct value.
Case B) 4399 (four digits) ---> 4.3990 as correct value, but parsed as 0.4399 because 0 has been removed when file was created.
Is there any way, like:
IF (NUM < 10000) THEN NUM = NUM / 1000 ELSE NUM = NUM / 10000
Or something like IF (Number_of_digits(NUM)) THEN ...
Thank you.
there's no need for VB script, go this way:
open a syntax window, paste the following code:
do repeat vr=var1 var2 var3 var4.
compute vr=vr/10000.
end repeat.
save outfile="filepath\My corrected data.sav".
exe.
Replace var1 var2 var3 var4 with the names of the actual variables you need to change. For variables that are contiguous in the file you may use var1 to var4.
Replace vr=vr/10000 with whatever mathematical calculation you would like to use to correct the data.
Replace "filepath\My corrected data.sav" with your path and file name.
WARNING: this syntax will change the data in your file. You should make sure to create a backup of your original in addition to saving the corrected data to a new file.

Lua random image

I have a Lua script I am using in a tabletop game and basically you have a "token" that represents a creature. When it dies, it overlays an image (which I have indicated in an .xml script) with an image of like a blood splat, or tombstone etc.
How do I make it so it would randomize which image gets overlayed?
The Script is here.
The lines below (178-184) are the main section that tells it "put image X over the token". I want it to randomize between say, 5 different images..
if not widgetDeathIndicator then
widgetDeathIndicator = tokenCT.addBitmapWidget("token_dead");
widgetDeathIndicator.setBitmap("token_dead");
widgetDeathIndicator.setName("deathindicator");
widgetDeathIndicator.setTooltipText(sName .. " has fallen, as if dead.");
widgetDeathIndicator.setSize(nWidth-20, nHeight-20);
end
token_dead is the name of the current image being used, which in the .xml directs to a .png
Yes, you can use math.random for this.
local images = {
'token_dead',
'another_image_name',
'yet_another_image_name',
}
local image = images[math.random(#images)]
math.random(n) will return a pseudo-random integer between 1 and n, so if you pass in #images (the length of the images table) you will get a valid pseudo-random table index for images.
To get better randomness you should set math.randomseed before you call math.random. (If you don't set it, then math.random will return the same sequence of "random" numbers each time.)

Logic behind COBOL code

I am not able to understand what is the logic behind these lines:
COMPUTE temp = RESULT - 1.843E19.
IF temp IS LESS THAN 1.0E16 THEN
Data definition:
000330 01 VAR1 COMP-1 VALUE 3.4E38. // 3.4 x 10 ^ 38
Here are those lines in context (the sub-program returns a square root):
MOVE VAR1 TO PARM1.
CALL "SQUAREROOT_ROUTINE" USING
BY REFERENCE PARM1,
BY REFERENCE RESULT.
COMPUTE temp = RESULT - 1.843E19.
IF temp IS LESS THAN 1.0E16 THEN
DISPLAY "OK"
ELSE
DISPLAY "False"
END-IF.
These lines are just trying to test if the result returned by the SQUAREROOT_ROUTINE is correct. Since the program is using float-values and rather large numbers this might look a bit complicated. Let's just do the math:
You start with 3.4E38, the squareroot is 1.84390889...E19.
By subtracting 1.843E19 (i.e. the approximate result) and comparing the difference against 1.0E16 the program is testing whether the result is between 1.843E19 and 1.843E19+1.0E16 = 1.844E19.
Not that this test would not catch an error if the result from SQUAREROOT_ROUTINE was too low instead of too high. To catch both types of wrong results you should compare the absolute value of the difference against the tolerance.
You might ask "Why make things so complicated"? The thing is that float-values usually are not exact and depending on the used precision you will get sightly different results due to rounding-errors.
well the logic itself is very straight forward, you are subtracting 1.843*(10^19) from the result you get from the SQUAREROOT_ROUTINE and putting that value in the variable called temp and then If the value of temp is less than 1.0*(10^16) you are going to print a line out to the SYSOUT that says "OK", otherwise you are going to print out "False" (if the value was equal to or greater than).
If you mean the logic as to why this code exists, you will need to talk to the author of the code, but it looks like a debugging display that was left in the program.

How to create a dummy variable

I'm working in a project that uses the IBM SPSS but I had some problems to set a dummy variable(binary variable).The process to get the variable is following : Consider an any variable(width for example), to get the dummy variable, we need
to sort this variable in the decreasing way; The next step is make a somatory of the cases until a limit, the cases before the limit receive the value 1 in the dummy variable the other values receive 0.
Your explanation is rather vague. And the critical value you give in the printscreen should be 2.009 in stead of 20.09?
But I think you mean the following.
When using syntax, use:
compute newdummyvariable eq (ABr gt 2.009477106).
To check if it's okay:
fre newdummyvariable.
UPDATE:
In order to compute a dummy based on the cumulative sum, the answer is as follows:
If your critical value is predetermined, the fastest way is to sort in decending order, and to use the command create with csum() to compute an extra variable which I called ABr_cumul. This one, you use to compute the newdummyvariable. As follows:
sort cases by ABr (d).
create ABr_cumul = csum(VAR00001).
compute newdummyvariable = (ABr_cumul le 20.094771061766488).
fre newdummyvariable.
the dummy comes from the sum of all cases, after decreasing order raqueados when cases of a variable representing 50% of the variable t0tal, these cases receive 1 and the other 0 ...

Resources