I have a data set with two parent/carer respondents (main and partner) for each participant (child). For one of the variables, only one respondent has given an answer - usually the main respondent, but in some cases it was the partner respondent. I therefore need to fill in some missing main respondent data with data from the partner respondent.
My data looks roughly like this:
MAIN PARTNER I would like the final var as below:
2 -1 2
1 -1 1
-1 2 2
1 . 1
-9 2 2
-8 1 1
2 . 2
1 . 1
etc.
(-1, -8 and -9 are missing values)
All variables are numeric. Where a response is missing from the main respondent, I would like to fill it in from the partner. I cannot seem to get the DO IF/RECODE commands to work.
Any advice on how to do this in SPSS would be hugely appreciated!
More than one way to skin a cat. Depending on your taste, you might create your final variable responder like so:
MISSING VALUES main (-1,-8,-9) .
IF (MISSING(main)) responder=partner .
IF (NOT(MISSING(main))) responder=main .
EXE .
First assign your missing values. Then assign a value to responder based on whether main is missing. Note that MISSING(main) will evaluate true when main has a specified missing value (in this case: -1, -8, or -9) or a system missing value.
Related
I've got the following list in SPSS:
Subjekt Reactiontime correct/incorrect
1 x 1
1 x 0
1 x 1
1 x 0
I now want to select all rows/cases that follow AFTER "0" (in the column correct/incorrect) because I want to compute the mean of all reactiontimes that come after "0".
How can I do that in SPSS?
One way to do this would be to add a column that keeps track of whether the prior row was equal to 0 in your correct field and then calculate the mean Reactiontime of those cases.
First let's make a variable to flag cases we want included in the average.
* set prev_correct to 0 if the prior case was 0 .
IF (LAG(correct)=0) prev_correct=0 .
* else set to -1 .
RECODE prev_correct (SYSMIS=-1) .
EXE .
Now we can calculate the mean reaction time, splitting by our new variable.
MEANS Reactiontime BY prev_correct /CELLS MEAN .
Or, if we only want to output the mean when prev_correct=0 .
TEMP .
SELECT IF prev_correct=0 .
MEANS Reactiontime /CELLS MEAN .
Here's a shorter approach (though less generic than #user45392's full process):
if lag(correct)=0 ReactiontimeAfter0=Reactiontime.
now you can just run means ReactiontimeAfter0.
I have a problem that I think it's easiest to solve with awk but I wrapped my head around it.
Inside a file I have repeating output like this:
....
Name="BgpIpv4RouteConfig_XXX">
<Ipv4NetworkBlock id="13726"
StartIpList="x.y.z.t"
PrefixLength="30"
NetworkCount="10000"
... other output
then this block will repeat.
a)I want to match on BGPIpv4Route.*, then skip 2 lines (the "n" keyword of awk), then when reaching Prefix Length:
- either replace it with random (25,30)
or
- better but I guess harder (no idea came to mind for keeping track of what was used and looping among /25../30) -> first occurrence /25, second one /26...till /30, then rollback to /25
b) then next line with NetworkCount depending on the new value of PrefixCount calculate it as 65536 / 2^(32-Prefix Count)
eg: if PrefixCount on this occurrence was replaced with /25, then NetworkCount on the line following it = 65536 / 2 ^ 7 = 65536 / 128 = 512
I found some examples with inserting/changing a line after one that matched (or with a counter variable X lines below the match) but I got a bit confused with the value generation part and also with the changing of two lines where one is depending on the other.
Not sure I made any sense...my head is a bit overwhelmed with what I'm finding everywhere right now.
Thanks in advance!
this should do
$ awk 'BEGIN {q="\""; FS=OFS="="; n=split("25=26=27=28=29=30",ps)}
/BgpIpv4Route/ {c=c%n+1}
/PrefixLength/ {$2=q ps[c] q}
/NetworkCount/ {$2=q 65536/2^(32-ps[c]) q}1' file
perhaps minimize computation by changing to 2^(ps[c]-16)
If there are free standing PrefixLength and NetworkCount attributes perhaps you need to qualify them for each BgpIpv4Route context.
So i have an instance where even after converting my sets to lists, they aren't recognized as lists.
So the idea is to delete extra columns from a data frame comparing with columns in another. I have two data frames say df_test and df_train . I need to remove columns in df_test which are not in train .
extracols = set(df_test.columns) - set(df_train.columns) #Gives cols 2b
deltd
l = [extracols] # or list(extracols)
Xdp.dropna( subset = l, how ='any' , axis = 0)
I get an error : Unhashable type set
Even on printing l it prints like a set with {} curlies.
[{set}] doesn't cast to list, it just creates a list of length 1 with your set inside it.
Are you sure that list({set}) isn't working for you? Maybe you should post more of your code as it is hard to see where this is going wrong for you.
I have this Cobol paragraph that will search one table which at this point in my example would have a table counter of 2 which is what the first INDEX loop does. The variable A represents an Occurs that is defined in a file (include) which has 5 occurrences. I can get to the if statement but it returns false. I read the information out of a ParmCard and store that in the table which is Table-B and the ParmCard is correct.
I did get it to find one value when was changing values around (conditional statements) but I know that both of the values that it is looking for in the ParmCard are in the file and should be found and it should find two results. I would have tried Expeditor but the system was down at work.
Is there something wrong with the index or may be I think that the perform's are working one way but they are really working a different way? This Search paragraph gets executed with every read of the ID file thus it will look in the table as many times as the ID file has an ID and ID symbols are unique.
Question: Why would the IF-STATEMENT not be working?
Code:
SEARCH-PARAGRAPH.
PERFORM VARYING SUB FROM 1 BY 1 UNTIL SUB > 2 <--DUPLICATE INDEXER
IF A(TAB) = TABLE-B(SUB) THEN
MOVE 6 TO TAB
MOVE 'TRUE' TO FOUND-IS
PERFORM WRITE-FILE THRU X-WF
PERFORM LOG-RESULT THRU X-LR
END-IF
END-PERFORM
X-SP. EXIT.
SEARCH-INDEX.
PERFORM VARYING I FROM 1 BY 1 UNTIL I > 2
DISPLAY 'INDEX --> ' I
PERFORM VARYING TAB FROM 1 BY 1 UNTIL TAB > 5
DISPLAY 'TAB --> ' TAB
PERFORM SEARCH-PARAGRPAH THRU X-SP
END-PERFORM
END-PERFORM.
X-SEARCH-INDEX. EXIT.
Here is the way that it works now and I do get the results I want. It is difficult to past the company code up because you never know who might have a problem.
New Code:
READ-PROV.
READ P-FILE
AT END
MOVE 'Y' TO EOF2
GO TO X-READ-PROV
NOT AT END
ADD 1 TO T-REC-READ
MOVE P-RECORD TO TEST-RECORD
PERFORM CHECK-MATCH THRU X-CHECK-MATCH
END-READ.
X-READ-PROV. EXIT.
CHECK-MATCH.
PERFORM VARYING SUB FROM 1 BY 1 UNTIL SUB > TABLECOUNTER
IF PID >= FROM(SUB) AND
PID <= THRU(SUB) THEN
IF TODAY < P-END-DTE THEN
IF TOTAL-PD = 0 AND
TOTAL-PD = 0 AND
TOTAL-PD = 0 AND
TOTAL-PD = 0 AND
TOTAL-PD = 0 THEN
IF PBILLIND NOT EQUAL 'Y'
PERFORM VARYING TAB FROM 1 BY 1 UNTIL TAB > 5
IF P-CD(TAB) = TY(SUB) THEN
MOVE 6 TO TAB
DISPLAY('***Found***')
ADD 1 TO T-REC-FOUND
END-IF
END-PERFORM
END-IF
END-IF
END-IF
END-IF
END-PERFORM.
X-CM. EXIT.
We can't tell.
There is nothing "wrong" with your nested PERFORM. The IF test is failing simply because it is never true.
We can't get you further with that without seeing your data-definitions, sample input and expected output.
However... my guess would be that the problem is with your data from the PARM in the JCL. That is the most likely area.
It is of course possible that the problem is with the other definition.
A couple of things whilst waiting.
Please always post the actual code, always. We don't want to look for errors in what you have typed here, we want to see the actual code. You have not shown the actual code, because it will not compile, as INDEX is a Reserved Word in COBOL, so you can't use it for a data-name.
Please always bear in mind that what you think may be wrong may not be the problem, so post everything we are likely to need (data-definitions, data you used, actual results you got with the code (including anything you've added for problem-determination), results which were expected).
Some tips.
A paragraph requires a full-stop/period after the paragraph-name and before the next paragraph. If you put that second full-stop/period on a line of its own, and have no full-stops/periods attached to your PROCEDURE code itself, you'll make things look neater and avoid problems when you want to copy some lines which happen to have a full-stop/period to a place where they cause you a mess.
You are using literal values. This is bad. When the number of entries in one of your tables changes, you have to change those literal values. Say the 2 needs to be changed to 5. You have to look at every occurrence of the literal 2 and decide if it needs to be changed. Then change it to 5. Then you get another request, to change the table which originally had five entries so that it will have six. See how difficult/error-prone life can be?
If instead you have unique and well-named data-names for your maximum number of entries, you only have one place to make a change, and you know it can be changed without reference to the rest of the code (assuming someone clever hasn't seen it has a value they want for something, and use it despite its name, of course...).
The content of those fields you can set automatically:
01 TABLE-1.
05 FILLER OCCURS 2 TIMES.
10 A PIC X(10).
01 TABLE-2.
05 FILLER OCCURS 5 TIMES.
10 TABLE-B PIC X(10).
01 TABLE-1-NO-OF-ENTRIES COMP PIC 9(4).
01 TABLE-2-NO-OF-ENTRIES COMP PIC 9(4).
...
PROCEDURE DIVISION.
...
COMPUTE TABLE-1-NO-OF-ENTRIES = LENGTH OF TABLE-1
/ LENGTH OF A
COMPUTE TABLE-2-NO-OF-ENTRIES = LENGTH OF TABLE-2
/ LENGTH OF TABLE-B
DISPLAY TABLE-1-NO-OF-ENTRIES
DISPLAY TABLE-2-NO-OF-ENTRIES
That gives you the output 2 and 5.
The names I've used are a mixture of yours and some for demonstration purposes only. Make everything meaningful, and by that I don't mean trite, as my example names would be in real life.
If you insist on escaping from within your PERFORM like that (and take note of Bruce Martin's comment), you can calculate your escape value by using new, aptly-named, fields and giving them the value of the above plus one.
To do a nested loop when the outer loop only has two entries is overkill. You don't need to escape out of the loops like you do, if you have a termination condition on the loop.
That'll do for now until we see your definitions, data and results.
I currently have a data file that is structured like this:
1
-
-
2
-
3
-
I would like it to look like this:
1
1
1
2
2
3
3
Unfortunately I do not how to achieve this in SPSS. Is there are a simple command that could recode the data this way?
I have found the answer, by using the LAG function. (I defined 9999 as a missing value).
IF (variable = 9999) variable=LAG(variable).
EXECUTE.