SPSS selection of same ID rows based on difference between rows - spss

I have a dataset where I have rows of data for each ID. Each row reflects a different time each ID has accessed the website. I have also created a variable which tells me how many months there were between each visit. I want to select all the cases from time 1 to last time value for each ID if they have returned after at least 1 month. What do I do?
ID Time MonthSince
1 1 .
1 2 0
2 1 .
2 2 1
3 1 .
3 2 0
I would like the dataset to look as follows:
ID Time MonthSince Filter
1 1 . Not Selected
1 2 0 Not Selected
2 1 . Selected
2 2 1 Selected
3 1 . Not Selected
3 2 0 Not Selected

What I suggest is calculate the total number of months in MonthSince. If this total is zero, we know there wasn't more then a month before the last visit and we can filter these cases out:
aggregate outfile=* mode=addvariables/break=ID/TotMonths=sum(MonthSince).
select if TotMonths>0.

Related

Product co-purchasing or bundles given a product ID

I have a data(amazon co-purchasing product) in two columns with values as product ID. I would like to select values from 100 - 299, 300-399, 400-999 and others values and group them. I want to create a bundle or co-purchasing between product in one group with another eg. 100-299 and 300-399, 400-999 and 100-299. The original data has two columns with FromNode and ToNode. Below are few lines of the original data. Some values(product ID) appear under both columns.
FromNode ToNode
0 1
0 2
0 3
0 4
0 5
1 0
1 2
1 4
1 5
1 15
2 0
2 11
2 13
2 14
3 65
3 66
3 67
I am using
df[df[['FromNode', 'ToNode']].isin([100,101,102...299]).any(1)]
to pick the values in the range but it seems I have to list all the values in the isin argument. Is there an efficient way to just give the range 100-299 to the isin(100-299) to fetch the values. Should just combine both columns into one and use iloc to select the values. Any tips will help.

rails order using relationship

I'm wanting to order the ActiveRevord results based upon relationships it has with itself. I don't think that makes sense but example below.
Below is the raw data, previous and next refer to the id of the previous/next record in the sequence. If there is not previous/next then the id is the same as the current record.
So the first record will be where previous = current id as this has to be the first item then the next rows will be where current id = next id of the previous record.
id
name
previous
next
1
grade 5
3
1
2
grade 3
5
3
3
grade 4
2
1
4
grade 1
4
5
5
grade 2
4
2
This needs sorting to display as
id
name
previous
next
4
grade 1
4
5
5
grade 2
4
2
2
grade 3
5
3
3
grade 4
2
1
1
grade 5
3
1

Merging different parts of one file - based on a variable in the file

I have a data file that looks like the first picture, I am reading it in to SPSS using FILE TYPE MIXED so that it looks like the second picture. How can I merge the cases based on the ID variable so that cases with the same ID variable are merged? The variable Age is repeated, so it does not matter which is selected, but it would be good if it were possible to select the first value.
Here is an example of the code I am using to read the data:
FILE TYPE MIXED RECORD=RecordID 1
/ WILD =WARN.
RECORD TYPE 1.
DATA LIST
/ ID 8-9 JobType 3-4 Age 5-7.
RECORD TYPE 2.
DATA LIST
/ ID 3-4 Sex 11 Salary 5-8.
RECORD TYPE 3.
DATA LIST
/ ID 6-7 Age 8-10 Hiring 3-5.
END FILE TYPE.
BEGIN DATA
1 1 39 1
1 3 27 2
1 2 27 3
1 3 25 4
2 1 9000 0
2 2 7500 0
2 3 4750 1
2 4 7250 1
3 76 1 39
3 98 2 27
3 8 3 27
3 44 4 25
END DATA.
LIST.
This should work:
sort cases by ID RecordID.
casestovars id=ID/index=RecordID.
If the ages are identical they collapse into one column. If they aren't, you'll get three age columns, and you'll be able to choose the one you prefer.

How to group by two columns

I'd tired group by week,day, it didn't give me the correct result
Here is the table: weekDay
week day
-----------
1 1
1 1
1 2
2 1
2 1
2 1
2 2
2 2
and expected result:
week day count
---------------------
1 1 2
1 2 1
2 1 3
2 2 2
How to get the result above by using group by or other ways?
Use following SQL query to retrieve result
SELECT week, day, count(wd.id) AS count FROM week_day wd GROUP BY wd.week, wd.day

Using sum when grouping one more columns in LINQ

I have 2 tables like below.
For comments to vote
VoteId VoteValue UserId CommentId DateAdded
1 1 1 1 10/11/2013
2 1 5 1 10/14/2013
3 1 9 2 09/08/2013
4 1 11 3 01/03/2014
For users that take point values
PointId Date PointValue UserId
1 10/11/2013 1 1
2 10/14/2013 1 5
3 09/08/2013 1 9
4 01/03/2014 1 11
I should find 10 users that most taken votes each month in all comments. Firstly I try to write LINQ like that;
var object = (db.Comments.
Where(c => c.ApplicationUser.Id == comment.ApplicationUser.Id).
FirstOrDefault()).ToList();
I can't use sum and add points to my table. Any helps?
I hope it's clear.
First you should extract mounth from datetime value, then group by month descending and also take sum of all coments and use Take(10) at the end.

Resources