I'm wanting to order the ActiveRevord results based upon relationships it has with itself. I don't think that makes sense but example below.
Below is the raw data, previous and next refer to the id of the previous/next record in the sequence. If there is not previous/next then the id is the same as the current record.
So the first record will be where previous = current id as this has to be the first item then the next rows will be where current id = next id of the previous record.
id
name
previous
next
1
grade 5
3
1
2
grade 3
5
3
3
grade 4
2
1
4
grade 1
4
5
5
grade 2
4
2
This needs sorting to display as
id
name
previous
next
4
grade 1
4
5
5
grade 2
4
2
2
grade 3
5
3
3
grade 4
2
1
1
grade 5
3
1
Related
I have a data(amazon co-purchasing product) in two columns with values as product ID. I would like to select values from 100 - 299, 300-399, 400-999 and others values and group them. I want to create a bundle or co-purchasing between product in one group with another eg. 100-299 and 300-399, 400-999 and 100-299. The original data has two columns with FromNode and ToNode. Below are few lines of the original data. Some values(product ID) appear under both columns.
FromNode ToNode
0 1
0 2
0 3
0 4
0 5
1 0
1 2
1 4
1 5
1 15
2 0
2 11
2 13
2 14
3 65
3 66
3 67
I am using
df[df[['FromNode', 'ToNode']].isin([100,101,102...299]).any(1)]
to pick the values in the range but it seems I have to list all the values in the isin argument. Is there an efficient way to just give the range 100-299 to the isin(100-299) to fetch the values. Should just combine both columns into one and use iloc to select the values. Any tips will help.
I have a dataset where I have rows of data for each ID. Each row reflects a different time each ID has accessed the website. I have also created a variable which tells me how many months there were between each visit. I want to select all the cases from time 1 to last time value for each ID if they have returned after at least 1 month. What do I do?
ID Time MonthSince
1 1 .
1 2 0
2 1 .
2 2 1
3 1 .
3 2 0
I would like the dataset to look as follows:
ID Time MonthSince Filter
1 1 . Not Selected
1 2 0 Not Selected
2 1 . Selected
2 2 1 Selected
3 1 . Not Selected
3 2 0 Not Selected
What I suggest is calculate the total number of months in MonthSince. If this total is zero, we know there wasn't more then a month before the last visit and we can filter these cases out:
aggregate outfile=* mode=addvariables/break=ID/TotMonths=sum(MonthSince).
select if TotMonths>0.
Let's say, for simplicity sake, I have the following table:
id amount p_id date
------------------------------------------------
1 5 1 2020-01-01T01:00:00
2 10 1 2020-01-01T01:10:00
3 15 2 2020-01-01T01:20:00
4 10 3 2020-01-01T03:30:00
5 10 4 2020-01-01T03:50:00
6 20 1 2020-01-01T03:40:00
Here's a sample response I want:
{
"2020-01-01T01:00:00": 25, -- this is from adding records with ids: 2 and 3
"2020-01-01T03:00:00": 55 -- this is from adding records with ids: 3,4,5 and 6
}
I want to get the total (sum(amount)) of all unique p_id's grouped by the hour.
The row chosen per p_id is the one with the latest date. So for example, the first value in the response above doesn't include id 1 because the record with id 2 has the same p_id and the date on that row is later.
The one tricky thing is I want to include the summation of all the amount per p_id if their date is before the hour presented. So for example, in the second value of the response (with key "2020-01-01T03:00:00"), even though id 3 has a timestamp in a different hour, it's the latest for that p_id 2 and therefore gets included in the sum for "2020-01-01T03:00:00". But the row with id 6 overrides id 2 with the same p_id 1.
In other words: always take the latest amount for each p_id so far, and compute the sum for every distinct hour found in the table.
Create a CTE that includes row_number() over (partition by p_id, date_trunc('hour',"date") order by "date" desc) as pid_hr_seq
Then write your query against that CTE with where pid_hr_seq = 1.
I have a data file that looks like the first picture, I am reading it in to SPSS using FILE TYPE MIXED so that it looks like the second picture. How can I merge the cases based on the ID variable so that cases with the same ID variable are merged? The variable Age is repeated, so it does not matter which is selected, but it would be good if it were possible to select the first value.
Here is an example of the code I am using to read the data:
FILE TYPE MIXED RECORD=RecordID 1
/ WILD =WARN.
RECORD TYPE 1.
DATA LIST
/ ID 8-9 JobType 3-4 Age 5-7.
RECORD TYPE 2.
DATA LIST
/ ID 3-4 Sex 11 Salary 5-8.
RECORD TYPE 3.
DATA LIST
/ ID 6-7 Age 8-10 Hiring 3-5.
END FILE TYPE.
BEGIN DATA
1 1 39 1
1 3 27 2
1 2 27 3
1 3 25 4
2 1 9000 0
2 2 7500 0
2 3 4750 1
2 4 7250 1
3 76 1 39
3 98 2 27
3 8 3 27
3 44 4 25
END DATA.
LIST.
This should work:
sort cases by ID RecordID.
casestovars id=ID/index=RecordID.
If the ages are identical they collapse into one column. If they aren't, you'll get three age columns, and you'll be able to choose the one you prefer.
I have 2 tables like below.
For comments to vote
VoteId VoteValue UserId CommentId DateAdded
1 1 1 1 10/11/2013
2 1 5 1 10/14/2013
3 1 9 2 09/08/2013
4 1 11 3 01/03/2014
For users that take point values
PointId Date PointValue UserId
1 10/11/2013 1 1
2 10/14/2013 1 5
3 09/08/2013 1 9
4 01/03/2014 1 11
I should find 10 users that most taken votes each month in all comments. Firstly I try to write LINQ like that;
var object = (db.Comments.
Where(c => c.ApplicationUser.Id == comment.ApplicationUser.Id).
FirstOrDefault()).ToList();
I can't use sum and add points to my table. Any helps?
I hope it's clear.
First you should extract mounth from datetime value, then group by month descending and also take sum of all coments and use Take(10) at the end.