Postgresql sum all unique values from previous dates - ruby-on-rails

Let's say, for simplicity sake, I have the following table:
id amount p_id date
------------------------------------------------
1 5 1 2020-01-01T01:00:00
2 10 1 2020-01-01T01:10:00
3 15 2 2020-01-01T01:20:00
4 10 3 2020-01-01T03:30:00
5 10 4 2020-01-01T03:50:00
6 20 1 2020-01-01T03:40:00
Here's a sample response I want:
{
"2020-01-01T01:00:00": 25, -- this is from adding records with ids: 2 and 3
"2020-01-01T03:00:00": 55 -- this is from adding records with ids: 3,4,5 and 6
}
I want to get the total (sum(amount)) of all unique p_id's grouped by the hour.
The row chosen per p_id is the one with the latest date. So for example, the first value in the response above doesn't include id 1 because the record with id 2 has the same p_id and the date on that row is later.
The one tricky thing is I want to include the summation of all the amount per p_id if their date is before the hour presented. So for example, in the second value of the response (with key "2020-01-01T03:00:00"), even though id 3 has a timestamp in a different hour, it's the latest for that p_id 2 and therefore gets included in the sum for "2020-01-01T03:00:00". But the row with id 6 overrides id 2 with the same p_id 1.
In other words: always take the latest amount for each p_id so far, and compute the sum for every distinct hour found in the table.

Create a CTE that includes row_number() over (partition by p_id, date_trunc('hour',"date") order by "date" desc) as pid_hr_seq
Then write your query against that CTE with where pid_hr_seq = 1.

Related

how can I query sum across columns, but avoid duplicated submissions / rows from unique users?

I have a Google form accepting data from different users, which goes to a sheet and is used to SUM the values across columns at the end of the day. The problem is that users can re-submit forms to update the data totals if there is a change at the end of the day:
NAME K L M
ALF 4 0 1
BILL 1 0 0
SALLY 1 0 1
DENNIS 1 1 1
RICK 0 0 1
SALLY 2 1 1 <--- SALLY RESUBMITTED HER FORM AGAIN WITH UPDATED VALUES
In my current Query, I SUM() the columns after filtering by the date like this:
SELECT SUM(K), SUM(M), SUM(N) WHERE C = date '"&TEXT($B$1,"yyyy-mm-dd")&
$B$1 is a cell with a datepicker and col C is the user submitted form date. Col A has the unique form generated submission timestamps
As you can see, the SUM for each column will be off by the extra submission from Sally. I need to include only the most recent submissions from any of the users, and ignore any prior ones for this date. I'm not sure how to filter in this manner and sum just the most recent instance from each unique user.
** EDIT **
I should note the original form data is on another sheet and the cells are referenced via a query to this range. The form is also submitted daily, so the query must be able to specify the date in question for summation of entries.
Give a try on following formula-
=QUERY(INDEX(REDUCE({0,0,0,0},UNIQUE(J2:J7),LAMBDA(a,b,{a;SORTN(FILTER(J2:M7,J2:J7=b,C2:C7=date(2023,2,17)),1)})))," select sum(Col2), sum(Col3), sum(Col4)")
If you actually want most recent response to sum then use-
=QUERY(INDEX(REDUCE(SEQUENCE(1,COLUMNS(A2:M7),0,0),UNIQUE(J2:J7),LAMBDA(a,b,{a;QUERY(SORT(FILTER(A2:M7,J2:J7=b),1,0),"limit 1")}))),"select sum(Col11), sum(Col12), sum(Col13)")
Here you have another option, creating an auxiliary column that returns 1 if it corresponds to the date and is the last timestamp
=QUERY({K:M,
MAP(A:A,C:C,J:J,LAMBDA(ts,date,n,IF(date<>B1,0,IF(ts=MAX(FILTER(A:A,J:J=n,C:C=date)),1,0))))},
"SELECT SUM(Col1),SUM(Col2),SUM(Col3) where Col4=1")

Google Sheet - It's possible to array sum function in the following condition?

Would it be possible to use arrayformular for this condition?
Sum all the rows that PID are the same, the result should be as in the image.
I tried this code, but I think it's too long, and if the PID exceed over 20 rows, it would not work.
=IF(A3<>A2,BJ3+IF(A3=A4,BJ4,0)+IF(A3=A5,BJ5,0)+IF(A3=A6,BJ6,0)+IF(A3=A7,BJ7,0)+IF(A3=A8,BJ8,0)+IF(A3=A9,BJ9,0)+IF(A3=A10,BJ10,0)+IF(A3=A11,BJ11,0)+IF(A3=A12,BJ12,0)+IF(A3=A13,BJ13,0)+IF(A3=A14,BJ14,0)+IF(A3=A15,BJ15,0)+IF(A3=A16,BJ16,0)+IF(A3=A17,BJ17,0)+IF(A3=A18,BJ18,0)+IF(A3=A19,BJ19,0)+IF(A3=A20,BJ20,0)+IF(A3=A21,BJ21,0)+IF(A3=A22,BJ22,0),0)
With a table like this :
ID
Value
1
5
1
10
2
5
2
10
2
15
You have an expected output of :
ID
Value
Sum
1
5
15
1
10
blank
2
5
30
2
10
blank
2
15
blank
It is achievable with this formula (just drag it in your sum column) :
=IF(A2=A1,"",SUMIFS(B$2:B$12,A$2:A$12,A2))
It check if the ids are the same and then sum them, but only show them on the row where the id first appears
Found it on google by searching google sheets sum group by
The following in C2 will generate the required answer without any copying-down required:
=arrayformula(if(len(A2:A),ifna(vlookup(row(A2:A),query({row(A2:B),A2:B},"select min(Col1),sum(Col3) where Col2 is not null group by Col2"),2,false)),))
We are making a lookup table of grouped sums against the first row of each 'P#' group using QUERY, then using VLOOKUP to distribute the group sums to the first row in each group. Probably also doable using a SCAN/OFFSET combination as well, I think.

Updated multiple records column value of based selected id

In products table I have fields like
id product_name product_value quantity status
1 abc 10000 50 received
2 efg 5000 15 shipment
3 hij 850 100 received
4 klm 7000 20 shipment
5 nop 350 50 received
I can select multiple rows at a time. And here I selected id=2,4 and need to change the status='received'. How to do multiple update at single time in rails?
Try
Product.where(id: [2, 4]).update_all(status: 'received')
If you are looking for all products that have a status of 'shipment', you can use:
Product.where(status: 'shipment')
From here, you can set all to 'received', or iterate through them and select only the ones you want to make changes to.

Calculate difference between two dates and group and count the results

If I have a table with field date_completed. I would like to calculate the difference between this date and the time now in months
(date_completed.year * 12) + date_completed.month - (Date.today.year * 12) + Date.today.month
and then group the results based on the number of months e.g.
No Months => Count
1 => 10
2 => 5
3 => 6
etc.
Is it possible to calculate the difference between two dates in a database and then group and count the results?
A Postgre-SQL specific SQL query will be something like below - it makes use of extract and age method of PostgreSQL(Refer Documentation)
select extract(year from age(date_completed)) * 12 +
extract(month from age(date_completed)) as months,
count(*) from learn group by months;
Sample output:
"months"|"count"
2| 2
3| 1
4| 1
I made use of a table learn above, you need to change it to table name you have.

Using sum when grouping one more columns in LINQ

I have 2 tables like below.
For comments to vote
VoteId VoteValue UserId CommentId DateAdded
1 1 1 1 10/11/2013
2 1 5 1 10/14/2013
3 1 9 2 09/08/2013
4 1 11 3 01/03/2014
For users that take point values
PointId Date PointValue UserId
1 10/11/2013 1 1
2 10/14/2013 1 5
3 09/08/2013 1 9
4 01/03/2014 1 11
I should find 10 users that most taken votes each month in all comments. Firstly I try to write LINQ like that;
var object = (db.Comments.
Where(c => c.ApplicationUser.Id == comment.ApplicationUser.Id).
FirstOrDefault()).ToList();
I can't use sum and add points to my table. Any helps?
I hope it's clear.
First you should extract mounth from datetime value, then group by month descending and also take sum of all coments and use Take(10) at the end.

Resources