Using psql, how do you get the total sum for last 3 days, on each day? - psql

I have a table that contains all purchases made at each school. I’m able to get the total spent per school, per item, per day,
with the following.
SELECT
date
school_id,
item_id,
sum(price) as total_price
FROM purchases
GROUP BY school_id, item_id, date
ORDER BY school_id, date
It will return something like
date school_id item_id total_price
2016-11-18 | 1 | 1 | 0.50
2016-11-17 | 1 | 2 | 1.00
2016-11-16 | 1 | 1 | 0.50
2016-11-18 | 2 | 2 | 1.00
2016-11-17 | 2 | 2 | 1.00
2016-11-16 | 2 | 2 | 1.00
I need a table that returns the total price for the last 3 days (including the day of) on each day,
So something like
date school_id item_id total_price
2016-11-18 | 1 | 1 | 1.00
2016-11-17 | 1 | 2 | 1.00
2016-11-16 | 1 | 1 | 0.50
2016-11-18 | 2 | 2 | 3.00
2016-11-17 | 2 | 2 | 2.00
2016-11-16 | 2 | 2 | 1.00
I know I can use lag() OVER (PARTITION BY), but I may need to do this for months at the time instead of 3 days, and lag will take forever to get set up.
I’m not really sure what other method I can use. Any guidance?

A simple INNER JOIN would do
You join the table to itself, when the school and item match, and the date is 3 days range
Notice that this would give a moving average of the last 3 days, but it seems so from your question, since you want consecutive days, without jumps
SELECT
p1.date
p1.school_id,
p1.item_id,
SUM(p2.price) total_price_3_days
purchases p1
INNER JOIN purchases p2 ON p1.school_id = p2.school_id AND p1.item_id = p2.item_id AND p2.`date` BETWEEN DATE_SUB(p1.`date`, INTERVAL 3 DAY) AND p1.`date`
GROUP BY p1.school_id, p1.item_id, p1.date
ORDER BY p1.school_id, p1.date

One approach would be to just use a correlated subquery in the select clause:
SELECT
date,
school_id,
item_id,
(SELECT SUM(p2.price) FROM purchases p2
WHERE p1.school_id = p2.school_id AND
p2.date BETWEEN p1.date - INTERVAL '3 DAY' AND p1.date) AS total_price
FROM purchases p1
GROUP BY school_id, item_id, date
ORDER BY school_id, date DESC;
Demo
Another approach would be to take advantage of Postgres' window functions:
SELECT
date,
school_id,
item_id,
SUM(price) OVER (PARTITION BY school_id
ORDER BY date ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS total_price
FROM purchases p1
GROUP BY school_id, item_id, date
ORDER BY school_id, date DESC;
Demo
Both generate this output:
Note that my school_id=1 output does not agree with your expected output, but I think your expected data has a typo.

Related

Count values without having to specify each

I need to count how many times ids repeats, without having to specify each id. In my case I need it for know how many customers come 3 times or more in a month. Here is an example of where Im getting the data from:
customers| id
------------------
person 1 | 2433340
person 2 | 3457548
person 3 | 3457584
person 4 | 4343218
person 4 | 4343218
person 4 | 4343218
person 3 | 3457584
And this one is the one that I need to fill:
Times that customers come
--------------------------
1 time | 2
2 times | 1
3 times | 1
I have used:
Formula in D2:
=QUERY(QUERY(B2:B,"Select Count(B) where B is not null group by B label Count(B) 'Times'"),"Select Col1, count(Col1) group by Col1 label count(Col1) 'Count'")
I would work with a helper column (D) to count visits per person. Then it is pretty easy to count the "x times".
Values in column F are numbers formatted as "0 "time""

Join two tables and count values from second table in Google Sheets

I have two Sheets with data like so-
Sheet 1:
A
--------
1 | Name |
2 | sue |
3 | bob |
4 | mary |
5 | john |
Sheet 2:
A B C D
---------------------------------------------
1 | ID | Asignee | Due | Days Left |
2 | ID001 | sue, bob | 1 | 5 |
3 | ID002 | sue, mary | 2 | 8 |
4 | ID003 | bob | 3 | 2 |
5 | ID004 | bob, john | 1 | 9 |
6 | ID005 | bob, mary, john | 4 | 1 |
7 | ID006 | sue, bob | 1 | 8 |
8 | ID007 | john, sue, mary | 2 | 6 |
On a 3rd sheet, I want to join and combine the data to get some totals/counts.
Sheet 3:
A B C D
---------------------------------------------------------
1 | Name | Number Rows | Total Due | Minimum of Days Left |
2 | sue | 4 | 6 | 5 |
3 | bob | 5 | 10 | 1 |
4 | mary | 3 | 8 | 1 |
5 | john | 3 | 7 | 1 |
For the 3rd sheet:
It has the same # of rows and values as Sheet 1
Column Sheet 3!B is the # of rows in Sheet 2 where Sheet 2!B contains Sheet 1!A (or Sheet 3!A)
There are 4 rows in Sheet 2 where Sheet 2!B contain sue
There are 5 rows in Sheet 2 where Sheet 2!B contain bob
There are 3 rows in Sheet 2 where Sheet 2!B contain bob
There are 3 rows in Sheet 2 where Sheet 2!B contain bob
Column Sheet 3!C is the total of Sheet 2!C where Sheet 2!B contains Sheet 1!A (or Sheet 3!A)
Column Sheet 3!D is the smallest value of Sheet 2!D where Sheet 2!B contains Sheet 1!A (or Sheet 3!A)
I've been staring at a blank sheet and am not sure where to start. I think I have to use filter, and arrayformula but I'm not sure how or where to start.
=ARRAYFORMULA(QUERY(SPLIT(TRIM(TRANSPOSE(SPLIT(TEXTJOIN("♀", 1,
IF(IFERROR(SPLIT(B2:B, ","))<>"",
SPLIT(B2:B, ",")&"♦"&C2:C&"♦"&D2:D, )), "♀"))), "♦"),
"select Col1,count(Col1),sum(Col2),min(Col3)
group by Col1
label count(Col1)'',sum(Col2)'',min(Col3)''"))
Edit by #IMtheNachoMan to add details on why/how I think the above formula works:
split the values in column B and concatenate the values in column C and column D with an arbritary value that is assured to not be used in any of the columns
because everything is wrapped in an arrayformula, each value from the column B split will get concatenated
splitting column B will create an errror for rows that don't have a value in column B
so the if and iferror will check if the split will create an error and if it does it will return null instead of the concatenated string from the first bullet
at this point we have one row for each row in the source table with column B split and concatenated with column C and D
join all the rows using a second arbritary value that is assured not to be in any of the columns
be sure to ignore empty values
empty values will be there from the rows that didn't have any values in the split from the first bullet
split the joined data (that doesn't have empty rows cause of the previous bullet) on the 2nd arbritary value that was used
transpose it back into rows
trim each row to remove spaces (not sure how/where the spaces got added though)
split the column in each row with the first arbritary value
use this as the input for a query call and use aggregate functions to get the data we want
if you really need to preserve order do:
=ARRAYFORMULA(IFERROR(VLOOKUP(Sheet1!A2:A,
QUERY(SPLIT(TRIM(TRANSPOSE(SPLIT(TEXTJOIN("♀", 1,
IF(IFERROR(SPLIT(B2:B, ","))<>"",
SPLIT(B2:B, ",")&"♦"&C2:C&"♦"&D2:D, )), "♀"))), "♦"),
"select Col1,count(Col1),sum(Col2),min(Col3)
group by Col1
label count(Col1)'',sum(Col2)'',min(Col3)''"), {1, 2, 3, 4}, 0)))

Google Spreadsheet, for each distinct item in column A, Sum columns B through E

I'm trying to make a spreadsheet for Zelda: Breath of the Wild where I count how many of each item I need to build a bunch of different armor sets.
Here is a link to the spreadsheet I'm working with, read only: https://docs.google.com/spreadsheets/d/161OmMq46BJuXN5KopDFvs7RwRFinKIM8AoKYhiB0ew0/edit?usp=sharing
Column A has a list of material names, and some of them repeat.
Columns B through E have a number indicating how many of the item in Column A is needed.
Each column, B through E, represents a level 1 through 4.
Column F has a =SUM(B{ROWNUM}:E{ROWNUM}
So I can have:
| Level 1 | Level 2 | Level 3 | Level 4 | Total
--------------------------------------------------------
Item one | 5 | 10 | 0 | 0 | 15
--------------------------------------------------------
Item two | 0 | 5 | 8 | 0 | 13
--------------------------------------------------------
Item one | 0 | 0 | 10 | 15 | 25
And what I want is:
| Total
-------------------
Item one | 40
-------------------
Item two | 13
I'm trying now to do it with Google Script, but I've never used it before, so I don't have anything to show for that as of now.
Query should do it. I assume your data is in Sheet1 and query is in another sheet.
=query(Sheet1!A1:F4,"select A, sum(F) group by A label sum(F) 'Total'")

case / if / filter / ? in google sheets query

I have a register that has all transactions with their account id. That register has a "Type" Field, on which I want to pull Summary Data.
Transactions:
| A | B | C | D |
|---------|---------|-------------|------------|
| Account | Amount | Type | Date |
|---------|---------|-------------|------------|
| 1b6f | 44.21 | Charge | 2016-09-01 |
| 5g0p | 101.57 | Charge | 2016-09-01 |
| 5g0p | 21.53 | Upgrade Fee | 2016-09-01 |
| 5g0p | -123.10 | Payment | 2016-09-07 |
| 1b6f | 4.43 | Late Fee | 2016-10-01 |
| 1b6f | 4.87 | Late Fee | 2016-11-01 |
I Would like a single query that will allow me to pull all the following summary info
Account
Current Balance
Total Charges
Last Charge Date
First Charge Date
Total Fees
Last Fee Date
Total Payments
Last Payment Date
Something like this:
=QUERY(
A:D,
"SELECT A,
SUM(B),
SUM(FILTER(D:D,C:C='Charge')),
MAX(FILTER(D:D,C:C='Charge')),
MIN(FILTER(D:D,C:C='Charge')),
SUM(FILTER(B:B,ISNUMBER(FIND('Fee',C)))),
MAX(FILTER(D:D,ISNUMBER(FIND('Fee',C)))),
SUM(FILTER(B:B,C:C='Payment')),
MAX(FILTER(D:D,C:C='Payment'))
GROUP BY A
label
A 'Account',
SUM(B) 'Current Balance',
SUM(FILTER(D:D,C:C='Charge')) 'Total Charges',
MAX(FILTER(D:D,C:C='Charge')) 'Last Charge Date',
MIN(FILTER(D:D,C:C='Charge')) 'First Charge Date',
SUM(FILTER(B:B,ISNUMBER(FIND('Fee',C)))) 'Total Fees',
MAX(FILTER(D:D,ISNUMBER(FIND('Fee',C)))) 'Last Fee Date',
SUM(FILTER(B:B,C:C='Payment')) 'Total Payments',
MAX(FILTER(D:D,C:C='Payment')) 'Last Payment Date'
"
)
With These Results
| Account | Current Balance | Total Charges | Last Charge Date | First Charge Date | Total Fees | Last Fee Date | Total Payments | Last Payment Date |
|---------|-----------------|---------------|------------------|-------------------|------------|---------------|----------------|-------------------|
| 1b6f | 53.51 | 44.21 | 2016-09-01 | 2016-09-01 | 9.30 | 2016-11-01 | 0 | |
| 5g0p | 121.55 | 223.12 | 2016-10-01 | 2016-09-01 | 21.53 | 2016-09-01 | -123.10 | 2016-09-07 |
Unfortunately, MAX requires input of of a column identifier and it doesn't seem you can use any non scalar functions at all (such as FILTER) or even any non listed aggregate functions (such as JOIN).
I currently am using a bunch of separate queries with different WHERE parameters, however it is very VERY slow.
Short answer
In Google Sheets it's not possible to use a unique QUERY to achieve the desired result. One solution is to optimize your set of separate queries
Explanation
As you already mentioned, Google Query Language doesn't support non scalar values as arguments of aggregate functions.
One way to optimize the queries is to use FILTER to remove blank rows from the source array for the queries without a WHERE clause.
Example
=QUERY(
FILTER(A:D,LEN(A:A),
"SELECT A, SUM(B) LABEL A 'Current Balance' SUM(B)' Total Charges'"
)
Then you could join the result of all queries by using array sintax
={query1,query2}
Reference
Google spreadsheet "=QUERY" join() equivalent function?
#trex005 I created table for you
Answer is here: https://docs.google.com/spreadsheets/d/1oC9RYfZD4ITnVfksIVFbC0KkadtFma-JUVk2PdiXqIA/edit?usp=sharing
The sample of the main formula is:
=SUMPRODUCT(FILTER(Sheet1!$B$2:$B,Sheet1!$A$2:$A=$A2,Sheet1!$C$2:$C="Charge"))
It will automatically add a new ID's from Sheet1 to the Sheet2 and count them.
I use the same way with google forms to count my pays. Using google forms this way is very effective
The formula for last date of "Charges":
=MAX(FILTER(Sheet1!$D$2:$D,Sheet1!$A$2:$A=$A2,Sheet1!$C$2:$C="Charge"))
where A2:A is
=UNIQUE(Sheet1!A2:A)
Pivot is an option:
=QUERY(A:D,"select A, max(D), min(D), sum(B) where B is not null group by A pivot C",0)
It won't give totals though.

How to use query in Google spreadsheet to select column instead of row?

Please, need your help !!!
I tried
=query(A1:E6, "select * where D = 'Yes'", 0)
it will select the rows which in column D that has value 'Yes'.
My question is: how to select the columns which in row 2 that has value 'Yes'. I tried:
=query(A1:E6, "select * where 2 = 'Yes'", 0)
but it does not work :(
To do this with QUERY, you would need to do a double TRANSPOSE, or use an alternative like FILTER.
=TRANSPOSE(QUERY(TRANSPOSE(A1:E6),"select * where Col2 = 'Yes'",0))
or
=FILTER(A1:E6,A2:E2="Yes")
Well it can be so simple, and we can use google query SQL features too:
Excel Table:
A B
1 Qty | 200
2 Stock | QUESS
3 Start | 8/24/2019
4 End | 8/23/2020
5 Today | 8/23/2021
Formula:
=query(Transpose(Sheet6!A1:B5),"select *" )
Output :
Qty | Stock | Start | End | Today
200 | QUESS | 8/24/2019 | 8/23/2020 | 8/23/2021
Note : A similar discussion here : [https://stackoverflow.com/questions/63044551/google-sheets-transpose-query/68901704#68901704]

Resources