Rolling sum and count in Analytical Functions - google-sheets

Look at the 2 Queries below. The Sum works but the Count does Not!! The Sum gives the Rolling numbers, but the count NOT!!
select date_key , product_key
, sum(product_key) SUM1
, sum (sum(product_key) ) over (order by date_key rows between 5 preceding and current row) as ROLLINGSUM1
FROM [DYLAN 1].[dbo].[SALESFACT]
group by date_key, product_key
order by date_key
select date_key , product_key
, count(product_key) countProductkey
, count (count(product_key) ) over (order by date_key rows between 5 preceding and current row) as RollingCOUNTPRODKEY
FROM [DYLAN 1].[dbo].[SALESFACT]
group by date_key, product_key
order by date_key

Related

SUMIFS With Text and Date Comparison

theoretically I want to sum the total income each employee made for all the businesses.
Like this:
Employee 1 = Biz 1 income + Biz 2 income + Biz 3 income, etc...
Employee 2 = Biz 1 income + Biz 2 income + Biz 3 income, etc...
Technically and based on the table below, I want to sum a range in column R starting from cell R14 where the text in column W starting from W14 is the same in column P starting from cell P14 AND the name of the month in column V starting from cell V14 is equal to a month in date in column N starting from cell N14.
*
(I included the date because this is part of a budget planner so I need to categorize the data based on months.)*
I used this formula:
=SUMIFS(R14:R1013, P14:P1013, U14:U1013, TEXT(N14:N1013,"MMMM"),"="&T14:T1013)
But it prompts me with the error: Array arguments to sumifs are of different size
What could be wrong here? Does someone have any idea?
Thanks for your help in advance!
Try wrapping the text formula into ARRAYFORMULA to get the full column:
=SUMIFS(R14:R1013, P14:P1013, U14:U1013, ARRAYFORMULA(TEXT(N14:N1013,"MMMM")),"="&T14:T1013)
You can get the totals for all months and all employees with query(), like this:
=arrayformula(
query(
{ text(N13:N, "yyyy-MM"), O13:R },
"select Col1, Col3, sum(Col5)
where Col3 is not not null
group by Col1, Col3",
1
)
)

How to count the number of times a specific column is the maximum value in a row?

I have a spreadsheet that looks like this, which tracks attendance and order. On day 1, the order was [Alice, Bob, Catherine, Dave]. On day 2, the order was [Bob, Dave, Catherine], and Alice was absent:
Date
Alice
Bob
Catherine
Dave
10/1
0
1
2
3
10/2
x
0
2
1
10/3
3
1
2
0
10/4
1
0
x
x
10/5
0
x
1
2
I am trying to write a formula to get the total number of times each attendee went last. In other words, I want to count the number of times a name in a column is the MAX value for each date row, ignoring any x's. Ideally, I would like a single formula that I could place in a single cell. If successful the resulting table would look like this:
Attendee
# of times they went last
Alice
2
Bob
0
Catherine
1
Dave
2
What's the best way to accomplish this?
Find the MAX BYROW, then compare the max to each of the Attendees using REDUCE+OFFSET. If equal, create a SUM:
=LAMBDA(
max,
REDUCE(
{"Attendee","#times"},
B1:E1,
LAMBDA(
a,c,
{a;c,SUMPRODUCT(OFFSET(c,1,0,5)=max)}
)
)
)(BYROW(B2:E6,LAMBDA(r,MAX(r))))
try:
=INDEX(QUERY(BYROW(B2:INDEX(E:E, MAX((A:A<>"")*ROW(A:A))),
LAMBDA(x, TEXTJOIN(, 1, IF(x=MAX(x), B1:E1, )))),
"select Col1,count(Col1) group by Col1 label count(Col1)''"))
with Bob:
=SORTN({QUERY(BYROW(B2:INDEX(E:E, MAX((A:A<>"")*ROW(A:A))),
LAMBDA(x, TEXTJOIN(, 1, IF(x=MAX(x), B1:E1, )))),
"select Col1,count(Col1) group by Col1 label count(Col1)''");
TRANSPOSE({B1:E1;(B1:E1="")*1})}, 9^9, 2, 1, 1)
One easy way is to add a column with the name of the last attendee. If, when you say you want a single formula, the reason is to don't show other cells, you can hide the column. Then you count how many times each name appears. This should be the result:
Date
Alice
Bob
Catherine
Dave
last
Attendee
# of times they went last
10/1
0
1
2
3
Dave
Alice
2
10/2
x
0
2
1
Catherine
Bob
0
10/3
3
1
2
0
Alice
Catherine
1
10/4
1
0
x
x
Alice
Dave
2
10/5
0
x
1
2
Dave
Formula in cell G2 (under "last"), copied to the other cells of the column:
=INDEX(B$1:E$1,1,MATCH(MAX(B2:E2), B2:E2, 0))
It searches, only in that line, the maximum value. Gets the column and returns the correspondent name from the first line.
Formula in I2 (under "Attendee"):
=TRANSPOSE(B1:E1)
Transposes all names from the first line to the column.
Formula in J2, copied to the other cells of the column:
=COUNTIF(G$2:G,I2)
Counts how many times the name appears in column G.

How to Divide the Sum of a Range by each Row from the Range in Google Sheets with Query?

I am using Query to pull Columns A and B from another sheet like this:
Query(Tank_List!A1:M716, "select A,E, SUM (E) Where B=1 Group by A,E",1)
Column A
Column B
Column C
Item 1
9240
9240
Item 2
11843
11843
Item 3
6372
6372
Item 4
8320
8320
Item 5
16365
16365
Item 6
1234
1234
Instead of returning the actual Sum of ColB (The SUM of the Range of numbers from ColB) it returns just a copy of ColB on it's line.
I've tried several ways but the issue is SUM returns a single total for ColB or as above, the SUM of just the Row.
I am hoping for something like this:
Column A
Column B
Column C
Column D
Item 1
9240
53374
ColC/ColB
Item 2
11843
53374
ColC/ColB
Item 3
6372
53374
ColC/ColB
Item 4
8320
53374
ColC/ColB
Item 5
16365
53374
ColC/ColB
Item 6
1234
53374
ColC/ColB
Where I can do equations based on the original range numbers and the total SUM of that range. I imagine the answer will have to do with ArrayFormula, but I could not make it work myself.
It's hard to tell exactly what you want as you have columns not listed, but it would probably be easist to just grab the number using a regular sum value, then parse it into your query. I didn't duplicate your sheet, but used the below sample data and this formula:
=Query(A:B, "Select A, Sum(B), "&sum(B:B)&",SUM (B)/"&sum(B:B)&" where A is not null Group by A",1)
You can see the output pasted in cell E1 which has the division done in column G.
Your Sample Data
Column A
Column B
Item 1
9240
Item 2
11843
Item 3
6372
Item 4
8320
Item 5
16365
Item 6
1234
Use FILTER instead QUERY:
=ARRAYFORMULA({FILTER({Tank_List!A1:A,Tank_List!E1:E},Tank_List!B1:B=1),FILTER(Tank_List!E1:E,Tank_List!B1:B=1)/SUMIF(Tank_List!B1:B,1,Tank_List!E1:E)})
Retrieve columns A and E in the first part of array, then for each value in E that I divide for a sum with SUMIF.

Count values without having to specify each

I need to count how many times ids repeats, without having to specify each id. In my case I need it for know how many customers come 3 times or more in a month. Here is an example of where Im getting the data from:
customers| id
------------------
person 1 | 2433340
person 2 | 3457548
person 3 | 3457584
person 4 | 4343218
person 4 | 4343218
person 4 | 4343218
person 3 | 3457584
And this one is the one that I need to fill:
Times that customers come
--------------------------
1 time | 2
2 times | 1
3 times | 1
I have used:
Formula in D2:
=QUERY(QUERY(B2:B,"Select Count(B) where B is not null group by B label Count(B) 'Times'"),"Select Col1, count(Col1) group by Col1 label count(Col1) 'Count'")
I would work with a helper column (D) to count visits per person. Then it is pretty easy to count the "x times".
Values in column F are numbers formatted as "0 "time""

Clean cumulative sum alongside grouped sum

I am working in PostgreSQL 9.6.6
For the sake of reproducibility, I'll use create tempory table to create a "constant" table to play with:
create temporary table test_table as
select * from
(values
('2018-01-01', 2),
('2018-01-01', 3),
('2018-02-01', 1),
('2018-02-01', 2))
as t (month, count)
A select * from test_table returns the following:
month | count
------------+-------
2018-01-01 | 2
2018-01-01 | 3
2018-02-01 | 1
2018-02-01 | 2
The desired output is the following:
month | sum | cumulative_sum
------------+-----+----------------
2018-01-01 | 5 | 5
2018-02-01 | 3 | 8
In other words, the values have been summed, grouping by month, and then the cumulative sum is displayed in another column.
The issue is that the only way I know to achieve this is somewhat convoluted. The grouped sum must be computed first, (as with a sub select or with statement), and then the running tally is computed with a select statement against that table, as so:
with sums as
(select month,
sum(count) as sum
from test_table
group by 1)
select month,
sum,
sum(sum) over (order by month) as cumulative_sum
from sums
What I wish could work would be something more like...
select month,
sum(count) as sum,
sum(count) over (order by month) as cumulative_sum
from test_table
group by 1
But this returns
ERROR: column "test_table.count" must appear in the GROUP BY clause or be used in an aggregate function
LINE 3: sum(count) over (order by month) as cumulative_sum
No amount of fussing with the group by clause seems to satisfy PSQL.
TL,DR: is there a way in PSQL to compute both a sum over groups and the cumulative sum over groups using just a single select statement? More generally, is there a "preferred" way to accomplish this, beyond the method I use in this question?
Your hunch to use SUM as an analytic function was on the right track, but you need to analytic sum the aggregate sum:
SELECT month,
SUM(count) as sum,
SUM(SUM(count)) OVER (ORDER BY month) AS cumulative_sum
FROM test_table
GROUP BY 1;
Demo
As to why this works, the analytic functions are applied after the GROUP BY clause has happened. So the aggregate sum in fact is available when we go take the rolling sum.

Resources