SUMPRODUCT of last nth values with criteria - google-sheets

I have two columns (see screenshot)
How can i create a formula that sum the second LATEST column values with a criteria from column A?
For example i need the sum of the last 6 values (from column B) of the cells (in column A) that start with HH, so values starting from the bottom.
I know how to make a sum of all values (from column B) containing HH (from column A)
=SUMIF(A1:A;"HH"&"*";B1:B)
P.S. HH and * are separate because i'll substitute the HH with a cell
but now i need to delimit this to the last N values (let say last 3 values)
P.P.S.
=SUMPRODUCT((COUNTIFS(A1:A;"exact text";ROW(A1:A)*{1;1};">="&ROW(A1:A)*{1;1})<=3)*(A1:A="exact text");B1:B)
This works so far ONLY if i write the exact text, not with values like HH*

Maybe try
=sum(index(query({row(A1:A), A1:B}, "Select Col3 where Col2 contains 'HH' order by Col1 desc limit 6")))
and see if that works?
Note:
*the string HH can be also be in a cell (ex. D1)
=sum(index(query({row(A1:A), A1:B}, "Select Col3 where Col2 contains '"&D1&"' order by Col1 desc limit 6")))
*6 indicates the number of values you want to sum
EDIT: For your locale you'll need to use in G1
=sum(index(query({row($B$1:$B) \ $B$1:$C}; "Select Col3 where Col2 contains '"&E2&"' order by Col1 desc limit 3")))
and fill down. See if that works?

This should also work, not sure if it's any easier to understand/ less complicated than any other approach:
=SUM(SORTN(REGEXMATCH(B:B;E2)*C:C;3;0;ROW(B:B)*REGEXMATCH(B:B;E2);0))
Note the number 3 for the number of values you want from the bottom. and the reference to E2, which is "HH" as on your sample sheet.

use:
=QUERY(FILTER({IFNA(REGEXEXTRACT(SORT(B2:B; ROW(B2:B); 0);
"^([A-Za-z]{1,3})\d"))\SORT(C2:C; ROW(B2:B); 0)}; COUNTIFS(
REGEXEXTRACT(SORT(B2:B; ROW(B2:B); 0); "^([A-Za-z]{1,3})\d");
REGEXEXTRACT(SORT(B2:B; ROW(B2:B); 0); "^([A-Za-z]{1,3})\d");
ROW(H2:H43); "<="&ROW(H2:H43))<=3);
"select Col1,sum(Col2) group by Col1 label sum(Col2)''")
full explanation here

Related

Google Sheets - Trying to use IMPORTRANGE to get student enrollments by school by date

I made columns that simply lists all the schools for each student under a date column header. I only need to count the schools - I don't want to see any student data. I have a form that needs to be submitted to the state and there is a cell with the date we are submitting. I want to use an IMPORTRANGE to go to the other spreadsheet and find the column where the date matches the form date and pull back the count of schools in that column. I tried using INDEX/MATCH, etc., and while I have those working on the form for other data already on my spreadsheet - I can't seem to get them to work on the 'outside' spreadsheet. Any help is appreciated! I'm attaching an example spreadsheet with more explanation. I hope it's clear. Thank you.
Example Document
ztiaa has provided a good solution. I will suggest another which pivots the data differently. I feel that having the dates go across columns is going to quickly become unwieldy. So this approach leaves the dates running down the right side with the schools running as column headers. My solution also includes exceptions such as snow days, as those will likely be important to see at a glance as well. Just be careful to use the same notation for such exceptions. For instance, always use "snowday" (not, eg., "snow day") or "prep day" (not, e.g., "prep").
In Sheet1, use IMPORTRANGE to bring in the data exactly as you have it listed in your sample sheet (i.e., dates at top, school names or exceptions under each date header).
Then, in a new sheet, place the following formula in cell A1:
=ArrayFormula({QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!1:1<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"),TRANSPOSE({"District Totals",MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!1:1<>"")="",0,1))})})
This assumes that the first sheet is, in fact, named Sheet1. If it is not, you'll need to change every instance of that sheet name in the formula.
It will probably further aid visual ease if you freeze Row 1 and Column 1 (View > Freeze > 1 row / 1 column).
ADDENDUM (after seeing realistic copy of OP's sheet)
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))}})
If you prefer to see the name of the exception (e.g., "snowday") rather than the count of 1 for each exception in the "District totals" line:
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",IF(MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))=1,FILTER(Sheet1!2:2,Sheet1!2:2<>""),MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0)))}})
If you want exceptions (e.g., "snowday") to always appear at the bottom instead of in alphabetical order within the school-name list:
=ArrayFormula({TRANSPOSE(QUERY(SPLIT(FLATTEN(FILTER(Sheet1!1:1&"~"&IF(REGEXMATCH(UPPER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A))),INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A))),,"‡")&INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")),"~",1,0),"Select Col1, COUNT(Col1) WHERE Col2 Is Not Null AND Col2 <> '‡' GROUP BY Col1 PIVOT Col2 FORMAT Col1 'mmm dd'"));{"District Totals",IF(MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0))=1,FILTER(Sheet1!2:2,Sheet1!2:2<>""),MMULT(SEQUENCE(1,ROWS(Sheet1!A2:A),1,0),IF(FILTER(INDIRECT("Sheet1!2:"&ROWS(Sheet1!A:A)),Sheet1!2:2<>"")<>"",1,0)))}})
Try this out
=ArrayFormula({query(split(flatten(text(A1:E1,"mmm/d")&"❄️"&A2:E),"❄️"),"select Col2, count(Col2) where not Col2 matches '|Snowday' group by Col2 pivot Col1");{"District Total",transpose(MMult(transpose(N(filter(A2:E,not(RegexMatch(A2:E2,"Snowday")))<>"")),sequence(rows(A2:E))^0))}})

How do I sum the last 10 rows of column "D". with the result in a cell in column "F"?

I have been using this and it works, but everyday I add a new row and have to insert a row above this formula, that's why I want it to display in column "F" somewhere:
=SUM(INDIRECT(ADDRESS(ROW()-10,COLUMN(),4) &":"& ADDRESS(ROW()-1,COLUMN(),4)))
You can try SUMPRODUCT:
=SUMPRODUCT((ROW(A:A)<=COUNTA(A:A))*(ROW(A:A)>COUNTA(A:A)-10)*A:A)
There can't be blanks in A:A
If your range of values in D:D is contiguous (i.e., no blanks) and begins at the top of the column (with or without a header), you can use this:
=ArrayFormula(SUMIF(ROW(D:D),">"&COUNTA(D:D)-10,D:D))
If there are interspersed blanks in the range, you can use either of these:
=SUM(SORTN(FILTER(D:D,ISNUMBER(D:D)),10,0,FILTER(ROW(D:D),ISNUMBER(D:D)),0))
=SUM(QUERY(FILTER({D:D,ROW(D:D)},ISNUMBER(D:D)),"Select Col1 ORDER BY Col2 DESC LIMIT 10"))
use:
=SUM(QUERY(SORT(D2:D; ROW(D2:D); );
"where Col1 is not null limit 10"; ))

How can I avoid having to put 0s into the NULL fields to get a correct query calculation in Google Sheets

I have a Google Sheets question, which I have not been able to figure out yet with Google-Fu and RTFM:
Take the following spreadsheet as an example:
https://docs.google.com/spreadsheets/d/1IvMVaUdUDfYOoKyG0Uwd2n0M1mLjOTE5yZQ9K2R3q2M/edit?usp=sharing
In case the sheet gets lost in time, I am going to post its contents here:
Sheet1:
foo
withdrawal
deposit
C
4
10
D
10
E
10
4
As you see here, the withdrawal field for the D value being foo is empty, i.e. null
Sheet2:
foo
balance
C
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A2&"'"), 2)
D
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A3&"'"), 2)
E
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A4&"'"), 2)
The result is
foo
balance
C
6
D
E
-6
As you see, the balance field for the category D is null, although it should be -10.
The fix for that is to put a 0 into the deposit field in Sheet1 explicitly.
In my example, I get that data using a csv-export, and fields are generally empty and not 0, and it is cumbersome to add the 0 there. Is there a way to have something like COALESCE in that sum there (like in SQL)?
Please let me know.
it seems like something quite a bit simpler would avoid the problem:
=SUMPRODUCT(Sheet1!C:C-Sheet1!B:B,Sheet1!A:A=A2)
for cell B2.
Why don't you just add this in cell A1 of Sheet2 instead of all the Query:
=arrayformula({Sheet1!A1,"balance";if(Sheet1!A2:A<>"",{Sheet1!A2:A,Sheet1!C2:C-Sheet1!B2:B},)})
Obviously ensure cells Sheet2!A2:A and Sheet2!B1:B are empty.
If you have duplicate values of foo, try:
=arrayformula(query({Sheet1!A1,"balance";if(Sheet1!A2:A<>"",{Sheet1!A2:A,Sheet1!C2:C-Sheet1!B2:B},)},"select Col1,sum(Col2) where Col1 is not null group by Col1 label sum(Col2) 'balance'",1))
A better option for a single-cell formula, referencing multiple sheets would be:
=arrayformula(query(
{Sheet1!A:A,n(Sheet1!B:C);Sheet2!A2:A,n(Sheet2!B2:C);Sheet3!A2:A,n(Sheet3!B2:C)},
"select Col1,sum(Col3)-sum(Col2) where Col1 is not null group by Col1 label sum(Col3)-sum(Col2) 'balance' ",1))

Google sheets Query function with Arrayformula

For each of the email id, I want to get latest 10 records by timestamp. How do I get the results with arrayformula? Query function is not important as long as I can still achieve this with arrayformula. Here is the sample data:
https://docs.google.com/spreadsheets/d/1YAHA02VM-5MXzVKhkxu_eODPKObpoz441mGX8lOFu5M/edit?usp=sharing
Try this on another sheet, row 1:
=arrayformula(query({query({Sheet1!$A:$C},"order by Col1 desc,Col2",1),{"Dupe position";countifs(query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),row(Sheet1!$A2:$C),"<="&row(Sheet1!$A2:$C))}},"select Col1,Col2,Col3 where Col1 is not null and Col4 <= 10 order by Col1",1))
You can adjust the number of records found by adjusting Col4 <= 10, and also the final sort by altering order by Col1 at the end of the formula.
Explanation
This gets the data from Sheet1, sorts it by date desc then email asc:
query({Sheet1!$A:$C},"order by Col1 desc,Col2",1)
Then to the side of this data, a COUNTIFS() is used to get the number each time an email appears in the list above (since it's sorted desc, 1 represents the most recent instance).
countifs(<EmailColumnData>,<EmailColumnData>,row(<EmailColumn>),"<="&row(<EmailColumn>))
In place of <EmailColumnData> in the COUNTIF() is:
query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0)
In place of <EmailColumn> above, we only want the row number so we don't need the actual data. We can use:
Sheet1!$A2:$C
Various {} work as arrays to bring the data together.
Eg., {a,b,c;d,e,f} would result in three columns, with a, b, c in row 1 and d, e, f in row 2. , is a new column, ; is a return for a new row.
A final query around everything gets the 3 columns we need, where the count number in col 4 is <=10, then sorts the output by Col1 (date asc).
On second thoughts, maybe this is bit cheeky, but this might do it ( taken from conditional rank idea )
=ArrayFormula(filter(A2:C,countifs(A2:A,">="&A2:A,B2:B,B2:B)<=10,A2:A<>""))
EDIT
The above assumes (because the data is time-stamped) dups shouldn't occur. If they do and the data is pre-sorted, you can use row number as a proxy for time stamp as suggested by #Aresvik.
Alternatively, you could count separately
(a) only rows with a later timestamp
plus
(b) rows with the same time stamp but with earlier (or identical) row number
=ArrayFormula(filter(A2:C,countifs(A2:A,">"&A2:A,B2:B,B2:B)+countifs(A2:A,"="&A2:A,B2:B,B2:B,row(A2:A),"<="&row(A2:A))<=10,A2:A<>""))
I have added a new sheet ("Erik Help") with the following formula in A1:
=ArrayFormula({"Submitted Time","Email","Score";SORT(SPLIT(FLATTEN(QUERY(SORT(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(IF(Sheet1!B2:B=TRANSPOSE(UNIQUE(FILTER(Sheet1!B2:B,Sheet1!B2:B<>""))),Sheet1!A2:A&"|"&Sheet1!B2:B&"|"&Sheet1!C2:C,),,COUNTA(Sheet1!A2:A)))," ",0,1)),SEQUENCE(MAX(COUNTIF(Sheet1!B2:B,Sheet1!B2:B))),0),"LIMIT 10")),"|",1,0),1,0)})
The number of records is set after LIMIT.
The order is set by the final two numbers: 1,0 (meaning "sort by column 1 in reverse order," which, as currently set, is sorting in reverse order by date/time).

How to find the most frequent text value in a whole spreadsheet on Google sheets?

I know the code to find the most frequent value in a single column, but I'm trying to figure out the code that lets me find the most frequent value in multiple specific columns.
For example:
=ARRAYFORMULA(INDEX(A2:A17,MATCH(MAX(COUNTIF(A2:A17,A2:A17)),COUNTIF(A2:A17,A2:A17),0)))
^^this formula only allows me to have two arguments for COUNTIF
=INDEX(A2:A6,MODE(MATCH(A2:A6,A2:A6,0)))
^^this one only allows 4 arguments max in the index
I want an equation that'll allow me to find the most frequent text value for 16 specific columns.
Is that even possible?
BETTER FORMULA
You asked for formula when you have 2 top values with the same frequency
=SORTN(QUERY(FLATTEN(H2:K),
"select Col1, count(Col1)
where Col1 is not null group by Col1
order by count(Col1) desc label count(Col1) '' ",0),1,1,2,0)
First answer
Use only 1 QUERY formula
For top value
=QUERY(FLATTEN(A2:D),
"select Col1, count(Col1)
where Col1 is not null group by Col1
order by count(Col1) desc limit 1",0)
For all values
=QUERY(FLATTEN(A2:D),
"select Col1, count(Col1)
where Col1 is not null group by Col1
order by count(Col1) desc limit 1",0)
I show it in 2 steps then combine steps together in one formula:
First you flatten table and remove duplicates to get a column of values you check frequency for:
unique(flatten(B2:I25))
Then for each one you check number of occurences:
ArrayFormula(countif($B$2:$I$25,unique(flatten(B2:I25))))
Then you have to filter new table and get only rows with highest values:
=filter({first , second column};second column = max(second column)
Using earlier values:
=filter({unique(flatten(B2:I25)),ArrayFormula(countif($B$2:$I$25,unique(flatten(B2:I25))))},ArrayFormula(countif($B$2:$I$25,unique(flatten(B2:I25))))=max(ArrayFormula(countif($B$2:$I$25,unique(flatten(B2:I25))))))
My solution is available here:
https://docs.google.com/spreadsheets/d/1gMZDFWOY8qbA1Bhp8rZ_por3Skb0i_qQtlaSlCTKXgM/copy
If your columns are scattered around your sheet, you should use { } and make a table of them.

Resources