For each of the email id, I want to get latest 10 records by timestamp. How do I get the results with arrayformula? Query function is not important as long as I can still achieve this with arrayformula. Here is the sample data:
https://docs.google.com/spreadsheets/d/1YAHA02VM-5MXzVKhkxu_eODPKObpoz441mGX8lOFu5M/edit?usp=sharing
Try this on another sheet, row 1:
=arrayformula(query({query({Sheet1!$A:$C},"order by Col1 desc,Col2",1),{"Dupe position";countifs(query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),row(Sheet1!$A2:$C),"<="&row(Sheet1!$A2:$C))}},"select Col1,Col2,Col3 where Col1 is not null and Col4 <= 10 order by Col1",1))
You can adjust the number of records found by adjusting Col4 <= 10, and also the final sort by altering order by Col1 at the end of the formula.
Explanation
This gets the data from Sheet1, sorts it by date desc then email asc:
query({Sheet1!$A:$C},"order by Col1 desc,Col2",1)
Then to the side of this data, a COUNTIFS() is used to get the number each time an email appears in the list above (since it's sorted desc, 1 represents the most recent instance).
countifs(<EmailColumnData>,<EmailColumnData>,row(<EmailColumn>),"<="&row(<EmailColumn>))
In place of <EmailColumnData> in the COUNTIF() is:
query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0)
In place of <EmailColumn> above, we only want the row number so we don't need the actual data. We can use:
Sheet1!$A2:$C
Various {} work as arrays to bring the data together.
Eg., {a,b,c;d,e,f} would result in three columns, with a, b, c in row 1 and d, e, f in row 2. , is a new column, ; is a return for a new row.
A final query around everything gets the 3 columns we need, where the count number in col 4 is <=10, then sorts the output by Col1 (date asc).
On second thoughts, maybe this is bit cheeky, but this might do it ( taken from conditional rank idea )
=ArrayFormula(filter(A2:C,countifs(A2:A,">="&A2:A,B2:B,B2:B)<=10,A2:A<>""))
EDIT
The above assumes (because the data is time-stamped) dups shouldn't occur. If they do and the data is pre-sorted, you can use row number as a proxy for time stamp as suggested by #Aresvik.
Alternatively, you could count separately
(a) only rows with a later timestamp
plus
(b) rows with the same time stamp but with earlier (or identical) row number
=ArrayFormula(filter(A2:C,countifs(A2:A,">"&A2:A,B2:B,B2:B)+countifs(A2:A,"="&A2:A,B2:B,B2:B,row(A2:A),"<="&row(A2:A))<=10,A2:A<>""))
I have added a new sheet ("Erik Help") with the following formula in A1:
=ArrayFormula({"Submitted Time","Email","Score";SORT(SPLIT(FLATTEN(QUERY(SORT(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(IF(Sheet1!B2:B=TRANSPOSE(UNIQUE(FILTER(Sheet1!B2:B,Sheet1!B2:B<>""))),Sheet1!A2:A&"|"&Sheet1!B2:B&"|"&Sheet1!C2:C,),,COUNTA(Sheet1!A2:A)))," ",0,1)),SEQUENCE(MAX(COUNTIF(Sheet1!B2:B,Sheet1!B2:B))),0),"LIMIT 10")),"|",1,0),1,0)})
The number of records is set after LIMIT.
The order is set by the final two numbers: 1,0 (meaning "sort by column 1 in reverse order," which, as currently set, is sorting in reverse order by date/time).
Related
Right now I am using this query to search for a row based on its Column 1 value. Then it takes the value from the last column. I need a way for it to automatically find the last column in the row since some of the rows have more columns than others.
This is what I had before, which I had manually specified the last column with a value:
=QUERY(IMPORTRANGE("link_redacted","PriceList!A1:AZ100000"), "Select Col10 where Col1 = '5531001'",1)
I have tried using LOOKUP with ARRAYFORMULA I couldn't get it to work:
=QUERY(IMPORTRANGE("link_redacted","PriceList!A1:AZ100000"), "Select (LOOKUP(1, ARRAYFORMULA(1/[Select Col1 where Col1 = '5531006']:[Select Col100 where Col1 = '5531006']<>"")[Select Col1 where Col1 = '5531006']:[Select Col100 where Col1 = '5531006']))",1)
Any ideas for a simpler way to do this?
Since no example is presented, I tested the formulas given but no source data is fetched.
so i created a a minimal, reproducible example
Example is the data on the left
Use this formula to get the last non empty columns values.
=ArrayFormula(IFERROR( REGEXEXTRACT( TRIM(TRANSPOSE(QUERY(TRANSPOSE(C3:E),,ROW(C3:E)))), "[^\s]+$")))
=QUERY(IMPORTRANGE(some_range);"
SELECT Col2,dateDiff(Col20,now())
WHERE Col20 IS NOT NULL AND dateDiff(Col20,now()) <= 30
ORDER BY Col2 ASC
LABEL Col2 'SANITARNA'
";0)
So, i have this query formula which works perfectly for a column that has only dates. However i need to apply it to a column where there are dates and some text values. The problem is when i change the dateDiff column i get an error "Unable to parse query string for Function QUERY parameter 2: Can't perform the function 'dateDiff' on values that are not a Date or a DateTime values" which makes sense. However, i cant seem to figure out how to incorporate a filter within the dateDiff function to just skip the text values and only output the ones that have dates. My best guess so far is that the filter has to be applied within the dateDiff function in SELECT and not WHERE. I've tried a filter/isnumber formula but get parse error and my brain is fried and can't see the problem.
Test sheet: https://docs.google.com/spreadsheets/d/1thpXBSp-Vt1E5MGaM89Xko6GvekjmzyidO94Sil2AjQ/edit?usp=sharing
See my newly added sheet ("Erik Help"), which contains the following version of your original formula:
=QUERY(FILTER(A:C,ISNUMBER(C:C))," SELECT Col1,dateDiff(Col3,now()) WHERE Col2 IS NOT NULL AND dateDiff(Col3,now()) <= 30 ORDER BY Col1 ASC LABEL Col1 'PRIJAVA DO' ",0)
FILTER will first filter in only those rows where the value in Col C is a number. And since all dates are numbers as far as Google Sheets is concerned, that will be only your rows with dates in Col C.
NOTE: You currently have no date values in Col C that are less than or equal to 30 days from TODAY() — that is, all of your Col-C dates are in the future — so your table is returning empty (yet without error, because it is working). Because QUERY is acting on a FILTER of the original data and not on the original data itself, all QUERY column references must be in Colx notation, not A-B-C notation.
The solution Erik Tyler provided works, ill just add here the syntax for the importrange as well for anyone with similar problem.
=QUERY(FILTER(
IMPORTRANGE("some_range";"some_sheets!A:QQ");
ISNUMBER(
IMPORTRANGE("some_range";"some_sheets!C:C")));"
SELECT Col2,dateDiff(Col20,now())
WHERE Col20 IS NOT NULL AND dateDiff(Col20,now()) <= 30
ORDER BY Col2 ASC
LABEL Col2 'SANITARNA'
";0)
I am trying to calculate the last 5 and last 10 values for specific players in a set of data. The date is in column B, players names are in column K, the data I need to count is in column Z. I just want to have a formula that finds the last 5 values associated with a players name and adds them up. Is that possible? I attached the googlesheet for your review! Thank you for the help!
https://docs.google.com/spreadsheets/d/1iR44nAFSUxZ54LOH61zPOGS_6ugYIMYHEfhJM-SkV6k/edit?usp=sharing
since the dates are in descending order all you need is:
=QUERY(FILTER({K2:K, Z2:Z},
COUNTIFS(K2:K, K2:K, ROW(K2:K), "<="&ROW(K2:K))<6),
"select Col1,sum(Col2) where Col1 !='' group by Col1 label sum(Col2)''")
I have searched on a lot of pages but I cannot find a solution to my problem except in reverse order. I have simplified what I do, but I have a query that comes looking for information in my data sheet. Here there are 3 columns, the date, the amount and the source.
I would like, with a query function, to be able to make different columns which counts the information of column C based on the values of its cells per month, like this
I'm okay with the start of the formula
=QUERY(A2:C,"select month(A)+1, sum(B), count(C) where A is not null group by month(A)+1")
But as soon as I try a little different things by putting 2 query together in an arrayformula, obviously the row count doesn't match as some minus are 0 for some sources.
Do you have a solution for what I'm trying to do? Thank you in advance :)
Solution:
It's not possible in Google Query Language to have a single query statement that has one result grouped by one column and another result grouped by another.
The first two columns can be like this:
=QUERY(A2:C,"select month(A)+1, sum(B) where A is not null group by month(A)+1 label month(A)+1 'Month', sum(B) 'Amount'")
To create the column labels for the succeeding columns, use in the first row, in my example, I1:
=TRANSPOSE(UNIQUE(C2:C))
Then from cell I2, enter this:
=COUNTIFS(arrayformula(month($A$2:$A)),$G2,$C$2:$C,I$1)
Then drag horizontally and vertically to apply to the entire table.
Results:
try:
=INDEX({
QUERY({MONTH(A2:A), B2:C},
"select Col1,sum(Col2) where Col2 is not null group by Col1 label Col1'month',sum(Col2)'amount'"),
QUERY({MONTH(A2:A), B2:C, C2:C},
"select count(Col3) where Col2 is not null group by Col1 pivot Col4")})
I have simple table that looks like this:
All i need is to SUM points for specific player (John) in his last 3 matches.
I was able to come with this formula:
SUMPRODUCT(LARGE((A2:B="John")*(C2:D);{1;2;3}))
The problem is that instead of what I was looking for, it sums the highest 3 values, that can be anywhere in that range.
Is there some similar formula, that can do only the last 3 matches?
I think a SUMPRODUCT can get you there with some constructed arrays using a COUNTIFS() and ROW() to get the most recent 3.
This formula:
=SUMPRODUCT((COUNTIFS(A:B,G2,ROW(A:B)*{1,1},">="&ROW(A:B)*{1,1})<=3)*(A:B=G2),C:D)
on this sheet I made seems to work.
I thnk I have a formula that gives what you want. It's not pretty, and I'm sure it can be made simpler, but this works:
=query( query(
{ arrayformula( {ROW(A1:A) } ),
query(A1:D,"select A, B, C, D",1)
} , "select * order by Col1 desc",1),
"select Col2, Col3, Col4, Col5
where (Col2 ='John' or Col3 = 'John')
order by Col1 desc limit 3",1)
Basically, it adds the row number as an extra column to the data, so that we can sort the data in reverse order by row number. Then we query the result to find the first three occurences of 'John', in either Col A or Col B.
Here is a sample sheet:
https://docs.google.com/spreadsheets/d/1-mhTb5Cpp3D-1OltlmCfwlmM-vc2OknHxfJAyHD7BjI/edit?usp=sharing
Credit to Erik Tyler for a previous answer on a different question, on how to add the row number to a query.
Edit: Updated the sheet to provide the SUM of John's (or any player's) scores from the last three matches. This can be combined with the previous formula, if you want a single formula to place somewhere. Or will you have a list of all the players, and you'll want their last three scores beside each of their names?
If I can simplify the formula, I'll update it here.
Let me know if you need something more than this, or if this has answered your question.
Approach
I would use the query formula to get the cells that you need so that you can leverage the limit statement.
You should put a column with the indexes so that you can order the cells in descending order and take the first 3.
Given that your table headers are:
+-----------------------------------------------+
| INDEX | NAME 1 | NAME 2 | POINTS 1 | POINTS 2 |
+-----------------------------------------------+
I would use this query to get your desired result:
=SUMPRODUCT(QUERY(A2:E, "Select D * E where B = 'John' or C = 'John'" order by A desc limit 3"))