Sum 5 largest numbers in each row, dynamically - google-sheets

I have a league table with Column A displaying a list championship entrants.
In the corresponding row are the entrants various race results (points scores). i.e. ColC shows Race 1, ColD Race 2 etc.
I want to sum total, per row (entrant), the 5 largest scores (in Col B)
The following formula works fine entered line by line,
=ArrayFormula(SUM(IFERROR(LARGE($H5:$AE5,{1,2,3,4,5,6}),0)))
However, I want it to be a dynamic array formula that self populates, should new entrants be added. Something like (though this doesn't work):
=arrayformula(If(A2:A<>"",ArrayFormula(SUM(IFERROR(LARGE($H5:$AE5,{1,2,3,4,5,6}),0))),""))
I've been trying to use MMULT, and a few other haphazard ideas, unsuccessfully.
Test sheet can be used here;
https://docs.google.com/spreadsheets/d/18tmKdwAcXoDQrQxSDSnzgK6A5Erj22oSXcxwUt_lq4o/edit?usp=sharing

This should work even with hundreds or thousands of rows. You can find it on the new tab called mk.help
=Arrayformula({"TEST";if(A3:A="",,VLOOKUP(A3:A,query({query(vlookup(SEQUENCE(COUNTA(A3:A)*10,1,0)/10+3,{row(A3:A),A3:A,D3:M},mod(SEQUENCE(COUNTA(A3:A)10,1,0),10){0,1}+{2,3}),"order by Col1,Col2 desc"),Mod(SEQUENCE(COUNTA(A3:A)*10,1,0),10)},"select Col1,Sum(Col2) where Col3<5 group by Col1"),2,0))})

In B3 put this formula:
=arrayformula(query({transpose(split(textjoin(",",false,{left("",row(A3:A5))} & join(",",column(D3:M3)-column(D3))),",",true,false)),sort(split(transpose(split(textjoin("*",false,{row(B3:B5) & "^" & D3:M5}),"*",true,false)),"^",true,false),1,true,2,false)},"Select sum(Col3) where Col1<=4 group by Col2 label sum(Col3) ''"))
but you must modify this for more than row number 5

Related

How sum a range of values when the criterion depends on multiple values spread along different columns

Suppose we have the following table in Google Sheets:
A
B
C
D
E
1
a
green apple
apple
=SUMIFS(A:A, B:B,"*" & D1 & "*", C:C,"*"&D1&"*")
1
Orange
banana
=SUMIFS(A:A, B:B,"*" & D2 & "*", C:C, "*" & D2 & "*")
20
a
red apple
1
banana
1
kiwi
1
Banana
Then E1 == 0 and E2 == 2. This is because SUMIFS sums the values of B column if ALL the criteria for all ranges are TRUE, this is equivalent to say that SUMIFS joins all criteria with an AND (logical) operator.
What I need is the same SUMIFS operation but with an OR operator so that E1 == 21.
One solution is to concatenate B and C values in F column and then simply use this formula
=SUMIF(F:F, "\*" & D1 & "\*", B:B)
Is there another way to do this without having to create another column?
Since someone edited the tags, the answer can be written for Google Sheets, Excel, LibreOffice and similar apps. Thanks for you help!
If your real application only needs to find any items from D:D in either of only two columns and then return one final total, you can use this:
=ArrayFormula(SUM(FILTER(A:A,NOT(ISERROR(REGEXEXTRACT(LOWER(B:B&"~"&C:C),JOIN("|",FILTER(LOWER(D:D),D:D<>""))))))))
If your real application needs to find any items from D:D in more that two columns and then return one final total, you can either continue to join columns like this — B:B&"~"&C:C — or use a formula like this:
=ArrayFormula(SUM(FILTER(A:A,NOT(ISERROR(REGEXEXTRACT(LOWER(TRANSPOSE(QUERY(TRANSPOSE(B:C),,COLUMNS(B:C)))),JOIN("|",FILTER(LOWER(D:D),D:D<>""))))))))
If you need a per-item count of the strings in D:D if they appear any number of times in other columns, try this:
=ArrayFormula(QUERY(TRIM(QUERY(SPLIT(FLATTEN(IF(NOT(ISNUMBER(SEARCH(TRANSPOSE(FILTER(D:D,D:D<>"")),LOWER(TRANSPOSE(QUERY(TRANSPOSE(B:C),,COLUMNS(B:C))))))),,TRANSPOSE(FILTER(LOWER(D:D),D:D<>""))&"~"&A:A)),"~"),"Select Col1, SUM(Col2) WHERE Col1 Is Not Null GROUP BY Col1 LABEL SUM(Col2) ''")),"WHERE Col2 Is Not Null"))
If there will only ever be 2 columns to consider, then it makes sense to go with JohnSUN's suggestion from the comments.
Otherwise:
=SUMPRODUCT(N(MMULT(N(ISNUMBER(SEARCH(D1,B$1:C$6))),TRANSPOSE(COLUMN(B$1:C$6)^0))>0),A$1:A$6)
The range referenced (B$1:C$6) can be extended to one comprising as many columns as desired, though bear in mind that, having switched from a SUMIF set-up to a SUMPRODUCT one, you would be strongly advised to not use entire column references (at least in Excel this is the case; not sure about Sheets).

Google sheets Query function with Arrayformula

For each of the email id, I want to get latest 10 records by timestamp. How do I get the results with arrayformula? Query function is not important as long as I can still achieve this with arrayformula. Here is the sample data:
https://docs.google.com/spreadsheets/d/1YAHA02VM-5MXzVKhkxu_eODPKObpoz441mGX8lOFu5M/edit?usp=sharing
Try this on another sheet, row 1:
=arrayformula(query({query({Sheet1!$A:$C},"order by Col1 desc,Col2",1),{"Dupe position";countifs(query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0),row(Sheet1!$A2:$C),"<="&row(Sheet1!$A2:$C))}},"select Col1,Col2,Col3 where Col1 is not null and Col4 <= 10 order by Col1",1))
You can adjust the number of records found by adjusting Col4 <= 10, and also the final sort by altering order by Col1 at the end of the formula.
Explanation
This gets the data from Sheet1, sorts it by date desc then email asc:
query({Sheet1!$A:$C},"order by Col1 desc,Col2",1)
Then to the side of this data, a COUNTIFS() is used to get the number each time an email appears in the list above (since it's sorted desc, 1 represents the most recent instance).
countifs(<EmailColumnData>,<EmailColumnData>,row(<EmailColumn>),"<="&row(<EmailColumn>))
In place of <EmailColumnData> in the COUNTIF() is:
query({Sheet1!$A2:$C},"select Col2 order by Col1 desc,Col2",0)
In place of <EmailColumn> above, we only want the row number so we don't need the actual data. We can use:
Sheet1!$A2:$C
Various {} work as arrays to bring the data together.
Eg., {a,b,c;d,e,f} would result in three columns, with a, b, c in row 1 and d, e, f in row 2. , is a new column, ; is a return for a new row.
A final query around everything gets the 3 columns we need, where the count number in col 4 is <=10, then sorts the output by Col1 (date asc).
On second thoughts, maybe this is bit cheeky, but this might do it ( taken from conditional rank idea )
=ArrayFormula(filter(A2:C,countifs(A2:A,">="&A2:A,B2:B,B2:B)<=10,A2:A<>""))
EDIT
The above assumes (because the data is time-stamped) dups shouldn't occur. If they do and the data is pre-sorted, you can use row number as a proxy for time stamp as suggested by #Aresvik.
Alternatively, you could count separately
(a) only rows with a later timestamp
plus
(b) rows with the same time stamp but with earlier (or identical) row number
=ArrayFormula(filter(A2:C,countifs(A2:A,">"&A2:A,B2:B,B2:B)+countifs(A2:A,"="&A2:A,B2:B,B2:B,row(A2:A),"<="&row(A2:A))<=10,A2:A<>""))
I have added a new sheet ("Erik Help") with the following formula in A1:
=ArrayFormula({"Submitted Time","Email","Score";SORT(SPLIT(FLATTEN(QUERY(SORT(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(IF(Sheet1!B2:B=TRANSPOSE(UNIQUE(FILTER(Sheet1!B2:B,Sheet1!B2:B<>""))),Sheet1!A2:A&"|"&Sheet1!B2:B&"|"&Sheet1!C2:C,),,COUNTA(Sheet1!A2:A)))," ",0,1)),SEQUENCE(MAX(COUNTIF(Sheet1!B2:B,Sheet1!B2:B))),0),"LIMIT 10")),"|",1,0),1,0)})
The number of records is set after LIMIT.
The order is set by the final two numbers: 1,0 (meaning "sort by column 1 in reverse order," which, as currently set, is sorting in reverse order by date/time).

Find frequency of words in a column in Google Sheets and lookup another value from a different column using formulae

I have 2 columns of data in a Google Sheet. Column1 is unique words or sentences (words are repeated in sentences) and the Column2 is a numeric value next to each (say votes). I am trying to get a list of unique words from Column1 and then the sum of values (votes) from Column2 when the word was present either on its own or in a sentence.
Here is a sample of the data I am working with in Google Sheets:
Term Votes
apple 20
apple eat 100
orange 30
orange rules 40
rule why 50
This is what the end result looks like:
Word Votes
apple 120
eat 100
orange 70
rules 40
rule 50
why 50
The way I am doing it now is quite long and I am not sure if this is the best solution.
Here's my solution:
JOIN values in Column1 using a delimiter " " and then SPLIT them using the same delimiter and then TRANSPOSE them into a column all in one step. This way I have a list of all the words used in Column1 in say Column3.
In Column4 pull out all the UNIQUE values and then do a COUNTIF for the unique values from Column3. This way I am able to get the frequency of each unique word by referencing to the lsit of all words.
In order to get the sum of Votes I have to TRANSPOSE Column4 and then QUERY Column1 and Column2 by using dynamic text in the formula. The formula looks like =QUERY(Column1:Column2, "SELECT SUM(Column2) WHERE Column1 CONTAINS '" & referenceToUniqueWord & "'", 1). The reason I have to transpose first is because the query formula outputs 2 cells of data ie Text: sumColumn1 and Number: 'sum of votes'. Since for one cell of unique word I get two cells of data I am not able to drag the formula down and hence I have to do it horizontally.
I finally get three rows of data after the last step:
One row is just transposed Column4 (all the unique words). Second row is just the text sumColumn2 from using the QUERY formula. And third row is the actual sum of votes, resulting from individual QUERY formulae. I then transpose these rows to columns and to get my final table I VLOOKUP the frequency values arrived at earlier.
This approach is lengthy and prone to errors. Also doesn't work if the list is large and in the initial JOIN I get an error of limit 50,000 reached. Any ideas to make it better are welcome. I know this can be done much easier using Scripts but I'd prefer to have it done using only formulae.
try:
=ARRAYFORMULA(QUERY(SPLIT(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(
IF(IFERROR(SPLIT(A:A, " "))="",,"♠"&SPLIT(A:A, " ")&"♦"&B:B)
,,999^99)),,999^99), "♠")), "♦"),
"select Col1,sum(Col2)
group by Col1
order by sum(Col2) desc
label sum(Col2)''"))

How to count the unique pairs in 2 columns and sort using the count using an ArrayFormula in Google Sheets?

Let's say I have the following spreadsheet:
https://docs.google.com/spreadsheets/d/1FY7GnhZoT2_Tzm8FLOkDuc5XR8TkFhBJKgW_qZ1r4Cc/edit?usp=sharing
On the top left column, I have a formula that counts the events and sorts them according to the frequency. Anyway, what I want to do now is instead of just counting the frequencies of the actions, I want to count the number of unique actions. For example, in my spreadsheet, the action call came up 5 times: 2 times by Joe, 2 times by Mary, 1 times without a user (empty). Therefore, next to the call action on my left-hand table, I would want 2 because the number of unique pairs (event and user) is exactly 2.
So using the above logic, what I want my left side table to be is the following:
Call 2
SMS 1
Review 1
Hopefully, I have made myself clear.
How can I do this using my example spreadsheet? Thanks.
try:
=ARRAYFORMULA(QUERY(UNIQUE({D:D, D:D&E:E, E:E}),
"select Col1,count(Col1)
where Col3 != ''
group by Col1
order by count(Col1) desc
label count(Col1)'Count'", 1))

Google Sheet formula to find the minimal sum of pairs in array

I'm looking for solution for my problem. I have a sheet to summarize lap times for some competition. We make 3 laps in each qualification. We are qualifying to finals by 2 best laps one after another. So we sum first and second lap or second and third lap and then choose the smallest one sum. I've managed to get array of pairs and filter out empty cells (run not finished). Number of pairs may vary form 1 to 20.
Now is my question. How to find the smallest sum of pairs from my array in one elegant formula?
Here is my sample sheet: example sheet
=QUERY(QUERY({A17:B17;B17:C17;D17:E17;E17:F17;G17:H17;H17:I17};
"select Col1+Col2
where Col1 is not NULL
and Col2 is not NULL");
"select min(Col1)
label min(Col1)''")
I know this isn't exactly your question and fair play if it gets marked down, but in your quest for an 'elegant formula', I was wondering if there was a more general way to get the pairs in the first place.
You can do it with by using two ranges offset by one cell together with the mod of the column number:
=ArrayFormula(query(
query({transpose({A17:H17;B17:I17;mod(column(A17:H17),3)})},"select Col1+Col2 where Col1 is not null and Col2 is not null and Col3>0")
,"select min(Col1) label min(Col1) ''"))

Resources