Display Top 5 Entires from Matching Keys in Sheets - google-sheets

This is a little more complex than the title makes it out to be and I'm having difficulty figuring out a way to go about it.
I have a set of tables which record the 20 highest values and their corresponding keys from a larger list of keys and values.
Right now my formulas are: For values
=LARGE(Values!$K$3:Values!$K$999,ROW(J4)-ROW(J$3)) for Values and
=INDEX(Values!$L$3:Values!$L$999,MATCH(J4,Values!$K$3:Values!$K$5000,0)) to select matching keys
This approach works for selecting all N desired values, but will select whatever the top values are, without regard for keys, allowing a column of entirely A values for example.
What I want is:
Display top N (20) values with the caveat that:
Only the top M (5) values for each key are displayed.
For example, with N of 5 and M of 2, this is what I would want.
Given data:
Key
Value
A
100
B
200
C
300
A
400
A
600
B
140
C
100
A
350
I would want the resulting table to look like this:
Key
Value
A
600
A
400
C
300
B
200
B
140
In this system, keys are unpredictable, and there can be fewer than M entries per key, or many more than M entries per key. Values might not be unique, but I haven't found a good way to address this either, and it happens very infrequently.

max(B) for each unique A:
=SORTN(SORT(A:B, 2, ), 9^9, 2, 1, 1)
max(B) for each unique A but only the highest N(2) returned:
=SORTN(SORTN(SORT(A:B, 2, ), 9^9, 2, 1, 1), N(2),, 2, )
top N(2) values of B for every A and final output limited to top M(5):
=QUERY(FILTER(SORT(A:B, 2, ),
COUNTIFS(SORT(A:A, B:B, 0), SORT(A:A, B:B, 0), ROW(A:A), "<="&ROW(A:A))
<=N(2)), "limit 5", )

Related

ArrayFormula which returns the second last matching

I have a table which collects daily readings of a total score from many different players. Since it's manual collection via form it may be that some players will add their reading more than once a day, and also can be a day or more without any reading at all.
The structure is very basic 3 columns (Date, Player, Total).
I'm looking for an ArrayFormula that will automatically filling in a 4th column with the daily score of the specific player. This can achieve by a formula that finds the second-last reading of the specific player and subtract it from its last/current reading.
Date
Player
Total
Daily
17/10/2021
Player 001
1500
1500
17/10/2021
Player 007
700
700
19/10/2021
Player 003
700
700
19/10/2021
Player 005
100
100
19/10/2021
Player 004
1100
1100
19/10/2021
Player 006
300
300
19/10/2021
Player 002
900
900
20/10/2021
Player 006
900
600
20/10/2021
Player 006
1600
700
20/10/2021
Player 002
1100
200
20/10/2021
Player 005
600
500
20/10/2021
Player 009
200
200
21/10/2021
Player 001
1600
100
21/10/2021
Player 003
1000
300
I found a very interesting solution, but since it's based on INDIRECT it can't work with ArrayFormula:
https://infoinspired.com/google-docs/spreadsheet/find-the-last-matching-value-in-google-sheets/
I thought about a different approach, using VLOOKUP and limiting the search-range to the rows above the current row, then to find the last matching value in this range (-which is actually the second-last in the whole table), but I can't find a syntax that is working in ArrayFormula.
Any thoughts?
Try this:
=ARRAYFORMULA(
IF(
A2:A = "",,
C2:C
- IFNA(VLOOKUP(
MATCH(
B2:B,
UNIQUE(FILTER(B2:B, B2:B <> "")),
)
* 10^INT(LOG10(ROWS(A2:A)) + 1)
+ ROW(A2:A) - 1,
SORT(
{
SEQUENCE(COUNTUNIQUE(B2:B)) * {10^INT(LOG10(ROWS(A2:A)) + 1), 0};
FILTER(
{
MATCH(
B2:B,
UNIQUE(FILTER(B2:B, B2:B <> "")),
)
* 10^INT(LOG10(ROWS(A2:A)) + 1)
+ ROW(A2:A),
C2:C
},
A2:A <> ""
)
},
1, 1
),
2
))
)
)
I'll offer a tentative solution, with the understanding that it's always difficult to write such a formula without the ability to see some actual data and the expected result.
Let's say your data is in A2:C (with headers in A1:C1). Try the following formula in D2 of an otherwise empty Col D:
=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&(A2:A-1), SORT({ {"", 0}; {B2:B&A2:A, C2:C} }), 2, TRUE) * (VLOOKUP(B2:B&(A2:A-1), SORT({ {"", 0}; {B2:B&A2:A, B2:B} }), 2, TRUE) = B2:B))))
To find the second-to-last score per player, VLOOKUP looks up a concatenation of each row's player-and-"yesterday" within a SORTed virtual range containing A.) {null, 0} on top of B.) {a concatenation of each row's player-and-date, score}.
Because of the SORT, a final parameter of TRUE can be used, which means that if an exact match for player-and-"yesterday" is not found, the closest previous match will be returned. The * VLOOKUP(...) is there to make sure the previous match is for the same person (because the alphabetical entry prior to each person's earliest date will be someone else's last date, except for the first person alphabetically, who will bounce back to the {null, 0}).
However, if your sheet will always have at least one blank row below your data, you can simplify a bit:
=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&(A2:A-1), SORT({B2:B&A2:A, C2:C}), 2, TRUE) * (VLOOKUP(B2:B&(A2:A-1), SORT({B2:B&A2:A, B2:B}), 2, TRUE) = B2:B))))
This is because the bounce-back for the first alphabetical person's first date will find {null, null} for all blank rows, which is equivalent to {null, 0}, all of which will be SORTed earlier than all of your data. So we don't need to include it in the virtual array setup.
If the result is not as expected, please share a minimal set of realistic data with the expected results.
ADDENDUM (per additional comment from OP):
If a player may enter more than one score per day, you can use the formula versions below.
If you're not sure you'll always have at least one blank row below your data:
=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT({ {"", 0}; {B2:B&TEXT(ROW(B2:B),"0000"), C2:C} }), 2, TRUE) * (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT({ {"", 0}; {B2:B&TEXT(ROW(B2:B),"0000"), B2:B} }), 2, TRUE) = B2:B))))
If you are sure you will always have at least one blank row below your data:
=ArrayFormula(IF(A2:A="",,C2:C - (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT( {B2:B&TEXT(ROW(B2:B),"0000"), C2:C} ), 2, TRUE) * (VLOOKUP(B2:B&TEXT(ROW(B2:B)-1,"0000"), SORT( {B2:B&TEXT(ROW(B2:B),"0000"), B2:B} ), 2, TRUE) = B2:B))))
Both of the above substitute row number for date. They assume, then, that your data will always be entered in the order they occurred in real time, not randomly (i.e., that you will not enter an earlier date's score after a later date's score). If you will potentially enter things out of order, this can also be controlled for; but I haven't done so here.

Countifs with array formula to collect data

I have a raw data that contains types of defect (EC,VC,BC,NC).
first type of defect calculation (EC,VC,BC).
For example if i want to collect the EC defects, I will collect all the rows that contain EC=0 & if the EC is score down to 0 twice in the same row it will count as 1 EC only in this raw.
Same goes for VC & BC.
Second Type of defect calculation (NC)
Same calculation of the above but if the NC defect was repeated on the same row it will be summed.
I have the headers containing VC,EC,BC,NC above each column containing the score.
What i need here to calculate each raw containing defects.
I tried the below formula for VC,EC,BC.
=ARRAYFORMULA(IF(ROW(E2:E)=2,"EC",IF(LEN(E2:E)=0,IFERROR(1/0),IF(COUNTIFS($E$1:$BH$1,"EC",$E:$BH,0)>=1,1,0))))
The formula is working without the array formula, but will be hard to drag each time the data is updated.
Also, Tried the below for NC.
=ARRAYFORMULA(IF(ROW(H2:H)=2,"NC",IF(LEN(H2:H)=0,IFERROR(1/0),COUNTIFS($E$1:$BH$1,"NC",$E:$BH,0))))
Sample Sheet: Test Sheet
={"EC"; ARRAYFORMULA(IF(LEN(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="EC")=0, 1, )),,9^9))))>0, 1, 0))}
={"VC"; ARRAYFORMULA(IF(LEN(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="VC")=0, 1, )),,9^9))))>0, 1, 0))}
={"BC"; ARRAYFORMULA(IF(LEN(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="BC")=0, 1, )),,9^9))))>0, 1, 0))}
={"NC"; ARRAYFORMULA(MMULT(IF(FILTER(
INDIRECT("E3:"&ADDRESS(MAX((E:Z<>"")*(ROW(A:A))), COLUMNS(1:1))),
E1:1="NC")=0, 1, 0), SEQUENCE(COUNTIF(E1:1, "NC"))^0))}

SUMIFs multiple criteria is not working consitently

I am using Google Sheets with the spreadsheet shown below.
I want to Sum the 'Amount' column
IF the Key in column J == the key in column B
AND The Assigned person == the actual person.
So, where the key is 2, we'd have a subset of 7 items. From that the assigned person is Sally and four entries match, our total would therefore be the sum of those matching values which are 20, 10, 2, 4 giving a sum of 36.
In K3, we can correctly see the sum of 36.
The formulae I used in that cell is:
=SUMIFS(H:H,B:B,J3,G:G,D:D)
The cell below has the formulae:
=SUMIFS(H:H,B:B,J4,G:G,D:D)
So, that should, I believe sum the values 3,8 and 4 since the key (3) in column J matches three items in column B. In each case Mike is the assigned and actual person, which means we should be summing 3, 8 and 4. However, the value as you can see is 0.
Any ideas what I'm doing wrong, please?
You can also do this with a single formula in Google Sheets;
=query(B2:H," select B,sum(H) where D=G and B is not null group by B label sum(H) ''")
Use SUMPRODUCT:
=SUMPRODUCT((B$2:B$13=J2)*(D$2:D$13=G$2:G$13)*H$2:H$13)

Google sheet - compare items based on another dataset and get the one with max value

I have a series of words in a column A, each associated with a certain number/score.
After this table of words/score, I have lines where I have values of these words and I run a contest and must get the item with the highest score.
Let's make this simple with this example:
Here my quesiton is about getting the blue value inside E8. That is: how to create a fomula which takes analyzes the contender of line 8 which are "word4 word5", "word1 word2", and "word2 word6" and for each of them goes on the column A to find it and find their associated score. And then put the name with the highest score on E8.
Note there here is a special attention for D7 which is "word2 word6" because there won't be a match on column A.
You'll see below the structure of my data and table: note that there I need to keep the comparison between strings/words on line 8 (and below) inside the column B, C, D and E.
=VLOOKUP(MAX(ARRAYFORMULA(IFERROR(VLOOKUP(TRANSPOSE(B8:D8),
A2:F7, 6, 0), ))), {ARRAYFORMULA(IFERROR(VLOOKUP(TRANSPOSE(B8:D8),
A2:F7, 6, 0), )), TRANSPOSE(B8:D8)}, 2, 0)

Take the average of the smallest 10 numbers of the last 20 entries

I am setting up a Golf index calculator and I need help taking the last 20 entries for an average. The formula is suppose to take the average of the smallest 10 numbers of the last 20 games played. So far all I have is:
average(small(i2:i21, 10))
I would not like to change the row numbers every time I put in a new entry.
The small function returns one element from a range - in your case, the 10th smallest element, not the 10 smallest elements. This doesn't help much here. For your purpose, the combination of sort (sort in increasing or decreasing order) with array_constrain (keep only a given number of elements) works well.
=average(array_constrain(sort(array_constrain(sort(filter({I2:I, row(I2:I)}, len(I2:I)), 2, false), 20, 1)), 10, 1))
or with linebreaks
=average(
array_constrain(
sort(
array_constrain(
sort(
filter({I2:I, row(I2:I)}, len(I2:I)),
2, false),
20, 1)
),
10, 1)
)
The array {I2:I, row(I2:I)} contains row numbers in the second column. Keeping only nonempty entries in I column, we sort by the row numbers in descending order. Then keep only the first 20 entries from I column. Sort again (by default: increasing), and keep 10 entries. Finally, average is taken.

Resources