I have a list of values and I need to sum the largest 10 values (in a row). I found this but I can't figure it out/get it to work:
https://productforums.google.com/forum/#!topic/docs/A5jiMqkRLYE
let's say you want to sum the 10 highest values of the range E2:EP
then try:
=sumif(E2:P2, ">="&large(E2:P2,10))
and see if that works ?
EDIT: Maybe this is a better option ? This will only sum the 10 outputted by the array_constrain. Will only work in the new google sheets, though..
=sum(array_constrain(sort(transpose($A3:$O3), 1, 0), 10 ,1))
Can you see if this works ?
This works in old google sheets too:
sum(query(sort(transpose($A3:$O3), 1, false), "select * limit 10"))
Transpose puts the data in a column, sort sorts the data in a descending order and then query selects first 10 numbers.
Unfortunately, replacing sort with "order by" in a query statement does not work, because you can not reference a column in a range returned by transpose.
The sortn function seems to be just what you need.
From the documentation linked above, it "[r]eturns the first n items in a data set after performing a sort." The data set does not have to be sorted. It takes a bunch of optional parameters as it can sort on multiple columns.
SORTN(range, [n], [display_ties_mode], [sort_column1, is_ascending1], ...)
The interesting ones for your case are n, sort_column1, and is_ascending1. Specifically, your required formula would be
sum(sortn(transpose(A3:O3), 10, 0, 1, false)))
Some notes:
This assumes your data in A3:O3. You can replace it with your range.
transpose converts the data row to a data column as required by sortn.
10 is n, indicating the number of values that you require.
0 is the value for display_ties_mode. We are ignoring this value.
1 is the value of sort_column1, telling that we want to sort the first column (after transpose).
false tells sortn to sort descending and thus pick the largest values. The default is to pick the smallest.
Related
This is a similar problem to Using ARRAYFORMULA with SUMIF for multiple conditions combined with a wildcard to select all values for a given condition, but trying to get the sum of the first column by all months of a given year. I was trying to extrapolate the same solution provided by #player0, but group by doesn't work because I would like to get a result for each month of the year, regardless of the month's data on the source. Here is the sample:
In column G we have different filter conditions and it can include a special token: ALL to include all values for each criterion. For the year we can select the given year for which we want to sum the result by each month of the year.
Column I has the expected result using the similar idea of the referred question, but it cannot be expressed in an ArrayFormula (query result doesn't expand):
=IFNA(QUERY(QUERY(FILTER($A$2:$E,
IF($G$2="All", $B$2:$B<>"×", $B$2:$B=$G$2),
IF($G$4="All", $C$2:$C<>"×", $C$2:$C=$G$4),
IF($G$6="All", $D$2:$D<>"×", $D$2:$D=$G$6),
YEAR($E$2:$E) = $G$8
),
"select sum(Col1) where month(Col5) =" & MONTH($H2) - 1),
"offset 1", 0),"")
On column J the array try doesn't work because we cannot use the virtual range with SUMIF it can be resolved by creating auxiliary columns with the FILTER, but I am looking for a solution that doesn't require that.
=Arrayformula(if(not(ISBLANK(H2:H)), sumif(
Filter(FILTER($A$2:$E, IF($G$3="All", $B$2:$B<>"×",
$B$2:$B=$G$3), IF($G$5="All", $C$2:$C<>"×", $C$2:$C=$G$5),
IF($G$7="All", $D$2:$D<>"×", $D$2:$D=$G$7),
YEAR($E$2:$E) = $G$9),{1,0,0,0,0}), H2:H,
Filter(FILTER($A$2:$E, IF($G$3="All", $B$2:$B<>"×",
$B$2:$B=$G$3), IF($G$5="All", $C$2:$C<>"×", $C$2:$C=$G$5),
IF($G$7="All", $D$2:$D<>"×", $D$2:$D=$G$7),
YEAR($E$2:$E) = $G$9), {0,0,0,0,1})
),))
Here is the sample spreadsheet:
https://docs.google.com/spreadsheets/d/1Lqjim3c_j8KNr_7LVlKjlR4BXi9_1HFoGDuwsuX1XS0/edit?usp=sharing
try:
=INDEX(IFNA(VLOOKUP(MONTH(H2:H13), QUERY(QUERY(FILTER(A2:E,
IF(G3="All", B2:B<>"×", B2:B=G3),
IF(G5="All", C2:C<>"×", C2:C=G5),
IF(G7="All", D2:D<>"×", D2:D=G7)),
"select month(Col5)+1,sum(Col1)
where Col1 is not null "&IF(G9="",,"
and year(Col5)="&G9)&
"group by month(Col5)+1"),
"offset 1", 0), 2, 0)))
This formula is in cell L2 of your sample Sheet1. It is a sumif() that uses a regex and MMULT to create an array of 3's where the conditions are met and &'s it with the month for the 'sumif' criterium.
=ARRAYFORMULA(SUMIF(MMULT(N(REGEXMATCH(B2:D,SUBSTITUTE({G3,G5,G7},"ALL",))),
{1;1;1})&EOMONTH(E2:E,-1)+1,3&H2:H13,A2:A))
Detail Explanation
(The formula, then in the following lines the explanation)
SUBSTITUTE({G3,G5,G7},"ALL",)
Generate a 1x3 array with G3,G5,G7 values, if "ALL" then will be replaced by empty string. In case of ("ALL", ALL", "ALL") returns: (,,). The result of SUBSTITUTE will be the regular expression to be used in REGEXMATCH. In case of ("ALL", ALL", "ALL") REGEXMATCH will return (TRUE, TRUE, TRUE) on each row.
REGEXMATCH(B2:D,SUBSTITUTE({G3,G5,G7},"ALL",))
Return an array of the same size as B2:D (let's say Mx3 array) where the condition are met (TRUE|FALSE) values and the function N()converts it into (0|1) values.
MMULT(N(REGEXMATCH(B2:D,SUBSTITUTE({G3,G5,G7},"ALL",))),
{1;1;1})
Multiplies two matrixes: Mx3 x 3x1 and returns an array Mx1. If all filters conditions are satisfied will return on a given cell the number 3 (we have three conditions), otherwise a number lower than 3.
EOMONTH(E2:E,-1)+1
Calculate the last day of the previous month of the cells E2:E and add one day resulting the first day of the month of E2:E. We need to compare such dates with the dates in H2:Hthat represent the first day of the month.
SUMIF(range, criterion, [sum_range])
range: will contain an array of Mx1 with one of the values: {0,1,2,3} depending on how many conditions are met. It appends &EOMONTH(E2:E,-1)+1. Remember dates are stored as an integer number.
criterium: 3&H2:H13the reference dates prefixed by number 3.
sum_range: The range A2:A we want to sum based on the range and criterium match, so only rows with 3 values of MMULT will be considered.
Here the result in case of filters have the value ALL:
Here the solution when ALL token is not used:
Note: The only difference with expected result and #player0 solution is that it returns 0when there is no match.
I wanted a ArrayFormula at C1 which gives the required result as shown.
Entry sheet:
(Column C is my required column)
Date Entered is the date when the Name is Assigned a group i.e. a, b, c, d, e, f
Criteria:
The value of count is purely on basis of Date Entered (if john is assigned a on lowest date(10-Jun) then count value is 1, if rose is assigned a on 2nd lowest date(17-Jun) then count value is 2).
The value of count does not change even when the data is sorted in any manner because Date Entered column values is always permanent & does not change.
New entry date could be any date not necessarily highest date (If a new entry with name Rydu is assigned a on 9-Jun then the it's count value will become 1, then john's (10-Jun) will become 2 and so on)
Example:
After I sort the data in any random order say like this:
Random ordered sheet:
(Count value remains permanent)
And when I do New entries in between (Row 4th & 14th) and after last row (Row 17th):
Random Ordered sheet:
(Doesn't matter where I do)
I already got a ArrayFormula which gives the required result:
={"AF Formula1"; ArrayFormula(IF(B2:B="", "", COUNTIFS(B$2:B, "="&B2:B, D$2:D, <"&D2:D)+1))}
I'm not looking for another Arrayformula as solutions. What I want is to know what is wrong in my ArrayFormula? and how do I correct it?
I tried to figure my own ArrayFormula but it's not working:
I got Formula for each cell:
=RANK($D2,FILTER($D$2:$D, $B$2:$B=$B2),1)
I figured out Filter doesn't work with ArrayFormula so I had to take a different approach.
I took help from my previous question answer (Arrayformula at H3) which was similar since in both cases each cell FILTER formula returns more than 1 value. (It was actually answered by player0)
Using the same technique I came up with this Formula which works absolutely fine :
=RANK($D2, ARRAYFORMULA(TRANSPOSE(SPLIT(VLOOKUP($B2, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ","))), 1)
Now when I tried converting it to ArrayFormula:
($D2 to $D2:$D & $B2 to $B2:$B)
=ARRAYFORMULA(RANK($D2:$D,TRANSPOSE(SPLIT(VLOOKUP($B2:$B, SUBSTITUTE(TRIM(SPLIT(FLATTEN(QUERY(QUERY({$B:$B&"×", $D:$D}, "SELECT MAX(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1", 1),, 9^9)), "×")), " ", ","), 2, 0), ",")), 1))
It gives me an error "Did not find value '' in VLOOKUP evaluation", I figured out that the problem is only in VLOOKUP when I change $B2 to $B2:$B.
I'm sure VLOOKUP works with ArrayFormula, I fail to understand where my formula is going wrong! Please help me correct my ArrayFormula.
Here is the editable sheet link
if I understand correctly, you are trying to "rank" B column based on D column dates in such way that dates are in theoretical ascending order so if you randomize your dataset, the "rank" of each entry would stay same and not change based on the randomness you introduce.
therefore the correct formula would be:
={"fx"; INDEX(IFNA(VLOOKUP(B2:B&D2:D,
{INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1),
IFERROR(1/(1/COUNTIFS(
INDEX(SORT(B2:D, 3, 1),,1),
INDEX(SORT(B2:D, 3, 1),,1), ROW(B2:B), "<="&ROW(B2:B))))}, 2, 0)))}
{"fx"; ...} array of 2 tables (header & actual table) under each other eg. ;
outer shorter INDEX or longer ARRAYFORMULA (doesnt matter which one) is needed coz we are processing an array
IFNA for removing possible #N/A errors from VLOOKUP function when VLOOKUP fails to find a match
we VLOOKUP joint B and D column B2:B&D2:D in our virtual table {} and returning second 2 column if there is an exact match 0
our virtual table {INDEX(SORT({B2:B&D2:D, D2:D}, 2, 1),,1), ...} we VLOOKUP from is constructed with 2 columns next to each other eg. ,
we are getting the first column by creating an array of 2 columns {B2:B&D2:D, D2:D} next to each other where we SORT this array by date/2nd column 2, in ascending order 1 but all we need after sorting is the 1st column so we use INDEX where we bring all rows ,, and the first column 1
now lets take a look on how we getting the 2nd column of our virtual table by using COUNTIFS which will mimic the "rank"
IFERROR(1/(1/ is used to remove all zero values from the output (all empty rows would have 0 in it as the "rank")
under COUNTIFS we put 2 pairs of arguments: "if column is qual to column" and "if row is larger or equal to next row increment it by 1" ROW(B2:B), "<="&ROW(B2:B))
for "if column is qual to column" we do this twice and use range B2:D and sort it by date/3rd column 3 in ascending order 1 and of this we again need only the 1st column so we INDEX it and return all rows ,, and first column 1
with this formula you can add, remove or randomize your dataset and you will always get the right value for each of your rows
as for why your formula doesnt work... to not get #N/A error for vlookup you would need to define the end row of the range but still, the result wont be as you would expect coz formula is not the right one for this job.
as mentioned there are functions that are not supported under AF like SUM,AND,OR and then there are also functions which work but in a different way like IFS or with some limitations like SPLIT,GOOGLEFINANCE,etc.
I have answered you on the tab in your shared sheet called My Practice thusly:
You cannot split a two column array as you have attempted to do in cell CI2. That is why your formula does not work. You can only split a ONE column array.
I understand you are trying to learn, but attempting to use complicated formulas like that is going to make it harder I'm afraid.
I have below data in date-value pair format in Google Sheet:
Date
Value
1/8/2021
1301.85
1/11/2021
1303.9
1/12/2021
1320.05
1/13/2021
1291.55
1/14/2021
1287.45
1/15/2021
1270
I'm looking for a google sheet formula that will return output: highest value along with its associated date.
That means I'm trying to get the output as below:
1/12/2021 1320.05
You can also use the SORTN function.
The difference being, by using an INDEX formula you need to re-format the column A result to a date while using SORTN you don't.
=SORTN(A2:B,1,0,2,0)
How the SORTN function in our formula works
1 is "the number of items to return" (since we need just the max we set it to 1)
0 "Shows at most the first n rows" (this is the so called ties_mode, in our case 0)
2 is "the index of the column containing the values to sort by" which here is the 2nd column
0 "indicates how to sort the sort_column. TRUE (or 1) sorts in ascending order. FALSE (or 0) sorts in descending order".
try this:
=INDEX(SORT(A2:B), 2, 0), 1)
Another approach:
=index(A2:B,match(max(B2:B),B2:B,0))
Like this answer, you don't need to reformat the date.
Below is an example of the values I'm trying to add. My goal is not to use the filter formula for every line again. This is because it takes a lot of resources from the processing power that google sheets uses.
The words in the left list below occur several times in combination with a certain number. I want to have a unique list as shown on the right where the values from the left row add up when the words are equal (without using a filter formula for each line).
This formula
=arrayformula(vlookup(A:A, A:B, 2, FALSE)))
comes very close, but returns only one value even though there are multiple matches. I want all values that matches returned.
try:
=QUERY(A:B, "select A,sum(B) where A is not null group by A label sum(B)'summed'", 1)
-- EDIT #2 -- Updated the Google Sheet again with a solution which is painfully close. Best formula I've had so far is below. --
=ARRAYFORMULA(SPLIT(UNIQUE({ARRAYFORMULA(QUERY(IF(COUNTIF(G4:G&"|||"&H4:H,ARRAYFORMULA(A4:A&"|||"&B4:B))>0,REGEXREPLACE(A4:A&"|||"&B4:B,".*",""),A4:A&"|||"&B4:B),"SELECT * WHERE Col1 IS NOT NULL"));ARRAYFORMULA(QUERY(IF(COUNTIF(G4:G&"|||"&H4:H,ARRAYFORMULA(D4:D&"|||"&E4:E))>0,REGEXREPLACE(D4:D&"|||"&E4:E,".*",""),D4:D&"|||"&E4:E),"SELECT * WHERE Col1 IS NOT NULL"))}),"|||"))
-- EDIT -- Updated the Google Sheet to more closely reflect my use case --
Pretty confident someone's asked this before but I've been Googling for a few hours now and I'm starting to lose hair. I think I've got to use a QUERY function but not 100% on that.
Demo sheet here: https://docs.google.com/spreadsheets/d/1p_hqk9WydcyXQZT4bIm4DSZZnaPKbngtnZ0-laYHwk8/edit?usp=sharing
What I want to do:
I want to combine the ranges under DATA 1 and DATA 2, but I want to exclude and rows which start with the values in DATA 3.
RESULT 1 doesn't add value but shows how I was adding DATA 1 and DATA 2 together.
RESULT 2 shows the result I'm trying to get.
RESULT 3 hidden but where I got to (and doesn't add value again, sorry). I can get it mostly working, but I'd have to manually specify in the QUERY which combinations I'm looking for... and frankly my dataset is HUGE. That formula currently looks like this:
=QUERY(UNIQUE({FILTER(A4:B,NOT(ISBLANK(A4:A)));FILTER(D4:E,NOT(ISBLANK(D4:D)))}),"SELECT * WHERE NOT Col1 STARTS WITH 'a' OR NOT Col2 STARTS WITH 'v'",-1)
Hope someone can help me out! You're my only hope.
This works for your example data. It uses ARRAYFORMULA and MATCH to concatenate the two columns and LEFT to only use the first character in the match function. You might need to find a slightly smarter way to do the starts with element depending on your actual data.
=UNIQUE({FILTER(A4:B,not(isblank(A4:A)),iserror(MATCH(ARRAYFORMULA(A4:A&left(B4:B,1)),ARRAYFORMULA(G4:G&H4:H),0)));FILTER(D4:E,not(isblank(D4:D)),iserror(MATCH(ARRAYFORMULA(D4:D&left(E4:E,1)),ARRAYFORMULA(G4:G&H4:H),0)))})
Documentation:
MATCH: here
ARRAYFORMULA: here
Was also looking for how to find the set-difference between two ranges.
All my values were in one range but I reworked my solution to fit your case of needing to join ranges too.
Finding the set-difference between two ranges:
// given A1:A9 and C1:C9
// return all values in A1:A10 excluding those in C1:C10
=filter(A1:A9, not(iferror( match(A1:A9,C1:C9,0), false )), A1:A9<>"")
Finding the set-difference between two joined ranges and a third range:
// given A1:A9, B1:B9, and C1:C9
// return all values in A1:A10 and B1:B10 excluding those in C1:C10
=filter({A1:A9;B1:B9}, not(iferror( match({A1:A9;B1:B9},C1:C9,0), false )), {A1:A9;B1:B9}<>"")
Breakdown:
Goal is to get an output range that is the input range excluding all values in an "exclusion" range
MATCH: match all values that are in both your input range and your exclusion range (remaining values will produce an error)
NOT + IFERROR : convert matches to false and errors to true
FILTER: filter the input down to only the true values (i.e. whats not matched, a.k.a. remove the excluded values), and also add a 2nd condition to remove blanks
tips:
{X;Y}: unions two ranges, ; adds rows , adds columns
X<>"": true for all non-blank values in range X