How to sum the maximum value of each column in Google Spreadsheets? - google-sheets

I have a Google Spreadsheet of numbers. How do I take the maximum value from each column, and summarize them using only one formula? (No temp cells, no scripts.)
1 2 1
0 1 3
0 2 0
For the table above the result should be 6 (1+2+3, the maximum value of each column). But I'd like a solution that works for much larger tables, too.
As a more general question, I'd like to find out how I could fold 2D ranges into 1D arrays using an arbitrary operator (like MAX and SUM in this case).

Assuming your data in range A2:D, to get the maximum of every row (array output) try
=query(transpose(query(if(row(A2:D)>=transpose(row(A2:D)),transpose( A2:D)),"select max(Col1),max(Col2),max(Col3),max(Col4) ",0)),"Select Col2", 0)
If you need to process a lot of columns, this may be better
=ArrayFormula(QUERY(TRANSPOSE(QUERY(TRANSPOSE( A2:D) , "Select "&"MAX(Col"&JOIN( ", MAX(Col",ROW(INDIRECT( "YY1:YY"&ROWS(A2:A)))&")"))), "Select Col2", 0))
To sum, just wrap SUM() around the above formulas.

MAX by columns in A1:C3
=INDEX(QUERY({A1:C3},"Select "&"MAX(Col"&JOIN(", MAX(Col",SEQUENCE(COLUMNS(A1:C3))&")"),0),2)
MAX by rows in A1:C3
=TRANSPOSE(INDEX(QUERY(TRANSPOSE(A1:C3),"Select "&"MAX(Col"&JOIN(", MAX(Col",SEQUENCE(ROWS(A1:C3))&")"),0),2))
Substitute MAX with MIN to get the minimums.

Related

Google Sheets: Find minimum sum in an ordered array

My goal is to find the smallest sum of 1 entry per column in this array.
i.e. Though column 9 has the lowest value for all rows, only one value can be used, and same for all the other columns, and I want to know what combination will produce the smallest sum.
Is there any way of doing this in Google Sheets?
Thank you!
Use this formula
=INDEX(QUERY({A3:I},"Select "&
TEXTJOIN(", ",1,ARRAYFORMULA("min(Col"&COLUMN(A3:I)&")"))),2)

Using ARRAYFORMULA to Calculate Running Total of Payables (Alternative to INDIRECT)

I use a Google Spreadsheet to keep track of the accounts payable per vendor. There is a sheet per vendor in the Spreadsheet. A simplified sheet looks like this:
When I receive a new invoice, an entry for the amount is made in the Credit column and when I release a payment, an entry for the amount is made in the Debit column. I keep track of the running total in the AC Payable column. I achieve this by using a formula in each cell of the AC Payable column (the example below is from cell E4):
=IF(
ISNUMBER(INDIRECT(ADDRESS(ROW()-1,COLUMN()))),
INDIRECT(ADDRESS(ROW()-1,COLUMN()))+C4-D4,
C4-D4
)
The logic is simple. The running total for row n is calculated by:
AC Payable(n - 1) + Credit(n) - Debit(n)
This setup works fine, except I have to drag the formula into newly added rows. Is there a way to achieve this by using ARRAYFORMULA?
PS: I have found a solution using:
= ARRAYFORMULA(
SUMIF(
ROW(C3:C),
"<="&ROW(C3:C),
C3:C)
-
SUMIF(
ROW(D3:D),
"<="&ROW(D3:D),
D3:D
)
)
I feel this is a suboptimal (The original sheet dates back to 2018. It has a lot of rows) solution since, in every row, it calculates the total of the Debit and Credit columns up to the current row and then subtracts the total of the Debit column from the total of the Credit column.
I am expecting a solution that would take advantage of the running total available in the previous row and not redo the whole calculation per row.
solution for up to 1581 rows:
=ARRAYFORMULA(QUERY(QUERY(MMULT(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=
SEQUENCE(1, COUNTA(A3:A)*2))*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)), "offset 1", ), "skipping 2", ))
skills:
it's fast
it's smart
gets slower more rows you add
dies after 1581 rows
it's based on standard MMULT Running/Cumulative Total/Sum formula:
=ARRAYFORMULA(MMULT(TRANSPOSE((ROW(B1:B6)
<=TRANSPOSE(ROW(B1:B6)))*B1:B6), SIGN(B1:B6)))
but with a modification twist, because you got 2 columns to total
instead of ROW(B1:B6) we use a sequence of count of real data multiplied by two (because you got 2 columns):
SEQUENCE(COUNTA(A3:A)*2)
instead of TRANSPOSE(ROW(B1:B6)) we use again:
SEQUENCE(1, COUNTA(A3:A)*2)
combination of these pieces:
=ARRAYFORMULA(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))))
will produce a matrix like:
and that's the reason why it dies with lots of rows because while you may think that if you have only 1500 rows in two columns, then formula will work only on 1500*2=3000 virtual cells, but in fact the MMULT formula processes (1500*2)*(1500*2)=9000000 virtual cells. still, it's worth to note, that this MMULT fx is great if deployed on a small scale.
next, instead of *B1:B6 we use:
*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1}))
eg. with INDIRECT we take only "valid" range of C3:D which is in your example sheet just C3:D5 and we multiply C column by 1 and D column by -1 to simulate subtraction and then we FLATTEN both columns into one single column. the part +ROW(A3)-1 is just an offset because you start from row 3
and the last part of standard RT fx - SIGN(B1:B6) is replaced with one column full of ones:
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)
then we offset the output with inner QUERY by 1 because we are interested in a totals after subtraction and finally we use skipping 2 which means that we filter out every second value - again, we are interested in totals after subtraction of D column.
solution for more than 1581 rows:
=ARRAYFORMULA(
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("C3:C"&COUNTA(A3:A)))-
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("D3:D"&COUNTA(A3:A))))
skills:
supports more rows
looks less smart
sadly the third argument of SUMIF always needs to be a range
gets slower with more rows
it will get sick if you feed it with 10000 rows
it may kill off your sheet with 11000+ rows
Here'a modification of Ben Collins' running total formula
=ARRAYFORMULA(
IF(ISBLANK(A2:A),,
MMULT(TRANSPOSE((ROW(C2:C)<=TRANSPOSE(ROW(C2:C)))*C2:C),SIGN(C2:C))-
MMULT(TRANSPOSE((ROW(D2:D)<=TRANSPOSE(ROW(D2:D)))*D2:D),SIGN(D2:D))))
yet another alternative to MMULT:
=INDEX(QUERY(FLATTEN(QUERY(QUERY(TRANSPOSE(QUERY(QUERY(TRANSPOSE(
(SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))*
FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
"offset 1", ), "skipping 2", )), "select "&QUERY(
"sum(Col"&SEQUENCE(COUNTA(A3:A))&"),",, 9^9)&"' '"),
"offset 1", )), "where Col1 is not null", ))
but again, LTE (<=) limitation of 10M cells won't let you use more than 1581 rows in your case or 3162 rows in the standard cumulative sum case
(1581 rows * 2 columns) raised on 2nd power < 10 million cells
(1581*2)^2 = 9998244

Array of different sizes

I am using this formula and I am getting some troubles in there:
=ARRAYFORMULA(IF(A2:A="","",IF($B$2:$B>=Helper!$B$2:$B,1,0)))
B2:B Column Main sheet contains almost 30k rows with values and Helper2 B column sheet has only 60 values but the sheet has 1000 rows.
I am checking whether the values in B (Main sheet) are greater or equals to what I have in the Helper B column. But I am only getting the results for the first 1000 rows in the main sheet.
can anyone please tell me what is the workaround for this?
ERROR is: Array arguments to GTE are of different sizes.
Thanks
Here is the link to the Spreadsheet
https://docs.google.com/spreadsheets/d/1hfXCFTp_vLqPFjazM-Vo6-3o8J1td1zpnCJOeK1LBg0/edit#gid=511305173
Main sheet
Helper2 Sheet
The issue is that both sheets have unequal number of rows. (GTE in the error means greater than equal which is the conditional you used)
Easiest fix to eliminate the error is to have the same number of rows from both sheets. If your Main sheet have 30000 and Helper2 have 1000, then add 29000 to your Helper2 sheet.
But if you want to limit your ARRAYFORMULA and only compare until row 1000, then limit the range. Use the formula below:
=ARRAYFORMULA(IF(A2:A1000="","",IF($B$2:$B>=Helper2!$B$2:$B,1,0)))
Or use ARRAY_CONSTRAIN to limit the result of your ARRAYFORMULA
=ARRAY_CONSTRAIN(ARRAYFORMULA(IF(A2:A="","",IF($B$2:$B>=Helper2!$B$2:$B,1,0))), 999, 1)
Limits to 999 rows (since you started at row 2, 1000th row in sheets should be the 999th row of ARRAYFORMULA result)
EDIT:
Try this one:
=arrayformula(if(A2:A="","",if((B2:B <= vlookup(A2:A, Helper2!$A$2:$B, 2)) * (C2:C >= vlookup(A2:A, Helper2!$A$2:$B, 2)) > 0 = TRUE, 1, 0)))
Note:
Your data has issue with your Artem name, not sure why. Thus if Artem is found, blank is returned.
Tried changing all "Artem" to "John" and the formula works seamlessly.

Compute subranks in spreadsheet column in combination with ArrayFormula (Google Sheets)

I'm trying to find the inverse rank within categories using an ArrayFormula. Let's suppose a sheet containing
A B C
---------- -----
1 0.14 2
1 0.26 3
1 0.12 1
2 0.62 2
2 0.43 1
2 0.99 3
Columns A:B are input data, with an unknown number of useful rows filled-in manually. A is the classifier categories, B is the actual measurements.
Column C is the inverse ranking of B values, grouped by A. This can be computed for a single cell, and copied to the rest, with e.g.:
=1+COUNTIFS($B$2:$B,"<" & $B2, $A$2:$A, "=" & $A2)
However, if I try to use ArrayFormula:
=ARRAYFORMULA(1+COUNTIFS($B$2:$B,"<" & $B2:$B, $A$2:$A, "=" & $A2:$A))
It only computes one row, instead of filling all the data range.
A solution using COUNT(FILTER(...)) instead of COUNTIFS fails likewise.
I want to avoid copy/pasting the formula since the rows may grow in the future and forgetting to copy again could cause obscure miscalculations. Hence I would be glad for help with a solution using ArrayFormula.
Thanks.
I don't see a solution with array formulas available in Sheets. Here is an array solution with a custom function, =inverserank(A:B). The function, given below, should be entered in Script Editor (Tools > Script Editor). See Custom Functions in Google Sheets.
function inverserank(arr) {
arr = arr.filter(function(r) {
return r[0] != "";
});
return arr.map(function(r1) {
return arr.reduce(function(rank, r2) {
return rank += (r2[0] == r1[0] && r2[1] < r1[1]);
}, 1);
});
}
Explanation: the double array of values in A:B is
filtered, to get rid of empty rows (where A entry is blank)
mapped, by the function that takes every row r1 and then
reduces the array, counting each row (r2) only if it has the same category and smaller value than r1. It returns the count plus 1, so the smallest element gets rank 1.
No tie-breaking is implemented: for example, if there are two smallest elements, they both get rank 1, and there is no rank 2; the next smallest element gets rank 3.
Well this does give an answer, but I had to go through a fairly complicated manoeuvre to find it:
=ArrayFormula(iferror(VLOOKUP(row(A2:A),{sort({row(A2:A),A2:B},2,1,3,1),row(A2:A)},4,false)-rank(A2:A,A2:A,true),""))
So
Sort cols A and B with their row numbers.
Use a lookup to find where those sorted row numbers now are: their position gives the rank of that row in the original data plus 1 (3,4,2,6,5,7).
Return the new row number.
Subtract the rank obtained just by ranking on column A (1,1,1,4,4,4) to get the rank within each group.
In the particular case where the classifiers (col A) are whole numbers and the measurements (col B) are fractions, you could just add the two columns and use rank:
=ArrayFormula(iferror(rank(A2:A+B2:B,if(A2:A<>"",A2:A+B2:B),true)-rank(A2:A,A2:A,true)+1,""))
My version of an array formula, it works when column A contains text:
=ARRAYFORMULA(RANK(ARRAY_CONSTRAIN(VLOOKUP(A1:A,{UNIQUE(FILTER(A1:A,A1:A<>"")),ROW(INDIRECT("a1:a"&COUNTUNIQUE(A1:A)))},2,)*1000+B1:B,COUNTA(A1:A),1),ARRAY_CONSTRAIN(VLOOKUP(A1:A,{UNIQUE(FILTER(A1:A,A1:A<>"")),ROW(INDIRECT("a1:a"&COUNTUNIQUE(A1:A)))},2,)*1000+B1:B,COUNTA(A1:A),1),1) - COUNTIF(A1:A,"<"&OFFSET(A1,,,COUNTA(A1:A))))

How to sum largest $n$ values in a range in Google Spreadsheet?

I have a list of values and I need to sum the largest 10 values (in a row). I found this but I can't figure it out/get it to work:
https://productforums.google.com/forum/#!topic/docs/A5jiMqkRLYE
let's say you want to sum the 10 highest values of the range E2:EP
then try:
=sumif(E2:P2, ">="&large(E2:P2,10))
and see if that works ?
EDIT: Maybe this is a better option ? This will only sum the 10 outputted by the array_constrain. Will only work in the new google sheets, though..
=sum(array_constrain(sort(transpose($A3:$O3), 1, 0), 10 ,1))
Can you see if this works ?
This works in old google sheets too:
sum(query(sort(transpose($A3:$O3), 1, false), "select * limit 10"))
Transpose puts the data in a column, sort sorts the data in a descending order and then query selects first 10 numbers.
Unfortunately, replacing sort with "order by" in a query statement does not work, because you can not reference a column in a range returned by transpose.
The sortn function seems to be just what you need.
From the documentation linked above, it "[r]eturns the first n items in a data set after performing a sort." The data set does not have to be sorted. It takes a bunch of optional parameters as it can sort on multiple columns.
SORTN(range, [n], [display_ties_mode], [sort_column1, is_ascending1], ...)
The interesting ones for your case are n, sort_column1, and is_ascending1. Specifically, your required formula would be
sum(sortn(transpose(A3:O3), 10, 0, 1, false)))
Some notes:
This assumes your data in A3:O3. You can replace it with your range.
transpose converts the data row to a data column as required by sortn.
10 is n, indicating the number of values that you require.
0 is the value for display_ties_mode. We are ignoring this value.
1 is the value of sort_column1, telling that we want to sort the first column (after transpose).
false tells sortn to sort descending and thus pick the largest values. The default is to pick the smallest.

Resources