ArrayFormula of Average on Infinite Truly Dynamic Range in Google Sheets - google-sheets

as per example:
A B C D E F G ∞
|======|=======|=====|=====|=====|=====|=====|=====
1 | |AVERAGE| | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
2 | xx 1 | | 1 | 2 | 0.5 | 10 | |
|======|=======|=====|=====|=====|=====|=====|=====
3 | xx 2 | | 7 | 1 | | | |
|======|=======|=====|=====|=====|=====|=====|=====
4 | | | 0 | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
5 | xx 3 | | 9 | 8 | 7 | 6 | |
|======|=======|=====|=====|=====|=====|=====|=====
6 | xx 4 | | 0 | 1 | 2 | 1 | |
|======|=======|=====|=====|=====|=====|=====|=====
7 | | | 1 | | 4 | | |
|======|=======|=====|=====|=====|=====|=====|=====
8 | xx 5 | | | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
9 | | | | | | | 5 |
|======|=======|=====|=====|=====|=====|=====|=====
∞ | | | | | | | |
what's the most optimal way of getting AVERAGE for every valid row in the dynamic sense of terms (unknown quantity of rows & unknown quantity of columns) ?
if you are here by accident for running / cumulative / rolling average see:
https://stackoverflow.com/a/59120993/5632629

QUERY
level 1:
if all 5 cells in range C2:G have values:
=QUERY(QUERY(C2:G, "select (C+D+E+F+G)/5"), "offset 1", )
if not, then rows are skipped:
if empty cells are considered as zeros:
=INDEX(QUERY(QUERY({C2:G*1}, "select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))
to remove zero values we use IFERROR(1/(1/...)) wrapping:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))))
to make Col references dynamic we can do:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select "&
"("&JOIN("+", "Col"&ROW(INDIRECT("1:"&COLUMNS(C:G))))&")/"&COLUMNS(C:G)),
"offset 1", ))))
level 2:
if empty cells are not considered as zeros and shouldn't be skipped:
=INDEX(TRANSPOSE(QUERY(TRANSPOSE(E2:I),
"select "&TEXTJOIN(",", 1, IF(A2:A="",,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")")))),, 2)
note that this is column A dependant, so missing values in column A will offset the results
fun fact !! we can swap avg to max or min:
to free it from confinement of column A and make it work for any valid row:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(C2:G),,9^9)))="", C2:G*0, C2:G)),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
if present 0's in range shouldn't be averaged we can add a small IF statement:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(C2:G>0, C2:G, )),,9^9)))="", C2:G*0,
IF(C2:G>0, C2:G, ))),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
here we used so-called "vertical query smash" which takes all values in a given range and concentrates it to one single column, where all cells per each row are joined with empty space as a byproduct:
=FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9))
apart from this, there is also "horizontal query smash":
=QUERY(C2:G,,9^9)
and also "ultimate 360° double query smash" which puts all cells from range into one single cell:
=QUERY(FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9)),,9^9)
and finally "the infamous negative 360° reverse double query smash" which prioritizes columns over rows:
=QUERY(FLATTEN(QUERY(C2:G,,9^9)),,9^9)
all query smash names are copyrighted of course
back to the topic... as mentioned above all cells per row in range are joined with empty space even those empty ones, so we got a situation where we getting double or multiple spaces between values. to fix this we use TRIM and introduce a simple IF statement to assign 0 values for empty rows in a given range eg. to counter the offset:
MMULT
level 3:
MMULT is a kind of heavy class formula that is able to perform addition, subtraction, multiplication, division even running total on arrays/matrixes... however, bigger the dataset = slower the formula calculation (because in MMULT even empty rows take time to perform + - × ÷ operation) ...unless we use truly dynamic range infinite in both directions...
to get the last row with values of a given range:
=INDEX(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))))
to get the last column with values of a given range:
=INDEX(MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))
now we can construct it in a simple way:
=INDIRECT("C2:"&ADDRESS(9, 7))
which is the same as:
=INDEX(INDIRECT("C2:"&ADDRESS(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))),
MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))))
or shorter alternative:
=INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2)))))
therefore simplified MMULT formula would be:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9<>"", 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
in case we want to exclude zero values from range, the formula would be:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9>0, 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
level 4:
putting together all above to make it infinitely dynamic and still restricted to valid dataset:
=INDEX(IFERROR(
MMULT(N( INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)/
MMULT(N(IF(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))<>"", 1, )), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)))
again, not including cells with zeros in range:
LAMBDA
level 5:
since 20 September 2022, we can use new functions that make stuff easier:
MAKEARRAY
REDUCE
BYROW
BYCOL
SCAN
MAP
LAMBDA
so to jump right in for a closed range we can take an average like:
=IFERROR(BYROW(C2:G9, LAMBDA(x, AVERAGE(x))))
and to get an average column-wise we just replace BYROW with BYCOL. now to make the range open and truly dynamic we can modify the above formula like this:
=IFERROR(BYROW(INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), LAMBDA(x, AVERAGE(x))))
we can do it shorter by 12 characters like:
=IFERROR(BYROW(INDEX(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2)))), LAMBDA(x, AVERAGE(x))))
to exclude zeros from the output:
=INDEX(IFERROR(1/(1/BYROW(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGE(x))))))
to exclude zeros from input:
=INDEX(IFERROR(1/(1/BYROW(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGEIF(x, ">0"))))))
or if blank cells should be treated as zeros:
=INDEX(IFERROR(1/(1/BYROW(1*OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGE(x))))))
also, it's worth mentioning the BYROW limitation of ~ 99990 rows
honorable mentions:
#Erik Tyler level:
the polar opposite of the previous formula would be to run the MMULT on
total area of C2:? (all rows, all columns) instead of
valid area C2:? (excluding empty rows and columns) which avoids mass-calculations of 0 × 0 = 0
including zeros:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"", 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
excluding zeros:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))>0, 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
#kishkin level:
for a fixed range C2:G9 the MMULT average would be:
=INDEX(IFERROR(
MMULT( C2:G9*1, FLATTEN(COLUMN(C:G))^0)/
MMULT((C2:G9>0)*1, FLATTEN(COLUMN(C:G))^0)))
=INDEX(IFNA(VLOOKUP(ROW(C2:C),
QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&C2:J), "×"),
"select Col1,avg(Col2)
where Col2 is not null
group by Col1"), 2, )))
#MattKing level:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))
excluding zeros:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
where Col2 <> 0
group by Col1
label avg(Col2)''"))
including empty cells:
=INDEX(IFERROR(1/(1/QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)*1), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))))

You put a ton of time into this. I hope people appreciate it, more so that you did it for everyone else and not for yourself.
Looking at your final formulas, these should produce the same results (give data in C2:? as in your examples):
In B2 (include zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"",1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
In B2 (exclude zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>0,1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))

UPDATE: I've updated the formula from my original post. The ROW() should always come first so that missing values in the data don't throw off the split.
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"|"&OFFSET(C2,,,9^9,9^9)),"|"),"select AVG(Col2) group by Col1 label AVG(Col2)''"))
Should work unless I'm misunderstanding the question.
No need for vlookups or mmults or filters or anything.

I will try to make a little addition to #player0's answer. And I will really appreciate any comments on optimizing this.
In case there is a lot of empty rows and columns inside the data range those might as well be excluded from MMULT.
Step 1 - Filter out empty rows
We've got a data range: from C2 down to the last row and right to the last column (which is J:J). I will use C2:K, see details below for explanation.
This formula will give us an array of row numbers where there is at least one non empty cell. Also it will have a 0 if there are empty rows, but it won't matter for searching in this array, or we will filter it out when it does matter:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K)))
)
So, to filter out empty rows from the data range we use FILTER which will check if a row is in our array from above and leave if be in that case:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
)
)
Step 2 - Filter out empty columns
To get an array of only non-empty column numbers we can use almost the same formula:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))))
)
Why SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)) is used instead of COLUMN(C2:K) see details at the end.
To filter out empty columns we also use FILTER with MATCH condition to search for column numbers in our array:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
And to filter out empty rows and empty columns we just use two FILTERs:
=ARRAYFORMULA(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
Original data range will internally become:
Step 3 - Do the MMULT
Now we can use MMULT with that data set to calculate average:
=ARRAYFORMULA(
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
)
It is a bit off regarding original data rows.
Step 4 - Fill the AVERAGE column
To make averages consistent with the original data rows we can use VLOOKUP like this:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
...
) /
MMULT(
...
)
},
2,
0
))
)
Where
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)) is an array of row numbers from the 2nd one to the last none-empty one. We won't be filling all the rows down with empty strings.
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0") is an array of non-empty row numbers with that 0 filtered out used as keys for search.
IFNA will return an empty string to put alongside an empty data row.
FINAL FORMULA
Putting it all together:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
},
2,
0
))
)
A few details
INDEX could be used instead of ARRAYFORMULA for brevity (thanks #player0, taught me that a few months ago), but I like unambiguity of ARRAYFORMULA.
I use SEQUENCE to construct a column or a row of 1s to be explicit, for clarity. For example, this one
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
could be replaced with
SIGN(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
)
which is a bit shorter. There is also a way demonstrated here by #player0 of raising to the power of 0:
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)^0
but (it is just my speculation) I think SEQUENCE's internal implementation should be simpler then the operation of raising to a power.
I use range C2:K which is one column more than there actually exist on the sheet. Not only it gives a range of all the columns to the right of C2 and all the rows down from it, but it also updates in case of adding another column to the right of the sheet: a demo. Though it does not get to be highlighted. This C2:K can almost perfectly (there will be a problem in case there is actually ZZZ column present on a sheet) replace those approaches:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
There is a small drawback in using C2:K: =ARRAYFORMULA(COLUMN(C2:K)) will return an array of column numbers even for non-existing ones, so we need to use =SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)) instead.

I think there is a simple answer for row-wise average using VLOOKUP and QUERY.
This one is in B2:
=ARRAYFORMULA(
IFNA(
VLOOKUP(
ROW(B2:B),
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(C2:J)
},
"SELECT Col1, AVG(Col2)
WHERE Col2 IS NOT NULL
GROUP BY Col1"
),
2,
0
)
)
)
This could be easily changed for max, min, sum, count - just change aggregation function inside QUERY statement.
Same approach could be used for column-wise aggregation.
FLATTEN(C2:J) could be changed to:
FLATTEN(--C2:J) to treat empty cells as 0s;
FLATTEN(IFERROR(1/(1/C2:J))) to exclude 0s from average.
If there are no intermediate empty rows, VLOOKUP could be removed from the formula, as well as Col1 from SELECT statement.
There's a shorter version (thanks #MattKing!) without VLOOKUP and WHERE Col...:
=ARRAYFORMULA(
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(IFERROR(1/(1/C2:J)))
},
"SELECT AVG(Col2)
GROUP BY Col1
LABEL AVG(Col2) ''"
)
)
I use C2:J range having columns up to I:I, some details on that:
Range C2:J which is one column more than there actually exist on the sheet. Not only it gives a range of all the columns to the right of C2 and all the rows down from it, but it also updates in case of adding another column to the right of the sheet: a demo. Though it does not get to be highlighted. This C2:J can almost perfectly (there will be a problem in case there is actually ZZZ column present on a sheet) replace those approaches:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
There is a small drawback in using C2:J: =ARRAYFORMULA(0 * COLUMN(C2:J)) will return an array of column numbers even for non-existing ones (multiplied by 0), so we need to use =SEQUENCE(1, COLUMNS(C2:J),,) instead.
#player0, any thoughts on this?

This is now easier with BYROW:
=BYROW(C2:G,LAMBDA(r, AVERAGE(r)))
Piece of cake. Easy peasy

Related

Looking for the most profitable range (mathematical names: "maximum subarray problem" or "maximum consecutive subsequence sum")

To be able to find the most profitable range, I add the lowest value I want to the highest value I want, with that I create a table like this example:
https://docs.google.com/spreadsheets/d/17zpapBeC5wYxyU6SjbqcbnV4_QP4gooxj0PxdCywDk0/edit?usp=sharing
Cell's formulas examples:
Between 0 and 0:
=IFERROR(SUM(FILTER($B$1:$B,($A$1:$A<=D2)*($A$1:$A>=$E$1))))
Between 5 and 10:
=IFERROR(SUM(FILTER($B$1:$B,($A$1:$A<=D12)*($A$1:$A>=$J$1))))
=MAX(E2:O12)
Max Profit = £185.00
=INDEX(A1:O1,ARRAYFORMULA(MIN(IF(E2:O12=MAX(E2:O12),COLUMN(E2:O12)))))
Value Min for Max Profit = 4
=INDEX(D1:D12,ARRAYFORMULA(MAX(IF(E2:O12=MAX(E2:O12),ROW(E2:O12)))))
Value Max for Max Profit = 10
When there are hundreds of values¹ in A and B, this table gets very big and heavy, even causing crashes like my current original data spreadsheet.
Is there any way using a only one formula or script code to found Max Profit | Value Min for Max Profit | Value Max for Max Profit without doing each range one by one needing to use thousands of cells each with a specific formula?
Notes:
hundreds of values¹ → my original spreadsheet currently contains 1471 rows of data in A with the results in B. So to be able to do this analysis, I need to put 2,163,841 formulas like =IFERROR(SUM(FILTER($B$1:$B,($A$1:$A<=D2)*($A$1:$A>=$E$1)))) in the cells to create the table and find the most profitable range.
max:
=INDEX(MAX(IF(SEQUENCE(MAX(A:A)+1)>=SEQUENCE(1, MAX(A:A)+1),
SUMIF(SEQUENCE(MAX(A:A)+1), "<="&SEQUENCE(MAX(A:A)+1), B:B)*
SEQUENCE(1, MAX(A:A)+1, 1, )-QUERY(QUERY(
(SEQUENCE(MAX(A:A)+1)<SEQUENCE(1, MAX(A:A)+1))*B1:B,
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(MAX(A:A)+1)&")")),
"offset 1", ), )))
value min:
=INDEX(REGEXEXTRACT(MAX(IF(SEQUENCE(MAX(A:A)+1)>=SEQUENCE(1, MAX(A:A)+1),
SUMIF(SEQUENCE(MAX(A:A)+1), "<="&SEQUENCE(MAX(A:A)+1), B:B)*
SEQUENCE(1, MAX(A:A)+1, 1, )-QUERY(QUERY(
(SEQUENCE(MAX(A:A)+1)<SEQUENCE(1, MAX(A:A)+1))*B1:B,
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(MAX(A:A)+1)&")")),
"offset 1", )+(SEQUENCE(1, MAX(A:A)+1)*10^-10)&9, )*1)&"", "0(\d+)9$")-1)
value max:
=INDEX(REGEXEXTRACT(MAX(IF(SEQUENCE(MAX(A:A)+1)>=SEQUENCE(1, MAX(A:A)+1),
SUMIF(SEQUENCE(MAX(A:A)+1), "<="&SEQUENCE(MAX(A:A)+1), B:B)*
SEQUENCE(1, MAX(A:A)+1, 1, )-QUERY(QUERY(
(SEQUENCE(MAX(A:A)+1)<SEQUENCE(1, MAX(A:A)+1))*B1:B,
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(MAX(A:A)+1)&")")),
"offset 1", )+(SEQUENCE(MAX(A:A)+1)*10^-10)&9, )*1)&"", "0(\d+)9$")-1)
update:
=INDEX(TEXTJOIN(", ", 1, UNIQUE(FLATTEN(
IF(IF(SEQUENCE(MAX(A:A)+1)>=SEQUENCE(1, MAX(A:A)+1),
SUMIF(SEQUENCE(MAX(A:A)+1), "<="&SEQUENCE(MAX(A:A)+1), B:B)*
SEQUENCE(1, MAX(A:A)+1, 1, )-QUERY(QUERY(
(SEQUENCE(MAX(A:A)+1)<SEQUENCE(1, MAX(A:A)+1))*B1:B,
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(MAX(A:A)+1)&")")),
"offset 1", ), )=MAX(IF(SEQUENCE(MAX(A:A)+1)>=SEQUENCE(1, MAX(A:A)+1),
SUMIF(SEQUENCE(MAX(A:A)+1), "<="&SEQUENCE(MAX(A:A)+1), B:B)*
SEQUENCE(1, MAX(A:A)+1, 1, )-QUERY(QUERY(
(SEQUENCE(MAX(A:A)+1)<SEQUENCE(1, MAX(A:A)+1))*B1:B,
"select "&TEXTJOIN(",", 1, "sum(Col"&SEQUENCE(MAX(A:A)+1)&")")),
"offset 1", ), )), SEQUENCE(1, MAX(A:A)+1, 0), )))))

Use Google Sheets ARRAYFORMULA to create QUERY input

I have a QUERY that looks up values from various separate sheets, this is its input:
=QUERY( {
IFNA( QUERY({'Week 1'!A2:Z133;'Week 2'!A2:Z133;'Week 3'!A2:Z133;'Week 4'!A2:Z133;'Week 5'!A2:Z133},"select Col26, Col1, Col2, Col4, Col5 where Col1 IS NOT NULL and Col4 IS NOT NULL", 0), { "","","","","" } );
IFNA( QUERY({'Week 1'!A2:Z133;'Week 2'!A2:Z133;'Week 3'!A2:Z133;'Week 4'!A2:Z133;'Week 5'!A2:Z133},"select Col26, Col1, Col6, Col8, Col9 where Col1 IS NOT NULL and Col8 IS NOT NULL", 0), { "","","","","" } );
IFNA( QUERY({'Week 1'!A2:Z133;'Week 2'!A2:Z133;'Week 3'!A2:Z133;'Week 4'!A2:Z133;'Week 5'!A2:Z133},"select Col26, Col1, Col10, Col12, Col13 where Col1 IS NOT NULL and Col12 IS NOT NULL", 0), { "","","","","" } );
IFNA( QUERY({'Week 1'!A2:Z133;'Week 2'!A2:Z133;'Week 3'!A2:Z133;'Week 4'!A2:Z133;'Week 5'!A2:Z133},"select Col26, Col1, Col14, Col16, Col17 where Col1 IS NOT NULL and Col16 IS NOT NULL", 0), { "","","","","" } );
IFNA( QUERY({'Week 1'!A2:Z133;'Week 2'!A2:Z133;'Week 3'!A2:Z133;'Week 4'!A2:Z133;'Week 5'!A2:Z133},"select Col26, Col1, Col18, Col20, Col21 where Col1 IS NOT NULL and Col20 IS NOT NULL", 0), { "","","","","" } )
}, "SELECT * WHERE Col1 IS NOT NULL ORDER BY Col1")
The {} gets repetitive and longwinded, and requires manually updating every time I add another week.
Recently I discovered I can generate a list of Week sheet names using this formula:
=ARRAYFORMULA(
"Week " &
{1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30}
& "!A2:Z133"
)
and then in another column list only those sheets that actually exist:
=IF(
ISERROR(CELL("address",INDIRECT($U2))),
"",
$U2
)
Now I have a column with values such as Week1!A2:Z133, Week2!A2:Z133, etc. How can I use this column to create the QUERY formula source automatically?
Using this formula gets me the first range referenced but none of the subsequent ones in the column:
={ARRAYFORMULA(INDIRECT(AA:AA) )}
Here within this sheet is a basic script for combining tabs that start with the word "Week".
function tabCombo(){
var ss = SpreadsheetApp.getActive();
//Filters sheets to just the ones that start with "Week"
var sheets = ss.getSheets().filter(function (e){return e.getName().slice(0,4)=='Week'});
//combines all tab values into one array and filters out the rows with a certain value in the first column
var combo = sheets.map(e=>e.getDataRange().getValues()).flat().filter(e=>e[0]!='Header1');
//writes that new value to a 'Master' tab.
ss.getRange('Master!A2').offset(0,0,combo.length,combo[0].length).setValues(combo);
}
Take note of the word "Week" which is how it decides which tabs to grab.
Take note of the number 4 (which is how many letters "Week" has)
Take note of the range "Master!A2" which is the top left corner of where the combined data should go.
Take note of the term "Header1" which is how the combined array filters out the header rows from all the tabs.

Excel/Google Sheets - How to streamline an Array formula so it doesn't crash sheet? [duplicate]

as per example:
A B C D E F G ∞
|======|=======|=====|=====|=====|=====|=====|=====
1 | |AVERAGE| | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
2 | xx 1 | | 1 | 2 | 0.5 | 10 | |
|======|=======|=====|=====|=====|=====|=====|=====
3 | xx 2 | | 7 | 1 | | | |
|======|=======|=====|=====|=====|=====|=====|=====
4 | | | 0 | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
5 | xx 3 | | 9 | 8 | 7 | 6 | |
|======|=======|=====|=====|=====|=====|=====|=====
6 | xx 4 | | 0 | 1 | 2 | 1 | |
|======|=======|=====|=====|=====|=====|=====|=====
7 | | | 1 | | 4 | | |
|======|=======|=====|=====|=====|=====|=====|=====
8 | xx 5 | | | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
9 | | | | | | | 5 |
|======|=======|=====|=====|=====|=====|=====|=====
∞ | | | | | | | |
what's the most optimal way of getting AVERAGE for every valid row in the dynamic sense of terms (unknown quantity of rows & unknown quantity of columns) ?
if you are here by accident for running / cumulative / rolling average see:
https://stackoverflow.com/a/59120993/5632629
QUERY
level 1:
if all 5 cells in range C2:G have values:
=QUERY(QUERY(C2:G, "select (C+D+E+F+G)/5"), "offset 1", )
if not, then rows are skipped:
if empty cells are considered as zeros:
=INDEX(QUERY(QUERY({C2:G*1}, "select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))
to remove zero values we use IFERROR(1/(1/...)) wrapping:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))))
to make Col references dynamic we can do:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select "&
"("&JOIN("+", "Col"&ROW(INDIRECT("1:"&COLUMNS(C:G))))&")/"&COLUMNS(C:G)),
"offset 1", ))))
level 2:
if empty cells are not considered as zeros and shouldn't be skipped:
=INDEX(TRANSPOSE(QUERY(TRANSPOSE(E2:I),
"select "&TEXTJOIN(",", 1, IF(A2:A="",,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")")))),, 2)
note that this is column A dependant, so missing values in column A will offset the results
fun fact !! we can swap avg to max or min:
to free it from confinement of column A and make it work for any valid row:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(C2:G),,9^9)))="", C2:G*0, C2:G)),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
if present 0's in range shouldn't be averaged we can add a small IF statement:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(C2:G>0, C2:G, )),,9^9)))="", C2:G*0,
IF(C2:G>0, C2:G, ))),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
here we used so-called "vertical query smash" which takes all values in a given range and concentrates it to one single column, where all cells per each row are joined with empty space as a byproduct:
=FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9))
apart from this, there is also "horizontal query smash":
=QUERY(C2:G,,9^9)
and also "ultimate 360° double query smash" which puts all cells from range into one single cell:
=QUERY(FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9)),,9^9)
and finally "the infamous negative 360° reverse double query smash" which prioritizes columns over rows:
=QUERY(FLATTEN(QUERY(C2:G,,9^9)),,9^9)
all query smash names are copyrighted of course
back to the topic... as mentioned above all cells per row in range are joined with empty space even those empty ones, so we got a situation where we getting double or multiple spaces between values. to fix this we use TRIM and introduce a simple IF statement to assign 0 values for empty rows in a given range eg. to counter the offset:
MMULT
level 3:
MMULT is a kind of heavy class formula that is able to perform addition, subtraction, multiplication, division even running total on arrays/matrixes... however, bigger the dataset = slower the formula calculation (because in MMULT even empty rows take time to perform + - × ÷ operation) ...unless we use truly dynamic range infinite in both directions...
to get the last row with values of a given range:
=INDEX(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))))
to get the last column with values of a given range:
=INDEX(MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))
now we can construct it in a simple way:
=INDIRECT("C2:"&ADDRESS(9, 7))
which is the same as:
=INDEX(INDIRECT("C2:"&ADDRESS(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))),
MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))))
or shorter alternative:
=INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2)))))
therefore simplified MMULT formula would be:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9<>"", 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
in case we want to exclude zero values from range, the formula would be:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9>0, 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
level 4:
putting together all above to make it infinitely dynamic and still restricted to valid dataset:
=INDEX(IFERROR(
MMULT(N( INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)/
MMULT(N(IF(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))<>"", 1, )), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)))
again, not including cells with zeros in range:
LAMBDA
level 5:
since 20 September 2022, we can use new functions that make stuff easier:
MAKEARRAY
REDUCE
BYROW
BYCOL
SCAN
MAP
LAMBDA
so to jump right in for a closed range we can take an average like:
=IFERROR(BYROW(C2:G9, LAMBDA(x, AVERAGE(x))))
and to get an average column-wise we just replace BYROW with BYCOL. now to make the range open and truly dynamic we can modify the above formula like this:
=IFERROR(BYROW(INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), LAMBDA(x, AVERAGE(x))))
we can do it shorter by 12 characters like:
=IFERROR(BYROW(INDEX(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2)))), LAMBDA(x, AVERAGE(x))))
to exclude zeros from the output:
=INDEX(IFERROR(1/(1/BYROW(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGE(x))))))
to exclude zeros from input:
=INDEX(IFERROR(1/(1/BYROW(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGEIF(x, ">0"))))))
or if blank cells should be treated as zeros:
=INDEX(IFERROR(1/(1/BYROW(1*OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGE(x))))))
also, it's worth mentioning the BYROW limitation of ~ 99990 rows
honorable mentions:
#Erik Tyler level:
the polar opposite of the previous formula would be to run the MMULT on
total area of C2:? (all rows, all columns) instead of
valid area C2:? (excluding empty rows and columns) which avoids mass-calculations of 0 × 0 = 0
including zeros:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"", 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
excluding zeros:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))>0, 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
#kishkin level:
for a fixed range C2:G9 the MMULT average would be:
=INDEX(IFERROR(
MMULT( C2:G9*1, FLATTEN(COLUMN(C:G))^0)/
MMULT((C2:G9>0)*1, FLATTEN(COLUMN(C:G))^0)))
=INDEX(IFNA(VLOOKUP(ROW(C2:C),
QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&C2:J), "×"),
"select Col1,avg(Col2)
where Col2 is not null
group by Col1"), 2, )))
#MattKing level:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))
excluding zeros:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
where Col2 <> 0
group by Col1
label avg(Col2)''"))
including empty cells:
=INDEX(IFERROR(1/(1/QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)*1), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))))
You put a ton of time into this. I hope people appreciate it, more so that you did it for everyone else and not for yourself.
Looking at your final formulas, these should produce the same results (give data in C2:? as in your examples):
In B2 (include zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"",1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
In B2 (exclude zeros):
=ArrayFormula(IFERROR(MMULT(INDIRECT("C2:"&ROWS(C:C))*1,SEQUENCE(COLUMNS(C1:1),1,1,0))/ MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>0,1,0),SEQUENCE(COLUMNS(C1:1),1,1,0))))
UPDATE: I've updated the formula from my original post. The ROW() should always come first so that missing values in the data don't throw off the split.
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"|"&OFFSET(C2,,,9^9,9^9)),"|"),"select AVG(Col2) group by Col1 label AVG(Col2)''"))
Should work unless I'm misunderstanding the question.
No need for vlookups or mmults or filters or anything.
I will try to make a little addition to #player0's answer. And I will really appreciate any comments on optimizing this.
In case there is a lot of empty rows and columns inside the data range those might as well be excluded from MMULT.
Step 1 - Filter out empty rows
We've got a data range: from C2 down to the last row and right to the last column (which is J:J). I will use C2:K, see details below for explanation.
This formula will give us an array of row numbers where there is at least one non empty cell. Also it will have a 0 if there are empty rows, but it won't matter for searching in this array, or we will filter it out when it does matter:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K)))
)
So, to filter out empty rows from the data range we use FILTER which will check if a row is in our array from above and leave if be in that case:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
)
)
Step 2 - Filter out empty columns
To get an array of only non-empty column numbers we can use almost the same formula:
=ARRAYFORMULA(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2))))
)
Why SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)) is used instead of COLUMN(C2:K) see details at the end.
To filter out empty columns we also use FILTER with MATCH condition to search for column numbers in our array:
=ARRAYFORMULA(
FILTER(
C2:K*1,
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
And to filter out empty rows and empty columns we just use two FILTERs:
=ARRAYFORMULA(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
)
)
Original data range will internally become:
Step 3 - Do the MMULT
Now we can use MMULT with that data set to calculate average:
=ARRAYFORMULA(
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
)
It is a bit off regarding original data rows.
Step 4 - Fill the AVERAGE column
To make averages consistent with the original data rows we can use VLOOKUP like this:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
...
) /
MMULT(
...
)
},
2,
0
))
)
Where
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)) is an array of row numbers from the 2nd one to the last none-empty one. We won't be filling all the rows down with empty strings.
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0") is an array of non-empty row numbers with that 0 filtered out used as keys for search.
IFNA will return an empty string to put alongside an empty data row.
FINAL FORMULA
Putting it all together:
=ARRAYFORMULA(
IFNA(VLOOKUP(
SEQUENCE(MAX((C2:K <> "") * ROW(C2:K)) - 1, 1, ROW(C2)),
{
QUERY(UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))), "WHERE Col1 <> 0"),
MMULT(
FILTER(
FILTER(
C2:K*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
) /
MMULT(
FILTER(
FILTER(
(C2:K <> "")*1,
MATCH(
ROW(C2:K),
UNIQUE(FLATTEN((C2:K <> "") * ROW(C2:K))),
0
)
),
MATCH(
SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)),
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
0
)
),
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
)
},
2,
0
))
)
A few details
INDEX could be used instead of ARRAYFORMULA for brevity (thanks #player0, taught me that a few months ago), but I like unambiguity of ARRAYFORMULA.
I use SEQUENCE to construct a column or a row of 1s to be explicit, for clarity. For example, this one
SEQUENCE(
ROWS(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
),
1,
1,
0
)
could be replaced with
SIGN(
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)
)
which is a bit shorter. There is also a way demonstrated here by #player0 of raising to the power of 0:
QUERY(
UNIQUE(FLATTEN((C2:K <> "") * SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)))),
"WHERE Col1 <> 0"
)^0
but (it is just my speculation) I think SEQUENCE's internal implementation should be simpler then the operation of raising to a power.
I use range C2:K which is one column more than there actually exist on the sheet. Not only it gives a range of all the columns to the right of C2 and all the rows down from it, but it also updates in case of adding another column to the right of the sheet: a demo. Though it does not get to be highlighted. This C2:K can almost perfectly (there will be a problem in case there is actually ZZZ column present on a sheet) replace those approaches:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
There is a small drawback in using C2:K: =ARRAYFORMULA(COLUMN(C2:K)) will return an array of column numbers even for non-existing ones, so we need to use =SEQUENCE(1, COLUMNS(C2:K), COLUMN(C2)) instead.
I think there is a simple answer for row-wise average using VLOOKUP and QUERY.
This one is in B2:
=ARRAYFORMULA(
IFNA(
VLOOKUP(
ROW(B2:B),
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(C2:J)
},
"SELECT Col1, AVG(Col2)
WHERE Col2 IS NOT NULL
GROUP BY Col1"
),
2,
0
)
)
)
This could be easily changed for max, min, sum, count - just change aggregation function inside QUERY statement.
Same approach could be used for column-wise aggregation.
FLATTEN(C2:J) could be changed to:
FLATTEN(--C2:J) to treat empty cells as 0s;
FLATTEN(IFERROR(1/(1/C2:J))) to exclude 0s from average.
If there are no intermediate empty rows, VLOOKUP could be removed from the formula, as well as Col1 from SELECT statement.
There's a shorter version (thanks #MattKing!) without VLOOKUP and WHERE Col...:
=ARRAYFORMULA(
QUERY(
{
FLATTEN(ROW(C2:J) + SEQUENCE(1, COLUMNS(C2:J),,)),
FLATTEN(IFERROR(1/(1/C2:J)))
},
"SELECT AVG(Col2)
GROUP BY Col1
LABEL AVG(Col2) ''"
)
)
I use C2:J range having columns up to I:I, some details on that:
Range C2:J which is one column more than there actually exist on the sheet. Not only it gives a range of all the columns to the right of C2 and all the rows down from it, but it also updates in case of adding another column to the right of the sheet: a demo. Though it does not get to be highlighted. This C2:J can almost perfectly (there will be a problem in case there is actually ZZZ column present on a sheet) replace those approaches:
INDIRECT("C2:" & ROWS(C:C))
OFFSET(C2,,, ROWS(C2:C), COLUMNS(C2:2))
There is a small drawback in using C2:J: =ARRAYFORMULA(0 * COLUMN(C2:J)) will return an array of column numbers even for non-existing ones (multiplied by 0), so we need to use =SEQUENCE(1, COLUMNS(C2:J),,) instead.
#player0, any thoughts on this?
This is now easier with BYROW:
=BYROW(C2:G,LAMBDA(r, AVERAGE(r)))
Piece of cake. Easy peasy

Use Sheet's Array formula to count values in each row

When I apply an array formula for:
=count(D3:AA3)
It looks like this:
=ArrayFormula(if(row(A:A)=1,"Count",Count(D1:D:AA1:AA)))
Too many ":" (colons)?
I could (manually) paste the =count(D3:AA3) ...down every row, but I'd like it to be automated.
Here is a formula to count all the number values (COUNT does exactly that) row-wise:
={
"Count";
MMULT(
ARRAYFORMULA(--(ISNUMBER(F2:O))),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
)
}
You can replace F2:O with the range you have the data in.
Update.
Count is in column A:A, sum - column B:B, avg - column C:C, avg in a single cell (w/o using count and sum columns) - column D:D. F2:N cells have random data, some numeric, some text (will be ignored).
Here is a formula for the row wise sum of numeric values:
={
"Sum";
MMULT(
ARRAYFORMULA(IF(ISNUMBER(F2:O), F2:O, 0)),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
)
}
Here is the formula for the row wise average if you have count and sum columns:
={
"AVG";
ARRAYFORMULA(IF(A2:A = 0, 0, B2:B / A2:A))
}
And the row wise average in a single cell w/o using count and sum columns:
={
"AVG one single formula";
ARRAYFORMULA(
IF(
MMULT(
--(ISNUMBER(F2:O)),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
) = 0,
0,
MMULT(
IF(ISNUMBER(F2:O), F2:O, 0),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
) / MMULT(
--(ISNUMBER(F2:O)),
SEQUENCE(COLUMNS(F2:O), 1, 1, 0)
)
)
)
}

How do I count the number of times a column consecutively decreases in value

I have a spreadsheet where a column contains a person's weight. I want to count how many times in a row their weight has decreased in consecutive entries for both the current weight and all time, so:
108
107
106
105
104
106
104
103
Should return 3 as it's decreased three times in a row at the end, and also 5 as it decreased 5 times in a row at the start. Those values are in a column not a row in the sheet. How do I do this?
paste in D2 cell:
=ARRAYFORMULA(IF(LEN({A2; A2:A}), IF(B2:B901 < {1000; B2:B900}, 1, ), ))
paste in E2 cell:
=ARRAYFORMULA(IF(D2:D901=1,
MMULT(N(ROW(D2:D901)>=TRANSPOSE(ROW(D2:D901))), N(D2:D901=1))-HLOOKUP(0,
MMULT(N(ROW(D2:D901)> TRANSPOSE(ROW(D2:D901))), N(D2:D901=1)), MATCH(
VLOOKUP(ROW(D2:D901), IF(N(D2:D901<>D1:D900), ROW(D2:D901), ), 1, 1),
VLOOKUP(ROW(D2:D901), IF(N(D2:D901<>D1:D900), ROW(D2:D901), ), 1, 1), 0), 0), ))
and make sure your sheet has a minimum of 901 rows
I managed to do the current streak without a helper column, it also calculates the days the streak has continued for (which wasn't what I asked but is better for me). Not sure if answering my own question is the right response here, I'm not experienced with this website.
=iferror(sum(query(query({Log!A2:A, ArrayFormula(if(len(Log!A2:A), if(row(Log!A2:A) > 2, Log!A2:A - offset(Log!A2:A, -1,,counta(Log!A2:A)+1 ),0 ), )), ArrayFormula(if(len(Log!A2:A), if(row(Log!A2:A) > 2, if(len(Log!B2:B), Log!B2:B < offset(Log!B2:B, -1,,counta(Log!B2:B)+1 ), true),TRUE ), )), Log!B2:B}, "select * where Col1 is not null order by Col1 desc limit " & match(False, query(query({Log!A2:A, ArrayFormula(if(len(Log!A2:A), if(row(Log!A2:A) > 2, Log!A2:A - offset(Log!A2:A, -1,,counta(Log!A2:A)+1 ),0 ), )), ArrayFormula(if(len(Log!A2:A), if(row(Log!A2:A) > 2, if(len(Log!B2:B), Log!B2:B < offset(Log!B2:B, -1,,counta(Log!B2:B)+1 ), true),TRUE ), )), Log!B2:B}, "select * where Col1 is not null order by Col1 desc"), "select Col3"), 0)-1), "select Col2")), 0)
And to count the max streak:
=max(transpose(index(query({0}, "select " & join(",", unique(transpose(split(substitute(join(" + ", query({Log!A2:A, ArrayFormula(if(len(Log!A2:A), if(row(Log!A2:A) > 2, if(Log!B2:B < offset(Log!B2:B, -1,,counta(Log!B2:B)+1,), Log!A2:A - offset(Log!A2:A, -1,,counta(Log!A2:A)+1 ), 0),0 ), )), ArrayFormula(if(len(Log!A2:A), if(row(Log!A2:A) > 2, if(len(Log!B2:B), Log!B2:B < offset(Log!B2:B, -1,,counta(Log!B2:B)+1 ), false),false ), )), Log!B2:B}, "select Col2 where Col1 is not null order by Col1 desc")), " + 0 + ", ","), ","))))), 2)))

Resources