Sheets ArrayFormula. Find nearest number by group - google-sheets

Master Data
Group-Value pairs
1 | 1
1 | 2
1 | 3
2 | 5
2 | 8
3 | 10
3 | 12
Work Data
Group-Value pairs + desired result
1 | 4 | 3 (3≤4, max in group 1)
1 | 2 | 2 (2≤2, max in group 1)
2 | 6 | 5 (5≤6, max in group 2)
3 | 7 | no result (both 10 and 12 > than 7)
The task is to find the maximum possible matched number from a group, the number should be less or equal to the given number.
For Group 1, value 4:
=> filter Master Data (1,2,3) => find 3
Will have no problem with doing it once, need to do it with arrayformula.
My attempts to solve it were using modifications of the vlookup formula, with wrong outputs so far.
Samples and my working "arena":
https://docs.google.com/spreadsheets/d/11Cd2BGpGN-0h2bL0LQ_EpIDBKKT2hvTVHoxGC6i8uTE/edit?usp=sharing
Notes: no need to solve it in a single formula, because it may slow down the result.

I used
=ArrayFormula(VLOOKUP(D4:D8&text(E4:E8,"0000"),A4:A10&text(B4:B10,"0000"),1,true))
starting in J4
then
=ArrayFormula(if(--left(J4:J8)=D4:D8,--right(J4:J8,4),""))
starting in K4.
Needs further refinement but doesn't make any assumptions about max of previous group.
EDIT
So after further work it would look like this
=ArrayFormula(if(D4:D="",,
if(D4:D=
vlookup(D4:D&text(E4:E,"0000"),filter({A4:A&text(B4:B,"0000"),A4:A},A4:A<>""),2,true),
vlookup(D4:D&text(E4:E,"0000"),filter({A4:A&text(B4:B,"0000"),B4:B},A4:A<>""),2,true),"")))
A lot like #player0's solution in fact.
I guess you could make it a bit more general by doing something like
=text(B4,rept("0",ceiling(log10(max(B4:B)))))
assuming these are positive integers.
Alternative method
I think this is a better way. Find the start row of each group and how many rows r less than or equal to the required group/value pair are in that group. Then just go forward r-1 rows from the first line of the group to find the matching value:
=ArrayFormula(if(countifs(A4:A,D4:D,B4:B,"<="&E4:E)>0,
vlookup(
vlookup(D4:D,{A4:A,row(A4:A)},2,false)+countifs(A4:A,D4:D,B4:B,"<="&E4:E)-1,{row(A4:A),B4:B},2,false),))
Assuming of course that the Master data is sorted by group and value - otherwise you would have to use sort():
=ArrayFormula(if(countifs(A4:A,D4:D,B4:B,"<="&E4:E)>0,
vlookup(
vlookup(D4:D,{sort(A4:A,A4:A,1,B4:B,1),row(A4:A)},2,false)+countifs(A4:A,D4:D,B4:B,"<="&E4:E)-1,{row(A4:A),SORT(B4:B,A4:A,1,B4:B,1)},2,false),))

My solution was based on the technique of finding the maximum number by a row. The sample formula is here:
https://docs.google.com/spreadsheets/d/1VY157ykKsCVDqEKDBp3oAVaG0LTXAz8wUCggCrFXMDM/edit#gid=628408999
My whole solution is here:
https://docs.google.com/spreadsheets/d/11Cd2BGpGN-0h2bL0LQ_EpIDBKKT2hvTVHoxGC6i8uTE/edit#gid=0
Step 1
Get joined numbers by groups from a Master Table.
1 | 3,2,1
2 | 8,5
3 | 12,10
Used offset to achieve this ↑. And used vlookup to match this semi-result with work table.
Step 2
Used if + split to check if the resulted value was ≤ than my work value, and in the same formula used query to find the maximum by each row.
compose a query: used join + sequence
=IF(M3=0,,"select "&JOIN(", ",INDEX("max(Col"&SEQUENCE(M3)&")")))
result:
select max(Col1), max(Col2), max(Col3), max(Col4), max(Col5)
Found the maximum by each group:
=index(TRANSPOSE(QUERY(TRANSPOSE(data), "select ...")))
This final formula was the 🔑 to solving the problem.
Note: the result: 0 of my formula means "no matches". This is fine for me.

try:
=INDEX(IFNA(IF(E4:E>=
VLOOKUP(D4:D&TEXT(E4:E, "00000"), {A4:A&TEXT(FILTER(B4:B, B4:B<>""), "00000"), B4:B}, 2),
VLOOKUP(D4:D&TEXT(E4:E, "00000"), {A4:A&TEXT(FILTER(B4:B, B4:B<>""), "00000"), B4:B}, 2), 0)))

Related

Google Sheet - It's possible to array sum function in the following condition?

Would it be possible to use arrayformular for this condition?
Sum all the rows that PID are the same, the result should be as in the image.
I tried this code, but I think it's too long, and if the PID exceed over 20 rows, it would not work.
=IF(A3<>A2,BJ3+IF(A3=A4,BJ4,0)+IF(A3=A5,BJ5,0)+IF(A3=A6,BJ6,0)+IF(A3=A7,BJ7,0)+IF(A3=A8,BJ8,0)+IF(A3=A9,BJ9,0)+IF(A3=A10,BJ10,0)+IF(A3=A11,BJ11,0)+IF(A3=A12,BJ12,0)+IF(A3=A13,BJ13,0)+IF(A3=A14,BJ14,0)+IF(A3=A15,BJ15,0)+IF(A3=A16,BJ16,0)+IF(A3=A17,BJ17,0)+IF(A3=A18,BJ18,0)+IF(A3=A19,BJ19,0)+IF(A3=A20,BJ20,0)+IF(A3=A21,BJ21,0)+IF(A3=A22,BJ22,0),0)
With a table like this :
ID
Value
1
5
1
10
2
5
2
10
2
15
You have an expected output of :
ID
Value
Sum
1
5
15
1
10
blank
2
5
30
2
10
blank
2
15
blank
It is achievable with this formula (just drag it in your sum column) :
=IF(A2=A1,"",SUMIFS(B$2:B$12,A$2:A$12,A2))
It check if the ids are the same and then sum them, but only show them on the row where the id first appears
Found it on google by searching google sheets sum group by
The following in C2 will generate the required answer without any copying-down required:
=arrayformula(if(len(A2:A),ifna(vlookup(row(A2:A),query({row(A2:B),A2:B},"select min(Col1),sum(Col3) where Col2 is not null group by Col2"),2,false)),))
We are making a lookup table of grouped sums against the first row of each 'P#' group using QUERY, then using VLOOKUP to distribute the group sums to the first row in each group. Probably also doable using a SCAN/OFFSET combination as well, I think.

ArrayFormula of Resetting Running Total in Google Sheets

I'm looking for a (non-dragging) ArrayFormula of running total that resets every time the value in alt column changes. example:
desired result
a 2 2
a 3 5
a 5 10
b 2 2
c 3 3
c 4 7
so every time value in the 1st column changes the sum resets. the table is always sorted if it matters.
non-reseting regular running total formulae:
=ARRAYFORMULA(SUMIF(ROW(B1:B6), "<="&ROW(B1:B6), B1:B6))
=ARRAYFORMULA(MMULT(TRANSPOSE((ROW(B1:B6)<=TRANSPOSE(ROW(B1:B6)))*B1:B6), SIGN(B1:B6)))
I was trying somehow to combine it with this counter formula but no luck so far:
=ARRAYFORMULA(COUNTIFS(A1:A6, A1:A6, ROW(A1:A6), "<="&ROW(A1:A6)))
also I did some research, but only found either script which I am not interested in or dragging/MS Excel formulae solutions like:
=SUM(INDIRECT("L"&SUMPRODUCT(MAX(($H$2:H4=0)*ROW($H$2:H4)))+1):L5)
-----------------------------------------------------------------------------------------------------------
=SUM(L$3:L5)-SUM(M$4:M4)
-----------------------------------------------------------------------------------------------------------
=SUM($C$2:$C2)-IFERROR(SUM($C$2:OFFSET($C$1,LOOKUP(2,1/($B$2:$B2="reset"),ROW($B$2:$B2)-ROW($B$2)+1),0)),0)
-----------------------------------------------------------------------------------------------------------
=MOD((ROW()-ROW(E$1))*1,(1+5))
modification of #JPV solution focused on speed:
=INDEX(MMULT(1*TRANSPOSE(IF((TRANSPOSE(ROW(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>"")))))>=ROW(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>"")))))*(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>"")))=TRANSPOSE(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>""))))),
INDIRECT("B2:B"&MAX(ROW(A2:A)*(A2:A<>""))), 0)), ROW(
INDIRECT("A2:A"&MAX(ROW(A2:A)*(A2:A<>""))))^0))
shortened:
=INDEX(LAMBDA(x, MMULT(1*TRANSPOSE(IF((TRANSPOSE(ROW(x))>=
ROW(x))*(x=TRANSPOSE(x)), OFFSET(x,,1), 0)), ROW(x)^0))
(A2:INDEX(A:A, MAX(ROW(A:A)*(A:A<>"")))))
=INDEX(IF(B2:B="",, ROW(B2:B) - VLOOKUP(ROW(B2:B), FILTER(ROW(B2:B), B2:B<>"", B2:B<>B2:Boffset1), 1, 1) + 1)) - B=T/F.txt
In addition, you could also try
=ArrayFormula(if(len(A:A),mmult(--transpose(if( (transpose(row(A:A))>=row(A:A))*(A:A=transpose(A:A)),B:B, 0)),row(A:A)^0),))
This should also work if the data is unsorted.
If the table is always sorted on column A, you can just do:
=ARRAYFORMULA(SUMIF(ROW(B1:B6), "<="&ROW(B1:B6), B1:B6)-SUMIF(A1:A6, "<"&A1:A6, B1:B6))
If the table is not sorted, you can still do it with a vlookup:
=ARRAYFORMULA(SUMIF(ROW(B1:B6), "<="&ROW(B1:B6), B1:B6)-SUMIF(row(A1:A6), "<"&vlookup(A1:A6,{A1:A6,row(A1:A6)},2,false), B1:B6))

Negative References or reversing order of column for DATEDIF

I have a ascending sorted list of irregular dates in Column A:A:
A B C D (A:A,A2:A) E (A:A,A3:A)
2017-11-09 10 10 NA NA
2017-11-10 11 21 1 NA
2017-11-14 15 36 4 5
2017-11-15 22 58 1 5
Column C:C is a rolling sum of B:B. I'm trying to get arrayformula in D:D/E:E to find the datedif between current row (starting date) and X rows above (end date):
=ArrayFormula(DATEDIF(B:B-(X Rows),B:B,"D"))
The goal is to find range of change in D:D over X amount of days:
D:D - D:D-rowX / datedif (A:A-rowX, A:A)
i.e for 2 days on row C4:
(C4-C2) / datedif(C4-2,C4,"D")
(58-21) / datedif(C2,C4,"D")
37 / 5 = 7.4
for 5 days on row C10:
(C10-C5) / datedif(C10-5,C10,"D")
for 15 days on row C20:
(C20-C5) / datedif(C20-15,C20,"D")
I'm trying to calculate X for 1,2,3,4,7,28 rows up which means the array has to start that 1,2,3,4,7,28 rows down.
Right now, the array bugs out to bad reference because the first starting date is DATEDIF(B-X,B1,"D") where B-X is a invalid negative reference. Arrayformulas with bad values instead of bad references seems to just skip past errors and starts working once input are valid. But I can't figure out how to skip bad references. I've tried forcing start date with INDIRECT but can't get it to recognize value as a date. I also tried DATEDIF(B:B, B:B+X,"D"), which spits out the correct numbers but results are offset by X rows. I've tried reverse sorting A:A, =ArrayFormula(if(len(A:A),DATEDIF(SORT(A2:A,1,0),SORT(A:A,1,0),"D"),"")) it produces a reverse orders list of correct answers that I can't figure out how to flip back.
Seems like I'm missing something obvious?
EDIT: tried to clarify original post
Is there a easy way to displace an entire column?
Alternative Solution?
The formula roughly works but is not aligned to the correct row:
C D E
1 2 3
1 2 3
1 2 3
1 2
1
I just need it to display
C D E
1
1 2
1 2 3
1 2 3
1 2 3
To get things aligned, I can put in cell on row2 of Column F:
=array_constrain(ARRAYFORMULA(D:D),COUNT(A:A)-2,1)
Or cell in row3 of Column G:
=array_constrain(ARRAYFORMULA(E:E),COUNT(A:A)-3,1)
But if I try trigger teh formula from row1 via:
=arrayformula(if(row(A:A)>=2,array_constrain(D:D,COUNT(A:A)-2,1)))
It label everythign >=2 row false and still render D:D without displacing the cells the proper number of rows:
C D
1 false
1 2
1 2
1 2
1
EDIT: I'm closing the request, ended up just using vlookup(B:B-X) which provided an approximate enough result to work for my needs.
Short answer
Add the following formula to D1
=ArrayFormula({"N/A";ARRAY_CONSTRAIN(DATEDIF(A:A,A2:A,"D"),COUNT(A:A)-1,1)})
And the following formula to E1
=ArrayFormula({"N/A";"N/A";ARRAY_CONSTRAIN(DATEDIF(A:A,A3:A,"D"),COUNT(A:A)-2,1)})
Explanation
The solution use ARRAY_CONSTRAIN to return just the required result values and use a the array notation to add the required N/A values for the rows that as it don't have a pair to calculate the date difference.
REMARK:
Please note that the DATEDIF functions use the column A for the references as this column is the one that holds the date values.

How do I sum horizontally across a row based on 1st column value?

I've used the search but haven't found much on this. Essentially I would like to do a SUMIF style action on a dataset but it only grabs the first adjacent value. My table would be something like:
KT 4 5 9
AM 3 7 8
IA 2 5 12
On rows below I would have
KT | =Sumif(A1:E3,A8,B1:E3) Which returns 4
AM | =Sumif(A1:E3,A9,B1:E3) Which returns 3
IA | =Sumif(A1:E3,A8,B1:E3) Which returns 2
Now I know I could surely just add a column with a total use vlookup(array, value, index) but that is not what I want to do (although I may just do so if this is too big a pain).
Any thoughts/ideas. Demo here
Try using INDEX and MATCH to get the 'VLOOKUP` similarity:
=SUM(INDEX($B$1:$E$3, MATCH(A8, $A$1:$A$3, 0), 0))
INDEX($B$1:$E$3, MATCH(A8, $A$1:$A$3, 0), 0) returns the row within $B$1:$E$3 where the range $A$1:$A$3 corresponds to A8.

Is there a multiple-and-add formula in Google's spreadsheet?

What I want is to easily multiply a number by another number for each column and add them up at the end in Google Sheets. For example:
User | Points 1 | Points 2 | Points 3 | Total
| 5 | 1 | 4 |
-----+----------+----------+----------+------
Jane | 2 | 3 | 0 | 13 (2*5 + 3*1 + 0*4)
John | 1 | 11 | 4 | 32 (1*5 + 11*1 + 4*4)
So it's easy enough to make this formula for the total:
= B3*$B$2 + C3*$C$2 + D3*$D$2
The problem is I frequently need to insert additional columns or even remove some columns. So then I have to mess with all the formulas. It's a pain... we have many spreadsheets with these formulas. I wish there was a formula like SUM(B3:D3) where I could just specify a range. Is there anything like MULTIPLY_AND_SUM(B2:D2, B3:D3) that would do this? Then I could insert columns in the middle and the range would still work.
There is a built in function in Google Sheets that does exactly what you are looking for: SUMPRODUCT.
In your example the formula would be:
=sumproduct(B$2:D$2,B3:D3)
Click here for more information about this function.
You can accomplish that without requiring a special-purpose function.
In E3, try this (and copy it to the rest of your rows):
=sum(arrayformula(B3:D3*B$2:D$2))
You can read about arrayformula here.
As long as you introduce new columns between B and D, this formula will automatically adjust. If you add new columns outside of that range, you'll need to edit (and cut & paste).
On it's own, arrayformula(B3:D3*B$2:D$2) operates over each value in B3:D3 in turn, multiplying it by the corresponding value in B$2:D$2. (Note the use of absolute references to 'lock down' to row 2.) The result in this case is three values, [10,3,0], arranged horizontally in three rows because that matches the dimensions of the ranges.
The enveloping sum() function adds up the values of the array produced by arrayformula, which is 13 in this case.
As you copy that formula to other rows, the relative range references get updated for the new row.

Resources