Stacking Multiple Arrays In Query/Lambda Function - google-sheets

My question was inspired by this post in that I'm wondering if it's possible to create a formula to stack a dynamic amount of arrays based on a list (see below for clarification).
Sample Starting Data From Three Sources
ID
Amount
India
9
Delta
4
Hotel
8
ID
Amount
Alpha
1
Echo
5
Foxtrot
6
ID
Amount
Bravo
2
Gulf
7
Charlie
3
Desired final result:
ID
Amount
Alpha
1
Bravo
2
Charlie
3
Delta
4
Echo
5
Foxtrot
6
Gulf
7
Hotel
8
India
9
I can get the final result by using a query function as shown in this spreadsheet with a formula referencing the appropriate cells with fileID and range:
=Query({IMPORTRANGE(E2,F2);
IMPORTRANGE(E3,F3);
IMPORTRANGE(E4,F4)},"Select * where Col1 is not null order by Col1",1)
if you want to play with it in your own sheet, you could use this hard-coded function which is the same as above:
=Query({IMPORTRANGE("1WtI56_9mhyArMn_j_H4pZg8E0QdIBaKoJfAr-fDAoE0","'Sheet1'!A:B");
IMPORTRANGE("1HamomAuLtwKJiFEtRKTuEkt--YDTtWChUavetBcAcBA","'Sheet1'!A2:B");
IMPORTRANGE("1WtI56_9mhyArMn_j_H4pZg8E0QdIBaKoJfAr-fDAoE0","'Sheet2'!A2:B")},"Select * where Col1 is not null order by Col1",1)
My Question:
Is there a way to leverage a formula to generate this result based on the number of file ids and ranges in columns E and F? So if a fourth ID and range were added, the desired result in columns a and b would be shown? I suspect Lambda would work, but I am not as strong with it as I should be.
Unsuccessful attempt:
=lambda(someIDs,SomeRanges,IMPORTRANGE(someIds,SomeRanges))(filter(E2:E,E2:E<>""),filter(F2:F,F2:F<>""))
REALLY Bad Attempts:
=contact(Player()*1800-CoffeeBribe*Not(Home))
=company(theMaster(emailed)*(false))<>🐇
All helpful answers will be upvoted if not accepted. Thanks.

if ranges would be the same:
=LAMBDA(x, QUERY(REDUCE({"ID", "Amount"}, x,
LAMBDA(a, c, {a; IMPORTRANGE(c, "Sheet1!A2:B")})),
"where Col1 is not null", 1))
(E2:INDEX(E:E, MAX((E:E<>"")*ROW(E:E))))
if ranges are not the same:
=INDEX(LAMBDA(x, y, QUERY(SPLIT(TRANSPOSE(SPLIT(QUERY(MAP(x, y,
LAMBDA(e, f, QUERY("♣"&FLATTEN(QUERY("♥"&TRANSPOSE(
IMPORTRANGE(e, f)),,9^9)),,9^9))),,9^9),
"♣")), "♥"), "where Col1 <> ' ' order by Col2", 1))(
E2:INDEX(E:E, MAX((E:E<>"")*ROW(E:E))),
F2:INDEX(F:F, MAX((F:F<>"")*ROW(F:F)))))
or:
=LAMBDA(x, QUERY(REDUCE({"ID", "Amount"}, x,
LAMBDA(a, b, {a; IMPORTRANGE(b, OFFSET(b,,1))})),
"where Col2 is not null", 1))
(E2:INDEX(E:E, MAX((E:E<>"")*ROW(E:E))))
in old days it would be solved by generating it:
={""; INDEX("={"&TEXTJOIN("; ", 1, "IMPORTRANGE("""&
FILTER(E2:E, E2:E<>"")&""", """&FILTER(F2:F, F2:F<>"")&""")")&"}")}

REDUCE accepts and returns arrays. We can use it to stack ranges. INDEX/COUNTA can be used to get the range needed without blanks. OFFSET can be used to get the next column's value.
=QUERY(
REDUCE(
{"Id","Amount"},
E2:INDEX(E2:E,COUNTA(E2:E)),
LAMBDA(
a,e,
{a;IMPORTRANGE(e,OFFSET(e,0,1))}
)
),
"Select * where Col1 is not null order by Col1",
1
)

Related

How can I sum across multiple sheets using a named range with multiple conditions?

I believe what I am trying to do should be simple in Google Sheets formulae, but any solution based on an Excel formula should be easily transferable.
Because additional characters will be added periodically, I have a named range: "Heroes".
Heroes
Bilbo
Gandalf
Saruman
Wormtongue
Tom Bombadil
For each hero, I have a worksheet in one overall workbook. On these worksheets, there are columns for Date, Time, Quest, and Count. Several times per day, a hero will venture out on a quest of a certain type, returning with a certain count as a prize. Each venture has its own row distinguishable by date and time. Eg-:
Date
Time
Quest
Count
12/4
3:00P
Ring
9
12/5
8:00A
Mordor
6
12/5
4:15P
Sting
3
Meanwhile, I have a summary worksheet, on which I am manually entering (for now... bonus points to help create an =arrayformula() or equivalent to grab all unique date/time combinations from each character's worksheet) the date and time at which one or a batch of heroes are sent to quest. I am trying to figure out the formula template that will sum the counts for each quest type for each hero at the specific date and time signified by its corresponding row (starting at 12/4, 3:00P, Ring, the count should be 9, for example, which is Bilbo's prize for questing at that time; of course, other heroes are also sent out at 3:00P, resulting in prizes for the other quests, and multiple heroes may venture on the same type of quest at any given time):
Date
Time
Ring
Sting
Mordor
Moria
12/4
3:00P
9
3
4
1
12/4
9:30P
1
0
8
0
12/5
8:00A
5
3
6
9
12/5
12:10A
3
1
3
8
12/5
4:15P
4
5
2
5
Since not every date and time in the summary sheet will exist on each hero's worksheet, I seem unable to use "SUMIFS", which functions in such a way that each sum_range and criteria_range are added on only across the same row when conditions are met. I think there is a SUMPRODUCT(), or INDEX(MATCH()) way to do this, but when including the named range to read across multiple worksheets, only the first hero's numbers were added in my tinkering with this.
I'm dancing around the solution here. Anyone care to tango ? Many thanks !
Sample Workbook for support: https://docs.google.com/spreadsheets/d/142IE9r2ip6YHsGdMr-zt_IHd6W7glqUId_UiGQnCUZs/edit?usp=sharing
it would be done like this:
=QUERY({Bilbo!A:D; Gandalf!A:D; Saruman!A:D; Wormtongue!A:D; 'Tom Bombadil'!A:D},
"select Col1,Col2,sum(Col4) where Col1 is not null group by Col1,Col2 pivot Col3", 1)
if you want a specific order of places you can do:
=TRANSPOSE(SORT(TRANSPOSE(QUERY(
{Bilbo!A:D; Gandalf!A:D; Saruman!A:D; Wormtongue!A:D; 'Tom Bombadil'!A:D},
"select Col1,Col2,sum(Col4) where Col1 is not null
group by Col1,Col2 pivot Col3", 1)),
MATCH(FLATTEN(QUERY(QUERY(
{Bilbo!A:D; Gandalf!A:D; Saruman!A:D; Wormtongue!A:D; 'Tom Bombadil'!A:D},
"select Col1,Col2,sum(Col4) where Col1 is not null
group by Col1,Col2 pivot Col3", 1), "limit 0", 1)),
{"Date"; "Time"; "Ring"; "Sting"; "Mordor"; "Moria"}, ), 1))
or manually like this:
=QUERY(QUERY({Bilbo!A:D; Gandalf!A:D; Saruman!A:D; Wormtongue!A:D; 'Tom Bombadil'!A:D},
"select Col1,Col2,sum(Col4) where Col1 is not null group by Col1,Col2 pivot Col3", 1),
"select Col1,Col2,Col5,Col6,Col3,Col4")
if you thinking to outsmart it with the list of Heroes... don't. referring a range from other sheets requires the usage of INDIRECT. and surprise surprise, INDIRECT is not supported under ARRAYFORMULA so you cant build an array. at this point, you either re-think your life choices or you use a script where there is support for such indirected arrays. the best you can do without script is to hardcode it like:
=QUERY({
INDIRECT(Main!A2&"!A:D");
INDIRECT(Main!A3&"!A:D");
INDIRECT(Main!A4&"!A:D");
INDIRECT(Main!A5&"!A:D");
INDIRECT(Main!A7&"!A:D")},
"select Col1,Col2,sum(Col4) where Col1 is not null
group by Col1,Col2 pivot Col3", 1)
and ofc this will only work if sheet exists on the list and list does not contain empty cells otherwise you will get ARRAY error like this because Main!A6 sheet does not exist:
so to counter it we can do some slide of hand tricks with IFERROR which will allow us to not get the error and still use non-existent sheets and even empty cells so we can pre-program it for future additions like this:
=QUERY({
IFERROR(INDIRECT(IF(Main!A2="", 0, Main!A2)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A3="", 0, Main!A3)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A4="", 0, Main!A4)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A5="", 0, Main!A5)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A6="", 0, Main!A6)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A7="", 0, Main!A7)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A8="", 0, Main!A8)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A9="", 0, Main!A9)&"!A:D"), {"","","",""});
IFERROR(INDIRECT(IF(Main!A10="", 0, Main!A10)&"!A:D"),{"","","",""});
IFERROR(INDIRECT(IF(Main!A11="", 0, Main!A11)&"!A:D"),{"","","",""})},
"select Col1,Col2,sum(Col4) where Col1 is not null
group by Col1,Col2 pivot Col3", 1)
note: 4 columns in range A:D = 4 empty cells {"","","",""}

SUMPRODUCT of last nth values with criteria

I have two columns (see screenshot)
How can i create a formula that sum the second LATEST column values with a criteria from column A?
For example i need the sum of the last 6 values (from column B) of the cells (in column A) that start with HH, so values starting from the bottom.
I know how to make a sum of all values (from column B) containing HH (from column A)
=SUMIF(A1:A;"HH"&"*";B1:B)
P.S. HH and * are separate because i'll substitute the HH with a cell
but now i need to delimit this to the last N values (let say last 3 values)
P.P.S.
=SUMPRODUCT((COUNTIFS(A1:A;"exact text";ROW(A1:A)*{1;1};">="&ROW(A1:A)*{1;1})<=3)*(A1:A="exact text");B1:B)
This works so far ONLY if i write the exact text, not with values like HH*
Maybe try
=sum(index(query({row(A1:A), A1:B}, "Select Col3 where Col2 contains 'HH' order by Col1 desc limit 6")))
and see if that works?
Note:
*the string HH can be also be in a cell (ex. D1)
=sum(index(query({row(A1:A), A1:B}, "Select Col3 where Col2 contains '"&D1&"' order by Col1 desc limit 6")))
*6 indicates the number of values you want to sum
EDIT: For your locale you'll need to use in G1
=sum(index(query({row($B$1:$B) \ $B$1:$C}; "Select Col3 where Col2 contains '"&E2&"' order by Col1 desc limit 3")))
and fill down. See if that works?
This should also work, not sure if it's any easier to understand/ less complicated than any other approach:
=SUM(SORTN(REGEXMATCH(B:B;E2)*C:C;3;0;ROW(B:B)*REGEXMATCH(B:B;E2);0))
Note the number 3 for the number of values you want from the bottom. and the reference to E2, which is "HH" as on your sample sheet.
use:
=QUERY(FILTER({IFNA(REGEXEXTRACT(SORT(B2:B; ROW(B2:B); 0);
"^([A-Za-z]{1,3})\d"))\SORT(C2:C; ROW(B2:B); 0)}; COUNTIFS(
REGEXEXTRACT(SORT(B2:B; ROW(B2:B); 0); "^([A-Za-z]{1,3})\d");
REGEXEXTRACT(SORT(B2:B; ROW(B2:B); 0); "^([A-Za-z]{1,3})\d");
ROW(H2:H43); "<="&ROW(H2:H43))<=3);
"select Col1,sum(Col2) group by Col1 label sum(Col2)''")
full explanation here

How can I avoid having to put 0s into the NULL fields to get a correct query calculation in Google Sheets

I have a Google Sheets question, which I have not been able to figure out yet with Google-Fu and RTFM:
Take the following spreadsheet as an example:
https://docs.google.com/spreadsheets/d/1IvMVaUdUDfYOoKyG0Uwd2n0M1mLjOTE5yZQ9K2R3q2M/edit?usp=sharing
In case the sheet gets lost in time, I am going to post its contents here:
Sheet1:
foo
withdrawal
deposit
C
4
10
D
10
E
10
4
As you see here, the withdrawal field for the D value being foo is empty, i.e. null
Sheet2:
foo
balance
C
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A2&"'"), 2)
D
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A3&"'"), 2)
E
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A4&"'"), 2)
The result is
foo
balance
C
6
D
E
-6
As you see, the balance field for the category D is null, although it should be -10.
The fix for that is to put a 0 into the deposit field in Sheet1 explicitly.
In my example, I get that data using a csv-export, and fields are generally empty and not 0, and it is cumbersome to add the 0 there. Is there a way to have something like COALESCE in that sum there (like in SQL)?
Please let me know.
it seems like something quite a bit simpler would avoid the problem:
=SUMPRODUCT(Sheet1!C:C-Sheet1!B:B,Sheet1!A:A=A2)
for cell B2.
Why don't you just add this in cell A1 of Sheet2 instead of all the Query:
=arrayformula({Sheet1!A1,"balance";if(Sheet1!A2:A<>"",{Sheet1!A2:A,Sheet1!C2:C-Sheet1!B2:B},)})
Obviously ensure cells Sheet2!A2:A and Sheet2!B1:B are empty.
If you have duplicate values of foo, try:
=arrayformula(query({Sheet1!A1,"balance";if(Sheet1!A2:A<>"",{Sheet1!A2:A,Sheet1!C2:C-Sheet1!B2:B},)},"select Col1,sum(Col2) where Col1 is not null group by Col1 label sum(Col2) 'balance'",1))
A better option for a single-cell formula, referencing multiple sheets would be:
=arrayformula(query(
{Sheet1!A:A,n(Sheet1!B:C);Sheet2!A2:A,n(Sheet2!B2:C);Sheet3!A2:A,n(Sheet3!B2:C)},
"select Col1,sum(Col3)-sum(Col2) where Col1 is not null group by Col1 label sum(Col3)-sum(Col2) 'balance' ",1))

Get column value when another column is max in google query when grouping

Suppose I have a table like so:
one
ID
three
a
2
one
b
7
two
c
6
three
a
9
four
b
3
five
c
1
six
a
5
seven
b
10
eight
c
8
nine
a
4
ten
I want to GROUP BY one, get MAX of ID and then get the associated value from three.
I can do the first part like so:
=QUERY(A1:C11, "SELECT A, MAX(B) GROUP BY A")
To get:
one
max ID
a
9
b
10
c
8
But I want to get:
one
max ID
three
a
9
four
b
10
eight
c
8
nine
I am trying to do this all with one QUERY. I know I could use a VLOOKUP for the 3rd column but I'm hoping there is way to do with one QUERY.
From the Query Language Reference documentation, it is explicity stated in the rules of the GROUP BY clause that every column in the SELECT must be a grouped column -or- wrapped by an aggregation function. This is why it is not possible to include an ungrouped, unaggregated column in your specific query.
You can do the workaround as per player0's answer, but if you want to use QUERY() andVLOOKUP() in a single formula you can use this as well:
=ARRAYFORMULA({{QUERY(A1:C,"SELECT A, max(B) where A is not null group by A")},{VLOOKUP(FILTER(F:F,LEN(F:F)),SORT(B1:C,1,TRUE),2)}})
Sample:
This should also work. You can & the columns together pre-query, then split them out afterwards.
=ARRAYFORMULA(QUERY(SPLIT(QUERY({A:A,TEXT(B:B,"000000000")&"|"&C:C&"|"&A:A},"select MAX(Col2) where Col1<>'' group by Col1",1),"|"),"select Col3,Col1,Col2"))
use:
=SORTN(SORT(A2:C, 2, 0), 9^9, 2, 1, 1)
update:
={QUERY(source!A:E,
"select B,C,max(A) where D is not null group by B,C", 1),
{"value"; ARRAYFORMULA(IF(INDEX(QUERY(QUERY(source!A:E,
"select B,C,max(A) where D is not null group by B,C", 1),
"offset 1", 0),,1)<>"",
VLOOKUP(INDEX(QUERY(QUERY(source!A:E,
"select B,C,max(A) where D is not null group by B,C", 1),
"offset 1", 0),,3), source!A:E, 5, 0), ))}}

Perform a lookup to find out values from another column and summarises them using Google Sheets

Example data below.
I want to be able to sum the values in Col2 for each occurrence of Col 1, depending on the values in 'Other Cols' that are applied in combination with the value in Col1
Col1------Col2-----Other Cols
A---------40-------other data
A-------------------other data
A-------------------other data
B---------30-------other data
B-------------------other data
C-------------------other data
C-------------------other data
C---------90-------other data
For example, the values in 'other data' might mean the value where Col1 = B is not to included, so the correct outcome is 130 (40+90)
If possible I want to be able to achieve the above in a single Query.
In the real-life data there are over 2,000 rows of data and roughly 200 different values for Col1 (growing in size on a daily basis!!)
What I've been able to do myself!
1) I've created a Query that outputs a row for each valid occurrence of Col1 according to the selection criteria applied to 'Other Data', i.e.
A
C
2) Logically what I want to do next, but I can't do it because I don't know how to, is look back into the original data to find out the Col B values for the values for A and C (i.e. 40 and 90)
3) Then after that, I want to be able to sum the values identified (i.e. 40 + 90), so that in one single Query/cell the answer 130 is returned!!!
Being able to achieve step (2) would be very useful???
Doing (2) + (3) would be perfect!!!
(Note, the value for Col2 is unique to each set of values for Col1 )
=SUM(IFERROR(QUERY(A2:C,
"select sum(B)
where C = 'yes'
group by A
label sum(B)''", 0)))
=ARRAYFORMULA(SUM(QUERY(UNIQUE({A2:A,
IF(C2:C="",, VLOOKUP(ROW(B2:B), IF(
QUERY(A2:C, "select B order by A desc,B desc", 0)<>"", {ROW(B2:B),
QUERY(A2:C, "select B order by A desc,B desc", 0)}), 2)), C2:C}),
"select sum(Col2) where Col3='yes' group by Col1 label sum(Col2)''", 0)))

Resources