merge columns over multiple rows with a common column - google-sheets

Trying to "flatten" a Google sheet across multiple rows by using one row as the "primary key".
VBA answer in Excel: Merging Rows with common column
Tried doing Filter with Find but I am getting mismatched row errors. Not sure how to leverage VLOOKUP across multiple rows with criteria of the cell value being not blank.
Before
| animal | legs | cute |
|--------|------|------|
| dog | | |
| dog | 4 | |
| dog | | yes |
| cat | 4 | |
After
| animal | legs | cute |
|--------|------|------|
| dog | 4 | yes |
| cat | 4 | |

try it like this:
={A1:C1; ARRAYFORMULA({QUERY(TO_TEXT(A2:B), "where Col2 !=''", 0),
IFERROR(VLOOKUP(QUERY(TO_TEXT(A2:B), "select Col1 where Col2 !=''", 0),
SORT(A2:C, 3, 1), 3, 0))})}

Related

Google sheets: append data from 2 tabs, with different column order

I have 2 tabs named S1 and S2. They both contain 3 columns of data (A, B and C). I just want to merge their content in a 3rd tab, using functions. Issue is that the order of columns is different in S1 and S2.
S1
S2
S1
| Column A | Column B | Column C |
| -------- | -------- | -------- |
| 1 | A | DeptA |
| 2 | B | DeptB |
| 3 | C | DeptC |
S2
| Column A | Column B | Column C |
| -------- | -------- | -------- |
| 4 | DeptD | D |
| 5 | DeptE | E |
| 6 | DeptF | F |
What I want to get in a 3rd tab is:
| Column A | Column B | Column C |
| -------- | -------- | -------- |
| 1 | A | DeptA |
| 2 | B | DeptB |
| 3 | C | DeptC |
| 4 | D | DeptD |
| 5 | E | DeptE |
| 6 | F | DeptF |
I'm using the following formula: "query({'S1'!A1:C;'S2'!A2:A,'S2'!C2:C,'S2'!B2:B};"Select Col1, Col2,Col3 where Col1 is not null";1)". But I get a formula analysis error.
I have also tried "
={
query({S1!A1:C},"Select Col1,Col2,Col3");
query({S2!A2:C},"Select Col1,Col3,Col2")
}
"
But I also get a formula analysis error
Spreadsheet access: https://docs.google.com/spreadsheets/d/1FdaRSANfcqMkSBE-is8ek8bFWp80vVW8P3a-v0q4MwU/edit?usp=share_link
Thanks for your help
Depending on locale setting its either:
=query({{'S1'!A1:C};{'S2'!A2:A\'S2'!C2:C\'S2'!B2:B}};"Select Col1, Col2, Col3 where Col1 is not null")
OR
=query({{'S1'!A1:C};{'S2'!A2:A,'S2'!C2:C,'S2'!B2:B}},"Select Col1, Col2, Col3 where Col1 is not null")
The issue is in your Locale Settings. You should use semi-colons instead of commas:
={
query({'S1'!A1:C};"Select Col1,Col2,Col3");
query({'S2'!A2:C};"Select Col1,Col3,Col2")
}

Find multiple matches in a dataset, and return all matches in a single row

What array formula would work for this?
Test Sheet: Open
Current Data Structure
Contains a running list of names and when they started, ended training.
| A | B | C |
| John | StartDate1 | EndDate1 |
| Adam | StartDate3 | EndDate3 |
| John | StartDate2 | EndDate2 |
| Ted | StartDate5 | EndDate5 |
| Adam | StartDate4 | EndDate4 |
Expected Results
Unique column of names in column E =UNIQUE(A2:A)
Next to the unique name, display every StartDate & EndDate that matches the unique name.
| E | F | G | H | I |
| John | StartDate1 | EndDate1 | StartDate2 | EndDate2 |
| Adam | StartDate3 | EndDate3 | StartDate4 | EndDate4 |
| Ted | StartDate4 | EndDate4 | | |
What I have tried
=FILTER(B2:C,A2:A = E2)
Does not return on a single row. ❌
Does not work with ARRAYFORMULA. ❌
=TRANSPOSE(FILTER(B2:C,A2:A = E2:E))
Returns all StartDates on a single row, and all End Dates on the next row. ❌
It should return on a single row (StartDate,EndDate,StartDate,EndDate, etc)
Does not work with ARRAYFORMULA. ❌
=ARRAYFORMULA(VLOOKUP(E2:E,A2:C,{2,3}))
Returns the first match only ❌
Works with array formula. ✔️
What am I doing wrong? Is there a better arrayformula that can display every start and end date that matches a unique name in a row?
Thanks for your help!
use:
=INDEX(SPLIT(FLATTEN(QUERY(QUERY(IF(A3:A="",,{A3:A, "×"&B3:B&"×"&C3:C}),
"select max(Col2) where Col2 is not null group by Col2 pivot Col1"),,9^9)), "×"))

Create a pivot table where the resulting columns are the product of the original columns and rows?

This might be impossible to do without a ton of expensive scripting, but I would like to run it by the experts in case I'm missing something. It's hard to explain (because it's nonsensical.. i.e. not my choice), so I'll just give a very simplified example.
My source data sheet is like this...
+----------+-------+------+--------+
| Date | Time | Cars | Trucks |
+----------+-------+------+--------+
| 01/01/19 | 08:00 | 2 | 12 |
| 01/01/19 | 12:00 | 4 | 10 |
| 01/01/19 | 20:00 | 6 | 8 |
| 01/02/19 | 08:00 | 8 | 6 |
| 01/02/19 | 12:00 | 10 | 4 |
| 01/02/19 | 20:00 | 12 | 2 |
+----------+-------+------+--------+
.. and I want to have another sheet dynamically display it like ...
+----------+---------------+---------------+---------------+
| | 08:00 | 12:00 | 20:00 |
+----------+------+--------+------+--------+------+--------+
| | Cars | Trucks | Cars | Trucks | Cars | Trucks |
+----------+------+--------+------+--------+------+--------+
| 01/01/19 | 2 | 12 | 4 | 10 | 6 | 8 |
| 01/02/19 | 8 | 6 | 10 | 4 | 12 | 2 |
+----------+------+--------+------+--------+------+--------+
In other words, a column for each time at category combined.
Keep in mind that, in reality, this is a large data set. Also, I have a little bit of flexibility in the headers in the sense that, the two header rows in the output could be one. Something like "Cars 8:00", "Trucks 8:00", "Cars 12:00"... etc
Does anybody know how this could be done with a pivot table? Or some other simple'ish method?
Here's a live version of the same example...
https://docs.google.com/spreadsheets/d/1npQikx3Zwa2QZwDAk8IxyawYw2hkeYpPe9Nh4ImkZAE/edit?usp=sharing
try:
=ARRAYFORMULA({{TEXT(SUBSTITUTE(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(
QUERY({Source!A2:B, TRANSPOSE(QUERY(TRANSPOSE(Source!C2:D),,999^99))},
"select Col1,max(Col3) where Col1 is not null group by Col1 pivot Col2"),
"limit 0", 1)),,999^99)), " "), "1899-12-30", ), "hh:mm"), ""}; {"",
SPLIT(REPT(Source!C1&" "&Source!D1&" ", COUNTUNIQUE(Source!B2:B)), " ")};
TRANSPOSE(QUERY(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({Source!A2:B,
TRANSPOSE(QUERY(TRANSPOSE(Source!C2:D),,999^99))},
"select Col1,max(Col3) where Col1 is not null group by Col1 pivot Col2"),
"offset 1", 0))&" ",,999^99)), " ", 1, 0)), "where "&JOIN(" or ", "Col"&
ROW(INDIRECT("A1:A"&COUNTUNIQUE(Source!A2:A)))&" is not null"), 0))})
spreadsheet demo

Query a count of unique attributes in a table

I have a table like this:
+--------+-------+--------+-------+
| attr1 | attr2 | attr3 | attr4 |
+--------+-------+--------+-------+
| purple | wine | clear | 10.0 |
| red | wine | solid | 20.0 |
| red | beer | cloudy | 10.0 |
| purple | ale | clear | 34.0 |
| blue | ale | solid | 16.0 |
+--------+-------+--------+-------+
that i want to transform like this:
+--------+-------+-------+-------+-------+
| | attr1 | attr2 | attr3 | attr4 |
+--------+-------+-------+-------+-------+
| purple | 2 | | | |
| red | 2 | | | |
| blue | 1 | | | |
| wine | | 2 | | |
| beer | | 1 | | |
| ale | | 2 | | |
| clear | | | 2 | |
| solid | | | 2 | |
| cloudy | | | 1 | |
| 10.0 | | | | 2 |
| 20.0 | | | | 1 |
| 34.0 | | | | 1 |
| 16.0 | | | | 1 |
+--------+-------+-------+-------+-------+
This pivoted or cross-table will show me the count of each attribute value in their respective columns.
How do i use the Google Query language to display such a cross-table?
Well if the data were laid out in two columns it would be straightforward e.g. for something like this
Attrib Column
Red 1
Red 1
Green 1
Blue 1
Beer 2
Ale 2
Ale 2
you could use a query like
=query(A:B,"select A,count(A) where A<>'' group by A pivot B")
So the problem is to organise OP#s data into two columns.
This can be done by what is by now a fairly standard split/join/transpose technique
=ArrayFormula(split(transpose(split(textjoin("|",true,if(A2:D="","",A2:D&" "&column(A2:D))),"|"))," "))
Giving
You could either run the query on the result of this or combine the two like this
=ArrayFormula(query({"Attrib","Number";split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&column(A2:D))),"|"))," ")},"Select Col1,count(Col1) group by Col1 pivot Col2"))
I have joined the column number to the attribute e.g. 1-blue so that it sorts into the right order. If you don't like it, you could get rid of it using regexreplace.
Edit
Slightly shorter formula - I didn't need to put the headers in separately:
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" Attr"&column(A2:D))),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Edit 2
I was being a bit thick there, should have used first row of OP's data as attribute labels instead of column numbers
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&A1:D1)),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Edit 3
Should have chosen a better pair of delimiters
=ArrayFormula(query(split(transpose(split(textjoin("😊",true,if(A2:D="","",column(A2:D)&"-"&A2:D&"🍺"&A1:D1)),"😊")),"🍺"),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))

Return MAX values (Top 1 and 2) from list

I have a Google sheet with data of different players attacks and their corresponding damage.
Sheet1
| Player | Attack | Damage |
|:---------|:-------:|-------:|
| Iron Man | Melee | 50 |
| Iron Man | Missile | 2500 |
| Iron Man | Unibeam | 100 |
| Iron Man | Dash | 125 |
| Superman | Melee | 9000 |
| Superman | Breath | 200 |
| Superman | X-ray | 0 |
| Superman | Laser | 1500 |
| Hulk | Smash | 500 |
| Hulk | Throw | 500 |
| Hulk | Stomp | 500 |
| Hulk | Jump | 325 |
In my second sheet, I want to list each player and display their two best attacks and the corresponding damage. Like this:
Sheet2
| Player | # | Attack | Damage | Comment |
|:---------|:---:|:-------:|-------:|----------:|
| Iron Man | 1 | Missile | 2500 | |
| Iron Man | 2 | Dash | 125 | |
| Superman | 1 | Melee | 9000 | Very nice |
| Superman | 2 | Laser | 1500 | |
| Hulk | 1 | Smash | 500 | |
| Hulk | 2 | Stomp | 500 | |
Update:
Some attack may have been producing the exact same damage, if this happens - I just want to return the first one in alphabetical order.
I am now using the following formulas:
Damage-column: =MAX(FILTER(Sheet1!C:C,Sheet1!A:A=A2))
Attack-column: =JOIN(",",FILTER(Sheet1!B:B,Sheet1!A:A=A2,Sheet1!C:C=C2))
This returns the best attack/damage. For example on row 1/2:
| Player | # | Attack | Damage | Comment |
| Iron Man | 1 | Missile | 2500 | |
| Iron Man | 2 | Missile | 2500 | |
But not the second best. How do I modify the formula on the second row to return the second best attack/damage?
Update:
Using =LARGE(Sheet1!C:C;B3) in the second row I can get the second best attack from Sheet1, but it dosen't segment on player.
Update 2:
=ArrayFormula(LARGE(IF(Player="Iron Man",Damage),B2)) (using named ranges) returns both first and second best damage. Still trying to figure out how to return the attack-name.
With the data you provided I was able to produce the expected outcome by using this formula:
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:A),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0))) , sort(A2:C, 1, 0, 3, 0)}, "Select Col2, Col1, Col3, Col4 where Col1 < 3 ")
See if you can get this to work on your data.
Sample link
EDIT: Based on the comments below, here's an updated version.
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:A),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0))) , Sheet1!A2:C}, "Select Col2, Col1, Col3, Col4 where Col1 < 3 order by Col2")
I used two different formulas:
To get the maximum damage per Player =FILTER(FILTER($D$3:$D$10,$B$3:$B$10 = $F3), FILTER($D$3:$D$10,$B$3:$B$10 = $F3) = LARGE(FILTER(D3:D10,B3:B10 = $F3),$G3))
To get the attack =INDEX($C$3:$C$10, MATCH($F3&$I3,$B$3:$B$10&$D$3:$D$10, 0))
Give JPV credit but use this for getting the answer on another sheet:
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:B),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0)-ROW()+1)) , sort(Sheet1!A2:C, 1, 1, 3, 0)}, "Select Col2, Col1, Col3, Col4 where Col1 < 3")
Please, try this formula:
=QUERY({Sheet1!$A$2:$C},
"select Col2, Col3 where Col1 = '"&A2&"' order by Col3 desc limit 1 offset "&B2-1)
It sorts the data by Col3 - metric, and limits the result to 1 row, and offsets the result, so you have top 1, 2, and so on values. In my sample, Superman has 2 same values: 9000, the formula enters "Melee" first because it was at first position in the table, but you may sort it in a query text to get an alphabetical order

Resources