Query a count of unique attributes in a table - google-sheets

I have a table like this:
+--------+-------+--------+-------+
| attr1 | attr2 | attr3 | attr4 |
+--------+-------+--------+-------+
| purple | wine | clear | 10.0 |
| red | wine | solid | 20.0 |
| red | beer | cloudy | 10.0 |
| purple | ale | clear | 34.0 |
| blue | ale | solid | 16.0 |
+--------+-------+--------+-------+
that i want to transform like this:
+--------+-------+-------+-------+-------+
| | attr1 | attr2 | attr3 | attr4 |
+--------+-------+-------+-------+-------+
| purple | 2 | | | |
| red | 2 | | | |
| blue | 1 | | | |
| wine | | 2 | | |
| beer | | 1 | | |
| ale | | 2 | | |
| clear | | | 2 | |
| solid | | | 2 | |
| cloudy | | | 1 | |
| 10.0 | | | | 2 |
| 20.0 | | | | 1 |
| 34.0 | | | | 1 |
| 16.0 | | | | 1 |
+--------+-------+-------+-------+-------+
This pivoted or cross-table will show me the count of each attribute value in their respective columns.
How do i use the Google Query language to display such a cross-table?

Well if the data were laid out in two columns it would be straightforward e.g. for something like this
Attrib Column
Red 1
Red 1
Green 1
Blue 1
Beer 2
Ale 2
Ale 2
you could use a query like
=query(A:B,"select A,count(A) where A<>'' group by A pivot B")
So the problem is to organise OP#s data into two columns.
This can be done by what is by now a fairly standard split/join/transpose technique
=ArrayFormula(split(transpose(split(textjoin("|",true,if(A2:D="","",A2:D&" "&column(A2:D))),"|"))," "))
Giving
You could either run the query on the result of this or combine the two like this
=ArrayFormula(query({"Attrib","Number";split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&column(A2:D))),"|"))," ")},"Select Col1,count(Col1) group by Col1 pivot Col2"))
I have joined the column number to the attribute e.g. 1-blue so that it sorts into the right order. If you don't like it, you could get rid of it using regexreplace.
Edit
Slightly shorter formula - I didn't need to put the headers in separately:
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" Attr"&column(A2:D))),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Edit 2
I was being a bit thick there, should have used first row of OP's data as attribute labels instead of column numbers
=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&A1:D1)),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Edit 3
Should have chosen a better pair of delimiters
=ArrayFormula(query(split(transpose(split(textjoin("😊",true,if(A2:D="","",column(A2:D)&"-"&A2:D&"🍺"&A1:D1)),"😊")),"🍺"),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))

Related

Google sheets: append data from 2 tabs, with different column order

I have 2 tabs named S1 and S2. They both contain 3 columns of data (A, B and C). I just want to merge their content in a 3rd tab, using functions. Issue is that the order of columns is different in S1 and S2.
S1
S2
S1
| Column A | Column B | Column C |
| -------- | -------- | -------- |
| 1 | A | DeptA |
| 2 | B | DeptB |
| 3 | C | DeptC |
S2
| Column A | Column B | Column C |
| -------- | -------- | -------- |
| 4 | DeptD | D |
| 5 | DeptE | E |
| 6 | DeptF | F |
What I want to get in a 3rd tab is:
| Column A | Column B | Column C |
| -------- | -------- | -------- |
| 1 | A | DeptA |
| 2 | B | DeptB |
| 3 | C | DeptC |
| 4 | D | DeptD |
| 5 | E | DeptE |
| 6 | F | DeptF |
I'm using the following formula: "query({'S1'!A1:C;'S2'!A2:A,'S2'!C2:C,'S2'!B2:B};"Select Col1, Col2,Col3 where Col1 is not null";1)". But I get a formula analysis error.
I have also tried "
={
query({S1!A1:C},"Select Col1,Col2,Col3");
query({S2!A2:C},"Select Col1,Col3,Col2")
}
"
But I also get a formula analysis error
Spreadsheet access: https://docs.google.com/spreadsheets/d/1FdaRSANfcqMkSBE-is8ek8bFWp80vVW8P3a-v0q4MwU/edit?usp=share_link
Thanks for your help
Depending on locale setting its either:
=query({{'S1'!A1:C};{'S2'!A2:A\'S2'!C2:C\'S2'!B2:B}};"Select Col1, Col2, Col3 where Col1 is not null")
OR
=query({{'S1'!A1:C};{'S2'!A2:A,'S2'!C2:C,'S2'!B2:B}},"Select Col1, Col2, Col3 where Col1 is not null")
The issue is in your Locale Settings. You should use semi-colons instead of commas:
={
query({'S1'!A1:C};"Select Col1,Col2,Col3");
query({'S2'!A2:C};"Select Col1,Col3,Col2")
}

merge columns over multiple rows with a common column

Trying to "flatten" a Google sheet across multiple rows by using one row as the "primary key".
VBA answer in Excel: Merging Rows with common column
Tried doing Filter with Find but I am getting mismatched row errors. Not sure how to leverage VLOOKUP across multiple rows with criteria of the cell value being not blank.
Before
| animal | legs | cute |
|--------|------|------|
| dog | | |
| dog | 4 | |
| dog | | yes |
| cat | 4 | |
After
| animal | legs | cute |
|--------|------|------|
| dog | 4 | yes |
| cat | 4 | |
try it like this:
={A1:C1; ARRAYFORMULA({QUERY(TO_TEXT(A2:B), "where Col2 !=''", 0),
IFERROR(VLOOKUP(QUERY(TO_TEXT(A2:B), "select Col1 where Col2 !=''", 0),
SORT(A2:C, 3, 1), 3, 0))})}

Try to match string in column and print matching column name

I am trying to build an expense dashboard in google sheets for my personal use.
I have data that I will pull from my receipts like so:
First sheet: "Expenses Feb 18"
+------------+--------+--------+
| Item | Amount | Type |
+------------+--------+--------+
| Tomatoes | 2.39 | veggie |
| Joghurt | 1.45 | dairy |
| mozzarella | 1.99 | dairy |
| macadamia | 4.59 | nuts |
+------------+--------+--------+
Second table: "Categories"
+------------+----------+-----------+---------------+
| dairy | veggie | nuts | uncategorised |
+------------+----------+-----------+---------------+
| joghurt | tomatoes | macadamia | a |
| mozzarella | cucumber | pecan | b |
| feta | | | c |
| | | | d-z |
| | | | 0-9 |
| | | | - |
| | | | _ |
+------------+----------+-----------+---------------+
I want to automatically fill out the type column based on the item name.
So far I have a regex that is able to match an item. It will print the matched string. But what I need is the column name (header). And it has to be able to loop through the columns. This only works for a single column.
=REGEXEXTRACT(C11, JOIN("|", INDIRECT("Categories!A1:A"&COUNTA(Categories!A:A))))
The second table is not a desirable way to enter data. Data should be entered preferably with more rows than columns ( not in a pivoted manner).
=ARRAYFORMULA(CONCATENATE(IF(A16=$C$24:$E$25,C$23:E$23,)))
A16 : 🍅
C24:E25: Category table
C23:E23: Category header.

Return MAX values (Top 1 and 2) from list

I have a Google sheet with data of different players attacks and their corresponding damage.
Sheet1
| Player | Attack | Damage |
|:---------|:-------:|-------:|
| Iron Man | Melee | 50 |
| Iron Man | Missile | 2500 |
| Iron Man | Unibeam | 100 |
| Iron Man | Dash | 125 |
| Superman | Melee | 9000 |
| Superman | Breath | 200 |
| Superman | X-ray | 0 |
| Superman | Laser | 1500 |
| Hulk | Smash | 500 |
| Hulk | Throw | 500 |
| Hulk | Stomp | 500 |
| Hulk | Jump | 325 |
In my second sheet, I want to list each player and display their two best attacks and the corresponding damage. Like this:
Sheet2
| Player | # | Attack | Damage | Comment |
|:---------|:---:|:-------:|-------:|----------:|
| Iron Man | 1 | Missile | 2500 | |
| Iron Man | 2 | Dash | 125 | |
| Superman | 1 | Melee | 9000 | Very nice |
| Superman | 2 | Laser | 1500 | |
| Hulk | 1 | Smash | 500 | |
| Hulk | 2 | Stomp | 500 | |
Update:
Some attack may have been producing the exact same damage, if this happens - I just want to return the first one in alphabetical order.
I am now using the following formulas:
Damage-column: =MAX(FILTER(Sheet1!C:C,Sheet1!A:A=A2))
Attack-column: =JOIN(",",FILTER(Sheet1!B:B,Sheet1!A:A=A2,Sheet1!C:C=C2))
This returns the best attack/damage. For example on row 1/2:
| Player | # | Attack | Damage | Comment |
| Iron Man | 1 | Missile | 2500 | |
| Iron Man | 2 | Missile | 2500 | |
But not the second best. How do I modify the formula on the second row to return the second best attack/damage?
Update:
Using =LARGE(Sheet1!C:C;B3) in the second row I can get the second best attack from Sheet1, but it dosen't segment on player.
Update 2:
=ArrayFormula(LARGE(IF(Player="Iron Man",Damage),B2)) (using named ranges) returns both first and second best damage. Still trying to figure out how to return the attack-name.
With the data you provided I was able to produce the expected outcome by using this formula:
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:A),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0))) , sort(A2:C, 1, 0, 3, 0)}, "Select Col2, Col1, Col3, Col4 where Col1 < 3 ")
See if you can get this to work on your data.
Sample link
EDIT: Based on the comments below, here's an updated version.
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:A),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0))) , Sheet1!A2:C}, "Select Col2, Col1, Col3, Col4 where Col1 < 3 order by Col2")
I used two different formulas:
To get the maximum damage per Player =FILTER(FILTER($D$3:$D$10,$B$3:$B$10 = $F3), FILTER($D$3:$D$10,$B$3:$B$10 = $F3) = LARGE(FILTER(D3:D10,B3:B10 = $F3),$G3))
To get the attack =INDEX($C$3:$C$10, MATCH($F3&$I3,$B$3:$B$10&$D$3:$D$10, 0))
Give JPV credit but use this for getting the answer on another sheet:
=query({ArrayFormula(iferror(SORT(ROW(Sheet1!A2:B),SORT(ROW(Sheet1!A2:A),Sheet1!A2:A,1),1)-MATCH(Sheet1!A2:A,SORT(Sheet1!A2:A),0)-ROW()+1)) , sort(Sheet1!A2:C, 1, 1, 3, 0)}, "Select Col2, Col1, Col3, Col4 where Col1 < 3")
Please, try this formula:
=QUERY({Sheet1!$A$2:$C},
"select Col2, Col3 where Col1 = '"&A2&"' order by Col3 desc limit 1 offset "&B2-1)
It sorts the data by Col3 - metric, and limits the result to 1 row, and offsets the result, so you have top 1, 2, and so on values. In my sample, Superman has 2 same values: 9000, the formula enters "Melee" first because it was at first position in the table, but you may sort it in a query text to get an alphabetical order

Count occurrences of words from multiple columns

I have a spreadsheet like this, where the values A-E are the same options coming from a form:
+------+------+------+
| Opt1 | Opt2 | Opt3 |
+------+------+------+
| A | A | B |
| B | C | A |
| C | C | B |
| A | E | C |
| D | B | E |
| B | E | D |
+------+------+------+
I want to make a ranking, showing the most chosen options for each option. I already have this, where Rank is the ranking of the option and number is the count of the option:
+------+------+------+
| Rank | Opt1 | Numb |
+------+------+------+
| 1 | A | 2 |
| 1 | B | 2 |
| 3 | C | 1 |
| 3 | D | 1 |
+------+------+------+ (I have 3 of these, one for each option)
I want to do now a summary of the 3 options, making the same ranking but joining the options. It would be something like:
+------+------+------+
| Rank |Opt123| Numb |
+------+------+------+
| 1 | B | 5 |
| 2 | A | 4 |
| 2 | C | 4 |
| 4 | E | 3 |
| 5 | D | 2 |
+------+------+------+
The easiest way to do this would be getting the data from the three ranking tables or from the original three data columns?
And how would I do this?
I already have the formula to get the names of the options, the count and ranking, but I don't know how to make them work with multiple columns.
What I have (the F column is one of the data columns):
Column B on another sheet:
=SORT(UNIQUE(FILTER('Form Responses'!F2:F;NOT(ISBLANK('Form Responses'!F2:F)))); RANK(COUNTIF('Form Responses'!F2:F; UNIQUE(FILTER('Form Responses'!F2:F;NOT(ISBLANK('Form Responses'!F2:F))))); COUNTIF('Form Responses'!F2:F; UNIQUE(FILTER('Form Responses'!F2:F;NOT(ISBLANK('Form Responses'!F2:F))))); TRUE); FALSE)
Column C:
=ArrayFormula(COUNTIF('Form Responses'!F2:F; FILTER(B2:B;NOT(ISBLANK(B2:B)))))
Column A:
=ARRAYFORMULA(SORT(RANK(FILTER(C2:C;NOT(ISBLANK(C2:C))); FILTER(C2:C;NOT(ISBLANK(C2:C))))))
Edited:
Merge cols:
=TRANSPOSE(split(join(",",D2:D,E2:E),","))
merges 2 cols, not very clean, but works. (Same as here Stacking multiple columns on to one?)
Full formula:
=SORT(UNIQUE(FILTER(TRANSPOSE(split(join(",",D2:D,E2:E),","));NOT(ISBLANK(TRANSPOSE(split(join(",",D2:D,E2:E),",")))))); RANK(COUNTIF(TRANSPOSE(split(join(",",D2:D,E2:E),",")); UNIQUE(FILTER(TRANSPOSE(split(join(",",D2:D,E2:E),","));NOT(ISBLANK(TRANSPOSE(split(join(",",D2:D,E2:E),","))))))); COUNTIF(TRANSPOSE(split(join(",",D2:D,E2:E),",")); UNIQUE(FILTER(TRANSPOSE(split(join(",",D2:D,E2:E),","));NOT(ISBLANK(TRANSPOSE(split(join(",",D2:D,E2:E),","))))))); TRUE); FALSE)
The transpose could be done after the sort.

Resources