Select GROUP BY column without selecting the Sum() Column

Select GROUP BY column without selecting the Sum() Column - google-sheets

Say I have searched multiple sheets with a query and gotten the following table in Google Sheets: (its nested, so it will never actually be displayed)
| Item | Amount | ID |
------------------------------
| cat | 3 | 1 |
------------------------------
| dog | 2 | 2 |
------------------------------
| dog | 4 | 2 |
------------------------------
| bird | 1 | 3 |
------------------------------
| bird | 2 | 3 |
------------------------------
| dog | 1 | 2 |
------------------------------
Obviously, If I want to get the Sum of Dogs and Birds from this table I could do something like this:
(-not exact syntax, just an example)
"SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2)"
And I should get something like the following:
| Item | Amount |
---------------------
| dog | 7 |
---------------------
| bird | 3 |
---------------------
BUT - Is there a way I can return ONLY Col1 (As in still do the Grouping and ordering) so that If I was to put it side by side with a result that showed all Columns it would still line up correctly?

OK, Solved it!!! Quite Simple Really....
All I have to do is to add another layer of nesting.
SO:
=QUERY(table, "SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2))"
Becomes:
=QUERY(
QUERY(table, "SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2)"),
"SELECT Col1"
)
Again, Syntax not exact, but its the Logic I was after :)
:)

Related

Google Sheets function to group and concat rows

Sample sheet: https://docs.google.com/spreadsheets/d/1AeP0sxDi0-3aaesUdCNTKfricIimjTMFaKO-FX9_g50/edit?usp=sharing
I am trying to find a formula that will group a table on a column and concat the values from all the rows in another column.
For example, if this is my table:
| name | value |
|-------|---------|
| one | alpha |
| two | bravo |
| three | charlie |
| one | delta |
| two | echo |
| four | foxtrot |
| two | golf |
| three | hotel |
| four | india |
This is what I want the formula to output:
| one | alpha, delta |
| two | bravo, echo, golf |
| three | charlie, hotel |
| four | foxtrot, india |
I wish I could share some formula that gets me close but I can't find anything. I thought maybe this formula but, as you can see from the sample sheet, it does not work.
=ARRAYFORMULA(JOIN(", ", TRANSPOSE(FILTER(B2:B, A2:A = {UNIQUE(A2:A)}))))
My thought was, get a unique list of values in the name column, and then use arrayformula to get a list of values in the value column where the name column equals each value in the unique list. :/

try:
=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(
QUERY(QUERY({A2:A&"♦", B2:B&","},
"select max(Col2)
where Col1 !=''
group by Col2
pivot Col1"),,999^99)), "♦")), ",$", ))
or:
=ARRAYFORMULA(IFNA(VLOOKUP(UNIQUE(A2:A),
REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(
QUERY(QUERY({A2:A&"♦", B2:B&","},
"select max(Col2)
where Col1 !=''
group by Col2
pivot Col1"),,999^99)), "♦")), ",$", ), {1, 2}, 0)))

Return a list of all pairs row/column if condition of cell value is met

Given a Google Sheet table similar to the following:
+---+---A--+---B--+---C--+---D--+
| 1 | | col1 | col2 | col3 |
|---+------+------+------+------+
| 2 | row1 | 1 | X | 45 |
| 3 | row2 | 5 | | |
| 4 | row3 | 4 | | 34 |
+---+------+------+------+------+
where row1, col1 are header labels and "X" is also a valid value for the combination "row/column";
I need to retrieve a list of all the possible combinations row/column headers that are not null, meaning in this example:
row1 | col1
row2 | col1
row3 | col1
row1 | col2
row1 | col3
row3 | col3
I tried in different ways such as using the ISBLANK function or the QUERY one as:
=QUERY(A1:D4, "SELECT A,B,C,D WHERE B IS NOT NULL OR C IS NOT NULL OR D IS NOT NULL",1)
but is simply a subset of the precedent table AND I cannot GROUP BY because there's no aggregate function;

Consider a spreadsheet with 3 sheets, "Main", "LabelTable", and "List".
"Main" contains the data:
+---+---A--+---B--+---C--+---D--+
| 1 | | col1 | col2 | col3 |
|---+------+------+------+------+
| 2 | row1 | 1 | X | 45 |
| 3 | row2 | 5 | | |
| 4 | row3 | 4 | | 34 |
+---+------+------+------+------+
A named range from A2:D4 is created and named "Data".
My solution was to first make an accompanying table on "LabelTable" of the form:
+-------+---A--+------B-----+------C-----+------D-----+
| 1 | | col1 | col2 | col3 |
|-------+------+------------+------------+------------+
| 2 | row1 | row1 col1; | row1 col2; | row1 col3; |
| 3 | row2 | row2 col1; | row2 col2; | row2 col3; |
| 4 | row3 | row3 col1; | row3 col2; | row3 col3; |
+-------+------+------------+------------+------------+
where the number of rows and columns matches the size of the table on "Main".
If you have a large table, you can easily generate these by using the formula in B2 as =$A2&","&B$1&";" and copying across and down to the bottom right of your required range.
I then created a named range "Labels" from B2:D4 (or for larger tables, the full extent of the contents of this table)
On "List", I then entered the formula:
=transpose(split(concatenate(arrayformula(if(not(ISBLANK(Data)),Labels,""))),";"))
This produces a list of pairs, like "Row1Col1", corresponding to every value that is not blank.
Description
This uses ISBLANK() to determine if a cell is null or not. In combination with ARRAYFORMULA() on the "outside" this creates a table of TRUE or FALSE values. It then wraps the TRUE/FALSE table with IF() to create a table of labels corresponding to where the TRUE/FALSE table has TRUE. This is where the named range "Labels" comes in - it provides the cell reference of where there was not a null value.
All of this is then wrapped in CONCATENATE() to give a single cell with the references joined. Then SPLIT() is used to turn this into an array - and this is also why ";" was added into the cell references in "Labels". Finally, TRANSPOSE() is used to turn the 1xN array into an Nx1 array, i.e., a single column.
Limitations
If you have a dynamic table then you'll need to also make "Labels" dynamic to match. This method does not do that.
If for some reason there is a requirement that means you can't have the accompanying table on "LabelTable" then this method will not work. Some more will need to be done to "build that in" to the formula.

QUERY to merge the names of two regions and sum the number of their occurrences

I have an array with states/regions in the UK. Some regions occur more than once in this list, so I have performed a COUNTIF to determine the number of times that each one occur.
Now I need to run a QUERY to list top 5 regions.
Generally, most occurrences are for the London area.
The issue is that in the regions there are 2 states that refer to the Greater London area - London and Greater London.
These two I need to merge and sum their values.
There needs to be only one region - Greater London, and its value needs to hold the sum of London and Greater London.
This is the dataset I have:
+----------------+-------+
| State/Province | count |
+----------------+-------+
| Hampshire | 1 |
+----------------+-------+
| Kent | 2 |
+----------------+-------+
| West Lothian | 3 |
+----------------+-------+
| London | 4 |
+----------------+-------+
| Greater London | 5 |
+----------------+-------+
| Cheshire | 6 |
+----------------+-------+
I have managed to put together this QUERY so far:
=QUERY(A1:B,"select A, max(B) group by A order by max(B) desc limit 5 label max(B) 'Number of occurrences'",1)
That gives me this output:
+----------------+-----------------------+
| State/Province | Number of occurrences |
+----------------+-----------------------+
| Cheshire | 6 |
+----------------+-----------------------+
| Greater London | 5 |
+----------------+-----------------------+
| London | 4 |
+----------------+-----------------------+
| West Lothian | 3 |
+----------------+-----------------------+
| Kent | 2 |
+----------------+-----------------------+
What I need is the Greater London and London entries to be merged under the name Greater London and their numbers of occurrences to be summed, providing this result:
+----------------+-----------------------+
| State/Province | Number of occurrences |
+----------------+-----------------------+
| Greater London | 9 |
+----------------+-----------------------+
| Cheshire | 6 |
+----------------+-----------------------+
| West Lothian | 3 |
+----------------+-----------------------+
| Kent | 2 |
+----------------+-----------------------+
| Hampshire | 1 |
+----------------+-----------------------+
Apologies for not sharing a sheet, but I have security restrictions that are not allowing me to share any link to sheet outside the firm.

=QUERY(QUERY(ARRAYFORMULA(
{SUBSTITUTE(IF(A1:A="London","♥",A1:A),"♥","Greater London"),B1:B}),
"select Col1, sum(Col2)
where Col1 is not null
group by Col1"),
"select Col1, max(Col2)
group by Col1
order by max(Col2) desc
limit 5
label max(Col2)'Number of occurrences'",1)

=QUERY(ARRAYFORMULA(SUBSTITUTE(
IF((A1:A="London")+(A1:A="London2")+(A1:A="London3"),
"♥",A1:A),"♥","Greater London")),
"select Col1, count(Col1)
where Col1 is not null and not Col1 = '#N/A'
group by Col1
order by count(Col1) desc
limit 5
label count(Col1) 'Number of occurrences'", 1)

=QUERY(ARRAYFORMULA(SUBSTITUTE(
IF((QUERY(A1:B,"where B=1")="London")+
(QUERY(A1:B,"where B=1")="London2")+
(QUERY(A1:B,"where B=1")="London3"),
"♥",QUERY(A1:B,"where B=1")),"♥","Greater London")),
"select Col1, count(Col1)
where Col1 is not NULL and not Col1 = '#N/A'
group by Col1
order by count(Col1) desc
limit 5
label count(Col1) 'Number of occurrences'", 1)

Counting number of columns a value appears in (Google Sheets)

I have a Google Sheet where I want to know the number of unique columns that a value appears in. For example, given the following sheet:
| A | B | C | D |
+-------+-------+-------+-------+
| Joe | Lisa | Lisa | Lisa |
| Joe | Lisa | Jenny | Lisa |
| Joe | Jenny | Jenny | John |
| Joe | Jenny | Katie | John |
| Joe | Jenny | Katie | John |
I would want something that counts Joe appearing in 1 column, Lisa appearing in 3, Jenny appearing in 2, Katie appearing in 1, and John appearing in one, i.e.
| Name | Count |
+-------+-------+
| Joe | 1 |
| Lisa | 3 |
| Jenny | 2 |
| Katie | 1 |
| John | 1 |
What's the best way to do this?

Assuming the data has no spaces in it, try:
=ArrayFormula(QUERY(SPLIT(UNIQUE(TRANSPOSE(SPLIT(JOIN(" ", QUERY(A1:D&"_"&COLUMN(A1:D1),,ROWS(A1:A)))," "))), "_"), "Select Col1, count(Col2) where Col2 > 0 group by Col1 label count(Col2)''"))
If the source data has spaces, try:
=ArrayFormula(SUBSTITUTE(QUERY(SPLIT(UNIQUE(TRANSPOSE(SPLIT(JOIN(" ", QUERY(SUBSTITUTE(A1:D, " ", "~")&"_"&COLUMN(A1:D1),,ROWS(A1:A)))," "))), "_"), "Select Col1, count(Col2) where Col2 > 0 group by Col1 label count(Col2)''"), "~", " "))

EDIT
As Tom has noticed, I've missed the task, and the correct formula is:
=QUERY(
QUERY(
{TRANSPOSE(SPLIT(TEXTJOIN("#",1,FILTER(COLUMN(A:D)*row(A:D)^0,A:A<>"")),"#")),
TRANSPOSE(SPLIT(TEXTJOIN("#",1,A:D),"#"))},
"select Col1, Col2, count(Col2) group by Col1, Col2"),
"select Col2, count(Col3) group by Col2 label Col2 'Name',count(Col3) 'Count'")
Credit: #tom-sharpe
My original formula counted max times name is in a row:
=QUERY(
QUERY(
{TRANSPOSE(SPLIT(TEXTJOIN("#",1,FILTER(COLUMN(A:D)^0*row(A:D),A:A<>"")),"#")),
TRANSPOSE(SPLIT(TEXTJOIN("#",1,A:D),"#"))},
"select Col1, Col2, count(Col2) group by Col1, Col2"),
"select Col2, max(Col3) group by Col2 label Col2 'Name'")

Counting number of occurrences in column?

What would be a good approach to calculate the number of occurrences in a spreadsheet column? Can this be done with a single array formula?
Example (column A is input, columns B and C are to be auto-generated):
| A | B | C |
+-------+-------+-------+
| Name | Name | Count |
+-------+-------+-------+
| Joe | Joe | 2 |
| Lisa | Lisa | 3 |
| Jenny | Jenny | 2 |
| Lisa | | |
| Lisa | | |
| Joe | | |
| Jenny | | |

A simpler approach to this
At the beginning of column B, type
=UNIQUE(A:A)
Then in column C, use
=COUNTIF(A:A, B1)
and copy them in all row column C.
Edit: If that doesn't work for you, try using semicolon instead of comma:
=COUNTIF(A:A; B1)

Try:
=ArrayFormula(QUERY(A:A&{"",""};"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Count'";1))
22/07/2014 Some time in the last month, Sheets has started supporting more flexible concatenation of arrays, using an embedded array. So the solution may be shortened slightly to:
=QUERY({A:A,A:A},"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Count'",1)

=COUNTIF(A:A;"lisa")
You can replace the criteria with cell references from Column B

Just adding some extra sorting if needed
=QUERY(A2:A,"select A, count(A) where A is not null group by A order by count(A) DESC label A 'Name', count(A) 'Count'",-1)

=arrayformula(if(isblank(B2:B),iferror(1/0),mmult(sign(B2:B=TRANSPOSE(A2:A)),A2:A)))
I got this from a good tutorial - can't remember the title - probably about using MMult

Put the following in B3 (credit to #Alexander-Ivanov for the countif condition):
={UNIQUE(A3:A),ARRAYFORMULA(COUNTIF(UNIQUE(A3:A),"=" & UNIQUE(A3:A)))}
Benefits: It only requires editing 1 cell, it includes the name filtered by uniqueness, and it is concise.
Downside: it runs the unique function 3x
To use the unique function only once, split it into 2 cells:
B3: =UNIQUE(A3:A)
C3: =ARRAYFORMULA(COUNTIF(B3:B,"=" & B3:B))

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Select GROUP BY column without selecting the Sum() Column - google-sheets

Related

Google Sheets function to group and concat rows

Return a list of all pairs row/column if condition of cell value is met

QUERY to merge the names of two regions and sum the number of their occurrences

Counting number of columns a value appears in (Google Sheets)

Counting number of occurrences in column?

Categories

Resources