Counting number of columns a value appears in (Google Sheets) - google-sheets

I have a Google Sheet where I want to know the number of unique columns that a value appears in. For example, given the following sheet:
| A | B | C | D |
+-------+-------+-------+-------+
| Joe | Lisa | Lisa | Lisa |
| Joe | Lisa | Jenny | Lisa |
| Joe | Jenny | Jenny | John |
| Joe | Jenny | Katie | John |
| Joe | Jenny | Katie | John |
I would want something that counts Joe appearing in 1 column, Lisa appearing in 3, Jenny appearing in 2, Katie appearing in 1, and John appearing in one, i.e.
| Name | Count |
+-------+-------+
| Joe | 1 |
| Lisa | 3 |
| Jenny | 2 |
| Katie | 1 |
| John | 1 |
What's the best way to do this?

Assuming the data has no spaces in it, try:
=ArrayFormula(QUERY(SPLIT(UNIQUE(TRANSPOSE(SPLIT(JOIN(" ", QUERY(A1:D&"_"&COLUMN(A1:D1),,ROWS(A1:A)))," "))), "_"), "Select Col1, count(Col2) where Col2 > 0 group by Col1 label count(Col2)''"))
If the source data has spaces, try:
=ArrayFormula(SUBSTITUTE(QUERY(SPLIT(UNIQUE(TRANSPOSE(SPLIT(JOIN(" ", QUERY(SUBSTITUTE(A1:D, " ", "~")&"_"&COLUMN(A1:D1),,ROWS(A1:A)))," "))), "_"), "Select Col1, count(Col2) where Col2 > 0 group by Col1 label count(Col2)''"), "~", " "))

EDIT
As Tom has noticed, I've missed the task, and the correct formula is:
=QUERY(
QUERY(
{TRANSPOSE(SPLIT(TEXTJOIN("#",1,FILTER(COLUMN(A:D)*row(A:D)^0,A:A<>"")),"#")),
TRANSPOSE(SPLIT(TEXTJOIN("#",1,A:D),"#"))},
"select Col1, Col2, count(Col2) group by Col1, Col2"),
"select Col2, count(Col3) group by Col2 label Col2 'Name',count(Col3) 'Count'")
Credit: #tom-sharpe
My original formula counted max times name is in a row:
=QUERY(
QUERY(
{TRANSPOSE(SPLIT(TEXTJOIN("#",1,FILTER(COLUMN(A:D)^0*row(A:D),A:A<>"")),"#")),
TRANSPOSE(SPLIT(TEXTJOIN("#",1,A:D),"#"))},
"select Col1, Col2, count(Col2) group by Col1, Col2"),
"select Col2, max(Col3) group by Col2 label Col2 'Name'")

Related

Google Sheets function to group and concat rows

Sample sheet: https://docs.google.com/spreadsheets/d/1AeP0sxDi0-3aaesUdCNTKfricIimjTMFaKO-FX9_g50/edit?usp=sharing
I am trying to find a formula that will group a table on a column and concat the values from all the rows in another column.
For example, if this is my table:
| name | value |
|-------|---------|
| one | alpha |
| two | bravo |
| three | charlie |
| one | delta |
| two | echo |
| four | foxtrot |
| two | golf |
| three | hotel |
| four | india |
This is what I want the formula to output:
| one | alpha, delta |
| two | bravo, echo, golf |
| three | charlie, hotel |
| four | foxtrot, india |
I wish I could share some formula that gets me close but I can't find anything. I thought maybe this formula but, as you can see from the sample sheet, it does not work.
=ARRAYFORMULA(JOIN(", ", TRANSPOSE(FILTER(B2:B, A2:A = {UNIQUE(A2:A)}))))
My thought was, get a unique list of values in the name column, and then use arrayformula to get a list of values in the value column where the name column equals each value in the unique list. :/
try:
=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(
QUERY(QUERY({A2:A&"♦", B2:B&","},
"select max(Col2)
where Col1 !=''
group by Col2
pivot Col1"),,999^99)), "♦")), ",$", ))
or:
=ARRAYFORMULA(IFNA(VLOOKUP(UNIQUE(A2:A),
REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(
QUERY(QUERY({A2:A&"♦", B2:B&","},
"select max(Col2)
where Col1 !=''
group by Col2
pivot Col1"),,999^99)), "♦")), ",$", ), {1, 2}, 0)))

How can I get all non-empty columns and their contents?

I would like to know how I can filter out empty columns when using data from one part of the sheet in another without having to specify each column name since more columns can be added.
I found this site and tried out the formula there but that seems like sometimes it will include the column (meaning it has a non-empty value) but then it does not include that value so the column looks blank but shouldn't be.
=ArrayFormula(Query(transpose(Query(TRANSPOSE({Query({'Test Data'!A1:Z1;Query({if('Test Data'!A2:Z<>"",1,0)},"Select "&JOIN(",","Sum(Col"&column('Test Data'!A1:Z1)&")"))},"Offset 1",1);'Test Data'!A2:Z}),"Select * Where Col2>0")),"Select * Offset 1",1))
I currently have this:
| | english | math | science |
|:-----------|------------:|:------------:|:-----------:|
| 8:30 | bob,jill | | |
| 9:40 | | | |
| 10:15 | | | mike |
I would like it to this (its okay for a row to be empty):
| | english | science |
|:-----------|------------:|:-----------:|
| 8:30 | bob,jill | |
| 9:40 | | |
| 10:15 | | mike |
any help would be appreciated.
the best way of doing this would be to re-pivot it again like:
=ARRAYFORMULA(QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:Z<>"", B2:Z, "♥")&"♦"&B1:Z1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3"))
if you want to keep all times you will need:
=ARRAYFORMULA({QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:E<>"", B2:E, "♥")&"♦"&B1:E1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3 limit 0");
{A2:A, IFERROR(VLOOKUP(A2:A, QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:E<>"", B2:E, "♥")&"♦"&B1:E1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3"),
TRANSPOSE(ROW(INDIRECT("A2:A"&COLUMNS(QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:E<>"", B2:E, "♥")&"♦"&B1:E1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3 limit 0"))))), 0))}})

QUERY to merge the names of two regions and sum the number of their occurrences

I have an array with states/regions in the UK. Some regions occur more than once in this list, so I have performed a COUNTIF to determine the number of times that each one occur.
Now I need to run a QUERY to list top 5 regions.
Generally, most occurrences are for the London area.
The issue is that in the regions there are 2 states that refer to the Greater London area - London and Greater London.
These two I need to merge and sum their values.
There needs to be only one region - Greater London, and its value needs to hold the sum of London and Greater London.
This is the dataset I have:
+----------------+-------+
| State/Province | count |
+----------------+-------+
| Hampshire | 1 |
+----------------+-------+
| Kent | 2 |
+----------------+-------+
| West Lothian | 3 |
+----------------+-------+
| London | 4 |
+----------------+-------+
| Greater London | 5 |
+----------------+-------+
| Cheshire | 6 |
+----------------+-------+
I have managed to put together this QUERY so far:
=QUERY(A1:B,"select A, max(B) group by A order by max(B) desc limit 5 label max(B) 'Number of occurrences'",1)
That gives me this output:
+----------------+-----------------------+
| State/Province | Number of occurrences |
+----------------+-----------------------+
| Cheshire | 6 |
+----------------+-----------------------+
| Greater London | 5 |
+----------------+-----------------------+
| London | 4 |
+----------------+-----------------------+
| West Lothian | 3 |
+----------------+-----------------------+
| Kent | 2 |
+----------------+-----------------------+
What I need is the Greater London and London entries to be merged under the name Greater London and their numbers of occurrences to be summed, providing this result:
+----------------+-----------------------+
| State/Province | Number of occurrences |
+----------------+-----------------------+
| Greater London | 9 |
+----------------+-----------------------+
| Cheshire | 6 |
+----------------+-----------------------+
| West Lothian | 3 |
+----------------+-----------------------+
| Kent | 2 |
+----------------+-----------------------+
| Hampshire | 1 |
+----------------+-----------------------+
Apologies for not sharing a sheet, but I have security restrictions that are not allowing me to share any link to sheet outside the firm.
=QUERY(QUERY(ARRAYFORMULA(
{SUBSTITUTE(IF(A1:A="London","♥",A1:A),"♥","Greater London"),B1:B}),
"select Col1, sum(Col2)
where Col1 is not null
group by Col1"),
"select Col1, max(Col2)
group by Col1
order by max(Col2) desc
limit 5
label max(Col2)'Number of occurrences'",1)
=QUERY(ARRAYFORMULA(SUBSTITUTE(
IF((A1:A="London")+(A1:A="London2")+(A1:A="London3"),
"♥",A1:A),"♥","Greater London")),
"select Col1, count(Col1)
where Col1 is not null and not Col1 = '#N/A'
group by Col1
order by count(Col1) desc
limit 5
label count(Col1) 'Number of occurrences'", 1)
=QUERY(ARRAYFORMULA(SUBSTITUTE(
IF((QUERY(A1:B,"where B=1")="London")+
(QUERY(A1:B,"where B=1")="London2")+
(QUERY(A1:B,"where B=1")="London3"),
"♥",QUERY(A1:B,"where B=1")),"♥","Greater London")),
"select Col1, count(Col1)
where Col1 is not NULL and not Col1 = '#N/A'
group by Col1
order by count(Col1) desc
limit 5
label count(Col1) 'Number of occurrences'", 1)

Select GROUP BY column without selecting the Sum() Column

Say I have searched multiple sheets with a query and gotten the following table in Google Sheets: (its nested, so it will never actually be displayed)
| Item | Amount | ID |
------------------------------
| cat | 3 | 1 |
------------------------------
| dog | 2 | 2 |
------------------------------
| dog | 4 | 2 |
------------------------------
| bird | 1 | 3 |
------------------------------
| bird | 2 | 3 |
------------------------------
| dog | 1 | 2 |
------------------------------
Obviously, If I want to get the Sum of Dogs and Birds from this table I could do something like this:
(-not exact syntax, just an example)
"SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2)"
And I should get something like the following:
| Item | Amount |
---------------------
| dog | 7 |
---------------------
| bird | 3 |
---------------------
BUT - Is there a way I can return ONLY Col1 (As in still do the Grouping and ordering) so that If I was to put it side by side with a result that showed all Columns it would still line up correctly?
OK, Solved it!!! Quite Simple Really....
All I have to do is to add another layer of nesting.
SO:
=QUERY(table, "SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2))"
Becomes:
=QUERY(
QUERY(table, "SELECT Col1, sum(Col2) WHERE Col1 = 'dog' or 'bird' GROUP BY Col1 ORDER BY Sum(Col2)"),
"SELECT Col1"
)
Again, Syntax not exact, but its the Logic I was after :)
:)

Counting number of occurrences in column?

What would be a good approach to calculate the number of occurrences in a spreadsheet column? Can this be done with a single array formula?
Example (column A is input, columns B and C are to be auto-generated):
| A | B | C |
+-------+-------+-------+
| Name | Name | Count |
+-------+-------+-------+
| Joe | Joe | 2 |
| Lisa | Lisa | 3 |
| Jenny | Jenny | 2 |
| Lisa | | |
| Lisa | | |
| Joe | | |
| Jenny | | |
A simpler approach to this
At the beginning of column B, type
=UNIQUE(A:A)
Then in column C, use
=COUNTIF(A:A, B1)
and copy them in all row column C.
Edit: If that doesn't work for you, try using semicolon instead of comma:
=COUNTIF(A:A; B1)
Try:
=ArrayFormula(QUERY(A:A&{"",""};"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Count'";1))
22/07/2014 Some time in the last month, Sheets has started supporting more flexible concatenation of arrays, using an embedded array. So the solution may be shortened slightly to:
=QUERY({A:A,A:A},"select Col1, count(Col2) where Col1 != '' group by Col1 label count(Col2) 'Count'",1)
=COUNTIF(A:A;"lisa")
You can replace the criteria with cell references from Column B
Just adding some extra sorting if needed
=QUERY(A2:A,"select A, count(A) where A is not null group by A order by count(A) DESC label A 'Name', count(A) 'Count'",-1)
=arrayformula(if(isblank(B2:B),iferror(1/0),mmult(sign(B2:B=TRANSPOSE(A2:A)),A2:A)))
I got this from a good tutorial - can't remember the title - probably about using MMult
Put the following in B3 (credit to #Alexander-Ivanov for the countif condition):
={UNIQUE(A3:A),ARRAYFORMULA(COUNTIF(UNIQUE(A3:A),"=" & UNIQUE(A3:A)))}
Benefits: It only requires editing 1 cell, it includes the name filtered by uniqueness, and it is concise.
Downside: it runs the unique function 3x
To use the unique function only once, split it into 2 cells:
B3: =UNIQUE(A3:A)
C3: =ARRAYFORMULA(COUNTIF(B3:B,"=" & B3:B))

Resources