Combine duplicate rows in column as comma separated values - Google Query - google-sheets

If i have 2 columns viz., ID & Name, ID column containing duplicates, and if i want to group by ID to get unique ID's but name column should be a comma-separated list, can this be possible in Google Query?
| ID | Name |
===============
| 1001 | abc |
---------------
| 1001 | def |
---------------
| 1002 | kjg |
---------------
| 1003 | aof |
---------------
| 1003 | lmi |
---------------
| 1004 | xyz |
---------------
into
| ID | Name |
====================
| 1001 | abc, def |
--------------------
| 1002 | kjg |
--------------------
| 1003 | aof, lmi |
--------------------
| 1004 | xyz |
--------------------

try:
=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B},
"select Col1,max(Col2)
where Col1 is not null
group by Col1
pivot Col3"),
"select Col1
offset 1", 0), REGEXREPLACE(TRIM(
TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B},
"select max(Col2)
where Col1 is not null
and Col2 <> ','
group by Col1
pivot Col3"),
"offset 1", 0)),,999^9))), ",$", )})
however, this may not work for massive datasets due to TRIM (which is needed to remove empty spaces) and REGEXREPLACE (which is needed to remove the end comma) limitations. otherwise, without it, the formula can handle anything:
=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B},
"select Col1,max(Col2)
where Col1 is not null
group by Col1
pivot Col3"),
"select Col1
offset 1", 0),
TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B},
"select max(Col2)
where Col1 is not null
and Col2 <> ','
group by Col1
pivot Col3"),
"offset 1", 0)),,999^9))})

I looked through Query specification. I could not find a solution. So I made some formulas that do the job (because I found this task interesting).
D2 contains =unique(a2:a)
E2 contains =join(", ",transpose(filter($B$2:$B,$A$2:$A=D2)))and it's copied down.
I had to copy formulas down (far from beautiful formula)
Hope you find it helpful.
Reference
UNIQUE
JOIN
TRANSPOSE
FILTER

Here is an answer using QUERY.
=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(SPLIT(
CONCATENATE(TRANSPOSE(QUERY({"♦"&A2:A&"♠", B2:B&", "},
"select max(Col2) where Col2 is not null group by Col2 pivot Col1", 0))),
"♦")), "♠")), ",$", ))
This comes directly from this question.
Player0 has answers with just amazing formulas that are able to reorganise data in a huge variety of ways.

if you could live with the end-comma present in the output you can try:
=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B},
"select Col1,max(Col3)
where Col1 is not null
and Col3 <> ','
group by Col1
pivot Col2"),
"select Col1 offset 1", 0),
TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(QUERY(QUERY({A2:B, B2:B},
"select Col1,max(Col3)
where Col1 is not null
and Col3 <> ','
group by Col1
pivot Col2"),
"select Col1 offset 1", 0),
QUERY(QUERY({A2:B, B2:B&","},
"select Col1,max(Col3)
where Col1 is not null
and Col3 <> ','
group by Col1
pivot Col2"),
"offset 1", 0),
SPLIT(TRANSPOSE(QUERY(TRANSPOSE(IF(QUERY(QUERY({A2:B, B2:B&","},
"select max(Col3)
where Col1 is not null
and Col3 <> ','
group by Col1
pivot Col2"),
"offset 1", 0)="",,COLUMN(B2:XXX)&",")),,999^99)), ","), 0))),,999^99))})
(tho this was never tested on an ultra-massive dataset but in theory, it should handle anything too)

Related

Count unique comma-separated values in column B based on criteria in column A

How do I list and count unique comma-separated values (column B in the example below) if the number in column A is larger (or smaller) than X? In other words, how do I turn the below table...
Day | Fruits
+--------|--------------------------+
|
20 | Apple, Banana, Pearl
|
24 | Apple, Pearl
|
32 | Banana, Pearl
+
...into this 👇, with criteria: Day < 28.
Fruit | Frequency
+----------|---------------+
|
Apple | 2
|
Pearl | 2
|
Banana | 1
+
A solution proposed by #AdamL in this question is really close to what I want to achieve, but I can't figure out how to list values based on criteria from another column. Here's what Adam came up with:
=ArrayFormula(QUERY(TRANSPOSE(SPLIT(JOIN(",",A:A),",")&{"";""}),"select Col1, count(Col2) group by Col1 label count(Col2) ''",0))
use:
=ARRAYFORMULA(QUERY(SPLIT(FLATTEN(
IF(IFERROR(SPLIT(B2:B, ","))="",,A2:A&"♦"&TRIM(SPLIT(B2:B, ",")))), "♦"),
"select Col2,count(Col2)
where Col1 < 28
and Col1 is not null
group by Col2
label count(Col2)''"))
I think it's a bit easier to filter on column A first then split:
=ArrayFormula(query(flatten(split(filter(B2:B,A2:A<28,A2:A<>""),",")),"select Col1,count(Col1) where Col1 is not null group by Col1 label Col1 'Fruit',Count(Col1) 'Count'"))
Considering Column 'A' as days and Column 'B' as fruits with row 1 as headers, AdamL's formula works for you with a little tweak as below:
=ArrayFormula(QUERY(TRANSPOSE(SPLIT(JOIN(",",B2:B),", ",True)&{"";""}),"select Col1, count(Col2) group by Col1 label count(Col2) ''",0))
Hope this solves your problem.

How to transpose this data in Google Sheets

I have data in google sheet I just want to fetch that data in my other sheet but in transpose
Here is ex:
Column A | Column B | Column C
=================================
site1.com | Name 1 | Name 2
site2.com | Name 3 | Name 4
site3.com | Name 5 | Name 6
Want to data like this
Column A | Column B | Column C
=================================
site1.com | site2.com | site3.com
Name 1 | Name 3 | Name 5
Name 2 | Name 4 | Name 6
I don't want to enter formula manually in every row so is arryformula can do this automatic.
I'm trying this but not able to what I want.
=ARRAYFORMULA(TRANSPOSE(Sheet1!$B2:B300 & Sheet1!$L2:L300))
try:
=ARRAYFORMULA({TRANSPOSE(SORT(FILTER(A:A, A:A<>"")));
SUBSTITUTE(TRANSPOSE(SPLIT(TRANSPOSE(QUERY(TRANSPOSE(
IF(ISNUMBER(QUERY(QUERY({A:B; A:A, C:C},
"select count(Col1) where Col1 is not null group by Col1 pivot Col2"),
"offset 1", 0)), SUBSTITUTE(QUERY(QUERY({A:B; A:A, C:C},
"select count(Col1) where Col1 is not null group by Col1 pivot Col2"),
"limit 0"), " ", "♦"), )),,999^99)), " ")), "♦" , " ")})
or you can do just:
=TRANSPOSE(A1:C3)
or:
=TRANSPOSE(INDIRECT("A1:C"&COUNTA(A:A)))
UPDATE:
=QUERY(TRANSPOSE(INDIRECT("Sheet1!A2:L"&COUNTA(Sheet1!A2:A)+1)), "offset 1")

Concatenate list of IDs based on matching IDs within a comma separated list in a different column

Using google sheets:
I've got two columns.
Column A is a list of ID numbers:
N1
N2
N3
N4
N5
Column B is a comma separated list of other ID numbers within column A related to the ID number on that same row:
N2,N3
N3,N4
(null)
(null)
N1
I'm trying to make a formula in a third column, column C, that will display a comma separated list of the ID numbers from column A that match the ID numbers entered entered in Column B.
Intended result:
A | B | C
N1 | N2,N3 | N4,N5
N2 | N3,N4 | N1
N3 | (null) | N1,N2
N4 | N1 | N2
N5 | N1 | (null)
The closest I could get was this formula here:
=arrayFormula({concatenate(rept(A:A&",",B:B=A2))})
But this will only work if multiple items haven't been entered into column B, so using this only "N4,N5" would be returned in column C rather than the rest shown in the intended result.
Edit(updated Image): I'm now seeing the following, seems there's an error somewhere:
try like this:
=ARRAYFORMULA(IFERROR(VLOOKUP(A1:A,
{QUERY(QUERY(SPLIT(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
IF(IFERROR(SPLIT(B1:B, ","))<>"", "♦"&SPLIT(B1:B, ",")&"♠"&A1:A, ))
,,999^99)),,999^99), "♦"))), "♠"),
"select Col1,count(Col1) group by Col1 pivot Col2"), "select Col1 offset 1", 0),
SUBSTITUTE(REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(ISNUMBER(
QUERY(QUERY(SPLIT(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
IF(IFERROR(SPLIT(B1:B, ","))<>"", "♦"&SPLIT(B1:B, ",")&"♠"&A1:A, ))
,,999^99)),,999^99), "♦"))), "♠"),
"select count(Col1) group by Col1 pivot Col2"), "offset 1", 0)),
QUERY(QUERY(SPLIT(TRIM(TRANSPOSE(SPLIT(QUERY(TRANSPOSE(QUERY(TRANSPOSE(
IF(IFERROR(SPLIT(B1:B, ","))<>"", "♦"&SPLIT(B1:B, ",")&"♠"&A1:A, ))
,,999^99)),,999^99), "♦"))), "♠"),
"select count(Col1) group by Col1 pivot Col2"), "limit 0", 1)&",", ))
,,999^99))), ",$", ), ", ", ",")}, 2, 0)))

How can I get all non-empty columns and their contents?

I would like to know how I can filter out empty columns when using data from one part of the sheet in another without having to specify each column name since more columns can be added.
I found this site and tried out the formula there but that seems like sometimes it will include the column (meaning it has a non-empty value) but then it does not include that value so the column looks blank but shouldn't be.
=ArrayFormula(Query(transpose(Query(TRANSPOSE({Query({'Test Data'!A1:Z1;Query({if('Test Data'!A2:Z<>"",1,0)},"Select "&JOIN(",","Sum(Col"&column('Test Data'!A1:Z1)&")"))},"Offset 1",1);'Test Data'!A2:Z}),"Select * Where Col2>0")),"Select * Offset 1",1))
I currently have this:
| | english | math | science |
|:-----------|------------:|:------------:|:-----------:|
| 8:30 | bob,jill | | |
| 9:40 | | | |
| 10:15 | | | mike |
I would like it to this (its okay for a row to be empty):
| | english | science |
|:-----------|------------:|:-----------:|
| 8:30 | bob,jill | |
| 9:40 | | |
| 10:15 | | mike |
any help would be appreciated.
the best way of doing this would be to re-pivot it again like:
=ARRAYFORMULA(QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:Z<>"", B2:Z, "♥")&"♦"&B1:Z1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3"))
if you want to keep all times you will need:
=ARRAYFORMULA({QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:E<>"", B2:E, "♥")&"♦"&B1:E1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3 limit 0");
{A2:A, IFERROR(VLOOKUP(A2:A, QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:E<>"", B2:E, "♥")&"♦"&B1:E1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3"),
TRANSPOSE(ROW(INDIRECT("A2:A"&COLUMNS(QUERY(SPLIT(TRANSPOSE(SPLIT(CONCATENATE(
IF(A2:A<>"", "♠"&A2:A&"♦"&IF(B2:E<>"", B2:E, "♥")&"♦"&B1:E1, )), "♠")), "♦"),
"select Col1,max(Col2) where Col2 <> '♥' group by Col1 pivot Col3 limit 0"))))), 0))}})

How to concat query results?

I've got data like this:
A | B
-----------------------
Date | Data
30/06/2015 | 1.2
01/07/2015 | 2
01/07/2015 | 3
02/07/2015 | 2
02/07/2015 | 3
And I write a simple query like this:
=query(A:B; "select YEAR(A) || ' ' || month(A), SUM(B)
group by YEAR(A), month(A)")
But I've got a "Value Error", so I write this
=query(A:B; "select YEAR(A), MONTH(A), SUM(B)
group by YEAR(A), month(A)")
And it work well, but, how can I concat two values in one cell ?
I though of this:
=QUERY(Journal!A:B;"select A, SUM(B)
group by A
format A 'yyyy-MM'")
But the format operation is done AFTER the group by so it's like the group by never was.
Any idea ?
[Solution]
=query({Arrayformula(text(Foo!A:A; "yyyy-MM")) \Foo!B:B}; "select Col1, sum(Col2) group by Col1"; -1)
Because my local is "french" I've to use the '\' in the Arrrayformula and ; instead of ','...
Use the ampersand (&) to concatenate in arrayformula:
=ArrayFormula(if(len(A:A), year(A:A)&" "&month(A2:A),))
or if you don't want the space between year and month:
=ArrayFormula(if(len(A:A), year(A:A)&month(A2:A),))
EDIT:
If you want to query and sum, try:
=query({Arrayformula(text(Journal!A:A, "yyyy-MM")),Journal!B:B}, "select Col1, sum(Col2) where Col2 is not null group by Col1", 0)
or in locale that use semi-colons as argument separators:
=query({Arrayformula(text(Journal!A:A; "yyyy-MM"))\Journal!B:B}; "select Col1, sum(Col2) where Col2 is not null group by Col1"; 0)
Example spreadsheet

Resources