Group the data in one column per the values in another column - google-sheets

I have data something as below
email id subject of interest
ramesh#axito.com Java,C++
mnp#axito.com VB
ramesh#axito.com Python
mohan#axito.com Java,C++
mnp#axito.com JS
rohan#axito.com C#
But I need it in the format as below-
email id subject of interest
ramesh#axito.com Java,C++,Python
mnp#axito.com VB,JS
mohan#axito.com Java,C++
rohan#axito.com C#
Can someone please tell me how can I do this?

First, create the list of unique email addresses with =unique(A2:A). Suppose this is done in column C.
Then in cell D2, enter =join(",", filter(B$2:B, A$2:A=C2)) and drag this formula down columd D.
Explanation: filter keeps only the entries from column B with matching email; join joins them into a comma-separated list.

Try using query function:
=QUERY({A:B,A:B},"select Col1, Count(Col2) where Col1 <> '' group by Col1 pivot Col4")
Also try this formula, this is single formula solution:
={UNIQUE(FILTER(A2:A,A2:A>0)),TRANSPOSE(
SPLIT(
", "&join(", ",
ARRAYFORMULA(
if(query(A:B,"select A where not A is null order by A",0)=
query(A:B,"select A where not A is null order by A limit "&COUNT(query(A:B,"select A where not A is null",0))-1,1),"","|")
& query(A:B,"select B where not A is null order by A",0)
& " "
)
)
,", |",0)
)}

Related

Aggregating rows with query in google sheets

I have a data set that looks something like this:
Column A
Column B
category 1
Team 1
1.category 1
Team 1
2.category 2
Team 1
category 2
Team 1
category 3
Team 1
3.category 3
Team 1
I am trying to use query function with a pivot statement to calculate the occurrence of each category for team 1 (I have several other teams in the data set, but for simplicity I just wrote out my example with team 1). Unfortunately the naming of the categories are not consistent in the original data, and I cannot change them.
So I need a way to combine the results of the sum of category 1 and 1.category1, and so on.
How could I handle rewrite this to get the type of result as listed below?
Category
Team 1
category 1
2
category 2
2
category 3
2
The formula I have now is as following:
query('sheet1!A:B,"Select A, count(B) where B='Team 1' group by A pivot B label B 'Team 1'",1)
If the category names all have a similar format to those in your example (with extraneous data only at the beginning, followed by 'category N', and you don't care if zero counts per category are left blank then a more compact approach then the previous answer is (for any number of teams/categories):
=arrayformula(query({regexextract(A2:A,"category.+"),B2:B},"select Col1,count(Col1) where Col2 is not null group by Col1 pivot Col2 label Col1 'Category'",0))
formula:
=ArrayFormula(
LAMBDA(DATA,CATEGORY,
LAMBDA(RESULT,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(QUERY(SPLIT(TRANSPOSE(SPLIT(RESULT,"&")),"|"),"SELECT Col1,SUM(Col3) GROUP BY Col1 PIVOT Col2 LABEL Col1'Category'",0))
)(
JOIN("&",
BYROW(CATEGORY,LAMBDA(CAT,
JOIN("&",CAT&"|"&BYROW(TRANSPOSE(QUERY(DATA,"SELECT COUNT(Col1) WHERE lower(Col1) CONTAINS'"&CAT&"' PIVOT Col2",0)),LAMBDA(ROW,JOIN("|",ROW))))
))
)
)
)({ASC($A$2:$B$7)},{"category 1";"category 2";"category 3"})
)
use ASC() to format all numbers-like values into number,
use {} to create the match conditions,
iterate the conditions with BYROW() and...
use QUERY() with CONTAINS to COUNT matches of the given conditions,
use TRANSPOSE() to turn the match results of each row sideway,
change the results into string with JOIN(), this helps to modify the row and column arrangment,
SPLIT() the data to create the correct array format we can use,
use QUERY() to PIVOT the SUM of the COUNT result as our final output.
Another approch works in a slightly different concept:
=ArrayFormula(
LAMBDA(DATA,CAT,
LAMBDA(DATA,
LAMBDA(COLA,COLB,
LAMBDA(COLA,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(TRANSPOSE(QUERY({COLA,COLB},"SELECT Col2,COUNT(Col2) GROUP BY Col2 PIVOT Col1 LABEL Col2'Category'",0)))
)(REGEXEXTRACT(COLA,JOIN("|",CAT)))
)(INDEX(DATA,,1),INDEX(DATA,,2))
)(ASC(DATA))
)($A$2:$B$7,{"category 1","category 2","category 3"})
)
We can modify the Category column of the input data with REGEXEXTRACT() before sending it into query, which in this case, do make the formula looks a bit cleaner.
Inspired by #The God of Biscuits 's answer, we can now get rid of the CAT variable, which makes the formula more elastic to fit into your condition.
This REGEXEXTRACT() will extract Category value from the 1st 'category' match found to the end of the 1st 'number' after it, with any spacing in between the two value.
=ArrayFormula(
LAMBDA(DATA,
LAMBDA(COLA,COLB,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(TRANSPOSE(QUERY({COLA,COLB},"SELECT Col2,COUNT(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1 LABEL Col2'Category'",0)))
)(REGEXEXTRACT(LOWER(INDEX(DATA,,1)),"((?:category)(?: +?)(?:[0-9]|[0-9])+)"),INDEX(DATA,,2))
)($A$2:$B)
)
You can also use filter with a count a like this:
=counta(filter(Sheet1!A:A,(Sheet1!A:A="category 1")+(Sheet1!A:A="1.category 1"),Sheet1!B:B="Team 1"))

Google Sheets: Consolidate Rows By Unique ID

I'm trying to consolidate crypto data from BscScan, by ID (a Hash in this case). As the downloaded data has many irrelevant columns, and actual hash data is very long, I've created a simple abbreviated data example of what I'm trying to do:
For each Hash, I need to
Sum the "Number" column
Count the Number of Records
Some pointers on how to do this would be appreciated.
Please try this QUERY
=QUERY(A2:B14, "select A, sum(B), count(B)
where A <>'' group by A
label sum(B) 'Total', count(B) 'Record Count' ",1)

Need to know with formula in Google sheet

I have date in column A, Name in column B and Product sale data in column C. Now I want formula which gives me in return in another table, If that person has sold something on a particular date that sale data entry is shown
Do it with a pivot table so that you will have the answer without any formula.
Try query() with a pivot clause, like this:
=query(A1:C, "select A, B, count(B) where A is not null group by A, B pivot C", 1)
May be it can help too
=Query('Sales '!A1:C, "Select Col1, Col2, Col3 Where Col3 is not null",0)
'Sales ' is data source worksheet
use ,1 if you don't want to include the same headers of of your data source worksheet

GS Query to split a column (Last, First) into 2 columns

I have a 'Raw Invoice' tab which is an excel file copy/pasted directly over. I'm then trying to format the data in the 'Invoices' tab in that column order using a query. I need to be able to break out the Student Name into two separate columns, hopefully within the query itself. Preferably it would then change it so column C is Last Name and column D is First name and the rest of columns shift over one.
I don't know if there's a way to perform a SPLIT function within the query. Right now I'm using a clunky method by doing a VLOOKUP on the student ID to get the names from another tab (not included in the Sample GS cuz it's an importrange from a work file), but it then creates two separate queries. Ideally I can somehow split column C within one query, but am getting lost by nesting queries and arrays together. I might be able to use REGEXEXTRACT, but again get lost in where to put it in the query or whether that's overkill.
QUERY('Raw Invoice'!$A:$I,"Select F,B,A, 'Bobs Diner',D,G,I where C is not null label 'Bobs Diner' 'Company' Format F 'M/DD/YYYY' ",1)
Link to sheet.
split in query arguments is not possible but you can do:
=ARRAYFORMULA(QUERY({IFERROR(SPLIT('Raw Invoice'!A:A, ", ")), 'Raw Invoice'!B:I},
"select Col7,Col3,Col1,Col2,'Bobs Diner',Col5,Col8,Col10
where Col3 is not null
label 'Bobs Diner' 'Company'
format Col7 'M/DD/YYYY'", 1))
Wat I suggest is to implement in Row Invoice
=arrayformula(split(A3:A,","))
and then
=QUERY('Raw Invoice'!$A:$K,"Select F,B,J,K, 'Bobs Diner',D,G,I where C is not null label 'Bobs Diner' 'Company' Format F 'M/DD/YYYY' ",1)

Google Query SELECT statement concatenated with a NESTED IF result

Is it possible to return a Nested IF result from a CELL that will be concatenated to the SELECT statement in the QUERY function?
For example, I am trying to return the result for the following Nested IF function into the Query Function:
https://docs.google.com/spreadsheets/d/15i1E8AZHORRmPlu1VQqFRN1_7-aUyAz-hlYMOUtIlY4/edit?usp=sharing
Appreciate it, if anyone could take a look.
Regards
JVA
its done like this:
=QUERY(TESTDATA!A1:D16, "SELECT A, D, SUM(C) WHERE 1=1 "&
IF(AND(M3="NAME",N3="Customer"), " GROUP BY A, D PIVOT B",
IF(AND(N3 = "Customer"," AND A = '"&M3&"' GROUP BY A, D PIVOT B"),
" AND A = '"&M3&"' GROUP BY A, D PIVOT B",
" AND A = '"&M3&"'
AND D = '"&N3&"' GROUP BY A, D PIVOT B")), 1)
Sometimes, it's easier to FILTER the results before applying QUERY:
=ArrayFormula(QUERY(FILTER(A1:D16, A1:A16=M3, D1:D16=N3), "SELECT Col1, Col4, SUM(Col3) GROUP BY Col1, Col4 PIVOT Col2 LABEL Col1 'Name', Col4 'Customer'",0))
As you can see, this requires using Colx notation instead of letters to indicate columns in the SELECT clause; but this is actually (in my opinion) more versatile, since you don't have to rewrite the QUERY if you ever insert columns before the existing source data.
You'll also notice that I needed to LABEL the first two columns, since FILTER will have FILTERed out the headers. (In fact, for this reason, the ranges in the formula could just as easily have begun with row 2, e.g., A2:A16, etc.)
Finally, at least in your sample spreadsheet, you didn't need the sheet name to reference the source ranges, since the result is in the same sheet.

Resources