Google Sheets: Combine duplicate rows into single row keeping separate columns - google-sheets

Left side is data, right side is expected
How do I combine duplicate rows into one while keeping separate columns?

One way to do that is with query(), like this:
=query( A3:H, "select A, max(B), max(C), max(D), max(E), max(F), max(G), max(H) where A is not null group by A", 0 )

use flatten function with query like this
=UNIQUE({QUERY( FLATTEN(a1:H),"Select * Where Col1 is not null ")})
if you don't want unique just remove unique from this

Related

ImportRange Function to Ignore Empty Gaps between Sheet A and Sheet B

How can I combine 2 spreadsheets into one without any gaps? When I import range 2 sheets, there is a gap of ~1000 rows. To make sure there are no gaps between the 2 sheets, I usually create a query "Where Col1 is not null " but I am missing some info. :(
My spreadsheet: https://docs.google.com/spreadsheets/d/1aSnbySwNPEvkkXqw6ItuBpZ_6-o58HPlVicIHWD0Y4I/edit#gid=381064131
Thanks in advance.
You can use just being three columns:
=query({
importrange("https://docs.google.com/spreadsheets/d/1aSnbySwNPEvkkXqw6ItuBpZ_6-o58HPlVicIHWD0Y4I/edit#gid=0", "Sheet1!A2:C");
IMPORTRANGE("https://docs.google.com/spreadsheets/d/1aSnbySwNPEvkkXqw6ItuBpZ_6-o58HPlVicIHWD0Y4I/edit#gid=0", "Sheet2!A2:C")}, "Select * WHERE Col1 IS NOT NULL OR Col2 is not null OR Col3 is not null")
A second possible scenario for avoiding empty rows, no matter how many columns you have is wrapping your range or query in LAMBDA, and use FILTER associated with BYROW and COUNTA. If there are no elements in any row, then the count will be 0 and it will be filtered out:
=LAMBDA(quer,FILTER(quer, BYROW (quer, LAMBDA (each,COUNTA(each)))))({
importrange("https://docs.google.com/spreadsheets/d/1aSnbySwNPEvkkXqw6ItuBpZ_6-o58HPlVicIHWD0Y4I/edit#gid=0", "Sheet1!A2:C");
IMPORTRANGE("https://docs.google.com/spreadsheets/d/1aSnbySwNPEvkkXqw6ItuBpZ_6-o58HPlVicIHWD0Y4I/edit#gid=0", "Sheet2!A2:C")})
Both solutions return:

Aggregating rows with query in google sheets

I have a data set that looks something like this:
Column A
Column B
category 1
Team 1
1.category 1
Team 1
2.category 2
Team 1
category 2
Team 1
category 3
Team 1
3.category 3
Team 1
I am trying to use query function with a pivot statement to calculate the occurrence of each category for team 1 (I have several other teams in the data set, but for simplicity I just wrote out my example with team 1). Unfortunately the naming of the categories are not consistent in the original data, and I cannot change them.
So I need a way to combine the results of the sum of category 1 and 1.category1, and so on.
How could I handle rewrite this to get the type of result as listed below?
Category
Team 1
category 1
2
category 2
2
category 3
2
The formula I have now is as following:
query('sheet1!A:B,"Select A, count(B) where B='Team 1' group by A pivot B label B 'Team 1'",1)
If the category names all have a similar format to those in your example (with extraneous data only at the beginning, followed by 'category N', and you don't care if zero counts per category are left blank then a more compact approach then the previous answer is (for any number of teams/categories):
=arrayformula(query({regexextract(A2:A,"category.+"),B2:B},"select Col1,count(Col1) where Col2 is not null group by Col1 pivot Col2 label Col1 'Category'",0))
formula:
=ArrayFormula(
LAMBDA(DATA,CATEGORY,
LAMBDA(RESULT,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(QUERY(SPLIT(TRANSPOSE(SPLIT(RESULT,"&")),"|"),"SELECT Col1,SUM(Col3) GROUP BY Col1 PIVOT Col2 LABEL Col1'Category'",0))
)(
JOIN("&",
BYROW(CATEGORY,LAMBDA(CAT,
JOIN("&",CAT&"|"&BYROW(TRANSPOSE(QUERY(DATA,"SELECT COUNT(Col1) WHERE lower(Col1) CONTAINS'"&CAT&"' PIVOT Col2",0)),LAMBDA(ROW,JOIN("|",ROW))))
))
)
)
)({ASC($A$2:$B$7)},{"category 1";"category 2";"category 3"})
)
use ASC() to format all numbers-like values into number,
use {} to create the match conditions,
iterate the conditions with BYROW() and...
use QUERY() with CONTAINS to COUNT matches of the given conditions,
use TRANSPOSE() to turn the match results of each row sideway,
change the results into string with JOIN(), this helps to modify the row and column arrangment,
SPLIT() the data to create the correct array format we can use,
use QUERY() to PIVOT the SUM of the COUNT result as our final output.
Another approch works in a slightly different concept:
=ArrayFormula(
LAMBDA(DATA,CAT,
LAMBDA(DATA,
LAMBDA(COLA,COLB,
LAMBDA(COLA,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(TRANSPOSE(QUERY({COLA,COLB},"SELECT Col2,COUNT(Col2) GROUP BY Col2 PIVOT Col1 LABEL Col2'Category'",0)))
)(REGEXEXTRACT(COLA,JOIN("|",CAT)))
)(INDEX(DATA,,1),INDEX(DATA,,2))
)(ASC(DATA))
)($A$2:$B$7,{"category 1","category 2","category 3"})
)
We can modify the Category column of the input data with REGEXEXTRACT() before sending it into query, which in this case, do make the formula looks a bit cleaner.
Inspired by #The God of Biscuits 's answer, we can now get rid of the CAT variable, which makes the formula more elastic to fit into your condition.
This REGEXEXTRACT() will extract Category value from the 1st 'category' match found to the end of the 1st 'number' after it, with any spacing in between the two value.
=ArrayFormula(
LAMBDA(DATA,
LAMBDA(COLA,COLB,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(TRANSPOSE(QUERY({COLA,COLB},"SELECT Col2,COUNT(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1 LABEL Col2'Category'",0)))
)(REGEXEXTRACT(LOWER(INDEX(DATA,,1)),"((?:category)(?: +?)(?:[0-9]|[0-9])+)"),INDEX(DATA,,2))
)($A$2:$B)
)
You can also use filter with a count a like this:
=counta(filter(Sheet1!A:A,(Sheet1!A:A="category 1")+(Sheet1!A:A="1.category 1"),Sheet1!B:B="Team 1"))

How can I separate a column into multiple columns based on values?

I have searched on a lot of pages but I cannot find a solution to my problem except in reverse order. I have simplified what I do, but I have a query that comes looking for information in my data sheet. Here there are 3 columns, the date, the amount and the source.
I would like, with a query function, to be able to make different columns which counts the information of column C based on the values of its cells per month, like this
I'm okay with the start of the formula
=QUERY(A2:C,"select month(A)+1, sum(B), count(C) where A is not null group by month(A)+1")
But as soon as I try a little different things by putting 2 query together in an arrayformula, obviously the row count doesn't match as some minus are 0 for some sources.
Do you have a solution for what I'm trying to do? Thank you in advance :)
Solution:
It's not possible in Google Query Language to have a single query statement that has one result grouped by one column and another result grouped by another.
The first two columns can be like this:
=QUERY(A2:C,"select month(A)+1, sum(B) where A is not null group by month(A)+1 label month(A)+1 'Month', sum(B) 'Amount'")
To create the column labels for the succeeding columns, use in the first row, in my example, I1:
=TRANSPOSE(UNIQUE(C2:C))
Then from cell I2, enter this:
=COUNTIFS(arrayformula(month($A$2:$A)),$G2,$C$2:$C,I$1)
Then drag horizontally and vertically to apply to the entire table.
Results:
try:
=INDEX({
QUERY({MONTH(A2:A), B2:C},
"select Col1,sum(Col2) where Col2 is not null group by Col1 label Col1'month',sum(Col2)'amount'"),
QUERY({MONTH(A2:A), B2:C, C2:C},
"select count(Col3) where Col2 is not null group by Col1 pivot Col4")})

Use Query to display two columns and group rows based on one column's count?

I'm using an array to bring 2 columns of data together from 3 sheets.
There are duplicates in the second column, and I would like to group those duplicates together and display both Col1 and Col2, ordered alphabetically by Col1.
This is the formula I have right now:
=QUERY({'Sheet1!'A:B;'Sheet2!'A:B;'Sheet3!'A:B}, "Select Col1, count(Col2) where Col1 is not null group by Col1",1)
Which only displays Col1.
I've tried nesting QUERY, but I can't get it to work and can't find any direction anywhere online.
Here's an example sheet I made to show what I'm trying to do:
https://docs.google.com/spreadsheets/d/1_x0mXZC0ZjsHDCd6I0dDf9OI19lrzEcPYqfcMxuK74Y/edit?usp=sharing
In the example if an employee is listed twice the name may change but the email is consistent. I'm hoping to group by the email addresses and return only one name (it doesn't really matter which name).
I'm not sure if this is possible without formulas in more than one cell. Thank you either way!
#confuseddesk, try this array formula:
=ArrayFormula(QUERY({VLOOKUP(UNIQUE({Sheet2!B2:B;Sheet3!B2:B;Sheet4!B2:B}),{Sheet2!B2:B,Sheet2!A2:A;Sheet3!B2:B,Sheet3!A2:A;Sheet4!B2:B,Sheet4!A2:A},2,FALSE),UNIQUE({Sheet2!B2:B;Sheet3!B2:B;Sheet4!B2:B})},"Select * Where Col2 Is Not Null"))

Calculate total of values in columns?

I have a Google Sheets document with four sheets. In each sheet, there are two columns: first is a name, and second a value.
I need to get the total of the values from the four sheets in a fifth sheet where the names from the four name-columns correspond.
The column with the values is called differently in most sheets.
How can I accomplish this task?
You can use Query + Arrays, like on image below:
Code:
=QUERY({Sheet1!A2:B;Sheet2!A2:B;Sheet3!A2:B;Sheet4!A2:B};
"select Col1,sum(Col2) where Col1 is not null group by Col1 label Col1 'Name',sum(Col2) 'Total'";0)
Explanation:
First: you combine 4 sheets using an Array. Ranges start from row 2 to omit different columns headers:
{Sheet1!A2:B;Sheet2!A2:B;Sheet3!A2:B;Sheet4!A2:B}
Second: Using Query to perform a total. In Query there are some extra operations:
Col1 is not null - for not showing empty rows in totals
label Col1 'Name',sum(Col2) 'Total' - for naming result columns
Syntax may be a little different depending on your local settings, so look at the working copy:
Link to working copy
Is that serves your needs?

Resources