How to Query function for multiple counts - google-sheets

I have a Google sheet being used as a property index which has a list of property with its summary details such as location, type, number of bedrooms etc.
I'm trying to count the number of how many 1 bedroom, 2 bedroom, 3 bedroom properties etc there are in each location.
However, I'm not sure how to do multiple counts, for example where column W = 1 and (next column) where W = 2 etc
=QUERY(Property_Location_Type,"select T, count(T) where W='1' Group by T",1)

you could try something alike:
formula in cell Y1 here:
=ARRAYFORMULA(QUERY({T2:T,TO_TEXT(IF(W2:W>2,">2",W2:W))&{"",""}},"Select Col1, Count(Col2) Where Col1!='' group by Col1 PIVOT Col3 label Col1 'Town/City'"))
-

Related

Aggregating rows with query in google sheets

I have a data set that looks something like this:
Column A
Column B
category 1
Team 1
1.category 1
Team 1
2.category 2
Team 1
category 2
Team 1
category 3
Team 1
3.category 3
Team 1
I am trying to use query function with a pivot statement to calculate the occurrence of each category for team 1 (I have several other teams in the data set, but for simplicity I just wrote out my example with team 1). Unfortunately the naming of the categories are not consistent in the original data, and I cannot change them.
So I need a way to combine the results of the sum of category 1 and 1.category1, and so on.
How could I handle rewrite this to get the type of result as listed below?
Category
Team 1
category 1
2
category 2
2
category 3
2
The formula I have now is as following:
query('sheet1!A:B,"Select A, count(B) where B='Team 1' group by A pivot B label B 'Team 1'",1)
If the category names all have a similar format to those in your example (with extraneous data only at the beginning, followed by 'category N', and you don't care if zero counts per category are left blank then a more compact approach then the previous answer is (for any number of teams/categories):
=arrayformula(query({regexextract(A2:A,"category.+"),B2:B},"select Col1,count(Col1) where Col2 is not null group by Col1 pivot Col2 label Col1 'Category'",0))
formula:
=ArrayFormula(
LAMBDA(DATA,CATEGORY,
LAMBDA(RESULT,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(QUERY(SPLIT(TRANSPOSE(SPLIT(RESULT,"&")),"|"),"SELECT Col1,SUM(Col3) GROUP BY Col1 PIVOT Col2 LABEL Col1'Category'",0))
)(
JOIN("&",
BYROW(CATEGORY,LAMBDA(CAT,
JOIN("&",CAT&"|"&BYROW(TRANSPOSE(QUERY(DATA,"SELECT COUNT(Col1) WHERE lower(Col1) CONTAINS'"&CAT&"' PIVOT Col2",0)),LAMBDA(ROW,JOIN("|",ROW))))
))
)
)
)({ASC($A$2:$B$7)},{"category 1";"category 2";"category 3"})
)
use ASC() to format all numbers-like values into number,
use {} to create the match conditions,
iterate the conditions with BYROW() and...
use QUERY() with CONTAINS to COUNT matches of the given conditions,
use TRANSPOSE() to turn the match results of each row sideway,
change the results into string with JOIN(), this helps to modify the row and column arrangment,
SPLIT() the data to create the correct array format we can use,
use QUERY() to PIVOT the SUM of the COUNT result as our final output.
Another approch works in a slightly different concept:
=ArrayFormula(
LAMBDA(DATA,CAT,
LAMBDA(DATA,
LAMBDA(COLA,COLB,
LAMBDA(COLA,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(TRANSPOSE(QUERY({COLA,COLB},"SELECT Col2,COUNT(Col2) GROUP BY Col2 PIVOT Col1 LABEL Col2'Category'",0)))
)(REGEXEXTRACT(COLA,JOIN("|",CAT)))
)(INDEX(DATA,,1),INDEX(DATA,,2))
)(ASC(DATA))
)($A$2:$B$7,{"category 1","category 2","category 3"})
)
We can modify the Category column of the input data with REGEXEXTRACT() before sending it into query, which in this case, do make the formula looks a bit cleaner.
Inspired by #The God of Biscuits 's answer, we can now get rid of the CAT variable, which makes the formula more elastic to fit into your condition.
This REGEXEXTRACT() will extract Category value from the 1st 'category' match found to the end of the 1st 'number' after it, with any spacing in between the two value.
=ArrayFormula(
LAMBDA(DATA,
LAMBDA(COLA,COLB,
LAMBDA(RESULT,
IF(RESULT="",0,RESULT)
)(TRANSPOSE(QUERY({COLA,COLB},"SELECT Col2,COUNT(Col2) WHERE Col2 IS NOT NULL GROUP BY Col2 PIVOT Col1 LABEL Col2'Category'",0)))
)(REGEXEXTRACT(LOWER(INDEX(DATA,,1)),"((?:category)(?: +?)(?:[0-9]|[0-9])+)"),INDEX(DATA,,2))
)($A$2:$B)
)
You can also use filter with a count a like this:
=counta(filter(Sheet1!A:A,(Sheet1!A:A="category 1")+(Sheet1!A:A="1.category 1"),Sheet1!B:B="Team 1"))

How to get unique values in a column, including cells with multiple values seperated by commas, in Google Sheet?

I have a column in a Google Sheet, which in some cases, includes multiple values separated by commas — like this:
Value
A example
B example
C example
D example
A example, E example
A example, F example
G example, D example, C example
I would like to count all occurrences of the unique values in this column, so the count should look like:
Unique value
Occurrences
A example
3
B example
1
C example
2
D example
2
E example
1
F example
1
G example
1
Currently, however, when I use =UNIQUE(A2:A), the result gives this:
Unique value
Occurrences
A example
1
B example
1
C example
1
D example
1
A example, E example
1
A example, F example
1
G example, D example, C example
1
Is there a way I can count all of the instances of letters, whether they appear in individually in a cell or appear alongside other letters in a cell (comma-seperated)?
(This looks like a useful answer in Python, but I'm trying to do this in Google Sheets)
try:
Formula in C1:
=INDEX(QUERY(IFERROR(FLATTEN(SPLIT(A1:A,", ")),""),"Select Col1, count(Col1) where Col1 is not null group by Col1 label count(Col1) ''"))
Or, as per the comments, split on the combination instead:
=INDEX(QUERY(IFERROR(FLATTEN(SPLIT(A1:A,", ",0)),""),"Select Col1, count(Col1) where Col1 is not null group by Col1 label count(Col1) ''"))
2nd EDIT: To order descending by count use:
=INDEX(QUERY(IFERROR(FLATTEN(SPLIT(A1:A,", ",0)),""),"Select Col1, count(Col1) where Col1 is not null group by Col1 Order By count(Col1) desc label count(Col1) ''"))
Assuming data in A1:A7:
In C1:
=SORT(UNIQUE(FLATTEN(ARRAYFORMULA(SPLIT(A1:A7,", ")))))
In D1:
=ARRAYFORMULA(MMULT(0+ISNUMBER(SEARCH(", "&ColumnCSpilledRange&", ",", "&TRANSPOSE(A1:A7)&", ")),ROW(A1:A7)^0))
Replace ColumnCSpilledRange appropriately.

How can I avoid having to put 0s into the NULL fields to get a correct query calculation in Google Sheets

I have a Google Sheets question, which I have not been able to figure out yet with Google-Fu and RTFM:
Take the following spreadsheet as an example:
https://docs.google.com/spreadsheets/d/1IvMVaUdUDfYOoKyG0Uwd2n0M1mLjOTE5yZQ9K2R3q2M/edit?usp=sharing
In case the sheet gets lost in time, I am going to post its contents here:
Sheet1:
foo
withdrawal
deposit
C
4
10
D
10
E
10
4
As you see here, the withdrawal field for the D value being foo is empty, i.e. null
Sheet2:
foo
balance
C
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A2&"'"), 2)
D
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A3&"'"), 2)
E
=INDEX(QUERY({Sheet1!$A$2:C}, "SELECT SUM(Col3) - SUM(Col2) WHERE Col1 = '"&A4&"'"), 2)
The result is
foo
balance
C
6
D
E
-6
As you see, the balance field for the category D is null, although it should be -10.
The fix for that is to put a 0 into the deposit field in Sheet1 explicitly.
In my example, I get that data using a csv-export, and fields are generally empty and not 0, and it is cumbersome to add the 0 there. Is there a way to have something like COALESCE in that sum there (like in SQL)?
Please let me know.
it seems like something quite a bit simpler would avoid the problem:
=SUMPRODUCT(Sheet1!C:C-Sheet1!B:B,Sheet1!A:A=A2)
for cell B2.
Why don't you just add this in cell A1 of Sheet2 instead of all the Query:
=arrayformula({Sheet1!A1,"balance";if(Sheet1!A2:A<>"",{Sheet1!A2:A,Sheet1!C2:C-Sheet1!B2:B},)})
Obviously ensure cells Sheet2!A2:A and Sheet2!B1:B are empty.
If you have duplicate values of foo, try:
=arrayformula(query({Sheet1!A1,"balance";if(Sheet1!A2:A<>"",{Sheet1!A2:A,Sheet1!C2:C-Sheet1!B2:B},)},"select Col1,sum(Col2) where Col1 is not null group by Col1 label sum(Col2) 'balance'",1))
A better option for a single-cell formula, referencing multiple sheets would be:
=arrayformula(query(
{Sheet1!A:A,n(Sheet1!B:C);Sheet2!A2:A,n(Sheet2!B2:C);Sheet3!A2:A,n(Sheet3!B2:C)},
"select Col1,sum(Col3)-sum(Col2) where Col1 is not null group by Col1 label sum(Col3)-sum(Col2) 'balance' ",1))

How do I find the most common value in more than one column?

When given multiple columns or rows, is there a formula that would return the most common value in the range? Example:
The formula would return Bird in the example above given columns A and B because Bird shows up 4 times, more than any other value.
try:
=INDEX(QUERY({A:A; B:B},
"select Col1,count(Col1)
group by Col1
order by count(Col1) desc"), 2, 1)

Perform a lookup to find out values from another column and summarises them using Google Sheets

Example data below.
I want to be able to sum the values in Col2 for each occurrence of Col 1, depending on the values in 'Other Cols' that are applied in combination with the value in Col1
Col1------Col2-----Other Cols
A---------40-------other data
A-------------------other data
A-------------------other data
B---------30-------other data
B-------------------other data
C-------------------other data
C-------------------other data
C---------90-------other data
For example, the values in 'other data' might mean the value where Col1 = B is not to included, so the correct outcome is 130 (40+90)
If possible I want to be able to achieve the above in a single Query.
In the real-life data there are over 2,000 rows of data and roughly 200 different values for Col1 (growing in size on a daily basis!!)
What I've been able to do myself!
1) I've created a Query that outputs a row for each valid occurrence of Col1 according to the selection criteria applied to 'Other Data', i.e.
A
C
2) Logically what I want to do next, but I can't do it because I don't know how to, is look back into the original data to find out the Col B values for the values for A and C (i.e. 40 and 90)
3) Then after that, I want to be able to sum the values identified (i.e. 40 + 90), so that in one single Query/cell the answer 130 is returned!!!
Being able to achieve step (2) would be very useful???
Doing (2) + (3) would be perfect!!!
(Note, the value for Col2 is unique to each set of values for Col1 )
=SUM(IFERROR(QUERY(A2:C,
"select sum(B)
where C = 'yes'
group by A
label sum(B)''", 0)))
=ARRAYFORMULA(SUM(QUERY(UNIQUE({A2:A,
IF(C2:C="",, VLOOKUP(ROW(B2:B), IF(
QUERY(A2:C, "select B order by A desc,B desc", 0)<>"", {ROW(B2:B),
QUERY(A2:C, "select B order by A desc,B desc", 0)}), 2)), C2:C}),
"select sum(Col2) where Col3='yes' group by Col1 label sum(Col2)''", 0)))

Resources