Google spreadsheet query multiple columns with a count column - google-sheets

I have a table like this:
Animal ID eye color
-------------------------
Rabbit 90 blue
Rabbit 90 brown
Cat 91 blue
Cat 91 green
Squirrel 92 brown
What I have is
=QUERY(A2:C6;"select A,count(A) group by A")
which returns two columns. Now I want to add the ID column to it.
The desired outcome basically is:
Animal ID Count
----------------------
Cat 91 2
Rabbit 90 2
Squirrel 92 1
I realize I can do a
=QUERY(A2:C6;"select A,B ")
But can't combine that with a count on A nor a count on B.
Is there some not too complicated way to do that?

If the same animal has the same ID, you can group by both:
=QUERY(A2:C6,"select A,B,count(A) group by A,B")
or choose max or min or avg value of ID per group:
=QUERY(A2:C6,"select A,min(B),count(A) group by A")

Related

How can I perform a SUM calculation for unique values in a particular column in Tableau?

I am trying to sum the autonomy of all unique car models in Germany.
Key
Car model
Country
Color
Autonomy (miles)
1
ID3
Germany
Green
340
2
Polestar 2
Sweden
Yellow
335
3
EQS
Germany
Blue
450
3
EQS
Germany
Red
450
The answer should be: 340+450=790
450 should only be considered once because Key=3 is a unique identifier (Car model) even though the colour is different.
I tried doing that using INCLUDE/FIXED LOD expressions but I am doing something wrong.
Try below solution!
{FIXED [Autonomy] : MIN([Autonomy])}

How to do a full outer join?

I am trying to do the full join for the data below in two different sheets.
Sheet 9:
Product ID
Name
Quantity
1
addi
55
2
fadi
66
3
sadi
33
Sheet10
Product ID
Variants
Model
1
xyz
2000
2
differ
2001
3
saddd
336
4
fsdfe
2005
Desired output sheet :
Product ID
Name
Quantity
Variants
Model
1
addi
55
xyz
2000
2
fadi
66
differ
2001
3
sadi
33
saddd
336
4
fsdfe
2005
Please also share if we have more columns to join like in sheet 1 and sheet 2 has two more columns like Year, product label etc then what should I change in your proposed solution
I am using this formula but its not returning the desired result
=ARRAYFORMULA({QUERY(SORT(UNIQUE({Sheet9!A1:D; Sheet10!A1:D})), "where Col1 is not null"),IFERROR(VLOOKUP(TRANSPOSE(QUERY(TRANSPOSE(QUERY(SORT(UNIQUE({Sheet9!A1:D; Sheet10!A1:D})), "where Col1 is not null")),,999^99)), TRANSPOSE(QUERY(TRANSPOSE(Sheet9!A1:D),,999^99)), Sheet9!C1:C}, 2, 0),""),IFERROR(VLOOKUP(TRANSPOSE(QUERY(TRANSPOSE(QUERY(SORT(UNIQUE({Sheet9!A1:D; Sheet10!A1:D})), "where Col1 is not null")),,999^99)), {TRANSPOSE(QUERY(TRANSPOSE(Sheet10!A1:D),,999^99)), Sheet10!C1:C}, 2, 0),"")}})
EDITED to consider dynamic row matching.
See this spreadsheet to illustration, but overall there's a question of your setup, but I would break your problem into two steps.
Get distinct list of ID's
You can get that with this formula:
=unique(transpose(split(textjoin(",",true,
iferror(INdex(Sheet2!$A$2:$Z,0,MATCH(A1,Sheet2!1:1,0)),""),
iferror(INdex(Sheet1!$A$2:$Z,0,MATCH(A1,Sheet1!1:1,0)),"")),",")))
Rest of Headers
Then for each header, will they each always only be in 1 exclusively or 2 (not both)? Assuming so, this should work for each additional column. If two values ever exist in the two sheets, will join them in the same column.
=filter(
iferror(VLOOKUP($A$2:$A,Sheet1!$A:$Z,match(E$1,Sheet1!1:1,0),false),"")
&iferror(VLOOKUP($A$2:$A,Sheet2!$A:$Z,match(E$1,Sheet2!1:1,0),false),"")
,$A$2:$A<>"")
There's probably a way to use the join function to do this more elegantly (if someone posts an answer showing me I'll upvote).

Filter to the latest month and then filter to the best score per person

I've got a Google Sheet which holds the results of a monthly competition. The format is
Name | Date | Score
--------------------------------
Alan Smith | 14/01/2016 | 500
Bob Dow | 14/01/2016 | 450
Bob Dow | 16/01/2016 | 470
Clare Allie| 16/01/2016 | 550
Declan Ham | 16/01/2016 | 350
Alan Smith | 10/02/2016 | 490
Bob Dow | 10/02/2016 | 425
Declan Ham | 12/02/2016 | 400
Declan Ham | 12/02/2016 | 390
Clare Allie| 12/02/2016 | 560
I want to do 2 things with this data
I want to create a new sheet which holds the latest 'best' results. For the data presented here that would be
Alan Smith | 10/02/2016 | 490
Bob Dow | 10/02/2016 | 425
Declan Ham | 12/02/2016 | 400
Clare Allie| 12/02/2016 | 560
i.e. The results from February with the 'best' score per person. Here Declan Ham's lower score of '390' was removed.
I want another sheet to hold the tournament ranking. People are ranked by their top 3 monthly scores. i.e. The best score for each person for each month is obtained and the top 3 scores are combined to give their place in the tournament.
So far I've attempted to use Google queries, vlookups, filters to get these new sheets. But, just focusing on 1), the best I've been able to achieve is
=FILTER(Results!$A:$B, MONTH(Results!$B:$B) = MONTH(MAX(Results!$B:$B)))
Which will get me the results from the latest month. But it does not remove duplicates entries by people.
Does anyone have a suggestion for how I can achieve these requirements? Feel like I'm treading water at the moment.
Rather than trying to remove duplicates, you need to identify the maximum score by each person; you can do that by grouping values by person, then aggregating using max(). Here's how that would look, for the month of February 2016:
=query(Results!A1:C,"select A,max(C) where todate(B) > date '2016-2-1' group by A")
Instead of using a fixed value for the start of the latest month, we can get the year and month using spreadsheet formulas, and concatenate our query with them:
=query(Results!A1:C,"select A,max(C) where todate(B) > date '"&year(max(Results!B2:B))&"-"&month(max(Results!B2:B))&"-1' group by A")
That addresses your first question.
Tournament ranking
Your second goal is too complex for a single spreadsheet formula, in my opinion. Here's a way to accomplish it with multiple formulas, though!
The X & Y axes are filled out by spreadsheet formulas. On the X axis (orange), we populate participants names using this in cell A3:
=unique(Results!A2:A)
The Y axis consists of dates (green). These are the start dates of each unique month that there are scores for, calculated using the following formula in cell D2. This results in strings, e.g. 2016-01-1, and that format is specifically required for the later formulas to work.
=TRANSPOSE(SORT(UNIQUE(ARRAYFORMULA(TEXT(Results!B2:B13,"YYYY-MM-1")))))
Here's the formula for cell D3, which will calculate the sum of the 3 highest scores recorded for the user whose name appears in A3, for the month appearing in D2. (Copy & Paste the formula across the full range of participants & months, and it will adjust.)
=sum(query(Results!$A$1:$C,"select C where A='"&$A2&"' and todate(B) >= date '"&B$1&"' and todate(B) < date '"&IF(ISBLANK(C$1),TEXT(TODAY()+1,"yyyy-mm-dd"),C$1)&"' order by C desc limit 3 label C ''"))
Key points about that formula:
The query range needs to used fixed values so it isn't transposed when copied to additional cells. However, it's still open-ended, to absorb additional rows of scores on the "Results" sheet.
Results!$A$1:$C
A WHERE clause is used to select rows from the Results sheet that are for the given participant (A='"&$A2&"') and fall within the month that heads the column (C$1).
...and todate(B) < date '"&IF(ISBLANK(C$1),TEXT(TODAY()+1,"yyyy-mm-dd"),C$1)&"'
The best 3 scores for the month are found by first sorting the above result descending, then limiting the result to 3 rows.
...order by C desc limit 3
Finally, the QUERY headers are suppressed by this little trick, so that we get a single number as the result:
...label C ''
Individual tournament totals appear in column C, with a range SUM across the row, e.g. for cell C3:
SUM(D3:3)
The corresponding ranking in column B is then:
RANK(C3,C$3:C)
Tidy
For simpler copy/paste, you can do some error checking in these formulas, so that they can be placed in the sheet before the corresponding data is - for example, at the start of your season. Using IF(ISBLANK(... or IFERROR(... can be very effective for this.
B3 & down:
=IFERROR(RANK(C3,C$3:C))
C3 & down:
=IF(ISBLANK(A3),"",sum(D3:3))
D3 & rest of field:
=IFERROR(sum(query(Results!$A$1:$C,"select C where A='"&$A3&"' and todate(B) >= date '"&D$2&"' and todate(B) < date '"&IF(ISBLANK(E$2),TEXT(TODAY()+1,"yyyy-mm-dd"),E$2)&"' order by C desc limit 3 label C ''")))
Alternatively for the first part of your question (the latest 'best' results) , in addition to the solution provided by Mogsdad, this should also work.. :-)
=ArrayFormula(iferror(vlookup(unique(A2:A), sort(A2:C, 2, 0, 3, 0), {1,3}, 0)))
EDIT: This formula sorts the table with dates (col B) descending and col C descending and then (ab)uses the fact that vlookup only returns the first match to return the first and last column.

Google Sheet Query - Group / concatenate multiple rows

I'm running a QUERY with a SUM and GROUP BY, but I'd like to aggregate multiple distinct values from the rows into a single row and column. I'm looking to concatenate all those values together.
Current Table:
Person
Widget
Count
Bill
Red
12
Bill
Blue
9
Sarah
Yellow
4
Bill
Yellow
1
Sarah
Orange
10
Expected Table:
Person
Widget
Count
Bill
Red, Blue, Yellow
22
Sarah
Yellow, Orange
14
You can use the filter and join functions to help:
To get a unique list of names:
=UNIQUE(A3:A)
To join the widgets:
=join(",",filter(B:B,A:A=E3))
To sum the values:
=sum(filter(C:C,A:A=E3))

Google Spreadsheet SUMIFs equivalent

I need a SUMIFs equivelent in google spreadsheet. It only has SUMIF, no IFS.
here is my data:
# Salesman Term (Month) Amount
1 Bob 1 1,717.09
2 John 1 634.67
3 Bob 1 50.00
4 Bob 1 1,336.66
5 Bob 1 0.00
6 Bob 1 55.00
7 Bob 300 23,803.97
8 Bob 300 24,483.91
9 Bob 300 20,010.03
10 Bob 300 41,191.62
11 Bob 300 40,493.14
12 Bob 300 10,014.01
13 John 1 100.00
13 John 100 100.00
I want to add everything that BOB sold that the term is equal to or less then 100. I also want to SUM everything that bob sold that the term is greater then 100. Same for John.
You need to use the FILTER function combined with the SUMfunction. In your example with Bob, your function would be like (assuming your data columns is from A to C):
=SUM(FILTER(C:C;A:A="Bob";B:B<=100))
The new Google Spreadsheet, now has the SUMIFS function available.
You can use the following formula:
=SUMIFS(F1:F14,D1:D14, "=Bob",F1:F14, ">=100",E1:E14, "<100")
If you want to perform the calculation for John as well, I can recommend the following:
=QUERY(D1:F14, "SELECT D, SUM(F) WHERE E<100 AND F>=100 GROUP BY D LABEL D 'Who', SUM(F) 'Total'")
See example file I created: SUMIFS on DATA

Resources