I have a table like the picture below and I would like
to get the average of every column of the table using a query
to sort all the average of every column of the table and keep only the 3 maximum (using a query)
Below is what I tried :
the following query only gives me the average of the first column
=QUERY({17:22};"SELECT AVG(Col1) WHERE Col1<>0";1)
for sorting, I guess I will have to use ORDER BY something DESC ? And for limiting values, I guess I will have to use LIMIT 3, isn't it ?
Thanks for your help
You can't query an average across multiple columns to my knowledge while excluding zero's. Therefore you cannot order the results and limit it to 3.
However if you are willing to accept zero's then this is a query that would work:
=TRANSPOSE(QUERY(TRANSPOSE(QUERY({A17:I},"SELECT "&ARRAYFORMULA("AVG(Col"&JOIN(",AVG(Col",SEQUENCE(COLUMNS(A:I),1)&")"))&"")),"SELECT * ORDER BY Col2 DESC LIMIT 3"))
UPDATE:
A super messy solution is to just compile arrays, not very dynamic and requires attention if more days are added:
=TRANSPOSE(QUERY({TRANSPOSE(A17:I17),{AVERAGEIF(A18:A,">0");AVERAGEIF(B18:B,">0");AVERAGEIF(C18:C,">0");AVERAGEIF(D18:D,">0");AVERAGEIF(E18:E,">0");AVERAGEIF(F18:F,">0");AVERAGEIF(G18:G,">0");AVERAGEIF(H18:H,">0");AVERAGEIF(I18:I,">0")}},"SELECT * ORDER BY Col2 DESC LIMIT 3"))
use:
=TRANSPOSE(SORTN(TRANSPOSE({17:17; BYCOL(18:22;
LAMBDA(x; IFERROR(AVERAGE(x))))}); 3; 1; 2; 0))
update:
=ARRAYFORMULA(TRANSPOSE(QUERY(TRANSPOSE(QUERY({17:22},
"select "&TEXTJOIN(",", 1, IF(17:17<>"",
"avg(Col"&SEQUENCE(1, COLUMNS(17:17))&")", ))&
" where "&TEXTJOIN(" or ", 1, IF(17:17<>"",
"Col"&SEQUENCE(1, COLUMNS(17:17))&"<>0", )), 1)),
"order by Col2 desc limit 3", 0)))
Related
I took an Udemy class in Google Sheets and I'm trying a formula on a similar dataset as I learned on, but I'm getting a value error. Please help.
The query is:
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1uJAwHzcg_MYBS08jqnWGcqe7oBIlPyVWCQN48G6tFfE/edit#gid=790495475","sephora_website_dataset2!$A:$G"),"SELECT Col2,Average(Col7) WHERE Col2 IS NOT NULL GROUP BY Col2 ORDER BY Average(Col7) desc LIMIT 10",-1)
Here is what the dataset looks like: https://i.stack.imgur.com/Qm6qx.png
It doesn't know Average, use avg instead.
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1uJAwHzcg_MYBS08jqnWGcqe7oBIlPyVWCQN48G6tFfE/edit#gid=790495475","sephora_website_dataset2!$A:$G"),"SELECT Col2,avg(Col7) WHERE Col2 IS NOT NULL GROUP BY Col2 ORDER BY avg(Col7) desc LIMIT 10",-1)
The name of the average-function in your query is avg, not "Average".
So the correct formula would be:
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1uJAwHzcg_MYBS08jqnWGcqe7oBIlPyVWCQN48G6tFfE/edit#gid=790495475","sephora_website_dataset2!$A:$G"),"SELECT Col2, avg(Col7) WHERE Col2 IS NOT NULL GROUP BY Col2 ORDER BY avg(Col7) desc LIMIT 10", -1)
use:
=QUERY(IMPORTRANGE("1uJAwHzcg_MYBS08jqnWGcqe7oBIlPyVWCQN48G6tFfE", "sephora_website_dataset2!A:G"),
"select Col2,avg(Col7)
where Col2 is not null
group by Col2
order by avg(Col7) desc
limit 10
label avg(Col7)''", 0)
I need to return a two column table from query where the first column shows the position order and then full name. So essentially in MySQL form it would be an autoincrement but I cannot get it to work. I'm using =arrayformula(QUERY({G4:G18, arrayformula(row(G4:G18)) & ". " & H4:H18&" "&I4:I18, J4:J18}, "SELECT Col2, Col3 WHERE Col1 = 'Yes' ORDER BY Col3 ASC LABEL Col2 '', Col3 ''")) which I realize the row(G4:G18) is just going to return the row number but I've tried everything else I can think of and can't get it to work. Any help is greatly appreciated. Note: I want to keep this in query form versus filter for various reasons. thanks.
Sample sheet to see in action
I have come up with a solution but to do it, I need to have 2 queries. 1 that returns the Full name with incrementing numbers, and the other one which returns the Person ID. Please see screenshots below:
1st query(for Full names with incrementing numbers):
=INDEX(arrayformula(ifna(arrayformula(row(G4:G18)-3) & ". " & QUERY({G4:G18, H4:H18&" "&I4:I18, J4:J18}, "SELECT Col2, Col3 WHERE Col1 = 'Yes' ORDER BY Col3 ASC LABEL Col2 '', Col3 ''"), "")), 0, 1)
2nd query(for the Person ID)
=INDEX(arrayformula(QUERY({G4:G18, H4:H18&" "&I4:I18, J4:J18}, "SELECT Col2, Col3 WHERE Col1 = 'Yes' ORDER BY Col3 ASC LABEL Col2 '', Col3 ''")), 0, 2)
They are basically the same query, but I split them into two in order to concatenate an incrementing value to the Full names. I tried doing it using only 1 query but what happens is that the incrementing value will also be seen in the Person ID column(eg. 1. 2, 2. 3, 3. 5). Please let me know if this solution solves your problem.
I would like to ask if there is any way to limit the number of values returned from the UNIQUE() function in googlesheets?
For example, based on the below screenshot, I would do
UNIQUE(A2:A)
That will return me all year_quarter values in ascending order.
Assuming that row A will contain many quarter values from 2016 to 2020, is there a function, or to use a query, to return the earliest first 4 quarters of column A?
I also tried combining with QUERY() through putting a LIMIT in the SELECT statement, but realize that logic doesn't work out .
If there is a way to flexibly do another selection of all values EXCEPT the first 4 quarters, that would be great.
Appreciate any tips on how to go about doing this!
It looks like your logic actually should work.
To show only (up to) the first 4 results:
=QUERY(SORT(UNIQUE(A2:A)),"limit 4")
And to exclude the first 4 results:
=QUERY(SORT(UNIQUE(A2:A)),"offset 4")
Reference:
QUERY
SORT
Answer 1:
Assuming that row A will contain many quarter values from 2016 to
2020, is there a function, or to use a query, to return the earliest
first 4 quarters of column A?
=query(unique(A2:A),"Select Col1 where Col1 is not Null Order By Col1 desc limit 4")
or if you want to sort them back:
=sort(query(unique(A2:A),"Select Col1 where Col1 is not Null Order By Col1 desc limit 4"))
Answer 2:
If there is a way to flexibly do another selection of all values
EXCEPT the first 4 quarters, that would be great.
=query(unique(A2:A),"Select Col1 where Col1 is not Null Order By Col1 limit " & counta(unique(A2:A))-4)
or if you don't want unique values remove the unique part:
=query(A2:A,"Select A where A is not Null Order By A limit " & counta(A2:A)-4)
You can also use the simpler and less intense SORTN
=SORTN(B2:B,4,2,1,1)
Below is the structure of my Google Sheets document. I want to limit the number of results returned by the query, but on each match and not for the entire query.
Here is what my current formula look like
=query({Sheet5!A2:B}, " select * where Col1 matches 'School A|School B|School C'
and Col2 is not null desc limit 5 ")
So what happens here is that the formula finds all the matches as requested by the formula and goes on to limit the number of total entries to 5. What I want is to limit the results to 5 per match so that School A results are 5, School B results are also 5 etc.
Thanks
Proposed solution:
Arrays
You can use arrays which involves combining three separate queries into one formula.
For example:
={1,1,1;2,2,2;3,3,3}
Gives you this:
Your example
You can do something similar for 3 separate queries, putting all queries between {} and separating them by a ;:
={
query(Sheet1!A2:B, "SELECT * WHERE A MATCHES 'School A' AND B IS NOT NULL ORDER BY B DESC LIMIT 5", 0);
query(Sheet1!A2:B, "SELECT * WHERE A MATCHES 'School B' AND B IS NOT NULL ORDER BY B DESC LIMIT 5", 0);
query(Sheet1!A2:B, "SELECT * WHERE A MATCHES 'School C' AND B IS NOT NULL ORDER BY B DESC LIMIT 5", 0)
}
Which will give you something like this:
It is possible in SQL to do a GROUP_CONCAT and make it into one single query, but AFAIK Google Sheets Query does not support that. This way seemed simpler anyway, unless you want to order globally by column B!
References
Google Query Language
Arrays
I am trying to discern the base percentage points change in two columns in a query statement and then order by the results.
I have tried to label the result and Order By that label, but this doesn't work.
=QUERY('Sample Sheet'!A:I,"SELECT A,B, ((I-H)/H)*10000 LIMIT 5")
I would like the ((I-H)/H)*10000 to be that which I order by. Currently, the Limit statement brings the top 5 results. I don't want to Order By H or I because it is the change in the two numbers I want to display.
Any help would be much appreciated!
=QUERY(A:I,"SELECT A,B, ((I-H)/H)*10000 WHERE A IS NOT NULL ORDER BY ((I-H)/H)*10000 LIMIT 5",1)
ORDER BY can be used with the same expression used in SELECT
you could use double query and have it nice and clean like:
=QUERY(QUERY('Sample Sheet'!A:I,
"select A,B,((I-H)/H)*10000", 1),
"order by Col3 desc
limit 5
label Col3'equation'", 1)
but if you prefer one query then:
=QUERY('Sample Sheet'!A:I,
"select A,B,((I-H)/H)*10000
order by ((I-H)/H)*10000 desc
limit 5
label ((I-H)/H)*10000'equation'", 1)