"Group by" in Google Sheets - finding distinct values for each column? - google-sheets

I have a table in Google Sheets with 4 descriptive columns and 1 numerical column (Ex. State, Region, Ethnic Group, Education Level, Population).
I'd like to create a new table in a separate tab with the distinct values for some descriptive fields and the median of the numerical field - what is the best way to do this dynamically?
If it helps, I know this is what it would look like in SQL (below), I just don't know how to do it in sheets ;)
select state, region, ethnic_group, median(population) as med_population
from TABLE
group by state, region, ethnic_group

try:
=QUERY(A:D;
"select A,B,C,avg(D)
where A is not null
group by A,B,C
label avg(D)'median'"; 1)

Related

How to SUMIF identical data spread out on multiple columns

I using excel to organize my cut list for a panel board. I have a table created like so, and i need help creating a formula that adds the number of pieces of panels of identical dimension and material, regardless of which panel it belongs
I have done formulas before where the datas where in a single column and there was only 1 criteria to check using UNIQUE and SUMIF formulas. I can already extract all the unique dimensions using
=UNIQUE(FLATTEN(C17:C19,E17:E19,G17:G19))
but that all that i got for now.
try:
={"Material", "Dimension", "Pcs";
QUERY({A2:C9; A2:A9, D2:E9; A2:A9, F2:G9},
"select Col1,Col2,sum(Col3) where Col1 is not null
group by Col1,Col2 label sum(Col3)''", 0)}

Google sheets question - filter sum equation

I have a list of email addresses in one sheet (first column).
I have a list of transactions in another sheet with emails and sale amounts.
I am looking for a formula that adds up all the transaction $ sales for any transactions made by the people (emails) in the first sheet.
Any help would be much appreciated!
Your sample data is very limited (only one row matching one person, in fact). But the following formula should work for you. Place it in a new sheet, in cell A1:
=ArrayFormula(IFERROR({"Name","Total"; QUERY(FILTER(Transactions!A2:D,NOT(ISERROR(VLOOKUP(Transactions!A2:A,Tags!A:A,1,FALSE)))),"Select Col2, SUM(Col4) GROUP BY Col2 ORDER BY Col2 LABEL SUM(Col4) '' ")},"No Data"))
This one formula will produce the headers (which you can change within the formula itself as desired) and all results.
FILTER filters in only 'Transactions' data (names through amounts) where the emails are found in the 'Tags' sheet.
Then QUERY returns the name and totals for each match in alphabetical order by name.
If there are no matches, "No Data" will show instead.
If I understood your question correctly!
Try this on the sheet, where you only have emails and wanted to get sum of sales amount
=sumifs(range_whichHasTransaction , range_of_Email_inThat_TransactionsTable , Cell_Reference_ofEmail_forWhich_you_want_sum_the_Transaction_Amount)
it will Look something like this:-
sumifs(TransactionSheet!B:B,TransactionSheet!A:A,Emails!A2)
Reference
SUMIFS

How can I combine multiple columns into one column in Google Sheets?

(Note: Please simply look at the Google sheet for the quickest understanding of what I'm describing in the below bulletpoints)
My data has rows which each represent an order
Each order (row) can consist of multiple products
For each product in an order (row) there is another set of columns in the same row
I need this data to convert into only one set of columns per row (i.e. one product per row)
The products (new rows) need to remain next to eachother so the columns can't just be added to the bottom of the array (which is more simple)
Can you please take a look at the example below and help me achieve this?
Example Sheet
Screenshot of linked sheet
Try this in another sheet
=SORT({query({Reference!$A5:$A,Reference!B5:F},"select * where Col2 is not null ");query({Reference!$A5:$A,Reference!G5:K},"select * where Col2 is not null ");query({Reference!$A5:$A,Reference!L5:P},"select * where Col2 is not null ")})

Is there a way to average from filtering specific data from an importrange?

I'm working on a different Google Sheets spreadsheet to input data (film details such as English Title, Original Title, Release Date, Rating, Country of Origin and a Link), while on the one I'm analyzing the data I managed to use importrange successfully.
Here is the code I used successfully in order to get the average rating of a country's list of movies:
=AVERAGE(IMPORTRANGE("LINK_TO_INPUT_DATA_GOOGLE_SHEETS", CONCAT(A1:A, "!D1:D")))
This average is outputted to column C, while the name of the Country (which is also the name of the sheet for importrange) is in Column A.
I want to create a similar query but for movies that have the Country of Origin matching the Country from Column A (Any movie that has multiple countries of origin are inputted with the first one in the spreadsheet and copied over in all the other countries of origin's respective sheets).
I tried using the QUERY from Google Sheets to make my resultset, but in the best case scenario, it gives the same result as the previous average, while in the worst case scenario it just gives out errors. Here is my latest attempt at the query:
=AVERAGE(QUERY (IMPORTRANGE("LINK_TO_INPUT_DATA_GOOGLE_SHEETS", A1:A), "SELECT Col4 WHERE Col5="&A1&""))
As far as I can tell, this should work, but at the moment it says it cannot find the range or sheet for the imported range.
Any help is deeply appreciated!
EDIT:
Here's a link of the input sheet: https://docs.google.com/spreadsheets/d/1bopmJu7Av71sCh8iUoG20WubGL9ssx09dOnBZnys4Ko/edit?usp=sharing
Here's a link of the analysis spreadsheet (the query should be in the MOVIES sheet):
https://docs.google.com/spreadsheets/d/1-hfQdqvDWXXtGR2fmTy-lZEOtp9sdxkvoget4toi1W4/edit?usp=sharing
I am not sure If I got you right:
- you want: import the average of the ratings (column 4) by movie title (column 1) where the country matches your current column A?
If so it can simply done with queries, especially if you include the average in the query as well. But you need to include all columns you use in the importrange:
=QUERY(IMPORTRANGE("https://...", "Syria!A1:E"),"SELECT AVG(Col4) WHERE Col5='"&A2&"' LABEL AVG(Col4) ''")
Explanation: group by will aggregate all columns by the column you declared as being used as average.
this is the correct syntax:
=AVERAGE(QUERY(IMPORTRANGE("ID_OR_URL"; "Sheet1!A1:A"); "SELECT Col4 WHERE Col5='"&A1&"'"; 0))

Google Sheets query to extract the top three instances according to criteria

I have a Google Spreadsheet with two sheets.
In sheet "Source" I have a series of countries, cities and landmarks - these are,respectively, in columns A, B and C.
In sheet "Sheet for Query", there are two columns: (A) Country, which has a list of unique country names; and (B) Top 3 cities by Landmark. In column B, I would like to have a Query which gives me, for each country, the top three cities by number of landmark, i.e., the query just has to count the number of instances each city in each country appears and return, for each country, the names of the three cities that come up the most times
This is a sample sheet that I've created in order to demonstrate what I mean: https://docs.google.com/spreadsheets/d/1IPwtAHjwjV1A03o9URws-AtDKw3h9QS9UTT0P1PeVN0/edit?usp=sharing.
Thank you!
I've given this some thought and to 'just' count the number of instances and return the top 3 in each country is surprisingly difficult.
The grouping is straightforward with a query like this
=query(A:C," select A,B,count(C) where A<>'' group by A,B order by A,count(C) desc label A 'Country',B 'City', Count(C) 'Landmarks'",1)
But I don't know of a way of getting the top 3 for each group without going through 2 further steps
(1) Number the results in each group (various ways of doing it but here is one)
=(E1=E2)*D1+1
where the country names after grouping are in column E.
(2) Filter the result for the number in column D being less than 4
=filter(E:G,D:D<4)
You don't specify what qualifies as top (so assuming those are the first listed - higher up the sheet), and you don't clarify number of landmark where there are no numbers in your sheet, but perhaps:
=textjoin(", ",,query(Source!A:C,"select B where A='"&A2&"' limit 3"))
in B2 of sheet for Query, copied down to suit.

Resources