Google Sheets Query with Variables in the data section - google-sheets

I am trying to create a health related spreadsheet that has a lot of data - a lot of which isn't relevant to this question so I've simplified it. There is a column for each type of pain where you write on a scale of 0-10 how intense your pain was, and another column for any relevant notes. The data is broken up into named ranges to make it easier to display on different tabs (HeadData = Head Pain, ChestData = Chest Pain, etc. - 15 named ranges in total.)
One of the tabs I'm working on has a table where you are viewing only the specific named range, in this case HeadData.
=query({HeadData}, " Select * where Col1 is not null ",1)
This works perfectly, but I want to replace {HeadData} with a reference cell to a drop menu so you can select the specific pain area column you want to be displayed.
If I put the reference cell in G1 with a drop down list of the named ranges and select ChestData and try to do
=query({&G1&}, " Select * where Col1 is not null ",1)
It is only picking up G1 (ChestData) as a string and not the actual named range.
So my question is, is there a way to make a drop menu containing named ranges that turn into actual sets of data and not strings when placed in the data section of the query?
Here is my spreadsheet, any help is appreciated. Thanks!
https://docs.google.com/spreadsheets/d/1CcuSV2bbfxsUPPkmj-fru2yYYmmtpXk9LKEF85sxVUw/edit?usp=sharing

You can use INDIRECT for this.
INDIRECT
Returns a cell reference specified by a string.
Change your formula to
=query({INDIRECT(G1)}, " Select * where Col1 is not null ",1)
The INDIRECT function will convert the string in G1 to a cell reference and then the rest of your formula will query the relevant named range.

Related

How to handle data manipulation when using importrange() in Google Sheets?

I am working on speeding up a workbook in Google sheets that is using importrange(). The purpose of the entire workbook is to import data from a mastersheet and then allow us to manipulate it the way we want to outside of the mastersheet.
The problem: because importrange() doesn't allow you to directly manipulate cells we have Sheet1 acting as the import sheet; it doesn't get touched. Sheet2 is where we do the manipulating but, it was literally just taken as a copy of Sheet1, so it is also using importrange(). This bogs down the entire workbook and makes manipulations very slow.
I am thinking of using !Sheet1A1... and copying that to all the cells in the manipulation sheet, but my concern is that this will still bog down the workbook. There is potential that the import data could grow as large as 10k+ rows, and I'm only at about half that currently and running into this problem. Outside of that, I'm not sure what else there is to try.
The QUERY function can help here and there are some great resources online.
=importrange(spreadsheet_url, range_string)
a typical example is:
=importrange("https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxxxxxx","Sheet1!A:Z")
You can wrap a QUERY function around this to manipulate your data.
QUERY is like a version of SQL and very powerful. It's in the format:
=QUERY({},"",1)
Your data range importrange("https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxxxxxx","Sheet1!A:Z") would go within {}.
Then within the "" part of the query, you could write your parameters for manipulating the data.
Example:
select Col1,Col4,Col5 where Col1 is not null and Col6 contains 'hello' order by Col1,Col7 desc label Col1 'new name 1',Col4 'new name 4'
The select bit allows you to specify specific columns from your importrange. If you want the all, then you could use select *.
The where item is where you build up your criteria using various or or and parameters.
is not null is another way of saying you want rows that have data.
contains is useful. You can also have matches, starts with, ends with and like. like can use wildcards %, so where Col1 like '%the%' would find 'hello there'.
order by is ascending unless you add desc, ie. order by Col1,Col2,Col4,Col5 desc,Col3.
label allows you to rename the columns, so let's say input column 1 is called 'Name1' and input column 2 is 'Name2' and you want them to be 'First name' and 'Surname, you would use label Col1 'First name', Col2 'Surname'.
If you like QUERY there are other powerful clauses, and they run in this order within the QUERY(range,"clauses",0):
select
where
group by
pivot
order by
limit
offset
label
format
options
One small point which you may come across, when you use importrange to get your data you need to reference the columns as Col1,Col2,Col3 within the QUERY.
If, however, your range is already in the same sheet (same or different tab), then you would reference column letters instead, eg. select A,B,C where A is not null order by A desc.
To make it more consistent and use the Col1,Col2,Col3 notation, you would put your internal range in an array {}.
QUERY(Sheet1!B:F,"select B,C,D where F is not null order by B,C",0)
would become:
QUERY({Sheet1!B:F},"select Col1,Col2,Col3 where Col5 is not null order by Col1,Col2",0)
{Sheet1!B:F} is smart because you can add columns in front of this range without needing to change your clause. So adding one column in front of Sheet1, would result in:
QUERY({Sheet1!C:G},"select Col1,Col2,Col3 where Col5 is not null order by Col1,Col2",0)
The other method would need you to alter your clause from:
QUERY(Sheet1!B:F,"select B,C,D where F is not null order by B,C",0)
to:
QUERY(Sheet1!C:G,"select C,D,E where G is not null order by C,D",0)
It's a lot to take in, but definitely worth persuing!

Display the multiple values from a drop down

here i have a sheet, in that we can find the sum of diff categories using query function by selecting from drop down list. but here I can select one month only at a time can i find the amount of January and February at the same time by adding another column for another month or in any other way. here I can find the sales of one month at a time. I want to find sales of two or three months at time.
Please help
https://docs.google.com/spreadsheets/d/1jdtrtdNQBsxiZt8FjvbaE9omCBs8x8vRgp0r2bc1_7c/edit#gid=0
There's no way you can make a drop down list with multiple choice in Google Sheets.
But there are some alternatives.
List of tick boxes (here as list of months)
Manual input of multiple values separated by comma or something else.
I give both:
Months are selected as list separated by | so it can be used as regex inside 'matches' clause in query
This generates list of months:
=join("|",query({A2:B7;C2:D7},"select Col2 where Col1 = true "))
Window with manual input works similar way
=substitute(substitute(F3,", ",","),",","|")
It takes its contents, removes spaces that are adjacent to comma, adds separator | instead of comma. It's case sensitive and I don't know how to get rid of this (?i) does not work within query.
All together it looks like on the picture and combined formula is:
=query(ORDERS!A1:R14,"select A, B, C , D where
A matches '"&join("|",query({A2:B7;C2:D7},"select Col2 where Col1 = true "))&"' and
B matches '"&substitute(substitute(F3,", ",","),",","|")&"'",0)
Here is my solution:
https://docs.google.com/spreadsheets/d/1fQ5_VdxZ-t4MqPbLqzb8q-saqp5Jqz3hVXeWHX_Lls4/copy
I copied your spreadsheet to do my testing. Here's what you can do.
Add another row of the same exact selection found on your "A" row.
Change your formula to this: ={query(ORDERS!A1:R,"Select * where A contains '"&$A2&"' and B contains '"&$B2&"'",1);query(ORDERS!A1:R,"Select * where A contains '"&$A3&"' and B contains '"&$B3&"'",0)}
What this does it run an array of two sets of formulas (In this case 2 queries) for the same set of data.
Here's the screenshot of the output if you're interested.
Sample Screenshot

IF cell contains, THEN return certain value with more values and return possibilities

I have a sheet with the following columns:
Column 1: contains text of the form "TS001", "TS002", "DR001", "MS002" etc.
The 2 letter in the beginning are a code for the manufacturer name, so for example "MS=Microsoft".
For the second column, I would like to have a formula that goes through the first column and searches for those letters, in order to then return the complete name of the manufacturer.
For example, it should look something like this:
Column 1
Column 2
MS001
Microsoft
TS002
Tesco
DR001
DR. Pepper
TS003
Tesco
Is something like that possible?
Thank you very much!
When you say "MS=Microsoft" it implies somewhere you have a table with that reference. For the purposes of the following example I created a sheet named ReferenceTable where column A contains the two letter code, and column B contains the name of the company. So it looks like this:
A
B
MS
Microsoft
TS
Tesco
And now in the main sheet in column B you would write the following formula:
=ARRAYFORMULA(VLOOKUP(MID(A1:A,1,2),ReferenceTable!A1:B,2,FALSE))
This will give you the name of the company, looked up from the reference table.
The array formula is there so that you only have to put this formula in cell B1, and assumes you will use the ReferenceTable sheet as a list; that way as you add records to Column A Column B is populated by the arrayformula in B1.
I'd simply use a Reference Table and a VLOOKUP formula
If cell B7 contains "MS0001"
the following formula will attempt to match just the first two letters again a reference table located in cells O7:P9
=VLOOKUP(MID(B7,1,2),O7:P9,2,FALSE)
and will return "Microsoft" when it finds "MS"
In order to achieve what you want, somewhere you need to have a list of the two letter codes and the corresponding company name.
As with all vba, there’s any number of ways to do this, but I would probably put the two letter code and company data into an array, then iterate through col1 to create the desired output for col2.
E.g below assumes the two letter code and company names are in col3 and col4 respectively, but you can change it to wherever they’re located.
Sub CompName()
Dim Cmpname () as string
Dim col1 as range, rng as range
Cmpname = range(range(“C1”), range(“D1048576”).end(xlup))
Set col1 = range(range(“A1”), range(“A1048576”).end(xlup))
For each rng in col1
For i = lbound(Cmpname, 1) to ubound(Cmpname, 1)
If left(rng, 2) = Cmpname(i, lbound(Cmpname, 2)) then
rng.offset(0,1) = Cmpname(i, ubound(Cmpname, 2))
Exit For
End if
Next
Next
End Sub
I’ve admittedly just written this on my phone and have not tested it, but hopefully there’s minimal mistakes.
I just reread your question and realized that you may actually want a formula rather than vba code.
If this is correct, using an INDEX MATCH is probably your best bet.
In this example I’ll assume the same setup as described above - col3 has company codes and col4 has company name - and this formula can be inserted into cell B1:
=index(D:D,match(left(A1,2),C:C,0))
You can then just filldown for the rest of the entries in col2.
Again, done from memory without testing so hopefully got it right.

Pivoting programatically and carrying color information in summary of Google sheets

We have a base table that looks like this, in our main sheet called "MainData".
We would like to summarize it in a new worksheet. The summary needs to be by time, in a program management mode, where the "When" becomes the main view, in the following way. We could technically get a version of the top part of the table via Pivot, but that forces a new worksheet. We would like this entire view to be in our own second sheet of choice which we can call "Summary".
Not sure where to begin with this. The GETPIVOTDATA command seems a more convenient way to control how the pivot shows without forcing a worksheet, but it's the itemization of colours etc that is confusing. In each week's listing below that week column, we'd like to show the items but their cell needs to be coloured by the Status that item is in.
Not looking for ready made solutions (although I won't revolt if that's shared), just looking for pointers for which functions to look for. Thanks muchly!
Solution:
It's quite a task... :)
I have build working solution for you.
Go to this link to grab this (2 sheets - data and report)
Explanation:
Data sheet:
I added an extra column to source data - we will need this column in further query (you can hide this column)
={"Rep Desc";ArrayFormula(if(A2:A<>"";"Count of "&A2:A;))}
Report sheet:
I added 2 extra columns (A:B) (you can hide them later) to explain better what is going on. There are 4 main parts to this solution - you are able to pack all of them into one formula, but for sake of clarification I left them separate.
Part 1 Numbers of "Open / Closes / Attn"
This is simple query - we use extra column in data source to have desire description (Count of... instead just Attn, Closed, etc)
=QUERY({INDIRECT($A$1)};$B$1;1)
string to query
select Col5, count(Col4) where Col1 is not null group by Col5 pivot Col3 label Col5 ''
Part 2 - "Sum of Point"
Its Query again put into next query to remove headers + "Sum of Points" as an extra column (using inline array - {}):
={"Sum of Points"\QUERY(QUERY({INDIRECT($A$1)};B5;1);"select * offset 1";0)}
string to query
select sum(Col4) where Col1 is not null pivot Col3
Part 3 - "Features"
It is quite complicated... If I find more time I will describe what is going on here... but for now just code:
=QUERY(
transpose(ArrayFormula(SPLIT(
transpose(SPLIT(
TEXTJOIN("^";1;transpose(
{SPLIT(join(" ## ";transpose(query(transpose(QUERY({INDIRECT($A$1)};$B$9;1));"select Col1 offset 1";0)));" #";0;1);
QUERY(ArrayFormula(IF(TRANSPOSE(query(transpose(QUERY({INDIRECT($A$1)};$B$9;1));"select * offset 1";0))<>"";
query(QUERY({INDIRECT($A$1)};$B$9;1);"select Col1";0);""));"select * offset 1";0)}
))
;"# ";0;1))
;"^")))
;
"select * offset 1";0)
Part 4 - Conditional formatting
For range D9:H apply 3 rules with corresponding color :
=INDEX(INDIRECT("data!$A:$A");MATCH(D9;INDIRECT("data!$B:$B");0);1)="Open"
=INDEX(INDIRECT("data!$A:$A");MATCH(D9;INDIRECT("data!$B:$B");0);1)="Closed"
=INDEX(INDIRECT("data!$A:$A");MATCH(D9;INDIRECT("data!$B:$B");0);1)="Attn"
OK?
Is that what you were going to achieve?
Again - this is working copy for you:
Go to this link to grab this (2 sheets - data and report)

Google Sheets: selecting cell values based on another cell MAX values

I'm trying to make a database of students using Google Sheets. It contains info about students, groups and orders; orders can change students membership in groups (taken in a group, moved up to a new group, graduated, on leave, sent down). Here are sample database sheets and here is a detailed description of my DB structure (the sheet report_Groups is slightly changed, its previous variant, described on the link, is now named old_report_Groups).
I need a query that would select a list of present members of given group on the given date. That means that for each student I have to select
the name, the latter status before given date and corresponding group. And from this result select student names, where statuses are "Taken in" or "Moved Up" and group is the same as given one.
The problem is to select the latter status. It should be MAX(status), whose "since" date ≤ given date, but there's a well-known problem of selecting more than one field together with aggregate function. Here is a question which is very close to, but query from its "best" answer gives me error "QUERY:NO_COLUMN". I've even copied the sheet Raw from there and tried to perform proposed query (with the onliest modification — replacing commas with semicolons according to my locale restrictions) on the data it was reported to work on — same error (check Raw and report_Raw sheets in my DB). Other variant (via MMULT and TRANSPOSE) works, but it's perfomance is very poor.
What can you suggest me? Thanks in advance.
Update: I've found the solution with an issue (described in my answer).
To solve the issue I need to know an answer for a different question.
Here's the solution (with an issue described below).
A. Orders_Students is filtered for selecting rows, having "since" cell value ≤ given date (report_Groups!A2):
=QUERY(Orders_Students!B:E;"select E, B, C, D where E <= date '" & TEXT(report_Groups!A2;"yyyy-MM-dd") & "'";1)
This interim result is stored at the inner_report_Groups tab (it will be referenced few times in the next query).
B. inner_report_Groups is filtered for selecting MAX("since") values and corresponding row cell values for each student:
ARRAYFORMULA(VLOOKUP(QUERY({ROW(inner_report_Groups!A$2:A)\SORT(inner_report_Groups!A$2:D)};"select max(Col1) group by Col3 label max(Col1)''";0);{ROW(inner_report_Groups!A$2:A)\SORT(inner_report_Groups!A$2:D)};{3\4\5};0)
The formula above is used as inner query in report_Groups!D2 (also in D3, D4—with appropriate indeces).
C. The second query result is filtered to get students whose status is either "Taken in" or "Moved Up" and corresponding group is equal to the given group (report_Groups!B2 (also in B3, B4—with appropriate indeces)):
=TRANSPOSE(IFERROR(QUERY(<here is the formula from step B>);"select Col1 where Col3 = '" & B2 & "' and (Col2='Taken in' or Col2='Moved Up')";0)))
The formula above is used as outer query in report_Groups!D2 (also in D3, D4—with appropriate indeces). IFERROR is intended to display nothing if query result is #N/A.
That query displays the needed results as you can see in report_Groups tab. But as the query on step B searches the whole columns of inner_report_Groups, there's only a single given date can be analysed (or the query interim results for other given dates should be placed in different columns of inner_report_Groups or at the different tab. Is there any way to give an alias for an interim result to refer it in a single cell formula instead of keeping it on different tab?

Resources