Google Sheets Query returning odd formatting - google-sheets

I have a simple sheet to try to track and format race results from a league that I've joined. For the most part I know how I want to do this but when I use a query it's dropping data in some situations and formatting it strangely in others.
It seems as if where there are more numbers in a column than text it drops all text entries.
In addition for some reason when I add a check row, if it's included in the query it pushes almost all the data into a single cell except for the check row.
Would someone mind having a look and trying to figure out why it's doing this. Link Below
On sheet RRL1 I have my compiled data on the left, my 'missing' data on the right and my weirdly formatted data below.
https://docs.google.com/spreadsheets/d/1c9xlQG06dQCrpMk3UMAX29oTlpRuhTfx6btbYTGmC8g/edit?usp=sharing

The query() formula will only support one data type per column — number, text, boolean or date. The type is determined by the majority of the values in the first few hundred rows. Values that are of another type will be returned as null, i.e., blank values.
=QUERY('Tournament Details'!D2:E22)
Use an { array expression } like this:
={ 'Tournament Details'!D2:E22 }
=TRANSPOSE(query('Tournament Details'!I3:I26))
Use this:
=transpose('Tournament Details'!I3:I26)
Use this pattern to replace "DNS" and "DNF" with nulls:
=arrayformula(
query(
{ 'RRL1'!A1:C, iferror(value('RRL1'!D1:D)) },
"select Col3, sum(Col4)
where Col3 is not null
group by Col3
label sum(Col4) 'Total AUS RRL1' ",
1
)
)
The "squished" values you mention come about because you are not specifying the headers parameter. The best practice is to always include it, like this:
=query('Tournament Details'!A2:E22,"select A where C != 'N/A'", 1)

Related

How to 'count' only when header matches value?

I have a Google Form that collects a bunch of data from dropdown questions on a Sheet with each question going to one column (as normal). On separate sheets, I want to be able to count how many times each option is selected.
Here is an example of what the response sheet might look like. A, B, and C are all questions.
I would then have separate sheets for 'Person?', 'Place?', and 'Thing?'. The 'Person?' sheet would look something like this:
I want to be able to add in the count of each time the option appears for that question. In the example, notice that 'Napoleon" is in both Col A and Col C. If I just count the number of times 'Napoleon' appears, I will get '2' even though he only appears once in the "Person?" responses.
I originally used a QUERY function like =QUERY('Input Data'!1:1000, "select count(A) where A contains '"&$A2&"'",0). BUT, I need it to be dynamic. So the "Person?" question may not always be Col A. I want the Query (or whatever formula) to search the headers and only return the count of that option for that question even if the column location changes.
Okay, I figured it out! In case someone else is curious, I used this formula:
=QUERY({'Input Data'!A1:L}, "SELECT COUNT(Col"&MATCH("Person?", 'Input Data'!1:1,0)&") WHERE Col"&MATCH("Person?", 'Input Data'!1:1,0)&" CONTAINS '"&$A2&"' label COUNT(Col"&MATCH("Person?", 'Input Data'!1:1,0)&") ''",0)
Lee, I sent you a PM about your most recent post, but in the process, I came across this one. There is no need for multiple formulas or manual entry references. One formula can produce the entire report with headers, listing and counts:
=IFERROR(QUERY(FILTER(FILTER(A:L,A:A<>""),A1:L1="Person?"),"Select Col1, COUNT(Col1) GROUP BY Col1 ORDER BY Col1 LABEL COUNT(Col1) 'Count'",1),"No Matches")
Just fill in the header your looking for between the quotes where Person? is now.
The double FILTERs mean "Start with only rows where Col A is not null and Row 1 reads 'Person?'"
Then QUERY simply returns the unique names in the left column and their counts in the right column. Because the QUERY had a final parameter of 1, any existing header will be kept (in this case, the one you were searching for); and the created column will receive a header (i.e., LABEL) of Count.
IFERROR will give a friendly error message if no matches are found (in which case check that what you entered for the search in the formula exactly matches a column header in the range).

Only apply complex arrayformula() to rows with certain value in dataset

I have a quite complext formula (i mean that is complex to me) that Tom Sharpe helped me building to aggregate values and ordering them by months in a row(you can find the details in the original post but i think you'll only need the final formula which is:
=ArrayFormula(mmult(sequence(1,counta(A2:A),1,0), if((C2:index(C:C,counta(C:C))<=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0)))* (D2:index(D:D,counta(D:D))>=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0))),E2:index(E:E,counta(E:E)),0)))
and here is the result -> [J1:U1]
Now, what i would need to do as the final step is to be able to group data by a certain label (John or Jane in the example) on separate rows, but mantaining the order/aggregate by month on the row. On the example, this would mean having one row with only 'John' data and below, one with 'Jane' values.
I am struggling to understand how to adapt the formula to do so.
I have tried:
Using another array to first return a list of these labels with query(unique()) or something like that, but then i struggle looping in it with the other formula.
A bit more simplistic but it could work after all: on the 1st row (the cell next to where the data will be returned) writing 'John', on row 2 'Jane' and then using filter() to only pull data that matches. The 'John, Jane' value is for the example but the real labels won't be that many, the list of labels don't need to be dynamic.
The thing with these solutions is that they work when used separately, but i can't figure out how to nest this in the first arrayformula() that Tom helped me with...As i am just beginning with the google sheets queries.
I don't really need necessarily the complete formula/code but maybe just directions or tips to visualize the way i could solve this.
Thanks to all who might contribute
With hindsight I might have done better to go down the route of using a query to calculate the sums on my previous answer rather than Mmult.
This uses the same method as before to create a 2d array of amounts vs dates (going across) and individuals (going down). Then it uses Textjoin to generate a query to group by name with the required number of columns.
=ArrayFormula(query({A2:A,if((C2:C<=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0)))* (D2:D>=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0))),E2:E,0)},
"select Col1,sum(Col"&textjoin("),sum(Col",,sequence(1,datedif(G2,H2,"M")+1,2))&") where Col1 is not null group by Col1"))
This is the generated query
select Col1,sum(Col2),sum(Col3),sum(Col4),sum(Col5),sum(Col6),sum(Col7),sum(Col8),sum(Col9),sum(Col10),sum(Col11),sum(Col12),sum(Col13) where Col1 is not null group by Col1
Ideally there should be an extra section saying label sum(Col2) '' etc. to suppress the 'Sum' headers.
=ArrayFormula(query({A2:A,if((C2:C<=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0)))* (D2:D>=eomonth(G2,sequence(1,datedif(G2,H2,"M")+1,0))),E2:E,0)},
"select Col1,sum(Col"&textjoin("),sum(Col",,sequence(1,datedif(G2,H2,"M")+1,2))&") where Col1 is not null group by Col1 label sum(Col" & textjoin(") '', sum(Col",,sequence(1,datedif(G2,H2,"M")+1,2)) & ") ''"))

How to handle data manipulation when using importrange() in Google Sheets?

I am working on speeding up a workbook in Google sheets that is using importrange(). The purpose of the entire workbook is to import data from a mastersheet and then allow us to manipulate it the way we want to outside of the mastersheet.
The problem: because importrange() doesn't allow you to directly manipulate cells we have Sheet1 acting as the import sheet; it doesn't get touched. Sheet2 is where we do the manipulating but, it was literally just taken as a copy of Sheet1, so it is also using importrange(). This bogs down the entire workbook and makes manipulations very slow.
I am thinking of using !Sheet1A1... and copying that to all the cells in the manipulation sheet, but my concern is that this will still bog down the workbook. There is potential that the import data could grow as large as 10k+ rows, and I'm only at about half that currently and running into this problem. Outside of that, I'm not sure what else there is to try.
The QUERY function can help here and there are some great resources online.
=importrange(spreadsheet_url, range_string)
a typical example is:
=importrange("https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxxxxxx","Sheet1!A:Z")
You can wrap a QUERY function around this to manipulate your data.
QUERY is like a version of SQL and very powerful. It's in the format:
=QUERY({},"",1)
Your data range importrange("https://docs.google.com/spreadsheets/d/xxxxxxxxxxxxxxxxxxxx","Sheet1!A:Z") would go within {}.
Then within the "" part of the query, you could write your parameters for manipulating the data.
Example:
select Col1,Col4,Col5 where Col1 is not null and Col6 contains 'hello' order by Col1,Col7 desc label Col1 'new name 1',Col4 'new name 4'
The select bit allows you to specify specific columns from your importrange. If you want the all, then you could use select *.
The where item is where you build up your criteria using various or or and parameters.
is not null is another way of saying you want rows that have data.
contains is useful. You can also have matches, starts with, ends with and like. like can use wildcards %, so where Col1 like '%the%' would find 'hello there'.
order by is ascending unless you add desc, ie. order by Col1,Col2,Col4,Col5 desc,Col3.
label allows you to rename the columns, so let's say input column 1 is called 'Name1' and input column 2 is 'Name2' and you want them to be 'First name' and 'Surname, you would use label Col1 'First name', Col2 'Surname'.
If you like QUERY there are other powerful clauses, and they run in this order within the QUERY(range,"clauses",0):
select
where
group by
pivot
order by
limit
offset
label
format
options
One small point which you may come across, when you use importrange to get your data you need to reference the columns as Col1,Col2,Col3 within the QUERY.
If, however, your range is already in the same sheet (same or different tab), then you would reference column letters instead, eg. select A,B,C where A is not null order by A desc.
To make it more consistent and use the Col1,Col2,Col3 notation, you would put your internal range in an array {}.
QUERY(Sheet1!B:F,"select B,C,D where F is not null order by B,C",0)
would become:
QUERY({Sheet1!B:F},"select Col1,Col2,Col3 where Col5 is not null order by Col1,Col2",0)
{Sheet1!B:F} is smart because you can add columns in front of this range without needing to change your clause. So adding one column in front of Sheet1, would result in:
QUERY({Sheet1!C:G},"select Col1,Col2,Col3 where Col5 is not null order by Col1,Col2",0)
The other method would need you to alter your clause from:
QUERY(Sheet1!B:F,"select B,C,D where F is not null order by B,C",0)
to:
QUERY(Sheet1!C:G,"select C,D,E where G is not null order by C,D",0)
It's a lot to take in, but definitely worth persuing!

Google Sheets Query Coalesce?

is there any query syntax that woks like coalesce in google sheets?
if i have a source like pict below
the result i want is only getting id and time if status is true, but the time is only exist in one col either in check column or report column
so the result would be like this...
I tired this but doesn't work
=QUERY(A1:D4, "SELECT A, COALESCE(B, C) WHERE D = TRUE")
any ideas or workarounds?
Thanks
try:
=ARRAYFORMULA(IFERROR(SPLIT(FLATTEN(QUERY(TRANSPOSE(
ARRAY_CONSTRAIN(IF(D2:D=TRUE, {A2:A, IF(B2:C="",,"×"&B2:C), D2:D}, ), 9^9,
COLUMNS(A:C))),, 9^9)), "×")))
A very short one just for the special case of 2 columns where you know that only one of them is populated and they are dates:
=ArrayFormula(to_date(if(D2:D,B2:B+C2:C,)))
Maybe the simplest formula which behaves like coalesce would be
=iferror(if(D2,hlookup(9^9,B2:C2,1,true),))
It's just a pull-down formula but will pick up the first non-blank column from a range of columns containing numbers or dates. If the columns are all blank, it returns blank.
You can take advantage of the either or situation and concatenate the 2 columns.
=filter({A2:A,concat(B2:B,C2:C)},D2:D)
Also see local array and filter
Add a column after Status call it Time (column E), whereas each formula follows this format (assuming your table starts at A3:E)
=if(A4="","",if(B4<>"",B4,C4))
Now query A3:E like so,
=query(A3:E,"Select A,E where D=TRUE")
you can use something like this:
=QUERY(transpose(B1:H1),"Select Col1 where Col1 is not null limit 1",0)
This transposes the row into a column, queries all non-null values from that column, and then set limit 1 to return the first value. So essentially you are selecting the leftmost non-empty value from your row.
I can't take full credit for this, I must have gotten it somewhere else... but it's in one of my sheets.

How to definitely use column names in Google Sheet Query

query function doesn't let you use column names; you have instead to use letters if you refer to a cell range or ColN if you refer to an array.
This is very annoying, most of all when you alter the queried table adding, deleting or exchanging columns.
I would like to use column names, like in a standard SQL query.
You can actually get around this by splitting the Query formula and using other formula's to automatically get the desired column names from a list.
For example if you have a table in range A1:E15 with headers "H1, H2, H3, H4, H5", and you'd like to only get columns H3 & H5:
Store the desired headers (H3 & H5) in another table/range as a list - lets say this range is G1:G2
Use MATCH formula along with TextJoin formula to generate an concatenated string like Col3, Col5
=TextJoin(", ",TRUE,ArrayFormula(IFERROR("Col"&MATCH(G1:G6,$A$1:$E$1,0),"")))
Lets say this was in cell H1
You can refer to this cell in your Query formula like below
=QUERY({A1:E20},"SELECT "&H1&" WHERE Col2='w'")
You can see it in action in below screenshot:
One solution could be recurring to some custom function created by a script, but when you have a not so small table you surely will incur in some error due to the exceeding computation time.
The most efficient solution (using only native functions) I found is as follows.
Suppose you are working on a sheet range, your column names are in row 1 and you want to refer to the column "salary"; you can obtain the column letter by
substitute(address(1,match("salary",A1:1,0),4),"1","")
Instead, if you are querying arrays, it is simpler; the string you need is
"Col"&match("salary",A1:1,0)
The final query could be not so elegant, but the efficiency is guaranteed:
query(
employeessheet!A:E,
"select "&substitute(address(1,match("salary",employeessheet!A1:1,0),4),"1","")&" where ...",
1)

Resources