Google Sheets QUERY...WHERE stopped working - google-sheets

I use the Google Sheets SQL-ish Query function to summarize data in a number of my spreadsheets. This has been working well for years, albeit slowly.
Today, I'm having problems with some of my queries - specifically with some that compare dates in the source data to TODAY().
To demonstrate, here's a link to a shared spreadsheet that I've used to reproduce the problem on fake data.
Edit: Example has been updated with AdamL's suggestion.
The source data is in range A1:D6, with columns "Serial No.", "Type", "Location", and "Warranty Expiration". The last column is a date.
This function in A9 summarizes all data:
=query(A1:D6,"select B, count(A) group by B pivot C")
...like so:
Here's the thing. If I try to filter using WHERE and DATE(), the Query seems to break down completely. What I want is a table that looks like the one above, but including only data rows that have a date in column D that is in the past.
=query(A1:D6,"select B, count(A) where D < now() group by B pivot C")
If I change the filter to something not involving dates, I get the expected output:
How do I get this to give me the summary I want?

The now() scalar function returns a datetime value, and you have date values in the source data. Comparisons between the two will unfortunately fail. The workaround is to convert now() to a date value:
=QUERY(A1:D6;"select B, count(A) where D < toDate(now()) group by B pivot C")
As an aside, there is a limitation (bug?) with the QUERY function whereby the now() scalar function does not (necessarily) operate in the time zone of your spreadsheet, and there doesn't appear to be any way of modifying this behaviour. I believe that the now() scalar function will always return the current time in Pacific Daylight Time (eg US west coast). So for me, right now in Brisbane, Australia, toDate(now()) used in a QUERY select clause returns yesterday's date.
The safer bet is to use a spreadsheet function to generate today's date, and concatenate that into the QUERY clause:
=QUERY(A1:D6;"select B, count(A) where D < date '"&TEXT(GoogleClock();"yyyy-MM-dd")&"' group by B pivot C")

Related

Google Sheets date query won't work on specific columns

I have data that I'm importing from Salesforce, and I'm using query functions to find all rows where any of the columns has a date in a given range. Here's an example of the data:
The query that's not working is:
=query('Salesforce Data'!A2:C,"SELECT A,C WHERE C >= date '"&TEXT(DATEVALUE($A$1),"yyyy-mm-dd")&"' AND C < date '"&TEXT(DATEVALUE($B$1),"yyyy-mm-dd")&"'")
I'm using the same query except in one case, it's looking at dates in column B, and in the other, it's looking at the dates in column C. The column B version works, the column C version does not. I have verified that there is at least one date in column C that falls in the range, so it should not be an issue of no data, as the error suggests:
I've looked over data formatting, and there is no difference between columns B and C in that regard. These are the same types of field in Salesforce as well, so I would not expect a difference in formatting. I tried manually changing the first value in column C to a date (that was an obvious difference between the columns), but that also didn't work.
After a lot of trial and error, I found the issue: it seems that Google Sheets classifies the column of data based on what the majority of the cells are. So, even though both columns B & C have some cells with valid dates and some with a - signifying null, column B has more dates than strings, but C has more strings than dates, so date compare queries won't work on column C at all.
My solution for now is to add a formula sheet to transform all of the null values, -, into a date that won't mess with my query, 1/1/1970:
Example formula:
=IF( OR('Salesforce Data'!C2="-",'Salesforce Data'!C2=""), date(1970,1,1), 'Salesforce Data'!C2)
Another solution would be to edit the data source, but this solution will work entirely within sheets.
Also note, I dragged this formula down far below where I needed, just in case, make sure that if you have a text column (like my column A), you replace empty values there with junk text of some sort. At first I replaced with 0 and then my text column wasn't picked up by the query.
try:
=ARRAYFORMULA(QUERY(TO_TEXT('Salesforce Data'!A2:C),
"select Col1,Col3
where Col3 >= date '"&TEXT(A1, "yyyy-mm-dd")&"'
and Col3 < date '"&TEXT(B1, "yyyy-mm-dd")&"'", 0))
Thank you thank you so so much. This thread helped me a lot.
I have used these from this thread. Someone may need in future:
"select A, B, C, G, H, J where I='"&TEXT($A$2, "dd-mmm-yyyy")&"'"
"select B, C WHERE F= date '"&TEXT(DATEVALUE($A$2),"yyyy-mm-dd")&"'"
"select A, B, C, G, H, J where I='"&TEXT($A$2, "dd-mmm-yyyy")&"' or I='"&TEXT($A$2, "d-mmm-yyyy")&"'"

Need to find the duration using query in google sheet

Have rfid attendance data in google sheet where it has in and out time.Need to have how many hours an employee had worked for a given day. Have used query statement and was able to pull out min and max for a given day.Struck at need to find the duration HH:MM worked
https://docs.google.com/spreadsheets/d/1AbcTQ8CdmPO7KjtMFPWc1QK7rVnf1q0K51U_WAMKlqw/edit?usp=sharing
Attached the google sheet do have a look at the query code in G1 need to find the duration in K column
QUERY(A:D,"Select A,B, Max(C), Min (C),Max(C)- Min (C) Group by A,B Label A 'Key',B 'Date',Max(C) 'Maximum', Min (C) 'Minimum', Max(C)- Min (C) `Duration`",0)
so how to find difference between hours in query statement of googlesheet
In your query formula, Sheets provides an error message that seems to be about the labels, so delete the label clause
Label A 'Key',B 'Date',Max(C) 'Maximum', Min (C) 'Minimum', Max(C)- Min (C) `Duration`
Now, after deleting the label clause, Sheets provides another error, stating
Can't perform the function difference on values that are not
numbers.
So the problem is that you can't subtract Min(C) from Max(C) because the value in C is a time, not a number.
I don't think there's a great way to fix this within your QUERY() formula, but we can fix the data in the spreadsheet so that it works. In Column E, use =VALUE() to convert the times from Column C into numbers. For example, E2 will be =VALUE(C2).
Now we just need to use those new values in your query (instead of Max(C)-Min(C), use Max(E)-Min(E)) and include the format clause.
=QUERY(A:E,"Select A,B, Max(C), Min(C), Max(E)-Min(E) Group by A,B Label A 'Key',B 'Date',Max(C) 'Maximum', Min(C) 'Minimum', Max(E)-Min(E) 'Duration' Format Max(E)-Min(E) 'HH:mm'",0)

How to return a value from Column B, based on Column A data

So I want to try to return a value from Column B, when Column A has certain values.
So for example here, when Column A is June, I'd like to return the value in Column B. (Or the average, or sum of them, etc). I've read through the GS documentation on functions and don't really see much as far as IF/THEN functions like this. Any Ideas?
C1:
=QUERY(A:B,"Select A,avg(B) where A = 'June' group by A")
avg() to sum() for SUM

Automatically transform a log of check-in/check-out events into a time-sheet

UPDATE: Some context: A log that is fed automatically by a IFTTT script contains all check-in and check-outs for employees that work in a factory. I need to build a report with the first check-in for each day, and the last check-out for each day (employees might check-out for lunch, but come back and only the first check-in and last check-out should count).
My current solution is to calculate a "is first checkin or last checkout?" Boolean, and then feed this log into a pivot table for reporting purposes filtering out the repeat entries
My spreadsheet will have data inserted in columns D & E by a third party application (IFTTT or google forms), and I would like to use an arrayformula to automatically calculate one column as data come ins from those applications.
(D)Date (E)Time Calc
January 6, Friday 15:06 TRUE
January 6, Friday 15:15 TRUE
January 9, Monday 8:36 TRUE
January 9, Monday 10:04 FALSE
January 9, Monday 10:37 FALSE
January 9, Monday 15:51 TRUE
The formular for Calc is
=or(MIN(filter(E:E,D:D=D2,B:B=B2))=E2,MAX(filter(E:E,D:D=D2,B:B=B2))=E2)
How can I transform this formula into an arrayformula? From my experimentations it seems that ArrayFormula doesn't mix well with Filter. Help is appreciated!
So, the goal is to determine, for each date, whether the value in column E is the highest or lowest for that date. I think this is too much logic to pack into a single formula, but can be expressed by two array formulas. The first one creates two helper columns:
=arrayformula(vlookup(filter(D:D, len(D:D)), query(D:E, "select D, min(E), max(E) group by D", 1), {2, 3}))
This is itself a combination of two formulas: the inner query gets the minimum and maximum of E for each date in D; then vlookup aligns these min-max values with the rows of the original table. The filtering by len(D:D) is for performance reasons, to avoid looking up a huge number of empty cells.
Suppose the first formula was in G1; then it formed the columns G and H, which leads to E1 being
=arrayformula(not((E:E > G:G) * (E:E < H:H)))
Note that and and or are not arrayformula-friendly, but can be replaced by * and + which result in booleans getting implicitly converted to 0-1. The not function is array-friendly, and is used here partly to get a boolean back from an integer.
Inspired by #zaq, I solved by re-engineering the spreadsheet and got the solution by using the following formula:
=query(query(Sheet1!B:E, "select D, min(E), max(E) group by D pivot C,B ", 1),"select Col1, Col3, Col10,Col4,Col11")
This formula transforms a log of employee check-in and check-outs into a summarized "hours worked" table that contains,for each day, and for every employee, the first check-in and the last check-out.

using both QUERY and FILTER together in a single statement?

I hope someone can help me; I am building some spreadsheets to help with time-tracking. I have a list of tasks, with columns for criteria including date, hours spent, category of work, and client.
I want to filter this data by month, so for example I would like to know how long I spent in a single month on correspondence. This means I need to select all the rows where category = 'correspondence' and where the dates are all from one specified month. At the moment, I am having to use a query which outputs to an intermediary table, and then run a filter function on that table in order to output to my final table. Here are my two functions:
=QUERY( 'Task List'!A4:F , "select A, B, E, F where C = 'Correspondence'" )
that gives me the first table, with just the rows where the category is "Correspondence". Then, on that table, I have to run the next function:
=filter(J4:M,J4:J>=date(2015,4,1),J4:J<=date(2015,4,31))
To get only the rows from this month of April. If possible I would like to remove the intermediary table (which serves no other purpose and just clutters my sheet).
Is it possible to combine these statements and do the process in one step?
Thanks.
That is indeed possible.
Since you didn't specify in which column the dates are to be found (in the 'raw' data), I assumed for this example that dates are in col F. The easiest way would be to use the MONTH() function. However, when used in query(), this function considers January as month 0. That's why I added the +1. See if this works ?
=QUERY( 'Task List'!A4:F , "select A, B, E, F where C = 'Correspondence' and month(F)+1 =4 ")
I came to this question needing to filter by weeknum() and year() as well as query by contains(). It can be helpful to combine the query and filter functions for similar but more dynamic date and text matching needs. If for example the OP had needed to show this data by week, that is not available in the Google Query Language.
The filter function does not have the contains function so you are limited to exact match text or using Reg-Ex. The Query Lanuague does not have the Weeknum functions.
Combining Filter and Query can be useful in scenario similar to this question but with a dynamic timeline (no hard set month or date such as rolling timeline) and where the text your matching is not exact (when you need to use a contains function from query language).
Here is an example for combining filter and query in Google sheets.
=(sum(Filter(QUERY(FB!$A:$Z, "select Q where B contains 'Apple'"), Weeknum (QUERY(FB!$A:$Z, "select E where B contains 'Apple'")) = Weeknum($A8))))
In this example I queried Facebook ads data export for any posts which contained the word 'Apple' in their title, and where Weeknum() matched the ongoing weeks on my sheet, in order to pull weekly data from multiple sources into one table to build reports, with minimal updating required as the timeline runs on.
It selects Q(spend) Where B(title) contains Apple, and Weeknum(E) matches week number on current row of sheet(A8). I have found this useful many times. Query + Filter Example Sheet Here.
If OP wanted to pull this info dynamically as the months went on if A column contained months in order the formula could be pulled along and would automatically pull data from query data filtered by matching month month.
=(sum(Filter(QUERY( 'Task List'!A:Z , "select A, B, E, F, J where C contains 'Correspondence'" ), Month(QUERY( 'Task List'!A4:F , "select J where C contains 'Correspondence'" )) = Month('$A2'))))

Resources