I need to use query in Google Sheet Spreadsheet to return text in Column B of a given length, and various other conditions. To make it easy, below is a simplified version concentrate solely on Len() function. Seems simple enough but unfortunately does not work.
=QUERY(Valuations!B1:B,"select B where LEN(B)>3 ")
I'm aware that SQL uses LEN(), where as LENGTH() for MySQL.
Are my syntax incorrect or it is not possible to check string length in Google Sheet Query?
You can do it using a filter
=filter(B:B,len(B:B)>=3)
And then if you want to combine that with other conditions, you can put it in a query e.g.
=query(filter(A:B,len(B:B)>=3),"select Col1,Col2 where Col1>1")
See this question
A regular expression can be used:
=QUERY(Valuations!B1:B, "select B where B matches '.{3,}'")
The regular expression explained:
. match any character
{3,} match the preceding symbol (the .) 3 or more times
You could also search for a specific length by modifying the expression to ^.{3}$
OR a range ^.{3,10}$
OR a maximum ^.{,10}$
^ the start of the string
$ the end of the sting
regex101.com is a valuable resource for regular expressions.
I am not associated with the site in any way but I use it all the time.
Related
I have a google sheet (link) where I'm running a trivial query to find values based on a cell with piped values.
=QUERY(A1:A8;"select A where A matches '"&C2&"'";-1)
The piped values are
word 1|word (1)
The range is
Entries
word (1)
word (2)
word (2)
word (1)
word 1
word 2
word 1
For some reason, I don't get any results which include parentheses ().
It works well with the following query
=QUERY(A1:A8;"select A where A = 'word (1)'";-1)
Are there any limitations with using parentheses () with the matches function?
Thanks in advance.
You will need to escape parenthesis for the query to work
This means that the correct syntax for your C2 cell is
word 1|word \(1\)
You can even use
="word 1|word \(1\)"
Your formula will still be
=QUERY(A1:A8;"select A where A matches '"&C2&"'";-1)
Google Sheets QUERY function does not seem to use regular expressions.
I want to match strings with an arbitrary number of spaces before the string. My QUERY function is:
=QUERY('$A$1:$B$2, "select B where A=' *abc'")
It returns #N/A
It works when my data does not have the leading blanks and the match string is just 'abc'. It's acting as if sheets has regular expressions disabled.
perhaps try like this:
=QUERY(A1:B2, "select B where A contains 'abc'")
if you need something more strict try:
=ARRAYFORMULA(IF(ISNA(REGEXEXTRACT(A1:A, "abc$")), , B1:B))
I'm trying to SUM column C based on the contents of columns A and B. Like this:
=sum(filter(C:C, (A:A="Safari")*(B:B="10.0.1")))
The above formula works. The FILTER function works as an exact match for "Safari" and "10.0.1" for columns A and B respectively.
The problem is... this only captures an exact match: "10.0.1". I need to capture multiple strings e.g. "10.0.1", "10.0.2", "10.0.3", etc.
If helpful, here's an example sheet.
I'm not sure if regex can be used in combination with a filter function. In any case, I've tried hard and failed spectacularly. So... how best to filter for multiple strings instead of exact match only?
=SUMIFS(C:C,A:A,"Safari",B:B,"10.0.*")
Please try:
=filter(C:C, (A:A="Safari")*(REGEXMATCH(B:B, "10\.0\..*")))
Notes:
filter is an arrayformlula and it has a great property: it converts all the formulas inside it into array formulas
"10.0..*" is a regex for your match. "\." will match a dot, ".*" will match any sequence of chars. Please see more syntax here.
I did some searching and in openoffice and excel it looks like you can simply add an * at the beginning or end of a character to delete everything before and after it, but in Google spreadsheet this isn't working. Does it support this feature? So if I have:
keyword USD 0078945jg .12 N N 5748 8
And I want to remove USD and everything after it what do I use? I have tried:
USD* and (USD*) with regular expressions checked
But it doesn't work. Any ideas?
The * quantifier just needs to be applied to a dot (.) which will match any character.
To clarify: the * wildcard used in certain spreadsheet functions (eg COUNTIF) has a different usage to the * quantifier used in regular expressions.
In addition to options that would be available in Excel (LEFT + FIND) pointed out by pnuts, you can use a variety of regex tools available in Google Sheets for text searching / manipulation
For example, RegexReplace:
=REGEXREPLACE(A1,"(.*)USD.*","$1")
(.*) <- capture group () with zero or more * of any character .
USD.* <- exact match on USD followed by zero or more * of any character .
$1 <- replace with match in first capture group
Please try:
and also have a look at.
For spaces within keyword I suggest a helper column with a formula such as:
=left(A1,find("USD",A1)-1)
copied down to suit. The formula could be converted to values and the raw data (assumed to be in ColumnA) then deleted, if desired.
To add to the answers here, you can get into trouble when there are special characters in the text (I have been struggling with this for years).
You can put a frontslash \ in front of special characters such as ?, + or . to escape them. But I still got stuck when there were further special characters in the text. I finally figured it out after reading find and replace in google sheets with regex.
Example: I want to remove the number, period and space from the beginning of a question like this: 1. What is your name?
Go to Edit → Find and replace
In the Find field, enter the following: .+\. (note: this includes a space at the end).
Note: In the Find and replace dialogue box, be sure to check "Search using regular expressions" and "match case". Leave the Replace field blank.
The result will be this text only: What is your name?
I have a column XXX like this :
XXX
A
Aruin
Avolyn
B
Batracia
Buna
...
I would like to count a cell only if the string in the cell has a length > 1.
How to do that?
I'm trying :
COUNTIF(XXX1:XXX30, LEN(...) > 1)
But what should I write instead of ... ?
Thank you in advance.
For ranges that contain strings, I have used a formula like below, which counts any value that starts with one character (the ?) followed by 0 or more characters (the *). I haven't tested on ranges that contain numbers.
=COUNTIF(range,"=?*")
To do this in one cell, without needing to create a separate column or use arrayformula{}, you can use sumproduct.
=SUMPRODUCT(LEN(XXX1:XXX30)>1)
If you have an array of True/False values then you can use -- to force them to be converted to numeric values like this:
=SUMPRODUCT(--(LEN(XXX1:XXX30)>1))
Credit to #greg who posted this in the comments - I think it is arguably the best answer and should be displayed as such. Sumproduct is a powerful function that can often to be used to get around shortcomings in countif type formulae.
Create another list using an =ARRAYFORMULA(len(XXX1:XXX30)>1) and then do a COUNTIF based on that new list: =countif(XXY1:XXY30,true()).
A simple formula that works for my needs is =ROWS(FILTER(range,LEN(range)>X))
The Google Sheets criteria syntax seems inconsistent, because the expression that works fine with FILTER() gives an erroneous zero result with COUNTIF().
Here's a demo worksheet
Another approach is to use the QUERY function.
This way you can write a simple SQL like statement to achieve this.
For example:
=QUERY(XXX1:XXX30,"SELECT COUNT(X) WHERE X MATCHES '.{1,}'")
To explain the MATCHES criteria:
It is a regex that matches every cell that contains 1 or more characters.
The . operator matches any character.
The {1,} qualifies that you only want to match cells that have at 1 or more characters in them.
Here is a link to another SO question that describes this method.