Postgresql and ActiveRecord where: Regex matching

Postgresql and ActiveRecord where: Regex matching - ruby-on-rails

I created this regex in normal Regex
/(first|last)\s(last|first)/i
It matches the first three of
first last
Last first
First Last
First name
I am trying to get all the records where the full_name matches with the regex I wrote. I'm using PostgreSQL
Person.where("full_name ILIKE ?", "%(first|last)%(last|first)%")
This is my attempt. I also tried SIMILAR TO and ~ with no luck

Your LIKE query:
full_name ilike '%(first|last)%(last|first)%'
won't work because LIKE doesn't understand regex grouping ((...)) or alternation (|), LIKE only understands _ for a single character (like . in a regex) and % for any sequence of zero or more characters (like .* in a regex).
If you hand that pattern to SIMILAR TO then you'll find 'first last' but none of the others due to case problems; however, this:
lower(full_name) similar to '%(first|last)%(last|first)%'
will take care of the case problems and find the same ones as your regex.
If you want to use a regex (which you probably do because LIKE is very limited and cumbersome and SIMILAR TO is, well, a strange product of the fevered minds of some SQL standards subcommittee) then you'll want to use the case-insensitive matching operator and your original regex:
full_name ~* '(first|last)\s+(last|first)'
That translates to this bit of AR:
Person.where('full_name ~* :pat', :pat => '(first|last)\s+(last|first)')
# or this
Person.where('full_name ~* ?', '(first|last)\s+(last|first)')
There's a subtle change in my code that you need to take note of: I'm using single quotes for my Ruby strings, you're using double quotes. Backslashes mean more in double quoted strings than they do in single quoted strings so '\s' and "\s" are different things. Toss in a couple to_sql calls and you might see something interesting:
> puts Person.where('full_name ~* :pat', :pat => 'a\s+b').to_sql
SELECT "people".* FROM "people" WHERE (full_name ~* 'a\s+b')
> puts Person.where('full_name ~* :pat', :pat => "a\s+b").to_sql
SELECT "people".* FROM "people" WHERE (full_name ~* 'a +b')
That difference probably isn't causing you any problems but you need to be very careful with your strings when everyone wants to use the same escape character. Personally, I use single quoted strings unless I specifically need the extra escapes and string interpolation functionality of double quoted strings.
Some demos: http://sqlfiddle.com/#!15/99a2c/6

Related

How do I prevent interpolation of this Ruby string?

I am using Ruby on Rails and I have a location in my database with the name:
1A J#ck$on & S0n's #{10}
I am receiving this name via a webhook and then searching my database with it however it does not find the location name ( it is instead searching for the interpolated name:
1A J#ck$on & S0n's 10
How can I receive this string via a webhook like this:
#location = inbound_webhook_request['location']
And then put it in a pg "like" query as shown below:
Location.where("name ~* ?", #location['name'])
Without it being interpolated along the way?

The string is not being interpolated. I'm not sure what led you to that assumption. However:
Location.where("name ~* ?", #location['name'])
This is not a LIKE operation, it's a POSIX regexp (case insensitive) operation.
Assuming you actually did want to perform a LIKE operation, not a regular expression search, you can do this:
Location.where("name LIKE ?", "%#{#location['name']}%")
or, using the shorthand syntax from the above linked documentation:
Location.where("name ~~ ?", "%#{#location['name']}%")
For a case-insensitive LIKE, you can use ILIKE or ~~*.
If the user input needs to be further sanitised, see this answer.

How to capture regex first occurence and interpolate into string with postgreSQL

I'm trying to concatenate the digits from a string that starts with 'CityName' into a separate string. I have the concatenation part. My issue is being able to access the matches from the regex
I have a regex in rails that looks like /CityName\s*(\d+)/i. I'm super new to regex and it's hard for me to wrap my head around the docs. But I'm assuming that this regex will find any digits after the CityName case intensively. And then it's interpolated if it matches an attribute on my model.
regex = /CityName\s*(\d+)/i
if line_1 =~ regex
"C#{$1}"
...
end
But further along in the execution, it's slowing down because I have to iterate over a lot of records. I have a query in psql that will do that calculations that I need, however I'm having a hard time implementing this regex replacement. My attempts so far look like:
CASE
when addr.line_1 ~* 'CityName\s*(\d+)' then 'C' || regex_matches('CityName\s*(\d+)')[0]
...
I'm having a hard time finding a solution to grab the first occurrence of the regex match. Thanks for any tips :D
EDIT: I am trying to grab the digits after 'CityName' from a string if that string contains 'CityName'
Ultimately I need assistance with the regex and how to contactenate the digits with 'C'

Your question is a bit unclear. Are you trying to add the digits to your selection or to filter records based on them?
If you just want to select them:
Address.select(%q{(regexp_matches(addr.line_1, 'CityName\s*(\d+)'))[1] as digits})
.map(&:digits)
If you want to filter based on then:
Address.where(%q{addr.line_1 ~ 'CityName\s*(\d+)'}).map &:email
.map(&:line_1)
Also a few notes:
Selecting digits case intensively does not really make sense. Digits
does not have case.
PostgreSQL arrays start from 1 instead of 0.

It seems you need a subquery or a WITH query:
SELECT tbl1.col1, sum(...), min(...) FROM (SELECT ..., CASE ...yourregex stuff... END col1 FROM ...) tbl1 GROUP BY 1;
WITH tbl1 AS (SELECT ..., CASE ...yourregex stuff... END col1 FROM ...) SELECT t.col1, sum(...) FROM tbl1 t GROUP BY 1;
If you need them regulary, you can also create views from the query or create a temp table, then you can use it in queries later.

Got it! Was able to finally start to figure out the regex.
WHEN addr.line_1 ~* '(?i)CityName\s*(\d+)' THEN 'C' || (SELECT (regexp_matches(addr.line_1, '(?i)CityName\s*(\d+)'))[1])
The (?i) allowed for case insensitive matching for CityName and then the concatenation worked. Thank you #ti6on for pointing out the index difference with postgres :D

How to use regex in active record queries?

There are values in a specific column of my database that ends with a number but others that do not.
I'm trying to only take the data that is not containing these numbers.
I tried to use these queries but they do not work :
User.where.not("spec like ?", "%\d")
User.where.not("spec ~ ?", "%\d")
How could I find this data ?

Use SIMILAR TO with %[0-9] pattern:
User.where.not("spec SIMILAR TO ?", "%[0-9]")
The SIMILAR TO operator is similar to regex, but allows the use of wildcards as with LIKE and some "light" regex constructs, e.g. bracket expressions like [0-9] or [A-Z]. The pattern should match the whole input as with LIKE.
So, the %[0-9] pattern will match any strings that start with any text (% wildcard does that) and end with an ASCII digit (due to the [0-9] at the end).

How to store regex or search terms in Postgres database and evaluate in Rails Query?

I am having trouble with a DB query in a Rails app. I want to store various search terms (say 100 of them) and then evaluate against a value dynamically. All the examples of SIMILAR TO or ~ (regex) in Postgres I can find use a fixed string within the query, while I want to look the query up from a row.
Example:
Table: Post
column term varchar(256)
(plus regular id, Rails stuff etc)
input = "Foo bar"
Post.where("term ~* ?", input)
So term is VARCHAR column name containing the data of at least one row with the value:
^foo*$
Unless I put an exact match (e.g. "Foo bar" in term) this never returns a result.
I would also like to ideally use expressions like
(^foo.*$|^second.*$)
i.e. multiple search terms as well, so it would match with 'Foo Bar' or 'Search Example'.
I think this is to do with Ruby or ActiveRecord stripping down something? Or I'm on the wrong track and can't use regex or SIMILAR TO with row data values like this?
Alternative suggestions on how to do this also appreciated.

The Postgres regular expression match operators have the regex on the right and the string on the left. See the examples: https://www.postgresql.org/docs/9.3/static/functions-matching.html#FUNCTIONS-POSIX-TABLE
But in your query you're treating term as the string and the 'Foo bar' as the regex (you've swapped them). That's why the only term that matches is the exact match. Try:
Post.where("? ~* term", input)

Rails query by number of digits in field

I have a Rails app with a table: "clients". the clients table has a field: phone. phone data type is string. I'm using postgresql. I would like to write a query which selects all clients which have a phone value containing more than 10 digits. phone does not have a specific format:
+1 781-658-2687
+1 (207) 846-3332
2067891111
(345)222-777
123.234.3443
etc.
I've been trying variations of the following:
Client.where("LENGTH(REGEXP_REPLACE(phone,'[^\d]', '')) > 10")
Any help would be great.

You almost have it but you're missing the 'g' option to regexp_replace, from the fine manual:
The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. [...] The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one.
So regexp_replace(string, pattern, replacement) behaves like Ruby's String#sub whereas regexp_replace(string, pattern, replacement, 'g') behaves like Ruby's String#gsub.
You'll also need to get a \d through your double-quoted Ruby string all the way down to PostgreSQL so you'll need to say \\d in your Ruby. Things tend to get messy when everyone wants to use the same escape character.
This should do what you want:
Client.where("LENGTH(REGEXP_REPLACE(phone, '[^\\d]', '', 'g')) > 10")
# --------------------------------------------^^---------^^^

Try this:
phone_number.gsub(/[^\d]/, '').length

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Postgresql and ActiveRecord where: Regex matching - ruby-on-rails

Related

How do I prevent interpolation of this Ruby string?

How to capture regex first occurence and interpolate into string with postgreSQL

How to use regex in active record queries?

How to store regex or search terms in Postgres database and evaluate in Rails Query?

Rails query by number of digits in field

Categories

Resources