How to use regex in active record queries? - ruby-on-rails

There are values in a specific column of my database that ends with a number but others that do not.
I'm trying to only take the data that is not containing these numbers.
I tried to use these queries but they do not work :
User.where.not("spec like ?", "%\d")
User.where.not("spec ~ ?", "%\d")
How could I find this data ?

Use SIMILAR TO with %[0-9] pattern:
User.where.not("spec SIMILAR TO ?", "%[0-9]")
The SIMILAR TO operator is similar to regex, but allows the use of wildcards as with LIKE and some "light" regex constructs, e.g. bracket expressions like [0-9] or [A-Z]. The pattern should match the whole input as with LIKE.
So, the %[0-9] pattern will match any strings that start with any text (% wildcard does that) and end with an ASCII digit (due to the [0-9] at the end).

Related

Ignore first 2 characters of string in where query using active record

How Can I find all records that match my string, ignoring exactly two characters from the start of the fields string.
Something Like:
Things.where("reference like ?", "**#{MyReference}")
Where ** can be any two characters, but not none, one or three or more characters
You can use underscore(_) to match the pattern - https://www.postgresql.org/docs/8.1/functions-matching.html
You can also use ~~ instead of like. So the query will be like -
Things.where("reference ~~ ?", "__#{MyReference}")

How to store regex or search terms in Postgres database and evaluate in Rails Query?

I am having trouble with a DB query in a Rails app. I want to store various search terms (say 100 of them) and then evaluate against a value dynamically. All the examples of SIMILAR TO or ~ (regex) in Postgres I can find use a fixed string within the query, while I want to look the query up from a row.
Example:
Table: Post
column term varchar(256)
(plus regular id, Rails stuff etc)
input = "Foo bar"
Post.where("term ~* ?", input)
So term is VARCHAR column name containing the data of at least one row with the value:
^foo*$
Unless I put an exact match (e.g. "Foo bar" in term) this never returns a result.
I would also like to ideally use expressions like
(^foo.*$|^second.*$)
i.e. multiple search terms as well, so it would match with 'Foo Bar' or 'Search Example'.
I think this is to do with Ruby or ActiveRecord stripping down something? Or I'm on the wrong track and can't use regex or SIMILAR TO with row data values like this?
Alternative suggestions on how to do this also appreciated.
The Postgres regular expression match operators have the regex on the right and the string on the left. See the examples: https://www.postgresql.org/docs/9.3/static/functions-matching.html#FUNCTIONS-POSIX-TABLE
But in your query you're treating term as the string and the 'Foo bar' as the regex (you've swapped them). That's why the only term that matches is the exact match. Try:
Post.where("? ~* term", input)

Rails query by number of digits in field

I have a Rails app with a table: "clients". the clients table has a field: phone. phone data type is string. I'm using postgresql. I would like to write a query which selects all clients which have a phone value containing more than 10 digits. phone does not have a specific format:
+1 781-658-2687
+1 (207) 846-3332
2067891111
(345)222-777
123.234.3443
etc.
I've been trying variations of the following:
Client.where("LENGTH(REGEXP_REPLACE(phone,'[^\d]', '')) > 10")
Any help would be great.
You almost have it but you're missing the 'g' option to regexp_replace, from the fine manual:
The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. [...] The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one.
So regexp_replace(string, pattern, replacement) behaves like Ruby's String#sub whereas regexp_replace(string, pattern, replacement, 'g') behaves like Ruby's String#gsub.
You'll also need to get a \d through your double-quoted Ruby string all the way down to PostgreSQL so you'll need to say \\d in your Ruby. Things tend to get messy when everyone wants to use the same escape character.
This should do what you want:
Client.where("LENGTH(REGEXP_REPLACE(phone, '[^\\d]', '', 'g')) > 10")
# --------------------------------------------^^---------^^^
Try this:
phone_number.gsub(/[^\d]/, '').length

Configure Sphinx to index dash and search it with and without it

I have a record
Item id: 1, name: "wd-40"
How do I configure Sphinx to match this record on the following queries:
Item.search("wd40")
Item.search("wd-40")
To answer your title question, charset_table is what you want.
http://sphinxsearch.com/docs/current.html#charsets
But that doesnt actully solve the query of matching those two queries, indexing - wouldn't work, just be the inverse of indexing it.
Instead, you probably want ignore_chars
http://sphinxsearch.com/docs/current.html#conf-ignore-chars
First indexing:
By default, only ascii characters are indexed by Sphinx; the others are considered word separators. To fix that, you need to use the charset_table parameter to map the dash to the dash character.
Second searching:
AFAIK, it is not possible to make Sphinx to consider both searches like you are asking for. However, you can just use something like:
# in Python, but I believe is understandable
query = word
if '-' in word:
query += " | " + word.replace('-','')
Item.search(query) # if word = 'wd-40', query = 'wd-40 | wd40'

Postgresql and ActiveRecord where: Regex matching

I created this regex in normal Regex
/(first|last)\s(last|first)/i
It matches the first three of
first last
Last first
First Last
First name
I am trying to get all the records where the full_name matches with the regex I wrote. I'm using PostgreSQL
Person.where("full_name ILIKE ?", "%(first|last)%(last|first)%")
This is my attempt. I also tried SIMILAR TO and ~ with no luck
Your LIKE query:
full_name ilike '%(first|last)%(last|first)%'
won't work because LIKE doesn't understand regex grouping ((...)) or alternation (|), LIKE only understands _ for a single character (like . in a regex) and % for any sequence of zero or more characters (like .* in a regex).
If you hand that pattern to SIMILAR TO then you'll find 'first last' but none of the others due to case problems; however, this:
lower(full_name) similar to '%(first|last)%(last|first)%'
will take care of the case problems and find the same ones as your regex.
If you want to use a regex (which you probably do because LIKE is very limited and cumbersome and SIMILAR TO is, well, a strange product of the fevered minds of some SQL standards subcommittee) then you'll want to use the case-insensitive matching operator and your original regex:
full_name ~* '(first|last)\s+(last|first)'
That translates to this bit of AR:
Person.where('full_name ~* :pat', :pat => '(first|last)\s+(last|first)')
# or this
Person.where('full_name ~* ?', '(first|last)\s+(last|first)')
There's a subtle change in my code that you need to take note of: I'm using single quotes for my Ruby strings, you're using double quotes. Backslashes mean more in double quoted strings than they do in single quoted strings so '\s' and "\s" are different things. Toss in a couple to_sql calls and you might see something interesting:
> puts Person.where('full_name ~* :pat', :pat => 'a\s+b').to_sql
SELECT "people".* FROM "people" WHERE (full_name ~* 'a\s+b')
> puts Person.where('full_name ~* :pat', :pat => "a\s+b").to_sql
SELECT "people".* FROM "people" WHERE (full_name ~* 'a +b')
That difference probably isn't causing you any problems but you need to be very careful with your strings when everyone wants to use the same escape character. Personally, I use single quoted strings unless I specifically need the extra escapes and string interpolation functionality of double quoted strings.
Some demos: http://sqlfiddle.com/#!15/99a2c/6

Resources