Normalize seeking value in SQL search query - ruby-on-rails

[PostgreSQL(9.4), Rails(4.1)]
The problem:
I have a table with the names of tools. The column_name is hstore type and looks like this: name -> ('en': value, 'de': value). Worth noting that 'de' is unnecessary in this problem, cause all names are stored only in 'en' key.
Next I have to construct a search query that will find the right record, but the format of the text in query are unknown, e.g.:
In DB:
WQXZ 123GT, should match query: WQXZ_123-GT
In DB:
Three Words Name 123-D45, should match query: Three_WORDS_NAME 123D45
and so on...
Solution:
To get this happen I want to normalize the value that I'm looking for and the query in such way that both of them will be identical. To do this I need to make both values in downcase, remove all whitspaces, remove all non-alphanumeric characters, so the values above will be:
wqxz123gt == wqxz123gt
and
threewordsname123d45 == threewordsname123d45
I have no problem to format a search value in ruby:
"sTR-in.g24 3".downcase.gsub(/\s/, "").gsub(/\W/, "") # => "string243"
But I can't understand how to do this in SQL-search query to look like:
Tool.where("CODE_I_AM_LOOKING_FOR(name -> 'en') = (?)", value.downcase.gsub(/\s/, "").gsub(/\W/, ""))
Thank you for your time.
UPD: I can make a downcase in query:
Tool.where("lower(name -> 'en') = (?)", value.downcase)
But it solves only a part of the problem (downcase). The whitespaces and non-word characters (dots, dashes, underscores, etc.) are still an issue.

You can use Postgres replace function to remove spaces. Then use lower function to match on that value. Like this.
Tool.where("lower(replace(name -> 'en', ' ', '')) = (?)", value.downcase.gsub(/\s/, "").gsub(/\W/, "") )
I hope this would be helpful.

Nitin Srivastava's answer directed me in right direction. All I needed was to use regexp_replace function.
So the proper query is:
Tool.where(
"lower(regexp_replace((name -> 'en'), '[^a-zA-Z0-9]+', '', 'g')) = ?",
value.downcase.gsub(/\s/, "").gsub(/\W/,"")
)

Related

Rails, Postgres 12, Query where pattern matches regex, and contains substring

I have a field in the database which contains strings that look like: 58XBF2022L1001390 I need to be able to query results which match the last letter(in this case 'L'), and match or resemble the last four digits.
The regular expression I've been using to find records which match the structure is: \d{2}[A-Z]{3}\d{4}[A-Z]\d{7}, So far I've tried using a scope to refine the results, but I'm not getting any results. Here's my scope
def self.filter_by_shortcode(input)
q = input
starting = q.slice!(0)
ending = q
where("field ~* ?", "\d{2}[A-Z]{3}\d{4}/[#{starting}]/\d{3}[#{ending}]\g")
end
Here are some more example strings, and the substring that we would be looking for. Not every string stored in this database field matches this format, so we would need to be able to first match the string using the regex provided, then search by substring.
36GOD8837G6154231
G4231
13WLF8997V2119371
V9371
78FCY5027V4561374
V1374
06RNW7194P2075353
P5353
57RQN0368Y9090704
Y0704
edit: added some more examples as well as substrings that we would need to search by.
I do not know Rails, but the SQL for what you want is relative simple. Since your string if fixed format, once that format is validated, simple concatenation of sub-strings gives your desired result.
with base(target, goal) as
( values ('36GOD8837G6154231', 'G4231')
, ('13WLF8997V2119371', 'V9371')
, ('78FCY5027V4561374', 'V1374')
, ('06RNW7194P2075353', 'P5353')
, ('57RQN0368Y9090704', 'Y0704')
)
select substr(target,10,1) || substr(target,14,4) target, goal
from base
where target ~ '^\d{2}[A-Z]{3}\d{4}[A-Z]\d{7}$';

How to capture regex first occurence and interpolate into string with postgreSQL

I'm trying to concatenate the digits from a string that starts with 'CityName' into a separate string. I have the concatenation part. My issue is being able to access the matches from the regex
I have a regex in rails that looks like /CityName\s*(\d+)/i. I'm super new to regex and it's hard for me to wrap my head around the docs. But I'm assuming that this regex will find any digits after the CityName case intensively. And then it's interpolated if it matches an attribute on my model.
regex = /CityName\s*(\d+)/i
if line_1 =~ regex
"C#{$1}"
...
end
But further along in the execution, it's slowing down because I have to iterate over a lot of records. I have a query in psql that will do that calculations that I need, however I'm having a hard time implementing this regex replacement. My attempts so far look like:
CASE
when addr.line_1 ~* 'CityName\s*(\d+)' then 'C' || regex_matches('CityName\s*(\d+)')[0]
...
I'm having a hard time finding a solution to grab the first occurrence of the regex match. Thanks for any tips :D
EDIT: I am trying to grab the digits after 'CityName' from a string if that string contains 'CityName'
Ultimately I need assistance with the regex and how to contactenate the digits with 'C'
Your question is a bit unclear. Are you trying to add the digits to your selection or to filter records based on them?
If you just want to select them:
Address.select(%q{(regexp_matches(addr.line_1, 'CityName\s*(\d+)'))[1] as digits})
.map(&:digits)
If you want to filter based on then:
Address.where(%q{addr.line_1 ~ 'CityName\s*(\d+)'}).map &:email
.map(&:line_1)
Also a few notes:
Selecting digits case intensively does not really make sense. Digits
does not have case.
PostgreSQL arrays start from 1 instead of 0.
It seems you need a subquery or a WITH query:
SELECT tbl1.col1, sum(...), min(...) FROM (SELECT ..., CASE ...yourregex stuff... END col1 FROM ...) tbl1 GROUP BY 1;
WITH tbl1 AS (SELECT ..., CASE ...yourregex stuff... END col1 FROM ...) SELECT t.col1, sum(...) FROM tbl1 t GROUP BY 1;
If you need them regulary, you can also create views from the query or create a temp table, then you can use it in queries later.
Got it! Was able to finally start to figure out the regex.
WHEN addr.line_1 ~* '(?i)CityName\s*(\d+)' THEN 'C' || (SELECT (regexp_matches(addr.line_1, '(?i)CityName\s*(\d+)'))[1])
The (?i) allowed for case insensitive matching for CityName and then the concatenation worked. Thank you #ti6on for pointing out the index difference with postgres :D

How to store regex or search terms in Postgres database and evaluate in Rails Query?

I am having trouble with a DB query in a Rails app. I want to store various search terms (say 100 of them) and then evaluate against a value dynamically. All the examples of SIMILAR TO or ~ (regex) in Postgres I can find use a fixed string within the query, while I want to look the query up from a row.
Example:
Table: Post
column term varchar(256)
(plus regular id, Rails stuff etc)
input = "Foo bar"
Post.where("term ~* ?", input)
So term is VARCHAR column name containing the data of at least one row with the value:
^foo*$
Unless I put an exact match (e.g. "Foo bar" in term) this never returns a result.
I would also like to ideally use expressions like
(^foo.*$|^second.*$)
i.e. multiple search terms as well, so it would match with 'Foo Bar' or 'Search Example'.
I think this is to do with Ruby or ActiveRecord stripping down something? Or I'm on the wrong track and can't use regex or SIMILAR TO with row data values like this?
Alternative suggestions on how to do this also appreciated.
The Postgres regular expression match operators have the regex on the right and the string on the left. See the examples: https://www.postgresql.org/docs/9.3/static/functions-matching.html#FUNCTIONS-POSIX-TABLE
But in your query you're treating term as the string and the 'Foo bar' as the regex (you've swapped them). That's why the only term that matches is the exact match. Try:
Post.where("? ~* term", input)

Case Insensitive Search with Neo4jClient

Just an quick and easy one, I need to be able to search our database minus the case sensitivity, I know how to do it, just not with the Neo4jClient. Here's the code:
client.Cypher
.Match("(person:Person)")
.Where((Person person) => person.Email == search)
where 'search' is a parameter of type string that is passed to the method. I have read that using =~ '(?i)text' works, but that doesn't allow me to pass in the parameter, and I have tried this:
client.Cypher
.Match("(person:Person)")
.Where((Person person) => person.Email =~ "(?i){terms}")
.WithParam("terms",search)
But it doesn't like this.
I would like to be able to search without case, and if possible at the same time, using LIKE (or ILIKE as it seems to be for pattern matching).
Thanks
EDIT & ANSWER
The final code ended up as this:
return client.Cypher
.Match("(person:Person)")
.Where("person.Email =~ {terms}")
.OrWhere("person.Name =~ {terms}")
.WithParam("terms", "(?ui).*" + search + ".*")
.Return<Person>("person").Results.ToList();
Which does exactly what I want it to.
Also took the advice of a lowercase field with the value in, we already have one in the account so that logon names are not case sensitive, I am going to do this on the email and name fields, seems better than using toLower() (either in Cypher or in C#)
So thank to #Stefan Armbruster for his help.
You cannot have partial parameters. Instead add (?i) to the parameter value:
query: person.Email =~ term
parameter: term = "(?i)<myvalue"
Note 1: You need to use (?ui) for gracefully dealing with non-ascii case sensitivity (e.g. German umlauts).
Note 2: the =~ operator is not backed by an index, so the query above will touch every Person node and apply the regex to the property value. In Neo4j 2.3 there will be a index backed LIKE which supports string prefix matches.
If you want to use index based case insensitive search, the recommended approach is to store the property value converted to lower case (Cypher has a toLower function) and then do a exact match on the lower cased search value.

PostgreSql + Searching Issue for "'" present in string

Here, I have a problem in searching record in Postgresql DB in RoR Application. Name of table :: address_books, name of attributes :: organization_name, federal_tax_id, city, zip , business_name. In search, organization name contain :: Claire's Inc as record. At the time of searching, it does not show the data while we select Claire's Inc in search box. Because "'" breaks the string and gives no result. So I have used "?" replace "'" at time of search in mysql and it works. But I am getting appropriate conversion to make search of this words.
Query :: SELECT * FROM "address_books"
WHERE ( address_books.organization_name = 'Claire?s Inc'
and address_books.federal_tax_id = '59-0940416'
and address_books.city = 'Hoffman Estates'
and address_books.zip = '60192' and address_books.business_name ='' )
ORDER BY address_books.organization_name , city LIMIT 100
Please suggest any other way to make successful search.
Thanks in Advance
You're messing up your data to deal with a matter of query syntax. Put a correctly escaped apostrophe in the place where the apostrophe should be.
One way is to escape it to 'Claire''s Inc'. Another is to use a library that lets you pass parameters and handles the escaping for you. Another is to enter the string as $$Claire's Inc$$ though that syntax allows for other things that may not be appropriate here.
I think you can use RoR parameter substituion, than RoR will escape your dangerous strings for you. something like:
AddressBook.find(:all, :conditions => { "organization_name => ?", "Claire's Inc" })
or
AddressBook.find(:all, :conditions => { :organization_name => "Claire's Inc" })

Resources