Heroku/postgresql search - ruby-on-rails

I have a database with names of cities that uses all different kinds of characters, from åäö to vietnamese signs. The place names are all in a capitalized format e.g. "New York" and "Aix En Provence".
My problem now is to try to search for these city names in Heroku which uses postgresql.
p = Place.where("upper(name) LIKE '%" + place_name_searched.to_s.upcase.tr('åäöüñï','ÅÄÖÜÑÏ') + "%'").first
This is the best I have come up with but it is only but a fix for åäö etc and doesn't catch all the 100+ different characters there is out there, e.g. "é". The solution is not to fill out that string any further.
A general problem is that upper(name) seems to translate everything fine but upcase can only handle english letters. So the correspoding search will be .where("TÄBY like 'TäBY') if place_name_searched = 'täby'.
So, how can I match postgresql entries such as "Täby" and "Jidd Ḩafş" to corresponding strings in downcase, entered by the user ("täby", "jidd hafs")?
It all works fine in my Mysql-version of the application but once I upload to Heroku it all fails.

Postgres has an extension called "ILIKE" for case insensitive pattern matching.
p = Place.where("upper(name) ILIKE '%" + place_name_searched.to_s.upcase.tr('åäöüñï','ÅÄÖÜÑÏ') + "%'").first
But please note this only works with Postgres, so you need to check which environment you are running on to decide if you would use this one instead of the regular one.

Related

How to store regex or search terms in Postgres database and evaluate in Rails Query?

I am having trouble with a DB query in a Rails app. I want to store various search terms (say 100 of them) and then evaluate against a value dynamically. All the examples of SIMILAR TO or ~ (regex) in Postgres I can find use a fixed string within the query, while I want to look the query up from a row.
Example:
Table: Post
column term varchar(256)
(plus regular id, Rails stuff etc)
input = "Foo bar"
Post.where("term ~* ?", input)
So term is VARCHAR column name containing the data of at least one row with the value:
^foo*$
Unless I put an exact match (e.g. "Foo bar" in term) this never returns a result.
I would also like to ideally use expressions like
(^foo.*$|^second.*$)
i.e. multiple search terms as well, so it would match with 'Foo Bar' or 'Search Example'.
I think this is to do with Ruby or ActiveRecord stripping down something? Or I'm on the wrong track and can't use regex or SIMILAR TO with row data values like this?
Alternative suggestions on how to do this also appreciated.
The Postgres regular expression match operators have the regex on the right and the string on the left. See the examples: https://www.postgresql.org/docs/9.3/static/functions-matching.html#FUNCTIONS-POSIX-TABLE
But in your query you're treating term as the string and the 'Foo bar' as the regex (you've swapped them). That's why the only term that matches is the exact match. Try:
Post.where("? ~* term", input)

Handling special chars in a case insensitive search

I am using:
c.customerName =~ '(?i).*$q.*'
in order to find insensitive case any kind of customername and this is working absolutely fine for all standard character. In German unfortunately there are special chars e.g. like Ä,Ö,Ü. In this cases the cypher statement is case sensitive, e.g. if we have two customer names like Ötest and ötest it will find only one of them depending if you type a lower or an upper Ö.
Anyone has a hint what I can do to expand the insensitive case search also on such special chars?
EDIT: The problem exists also when you have a name including e.g. a '&' - you'll find e.g. the company D&A Construction when you type 'D&' - the moment you add a thrid character 'D&A' the search fails and no result is shown. Any idea?
You need to add a 'u' in your regex to transform it in a case-insensitive unicode regex. Like this:
c.customerName =~ '(?ui).*$q.*'
Works here:
From this StackOverflow question.

Postgresql text searching, matching multiple words

I don't know the name for this kind of search, but I see that it's getting pretty common.
Let's say I have records with the following file names:
'order_spec.rb', 'order.sass', 'orders_controller_spec.rb'
If I search with the following string 'oc' I would like the result to return 'orders_controller_spec.rb' due to match the o in orders and the c in controller.
If the string is 'os' then I'd like all 3 to match, 'order_spec.rb', 'order.sass', 'orders_controller_spec.rb'.
If the string is 'oco' then I'd like 'orders_controller_spec.rb'
What is the name for this kind of search and how would I go about getting this done in Postgresql?
This is a called a subsequence search. One simple way to do it in Postgres is to use the LIKE operator (or several of the other options in those docs) and fill the spaces between your letters with a wildcard, which for LIKE is %. To match anything with an o followed by an s in the words column, that would look like this:
SELECT * FROM table WHERE words LIKE '%o%s%';
This is a relatively expensive search, but you can improve performance with a varchar_pattern_ops or text_pattern_ops index to support faster pattern matching.
CREATE INDEX pattern_index ON table (words varchar_pattern_ops);

Rails query by number of digits in field

I have a Rails app with a table: "clients". the clients table has a field: phone. phone data type is string. I'm using postgresql. I would like to write a query which selects all clients which have a phone value containing more than 10 digits. phone does not have a specific format:
+1 781-658-2687
+1 (207) 846-3332
2067891111
(345)222-777
123.234.3443
etc.
I've been trying variations of the following:
Client.where("LENGTH(REGEXP_REPLACE(phone,'[^\d]', '')) > 10")
Any help would be great.
You almost have it but you're missing the 'g' option to regexp_replace, from the fine manual:
The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. [...] The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one.
So regexp_replace(string, pattern, replacement) behaves like Ruby's String#sub whereas regexp_replace(string, pattern, replacement, 'g') behaves like Ruby's String#gsub.
You'll also need to get a \d through your double-quoted Ruby string all the way down to PostgreSQL so you'll need to say \\d in your Ruby. Things tend to get messy when everyone wants to use the same escape character.
This should do what you want:
Client.where("LENGTH(REGEXP_REPLACE(phone, '[^\\d]', '', 'g')) > 10")
# --------------------------------------------^^---------^^^
Try this:
phone_number.gsub(/[^\d]/, '').length

How to handle special characters like " ’ " when fetched from DB

Actually, i have an entry in my db of title "You’re Accepted to College, Now What: Deciding Upon...".
So when user types "you'are accepted...." it doesn't match with the above because of the special character ’.
What should be done for such type of characters.
Thanks in advance. :)
So in the case that you do have the apostrophe within your database in all places, so that there is no record that has a quote instead, I'd just correct the queries via gsub so that they contain the correct quote.
The true, nice and elegant solution would be to add a collation to your database server. The collation is what makes the database return 1 for SELECT 'ä' LIKE 'a'. Unfortunately, neither utf8_general_ci nor utf8_unicode_ci (which are built-in collations to MySQL) fold the apostrophe together with the quote. This is why you would have to add a new one. How this is accomplished in MySQL, you may find here: http://dev.mysql.com/doc/refman/5.0/en/adding-collation.html

Resources