idiomatic way to do regular expression searches in rails models? - ruby-on-rails

in my rails controller, i would like to do a regular expression search of my model. my googling seemed to indicate that i would have to write something like:
Model.find( :all, :condition => ["field REGEXP '?' " , regex_str] )
which is rather nasty as it implies MySQL syntax (i'm using Postgres).
is there a cleaner way of forcing rails (4 in my case) to do a regexp search on a field?
i also much prefer using using where() as it allows me to map my strong parameters (hash) directly to a query. so what i would like is something like:
Model.where( params, :match_by => { 'field': '~' } )
which would loosely translate to something like (if params['field'] = 'regex_str')
select * from models where field ~ regex_str

Unfortunately, there is no idiomatic way to do this. There's no built-in support for regular expressions in ActiveRecord. It'd be impossible to do efficiently unless each database adapter had a database-specific implementation, and not all databases support regular expression matches. Those that do don't all support the same syntax (for example, Postgres doesn't have the same regexp syntax as Ruby's Regexp class).
You'll have to roll your own using SQL, as you've noted in your question. There are alternatives, however.
For a Postgres-specific solution, check out pg_search, which uses Postgres's full text search capabilities. This is very fast and supports fuzzy searching and some pattern matching.
elasticsearch requires more setup, but is incredibly fast, with some nice gems to make your life easier. Here's a RailsCasts episode introducing it. It requires running a separate server, but it's not too hard to get started, and it's powerful. Still no regular expressions, but it's worth looking at.
If you're just doing a one-off regexp search against a single field, SQL is probably the way to go.

Related

Querying for tag values in a given list

Is there any shortform syntax in influxdb to query for membership in a list? I'm thinking of something along the lines of
SELECT * FROM some_measurement WHERE some_tag IN ('a', 'b', 'c')
For now I can string this together using ORed =s, but that seems very inefficient. Any better approaches? I looked through the language spec and I don't see this as a possibility in the expression productions.
Another option I was thinking was using the regex approach, but that seems like a worse approach to me.
InfluxDB 0.9 supports regex for tag matching. It's the correct approach although of course regex can be problematic. It's not a performance issue for InfluxDB, and in fact would likely be faster than multiple chained OR statements. There is no support yet for clauses like IN or HAVING.
For example: SELECT * FROM some_measurement WHERE some_tag =~ /a|b|c/

MongoID Query with Regex and escaping

I want to know if it is necessary to escape regex in query calls with rails/mongoID ?
This is my current query:
#model.where(nice_id_string: /#{params[:nice_id_string]}/i)
I am now unsure if it is not secure enough, because of the regex.
Should i use this code below or does MongoID escape automatically query calls?
#model.where(nice_id_string: /#{Regexp.escape(params[:nice_id_string])}/i)
Of course you should escape the input. Consider params[:nice_id_string] being .*, your current query would be:
#model.where(nice_id_string: /.*/i)
whereas your second would be:
#model.where(nice_id_string: /\.\*/i)
Those do very different things, one of which you probably don't want. Someone with a sufficiently bad attitude could probably slip some catastrophic backtracking through your current version and I'm not sure what MongoDB/V8's regex engine will do with that.

Ruby on Rails - what is the best practice for accessing Active record Columns

I am really new at rails but i was wondering.
What is the best practice for accessing model column names in rails when doing queries?
like i want to do a order by column called "title" in DESCENDING order. how would i do it (best practice)?
MyModel.order(:title.to_s.concat " DESC").all
MyModel.order("title DESC").all
or something else?
From my experience using hardcoded strings always proves the wrong approach in matters such as this mainly because the code becomes impossible to refactor.
in My IDE (i am using RubyMine) it is showing a nice code completition for the colum symbols so i am guessing will be easier to track the use this way?
Thanks.
In my opinion MyModel.order("title DESC").all is the better choice here. Readability and complexity of the other choice are bad. Although performance might not be a consideration the other choice also scores bad in this section.
Apart from that, you should never write code by your IDE intellisense ability - your code should be navigable and readable in all IDEs. I use Vim and it completes strings as good as it completes symbols so no difference here.
EDIT:
If your order was ASC then you could use MyModel.order(:title).all which is definitely better than MyModel.order("title").all

Syntactic sugar for knowing if all array elements return true for specific method?

In my Ruby program, I have an array of five strings, and I want to check if each one of the elements of that array match to a given requirement, for example:
a = ['', '', '', '']
a.inject(:blank?) # Will return true if (and only if) all elements of a are blank
I'm asking this question because Ruby has a pretty large standard API with a lot of pre-written syntactical sugar, which I want to know and don't want to reinvent.
There is a very concise way:
array.all? &:blank?
Study Enumerable and learn how to use Enumerators and you'll be speaking the most pleasant dialect of Ruby in no time.
Just an alternative way: if you have String#to_proc (search it because I would not post my own depository site in case being considered as ads), you can use a similar way you have:
a.inject(&'&& $1.blank?')
which is equivalent to
a.inject{ |sum,i|
sum && i.blank?
}

Fuzzy Comparison in Ruby/Rails

I was looking for some good options for fuzzy comparison in Rails.
Essentially, I have a set of strings that I'd like to compare against some strings in my database and I'd like to get the closest one if applicable. In this particular case, I'm not so interested in detecting letters out of order/mis-spellings, but rather the ability to ignore extraneous words (extra information, punctuation, words like: the, and, it etc) and pick out the best match. These strings will usually be somewhere between 2-7 words long.
What would you suggest is the best gem/method of doing that? I've looked at amatch (http://flori.github.com/amatch/doc/index.html) but I was wondering what else was out there.
Thanks!
Have a look and a play with Thinking Sphinx http://freelancing-god.github.com/ts/en/
I can heartily recommend it
There is also a superb Railscast on how to use it here
http://railscasts.com/episodes/120-thinking-sphinx
Otherwise use ARel - but you are going to have to implement your own fuzzy logic (Not something I'd recommend)
Have a look on this FuzzyMatch gem
It may help you.

Resources