Ruby: Case-Insensitive Array Comparison - ruby-on-rails

Just found out that this comparison is actually case-sensitive..Anyone know a case-insensitive way of accomplishing the same comparison?
CardReferral.all.map(&:email) - CardSignup.all.map(&:email)

I don't think there is any "direct" way like the minus operator, but if you don't mind getting all your results in lowercase, you can do this:
CardReferral.all.map(&:email).map(&:downcase) - CardSignup.all.map(&:email).map(&:downcase)
Otherwise you'll have to manually do the comparison using find_all or reject:
signups = CardSignup.all.map(&:email).map(&:downcase)
referrals = CardReferral.all.map(&:email).reject { |e| signups.include?(e.downcase) }
I'd suggest that reading a reference of Ruby's standard types might help you come up with code like this. For example, "Programming Ruby 1.9" has all methods of the Enumerable object explained starting on page 487 (find_all is on page 489).

Related

Ruby On Rails searching database for word

I am new to rails. I am trying to search a database in MySQL where the term I am searching may be one word in the column string. For example if the cell was "this is a very lovely day" then I would like to be able to call that object by searching for the word 'lovely'
Thank you.
You need to do a LIKE query. (i.e. foo LIKE %bar%) The % represents a wildcard operator. bar% would be "starts with bar" and %bar% would be "contains bar." Note that contains searches cannot use column indexes and will be slow.
Suppose you had a Day class with the attribute description. In that case, you would do
Day.where("description LIKE '%lovely%')
by using Arel
days = Day.arel_table
Day.where(days[:description].matches("%lovely%"))

Store regular expression in database

I need to store a regular expression related to other fields in a database table with ActiveRecord.
I found the to_s method in Regexp class which states
Returns a string containing the regular expression and its options
(using the (?opts:source) notation. This string can be fed back in to
Regexp::new to a regular expression with the same semantics as the
original. (However, Regexp#== may not return true when comparing the
two, as the source of the regular expression itself may differ, as the
example shows). Regexp#inspect produces a generally more readable
version of rxp.
So it seems a working solution, but it will store the exp with an unusual syntax and in order to get the string to store I need to build it manually with /my-exp/.to_s. Also I may not be able to edit to regexp directly. For instance a simple regexp produces:
/foo/i.to_s # => "(?i-mx:foo)"
The other option is to eval the field content so I might store the plain expression in the db column and then doing an eval(record.pattern) to get the actual regexp. This is working and since I'm the only one who will be responsible to manage the regexp records there should be no issues in doing that, except application bugs ;-)
Do I have other options? I'd prefer to not doing eval on db fields but on the other side I don't want to work with a syntax which I don't know.
use serialize to store your regex 'as-is'
class Foo < ActiveRecord::Base
serialize :my_regex, Regexp
end
see the API doc to learn more about this.
Not sure I understand your constraints exactly.
If you store a string in db, you could make a Regexp from it:
a = 'foo'
=> "foo"
/#{a}/
=> /foo/
Regexp.new('dog', Regexp::EXTENDED + Regexp::MULTILINE + Regexp::IGNORECASE)
=> /dog/mix
There are other constructors, see doc.
The very best solution to not use eval'd code is to store the regexp part in a string column and flags in a separate integer column. In this way the regexp can be built with:
record = Record.new pattern: 'foo', flags: Regexp::IGNORECASE
Regexp.new record.pattern, record.flags # => /foo/i
You can use #{} within regular expressions to insert variables, so you could insert a carefully cleaned regexp by storing "foo" in the db under record.pattern as a string, and then evaluating it with:
/#{record.pattern}/
So, in the db, you would store:
"pattern"
in your code, you could do:
if record.other_field =~ /#{record.pattern}/
# do something
end
This compiles the regexp from a dynamic string in the db that you can change, and allows you to use it in code. I wouldn't recommend it for security reasons though, see below:
Obviously this could be dangerous, as the regex can contain ruby code, so this is simpler, but in terms of danger, it is similar to eval:
a = "foo"
puts a
=> foo
b = "#{a = 'bar'}"
a =~ /#{b}/
puts a
=> bar
You might be better to consider whether for security it is worth decomposing your regex tests into something you can map to methods which you write in the code, so you could store keys in the db for constraints, something like:
'alpha,numeric' etc.
And then have hard-coded tests which you run depending on the keys stored. Perhaps look at rails validations for hints here, although those are stored in code, it's probably the best approach (generalise your requirements, and keep the code out of the db). Even if you don't think you need security now, you might want it later, or forget about this and grant access to someone malicious.

Implement autocomplete on MongoDB

Say I have a collection of users and want to implement autocomplete on the usernames of those users. I looked at the mongodb docs and $regex seems to be one way to do this. Is there a better way? By better I mean more performant/better practice.
As suggested by #Thilo, you can use several ideas including prefixing.
The most important thing is to have very quick request (because you want autocomplete to feel instaneous). So you have to use query which will use properly indexes.
With regexp : use /^prefix/ (the important thing is the ^ to specify the beginning of line which is mandatory to make the query use index).
The range query is good too : { $gt : 'jhc', $lt: 'jhd' } }
More complicated but faster : you can store prefix-trees in mongo (aka tries) with entries like :
{usrPrefix : "anna", compl : ["annaconda", "annabelle", "annather"]}
{usrPrefix : "ann", compl : ["anne", "annaconda", "annabelle", "annather"]}
This last solution is very fast (if indexes on compl of course) but not space efficient at all. You know the trade-off you have too choose.
We do it using regex and it's fast as long as you have an index and you use /^value/
Be aware you can't use the case insensitive option with an index, so you may want to store a lower case version of your string as another field in your document and use that for the autocomplete.
I've done tests with 3 million+ documents and it still appears instantaneous.
If you are looking for prefixes, you could use a range query (not sure about the exact syntax):
db.users.find({'username': { $gt : 'jhc', $lt: 'jhd' } } )
And you want an index on the username field.

How to sum all properties of a nested collection?

Given I got User.attachments and Attachment.visits as an integer with the number count.
How can I easily count all the visits of all images of that user?
Use ActiveRecord::Base#sum:
user.attachments.sum(:visits)
This should generate an efficient SQL query like this:
SELECT SUM(attachments.visits) FROM attachments WHERE attachments.user_id = ID
user.attachments.map{|a| a.visits}.sum
There's also inject:
user.attachments.inject(0) { |sum, a| sum + a.visits }
People generally (and quite rightly) hate inject, but since the two other main ways of achieving this have been mentioned, I thought I may as well throw it out there. :)
The following works with Plain Old Ruby Objects, and I suspect the following is marginally faster than using count += a.visits, plus it has an emoticon in it:
user.attachments.map(&:visits).inject(:+)

Rails: A good search algorithm

I'm trying to return results more like the search
My curren algorithm is this
def search_conditions(column, q)
vars = []
vars2 = []
vars << q
if q.size > 3
(q.size-2).times do |i|
vars2 << q[i..(i+2)]
next if i == 0
vars << q[i..-1]
vars << q[0..(q.size-1-i)]
vars << q[i % 2 == 0 ? (i/2)..(q.size-(i/2)) : (i/2)..(q.size-1-(i/2))] if i > 1
end
end
query = "#{column} ILIKE ?"
vars = (vars+vars2).uniq
return [vars.map { query }.join(' OR ')] + vars.map { |x| "%#{x}%" }
end
If I search for "Ruby on Rails" it will make 4 search ways.
1) Removing the left letters "uby on Rails".."ils"
2) Removing the right letters "Ruby on Rail".."Rub"
3) Removing left and right letters "uby on Rails", "uby on Rail" ... "on "
4) Using only 3 letters "Rub", "uby", "by ", "y o", " on" ... "ils"
Is good to use these 4 ways? There any more?
Why are you removing these letters? Are you trying to make sure that if someone searches for 'widgets', you will also match 'widget'?
If so, what you are trying to do is called 'stemming', and it is really much more complicated than removing leading and trailing letters. You may also be interested in removing 'stop words' from your query. These are those extremely common words that are necessary to form grammatically-correct sentences, but are not very useful for search, such as 'a', 'the', etc.
Getting search right is an immensely complex and difficult problem. I would suggest that you don't try to solve it yourself, and instead focus on the core purpose of your site. Perhaps you can leverage the search functionality from the Lucene project in your code. This link may also be helpful for using Lucene in Ruby on Rails.
I hope that helps; I realize that I sort of side-stepped your original question, but I really would not recommend trying to tackle this yourself.
As pkaeding says, stemming is far too complicated to try to implement yourself. However, if you want to search for similar (not exact) strings in MySQL, and your user search terms are very close to the full value of a database field (ie, you're not searching a large body of text for a word or phrase), you might want to try using the Levenshtein distance. Here is a MySQL implementation.
The Levenshtein algorithm will allow you to do "fuzzy" matching, give you a similarity score, and help you avoid installation and configuration of a search daemon, which is complicated. However, this is really only for a very specific case, not a general site search.
While, were all suggesting other possible solutions, check out:
Sphinx - How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.
Thinking Sphinx - A Ruby connector between Sphinx and ActiveRecord.

Resources