How to make my DB query method in Rails more efficient - ruby-on-rails

I am doing a query over my POSTGRESQL DB. My app has Articles and the Articles can have a number of Hashtags. Those relations are saved in a joined table of Hashtags and Articles.
I have a working method which gives me back Articles which have certain hashtags, or gives me back all articles who do not contain certain hashtags
def test(hashtags, include = true)
articles= []
hashtags.split(' ').each do |h|
articles+= Article.joins(:hashtags).where('LOWER(hashtags.value) LIKE LOWER(?)', "#{h}")
end
if include
articles.uniq
else
(Article.all.to_set - articles.uniq.to_set).to_a
end
end
I could call it like this:
test("politics people china", true)
And it would give me all Articles who have one of those hashtags related to
Or I could call it like that
test("politics people china", false)
And it would give me all Articles EXCEPT those who have one of these hashtags
It works well, but I dont think this is very efficient as I do so much in Ruby and not on DB level.
I tried this:
def test2(hashtags, include = true)
articles= []
pattern = ''
hashtags.split(' ').each do |h|
pattern += "#{h}|"
end
pattern = '(' + pattern[0...-1] + ')'
if include
articles = Article.joins(:hashtags).where('hashtags.value ~* ?', "#{pattern}")
else
articles = Article.joins(:hashtags).where('hashtags.value !~* ?', "#{pattern}")
end
articles.uniq
end
But it does not behave like I thought it would. First of all if I call it like that:
test2("politics china", true)
It wouldn't only give me all Articles who have a hashtags politics or china, but also all artcles who have a hashtag containing one of the letters in politics or china like so:
(p|o|l|i|t|c|s|h|n|a)
but it should check for this actually, and the pattern looks actually like this, what I can see in the console:
(politics|china)
which it doesnt what I find is strange tbh...
And with
test2("politics", false)
It only gives me articles who have one or more hashtags associated to, BUT leaves out the ones who have no hashtag at all
Can someone help me make my working method more efficient?
EDIT:
Here is my updated code like suggested in an answer
def test2(hashtags, include = false)
hashtags =
if include
Hashtag.where("LOWER(value) iLIKE ANY ( array[?] )", hashtags)
else
Hashtag.where("LOWER(value) NOT iLIKE ANY ( array[?] )", hashtags)
end
Slot.joins(:hashtags).merge(hashtags).distinct
end
It still lacks to give me Articles who have NO hashtags at all if incude is false unfortunately

You are right about
I dont think this is very efficient as I do so much in Ruby and not on DB level.
ActiveRecord works nice for simple queries, but when things are getting complex it's reasonable to use plain SQL. So let's try to build a query that matches your test cases:
1) For this call test("politics people china", true) the query may look like:
SELECT DISTINCT ON (AR.id) AR.*
FROM articles AR
JOIN articles_hashtags AHSH ON AHSH.article_id = AR.id
JOIN hashtags HSH ON HSH.id = AHSH.hashtag_id
WHERE LOWER(HSH.value) IN ('politics', 'people', 'china')
ORDER BY AR.id;
(I'm not sure how your join table is named, so assuming it is articles_hashtags).
Plain and simple: we take data from articles table using 2 inner joins with articles_hashtags and hashtags and where conditions, which filters hashtags we want to see; and eventually it brings us all articles with that hashtags. No matter on how many hashtags we want to filter: IN statement works well even if there is only one hashtag in the list.
Please note DISTINCT ON: it's necessary for removing duplicate articles from resultset, in case the same article has more than one hashtag from given hashtag list.
2) For the call test("politics people china", false) the query is a bit more complex. It needs to exclude articles which have given hashtags. Hence it should return articles with different hashtags, as well as articles without hashtags at all. Trying to keep things simple we could use the previous query for that:
SELECT A.*
FROM articles A
WHERE A.id NOT IN (
SELECT DISTINCT ON (AR.id) AR.id
FROM articles AR
JOIN articles_hashtags AHSH ON AHSH.article_id = AR.id
JOIN hashtags HSH ON HSH.id = AHSH.hashtag_id
WHERE LOWER(HSH.value) IN ('politics', 'people', 'china')
ORDER BY AR.id
);
Here we're fetching all articles, but those who have any of given hashtags.
3) Converting these queries to a Ruby method gives us the following:
def test3(hashtags, include = true)
# code guard to prevent SQL-error when there are no hashtags given
if hashtags.nil? || hashtags.strip.blank?
return include ? [] : Article.all.to_a
end
basic_query = "
SELECT DISTINCT ON (AR.id) AR.*
FROM #{Article.table_name} AR
JOIN articles_hashtags AHSH ON AHSH.article_id = AR.id
JOIN #{Hashtag.table_name} HSH ON HSH.id = AHSH.hashtag_id
WHERE LOWER(HSH.value) IN (:hashtags)
ORDER BY AR.id"
query = if include
basic_query
else
"SELECT A.*
FROM #{Article.table_name} A
WHERE A.id NOT IN (#{basic_query.sub('AR.*', 'AR.id')})"
end
hashtag_arr = hashtags.split(' ').map(&:downcase) # to convert hashtags string into a list
Article.find_by_sql [query, { hashtags: hashtag_arr }]
end
The method above will return an array of articles matching your conditions, empty or not.

Try this:
def test(hashtags, include = true)
hashtags =
if include
Hashtag.where("LOWER(value) iLIKE ANY ( array[?] )", hashtags)
else
Hashtag.where("LOWER(value) NOT iLIKE ANY ( array[?] )", hashtags)
end
Article.joins(:hashtags).merge(hashtags).distinct
end

Related

CONTAIN or LIKE sql statement for ActiveRecord has_and_belongs_to_many relationship

I have 2 ActiveRecords: Article and Tag, in a many to many relationship. Basically I want to know how to select with a CONTAINS or LIKE condition, ie. to define a condition on a many to many relationship to contain a specified subset within an array.
The code structure I am trying to work out is as follows:
tag_names = ["super", "awesome", "dope"]
tags = Tag.where("name IN (?)", tag_names)
# The following is my non-working code to illustrate
# what I'm trying to do:
articles = Article.where("tags CONTAINS (?)", tags)
articles = Article.joins(:tags).where("articles.tags CONTAINS (?)", tags)
If you have different tables then you have to use joins and then specify a condition on your join:
Article.joins(:tags).where('tags.id IN ?', tag_ids)
If you want more flexibility on you queries, you could also use Arel and write something like the following:
tags = Tag.arel_table
tags_ids = Tag.where(tags[:name].matches("%#{some_tag}%"))
Article.joins(:tags).where(tagss[:id].in(tags_ids))
You can read more about matches in this answer.
I prefer Arel conditions over pure String or even Hash conditions.

Ruby on Rails search 2 models

Right... I've spent 3 days trying to do this myself to no a vale.
I have 2 models called Film and Screenings. Screenings belongs_to Film, Film has_many Screenings.
The Film has certain attributes(:title, :date_of_release, :description, :genre).
The Screening has the attributes(:start_time, :date_being_screened, :film_id(foreign key of Film)).
What I am trying to do is create a Search against both of these models.
I want to do something like this...
#films = Film.advanced_search(params[:genre], params[:title], params[:start_time], params[:date_showing])
And then in the Film model...
def self.advanced_search(genre, title, start_time, date)
search_string = "%" + title + "%"
self.find(:all, :conditions => ["title LIKE ? OR genre = ? OR start_time LIKE ? OR date_showing = ?", title, genre, start_time, date], order: 'title')
end
end
I don't think this could ever work quite like this, but I'm hoping my explanation is detailed enough for anyone to understand what im TRYING to do?? :-/
Thanks for any help guys
I would extract the search capability into a separate (non-ActiveRecord) class, such as AdvancedSearch as it doesn't neatly fit into either the Film or Screening class.
Rather than writing a complex SQL query, you could just search the films, then the screenings, and combine the results, for example:
class AdvancedSearch
def self.search
film_matches = Film.advanced_search(...) # return an Array of Film objects
screening_matches = Screening.advanced_search(...) # return an Array of Screening objects
# combine the results
results = film_matches + screening_matches.map(&:film)
results.uniq # may be necessary to remove duplicates
end
end
Update
Let's say your advanced search form has two fields - Genre and Location. So when you submit the form, the params sent are:
{ :genre => 'Comedy', :location => 'London' }
Your controller would then something like:
def advanced_search(params)
film_matches = Film.advanced_search(:genre => params[:genre])
screening_matches = Screening.advanced_search(:location => params[:location])
# remaining code as above
end
i.e. you're splitting the params, sending each to a different model to run a search, and then combining the results.
This is essentially an OR match - it would return films that match the genre or are being screened at that specified venue. (If you wanted and AND match you would need to the work out the array intersection).
I wanted to write something but this cast says all http://railscasts.com/episodes/111-advanced-search-form
Almost the same case as yours.

How do I combine ActiveRecord results from multiple has_many :through queries?

Basically, I have an app with a tagging system and when someone searches for tag 'badger', I want it to return records tagged "badger", "Badger" and "Badgers".
With a single tag I can do this to get the records:
#notes = Tag.find_by_name(params[:tag_name]).notes.order("created_at DESC")
and it works fine. However if I get multiple tags (this is just for upper and lower case - I haven't figured out the 's' bit either yet):
Tag.find(:all, :conditions => [ "lower(name) = ?", 'badger'])
I can't use .notes.order("created_at DESC") because there are multiple results.
So, the question is.... 1) Am I going about this the right way? 2) If so, how do I get all my records back in order?
Any help much appreciated!
One implementation would be to do:
#notes = []
Tag.find(:all, :conditions => [ "lower(name) = ?", 'badger']).each do |tag|
#notes << tag.notes
end
#notes.sort_by {|note| note.created_at}
However you should be aware that this is what is known as an N + 1 query, in that it makes one query in the outer section, and then one query per result. This can be optimized by changing the first query to be:
Tag.find(:all, :conditions => [ "lower(name) = ?", 'badger'], :includes => :notes).each do |tag|
If you are using Rails 3 or above, it can be re-written slightly:
Tag.where("lower(name) = ?", "badger").includes(:notes) do |tag|
Edited
First, get an array of all possible tag names, plural, singular, lower, and upper
tag_name = params[:tag_name].to_s.downcase
possible_tag_names = [tag_name, tag_name.pluralize, tag_name.singularize].uniq
# It's probably faster to search for both lower and capitalized tags than to use the db's `lower` function
possible_tag_names += possible_tag_names.map(&:capitalize)
Are you using a tagging library? I know that some provide a method for querying multiple tags. If you aren't using one of those, you'll need to do some manual SQL joins in your query (assuming you're using a relational db like MySQL, Postgres or SQLite). I'd be happy to assist with that, but I don't know your schema.

optimizing select query on has_many :through attributes association

I want to find all posts that are tagged with tags that are passed in a params array.
post has many tags through association.
currently my code looks like this:
if params.has_key?(:tags)
params[:tags].each do |tag|
#tags = Array.new if #tags.nil?
#tag = Tag.find_by_content(tag)
#tags << #tag if #tag
end
#allposts = Post.followed_by(#user).select { |p| p.tags.size != 0 && (p.tags & #tags).size == p.tags.size }
else
#allposts = Post.followed_by(#user)
end
what i'm basically doing is finding the actual tag models according to the params array and putting them into an array, then I run a select query on all posts searching for those with the same tags array.
is there a better and cleaner way to do this ?
You can roll your Tag.find query into a single request to the DB, and add an appropriate where clause to limit the posts returned:
finder = Post.followed_by(#user)
if params.has_key?(:tags)
#tags = Tag.where(:content => params[:tags])
finder = finder.with_tags(#tags)
end
#allposts = finder.all
in app/models/post.rb
scope :with_tags, lambda { |tags| joins(:tags).group('posts.id').where(:tags => { :id => tags.map { |t| t.id } } ).having("COUNT(*) = ?", tags.length) }
UPDATE
Here's what the with_tags scope does:
joins(:tags) Firstly we join the tags table to the posts table. Rails will do with with an inner join when you use the symbol syntax
where(:tags => { :id => tags.map { |t| t.id } } ) We want to filter the tags to only find those tags provided. Since we are providing a list of tag objects we use map to generate an array of IDs. This array is then used in the where clause to create a WHERE field IN (list) query - the hash within a hash syntax is used to denote the table, then column within the table.
group('posts.id') So now that we have a list of posts with the requisite tags, however, if there are multiple tags we will have posts listed multiple times (once for each matched tag), so we group by the posts.id so that we only have 1 row returned for each post (it's also required to that we can do the count in step 4)
having("count(*) = ?", tags.length) This is the final piece of the puzzle. Now that we've grouped by the post, we can count the number of matched tags associated with this post. So long as duplicate tags are not allowed then if the number of matched tags (count(*)) is the same as the number of tags we were searching with (tags.length) Then we can be sure that the post has all the tags we were searching with.
You can find a lot more information about the different query methods available for models by reading the Active Record Query Interface Guide

Rails searching with multiple conditions (if values are not empty)

Let's say I have a model Book with a field word_count, amongst potentially many other similar fields.
What is a good way for me to string together conditions in an "advanced search" of the database? In the above example, I'd have a search form with boxes for "word count between ___ and ___". If a user fills in the first box, then I want to return all books with word count greater than that value; likewise, if the user fills in the second box, then I want to return all books with word count less than that value. If both values are filled in, then I want to return word counts within that range.
Obviously if I do
Book.where(:word_count => <first value>..<second value>)
then this will break if only one of the fields was filled in. Is there any way to handle this problem elegantly? Keep in mind that there may be many similar search conditions, so I don't want to build separate queries for every possible combination.
Sorry if this question has been asked before, but searching the site hasn't yielded any useful results yet.
How about something like:
#books = Book
#books = #books.where("word_count >= ?", values[0]) if values[0].present?
#books = #books.where("word_count <= ?", values[1]) if values[1].present?
ActiveRecord will chain the where clauses
The only problem is that if values[0] && values[1] the query would not return anything if values[0] was greater than values[1].
For our advanced searching we create a filter object which encapsulates the activerecord queries into simple methods. It was originally based on this Thoughtbot post
A book filter could look something like this:
class BookFilter
def initialize
#relation = Book.scoped
end
def restrict(r)
minimum_word_count!(r[:first]) if r[:first].present?
maximum_word_count!(r[:second]) if r[:second].present?
recent! if r.try(:[], :recent) == '1'
#relation
end
protected
def recent!
where('created_at > ? ', 1.week.ago)
end
def minimum_word_count!(count)
where('word_count >= ? ', count)
end
def maximum_word_count!(count)
where('word_count <= ?', count)
end
def where(*a)
#relation = #relation.where(*a)
end
end
#to use
books = BookFilter.new.restrict(params)
Take a look at the ransack gem, which is the successor to the meta_search gem, which still seems to have the better documentation.
If you do want to roll your own, there's nothing preventing you from chaining clauses using the same attribute:
scope = Book
scope = scope.where("word_count >= ?", params[:first]) if params[:first]
scope = scope.where("word_count <= ?", params[:last]) if params[:last]
But it's really not necessary to roll your own search, there are plenty of ready solutions available as in the gems above.

Resources