In Rails - How to have one query that has multiple queries? - ruby-on-rails

In have 3 models here:
projects
threads (project_id)
thread_participations (thread_id, read boolean)
Right now I have a list of the user's projects, and the list shows how many threads are unread per project. The huge problem here is that if the user has several projects (which all users do) it causes the DB to get hit with several queries, one per project.
I would like to use Rails to build a query, that with one DB hit, returns an unread count for each of the user's project.
Here's what I use today in the view:
<% #projects.each_with_index do |project, i| %>
<%=project %>: <%= Thread.unread(current_user,project).count %>
<% end %>
And in the thread Model:
scope :unread, lambda { |user,project|
includes(:project,:thread_participations).where(:project_id => project.id, :thread_participations => {:read => false, :user_id => user.id})
}
Any suggestions on how to do this? Also which model should this live in? Maybe the user's model since it is not project or thread specific?
Thanks

There are a couple of ways to structure this query, but here is one.
You can perform this in a single query and then loop over the results. I would first create a scope on thread participations for unread for a certain user. Then use the scope and include all threads and projects, group by the project id (so that you are getting unread threads for that project) and then count the number of unread threads by counting threads.id:
class ThreadParticipations
scope :unread, lambda{ |user| where(user_id: user.id, read: false) }
end
ThreadParticipations
.unread(current_user)
.includes(:thread => :project)
.group('projects.id')
.count('threads.id')
=> { 10 => 15, 11 => 10 }
# project(10) has 15 unread threads and project(11) has 10 unread threads

Related

Optimize query in Rails

I have the following code:
results = Report.where(:car => 'xxxx').group(:date, :name, :car).select('date, name ,car, info, MAX(price) AS max_price')
for customer in customers
result = results.where(:date => customer.date, :name => customer.name, :car => customer.car).first
.... rest of the code ....
end
I have a database with many records ~20,000, so I want to optimize the code and cache results in memory.
Once again: my overall intention is make this code more efficient in terms of time. I want it to run faster than it is now and I want to reduce amount of database calls.
I am thinking of making my inital results object an array. I have a remote database so each .where query takes sometime. When I make results an array by adding .to_a - I load it to memory. So I think, it should be better(but not really sure)
Something like:
results = Report.where(:car => 'xxxx').group(:date, :name, :car)
.select('date, name ,car, info, MAX(price) AS max_price')
.to_a
for customer in customers
result = results.select {|result| result.date == customer.date and result.name == customer.name and result.car == customer.car }
.first
end
Well, the best things to have an association to fetch all reports for customers. In the case when you can not do so, I would recommend making only one query instead of n+1(as stated in the question) like this:
results = Report.where(:car => 'xxxx').group(:date, :name, :car)
.select('date, name ,car, info, MAX(price) AS max_price')
.where(:date => customers.map(&:date), :name => customers.map(&:name), :car => customers.map(&:car))
Assuming customers is an array of objects which respond to :name, :car, and :date methods.
One thing that should be noted is it does not guarantee that it will fetch reports of an exact customer. For that, you'd have to verify it by iterating through the results object yourself.

Rails App Movies API - Heroku

I have a Rails Movie App which will potentially use APIs from other websites to fill my database.
I am using the TMDB.org API (I already have an API Key) to extract, for now lets say the Title and Description of a movie.
How would I go by extracting information from the tmdb site to my site and display the information in <% #movie.title %> and <% #movie.description %>
The information taken will need to be placed in my PostgreSQL Database using Heroku
So thats one question
The other question is how would I do this without running a method for every movie in the TMDB database
For Example, using the Ruby-TMDB Gem, instead of running
TmdbMovie.find(:title => "The Social Network", :limit => 10, :expand_results => true, :language => "en")
and
TmdbMovie.find(:title => "The Dark Knight Rises ", :limit => 10, :expand_results => true, :language => "en")
for every movie I want in my database (which is every movie in the TMDB Database), what would I run to GET ALL Movies.
And then Display them in my movies :show page
To sum it all up, how to I get database information from TMDB.org to my rails app and display information from a TMDB Movie :show page to my movie :show page, using the Ruby-TMDB Gem IN Heroku? Would there be a rake task, if so what would it be?
Many Thanks!
There are 2 problems you wish to tackle.
How and where in my rails app should I pull the data?
Yeah this can be accomplished with rake. Add a new rake file lib/tasks/tmdb.rake
namespace :db do
task :pull_tmdb_data => :environment do
Tmdb.api_key = "t478f8de5776c799de5a"
# setup your default language
Tmdb.default_language = "en"
# find movies by id
#movie = TmdbMovie.find(id: 123)
Movie.create title: #movie.title, description: #movie.description
# find movies in groups
#movies = TmdbMovie.find(:title => 'Iron Man')
#movies.each do |movie|
Movie.create title: movie.title, description: movie.description
end
end
end
And now everytime you want to populate your db with tmdb you can simply run rake db:pull_tmdb_data
What are the necessary queries to pull all the movies from tmdb?
By glancing over their api there is no shortcut to duplicating the database, if you want to duplicate it your best bet may be to contact them directly. You can brute force it by trying every possible id for movies but beware that they do throttle you. Below is a quote from their website.
We do enforce a small amount of rate limiting. Please be aware that should you exceed these limits, you will receive a 503 error.
30 requests every 10 seconds per IP
Maximum 20 simultaneous
connections
It may be worth considering if you really need to duplicate tmdb. When tmdb is adding new movies and fixing errors in their data your database will be outdated as a result you will face a slew of data integrity issues which will be hard to resolve.

Cleaning up controllers to speed up application

So in my app I have notifications and different record counts that are used in the overall layout, and are therefore needed on every page.
Currently in my application_controller I have a lot of things like such:
#status_al = Status.find_by_name("Alive")
#status_de = Status.find_by_name("Dead")
#status_sus = Status.find_by_name("Suspended")
#status_hid = Status.find_by_name("Hidden")
#status_arc = Status.find_by_name("Archived")
#balloon_active = Post.where(:user_id => current_user.id, :status_id => #status_al.id )
#balloon_dependent = Post.where(:user_id => current_user.id, :status_id => #status_de.id )
#balloon_upcoming = Post.where(:user_id => current_user.id, :status_id => #status_sus.id )
#balloon_deferred = Post.where(:user_id => current_user.id, :status_id => #status_hid.id )
#balloon_complete = Post.where(:user_id => current_user.id, :status_id => #status_arc.id )
..
Thats really just a small piece, I have at least double this with similar calls. The issue is I need these numbers pretty much on every page, but I feel like I'm htting the DB wayyyy too many times here.
Any ideas for a better implementation?
Scopes
First off, you should move many of these into scopes, which will allow you to use them in far more flexible ways, such as chaining queries using ActiveRecord. See http://edgerails.info/articles/what-s-new-in-edge-rails/2010/02/23/the-skinny-on-scopes-formerly-named-scope/index.html.
Indexes
Second, if you're doing all these queries anyway, make sure you index your database to, for example, find Status quickly by name. A sample migration to accomplish the first index:
add_index :status (or the name of your Status controller), :name
Session
If the data you need here is not critical, i.e. you don't need to rely on it to further calculations or database updates, you could consider storing some of this data in the user's session. If you do so, you can simply read whatever you need from the session in the future instead of hitting your db on every page load.
If this data is critical and/or it must be updated to the second, then avoid this option.
Counter Caching
If you need certain record counts on a regular basis, consider setting up a counter_cache. Basically, in your models, you do the following:
Parent.rb
has_many :children
Child.rb
belongs_to :parent, :counter_cache => true
Ensure your parent table has a field called child_count and Rails will update this field for you on every child's creation/deletion. If you use counter_caching, you will avoid hitting the database to get the counts.
Note: Using counter_caching will result in a slightly longer create and destroy action, but if you are using these counts often, it's usually worth going with counter_cache.
You should only need 1 database query for this, something like:
#posts = Post.where(:user_id => current_user.id).includes(:status)
Then use Enumerable#group_by to collect the posts into the different categories:
posts_by_status = #posts.group_by do {|post| post.status.name }
which will give you a hash:
{'Alive' => [...], 'Dead' => [...]}
etc.

Collect many 'counts' in one query?

I need to do the following:
<% for customer in #customers do %>
<%= customer.orders.count %>
<% end %>
This strains the server, creating n queries, where n = number of customers.
How can I load these counts along with my customers in one query? Thanks.
You could use an eager join:
#customers = Customers.paginate :page => 1, :per_page => 20, :include => [:orders]
By specifying the :include parameter to the join, the orders will be preloaded, preventing the n+1 problem. You can then use customer.orders.length.
If loading all those orders is too memory-intensive, then you should explore counter_cache. This is designed to keep a count of a model on an associated model:
class Order
belongs_to :customer, :counter_cache => true
end
This will increment and decrement a orders_count field on the owning customer record when orders are added or removed from the associaition.
If you don't want to use the counter_cache, you'll need custom finder SQL which joins the orders table and groups on orders.customer_id, and then selects the count as an extra field. This will not perform nearly as well as the counter cache, though.

Creating "feeds" from multiple, different Rails models

I'm working on an application that has a few different models (tickets, posts, reports, etc..). The data is different in each model and I want to create a "feed" from all those models that displays the 10 most recent entries across the board (a mix of all the data).
What is the best way to go about this? Should I create a new Feed model and write to that table when a user is assigned a ticket or a new report is posted? We've also been looking at STI to build a table of model references or just creating a class method that aggregates the data. Not sure which method is the most efficient...
You can do it one of two ways depending on efficiency requirements.
The less efficient method is to retrieve 10 * N items and sort and reduce as required:
# Fetch 10 most recent items from each type of object, sort by
# created_at, then pick top 10 of those.
#items = [ Ticket, Post, Report ].inject([ ]) do |a, with_class|
a + with_class.find(:all, :limit => 10, :order => 'created_at DESC')
end.sort_by(&:created_at).reverse[0, 10]
Another method is to create an index table that's got a polymorphic association with the various records. If you're only concerned with showing 10 at a time you can aggressively prune this using some kind of rake task to limit it to 10 per user, or whatever scope is required.
Create an Item model that includes the attributes "table_name" and "item_id". Then create a partial for each data type. After you save, let's say, a ticket, create an Item instance:
i = Item.create(:table_name => 'tickets', :item_id => #ticket.id)
In your items_controller:
def index
#items = Item.find(:all, :order => 'created_on DESC')
end
In views/items/index.erb:
<% #items.each do |item| %>
<%= render :partial => item.table_name, :locals => {:item => item} %><br />
<% end %>

Resources