My app seams to be getting bogged down. Can someone can help me optimize this controller code to run faster? Or point me in the right direction.
I'm trying to display a list of customers which are defined by active is true and a list of potential customers which active is false. Archived customers are archived true.
Thank you.
if current_user.manager?
get_customers = Customer.where(:archived => false)
#cc = get_customers.where(:active => true)
#current_customers = #cc.where(:user_id => current_user.id)
#count_current = #current_customers.count
#pc = get_customers.where(:active => false)
#potential_customers = #pc.where(:user_id => current_user.id)
#count_potential = #potential_customers.count
end
How does this look for improving speed?
model
scope :not_archived, -> { where(:archived => false) }
scope :current_customers, -> { where(:active => true).not_archived }
scope :potential_customers, -> { where(:active => false).not_archived }
scope :archived_customers, -> { where(:archived => true) }
Controller
#current_customers = Customer.current_customers.includes(:contacts,:contracts)
View
link_to "Current Clients #{#count_current.size}"
You may find help here
As #Gabbar pointed out and I will add to it, your app right now is eager-loading (opposite of lazy-loading) which means that you are loading more from the database than needed. What we need to do is optimize but that totally depends on your use-case.
Whatever the use-case, you can do a few common things to make things better:
You can implement pagination (there are gems for it and you can do it yourself too) or infinite scrolling. In this case, you will be loading a set amount of records from db at first but as soon as user wants more, either they will scroll down or click 'next' button and your action will be called again but with an increment in the page number which means get the next set of records.
Implementing based on scroll involves JS and the view-height etc. but pagination is much simpler.
Gems:
kaminari gem
infinite-pages
Using includes
One more thing you must do is, use include in query if your records are related. Using include is tricky but very very helpful in time-saving. It will fetch the related needed record together in one go from database unlike your code going to and fro database multiple times. Fetching from database takes a lot of time as compared to fetching from RAM.
#users = User.all.includes(:comments) #comments for all users brought along with users but saved in RAM for future access.
#comments = #users.map(&:comments) # no need to go to db again, just RAM.
Using scopes in models:
Creating scopes in models helps too. In your case, you should create scopes like this:
scope :archived_customers, -> { where('archived IS false') }
scope :potential_customers, -> { where('active IS false') }
**OR**
scope :archived_customers, -> { where(:archived => false) }
scope :potential_customers, -> { where(:active => false) }
Loading all the available records in a single query can be very costly. Moreover, a user may be interested only in a couple of the most recent records (i.e., the latest posts in a blog) and does not want to wait for all records to load and render.
There are couples of ways to sort out this problem
example#1 implementation of Load More
example#2 implementation of Infinite Scrolling
example#3 implementation of pagination
Related
I know that find_each has been designed to consume smaller memory than each.
I found some code that other people wrote long ago. and I think that it's wrong.
Think about this codes.
users = User.where(:active => false) # What this line does actually? Nothing?
users.find_each do |user|
# update or do something..
user.update(:do_something => "yes")
end
in this case, It will store all user objects to the users variable. so we already populated the full amount of memory space. There is no point using find_each later on.
Am I correct?
so in other words, If you want to use find_each, you always need to use it with ActiveRecord::Relation object. Like this.
User.where(:active => false).find_each do |user|
# do something...
end
What do you think guys?
Update
in users = User.where(:active => false) line,
Some developer insists that rails never execute query unless we don't do anything with that variable.
What if we have a class with initialize method that has query?
class Test
def initialize
#users = User.where(:active => true)
end
def do_something
#user.find_each do |user|
# do something really..
end
end
end
If we call Test.new, what would happen? Nothing will happen?
users = User.where(:active => false) doesn't run a query against the database and it doesn't return an array with all inactive users. Instead, where returns an ActiveRecord::Relation. Such a relation basically describes a database query that hasn't run yet. The defined query is only run against the database when the actual records are needed. This happens for example when you run one of the following methods on that relation: find, to_a, count, each, and many others.
That means the change you did isn't a huge improvement, because it doesn't change went and how the database is queried.
But IMHO that your code is still slightly better because when you do not plan to reuse the relation then why assign it to a variable in the first place.
users = User.where(:active => false)
users.find_each do |user|
User.where(:active => false).find_each do |user|
Those do the same thing.
The only difference is the first one stores the ActiveRecord::Relation object in users before calling #find_each on it.
This isn't a Rails thing, it applies to all of Ruby. It's method chaining common to most object-oriented languages.
array = Call.some_method
array.each{ |item| do_something(item) }
Call.some_method.each{ |item| do_something(item) }
Again, same thing. The only difference is in the first the intermediate array will persist, whereas in the second the array will be built and then eventually deallocated.
If we call Test.new, what would happen? Nothing will happen?
Exactly. Rails will make an ActiveRecord::Relation and it will defer actually contacting the database until you actually do a query.
This lets you chain queries together.
#inactive_users = User.where(active: false).order(name: :asc)
Later you can to the query
# Inactive users whose favorite color is green ordered by name.
#inactive_users.where(favorite_color: :green).find_each do |user|
...
end
No query is made until find_each is called.
In general, pass around relations rather than arrays of records. Relations are more flexible and if it's never used there's no cost.
find_each is special in that it works in batches to avoid consuming too much memory on large tables.
A common mistake is to write this:
User.where(:active => false).each do |user|
Or worse:
User.all.each do |user|
Calling each on an ActiveRecord::Relation will pull all the results into memory before iterating. This is bad for large tables.
find_each will load the results in batches of 1000 to avoid using too much memory. It hides this batching from you.
There are other methods which work in batches, see ActiveRecord::Batches.
For more see the Rails Style Guide and use rubocop-rails to scan your code for issues and make suggestions and corrections.
For instance:
#examples = #user.examples.mostrecent.paginate(page: params[:page])
Where "mostrecent" is defined as:
def self.mostrecent
self.order('created_at DESC')
end
So basically the first call to the database is pull every User's example, and then on top of that, order them by most recent first. It seems like this should be doable, but for some reason I can't get it to work.
There is no defined order scope in the model I'm working with, and other calls to order work just fine. By checking the development.log I can see only the first database pulling example by users is respected. The mostrecent order is never called.
Is there a Rails way of doing this all in one line?
You could use a scope, as in:
scope :by_recent, lambda
{ |since_when| order("created_at") }
I'm trying to get all the clients that have doctors associated BUT none of them has started their first session (one client has_many doctors and can have first sessions with each of them).
So far I have:
#clients = Client.joins(:doctors).where('doctors.first_session IS NULL').order('clients.id DESC')
But this doesn't work when a client has for example 2 doctors. the first doctor.first_session = null but the second one is not. This case will return the client and it don't want it to.
Any ideas?
This is one of those cases where in order to find records that don't meet a certain condition, you do it by finding all records except those that meet the condition. In SQL this is done with a subquery in the WHERE clause.
For cases like this, the squeel gem is extremely helpful, because it encapsulates the SQL complexity. This is how I would do it (with squeel):
scope :visited_doctor, joins(:doctors).where { doctors.first_visit != nil }
scope :not_visited_doctor, where { id.not_in(Patient.visited_doctor.select(:id)) }
Note that you can do this without squeel, but you'll have to get your hands (and your code) dirty with SQL.
This will work, but may be a little less efficient since it does some of the work in ruby instead of all in the db.
clients = Client.order('clients.id DESC').include(:doctors).select do |client|
client.doctors.all? {|doctor| doctor.first_session.nil? }
end
Logically, that should fetch all the clients, and then in ruby, the select will evaluate the condition in the block and those clients that return true will be assigned to clients.
The condition block will return true only if all of that client's doctors have a nil first_session.
Hope that helps. There's probably a more efficient way to do this using subselects, but the syntax for that is likely to depend on which database you're using.
Well, I found a solution that involved two queries.
avoid_ids_results = Doctors.select('client_id')
.where("first_session IS NOT NULL")
.map(&:client_id).join(', ')
#clients = Clients.
joins(:doctors).
where('clients.id NOT IN (' + avoid_ids_results + ')').
order('clients.id DESC')
Thank you all!
You could create a method in your Client model which returns true if any first_session on a client's doctors is true, something like...
def has_first?
self.doctors.each do |doctor|
return true if !doctor.first_session.nil?
end
return false
end
This is pseudocode and may need to be tweaked first
I am building several reports in an application and have come across a few ways of building the reports and wanted to get your take on the best/common ways to build reports that are both scalable and as real-time as possible.
First, some conditions/limits/goals:
The report should be able to handle being real time (using node.js or ajax polling)
The report should update in an optimized way
If the report is about page views, and you're getting thousands a second, it might not be best to update the report every page view, but maybe every 10 or 100.
But it should still be close to real-time (so daily/hourly cron is not an acceptable alternative).
The report shouldn't be recalculating things that it's already calculated.
If it's has counts, it increments a counter.
If it has averages, maybe it can somehow update the average without grabbing all records it's averaging every second and recalculating (not sure how to do this yet).
If it has counts/averages for a date range (today, last_week, last_month, etc.), and it's real-time, it shouldn't have to recalculate those averages every second/request, somehow only do the most minimal operation.
If the report is about a record and the record's "lifecycle" is complete (say a Project, and the project lasted 6 months, had a bunch of activity, but now it's over), the report should be permanently saved so subsequent retrievals just pull a pre-computed document.
The reports don't need to be searchable, so once the data is in a document, we're just displaying the document. The client gets basically a JSON tree representing all the stats, charts, etc. so it can be rendered however in Javascript.
My question arises because I am trying to figure out a way to do real-time reporting on huge datasets.
Say I am reporting about overall user signup and activity on a site. The site has 1 million users, and there are on average 1000 page views per second. There is a User model and a PageView model let's say, where User has_many :page_views. Say I have these stats:
report = {
:users => {
:counts => {
:all => user_count,
:active => active_user_count,
:inactive => inactive_user_count
},
:averages => {
:daily => average_user_registrations_per_day,
:weekly => average_user_registrations_per_week,
:monthly => average_user_registrations_per_month,
}
},
:page_views => {
:counts => {
:all => user_page_view_count,
:active => active_user_page_view_count,
:inactive => inactive_user_page_view_count
},
:averages => {
:daily => average_user_page_view_registrations_per_day,
:weekly => average_user_page_view_registrations_per_week,
:monthly => average_user_page_view_registrations_per_month,
}
},
}
Things I have tried:
1. Where User and PageView are both ActiveRecord objects, so everything is via SQL.
I grab all of the users in chunks something like this:
class User < ActiveRecord::Base
class << self
def report
result = {}
User.find_in_batches(:include => :page_views) do |users|
# some calculations
# result[:users]...
users.each do |user|
# result[:users][:counts][:active]...
# some more calculations
end
end
result
end
end
end
2. Both records are MongoMapper::Document objects
Map-reduce is really slow to calculate on the spot, and I haven't yet spent the time to figure out how to make this work real-time-esque (checking out hummingbird). Basically I do the same thing: chunk the records, add the result to a hash, and that's it.
3. Each calculation is it's own SQL/NoSQL query
This is kind of the approach the Rails statistics gem takes. The only thing I don't like about this is the amount of queries this could possibly make (haven't benchmarked whether making 30 queries per-request-per-report is better than chunking all the objects into memory and sorting in straight ruby)
Question
The question I guess is, what's the best way, from your experience, to do real-time reporting on large datasets? With chunking/sorting the records in-memory every request (what I'm doing now, which I can somewhat optimize using hourly-cron, but it's not real-time), the reports take about a second to generate (complex date formulas and such), sometimes longer.
Besides traditional optimizations (better date implementation, sql/nosql best practices), where I can I find some practical and tried-and-true articles on building reports? I can build reports no problem, the issue is, how do you make it fast, real-time, optimized, and right? Haven't found anything really.
The easiest way to build near real-time reports for your use case is to use caching.
So in report method, you need to use rails cache
class User < ActiveRecord::Base
class << self
def report
Rails.cache.fetch('users_report', expires_in: 10.seconds) do
result = {}
User.find_in_batches(:include => :page_views) do |users|
# some calculations
# result[:users]...
users.each do |user|
# result[:users][:counts][:active]...
# some more calculations
end
end
result
end
end
end
end
And on client-side you just request this report with ajax pooling. That way generating this reports won't be a bottleneck as generating them takes ~1second, and many clients can easily get the latest result.
For better user experience you can store delta between two reports and increment your report on client side using this delta prediction, like this:
let nextPredictedReport = null;
let currentReport = null;
const startDrawingPredicted = () => {
const step = 500;
const timePassed = 0;
setInterval(() => {
timePassed += step;
const predictedReport = calcDeletaReport(currentReport, nextPredictedReport, timePassed);
drawReport(predictedReport);
}, step);
};
setInterval(() => {
doReportAjaxRequest().then((response) => {
drawReport(response.report);
currentReport = response.report;
nextPredictedReport = response.next_report;
startDrawingPredicted();
});
}, 10000);
that's just an example of the approach, calcDeletaReport and drawReport should be implemented on your own + this solution might have issues, as it's just an idea :)
I have a controller which has a lot of options being sent to it via a form and I'm wondering how best to separate them out as they are not all being used simultaneously. Ie sometimes no, tags, sometimes no price specified. For prices I have a default price set so I can work around with it always being there, but the tags either need to be there, or not. etc.
#locations = Location.find(params[:id])
#location = #locations.places.active.where("cache_price BETWEEN ? AND ?",price_low,price_high).tagged_with([params[:tags]).order(params[:sort]).paginate :page => params[:page]
I haven't seen any good examples of this, but I'm sure it must happen often... any suggestions? Also, even will_paginate which gets tacked on last should be optional as the results either go to a list or to a google map, and the map needs no pagination.
the first thing to do when refactoring a complex search action is to use an anonymous scope.
Ie :
fruits = Fruit.scoped
fruits = fruits.where(:colour => 'red') if options[:red_only]
fruits = fruits.where(:size => 'big') if options[:big_only]
fruits = fruits.limit(10) if options[:only_first]
...
If the action controller still remains too big, you may use a class to handle the search. Moreover, by using a class with Rails 3 and ActiveModel you'll also be able to use validations if you want...
Take a look at one of my plugins : http://github.com/novagile/basic_active_model that allows you to easily create classes that may be used in forms.
Also take a look at http://github.com/novagile/scoped-search another plugin more specialized in creating search objects by using the scopes of a model.