How to cache queries across page loads? - ruby-on-rails

In a Rails 3.2 app, I have a number of queries defined in application_controller.rb. The data returned by the query will change very infrequently.
Looking at the logs, these queries appear to be run on every page load.
How can I cache these arrays, so that they are refreshed less frequently, helping to increase page load time.
Thanks

By using something like Memcached or Redis?

There are a few ways. Most simply, you can cache the queries in some instance variables across requests, with something like this:
#variable ||= Model.query
This will save them from being loaded on every request. However, if it's something that's very complex, you might consider throwing it in Redis (which is great for things like this).
Normally you wouldn't want to do something like this. You might want to reconsider the architecture of the app if you're doing something like this a lot.

Related

Several PhantomJS calls in a RoR application

I have a RoR application that given a set of N URLs to parse, will perform N shell calls for a given PhantomJS (actually is a CasperJS) script.
So,
Right now I have something like this:
urls_to_parse = ['first.html', 'second.html',...]
urls_to_parse.each do |url|
parse_results = \`casperjs parse_urls.js '#{url}'\`
end
I have never done this before. Launching shell scripts from a RoR/Ruby application, so I am wondering if this is a good approach and what alternative may I have. So, why I use PhantomJS in combination with RoR?
I basically have an API (RoR app) that keeps receiving urls that need to be parsed. They need to be parsed in a headless browser manner. The page actually needs to be rendered (that's why I don't use Nokogiri or any other HTML parser).
I am concerned about putting this up to production performance wise, and before going forward I would like to know if I am doing this correctly, or I can do it in a better way.
It's possible I thought about doing the same thing, but even with a headless browser I would be really concerned about the speed and bandwidth your server is going to need to have. I use capser in conjuction with Python and it works very well for me. I read stdout spit back from firing the casper scripts, but I don't parse and scrape on the fly like you're talking about doing. I would imagine it's okay, but ideally you already have a cached database of results when people search. Maybe if it is a very very basic search you'll be okay, but I don't know.

Caching large numbers of ActiveRecord objects

There's an oft-called method in my Rails application that retrieves ~200 items from the database. Rather than do this again and again, I store the results using Rails.cache.write. However, when I retrieve the results using Rails.cache.read, it's still very slow: about 400ms. Is there any way to speed this up?
This is happening in a controller action, and I'd prefer users not have to wait so long to load the page.
FYI regarding Rails caching, from the Rails Guides, "...It’s important to note that query caches are created at the start of an action and destroyed at the end of that action and thus persist only for the duration of the action."
If you can share the method, I may be able to help more quickly. Otherwise, a couple performance best practices:
Use .includes to avoid N+1 queries. Define this in the model and
call it in the controller.
How are your indexes set-up (if any)?

Is this an ok design decision? is there a better way?

So, For the sake of performance, I'm using database sessions. I figure that while the sessions are server side, I might as well store commonly accessed objects in the session. So, I'm storing serialized versions of the current_user, current_account, and the current_user's permissions.
The User model handels a lot of the permissions methods (things like user.can_do_whatever), but since i'm trying to be more efficient, and store commonly accessed things in the session (this allows for far fewer DB accesses), does it make sense / break any design standards to (upon each request) store the session in an instance variable in the current_user ?
As of right now, I can't think of any alternatives.
ROR application have by default a RESTful design. One rules of REST is stateless. that mean each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server.
If you have trouble with Database performance, use a cache system like memcached wich is already integrated in rails (Caching with Rails).
I found a couple of references warning against storing non-primitive data types in the session, but they were all just warnings, and boiled down to: Storing complex objects is "Expecially discouraged" [sic], but if you decide you need to... well, just be careful.
Anyway, I'm kinda taken by the idea of having the users table double as the sessions table, but serialization still seems a bit sketchy. If you're just trying to cut down the number of DB requests, what about storing IDs and using :joins when looking up your user (might require a bit of hackery to get that worked into the default session loading). That avoids synchronization problems and serialization sketchiness, and still only generates a single DB query. Just make sure to use :joins and not :include, as the latter generates a query for each table.
Hope that helps!

Keep value in memory across requests and across users in Rails controller? Use class variable?

We're on Rails 3.0.6.
We maintain a list of numbers that changes only once a month, but nearly every page request requires access to this list.
We store the list in the database.
Instead of hitting the database on every request and grabbing the list, we would like to grab the data once and stash it in memory for efficient access.
If we store the list in each user session, we still need to hit the database for each session.
Is there a way to only hit the database once and let the values persist in memory across all users and all sessions? We need access to the list from the controller. Should we define a class variable in the controller?
Thanks!
I think Rails.cache is the answer to your problem here. It's a simple interface with multiple backends, the default stores the cache in memory, but if you're already using Memcached, Redis or similar in your app you can plug it into those instead.
Try throwing something similar to this in your ApplicationController
def list_of_numbers
#list_of_numbers ||= Rails.cache.fetch(:list_of_numbers, :expires_in => 24.hours) do
# Read from database
end
end
It will try to read from the cache, but if it doesn't find it, will do the intensive stuff and store it for next time
The pattern you're looking for is known as a singleton which is a simple way to cache stuff that doesn't change over time, for example, you'll often see something like this in application_controller.rb -- your code always calls the method
def current_user(user_id)
#current_user ||= User.find user_id
end
When it does, it checks the instance variable #current_user and returns it if not nil, otherwise it does the database lookup and assigns the result to the instance variable, which it returns.
Your problem is similar, but broader, since it applies to all instances.
One solution is with a class variable, which is documented here http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_classes.html#S3 -- a similar solution to the one above applies here.
This might be a good solution in your case, but has some issues. In specific, (assuming this is a web app) depending on your configuration, you may have multiple instances of Rails loaded in different processes, and class variables only apply to their specific instance. The popular Passenger module (for Apache and Nginx) can be configured to allow class variables to be accessible to all of it's instances ... which works great if you have only one server.
But when you have multiple servers, things get a little tricky. Sure, you could use a class variable and accept that you'll have to make one hit to the database for each server. This works great except for the when that the variable ... varies! You'll need some way of invalidating the variable across all servers. Depending on how critical the it is, this could create various very gnarly and difficult to track down errors (I learned the hard way :-).
Enter memcached. This is a wonderful tool that is a general purpose caching tool. It's very lightweight, and very, very smart. In particular, it can create distributed caches across a cluster of servers -- the value is only ever stored once (thus avoiding the synchronization problem noted above) and each server knows which server to look on to find any given cache key. It even handles when servers go down and all sorts of other unpleasantries.
Setup is remarkably easy, and Rails almost assumes you'll use it for your various caching needs, and the Rails gem just makes it as simple as pie.
On the assumption that there will be other opportunities to cache stuff that might not be as simple as a value you can store in a class variable, that's probably the first place to start.

Best Practices for Optimizing Dynamic Page Load Times (JSON-generated HTML)

I have a Rails app where I load up a base HTML layout and I fill in the main content with rows of divs from JSON. This works in 2 steps:
Render the HTML
Ajax call to get the JSON
This has the benefit of being able to cache the HTML layout which doesn't change much, but it seems to have more drawbacks:
2 HTTP requests
HTML isn't that complex, the generated html is where all the work is done, so I'm not saving that much on time probably.
Each request in my specific case requires that we check the current user, their roles, and some things related to that user, so those 2 calls are somewhat involved.
Granted, memcached will probably solve a lot of this, I am wondering if there are some best practices here. I'm thinking I could do this:
Render the first page of JSON inline, in a script block, along with the HTML. This would cut out those 2 server calls requiring user authentication. And, assuming 80% of the time you don't need to make the second ajax call (pagination/sorting in this case), that seems like a fairly good solution.
What are your thoughts on how to approach this?
There are advantages and disadvantages to doing stuff like this. In general I'd say it's only a good idea, if whatever you're delaying via an ajax call would delay the page load enough to annoy the end user for most of the use cases on your page.
A good example of this is browsing a repository on github. 90% of the time all you want is to navigate the files, so they use an ajax load to fill in the commit messages per file after the page load.
It sounds like you're trying to do this to speed up or do something fancy for your users, but I think you should consider instead, what part is slow, and what speed of page load (and maybe for what information on that page) on your users are expecting. As you say, using memcached or fragment caching might well give you the improvements you're looking for.
Are you using some kind of monitoring tool? I'm using the free version of New Relic RPM on Heroku. It gives a lot of data on request times for individual controller actions. Data like that could help you focus your optimization process.

Resources