How to improve performance of single-page application? - ruby-on-rails

Introduction
I have a (mostly) single-page application built with BackboneJS and a Rails backend.
Because most of the interaction happens on one page of the webapp, when the user first visits the page I basically have to pull a ton of information out of the database in one large deeply joined query.
This is causing me some rather extreme load times on this one page.
NewRelic appears to be telling me that most of my problems are because of 457 individual fast method calls.
Now I've done all the eager loading I can do (I checked with the Bullet gem) and I still have a problem.
These method calls are most likely ocurring in my Rabl serializer which I use to serialize a bunch of JSON to embed into the page for initializing Backbone. You don't need to understand all this but suffice to say it could add up to 457 method calls.
object #search
attributes :id, :name, :subscription_limit
# NOTE: Include a list of the members of this search.
child :searchers => :searchers do
attributes :id, :name, :gravatar_icon
end
# Each search has many concepts (there could be over 100 of them).
child :concepts do |search|
attributes :id, :title, :search_id, :created_at
# The person who suggested each concept.
child :suggester => :suggester do
attributes :id, :name, :gravatar_icon
end
# Each concept has many suggestions (approx. 4 each).
node :suggestions do |concept|
# Here I'm scoping suggestions to only ones which meet certain conditions.
partial "suggestions/show", object: concept.active_suggestions
end
# Add a boolean flag to tell if the concept is a favourite or not.
node :favourite_id do |concept|
# Another method call which occurs for each concept.
concept.favourite_id_for(current_user)
end
end
# Each search has subscriptions to certain services (approx. 4).
child :service_subscriptions do
# This contains a few attributes and 2 fairly innocuous method calls.
extends "service_subscriptions/show"
end
So it seems that I need to do something about this but I'm not sure what approach to take. Here is a list of potential ideas I have:
Performance Improvement Ideas
Dumb-Down the Interface
Maybe I can come up with ways to present information to the user which don't require the actual data to be present. I don't see why I should absolutely need to do this though, other single-page apps such as Trello have incredibly complicated interfaces.
Concept Pagination
If I paginate concepts it will reduct the amount of data being extracted from the database each time. Would product an inferior user interface though.
Caching
At the moment, refreshing the page just extracts the entire search out of the DB again. Perhaps I can cache parts of the app to reduce on DB hits. This seems messy though because not much of the data I'm dealing with is static.
Multiple Requests
It is technically bad to serve the page without embedding the JSON into the page but perhaps the user will feel like things are happening faster if I load the page unpopulated and then fetch the data.
Indexes
I should make sure that I have indexes on all my foreign keys. I should also try to think about places where it would help to have indexes (such as favourites?) and add them.
Move Method Calls into DB
Perhaps I can cache some of the results of the iteration I do in my view layer into the DB and just pull them out instead of computing them. Or I could sync things on write rather than on read.
Question
Does anyone have any suggestions as to what I should be spending my time on?

This is a hard question to answer without being able to see the actual user interface, but I would focus on loading exactly only as much data as is required to display the initial interface. For example, if the user has to drill down to see some of the data you're presenting, then you can load that data on demand, rather than loading it as part of the initial payload. You mention that a search can have as many as 100 "concepts," maybe you don't need to fetch all of those concepts initially?
Bottom line, it doesn't sound like your issue is really on the client side -- it sounds like your server-side code is slowing things down, so I'd explore what you can do to fetch less data, or to defer the complex queries until they are definitely required.

I'd recommend separating your JS code-base into modules that are dynamically loaded using an asset loader like RequireJS. This way you won't have so many XHRs firing at load time.
When a specific module is needed it can load and initialize at an appropriate time instead of every page load.
If you complicate your code a little, each module should be able to start and stop. So, if you have any polling occurring or complex code executing you can stop the module to increase performance and decrease the network load.

Related

Optimising export of DB using Rails

I have a RoR application which contains an API to manage applications, each of which contain recipes (and groups, ingredients, measurements).
Once the user has finished managing the recipes, they download a JSON file of the entire application. Because each application could have hundreds of recipes, the files can be large. It also means there is a lot of DB calls to get all the required data to export.
Now because of this, the request to download the application can take upwards of 30 seconds, sometimes more.
My current code looks something like this:
application.categories.each do |c|
c.recipes.each do |r|
r.groups.each do |r|
r.ingredients.each do |r|
Within each loop I'm storing the data in a HASH and then giving it to the user.
My question is: where do I go from here?
Is there a way to grab all the data I require from the DB in one query? From looking at the log, I can see it is running hundreds of queries.
If the above solution is still slow, is this something I should put into a background process, and then email the user a link (or similar)?
There are of course ways to grab more data at once. This is done with Rails includes or joins, depending on your needs. See this article for some detailed information.
The basic idea is that you can join between your tables so that each time new queries aren't generated. When you do application.categories, that's one query. For each of those categories, you'll do another query: c.recipes - this creates N+1 queries, where N is the number of categories you have. Rather, you can include them off the get go to create 1 or 2 queries (depending on what Rails does).
The basic syntax is easy:
Application.includes(:categories => :recipes).each do |application| ...
This generates 1 (or 2 - again, see article) query that grabs all applications, their categories, and each categories recipies all at once. You can tack on the groups and ingredients too.
As for putting the work in the background, my suggestion would be to just have a loading image, or get fancy by using a progress bar.
First of all I have to assume that the required has_many and belongs_to associations exist.
Generally you can do something like
c.recipes.includes(:groups)
or even
c.recipes.includes(:groups => :ingredients)
which will fetch recipes and groups (and ingredients) at once.
But since you have a quite big data set IMO it would be better if you limited that technique to the deepest levels.
The most usefull approach would be to use find_each and includes together.
(find_each fetches the items in batches in order to keep the memory usage low)
perhaps something like
application.categories.each do |c|
c.recipes.find_each do |r|
r.groups.includes(:ingredients).each do |r|
r.ingredients.each do |r|
...
end
end
end
end
Now even that can take quite a long time (for an http request) so you can consider using some async processing where the client will generate a request that is going to be processed by the server as a background job, and when that is ready, you can provide a download link (or send an email) to the client.
Resque is one possible solution for handling the async part.

Rails short term caching of complex query results

The special_item_id_list method is responsible for returning an array of ids. The query and logic is complicated enough that I only want to have to run it once per any page request, but I'll be utilizing that resulting array of ids in many different places. The idea is to be able to use the is_special? method or the special_items scope freely without worrying about incurring overhead each time they are used, so they rely on the special_item_id_list method to do the heavy lifting and caching.
I don't want the results of this query to persist between page loads, but I'd like the query ran only once per page load. I don't want to use a global variable and thought a class variable on the model might work, however it appears that the class variable does persist between page loads. I'm guessing the Item class is part of the Rails stack and stays in memory.
So where would be the preferred place for storing my id list so that it's rebuilt on each page load?
class Item < ActiveRecord::Base
scope :special_items, lambda { where(:id => special_item_id_list) }
def self.special_item_id_list
#special_item_id_list ||= ... # some complicated queries
end
def is_special?
self.class.special_item_id_list.include?(id)
end
end
UPDATE: What about using Thread? I've done this before for tracking the current user and I think it could be applied here, but I wonder if there's another way? Here's a StackOverflow conversation discussing threads! and also mentions the request_store! gem as possibly a cleaner way of doing so.
This railscast covers what you're looking for. In short, you're going to want to do something like this:
after_commit :flush_cache
def self.cached_special_item_list
Rails.cache.fetch("special_items") do
special_item_id_list
end
end
private
def flush_cache
Rails.cache.delete("special_items")
end
At first I went with a form of Jonathan Bender's suggestion of utilizing Rails.cache (thanks John), but wasn't quite happy with how I was having to expire it. For lack of a better idea I thought it might be better to use Thread after all. I ultimately installed the request_store gem to store the query results. This keeps the data around for the duration I wanted (the lifetime of the request/response) and no longer, without any need for expiration.
Are you really sure this optimisation is necessary? Are you having performance issues because of it? Unless it's actually a problem I would not worry about it.
That said; you could create a new class, make special_item_id_list an instance method on that class and then pass the class around to anything needs to use that expensive-to-calculate data.
Or it might suffice to cache the data on instances of Item (possibly by making special_item_id_list an instance method), and not worry about different instances not being able to share the cache.

Rails 2 call from one model to another is slow

In Rails 2, I'm trying to optimize the performance of a web page the loads slowly.
I'm timing the execution time of statements in a model and finding that a surprising amount of the time is in a call from inside one model to another model, even though it appears there is no database access at all.
To be specific, let's say the model that is slow is department, and I'm calculating Department.expenditures. The expenditures method needs to know whether the quarter has been closed, and that information is in a different model, Quarter
The first time that Department.expenditures calls Quarter.closed? there is a database access, and I can accept that. But I've done something so to keep that in memory inside the model method, so that future calls to Quarter.closed? have no database access. The code inside Quarter.closed? now runs in around 4 microseconds, but simply invoking Quarter.closed? from inside Department.expenditures takes 400 microseconds, and with hundreds of departments, that adds up.
I could cache the Quarter.closed value inside a global variable, but that seems hairy. Does anyone know what is going on or have a suggestion about a better practice?
Not 100% sure if this applies to your problem. But with similar loading time problems in many cases eager loading solves the problem. You would do it like this:
Department.all(:include => :expenditures)
I'm a bit out of Rails 2 syntax. In Rails 3 you can specify includes quite detailed like this:
Category.includes(:posts => [{:comments => :guest}, :tags]).find(1)
I think (but not sure) the :include option in Rails 2 allowed for similar syntax
So maybe this would work:
Department.all(:include => [:expenditures => [:quarters]])
(Maybe need some experiments with combination of arra/hash syntax here)

Caching bunch of simple queries in rails

In my app there're objects, and they belong to countries, regions, cities, types, groups, companies and other sets. Every set is rather simple - it has id, name and sometimes some pointers to other sets, and it never changes. Some sets are small and I load them in before_filter like that:
#countries = Country.all
#regions = Region.all
But then I call, for example,
offer.country.name
or
region.country.name
and my app performs a separate db query-by-id, although I've already loaded them all. After that I perform query through :include, and this case ids, generated by eager loading, do not depend on either I've already loaded this data with another query-by-id or not.
So I want some cache. For example, I may generate hashes with keys as records-ids in my before_filter and then call #countries[offer.country_id].name. This case it seems I don't need eager loading and it's easy turn on Rails.cache here. But maybe there's some smart built-in rails solution that does not require to rewrite everything?
Caching lists of models like that won't cache individual instances of that exist in other model's associations.
The Rails team has worked on implementing Identity Maps in Rails 3.1 to solve this exact problem, but it is disabled by default for now. You can enable it and see if it works for your problem.

Simple model of blog post

I want to have a site that is a simple blog so I created a model:
class Post < ActiveRecord::Base
attr_accessible :title, :body
end
I want to use Markdown but without HTML tags. I also want always to keep database clean and my idea is to use before_save()/before_update() callbacks to sanitise my input and escape HTML.
I don't care about caching and performance therefore I always want to render post when needed. My idea is toadd following to the model:
def body_as_html
html_from_markdown(body)
end
What do you think of such design? MVC and ActiveRecord are new for me and I am not sure of used callback.
I see nothing obvious wrong with that method. Caching is a very simple thing to enable if performance becomes an issue... the important thing to make caching useful is to reduce or eliminate the dynamic content on the page, so that the cache doesn't constantly get obsolete. If you're just showing the blog post, then the cache only needs to be regenerated if the blog changes, or perhaps if someone adds a comment (if you have comments).
My general rule of thumb is to keep the data in your database as "pure" as possible, and do any sanitization, rendering, escaping or general munging as close to the user as possible - typically in a helper method or the view, in a Rails app.
This has served me well for several reasons:
Different representations of your data may have display requirements - if you implement a console interface at some point, you won't want to have all that html sanitization.
Keeping all munging as far out from the database as possible makes it clear whose responsibility it is to sanitize. Many tools or new developers maintaining your code may not realize that strings are already sanitized, leading to double-escaping and other formatting ugliness. This also applies to the "different representations" problem, as things can end up escaped in multiple different ways.
When you look in your database by hand, which will end up happening from time to time, it's nice to see things in their un-munged form.
So, to address your specific project, I would suggest having your users enter their text as Markdown and storing it straight in to the database, without the before_save hook (which, as an aside, would be called on creation or update, so you wouldn't also need a before_update hook unless there was something specific that you wanted on update but not creation). I would then create a helper method, maybe santize_markdown, to do your sanitization. You could then call your helper method on the raw markdown, and generate your body html from the sanitized markdown. This could go in the view or in another helper method according to your taste and how many different places you were doing it, but I probably wouldn't put it in the model since it's so display-specific.

Resources