Difference between Rack::Cache and page caching - ruby-on-rails

We are currently updating our sites at work, and I am responsible for choosing/designing our caching strategy.
Our sites are all article based magazine sites, however some of them have a user system for restricted articles that needs subscriptions.
We have used page caching so far (and stored the pages in memcached) with a little bit of javascript. However Im thinking that Rack::Cache or maybe Varnish is a better solution now. As far as i can see, does it work almost the same way performance wise:
Page caching, caches the full page in memcached and this cache will be served directly from memcached by nginx on future request.
Rack::Cache, also caches the full page in memcached, owever the cached version is served by the webserver instead of nginx. Rack::Cache uses HTTP-caching headers which means that visitors also will store a local cache in the browsers. Furthermore would it be easy to replace with Varnish, that also uses HTTP-caching headers.
Im i right so far, and does anyone else have some comments on the differences or the performance of the two strategies? It's also possible too use both, but i can see any advantages with this approach as they will cache the same types of pages.

Related

PDF caching on heroku with cloudflare

I'm having a problem getting the caching I need to work using CloudFlare.
We use CloudFlare for caching all our assets on S3 which works 100% using a separate subdomain cdn
We also use CloudFlare for our main site (hosted on Heroku) as well, e.g. www
My problem is I can't get CloudFlare to cache PDFs that are generated from our Rails app. I'm using the WickedPDF gem to dynamically generate certain PDFs for invoices, etc. I don't want to upload these as files to say S3 but we would like to have CloudFlare cache these so they don't get generated each and every time, as the time spent generating these PDFs is a little intensive.
CloudFlare is turned on and is "accelerating" for the subdomain in question and we're using SSL, but PDFs never seem to cache properly.
Is there something else we need to do to ensure these get cached? Or maybe there's another solution that would work for Heroku? (eg we can't use Page caching since it relies on the filesystem) I also checked the WickedPDF documentation so see if we could do anything else, but found nothing about expire controls.
Thanks,
We should actually cache it as long as the resources are on-domain & not being delivered through a third-party resource in some way.
Keep in mind:
1. Our caching depends on the number of requests for the resources (at least three).
2. Caching is very much data center dependent (in other words, if your site receives a lot of traffic at a data center it is going to be cached; if your site doesn't get a lot of traffic in another data center it may not cache).
I would open a support ticket if you're still having issues.

best method for serving static pages hosted on s3 in rails app on heroku

I have a rails app which is dynamic and works well on it's own, but we also have an s3 bucket which has a bunch of html pages which are constantly updated and revised.
I'm looking for an overall solution which allows me to route requests to the static files for a large number of potential pages, but also use the app for dynamic pages. None of the static pages require a user login, but all of the dynamic pages require a user login.
We are also currently using heroku to serve the application which is something else to take into consideration.
What are some methods/gems/ideas for how to serve these static pages quickly on the same domain without interfering with the rest of the app?
According to Heroku you have 3 options:
You can use page_caching. page_caching has been removed from rails4 but is still available as an external gem actionpack-page_caching but from what I see, does not work as expected in heroku (files are not persisted).
You can use cache through HTTP headers (which actually caches data on the client)
You can use Rack::Cache as a reverse proxy.
I think the best solution for your matter is the third one. Heroku suggests that you use Rack::Cache. Of course you can combine that with the HTTP headers.
You can find more details here and here

What is the point of Varnish and Rack-Cache for a Rails app?

I am a bit confused about the purpose of Varnish and Rack-Cache for a Rails app. In config/environments/production.rb caching can be set with something like
config.static_cache_control = "public, max-age=3600"
Given that, what exactly is the purpose of Varnish and Rack-Cache if you can set caching in the Rails app itself?
And what causes the default Rails app to use rack-cache?
Static Cache Control affects the http headers for Cache-Control. As in, the server suggests to intermediate caches that the max-age=3600.
Varnish, Rack-Cache, Squid and others actively cache generated content on the server. Database calls are expensive and, even when a request doesn't make a call to the db, the less infrastructure the request has to go through, generally the faster it will be.
Rack::Cache is rack middleware that supports HTTP standards compliant caching. Their FAQ page has some good information about it's pros and cons over other caching solutions. Here's a question comparing rack::cache to varnish on heroku. Rails also has ActiveSupport::Cache which handles fragment and page caching. I'm not sure what the differences are, but both are included in Rails by default. I had said earlier that rack::cache wasn't default, but I was wrong.
Varnish, Squid, and others exist outside the Rails stack in front of the webserver(eg Apache/Nginx/etc) as a separate process. They are highly configurable, application independent, and have some advanced features (such as Squid's ACL's). Varnish and others have the benefit of minimizing the infrastructure a request has to go through to get served. If it's fresh, the request hits Varnish and immediately returns to the client. This probably has the most benefit for high-traffic sites and might be overkill for smaller apps.
Here's an article on heroku detailing the use of rack::cache in Rails3. There are also some good railscasts on doing page/fragment caching in-app and on using memcached as a backend(which is very important). For varnish and others, you can start with this tutorial on varnish's site.

Rails' page caching vs. HTTP reverse proxy caches

I've been catching up with the Scaling Rails screencasts. In episode 11 which covers advanced HTTP caching (using reverse proxy caches such as Varnish and Squid etc.), they recommend only considering using a reverse proxy cache once you've already exhausted the possibilities of page, action and fragment caching within your Rails application (as well as memcached etc. but that's not relevant to this question).
What I can't quite understand is how using an HTTP reverse proxy cache can provide a performance boost for an application that already uses page caching. To simplify matters, let's assume that I'm talking about a single host here.
This is my understanding of how both techniques work (maybe I'm wrong):
With page caching the Rails process is hit initially and then generates a static HTML file that is served directly by the Web server for subsequent requests, for as long as the cache for that request is valid. If the cache has expired then Rails is hit again and the static file is regenerated with the updated content ready for the next request
With an HTTP reverse proxy cache the Rails process is hit when the proxy needs to determine whether the content is stale or not. This is done using various HTTP headers such as ETag, Last-Modified etc. If the content is fresh then Rails responds to the proxy with an HTTP 304 Not Modified and the proxy serves its cached content to the browser, or even better, responds with its own HTTP 304. If the content is stale then Rails serves the updated content to the proxy which caches it and then serves it to the browser
If my understanding is correct, then doesn't page caching result in less hits to the Rails process? There isn't all that back and forth to determine if the content is stale, meaning better performance than reverse proxy caching. Why might you use both techniques in conjunction?
You are right.
The only reason to consider it is if your apache sets expires headers. In this configuration, the proxy can take some of the load off apache.
Having said this, apache static vs proxy cache is pretty much an irrelevancy in the rails world. They are both astronomically fast.
The benefits you would get would be for your none page cacheable stuff.
I prefer using proxy caching over page caching (ala heroku), but thats just me, and a digression.
A good proxy cache implementation (e.g., Squid, Traffic Server) is massively more scalable than Apache when using the prefork MPM. If you're using the worker MPM, Apache is OK, but a proxy will still be much more scalable at high loads (tens of thousands of requests / second).
Varnish for example has a feature when the simultaneous requests to the same URL (which is not in cache) are queued and only single/first request actually hits the back-end. That could prevent some nasty dog-pile cases which are nearly impossible to workaround in traditional page caching scenario.
Using a reverse proxy in a setup with only one app server seems a bit overkill IMO.
In a configuration with more than one app server, a reverse proxy (e.g. varnish, etc.) is the most effective way for page caching.
Think of a setup with 2 app servers:
User 'Bob'(redirected to node 'A') posts a new message, the page gets expired and recreated on node 'A'.
User 'Cindy' (redirected to node 'B') requests the page where the new message from 'Bob' should appear, but she can't see the new message, because the page on node 'B' wasn't expired and recreated.
This concurrency problem could be solved with a reverse proxy.

How to use multiple caches in rails?

I have a rails application where I would like to use both memcached and the file store cache, for different purposes.
I want to use the file store cache to keep a large number of pages that don't change often (some not at all) - i.e. page caching - and use memcached for everything else (action and DB caching etc). The reason is that the pages stored on the file store cache are likely to require a large amount of storage, but individually most will be accessed infrequently.
Is this possible to do or will configuring memcached as the cache mean that it is also used for page caching?
As a secondary question, what is a safe way to remove pages from the file store cache in some form of cron job, as there does not seem to be an option to specify ttl for this cache. For example a UNIX find command would quickly find and remove all old pages or pages that haven't been accessed in a long time - is this safe to do given the app server might potentially try to serve one of those pages at the time (tho this is very unlikely)? If not then what is the best way to do this.
If you want to use the filesystem only for page caching and memcached for action and fragment caching, you're fine. Page caching always uses the filesystem. Just remember that page caching bypasses your Rails application, so you can't use it for pages that include content that changes from user to user or for pages that are access controlled with filters.
Regarding the removal of pages, on Unix, a file can be deleted, but it is not actually removed from disk until all open file handles are closed. If the app server has opened the file to serve a request, and the find command deletes it a split-second later, the app server doesn't suddenly get an error when it tries to read.
You could also consider having find delete files based on their last access time, instead of creation or modification, and using a sweeper in your Rails app to delete the cached page when its content is out of date.
A simpler approach may be to use a http cache upstream of your application as your page cache rather than two stores within rails. This way you can use http headers to control the cache behavior, including TTL's. These same limits will also apply to browser's local caches as a nice bonus.
Varnish is about as high performance as it gets, but would require setting up another moving piece in your hosting environment as a proxy. This may still be worthwhile depending on what you're doing.
A simpler approach might be Rack::Cache, which will be easy to set up provided you're using a rack enabled version of rails.

Resources