rails cache vs redis on performance, ease to use, persistence? - ruby-on-rails

I know redis is powerful and I use it for caching in my rails application. Could anyone give me an comparison between rails default caching and redis? What's the trade off as cache.

The main point is distribution.
With Redis the cache can be shared across all back-ends (eventually running on multiple hosts). This is the most scalable solution (because you can multiply the number of back-end hosts). The downside is you will pay for an extra network roundtrip for each cache access. Also, you require an extra component to deploy and manage (Redis).
With ActiveSupport::FileStore, the cache can be shared across back-ends instances provided they run on the same host. Easy to use.
With ActiveSupport::MemStore, the cache cannot be shared across back-ends (even if they run on the same host). However, this is the fastest solution. Easy to use.

Related

Use ehcache for application deployed via docker against the stateless rule

I have an spring-boot application which I would like to deploy it into multiple docker instances and there is a load balance before the instances.
However, the application uses ehcache to cache some data from a database. It makes the application stateful.
So without session sticky, a same customer might hit different docker instances and see different results.
My question is if I can't apply session sticky in load balance, what is the best practice to deploy an app with cache feature via docker style and still comply the rule of should-be-stateless?
I explain here in this devoxx video how clustered caching can help each of you docker instance share the same cache
First of all, if you really have a pure caching use case, there should be no correctness impact only a performance one. Which of course can in itself be a bad thing for your application.
But effectively, if you want to use caching to provide performance and at the same time have a multi-node ability without sticky sessions, you have to move into the realm of distributed caching. This will give you the ability to share the cache content amongst the different nodes and thus make it (more) transparent for a given request in a conversation to hit any node of your application.
In the Ehcache world, this means backing the cache with a Terracotta server, see the documentation for details.
It is common to combine ehcache with terracotta to allow distributed caching among nodes.
Regards

Memcached vs Redis as Rails.cache when requiring an LRU cache

I currently use Redis as a work queue for Sidekiq. I'm interested in also using it as a caching mechanism for Rails.cache.
The recommended Rails caching mechanism never expires items and relies on evicting the least recently used (LRU) item. Unfortunately, Redis by default isn't configured to evict the least recently used item while the recommended cache store, memcached, is.
Furthermore, evicting items isn't a behavior I would want for my work queue, and configuring the same Redis instance to do this could lead to undesirable results. Nor would I want my queue to share cycles with my cache anyways.
What would you all recommend in this situation? A second redis store to act as a cache and have LRU configured? Or just use the rails recommended memcached cache store and only use redis alone to be a queue?
I'm leaning towards using both Redis and Memcached, despite plenty of stack overflow articles recommending otherwise. memcached supporting LRU eviction by default is what wins me over.
Some articles:
redis and memcache or just redis
why is memcached still used alongside rails
Hidden deeper in the comments, posters mention that memcached's LRU eviction as a great reason to use it as a cache.
Ended up using both redis and memcached. Pretty happy with the results.
Main difference is that Memcached can run in parallel cores/machines but Redis is so lightweight and fast that it takes a good amount of load to get to its limit if it's running on a decent machine, where it only uses a couple cores, well since it works to use both for you that's great, but it sounds like a bit unnecessary complexity to use both, that's all. (ie if you need contractors to work on it etc you'll need someone with experience in both technologies rather than just one)

How to scale with Rails

I would like to prepare a rails application to be scalable. Some features of this app is to connect to some APIs & send emails, plus I'm using PostgreSQL & it's on Heroku.
Now that code is clean, I would like to use caches and any technique that will help the app to scale.
Should I use Redis or Memcached ? It's a little obscur to me and I've seen similar questions on StackOverflow but here I would like to know which one I should use only for scaling purpose.
Also I was thinking to use Sidekiq to process some jobs. Is it going to conflict with Memcached/Redis ? Also, in wich case should I use it ?
Any other things I should think of in terms of scalability ?
Many thanks
Redis is a very good choice for caching, it has similar performances to memcached (redis is slightly faster) and it takes few minutes to configure it that way.
If possibile I would suggest agains using the same redis instance to store both cache and message store.
If you really need to do that make sure you configure redis with volatile-lru max memory policy and that you always set your cache with a TTL; this way when redis runs out of memory cache keys will be evicted.
Sidekiq requires Redis as its message store. So you will need to have a Redis instance or use a Redis service if you want to use Sidekiq. Sidekiq is great, btw.
You can use either Memcached or Redis as your cache store. For caching I'd probably use Memcached as the cache cleaning behavior is better. In Rails in general, and in Rails 4 apps in particular, one rarely explicit expires an item from cache or sets an explicit expiration time. Instead one depends on updates to the cache_key, which means the cached item isn't actually deleted from the store. Memcached handles this pretty well, by evicting the least recently used items when it hits memory limits.
That all said, scaling is a lot more than picking a few components. You'll have to do a fair amount of planning, identify bottlenecks as the app develops, and scale CPU/memory/disk by increasing servers as needed.
You must look at the two cool features of Redis that are yet to come in near future
Redis Cluster:
http://redis.io/topics/cluster-spec
http://redis.io/presentation/Redis_Cluster.pdf
Redis Sentinel: (Highly Available)
http://redis.io/topics/sentinel
http://redis.io/topics/sentinel-spec
Also these features will yield a great help in scaling redis,
In case of memcached these features are missing, also as far as active development is concerned redis community is fare more vibrant.

Is there a big performance hit with using file_store for storing the cache as opposed to mem_cache_store?

I don't think I'm at the point where I need to go through and get memcached setup for my Rails app, but I would like to do some simple caching on a few things.
Is using file_store for as the config.cache_store setting sufficient enough? Or will having to access files for data over and over kill the benefit of caching in the first place from a server load stand point?
Or maybe I'm not really understanding the difference between file_store and mem_cache_store...
I don't think I'm at the point where I need to go through and get memcached setup for my Rails app, but I would like to do some simple caching on a few things
Then use your existing database to store your cached items. (You are using a database, right?)
memcached is only a fast-but-dumb database. If you don't need the ‘fast’ part(*) then don't introduce the extra complexity, inconsistency and overhead of having a separate cache layer.
memcache with file_store is a dumb-but-not-even-fast database, and thus of little use to anyone except for compatibility/testing.
(*: and really, most sites don't. Memcache should be a last resort when you can't optimise your schema, denormalise it for common queries or pre-calculate complex operations any further. It's not something the average web application should be considering as essential for scalability.)
The file_store will cache stuff in files in a filesystem.
If that filesystem is LOCAL to your web server, then clearly, it will only be relevant to that particular web server, therefore you'll lose cache hit rate when a cached entity exists on one server but not another.
How many web servers do you have? 2? 10? 100?
The file_store for caching does not scale properly and will reduce your hit rate over a shared store (for example the memcached one).
The purpose of using memcached is so that a pool of web servers can access a single cache (even though it can be split on multiple physical servers). Using the file_store will not (unless it's a network filesystem, and that will almost certainly be fraught with its own problems).
Always measure the cache hit rate on any cache; if you're not getting a high hit % then it's usually not worth it. Be sure to trend it over time in your monitoring.

When would you NOT want to use memcached in a Ruby on Rails app?

Assuming a MySQL datastore, when would you NOT want to use memcached in a Ruby on Rails app?
Don't use memcached if your application is able to handle all requests quickly. Adding memcached is extra mental overhead when it comes to coding your app, so don't do it unless you need it.
Scaling's "one swell problem to have".
Memcache is a strong distributed cache, but isn't any faster than local caching for some content. Caching should allow you to avoid bottlenecks, which is usually database requests and network requests. If you can cache your full page locally as HTML because it doesn't change very often (isn't very dynamic), then your web server can serve this up much faster than querying memcache. This is especially true if your memcache server, like most memcached servers, are on seperate machines.
Flip side of this is that I will sometimes use memcache locally instead of other caching options because I know someday I will need to move it off to its own server.
The main benefit of memcached is that it is a distributed cache. That means you can generate once, and serve from cache across many servers (this is why memcached was created). All the previous answers seem to ignore this - it makes me wonder if they have ever had to build a highly scalable app (which is exactly what memcached is for)
Danga Interactive developed memcached
to enhance the speed of
LiveJournal.com, a site which was
already doing 20 million+ dynamic page
views per day for 1 million users with
a bunch of webservers and a bunch of
database servers. memcached dropped
the database load to almost nothing,
yielding faster page load times for
users, better resource utilization,
and faster access to the databases on
a memcache miss.
(My bolding)
So the answer is: if your application is only ever likely to be deployed on a single server.
If you are ever likely to use more than one server (for scalability and redundancy) memcached is (nearly) always a good idea.
When you want to have fine-grained control about things expiring. From my tests, memcached only seems to have a timing resolution of about a second.
EG: if you tell something to expire in 1 second, it could stay around for between 1 and just over 2 seconds.

Resources