How to scale with Rails - ruby-on-rails

I would like to prepare a rails application to be scalable. Some features of this app is to connect to some APIs & send emails, plus I'm using PostgreSQL & it's on Heroku.
Now that code is clean, I would like to use caches and any technique that will help the app to scale.
Should I use Redis or Memcached ? It's a little obscur to me and I've seen similar questions on StackOverflow but here I would like to know which one I should use only for scaling purpose.
Also I was thinking to use Sidekiq to process some jobs. Is it going to conflict with Memcached/Redis ? Also, in wich case should I use it ?
Any other things I should think of in terms of scalability ?
Many thanks

Redis is a very good choice for caching, it has similar performances to memcached (redis is slightly faster) and it takes few minutes to configure it that way.
If possibile I would suggest agains using the same redis instance to store both cache and message store.
If you really need to do that make sure you configure redis with volatile-lru max memory policy and that you always set your cache with a TTL; this way when redis runs out of memory cache keys will be evicted.

Sidekiq requires Redis as its message store. So you will need to have a Redis instance or use a Redis service if you want to use Sidekiq. Sidekiq is great, btw.
You can use either Memcached or Redis as your cache store. For caching I'd probably use Memcached as the cache cleaning behavior is better. In Rails in general, and in Rails 4 apps in particular, one rarely explicit expires an item from cache or sets an explicit expiration time. Instead one depends on updates to the cache_key, which means the cached item isn't actually deleted from the store. Memcached handles this pretty well, by evicting the least recently used items when it hits memory limits.
That all said, scaling is a lot more than picking a few components. You'll have to do a fair amount of planning, identify bottlenecks as the app develops, and scale CPU/memory/disk by increasing servers as needed.

You must look at the two cool features of Redis that are yet to come in near future
Redis Cluster:
http://redis.io/topics/cluster-spec
http://redis.io/presentation/Redis_Cluster.pdf
Redis Sentinel: (Highly Available)
http://redis.io/topics/sentinel
http://redis.io/topics/sentinel-spec
Also these features will yield a great help in scaling redis,
In case of memcached these features are missing, also as far as active development is concerned redis community is fare more vibrant.

Related

Memcached vs Redis as Rails.cache when requiring an LRU cache

I currently use Redis as a work queue for Sidekiq. I'm interested in also using it as a caching mechanism for Rails.cache.
The recommended Rails caching mechanism never expires items and relies on evicting the least recently used (LRU) item. Unfortunately, Redis by default isn't configured to evict the least recently used item while the recommended cache store, memcached, is.
Furthermore, evicting items isn't a behavior I would want for my work queue, and configuring the same Redis instance to do this could lead to undesirable results. Nor would I want my queue to share cycles with my cache anyways.
What would you all recommend in this situation? A second redis store to act as a cache and have LRU configured? Or just use the rails recommended memcached cache store and only use redis alone to be a queue?
I'm leaning towards using both Redis and Memcached, despite plenty of stack overflow articles recommending otherwise. memcached supporting LRU eviction by default is what wins me over.
Some articles:
redis and memcache or just redis
why is memcached still used alongside rails
Hidden deeper in the comments, posters mention that memcached's LRU eviction as a great reason to use it as a cache.
Ended up using both redis and memcached. Pretty happy with the results.
Main difference is that Memcached can run in parallel cores/machines but Redis is so lightweight and fast that it takes a good amount of load to get to its limit if it's running on a decent machine, where it only uses a couple cores, well since it works to use both for you that's great, but it sounds like a bit unnecessary complexity to use both, that's all. (ie if you need contractors to work on it etc you'll need someone with experience in both technologies rather than just one)

Dalli vs Redis-Store for Rails App

I have been using Dalli until now for caching and today I came across Redis -Store.
I am wondering should I switch to redisstore. My app already uses redis for certain stuff so I have a redis server which is quite big(in terms of resources) and I also have another memcached server. So if I where to switch to redis-store it would mean that I can remove the memcached server(less server to maintain + less cost).
Has anyone done a comparison of these 2 solutions.
Performance
Is it a drop-in replacement(can I switch between these 2 anytime without code change)
Any other stuff I should know about.
Redis can be used as a cache or as a permanent store, but if you try to mix both, you can end up having "interesting issues".
When you have memcached, yo have a maximum amount of memory for the process, so when memcached gets full it will automatically remove the least recently used entries to make room for the new entries.
You can configure Redis to have that behaviour too, but you don't want to do that if you are using Redis for persistent storage, because in that case you would potentially lose keys that are meant to be persistent.
So if you are using persistent storage for Redis, you would need to have two different Redis processes: one for your persistant keys, one for caching. Of course you could always have only one process and set expiring times to every cache item, but no one would assure you you don't hit the memory limit before they expire and you lose data, so in practice you would need two processes. Besides, if you are setting a master/slave configuration for your persistent data and you store cache on the same server, you are basically wasting RAM, so separate processes are the way to go.
About performance, both redis and memcached are VERY performant, and on different tests they are on the same range when it comes to get/extract data, but memcached is better when you only need a cache.
Why is this so? First of all, since memcached only has one mission, which is storing key/values, it doesn't have any overhead when it comes to storing metadata. Redis on the other hand offers different data structures, so it stores more metadata which each key. One example of this: it's much "cheaper" to store data on a hash in Redis instead of using individual keys. You don't get any of this on memcached, since there is only one type of data. This means with the same amount of memory in your servers you can store more data on memcached than on redis. If you have a relatively small installation you don't really care, but the moment you start seeing growth, believe me you will want to keep those data under control.
So, as much as I like Redis, I prefer to have memcached for my caching needs and redis for my persistent storage/temporary storage/queue needs. I still use redis as a "cache" but not a temporary one with expiration, but as a lookup cache to save reading from a more expensive storage. For example, I keep a mapping between user IDs and nicknames on Redis. I never expire these mappings, so Redis is a perfect place for it.
In the case that you are dealing with a small amount of data, then it might make sense your idea of having a single technology for everything, but the moment you start growing over a few hundreds MB, I would say go with both of them.

Rails and caching, is it easy to switch between memcache and redis?

Is there a common api such that if I switch between Redis or Memcached I don't have to change my code, just a config setting?
As long as you don't initialize the Memcached client yourself but you rely on Rails.cache common API, switching from Memcached to Redis is just a matter of installing redis-store and changing the configuration from
config.cache_store = :memcached_store
to
config.cache_store = :redis_store
More info about Rails.cache.
I hate to mess with your goals, but I would advise against using redis over memcached for generic rails caching.
I use redis and resque extensively in a large rails application and I thought it would be nice to consolidate caching, raw redis and resque into one. I ran into a few big issues:
First off, it was slower. It could have totally been my specific usage, the redis-store library or redis itself. I'm not going to badmouth anything and your mileage may vary, but it would suck to dump a lot of time switching to redis when memcached "just works"
Memcached is nice because it's extremely easy to add servers and use consistent hashing to accomplish your goals. Redis has this also, but in my experience it was difficult to simultaneously treat redis as both a monolithic datastore in some parts of my app and in other parts treat it as a distributed, consistently hashed blobs of caching storage.
Good luck with your project. I love redis AND memcached and use them in all my projects, but I let one do it's job as a kick-ass data structure server and let the other one kick ass at caching.
The neat parts of Redis include caching "list-based" things - pushing/popping things from this list as they happen in your app.
Rather than de-serializing a large value from memcached, editing it, then re-serializing it.
This would be done in ruby code in a custom filter, vs. the basic rails cache.

Redis and Memcache or just Redis?

I'm using memcached for some caching in my Rails 3 app through the simple Rails.cache interface and now I'd like to do some background job processing with redis and resque.
I think they're different enough to warrant using both. On heroku though, there are separate fees to use both memcached and redis. Does it make sense to use both or should I migrate to just using redis?
I like using memcached for caching because least recently used keys automatically get pushed out of the cache and I don't need the cache data to persist. Redis is mostly new to me, but I understand that it's persistent by default and that keys do not expire out of the cache automatically.
EDIT: Just wanted to be more clear with my question. I know it's feasible to use only Redis instead of both. I guess I just want to know if there are any specific disadvantages in doing so? Considering both implementation and infrastructure, are there any reasons why I shouldn't just use Redis? (I.e., is memcached faster for simple caching?) I haven't found anything definitive either way.
Assuming that migrating from memcached to redis for the caching you already do is easy enough, I'd go with redis only to keep things simple.
In redis persistence is optional, so you can use it much like memcached if that is what you want. You may even find that making your cache persistent is useful to avoid lots of cache misses after a restart. Expiry is available also - the algorithm is a bit different from memcached, but not enough to matter for most purposes - see http://redis.io/commands/expire for details.
I'm the author of redis-store, there is no need to use directly Redis commands, just use the :expires_in option like this:
ActionController::Base.cache_store = :redis_store, :expires_in => 5.minutes
The advantage of using Redis is fastness, and with my gem, is that you already have stores for Rack::Cache, Rails.cache or I18n.
I've seen a few large rails sites that use both Memcached and Redis. Memcached is used for ephemeral things that are nice to keep hot in memory but can be lost/regenerated if needed, and Redis for persistent storage. Both are used to take a load off the main DB for reading/write heavy operations.
More details:
Memcached: used for page/fragment/response caching and it's ok to hit the memory limit on Memcached because it will LRU (least recently used) to expire the old stuff, and frequently keep accessed keys hot in memory. It's important that anything in Memcached could be recreated from the DB if needed (it's not your only copy). But you can keep dumping things into it, and Memcached will figure which are used most frequently and keep those hot in memory. You don't have to worry about removing things from Memcached.
redis: you use this for data that you would not want to lose, and is small enough to fit in memory. This usually includes resque/sidekiq jobs, counters for rate limiting, split test results, or anything that you wouldn't want to lose/recreate. You don't want to exceed the memory limit here, so you have to be a little more careful about what you store and clean up later.
Redis starts to suffer performance problems once it exceeds its memory limit (correct me if I'm wrong). It's possible to solve this by configuring Redis to act like Memcached and LRU expire stuff, so it never reaches its memory limit. But you would not want to do this with everything you are keeping in Redis, like resque jobs. So instead of people often keep the default, Rails.cache set to use Memcached (using the dalli gem). And then they keep a separate $redis = ... global variable to do redis operations.
# in config/application.rb
config.cache_store = :dalli_store # memcached
# in config/initializers/redis.rb
$redis = $redis = Redis.connect(url: ENV['REDIS_URL'])
There might be an easy way to do this all in Redis - perhaps by having two separate Redis instances, one with an LRU hard memory limit, similar to Memcache, and another for persistent storage? I haven't seen this used, but I'm guessing it would be doable.
I would consider checking out my answer on this subject:
Rails and caching, is it easy to switch between memcache and redis?
Essentially, through my experience, I would advocate for keeping them separate: memcached for caching and redis for data structures and more persistant storage
I asked the team at Redis Labs (who provide the Memcached Cloud and Redis Cloud add ons) about which product they would recommend for Rails caching. They said that in general they would recommend Redis Cloud, that Memcached Cloud is mainly offered for legacy purposes, and pointed out that their Memcached Cloud service is in fact build on top of Redis Cloud.
I don't know what you're using them for, but actually using both may give you a performance advantage: Memcached has far better performance running across multiple cores than Redis, so caching the most important data with Memcached and keeping the rest in Redis, taking advantage of its capabilities as database, could increase performance.

When would you NOT want to use memcached in a Ruby on Rails app?

Assuming a MySQL datastore, when would you NOT want to use memcached in a Ruby on Rails app?
Don't use memcached if your application is able to handle all requests quickly. Adding memcached is extra mental overhead when it comes to coding your app, so don't do it unless you need it.
Scaling's "one swell problem to have".
Memcache is a strong distributed cache, but isn't any faster than local caching for some content. Caching should allow you to avoid bottlenecks, which is usually database requests and network requests. If you can cache your full page locally as HTML because it doesn't change very often (isn't very dynamic), then your web server can serve this up much faster than querying memcache. This is especially true if your memcache server, like most memcached servers, are on seperate machines.
Flip side of this is that I will sometimes use memcache locally instead of other caching options because I know someday I will need to move it off to its own server.
The main benefit of memcached is that it is a distributed cache. That means you can generate once, and serve from cache across many servers (this is why memcached was created). All the previous answers seem to ignore this - it makes me wonder if they have ever had to build a highly scalable app (which is exactly what memcached is for)
Danga Interactive developed memcached
to enhance the speed of
LiveJournal.com, a site which was
already doing 20 million+ dynamic page
views per day for 1 million users with
a bunch of webservers and a bunch of
database servers. memcached dropped
the database load to almost nothing,
yielding faster page load times for
users, better resource utilization,
and faster access to the databases on
a memcache miss.
(My bolding)
So the answer is: if your application is only ever likely to be deployed on a single server.
If you are ever likely to use more than one server (for scalability and redundancy) memcached is (nearly) always a good idea.
When you want to have fine-grained control about things expiring. From my tests, memcached only seems to have a timing resolution of about a second.
EG: if you tell something to expire in 1 second, it could stay around for between 1 and just over 2 seconds.

Resources