The documentation for the ActiveSupport::Cache::RedisCacheStore states:
Take care to use a dedicated Redis cache rather than pointing this at your existing Redis server. It won't cope well with mixed usage patterns and it won't expire cache entries by default.
Is this advice still true in general, especially when talking about custom data caches, not page (fragment) caches?
Or, more specifically, If I'm building a custom cache for specific "costly" backend calls to a slow 3rd-party API and I set an explicit expires_in value on my cache (or all my cached values), does this advice apply to me at all?
TLDR; yes, as long as you set an eviction policy. Which one? Read on.
On the same page, the docs for #new state that:
No expiry is set on cache entries by default. Redis is expected to be configured with an eviction policy that automatically deletes least-recently or -frequently used keys when it reaches max memory. See redis.io/topics/lru-cache for cache server setup.
This is more about memory management and access patterns than what's being cached. The Redis eviction policy documentation has a detailed section for policy choice and mixed usage (whether to use a single instance or not):
Picking the right eviction policy is important depending on the access pattern of your application, however you can reconfigure the policy at runtime while the application is running, and monitor the number of cache misses and hits using the Redis INFO output to tune your setup.
In general as a rule of thumb:
Use the allkeys-lru policy when you expect a power-law distribution in the popularity of your requests. That is, you expect a subset of elements will be accessed far more often than the rest. This is a good pick if you are unsure.
Use the allkeys-random if you have a cyclic access where all the keys are scanned continuously, or when you expect the distribution to be uniform.
Use the volatile-ttl if you want to be able to provide hints to Redis about what are good candidate for expiration by using different TTL values when you create your cache objects.
The volatile-lru and volatile-random policies are mainly useful when you want to use a single instance for both caching and to have a set of persistent keys. However it is usually a better idea to run two Redis instances to solve such a problem.
It is also worth noting that setting an expire value to a key costs memory, so using a policy like allkeys-lru is more memory efficient since there is no need for an expire configuration for the key to be evicted under memory pressure.
You do not have mixed usage. For example, you do not persist Sidekiq jobs in Redis, which have no TTL/expiry by default. So, you can treat your Redis instance as cache-only.
Related
I have an spring-boot application which I would like to deploy it into multiple docker instances and there is a load balance before the instances.
However, the application uses ehcache to cache some data from a database. It makes the application stateful.
So without session sticky, a same customer might hit different docker instances and see different results.
My question is if I can't apply session sticky in load balance, what is the best practice to deploy an app with cache feature via docker style and still comply the rule of should-be-stateless?
I explain here in this devoxx video how clustered caching can help each of you docker instance share the same cache
First of all, if you really have a pure caching use case, there should be no correctness impact only a performance one. Which of course can in itself be a bad thing for your application.
But effectively, if you want to use caching to provide performance and at the same time have a multi-node ability without sticky sessions, you have to move into the realm of distributed caching. This will give you the ability to share the cache content amongst the different nodes and thus make it (more) transparent for a given request in a conversation to hit any node of your application.
In the Ehcache world, this means backing the cache with a Terracotta server, see the documentation for details.
It is common to combine ehcache with terracotta to allow distributed caching among nodes.
Regards
I currently use Redis as a work queue for Sidekiq. I'm interested in also using it as a caching mechanism for Rails.cache.
The recommended Rails caching mechanism never expires items and relies on evicting the least recently used (LRU) item. Unfortunately, Redis by default isn't configured to evict the least recently used item while the recommended cache store, memcached, is.
Furthermore, evicting items isn't a behavior I would want for my work queue, and configuring the same Redis instance to do this could lead to undesirable results. Nor would I want my queue to share cycles with my cache anyways.
What would you all recommend in this situation? A second redis store to act as a cache and have LRU configured? Or just use the rails recommended memcached cache store and only use redis alone to be a queue?
I'm leaning towards using both Redis and Memcached, despite plenty of stack overflow articles recommending otherwise. memcached supporting LRU eviction by default is what wins me over.
Some articles:
redis and memcache or just redis
why is memcached still used alongside rails
Hidden deeper in the comments, posters mention that memcached's LRU eviction as a great reason to use it as a cache.
Ended up using both redis and memcached. Pretty happy with the results.
Main difference is that Memcached can run in parallel cores/machines but Redis is so lightweight and fast that it takes a good amount of load to get to its limit if it's running on a decent machine, where it only uses a couple cores, well since it works to use both for you that's great, but it sounds like a bit unnecessary complexity to use both, that's all. (ie if you need contractors to work on it etc you'll need someone with experience in both technologies rather than just one)
I would like to prepare a rails application to be scalable. Some features of this app is to connect to some APIs & send emails, plus I'm using PostgreSQL & it's on Heroku.
Now that code is clean, I would like to use caches and any technique that will help the app to scale.
Should I use Redis or Memcached ? It's a little obscur to me and I've seen similar questions on StackOverflow but here I would like to know which one I should use only for scaling purpose.
Also I was thinking to use Sidekiq to process some jobs. Is it going to conflict with Memcached/Redis ? Also, in wich case should I use it ?
Any other things I should think of in terms of scalability ?
Many thanks
Redis is a very good choice for caching, it has similar performances to memcached (redis is slightly faster) and it takes few minutes to configure it that way.
If possibile I would suggest agains using the same redis instance to store both cache and message store.
If you really need to do that make sure you configure redis with volatile-lru max memory policy and that you always set your cache with a TTL; this way when redis runs out of memory cache keys will be evicted.
Sidekiq requires Redis as its message store. So you will need to have a Redis instance or use a Redis service if you want to use Sidekiq. Sidekiq is great, btw.
You can use either Memcached or Redis as your cache store. For caching I'd probably use Memcached as the cache cleaning behavior is better. In Rails in general, and in Rails 4 apps in particular, one rarely explicit expires an item from cache or sets an explicit expiration time. Instead one depends on updates to the cache_key, which means the cached item isn't actually deleted from the store. Memcached handles this pretty well, by evicting the least recently used items when it hits memory limits.
That all said, scaling is a lot more than picking a few components. You'll have to do a fair amount of planning, identify bottlenecks as the app develops, and scale CPU/memory/disk by increasing servers as needed.
You must look at the two cool features of Redis that are yet to come in near future
Redis Cluster:
http://redis.io/topics/cluster-spec
http://redis.io/presentation/Redis_Cluster.pdf
Redis Sentinel: (Highly Available)
http://redis.io/topics/sentinel
http://redis.io/topics/sentinel-spec
Also these features will yield a great help in scaling redis,
In case of memcached these features are missing, also as far as active development is concerned redis community is fare more vibrant.
I have been using Dalli until now for caching and today I came across Redis -Store.
I am wondering should I switch to redisstore. My app already uses redis for certain stuff so I have a redis server which is quite big(in terms of resources) and I also have another memcached server. So if I where to switch to redis-store it would mean that I can remove the memcached server(less server to maintain + less cost).
Has anyone done a comparison of these 2 solutions.
Performance
Is it a drop-in replacement(can I switch between these 2 anytime without code change)
Any other stuff I should know about.
Redis can be used as a cache or as a permanent store, but if you try to mix both, you can end up having "interesting issues".
When you have memcached, yo have a maximum amount of memory for the process, so when memcached gets full it will automatically remove the least recently used entries to make room for the new entries.
You can configure Redis to have that behaviour too, but you don't want to do that if you are using Redis for persistent storage, because in that case you would potentially lose keys that are meant to be persistent.
So if you are using persistent storage for Redis, you would need to have two different Redis processes: one for your persistant keys, one for caching. Of course you could always have only one process and set expiring times to every cache item, but no one would assure you you don't hit the memory limit before they expire and you lose data, so in practice you would need two processes. Besides, if you are setting a master/slave configuration for your persistent data and you store cache on the same server, you are basically wasting RAM, so separate processes are the way to go.
About performance, both redis and memcached are VERY performant, and on different tests they are on the same range when it comes to get/extract data, but memcached is better when you only need a cache.
Why is this so? First of all, since memcached only has one mission, which is storing key/values, it doesn't have any overhead when it comes to storing metadata. Redis on the other hand offers different data structures, so it stores more metadata which each key. One example of this: it's much "cheaper" to store data on a hash in Redis instead of using individual keys. You don't get any of this on memcached, since there is only one type of data. This means with the same amount of memory in your servers you can store more data on memcached than on redis. If you have a relatively small installation you don't really care, but the moment you start seeing growth, believe me you will want to keep those data under control.
So, as much as I like Redis, I prefer to have memcached for my caching needs and redis for my persistent storage/temporary storage/queue needs. I still use redis as a "cache" but not a temporary one with expiration, but as a lookup cache to save reading from a more expensive storage. For example, I keep a mapping between user IDs and nicknames on Redis. I never expire these mappings, so Redis is a perfect place for it.
In the case that you are dealing with a small amount of data, then it might make sense your idea of having a single technology for everything, but the moment you start growing over a few hundreds MB, I would say go with both of them.
I don't think I'm at the point where I need to go through and get memcached setup for my Rails app, but I would like to do some simple caching on a few things.
Is using file_store for as the config.cache_store setting sufficient enough? Or will having to access files for data over and over kill the benefit of caching in the first place from a server load stand point?
Or maybe I'm not really understanding the difference between file_store and mem_cache_store...
I don't think I'm at the point where I need to go through and get memcached setup for my Rails app, but I would like to do some simple caching on a few things
Then use your existing database to store your cached items. (You are using a database, right?)
memcached is only a fast-but-dumb database. If you don't need the ‘fast’ part(*) then don't introduce the extra complexity, inconsistency and overhead of having a separate cache layer.
memcache with file_store is a dumb-but-not-even-fast database, and thus of little use to anyone except for compatibility/testing.
(*: and really, most sites don't. Memcache should be a last resort when you can't optimise your schema, denormalise it for common queries or pre-calculate complex operations any further. It's not something the average web application should be considering as essential for scalability.)
The file_store will cache stuff in files in a filesystem.
If that filesystem is LOCAL to your web server, then clearly, it will only be relevant to that particular web server, therefore you'll lose cache hit rate when a cached entity exists on one server but not another.
How many web servers do you have? 2? 10? 100?
The file_store for caching does not scale properly and will reduce your hit rate over a shared store (for example the memcached one).
The purpose of using memcached is so that a pool of web servers can access a single cache (even though it can be split on multiple physical servers). Using the file_store will not (unless it's a network filesystem, and that will almost certainly be fraught with its own problems).
Always measure the cache hit rate on any cache; if you're not getting a high hit % then it's usually not worth it. Be sure to trend it over time in your monitoring.