Redis memorylimit with read-only - memory

I know about the main way, how Redis handles the key when memory limit is reached. However...
What if I want Redis to "lock down" it self by making it read-only till the point when some keys will receive a delete signal. The only reason is that all the data in our Redis Cluster are quite important, thus we wish to have them ready any time. But OFC, if the memory limit is reached, we need to save some space, but without losing any data decided by Redis, but rather by the user.
Example:
User watches his statistic window on our applicatino server. Behind that, we store every data in Redis and display it for him. When the user closes the webapp, I'm currently freeing up all the keys related to his session. I want to drop these keys when a memory limit is reached, so noone should be getting "random" keys deleted or the least frequently used one neither.
Is this a possible thing or I'm just dreaming?

maxmemory-policy determines how your cache behaves when it reaches maxmemory. The default option is noeviction which means it won't try to automatically evict any items based on TTL or LRU, etc. This essentially means the cache is then read-only, you can't add new items, but can still read existing items from it.
Obviously some external process or user will then need to delete some of the items, before you can add new data to it.
https://raw.githubusercontent.com/antirez/redis/5.0/redis.conf (search for MEMORY MANAGEMENT in that config file)

Related

ETS entry limit, to use as a cache server

My idea is to use ETS as a temporary cache for my GenServer state.
For example when I restart my application, the GenServer state should be transported to ETS and when the application starts again, the GenServer should be able to get the state from there.
I would like to keep it simple, so all the state (a Map) from the GenServer should take a single entry. Is there a limit for entry-sizes?
Another approach would be, to simply create a file, and load it again, when needed. Maybe this is even better/simpler, but I am not sure :)
In case of an ETS table, the App could start on a completely other host, and connect to the Cache Node (ETS).
This can most certainly be done in a wide variety of ways. You could have a separate store, like Mnesia(which is ETS under the hood), Redis or a plain database. In the latter two cases, you would need to cast your Genserver state to a string and back doing: :erlang.term_to_binary and :erlang.binary_to_term respectively.
In the case that you are dealing with multiple GenServer processes that need to be cached in this way, e.g. every GenServer represents a unique customer cart for instance, then that unique identifier can be utilized as the key under which to store the state which can then later on be retrieved. This is particularly useful when you are running your shopping application on multiple nodes behind a load balancer, and every new request on part of a customer can get 'round robin'-ned around to any random node.
When the request comes in:
fetch the unique identifier belonging to that customer in one way or the other,
fetch the stored contents from wherever that may be(Mnesia/Redis/...),
spawn up a new GenServer process initialized with that stored contents,
do the various operations required for that request,
store the latest modified GenServer shopping cart into Redis/Mnesia/wherever,
tear down the GenServer and
respond to the request with whatever data is required.
Based on the Benchmarks I have done of ETS vs Redis on my local, it is no surprise that ETS is the more performant way to go, but ElastiCache is an awesome alternative if you are not in the mood to bother spinning up a dedicated Mnesia store.
In the case it pertains to a specific GenServer that needs to run, then you are most likely looking at failover as opposed to managing individual user requests.
In such a case, you could consider using something like: https://hexdocs.pm/elixir/GenServer.html#c:terminate/2 to have the state first persisted to some store and in your init make the GenServer first look in that store and reuse the cache accordingly.
The complicated matter here is in the scenario where you have multiple applications running, which key will you utilize in order to have the crashed application reinitialize the GenServer with the correct state?
There are several open ended questions over here that revolve around your exact use case, but what has been presented so far should give you a fair idea as to when it makes sense to utilize this caching solution and how to start implementing it.

Let user choose to update Service Worker

Is it possible to let user to choose when to update Service Worker?
Why? I want to add economy mode which means that user could choose to save a lot of bandwidth. This could be useful when user's limit is almost full or he/she is using expensive internet abroad.
That's because if Service Worker updates and there are new assets' versions, it will download all of them which could be several MB. If you're 3 days and 50MB away from new month, every MB counts.
Let's say that I can retrieve the setting from localStorage:
const economy = localStorage.getItem(economy) || false
How to let Service Worker know that it should only update itself if economy is true?
I kind of realize that it could be a problem in a long run (outdated versions) but Im planning to annoy the user often if he/she doesn't want to update. I just want to add the option for user to choose.
If you're willing to handle updating/deleting (perhaps just a subset of) cache entries outside of the install and activate events, you have more flexibility as to when they should be triggered. You actually don't have to perform the updates in the service worker at all, if it ends up being easier for you not to. Individual webpages have access to the exact same Cache Storage API instances that the service worker for a given origin uses. You can modify the caches directly from the page in response to whatever action makes the most sense for you, e.g.:
// Ensure we have access to the Cache Storage API.
if ('caches' in window) {
// Swap this out for whatever UI element will trigger the update.
const el = document.querySelector('#update-caches-button');
el.addEventListener('click', () => {
window.caches.open('my-cache').then(cache => {
// Add or delete entries from cache.
});
});
}
You could do something similar, but keep all the logic in the service worker, by using postMessage() from a client page to trigger a service worker's message event, and then update the caches in the message event handler.
There are some advantages to relying on the install/activate events for performing cache management. In particular, you can rely on the service worker staying in a "waiting" state while there are other active clients that rely on the previous cache state, and don't have to worry about throwing away entries that will be needed by those other clients or swapping out the expected version of a resource for a new version while it's still being used. But as a developer, ultimately you're responsible for understanding how your cached resources are being used, and if the assets you're managing in these caches aren't likely to cause problems if there's a version mismatch or if something that was deleted by one tab needs to be retrieved from the network later on by another tab, then you're free to implement that sort of approach.
For what it's worth, we've thought about similar issues when implementing precaching/updates in sw-precache. There's some background at https://github.com/GoogleChrome/sw-precache/issues/145 about trying to make use of standard signals exposed by browsers indicating that the user prefers to conserve data, rather than everyone coming up with their own heuristics.

How does memcache and rails work at max limits?

I am trying to understand how memcache works when (if) you fill up the allocated memory buffer. In particular I want to understand the lifecycle of a key value pair in cache. I am talking about low level cache operations in rails where I am directly creating the key/value pairs. e.g. commands like
Rails.cache.write key, cached_data
Rails.cache.fetch key
Assume for the sake of argument I have an infinite loop that was just generating random UUIDs as keys and storing random data. What happens when the cache fills up? Do older items just get bumped off or is there some specific algorithm behind the scenes that handles this eventuality?
I have read elsewhere "Cache Invalidation is a Hard Problem".
Just trying to understand how it actually works.
Maybe some simple code examples that illustrate the best way to create and destroy cached data? Do you have to explicitly define when entries should expire?
MemcacheD handles this behind the scenes. Check out this question -
Memcache and expired items
You can define expiration parameters, check out this wiki page -
http://code.google.com/p/memcached/wiki/NewProgramming#Cache_Invalidation
For cache invalidation specific to you application logic (and not just exhaustion of memory behind the scenes), the delete function will simply remove the data. As far when to delete cached data in your app, thats harder to say - hence the quote you referenced about cache invalidation being hard. I might suggest you start by thinking about ActiveRecord callbacks like after_commit - http://api.rubyonrails.org/classes/ActiveRecord/Callbacks.html, to let you easily invalidate cached data whenever your database changes.
But this is only a suggestion, there are many different cache invalidation schemes out there.

safe and performing way to save site wide variable

I am building a rails app, have a site wide counter variable (counter) that is used (read and write) by different pages, basically many pages can cause the counter increment, what is a multi-thread-safe way to store this variable so that
1) it is thread-safe, I may have many concurrent user access might read and write this variable
2) high performing, I originally thought about persist this variable in DB, but wondering is there a better way given there can be high volume of request and I don't want to make this DB query the bottleneck of my app...
suggestions?
It depends whether it has to be perfectly accurate. Assuming not, you can store it in memcached and sync to the database occasionally. If memcached crashes, it expires (shouldn't happen if configured properly), you have to shutdown, etc., reload it from the database on startup.
You could also look at membase. I haven't tried it, but to simplify, it's a distributed memcached server that automatically persists to disk.
For better performance and accuracy, you could look at a sharded approach.
Well you need persistence, so you have to store it in the Database/Session/File, AFAIK.

Sharing an large array with all users on a rails app

I have inherited an app that generates a large array for every user that visit the app. I recently discovered that it is identical for nearly all the users!!
Now I want to somehow make one copy of it so it is not built over and over again. I have thought of a few options and wanted input to see which one is the best:
1) Create a model and shove the data into the database
2) Create a YAML file and have the app load it when it initializes.
I personally like the model idea but a few engineers at work feel as though it does not deserve to be a full model. 97% of the times users will see the same exact thing but 3% of the time users will get a slightly different array (a few elements will have changed).
Any other approaches that I should consider.??..thanks in advance.
Remember that if you store the data in the DB, each request which requires the data will have to execute a DB query to pull it out. If you are running multiple server threads, each thread could have its own copy in memory (if they are all handling requests which require the use of the array). In that case, you wouldn't be saving any memory (though you might save time from not having to regenerate the array).
If you are running multiple server processes (not threads), and if the array contents change as the application is running, and the changes have to be visible to all the processes, caching in memory won't work. You will have to use the DB in that case.
From the information in your comment, I suggest you try something like this:
Store the array in your DB, and make sure that the record(s) used have created/updated timestamps. Cache the contents in memory using a constant/global variable/class variable. Also store the last time the cache was updated.
Every time you need to use the array, retrieve the relevant "updated" timestamp from the DB. (You may need to use hand-coded SQL and ModelName.connection.execute to avoid pulling back all the data in the record, which ActiveRecord will probably do.) If the timestamp is later than the last time your cache was updated, pull the array from the DB and update your cache.
Use a Mutex ('require thread') when retrieving/updating the cached data, in case your server setup may use multiple threads. (I don't think that Passenger does, but I have had problems similar to threading problems when using Passenger+RMagick, so I would still use a Mutex to be safe.)
Wrap all the code which deals with the cached array in a library class (or a class method on the model used to store the data), so the details of cache management don't spill over into the rest of the application.
Do a little bit of performance testing on the cache setup using Benchmark.measure {}. If a bug in the setup actually made performance worse rather than better, that would be sad...
I'd go with option 2. You can add two constants (for the 97% and 3%) that load from a YAML file when the app initializes. That ought to shrink your memory footprint considerably.
Having said that, yikes, this is just a band-aid on a hack, but you knew that already. I'd consider putting some time into a redesign, if you have that luxury.

Resources