Updating 'notifications' for a user with memcached - ruby-on-rails

I have an application with user 'Notifications' think SO or facebook or twitter. However, as notifications won't necessarily change on every page view I decided to save them in memcached.
def get_notification
if current_user
mc = Dalli::Client.new('localhost:11211')
require_dependency 'notification.rb'
#new_notification = mc.get(current_user.id.to_s+'new_notification')
if #new_notification == nil
#new_notification = Notification.getNew(current_user.id)
mc.set(current_user.id.to_s+'notification',#new_notification)
end
end
end
I overlooked the obvious flaw in this implementation. Once the notifications are loaded they would never be refreshed until the user logs out or the cache entry expires. One way to do this is to negate the user's cache entry when a event for a new notification occurs. This would force a new request to the db. Is there any other way to implement this?

Currently you are manually connecting to Memchaced, check if key exists, store content, expire it. But as you may notice this gets tedious & repetitive very soon.
However, Rails Provides you with few patterns that you can use to accomplish this but more easily.
First using Cache Stores option you can instruct rails to use Memchached
config.cache_store = :mem_cache_store, "example.com"
This cache store uses memcached server to provide a
centralized cache for your application. Rails uses the bundled dalli
gem by default. This is currently the most popular cache store for
production websites. It can be used to provide a single, shared cache
cluster with very a high performance and redundancy.
When initializing the cache, you need to specify the addresses for all
memcached servers in your cluster. If none is specified, it will
assume memcached is running on the local host on the default port, but
this is not an ideal set up for larger sites.
The write and fetch methods on this cache accept two additional
options that take advantage of features specific to memcached. You can
specify :raw to send a value directly to the server with no
serialization. The value must be a string or number. You can use
memcached direct operation like increment and decrement only on raw
values. You can also specify :unless_exist if you don't want memcached
to overwrite an existing entry.
Using rails Cache store instead of directly using Dalli allows you to use the following Nicer API
Rails.cache.read('key')
Rails.cache.write('key', value)
Rails.cache.fetch('key') { value }
Now, rails for actually caching. you can use Declarative Etags or Fragment Caching to cache the notifications. here is an example using Declarative Etags
def get_notification
if current_user
#new_notification = Notification.getNew(current_user.id)
end
refresh_when #new_notification
end
Now the way declarative E-tags works is Template is not rendered when request
sends a matching ETag & cache copy is sent. However, when #new_notification changes the E-tag value will change too. Thus causing the cache to expire. Now, Caching is a vast topic to cover & there are variously techniques to do it. so probally I won't give you a full answers but I would point to the following resources so you can learn more:
Caching with Rail
Rails 4: Zombie Outlaws Course
Rails Cache for dummies
Caching Strategies for Rails
Happy Caching ;-)

Related

Caching an HTTP request made from a Rails API (google-id-token)?

ok, first time making an API!
My assumption is that if data needs to be stored on the back end such that it persists across multiple API calls, it needs to be 1) in cache or 2) in a Database. is that right?
I was looking at the code for the gem "google-id-token". it seems to do just what i need for my google login application. My front end app will send the google tokens to the API with requests.
the gem appears to cache the public (PEM) certificates from Google (for an hour by default) and then uses them to validate the Google JWT you provide.
but when i look at the code (https://github.com/google/google-id-token/blob/master/lib/google-id-token.rb) it just seems to fetch the google certificates and put them into an instance variable.
am i right in thinking that the next time someone calls the API, it will have no memory of that stored data and just fetch it again?
i guess its a 2 part question:
if i put something in an #instance_variable in my API, will that data exist when the next API call comes in?
if not, is there any way that "google-id-token" is caching its data correctly? maybe HTTP requests are somehow cached on the backend and therefore the network request doesnt actually happen over and over? can i test this?
my impulse is to write "google-id-token" functionality in a way that caches the google certs using MemCachier. but since i dont know what I'm doing i thought i would ask.? Maybe the gem works fine as is, i dont know how to test it.
Not sure about google-id-token, but Rails instance variables are not available beyond single requests and views (and definitely not from one user's session to another).
You can low-level cache anything you want with Rails.cache.fetch this is put in a block, takes a key name, and an expiration. So it looks like this:
Rails.cache.fetch("google-id-token", expires_in: 24.hours) do
#instance_variable = something
end
If the cache exists and it is not past its expiration date/time, Rails grabs it from the cache; otherwise, it would make your API request.
It's important to note that low-level caching doesn't work with mem_store (the default for development) and so you need to implement a solution with redis or memcached or something like that for development, too. Also, make sure the file tmp/cache.txt exists. You can run rails dev:cache or just touch it to create it.
More on Rails caching

Refresh data with API every X minutes

Ruby on Rails 4.1.4
I made an interface to a Twitch gem, to fetch information of the current stream, mainly whether it is online or not, but also stuff like the current title and game being played.
Since the website has a lot of traffic, I can't make a request every time a user walks in, so instead I need to cache this information.
Cached information is stored as a class variable ##stream_data inside class: Twitcher.
I've made a rake task to update this using cronjobs, calling Twitcher.refresh_stream, but naturally that is not running within my active process (to which every visitor is connecting to) but instead a separate process. So the ##stream_data on the actual app is always empty.
Is there a way to run code, within my currently running rails app, every X minutes? Or a better approach, for that matter.
Thank you for your time!
This sounds like a good call for caching
Rails.cache.fetch("stream_data", expires_in: 5.minutes) do
fetch_new_data
end
If the data is in the cache and is not old then it will be returned without executing the block, if not the block is used to populate the cache.
The default cache store just keeps things in memory so doesn't fix your problem: you'll need to pick a cache store that is shared across your processes. Both redis and memcached (via the dalli gem) are popular choices.
Check out Whenever (basically a ruby interface to cron) to invoke something on a regular schedule.
I actually had a similar problem with using google analytics. Google analytics requires that you have an API key for each request. However the api key would expire every hour. If you requested a new api key for every google analytics request, it'd be very slow per request.
So what I did was make another class variable ##expires_at. Now in every method that made a request to google analytics, I would check ##expires_at.past?. If it was true, then I would refresh the api key and set ##expires_at = 45.minutes.from_now.
You can do something like this.
def method_that_needs_stream_data
renew_data if ##expires_at.past?
# use ##stream_data
end
def renew_data
# renew ##stream_data here
##expires_at = 5.minutes.from_now
end
Tell me how it goes.

How can I encrypt a cached value before storing in Rails cache (on Heroku)?

I run a live RoR (Rails 3.21.11) application on Heroku that contains some sensitive (personally-identifiable) information that we'd like to cache out (~80kb of JSON on a per-user basis).
Since we run on Heroku, we obviously trust Heroku with this data.
However, to use memcached, we need to use a Heroku addon, such as Memcachier.
The business problem: we are not willing to put this sensitive information on a third-party provider's infrastructure unless it is symmetrically encrypted on the way out.
Of course, I can do this:
value = encrypt_this(sensitive_value)
Rails.cache.write('key', value)
But we envisiage a future in which ActiveRecord objects, as well as good ol' JSON, will be stored -- so we need every bit of data going out to be automatically encrypted, and we don't want to have to write an encryption line into every bit of code that might want to use the cache.
Are there any gems/projects/tools to do this?
Although I haven't had a chance to use this yet the attr_encrypted library might get you some or all of the way there.

What exactly does the Memcached size limit represent with file system entitystore?

Good afternoon,
I have Memcached hooked up in my app on Heroku. The limit for the free managed plan is 5MB for Memcached and 25MB for Memcachier. Being new to pretty much everything, I was just hoping for clarification of exactly what this represents.
I have the DalliStore set up in my config file, and the typical options set up for Rack::Cache. My metastore is in Memcache and the entitiy store is set up on the filesystem.
Questions:
Does this mean that my 5/25MB limit is only being used by the meta information that I am storing about each cache fragment? This would mean that I'd be able to store a ton of information on just the free plans?
What exactly is the breakdown / story between Rack::Cache and Memcache (via Dalli store?) Do they serve different purposes? Are they doing the same thing? i.e. is the following code redundant
config.cache_store = :dalli_store
and
config.action_dispatch.rack_cache = {
:verbose => true,
:metastore => Dalli::Client.new,
:entitystore => 'file:tmp/cache/rack/body',
:allow_reload => false
}
I've been wrestling with similar issues.
First of all, let's get our terminology straight.
Memcached: a process running on a server somewhere (d is for daemon) that actually stores key-value pairs in memory.
cache_store: Rails cache storage (Rails.cache) with a single API which can be configured to use different Ruby software implementations. One setting for config.cache_store is :memory_store to use an in-memory implementation. Another setting is :dalli_store that specifies using the dalli gem, which under the hood makes a possibly-remote connection to your memcached server.
Dalli Store: I assume you mean the Rails cache backed by the dalli gem, which physically stores values in memcached.
Rack::Cache: Rack middleware that inserts headers (Expires, Cache-Control, ETag, etc.) into your HTTP responses, and also acts as a reverse proxy cache, handling requests before they hit the Rails stack when possible. See website.
In order for Rack::Cache to handle requests before they hit the Rails stack and the rest of your app, it must store responses + metadata somewhere. Where it stores things are configured by the config.action_dispatch.rack_cache = { ... } setting.
Notice, this is a different setting from config.cache_store = :dalli_store. They don't have to be related. I think this is where a lot of confusion comes from. In practice, though, we may want them both to use memcached, which means using the dalli implementation. However, they each have their own Dalli::Client instance. (Also, your session_store could be related, but doesn't have to be.)
The Heroku cedar stack has an ephemeral file system that cannot be shared among dynos. However, Heroku themselves recommend using tmp file storage with Rack::Cache for the entitystore only, with memcached used for the metastore.
As for what actually gets stored in the Rack::Cache metastore, these are the docs from rack-cache v1.2 Rack::Cache::MetaStore class:
The MetaStore is responsible for storing meta information about a
request/response pair keyed by the request's URL.
The meta store keeps a list of request/response pairs for each canonical
request URL. A request/response pair is a two element Array of the form:
[request, response]
The +request+ element is a Hash of Rack environment keys. Only protocol
keys (i.e., those that start with "HTTP_") are stored. The +response+
element is a Hash of cached HTTP response headers for the paired request.
So to answer your question, HTTP request headers and HTTP response headers are stored in the Rack::Cache metastore. On the other hand, the Rack::Cache entitystore stores entire response bodies (i.e. HTML).
Since you can't use page caching reliably on Heroku, this means you might be using action caching and fragment caching. Action and fragment caching use your Rails cache store (not rack cache). But if you've set them both to physically use the same memcached server, they'll both contribute to memory use. Action and partial caching store the actual HTML.
To get more insight of your actual usage, if you're using memcachier run the following command to open up an analytics dashboard in your browser.
heroku addons:open memcachier
See this question for more info about getting memcached stats.
Well, this is not so easy to answer.
First of all, you will likely not be storing anything on the Heroku filesystem, as it is not writeable. Therefore you should store everything in Memcache. So on the free plan you will store 5/25mb of data including both entities and metadata.
As the docs say (for the Cedar stack):
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During its lifetime the dyno can use the filesystem as a temporary scratchpad, but no files it writes are visible to any other dyno (including other dynos in the application) and any files written will be discarded the moment the dyno is stopped or restarted.
Therefore, yes using the filesystem seems fairly viable, especially if you are using a single dyno.
Regarding the distinction between Rack::Cache and Memcache: Memcache is a server that stores key/value pairs with some additional nice properties in memory. The config.cache_store = :dalli_store configures Rails.cache which is an abstraction over the different caching mechanisms that one could use. It is general and you can use it for arbitrary key/value storage, action and fragment caching.
Rack::Cache on the other hand is a replacement for Varnish and allows you to cache whole requests.

How to use multiple caches in rails?

I have a rails application where I would like to use both memcached and the file store cache, for different purposes.
I want to use the file store cache to keep a large number of pages that don't change often (some not at all) - i.e. page caching - and use memcached for everything else (action and DB caching etc). The reason is that the pages stored on the file store cache are likely to require a large amount of storage, but individually most will be accessed infrequently.
Is this possible to do or will configuring memcached as the cache mean that it is also used for page caching?
As a secondary question, what is a safe way to remove pages from the file store cache in some form of cron job, as there does not seem to be an option to specify ttl for this cache. For example a UNIX find command would quickly find and remove all old pages or pages that haven't been accessed in a long time - is this safe to do given the app server might potentially try to serve one of those pages at the time (tho this is very unlikely)? If not then what is the best way to do this.
If you want to use the filesystem only for page caching and memcached for action and fragment caching, you're fine. Page caching always uses the filesystem. Just remember that page caching bypasses your Rails application, so you can't use it for pages that include content that changes from user to user or for pages that are access controlled with filters.
Regarding the removal of pages, on Unix, a file can be deleted, but it is not actually removed from disk until all open file handles are closed. If the app server has opened the file to serve a request, and the find command deletes it a split-second later, the app server doesn't suddenly get an error when it tries to read.
You could also consider having find delete files based on their last access time, instead of creation or modification, and using a sweeper in your Rails app to delete the cached page when its content is out of date.
A simpler approach may be to use a http cache upstream of your application as your page cache rather than two stores within rails. This way you can use http headers to control the cache behavior, including TTL's. These same limits will also apply to browser's local caches as a nice bonus.
Varnish is about as high performance as it gets, but would require setting up another moving piece in your hosting environment as a proxy. This may still be worthwhile depending on what you're doing.
A simpler approach might be Rack::Cache, which will be easy to set up provided you're using a rack enabled version of rails.

Resources