What is the redis equivalent of storing an array? - ruby-on-rails

I'd like to use Redis and not my session for this for obvious reasons.
Old country code :
session[:some_stuff] = #my_objects.map(&:id)
Then later :
session[:some_stuff].each{|obj| ..
Alternatively,
I would like to store this map of id's into redis. And then retrieve them. I can't find any thing relevant on other web resources. Any ideas?

You haven't written about how you have your Redis connection/adapter set up but it's basically SADD for adding elements to a Redis set and SMEMBERS to retrieve all the elements.
http://redis.io/commands#set

I tried to use the redis-store gem thinking that would solve a few problems but it turns out it doesn't work. Even the supposibly stable 1.0.0 version.
So this is what I did and it worked out extraordinarily well :
def first_method
$redis = Redis.new
#customers.map(&:id).each{|c|$redis.sadd('export', c)}
def other_method
#customers = []
$redis.smembers('export').each{|c|#customers << Customer.find(c)}
Notes:
You only need to identify what $redis is once in one method. Then it saves itself into a stateless place outside of your MVC architecture.

Related

rails: how to get all key-values from Rails.cache

I want to maintain an user online/offline list with Rails.cache(memory_store).
Basically if a request /user/heartbeat?name=John reached Rails server, it will simply:
def update_status
name = params.require(:name)
Rails.cache.write(name, Time.now.utc.iso8601, expires_in: 6.seconds)
end
But how could I get all the data stored in Rails.cache similarly as following?
def get_status
# wrong codes, as Rails.cache.read doesn't have :all option.
ary = Rails.cache.read(:all)
# deal with array ...
end
I googled for a while, it seems Rails.cache doesn't provide the method to get all the data directly. Or there is better way to store the data?
I'm using Rails 5.0.2.
Thanks for your time!
If you have redis you can use sth like Rails.cache.redis.keys
You can get all keys with code:
keys = Rails.cache.instance_variable_get(:#data).keys
In addition, you can iterate on the keys to get their values and display them all
keys = Rails.cache.instance_variable_get(:#data).keys
keys.each{|key| puts "key: #{key}, value: #{Rails.cache.fetch(key)}"}
or map them all directly as such:
key_values = Rails.cache.instance_variable_get(:#data).keys.map{|key| {key: key, value: Rails.cache.fetch(key)}}
In addition I would check beforehand the amount of keys to make sure I would not come up with a gigantic object (and if so, restrict the key_value array generated to the first 1000 items for instance).
You could use:
Rails.cache.instance_variable_get(:#data).keys.count
or just look at the last line of the stats command:
Rails.cache.stats
None of the answers seem to work if you're using Redis. Instead, you will have to use redis-rb:
redis = Redis.new(url: ENV["REDIS_URL"])
redis.keys("*")
The above answeres helped me. The existing cache values can been seen by looping
Rails.cache.redis.keys.each{ |k| puts k if Rails.cache.exist?(k)}
If you have connection pooling, try:
Rails.cache.redis.with do |conn|
conn.keys
end

Rails: How to handle Thread.current data under a single-threaded server like Thin/Unicorn?

As Thin/Unicorn are single threaded, how do you handle Thread.current/per-request storage?
Just ran a simple test - set a key in one session, read it from another -- looks like it writes/reads from the same place all the time. Doesn't happen on WEBrick though.
class TestController < ApplicationController
def get
render text: Thread.current[:xxx].inspect
end
def set
Thread.current[:xxx] = 1
render text: "SET to #{Thread.current[:xxx]}"
end
end
Tried adding config.threadsafe! to application.rb, no change.
What's the right way to store per-request data?
How come there are gems (including Rails itself, and tilt) that use Thread.current for storage? How do they overcome this problem?
Could it be that Thread.current is safe per request, but just doesn't clear after request and I need to do that myself?
Tested with Rails 3.2.9
Update
To sum up the discussion below with #skalee and #JesseWolgamott and my findings--
Thread.current depends on the server the app is running on. Though the server might make sure no two requests run at the same time on same Thread.current, the values in this hash might not get cleared between requests, so in case of usage - initial value must be set to override last value.
There are some well known gems who use Thread.current, like Rails, tilt and draper. I guess that if it was forbidden or not safe they wouldn't use it. It also seems like they all set a value before using any key on the hash (and even set it back to the original value after the request has ended).
But overall, Thread.current is not the best practice for per-request storage. For most cases, better design will do, but for some cases, use of env can help. It is available in controllers, but also in middleware, and can be injected to any place in the app.
Update 2 - it seems that as for now, draper is uses Thread.current incorrectly. See https://github.com/drapergem/draper/issues/390
Update 3 - that draper bug was fixed.
You generally want to store stuff in session. And if you want something really short-living, see Rails' flash. It's cleared on each request. Any method which relies on thread will not work consistently on different webservers.
Another option would be to modify env hash:
env['some_number'] = 5
BTW Unicorn is not simply single-threaded, it's forking. The new process is spawned on each request (despite it sounds scary, it's pretty efficient on Linux). So if you set anything in Unicorn, even to global variable, it won't persist to another request.
While people still caution against using Thread.current to store "thread global" data, the possibly correct approach to do it in Rails is by clearing-up the Thread.current object using Rack middleware. Steve Labnik has written the request_store gem to do this easily. The source code of the gem is really, really small and I'd recommend reading it.
The interesting parts are reproduced below.
module RequestStore
def self.store
Thread.current[:request_store] ||= {}
end
def self.clear!
Thread.current[:request_store] = {}
end
end
module RequestStore
class Middleware
def initialize(app)
#app = app
end
def call(env)
RequestStore.clear!
#app.call(env)
end
end
end
Please note, clearing up the entire Thread.current is not a good practice. What request_store is basically doing, is it's keeping track of the keys that your app stashes into Thread.current, and clears it once the request is completed.
One of the caveats of using Thread.current, is that for servers that reuse threads or have thread-pools, it becomes very important to clean up after each request.
That's exactly what the request_store gem provides, a simple API akin to Thread.current which takes care of cleaning up the store data after each request.
RequestStore[:items] = []
Be aware though, the gem uses Thread.current to save the Store, so it won't work properly in a multi-threaded environment where you have more than one thread per request.
To circumvent this problem, I have implemented a store that can be shared between threads for the same request. It's called request_store_rails, and the usage is very similar:
RequestLocals[:items] = []

Using and Editing Class Variables in Ruby?

So I've done a couple of days worth of research on the matter, and the general consensus is that there isn't one. So I was hoping for an answer more specific to my situation...
I'm using Rails to import a file into a database. Everything is working regarding the import, but I'm wanting to give the database itself an attribute, not just every entry. I'm creating a hash of the file, and I figured it'd be easiest to just assign it to the database (or the class).
I've created a class called Issue (and thus an 'issues' database) with each entry having a couple of attributes. I was wanting to figure out a way to add a class variable (at least, that's what I think is the best option) to Issue to simply store the hash. I've written a rake to import the file, iff the new file is different than the previous file imported (read, if the hash's are different).
desc "Parses a CSV file into the 'issues' database"
task :issues, [:file] => :environment do |t, args|
md5 = Digest::MD5.hexdigest(args[:file])
puts "1: Issue.md5 = #{Issue.md5}"
if md5 != Issue.md5
Issue.destroy_all()
#import new csv file
CSV.foreach(args[:file]) do |row|
issue = {
#various attributes to be columns...
}
Issue.create(issue)
end #end foreach loop
Issue.md5 = md5
puts "2: Issue.md5 = #{Issue.md5}"
end #end if statement
end #end task
And my model is as follows:
class Issue < ActiveRecord::Base
attr_accessible :md5
##md5 = 5
def self.md5
##md5
end
def self.md5=(newmd5)
##md5 = newmd5
end
attr_accessible #various database-entry attributes
end
I've tried various different ways to write my model, but it all comes down to this. Whatever I set the ##md5 in my model, becomes a permanent change, almost like a constant. If I change this value here, and refresh my database, the change is noted immediately. If I go into rails console and do:
Issue.md5 # => 5
Issue.md5 = 123 # => 123
Issue.md5 # => 123
But this change isn't committed to anything. As soon as I exit the console, things return to "5" again. It's almost like I need a .save method for my class.
Also, in the rake file, you see I have two print statements, printing out Issue.md5 before and after the parse. The first prints out "5" and the second prints out the new, correct hash. So Ruby is recognizing the fact that I'm changing this variable, it's just never saved anywhere.
Ruby 1.9.3, Rails 3.2.6, SQLite3 3.6.20.
tl;dr I need a way to create a class variable, and be able to access it, modify it, and re-store it.
Fixes please? Thanks!
There are a couple solutions here. Essentially, you need to persist that one variable: Postgres provides a key/value store in the database, which would be most ideal, but you're using SQLite so that isn't an option for you. Instead, you'll probably need to use either redis or memcached to persist this information into your database.
Either one allows you to persist values into a schema-less datastore and query them again later. Redis has the advantage of being saved to disk, so if the server craps out on you you can get the value of md5 again when it restarts. Data saved into memcached is never persisted, so if the memcached instance goes away, when it comes back md5 will be 5 once again.
Both redis and memcached enjoy a lot of support in the Ruby community. It will complicate your stack slightly installing one, but I think it's the best solution available to you. That said, if you just can't use either one, you could also write the value of md5 to a temporary file on your server and access it again later. The issue there is that the value then won't be shared among all your server processes.

Working with a large data object between ruby processes

I have a Ruby hash that reaches approximately 10 megabytes if written to a file using Marshal.dump. After gzip compression it is approximately 500 kilobytes.
Iterating through and altering this hash is very fast in ruby (fractions of a millisecond). Even copying it is extremely fast.
The problem is that I need to share the data in this hash between Ruby on Rails processes. In order to do this using the Rails cache (file_store or memcached) I need to Marshal.dump the file first, however this incurs a 1000 millisecond delay when serializing the file and a 400 millisecond delay when serializing it.
Ideally I would want to be able to save and load this hash from each process in under 100 milliseconds.
One idea is to spawn a new Ruby process to hold this hash that provides an API to the other processes to modify or process the data within it, but I want to avoid doing this unless I'm certain that there are no other ways to share this object quickly.
Is there a way I can more directly share this hash between processes without needing to serialize or deserialize it?
Here is the code I'm using to generate a hash similar to the one I'm working with:
#a = []
0.upto(500) do |r|
#a[r] = []
0.upto(10_000) do |c|
if rand(10) == 0
#a[r][c] = 1 # 10% chance of being 1
else
#a[r][c] = 0
end
end
end
#c = Marshal.dump(#a) # 1000 milliseconds
Marshal.load(#c) # 400 milliseconds
Update:
Since my original question did not receive many responses, I'm assuming there's no solution as easy as I would have hoped.
Presently I'm considering two options:
Create a Sinatra application to store this hash with an API to modify/access it.
Create a C application to do the same as #1, but a lot faster.
The scope of my problem has increased such that the hash may be larger than my original example. So #2 may be necessary. But I have no idea where to start in terms of writing a C application that exposes an appropriate API.
A good walkthrough through how best to implement #1 or #2 may receive best answer credit.
Update 2
I ended up implementing this as a separate application written in Ruby 1.9 that has a DRb interface to communicate with application instances. I use the Daemons gem to spawn DRb instances when the web server starts up. On start up the DRb application loads in the necessary data from the database, and then it communicates with the client to return results and to stay up to date. It's running quite well in production now. Thanks for the help!
A sinatra app will work, but the {un}serializing, and the HTML parsing could impact performance compared to a DRb service.
Here's an example, based on your example in the related question. I'm using a hash instead of an array so you can use user ids as indexes. This way there is no need to keep both a table on interests and a table of user ids on the server. Note that the interest table is "transposed" compared to your example, which is the way you want it anyways, so it can be updated in one call.
# server.rb
require 'drb'
class InterestServer < Hash
include DRbUndumped # don't send the data over!
def closest(cur_user_id)
cur_interests = fetch(cur_user_id)
selected_interests = cur_interests.each_index.select{|i| cur_interests[i]}
scores = map do |user_id, interests|
nb_match = selected_interests.count{|i| interests[i] }
[nb_match, user_id]
end
scores.sort!
end
end
DRb.start_service nil, InterestServer.new
puts DRb.uri
DRb.thread.join
# client.rb
uri = ARGV.shift
require 'drb'
DRb.start_service
interest_server = DRbObject.new nil, uri
USERS_COUNT = 10_000
INTERESTS_COUNT = 500
# Mock users
users = Array.new(USERS_COUNT) { {:id => rand(100000)+100000} }
# Initial send over user interests
users.each do |user|
interest_server[user[:id]] = Array.new(INTERESTS_COUNT) { rand(10) == 0 }
end
# query at will
puts interest_server.closest(users.first[:id]).inspect
# update, say there's a new user:
new_user = {:id => 42}
users << new_user
# This guy is interested in everything!
interest_server[new_user[:id]] = Array.new(INTERESTS_COUNT) { true }
puts interest_server.closest(users.first[:id])[-2,2].inspect
# Will output our first user and this new user which both match perfectly
To run in terminal, start the server and give the output as the argument to the client:
$ ruby server.rb
druby://mal.lan:51630
$ ruby client.rb druby://mal.lan:51630
[[0, 100035], ...]
[[45, 42], [45, 178902]]
Maybe it's too obvious, but if you sacrifice a little access speed to the members of your hash, a traditional database will give you much more constant time access to values. You could start there and then add caching to see if you could get enough speed from it. This will be a little simpler than using Sinatra or some other tool.
be careful with memcache, it has some object size limitations (2mb or so)
One thing to try is to use MongoDB as your storage. It is pretty fast and you can map pretty much any data structure into it.
If it's sensible to wrap your monster hash in a method call, you might simply present it using DRb - start a small daemon that starts a DRb server with the hash as the front object - other processes can make queries of it using what amounts to RPC.
More to the point, is there another approach to your problem? Without knowing what you're trying to do, it's hard to say for sure - but maybe a trie, or a Bloom filter would work? Or even a nicely interfaced bitfield would probably save you a fair amount of space.
Have you considered upping the memcache max object size?
Versions greater than 1.4.2
memcached -I 11m #giving yourself an extra MB in space
or on previous versions changing the value of POWER_BLOCK in the slabs.c and recompiling.
What about storing the data in Memcache instead of storing the Hash in Memcache? Using your code above:
#a = []
0.upto(500) do |r|
#a[r] = []
0.upto(10_000) do |c|
key = "#{r}:#{c}"
if rand(10) == 0
Cache.set(key, 1) # 10% chance of being 1
else
Cache.set(key, 0)
end
end
end
This will be speedy and you won't have to worry about serialization and all of your systems will have access to it. I asked in a comment on the main post about accessing the data, you will have to get creative, but it should be easy to do.

Rails per-request hash?

Is there a way to cache per-request data in Rails? For a given Rails/mongrel request I have the result of a semi-expensive operation that I'd like to access several times later in that request. Is there a hash where I can store and access such data?
It needs to be fairly global and accessible from views, controllers, and libs, like Rails.cache and I18n are.
I'm ok doing some monkey-patching if that's what it takes.
Memcached doesn't work because it'll be shared across requests, which I don't want.
A global variable similarly doesn't work because different requests would share the same data, which isn't what I want.
Instance variables don't work because I want to access the data from inside different classes.
There is also the request_store gem. From the documentation:
Add this line to your application's Gemfile:
gem 'request_store'
and use this code to store and retrieve data (confined to the request):
# Set
RequestStore.store[:foo] = 0
# Get
RequestStore.store[:foo]
Try PerRequestCache. I stole the design from the SQL Query Cache.
Configure it up in config/environment.rb with:
config.middleware.use PerRequestCache
then use it with:
PerRequestCache.fetch(:foo_cache){ some_expensive_foo }
One of the most popular options is to use the request_store gem, which allows you to access a global store that you from any part of your code. It uses Thread.current to store your data, and takes care of cleaning up the data after each request.
RequestStore[:items] = []
Be aware though, since it uses Thread.current, it won't work properly in a multi-threaded environment where you have more than one thread per request.
To circumvent this problem, I have implemented a store that can be shared between threads for the same request. It's called request_store_rails, it's thread-safe, and the usage is very similar:
RequestLocals[:items] = []
Have you considered flash? It uses Session but is automatically cleared.
Memoisation?
According to this railscast it's stored per request.
Global variables are evil. Work out how to cleanly pass the data you want to where you want to use it.
app/models/my_cacher.rb
class MyCacher
def self.result
##result ||= begin
# do expensive stuff
# and cache in ##result
end
end
end
The ||= syntax basically means "do the following if ##result is nil" (i.e. not set to anything yet). Just make sure the last line in the begin/end block is returning the result.
Then in your views/models/whatever you would just reference the function when you need it:
MyCacher.result
This will cache the expensive action for the duration of a request.

Resources