So, recently I moved all the session-related information in my app to the redis. Everything is running fine and now I am not facing the cookie-related issues (especially from IE).
In doing that, I read some blogs and all of them defined a redis-connector as a global variable in the config like
$redis = Redis.new(:host => 'localhost', :port => 6379)
Now there are a few things that bugging me:
Defining a global resource means that I have just a single connection to the redis. Will it create a bottleneck in my system when I have to serve multiple requests?
Also when multiple request arrives, will the Rails enqueue the requests for the redis as the connection is global resource, in case it is already in use?
Redis supports multiple instances. Wouldn't creating multiple instances boost the performance?
There are no standard connections pools included into Redis gem. If we consider Rails as a single threaded execution model it doesn't sound too problematic.
It might be evil when used in multi-threaded environment (think of background jobs as an example). So connection pooling is a good idea in general.
You can implement it for Redis using connection_pool gem.
Sidekiq also uses this gem for connecting to Redis. It can be seen here and here. Also, sidekiq author is the same person as connection_pool author, https://github.com/mperham.
As to your questions:
Multiple requests still don't mean multi-threading, so this approach might work well before you use threads;
Rails is not going to play the role of connection pool for your database;
It will boost performance (and avoid certain errors) if used in multi-threaded environment.
1) No it's not a bottleneck, opening TCP for Redis for every query/request cause leak of perfomance.
3) Yes if you have more then one core/thread.
Simply measure Redis connection number to see there is no new connection instantiated before each Rails request processed. The connection established on rails processor (Unicorn, Puma, Passenger etc) side during application load process.
echo info | redis-cli | grep connected_clients
Try to run the bash command before and during your application is being run locally.
Related
I'm working on adding pgBouncer to my existing Rails Heroku application. My Rails app uses Sidekiq, has one Heroku postgres DB, and one TimescaleDB. I'm a bit confused about all the layers of connection pooling between these different services. I have the following questions I'd love some help with:
Rails already provides connection pooling out of the box, right? If so, what's the benefit of adding pgBouncer? Will they conflict?
Sidekiq already provides connection pooling out of the box, right? If so, what's the benefit of adding pgBouncer? Will they conflict?
I'm trying to add pgBouncer via the Heroku Buildpack for pgBouncer. It seems that I only add the pgBouncer to run on the web dyno but not with Sidekiq. Why is that? Shouldn't the worker and web dynos both be using pgBouncer?
The docs say that I can use pgBouncer on multiple databases. However, pgBouncer fails when trying to add my TimescaleDB database URL to pgBouncer. I've checked the script and everything looks accurate but I get the following error: ActiveRecord::ConnectionNotEstablished: connection to server at "127.0.0.1", port 6000 failed: ERROR: pgbouncer cannot connect to server. What's the best way to console into the pgBouncer instance and see in more detail what's breaking?
Thanks so much for your help.
They serve different purposes. ActiveRecord connection pool will manage the connections to limit connections in a thread-level from the same process while pgBouncer allow you to manage several applications pooling connection from the same DB or several DBs.
Sidekiq makes its own internal controller but PGBouncer works in several modes with multi processes in parallel as it's orchestrating 3 modes: session, transaction, and statement.
Probably this doc can help you to understand this part.
I think you can try admin console to check what's going on but not sure if it will work for troubleshooting as you expect.
Background: I have a Ruby/Rails + Nginx/Unicorn web app with connections to multiple Redis DBs (i.e. I am not using Redis.current and am instead using global variables for my different connections). I understanding that I need to create a new connection in the after_fork block when a new Unicorn worker is created, as explained here and here.
My question is about the need for connection pooling. According to this SO thread, "In Unicorn each process establishes its own connection pool, so you if your db pool setting is 5 and you have 5 Unicorn workers then you can have up to 25 connections. However, since each unicorn worker can handle only one connection at a time, then unless your app uses threading internally each worker will only actually use one db connection... Having a pool size greater than 1 means each Unicorn worker has access to connections it can't use, but it won't actually open the connections, so that doesn't matter."
Since I am NOT using Sidekiq, do I even need to use connection pools for my Redis connections? Is there any benefit of a connection pool with a pool size of 1? Or should I simply use variables with single connections -- e.g. Redis.new(url: ENV["MY_CACHE"])?
Connection pool is only used when ActiveRecord talks to the SQL databases defined in your databases.yml config file. It is not related to Redis at all and the SO answer that you cite is actually not relevant for Redis.
So, unless you wanted to use some custom connection pool solution for Redis, you don't have to deal with it at all, as there is no pool for Redis in Rails by default. I guess the custom pool might be suitable if you had multiple threads in your applications which is not your case.
Update: Does building a connection pool make sense in your scenario? I doubt it. Connection pool is a way to reuse open connections (typically among multiple threads / requests). But you say that you:
use unicorn, the workers of which are separate, independent processes, not threads,
open a stable connection (or two) during after_fork, a connection which is then open all the time the unicorn worker lives
do not use threads in your application anywhere (I'd check if this is true again - it's not only Sidekiq but it might be any gem that tends to do things in the background).
In such scenario, pooling connection to Redis makes no sense to me as there seems to be no code that would benefit from reusing the connection - it is open all the time anyway.
I've looked around at other similar questions on SO but can't quite piece things together well enough. I have a Rails app (on Heroku) that uses Puma with both multiple processes and multiple threads. My app also uses Redis as a secondary data store (in addition to a SQL database), querying Redis directly (well, through the connection_pool gem). Here's my Puma config file:
workers Integer(ENV["WEB_CONCURRENCY"] || 4)
threads_count = Integer(ENV["MAX_THREADS"] || 5)
threads threads_count, threads_count
preload_app!
rackup DefaultRackup
port ENV["PORT"] || 3000
environment ENV["RACK_ENV"] || "development"
on_worker_boot do
# Worker specific setup for Rails 4.1+
ActiveRecord::Base.establish_connection
redis_connections_per_process = Integer(ENV["REDIS_CONNS_PER_PROCESS"] || 5)
$redis = ConnectionPool.new(size: redis_connections_per_process) do
Redis.new(url: ENV["REDIS_URL"] || "redis://localhost:6379/0")
end
end
My Redis instance has a connection limit of 20, and I find myself regularly going over this limit, despite having what should be (as far as I can tell) only 5 connections per process spread across 4 worker processes.
In fact, I even get max number of clients reached Redis errors when I set REDIS_CONNS_PER_PROCESS to 1. Is on_worker_boot called for each thread rather than each process?
I've also tried having a separate redis.rb initializer, which still gives me errors even when REDIS_CONNS_PER_PROCESS is 1. This seems odd since I should be able to have it up to 4 if I'm doing my math correctly (4 worker processes + 1 master process) * 4 connections per process. (Note that for the purposes of this question I'm ignoring errors that occur around deploying, since I'm assuming Heroku might be connecting both old and new processes during that process, even though I'm not using Preboot.)
Where am I misunderstanding how this all fits together?
I had similar problem. At first I was using redis-togo, and it has no problem. but After I changed from redis-togo to Heroku redis, I got "ERR max number of clients reached" erros.
My app's code is not changed, redis provider's changing was the only one.
I opened a ticket at Heroku support, and they advised me to change the default setting of timeout value.
https://devcenter.heroku.com/articles/heroku-redis#configuring-your-instance
after I changed the default timeout value of Heroku redis, everyting was solved.
I guess the default value of redis timeout is different by redis providers. and Heroku redis's default setting is 0.
"A value of zero means that connections will not be closed."
I wish my experience is helpful.
After further reading and testing, I ended up moving my Redis connection pool code into a separate initializer. Unfortunately, this didn't solve my problem at all—despite lots of tinkering with process and connection numbers, I was still getting max number of clients reached errors way before I should have been.
The answer, it turns out, was to switch Redis providers from Heroku Redis to Redis Cloud. I'm not sure why Heroku Redis wasn't allowing the number of connections it advertises, but upon some investigation Redis Cloud actually appears to allow more connections than advertised (or at least limit connections transparently and without errors) without any issues whatsoever. Wow. They've certainly earned my business.
I ran into this problem also, and while the Heroku Redis dashboard showed only a few connections, I was running out of connections.
Then I contacted Heroku support, and they told me that the dashboard only shows active clients/connections, and not the idle ones.
So because of the Redis timeout being 0 (never timeout), on reboot the Redis connections idle and new ones are opened. So the situation gets worse on every reboot.
A solution, as mentioned by others on this page, is to set the timeout to something else than 0:
heroku redis:timeout -s 10 -a APPLICATION_NAME
This makes the connections die after 10 seconds, which shouldn't be a problem because while it's being used it will stay open (no unnecessary closes).
When you have only little traffic, you might consider setting this to something a bit higher.
I have a Rails application that I want to connect to a Redis data structure server. I'm wondering how I should proceed. I'm using a global variable $redis locate at config/initializers/redis.rb to make queries across the entire application.
I believe this approach it is not suitable for a application with 80+ simultaneous connections, because it uses one single global variable to handle the Redis connection.
What should I do to overcome this problem? am I missing something about Rails internals?
Tutorial I'm following
http://jimneath.org/2011/03/24/using-redis-with-ruby-on-rails.html
This depends on the application server you will use. If you're using Unicorn which is a popular choice you should be fine.
Unicorn forks it's workers and each one will establish it's own database connection. And since each worker can only handle one request at a time it will only need one connection at a time. Adding more connections won't increase performance, it just will open more (useless) connections.
ActiveRecord (which is the DB-part of Rails) or DataMapper support connection pooling which is a common solution to overcome the problem you've mentioned. Connection pooling however only make sense in a threaded environment.
On top of that Redis is mainly single threaded (search for "Single threaded nature of Redis") so there might be no advantages anyway. There was an request to add connection pooling but it got closed, you might get more information from there.
we are currently planning a rails 3.2.2 application where we use RabbitMQ. We would like to run several kind of workers (and several instances of a worker) to process messages from different queues. The workers are written in ruby and are laying in the lib directory of the rails app.
Some of the workers needs the rails framework (active record, active model...) and some of them don't. The first worker should be called every minute to check if updates are available. The other workers should process the messages from their queues when messages (which are send by the first worker) are present and do some (time consuming) stuff with it.
So far, so good. My problem is, that I only have little experiences with messaging systems like RabbitMQ and no experiences with the rails interaction between them. So I'm wondering what the best practices are to get the two playing with each other. Here are my requirements again:
Rails 3.2.2 app
RabbitMQ
Several kind of workers
Several instances of one worker
Control the amount of workers out of rails
Workers are doing time consuming tasks, so they have to be async
Only a few workers needs the rails framework. The others are just ruby files with some dependencies like Net or File
I was looking for some solution and came up with two possibilities:
Using amqp with EventMachine in a new thread
Of course, I don't want my rails app to be blocked when a new worker is created. The worker should run in another thread and do its work asynchronously. And furthermore, it should not start a new instance of my rails application. It should only require the things the worker needs.
But in some articles they say that there are some issues with Passenger. And another fact that I don't like is, that we are using webbrick for development and we ought to include workarounds for that too. It would be possible to switch to another webserver like thin, but I don't have any experience with that either.
Using some kind of daemonizing
Maybe its possible to run workers as a daemon, but I don't know how much overhead this would come up with, or how I can control the amount of workers.
Hope someone can advise a good solution for that (and I hope I made myself clear ;)
It seems to me that AMQP is a big shot to kill your problem. Have you tried to use Resque? The backed Redis database has some neat features (like publish/subscribe and blocking list pop) which make it very interesting as a message queue, and Resque is very easy to use in any Rails app.
The workers are daemonized, and you decide which worker of your pool listens to which queue, so you can scale each type of job as needed.
Using EM reactor inside a request/response cycle is not recommended, because it may conflict with an existing event loop (for instance if your app is served by thin), in any case you have to configure it specifically for your web server, OTOS it may be interesting to have an evented queue consumer, if your jobs have blocking IO and are not processor-bound.
If you still want to do it with AMQP, see Starting the event loop and connecting in Web applications and configure for your web server accordingly. Or use bunny to push synchronously in the queue (and whichever job consumer you deam useflu, like workling for instance)
we are running slightly different -- but similar technology stack.
daemon kit is used for eventmachine side of the system... no rails, but shared models (mongomapper & mongodb). EM is pulling messages off the queues, and doing whatever logic is required (we have ruleby in the mix, but if-then-else works too).
mulesoft ESB is our outward-facing message receiver and sender that helps us deal with the HL7/MLLP world. But in v1 of the app, we used some java code in ActiveMQ to manage HL7 messages.
the rails app then just serves up stuff for the user to see -- again, using the shared models.