Why am I getting RuntimeError: Session collision on '...' - ruby-on-rails

I've been getting quite a lot of session collision exceptions. Usually at least one per day, but sometimes I deploy and get 2-3 in a row and then nothing.
The app runs on Rails 3.2.2 and unicorn, and sessions are stored in memcached.
The exceptions happen in different places in different controllers and I'm not really able to find anything they have in common. What could be causing this?

I don't know how ruby/rails handles session data using memcached but normally the work is as follows:
new session -> using command ADD
update session -> using GET with token and than the command CAS (check and set)
If there is a hash collision the command ADD fails because the session already exists.
Another possible issue is if another process updated the same session between GET and CAS.

Related

Rails ActiveRecord/Postgres single query timeout?

I have a logging query (a simple INSERT) that happens on every single request.
For this request only (the one that happens on every page load), I want to set the limit to 500ms in case the database is locked/slow/down it won't affect the site, where the site hangs while it waits to connect/write.
Is there a way I can specify a timeout somehow on a per-query basis that I can abort the LoggedRequest.create! if it's taking too long?
I don't want to set it in my config because I have many other queries that shouldn't have timeouts that low.
I'm using Postgres 11.7
I also don't know how I feel about setting a timeout for the entire session because I don't want that connection to be shared from the pool with other queries that can't have that timeout.
Rails 6 introduces event based triggers for notifications, logging etc that comes in very handy, provided you are using/can afford to migrate to Rails 6. Here'a useful post that demonstrates creating event based triggers for notifications/logging: https://pramodbshinde.wordpress.com/2020/03/20/custom-events-tracking-with-activesupportnotifications-and-audited/
If, for some reason, you cannot use Rails 6, perhaps this article might help you find some answers: https://evilmartians.com/chronicles/the-silence-of-the-ruby-exceptions-a-rails-postgresql-database-transaction-thriller
If I were you, I could also contemplate using AJAX with a fire-and-forget API request to server for logging/whatever that is not critical to normal functioning of the application.

Rails 5 server hangs when receives multiple requests at once

My development Rails 5 server with Puma keeps freezing and hanging when sending multiple requests at one time from my separate frontend app to the Rails API. There is no error, it just hangs on the POST requests. When I try to kill the server with CTRL + C, nothing happens. I have to manually kill the port.
I've tried setting config.eager_load=true in development.rb. I've tried adding config.allow_concurrency in application.rb. I've Googled relentlessly to no avail. I am sending around 5 requests concurrently from frontend, so I believe this amount of requests is causing it, but I don't know for sure.
Has anyone else experienced this or have an idea of what needs to be done here? I can usually get all the requests coming back to the frontend successfully around 3-4 times, then the server just freezes.
It especially occurs after I change any one line of code in any file in the project while the server is running.
It's been nearly 2 years but I finally happened to stumble upon what had been causing my issue.
Basically it boiled down to a method in my code not being thread-safe. Since my current_user variable was only accessible from my controller, I had a before_action on my base controller to assign the current user to User.current so that I could access the current user globally via User.current, not just in my controllers.
So PLEASE make sure you're not dynamically updating classes like I this in your controllers. It is not thread-safe. I ended up following this thread-safe solution instead for my particular case: https://stackoverflow.com/a/2513456/7629239
What is your puma configuration? How many threads and workers(Puma workers not rails workers).
Ensure that your puma has enough threads, and that your db pool is large enough. Changing a line of code should not cause your server to get exhausted in resources. Are you using a watcher like watchman?

MVC app getting stuck on an error after server restart

The scenario is as follows. I start an instance of MVC app to debug it. The app uses simple membership and I log in during this run. Then I go back to VS change something and start the instance again. It doesn't happen really often but sometimes at this moment membership starts acting odd. As the app starts, some action, that is behind [Authorize] attribute (to be exact the attribute is on the controller), is called. However the action fails because WebSecurity.CurrentUserId is equal -1 (the action in question just loads some user information based on WebSecurity.CurrentUserId).
If I clear cookies in browser, everything is fine, but I can't expect users to do the same when they encounter the problem.
My colleague explaind to me that it's (probably) happening because my local IIS decided to restart and some of session cookies became invalid, but if this can happen on local instance of IIS, wouldn't it be possible to also happen on the remote server?
Other important fact, the action that fails is called (more like redirected to) by a custom filter that we wrote. This filter is applied to all actions (but doesn't affect the one mentioned). Can this filter somehow make MVC ignore [Authorize] attribute?
I have a dirty workaround for this problem that should work (with this specific app), but I would prefer to prevent the problem from appearing int the first place.
I think this is related to this. Basically when the server gets reset authentication cookies die. They get recreated right away, except my app doesn't really have access to them till the page is reloaded (just like with logging in).
I partially solved the problem described above (a redirect is preformed somewhere on the way) so the application no longer gets stuck. However, if someone was logged in during the time the server restarted and he tries to preform a post after that, his post will not work and he will be redirected to a get action with the same name as the post action (our custom filter is to blame for that). Unfortunately I cannot fix the filter, because I would need user id for that and at the point at which the filter is called, it's still -1.
I guess my question is not too well written and kind of very localized (I should probably rewrite it or reask it), but the underlaying problem is more general than it seems, so let me salvage all the useful information into this answer.
Question 1: There is nothing preventing IIS from having a hiccup on a remote server and restarting the app, so yes this can (and happens) on the remote server (frequency will depend on the app itself and IIS configuration). The problem of disappearing session data seems to be related to the restarts of the app pool rather than the app itself.
Question 2: The custom filter has little to do with the situation. As pointed by Larry, in simple membership authorization is kind of unrelated to session data. If your session data is lost, the user does not stop being authorized, however user data is stored in the session. Without session you don't know who the user is. This information becomes available one action after session data was lost. So loosing session data can lead to a crash of the application or like in my case (where a custom filter depends on user data) to even weirder results.
So if you encounter unexpected disappearance of user data in your app (such as WebSecurity.CurrentUserId becoming -1), it might be worth investigating if your app pool is getting restarted (and why). Setting memory limits for an app pool seems to increase the likelihood of those restarts.

Session timeout with multi-application session

(APEX 4.1.1.00.23)
I have two applications A and B that share the same session (because they use the same session cookie), and each has Maximum Session Idle Time set to the same value N. Having established a session and visited both applications, if I then spend more than N seconds working in application A (doing lots of page loads so not timing out), if I then navigate to application B it immediately times out and sends me to its login page.
I tried also calling APEX_UTIL.SET_SESSION_MAX_IDLE_SECONDS(N) in both applications, with p_scopr defaulting to 'SESSION', noting that the API docs say
This would be the most common use case when multiple Application
Express applications use a common authentication scheme and are
designed to operate as a suite in a common session.
However the same thing happens.
I want the timeout to apply to the session as a whole, not to each application independently. Is this not what the above is supposed to achieve, or am I doing something wrong?
I got the answer to this from Christian Neumueller on the Oracle APEX forum:
... it's no issue anymore in 4.2. Looking at the 4.1.1
code, it seems that the problem is how we stored the last access time.
While the APEX_UTIL call with SESSION scope would set the idle timeout
for both apps, we maintained a timer (FSP_LAST_REQUEST_TIME) for each
app. Working in TIMTEST1 only updated the timer for TIMTEST1, not for
TIMTEST2. After working with one app and switching back to the other
app, Apex sees the stale timer and decides that the session expired.
This is clearly a bug. The bad news is that a backport is not
feasible, because so much has changed in session state management.

Rails - Invalid Authenticity Token After Deploy

We're using EngineYard Cloud to deploy our Ruby on Rails application. We are running Rails v2.3.3.
EngineYard Cloud deploys to AWS instances in a manner similar to Capistrano. After each deploy, we're running into Invalid Authenticity Token errors. Specifically, any user that has previously visited our application and then visits after the deploy and then tries to submit a form gets an invalid authenticity token error. This error persists until they reset their cookies for the site. After they reset their cookies, the site works as expected with no errors.
We are using ActiveRecord's session store and sessions are being saved to the database.
This is the error we are seeing:
ActionController::InvalidAuthenticityToken
/usr/lib/ruby/gems/1.8/gems/actionpack-2.3.3/lib/action_controller/request_forgery_protection.rb:79:in `verify_authenticity_token'
The session object is nil after the deploy, however, the session data still persists in the database and the session ID cookie still exists:
Session:
session id: nil
data: nil
We haven't been able to explain this one. Any thoughts on what could be the root cause?
Thanks for any suggestions!
EDIT: Just to update on this, we've been able to isolate an example of the error.
1) User loads form
2) Code is updated on server
3) User submits form
** Invalid Authenticity Token error occurs
It seems that when the environment changes, Rails is unable to handle this with the authenticity token.
We've tried several steps to resolve:
Resetting the session
Deleting the session cookie (both in JavaScript and Rails)
Wiping the session table in the database after deploying code
Nothing works. The only thing that works is having the user clear their cookies client-side.
(We've been Googling (even tried Binging!) for answers, but no dice. This seems to be a similar related issue: http://railsforum.com/viewtopic.php?id=21479)
Also: initially we thought this was isolated to our deployment to EngineYard, but we've also been able to reproduce it on our development server that we deploy to via Capistrano.
Any thoughts would be gratefully accepted.
Thanks!
ANSWER: After extensive work by EngineYard (they're awesome!) they were able to diagnose the issue. The root cause of this issue is a bug with mongrel clusters. Mongrel doesn't seem to see the first post request after being started. EngineYard did extensive work to diagnose this:
There doesn't appear to be anything in your code causing the issue and I have found people outside of our environment that have experienced the bug as well (http://www.thought-scope.com/2009/07/mongrelcluster-rails-23x-bad-post.html). I suppose a lot of people don't see it because the first request to a site generally isn't a post or they chalk it up to flukes.
[There is a potential workaround using CURL.] The curl work around would do a simple GET request to each of your mongrels on the server to prime them so to speak. You could do this with capistrano, but that won't work if you deploy via the dashboard. You can find a short section on deploy hooks we have built into the infrastructure here:
https://cloud-support.engineyard.com/faqs/overview/getting-started-with-engine-yard-cloud
Adding a simple run curl http://localhost:500x > /dev/null should work (where x is the port you have 5000-50005 on your current setup).
We have addressed the issue by switching our stack from Mongrel to Passenger, but apparently, a fix for Mongrel is in the works. Hopefully, this helps someone who sees this same strange issue.
The authenticity token is a hidden field on the form that rails checks when the form is submitted to ensure that the post data is coming from a live session.
It is there as a security measure to prevent malicious people from using a form submit on their site to say a delete action on someones account.
You can turn it off on your whole app by adding this to config/environment.rb
config.action_controller.allow_forgery_protection = false
You can turn it off a single controller using
skip_before_filter :verify_authenticity_token
or turn it on
protect_from_forgery :except => :index
check out the ActionController::RequestForgeryProtection::ClassMethods docs for more details
It sounds like the secret key used for authentication is changing when you redeploy, invalidating all existing sessions.
Do you have the configuration parameter config.action_controller.session set anywhere, and if you do, is there anything which would cause it to change when you redeploy?
One of my apps has it configured in config/environment.rb, and a more recent one (generated with Rails 2.3) has it set in config/initializers/session_store.rb. The setting looks like:
config.action_controller.session = {
:secret => 'long-string-of-hex-digits'
}
If you don't have this configured for some reason, rake secret will generate a key for you, which can then be inserted into your configuration.
(If it is — and it's not being changed by your deployment processes — then I have no idea what's going on.)
If it would only be there for mongrels! I'm getting the exact same error on passenger as well (user loads form, deploy, submit -> invalid authenticity token). It'd be interesting to know how you solved the issue by switching to passenger? Any further hints are highly welcome. I'll have a closer look as well...
Cheers!
Have encountered this same problem with Rails 2.3 and a Mongrel cluster where the session secret is definitely set in the session initializer. The problem occured even after clearing the client cookies on the client.
However the suggestion of doing a curl get request across all the mongrels after they restart appears to work - thank goodness someone figured this out because it appears to be pretty darned obscure.
The only added info I can supply we are using Apache mod_proxy_balancer along with https in front of our Mongrels, however this problem was occuring before we turned on SSL. Is anyone seeing this with haproxy as the balancer instead of Apache?
This solved this issue for me :-) :-) :-)
https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/4690-mongrel-doesnt-work-with-rails-238#ticket-4690-37 Posted by Mike Bethany
August 30th, 2010 # 06:43 PM.
I've never gone to any length to figure out the details, but for me, this is a client-side data rot issue. If I've been messing around with the way I store my sessions (and therefore, my authorization details,) I get this error from time to time. Clearing out the private browser data; cookies, authenticated sessions, the works, has always solved it for me.
Hope this helps.

Resources