App session cookie not being created in Rails, sporadically - ruby-on-rails

This is an issue sporadically for very few users, however we haven't been able to replicate it. However I have now got a Chrome instance (Mac) which is reproducing the error (for some unknown reason), and I hope to not restart it until I have this nailed!
Rails application, using memcached for session store. While the bug manifests in the _app_session_id cookie not being created, our javascript-generated cookie test and app-generated language cookies are being created successfully. This means that InvalidAuthenticityToken errors are thrown for every form that is submitted by those afflicted - people can't log into the app.
The error occurs across all browsers - had reports for IE7 and Firefox (which most users use). Switching to another browser often fixes the issue (though not always), and standard cache-cookie-clear tactics do not.
So now that I have got Chrome open which is having the same issue - in development, staging and live environments (meaning http and https). All other browsers are fine.
I've restarted the servers and restarted memcached. I don't really want to restart Chrome - in the risk that the issue does go away with that (having said that, it hasn't worked for users).
I've been tcpdumping the requests - and although I'll keep digging, I'd love it if anyone had any suggestions, places to start looking, anything. This is really painful ;)
Thanks!

Related

Identity Server 4 with Chrome 76 gets stuck on authorize callback

At my work, we are finally upgrading our old Identity Server 3 to 4. We just got a very weird problem doing so. Everything works fine in all major browsers, but we also need to support some Electron clients. Here is where the weird part begins. All very old clients using Electron version 3 still work. All newer clients starting at Electron 9 also work. The only clients that don't work are the ones using Electron 6 (Chrome 76).
I already found this very helpful article written by Sebastian Gingter which helped to get the login working. But it only got me one step further. Now the client gets stuck at the connect/authorize/callback endpoint using the response_mode = form_post.
I already found some articles/stackoverflow questions pointing out to check the redirect URIs and to downgrade the CSP to version 1. The redirect URIs are configured correctly since the other clients work. The CSP does not help since I don't even get that far. It seems that the response body is never even loaded by Electron/Chrome.
Devtools Timing Screenshot
The request never finishes. On the server-side, it does though. I debugged through the IS 4 code and the dynamic HTML is written to the response like with all the other clients. I even called CompleteAsync() on the response manually and it still did not finish.
I researched and debugged for quite some time now and am out of ideas. Does anyone out there know this issue and more importantly also knows how to fix it?

How could I find out why Rails app throws error for a single, specific URL on Heroku while it works fine locally?

I have a Rails app running on Heroku that serves as the API for a front-end application.
I noticed that for a specific, dynamic URL, /bands/:band_id/members it consistently throws net::ERR_CONNECTION_CLOSED errors which breaks the app.
That specific URL doesn't throw an error when I run the Rails app locally and other URLs work fine on Heroku so I suspect this is a Heroku error but I'm not sure.
I couldn't get deeper in analyzing the problem as the request doesn't even appear in the Heroku logs.
Setup error monitoring on heorku. There are many addons listed under "Errors and Exceptions" category here - https://elements.heroku.com/addons
Eg. You can try Airbrake or Bugsnag. Most likely error is coming from you application. It's best practise to setup error monitoring but even before that you can check your server logs to debugs the issue - https://devcenter.heroku.com/articles/heroku-cli-commands#heroku-logs
Without more details I'm afraid I can only try to help you troubleshoot. Post as much code as you can. The route, the controller action, the view it's rendering, and any relevant logs from localhost and heroku would be a great start.
I've had Heroku requests timeout on my rails apps many times - in development there is often no time limit but if your request is taking too long that definitely could be the issue. How long does the request take on development? It could be as simple as shaving off a few seconds.
Otherwise I would say to check this out:
Heroku websocket connection
Also be sure to clear everything you can on your browser, try other browsers, incognito mode, all of that. Try to isolate the problem to one area - even though Heroku is throwing the error it is almost certainly not causing the error.
Check your routes. Look at everything that is happening with that request in your dev and prod logs and try and find something different about this request. Compare it to others.
It is also a good idea to understand your logs and increase their verbosity -
https://devcenter.heroku.com/articles/logging
What levels of logging are available for Heroku?
Good luck!

Are there any situations (e.g. failures) when browser clears cookies on its own?

We have two sites with different subdomains. Sometimes our employees lose their cookies (they are just gone) on both domains at the same time so they get logged out.
I don't really see how our app can be responsible, because we have different server configurations (and for each site there're multiple servers btw). I guess only nginx versions (1.10.3) are the same. Plus this does not explain why do they get logged out on both sites at the same time.
If it helps, we use rails (3/5), unicorn (4.8.3/5.3.0), on older app sessions are stored in redis and in the new one in cookies.
So I wonder maybe there're some browser (security) policies when it clears cookies. Maybe on some ssl connection error, ip changes or whatever.
I understand that this is not definitive problem description but it seems like magic to us atm so I hope that someone encountered something like this.
P.S. btw we tried to ask one of our employees to use firefox instead of Chrome (that is used by all of them) but it does not seem to be making any difference (he wasnt logged out for a week but then he was like every 20 minutes)

Rails 4.2 InvalidAuthenticityToken error, but only in production

I'm making this little app that uses (more or less) the sessions management code from https://www.railstutorial.org/book.
It works fine in development mode, and if running in production mode on my development machine. When I deploy to a machine running nginx and Phusion Passenger, I start getting InvalidAuthenticityToken on every request, where a token is used (forms and links with method: delete for example).
I have verified that the token is generated and is sent along with the request.
I have noticed one thing. The main area of the app at / is open to all and does not require any kind of login. The area needing login is at /admin. When running in development, one session cookie is generated with a path of /. When deployed, there are two session cookies, one for / and one for /admin. I suspect that the CSRF token is generated using one session and then validated using the other session.
Does this sound plausible? How would I go about investigating this further and fixing it?
Thank you in advance.
It seems this is linked to bugs regarding cookies in Phusion Passenger 5 beta:
Session being emptied on POST requests in 5.0.0-beta2
Downgrading to 4.0.57 fixed the problems.

Ways to troubleshoot a connection (works for some, doesn't work for others)

I've got a site that's currently in beta and thus password-protected (sorry, can't show yet). Most of my users access the site no problem and able to interact with it, upload files, ..etc. There's one guy, however, who seems to have a persisten issue with access. Whenever he accesses the site, the connection times out and Heroku sends back an app-not-available response. Better yet, that screws access to everyone else at that time and I have to wait for the service to restart. Heroku logs show no sign of any issues. New Relic logs are also fine.
Do you have any suggestions on how I could troubleshoot , what tools I could use to monitor?
I have also had issues like this with heroku from time to time, and they have blamed in on EC2 when I contacted them. However, this has only happened to me twice and hasn't happened in months.
I tweeted #heroku and #salesforce with the problem when I got a snarky remark, and it got me in touch with someone who was actually able to help me. Sometimes they can be quite standoffish :)

Resources