Devise + Heroku login sometimes fails, no error - ruby-on-rails

About half the time when I push to Heroku, Devise stops letting me login for a seemingly random length of time. I can fix this either by waiting for more than 10 minutes (time varies) or sometimes by pushing again.
While this login issue is happening there is nothing in the logs to indicate anything is wrong and nothing in the flash when I get redirected back to the login form. I'm not sure what else to look for or what could be causing this. Because of the strange time limits I thought it might have something to do with the tmp folder being pushed but it's listed correctly in .gitignore.
What else should I check?

At the time you posted this, Heroku was having problems... I wonder if THAT was the problem?
see
https://status.heroku.com/
and scroll down the Oct 22.
FWIW for free you can add new relic basic version and setup a monitoring process that emails you when your app is having trouble.

Related

How could I find out why Rails app throws error for a single, specific URL on Heroku while it works fine locally?

I have a Rails app running on Heroku that serves as the API for a front-end application.
I noticed that for a specific, dynamic URL, /bands/:band_id/members it consistently throws net::ERR_CONNECTION_CLOSED errors which breaks the app.
That specific URL doesn't throw an error when I run the Rails app locally and other URLs work fine on Heroku so I suspect this is a Heroku error but I'm not sure.
I couldn't get deeper in analyzing the problem as the request doesn't even appear in the Heroku logs.
Setup error monitoring on heorku. There are many addons listed under "Errors and Exceptions" category here - https://elements.heroku.com/addons
Eg. You can try Airbrake or Bugsnag. Most likely error is coming from you application. It's best practise to setup error monitoring but even before that you can check your server logs to debugs the issue - https://devcenter.heroku.com/articles/heroku-cli-commands#heroku-logs
Without more details I'm afraid I can only try to help you troubleshoot. Post as much code as you can. The route, the controller action, the view it's rendering, and any relevant logs from localhost and heroku would be a great start.
I've had Heroku requests timeout on my rails apps many times - in development there is often no time limit but if your request is taking too long that definitely could be the issue. How long does the request take on development? It could be as simple as shaving off a few seconds.
Otherwise I would say to check this out:
Heroku websocket connection
Also be sure to clear everything you can on your browser, try other browsers, incognito mode, all of that. Try to isolate the problem to one area - even though Heroku is throwing the error it is almost certainly not causing the error.
Check your routes. Look at everything that is happening with that request in your dev and prod logs and try and find something different about this request. Compare it to others.
It is also a good idea to understand your logs and increase their verbosity -
https://devcenter.heroku.com/articles/logging
What levels of logging are available for Heroku?
Good luck!

Heroku app does not respond at all anymore

Eventually a request to access the app will timeout within the browser.
Nothing was changed to the app. Yet suddenly it went down several hours ago.
The heroku logs show no attempt being made to request it.
I can restart it, and deploy to it, yet nothing changes.
I assumed maybe a service is causing this, but I stripped it of any 3rd party vendor calls and it still doesn't respond to me trying to access the site.
Does anyone have any insightful ideas on where to go from here? I'm sorry for how vague this question is..
is your DNS down? use host command to check it or visit the app using *.herokuapp.com

MVC 4 app users sometimes get logged off when creating new item in production

I have an MVC 4 app and am using the default authentication provider. I'm not using persistent cookies.
I don't have any problems in development but when hosted at HostGator, I SOMETIMES get logged off when I try to create a new item (HTTP POST). When this happens, I end up at the log on page like I wasn't authenticated.
HostGator does NOT have the app on multiple web servers so I'm thinking I shouldn't have to worry about machinekey stuff. Am I wrong?
When this happens, I just log in again and create the item again and it will succeed. Once this happens, I can't recreate the issue. I try reopening the browser and even different browsers but creating items will always work. It only seems to happen again if I try much later.
Some additional info, the timeout is set to 2880 (the default for an MVC project), which I know is long but I can't see how it would be related. Still, thought I'd mention it.
So I can't look at IIS logs or event viewer to get any idea what could be happening but I can add more logging to the app. Can anyone provide ideas for what to check or what logging to add to diagnose?
Thanks
EDIT
I realized that I could get to the IIS logs so I compared the POST that succeeded and the one that failed and immediately noticed something.
When I first did the GET to load the Item/Create page/view, the cs-username was populated but when I did the POST to create the item, it was gone. I can see that when I logged in again and was able to successfully create the item, that POST did have the cs-username populated.
Why would it disappear between the GET and the POST? There was a 7 minute delay from the GET to the POST but I can see I logged on 1 minute before the GET so the session was only 8 minutes old when the post happened. I've double checked that I don't have sessionstate explicitly configured so the default should be 20 minutes. I feel like I'm onto something but not sure exactly what.
Might be worth adding Glimpse, although running that on deployed code is kinda risky. It would have the benefit, though, of letting you see what's actually happening on the server. I've never used HostGator, so I can't say for certain, but if they recycle app pools aggressively, that would invalid your login, and explain why the logoff seems to happen randomly.

Ways to troubleshoot a connection (works for some, doesn't work for others)

I've got a site that's currently in beta and thus password-protected (sorry, can't show yet). Most of my users access the site no problem and able to interact with it, upload files, ..etc. There's one guy, however, who seems to have a persisten issue with access. Whenever he accesses the site, the connection times out and Heroku sends back an app-not-available response. Better yet, that screws access to everyone else at that time and I have to wait for the service to restart. Heroku logs show no sign of any issues. New Relic logs are also fine.
Do you have any suggestions on how I could troubleshoot , what tools I could use to monitor?
I have also had issues like this with heroku from time to time, and they have blamed in on EC2 when I contacted them. However, this has only happened to me twice and hasn't happened in months.
I tweeted #heroku and #salesforce with the problem when I got a snarky remark, and it got me in touch with someone who was actually able to help me. Sometimes they can be quite standoffish :)

App session cookie not being created in Rails, sporadically

This is an issue sporadically for very few users, however we haven't been able to replicate it. However I have now got a Chrome instance (Mac) which is reproducing the error (for some unknown reason), and I hope to not restart it until I have this nailed!
Rails application, using memcached for session store. While the bug manifests in the _app_session_id cookie not being created, our javascript-generated cookie test and app-generated language cookies are being created successfully. This means that InvalidAuthenticityToken errors are thrown for every form that is submitted by those afflicted - people can't log into the app.
The error occurs across all browsers - had reports for IE7 and Firefox (which most users use). Switching to another browser often fixes the issue (though not always), and standard cache-cookie-clear tactics do not.
So now that I have got Chrome open which is having the same issue - in development, staging and live environments (meaning http and https). All other browsers are fine.
I've restarted the servers and restarted memcached. I don't really want to restart Chrome - in the risk that the issue does go away with that (having said that, it hasn't worked for users).
I've been tcpdumping the requests - and although I'll keep digging, I'd love it if anyone had any suggestions, places to start looking, anything. This is really painful ;)
Thanks!

Resources