Is it possible to receive a webhook to my app before Heroku Postgres goes read-only? - heroku-postgres

I have an application that handles some data in memory.
I'd like to close the operations and persist the data into DB so that a reboot wouldn't destroy it.
My app opens some resources in various third parties and it I'd like to close them. After that the app can happily go down and wait until it reboots.
What I found is that Heroku has various webhooks for application deployment state changes and so on. But I couldn't find a way to trigger a webhook before the DB becomes read only.
I would like to have a webhook that tells me that "in 5 minutes PostgreSQL will become read only". And then later the app can reboot and for now it doesn't matter.
Also I couldn't find any info if this is even possible. I couldn't find an email for support as well.
Is there a way to do it? Is it even possible?
(I have an Event-Sourced app that saves event data into DB but persists the data in-memory as it runs. So I don't want to continuously bash all of my state into the DB).

It sounds like there is some amount of confusion with regards to your understanding about the various parts of dyno and database uptime on Heroku.
Firstly, a database going into read-only mode is a very rare event usually associated with a critical failure. Based on what behavior you're seeking and some of your comments, it seems like you may be confusing database state changes with dyno state changes. Dynos (representing the servers for your application runtime), are restarted once per 24 hours roughly and these servers are ephemeral. Thus the memory is blown away. The 'roughly' part accounts for fuzzing so that all of your dynos aren't restarting at the same time which would cause availability issues.
I don't think you actually need a webhook here. Conveniently, shortly before a dyno is due to be cycled (and blow away your memory) it will receive a SIGTERM and be given 30 seconds to clean up after itself. That SIGTERM can be trapped and you can then save your data to the database.

Related

My server gets overloaded even though I keep a limit on the requests I send it

I have a server on Heroku - 3 dynos, 2 processes each.
The server does 2 things:
It responds to requests from the browser (AJAX and some web pages), based on data stored in a postgresql database
It exposes a REST API to update the data in the database. This API is called by another server. The rate of calls is limited: The other server only calls my server through a queue with a single worker, which makes sure the other server doesn't issue more than one request in parallel to my server (I verified that indeed it doesn't).
When I look at new relic, I see the following graph, which suggests that even though I keep the other server at one parallel request at most, it still loads my server which creates peaks.
I'd expect that since the rate of calls from the other server is limited, my server will not get overloaded, since a request will only start when the previous request ended (I'm guessing that maybe the database gets overloaded if it gets an update request and returns but continue processing after that).
What can explain this behaviour?
Where else can I look at in order to understand what's going on?
Is there a way to avoid this behaviour?
There are whole lot of directions this investigation could go, but from your screenshot and some inferences, I have two guesses.
A long query—You'd see this graph if your other server or a browser occasionally hits a slow query. If it's just a long read query and your DB isn't hitting its limits, it should only affect the process running the query, but if the query is taking an exclusive lock, all dynos will have to wait on it. Since the spikes are so regular, first think of anything you have running on a schedule - if the cadence matches, you probably have your culprit. The next simple thing to do is run heroku pg:long-running-queries and heroku pg:seq-scans. The former shows queries that might need optimization, and the latter shows full table scans you can probably fix with a different query or a better index. You can find similar information in NewRelic's Database tab, which has time and throughput graphs you can try to match agains your queueing spikes. Finally, look at NewRelic's Transactions tab.
There are various ways to sort - slowest average response time is probably going to help, but check out all the options and see if any transactions stand out.
Click on a suspicious transaction and look at the graph on the right. If you see spikes matching your queueing buildups, that could be it, but since it looks to be affecting your whole site, watch out for several transactions seeing correlated slowdowns.
Check out the transaction traces at the bottom. Something in there taking a long time to run is as close to a smoking gun as you'll get. This should correlate with pg:long-running-queries.
Look at the breakdown table between the graph and the transaction traces. Check for things that are taking a long time (eg. a 2 second external request) or happening often (eg, a partial that gets rendered 2500 times per request). Those are places for caching or optimization.
Garbage collection—This is less likely because Ruby GCs all the time and there's no reason it would show spikes on that regular cadence, but if there's a regular request that allocates a ton of objects, both building the objects and cleaning them up will take time. It would only affect one dyno at once, and it would be correlated with a long or highly repetitive query in your NewRelic investigation. You can see some stats about this in NewRelic's Ruby VM tab.
Take a look at your dyno and DB memory usage too. Both are printed to the Heroku logs, and if you add Librato, they'll build some automatic graphs that are quite helpful. If your dyno is swapping, performance will suffer and you should either upgrade to a bigger dyno or run fewer processes per dyno. Processes will typically accumulate memory as they run and never quite release as much as you'd like, so tune it so that right before a restart, your dyno is just under its available RAM. Similarly for the DB, if you're hitting swap there, query performance will suffer and you should upgrade.
Other things it could be, but probably isn't in this case:
Sleeping dynos—Heroku puts a dyno to sleep if it hasn't served a request in a while, but only if you have just 1 dyno running. You have 3, so this isn't it.
Web Server Concurrency—If at any given moment, there are more requests than available processes, requests will be queued. The obvious fix is to increase the available dynos/processes, which will put more load on your DB and potentially move the issue there. Since some regular request is visible every time, I'm guessing request volume is low and this also isn't your problem.
Heroku Instability—Sometimes, for no obvious reason, Heroku starts queueing requests more than it should and doesn't report any issues at status.heroku.com. Restarting the dynos typically fixes that temporarily while Heroku gets their head back on straight.

How does adding a 3rd party logging service affect how much i have to pay to Heroku

This would sound very newbie but I've just added a centralized logging service (Splunkstorm free version) to my rails app on heroku and it completely changed my life. I don't know why i never thought of this before.
I can just read all the logs from web interface without running heroku logs --tail which spawns a new dyno everytime i do it.
Which makes me curious: Does adding this type of logging service affect how much i have to pay to heroku? I mean, it's sending out packets every time something happens.
Nope!
Bandwidth is included in the dyno pricing (including the one you get for free).
There is a soft limit at 2TB of bandwidth, but you're unlikely to come anywhere near that from logging.

Is there a way to run a command on all Heroku dynos?

I have N dynos for a Rails application, and I'd like to run a command on all of them. Is there a way to do it? Would running rails r "SomeRubyCode" be executed on all dynos?
I'm using a plugin which syncs with a 3rd party every M minutes. The problem is, sometimes the 3rd party service times out, and I'd like to run it again without having to wait for another M minutes to pass.
No. One off commands (those like heroku run bash) are ran on another, one-off dyno. You would need to setup some kind of pubsub/message queue that all dynos listen to to accomplish this. https://devcenter.heroku.com/articles/one-off-dynos
(Asked to turn my comment into an answer... will take this opportunity to expound.)
I don't know about the details of what your plugin needs to do to 'sync' to a 3rd-party service, but I'm going to proceed with the assumption that the plugin basically fetches some transient data which your Web application then uses somehow.
Because the syncing or fetching process occasionally fails, and your Web application relies on up-to-date data you want the option of running the 'sync' process manually. Currently, the only way to do this from the plugin itself which means you need to run some code on all dynos which, as others have pointed out, isn't currently possible.
What I've done in a previous, similar scenario (fetching analytics from an external service) is simple:
Provision and configure your Heroku app with Redis
Write a rake task that simply executes the code (that would otherwise be run by the plugin) to fetch the data, then write that data into cache
Where you would normally fetch the data in the app, first try to fetch from cache (and on a cache miss, just run the same code again—just means that the data expired from cache before it was refreshed)
I then went further and used Heroku simple scheduler to execute the said rake task every n minutes to attempt to keep the data freshly updated and always in cache (cache expiry was set to a little less than n minutes) and reduce instances of perceivable lag as the data fetch occurs. I could've set cache expiry to never or greater than n but this wasn't mission-critical.
This way, if I did want to ensure that the latest analytics were displayed, all I had to do was either a) connect to Redis and remove the item from cache, or (easier), b) just heroku run rake task.
Again—this mainly works if you're just pulling data that needs to be shared among all dynos.
This obviously doesn't work the other way around. For instance, if you had a centralized service that you wanted to periodically send metrics (say, time spent per request) to on a per-dyno basis. Can't think of an easy, elegant way to do that using Heroku (other than at real-time, with all the overhead that entails).

Heroku “psql: FATAL: remaining connection slots are reserved for non-replication superuser connections”

I got the above error message running Heroku Postgres Basic (as per this question) and have been trying to diagnose the problem.
One of the suggestions is to use connection pooling but it seems Rails has this built in. Another suggestion is that the app is configured improperly and opens too many connections.
My app manages all it's connections through Active Record, and I had one direct connection to the database from Navicat (or at least I thought I had).
How would I debug this?
RESOLUTION
Turns out it was an Heroku issue. From Heroku support:
We've detected an issue on the server running your Basic database.
While we pinpoint this and address it, we would recommend you
provision a new Basic database and migrate over with PGBackups as
detailed here:
https://devcenter.heroku.com/articles/upgrade-heroku-postgres-with-pgbackups
. That should put your database on a new server. I apologize for this
disruption – we're working to fix this issue and prevent it from
occurring in the future.
This has happened a few times on my app -- somehow there is a connection leak, then all of a sudden the database is getting 10 times as many connections as it should. If it is the case that you are getting swamped by an error like this, not traffic, try running this:
heroku pg:killall
That will terminate all connections to the database. If it is dangerous for your situation to possibly cut off queries be careful. I just have a rails app, and if it goes down, losing a couple queries is not a big deal, because the browser requests will have looooooong since timed out anyway.
You might be able to find why you have so many connections by inspecting view pg_stat_activity:
SELECT * FROM pg_stat_activity
Most likely, you have some stray loop that opens new connection(s) without closing it.
To save you the support call, here's the response I got from Heroku Support for a similar issue:
Hello,
One of the limitations of the hobby tier databases is unannounced maintenance. Many hobby databases run on a single shared server, and we will occasionally need to restart that server for hardware maintenance purposes, or migrate databases to another server for load balancing. When that happens, you'll see an error in your logs or have problems connecting. If the server is restarting, it might take 15 minutes or more for the database to come back online.
Most apps that maintain a connection pool (like ActiveRecord in Rails) can just open a new connection to the database. However, in some cases an app won't be able to reconnect. If that happens, you can heroku restart your app to bring it back online.
This is one of the reasons we recommend against running hobby databases for critical production applications. Standard and Premium databases include notifications for downtime events, and are much more performant and stable in general. You can use pg:copy to migrate to a standard or premium plan.
If this continues, you can try provisioning a new database (on a different server) with heroku addons:add, then use pg:copy to move the data. Keep in mind that hobby tier rules apply to the $9 basic plan as well as the free database.
Thanks,
Bradley

Response time increasing (worsening) over time with consistent load

Ok. I know I don't have a lot of information. That is, essentially, the reason for my question. I am building a game using Flash/Flex and Rails on the back-end. Communication between the two is via WebORB.
Here is what is happening. When I start the client an operation calls the server every 60 seconds (not much, right?) which results in two database SELECTS and an UPDATE and a resulting response to the client.
This repeats every 60 seconds. I deployed a test version on heroku and NewRelic's RPM told me that response time degraded over time. One client with one task every 60 seconds. Over several hours the response time drifted from 150ms to over 900ms in response time.
I have been able to reproduce this in my development environment (my Macbook Pro) so it isn't a problem on Heroku's side.
I am not doing anything sophisticated (by design) in the server app. An action gets called, gets some data from the database, performs an AR update and then returns a response. No caching, etc.
Any thoughts? Anyone? I'd really appreciate it.
What does the development log say is slow for those requests? The view or db? If it's the db, check to see how many records there are in database and see how to optimize the queries. Maybe you need to index some fields.
Are you running locally in development or production mode? I've seen Rails apps performance degrade faster (memory usage) over time in development mode. I'm not sure if one can run an app on Heroku in development mode but if I were you I would check into that.

Resources