Prevent request timeout with long requests - ruby-on-rails

I have a Rails Controller on Heroku where I send emails in a loop, and respond to the user with some information on which email address the emails were sent to.
While this works when only a few (~40) emails need to be sent out, the request times out when more there are more than just a few emails to be sent out (e.g. > 40).
Heroku states in their guides that requests must respond with at least one byte within 30 seconds: https://devcenter.heroku.com/articles/request-timeout
While I know this is not the best way to achieve this, I'm currently trying to figure out how to do this in Ruby.
If this were a PHP app, I could do an echo before entering the loop, and then keep echoing something in every iteration. How do I achieve something similar in rails?

Your best bet is to not do the mailing before sending the response back. You will have better luck first adding the job to one of Heroku's many available worker queues, then kicking to a monitoring page that displays the job progress and updates itself periodically. If you are trying to avoid using one of those queue services, for budget reasons, you may be able to accomplish the same thing using a new thread, instead of a queue. Either way, this technique will scale better, as well as being able to recover from failure more easily as well.
As you appear to already know that your proposed solution is not the ideal solution, I will also try to answer your exact question. You may be able to make HTTP streaming work for this. I would recommend checking out http://railscasts.com/episodes/266-http-streaming.

Related

Proper way to save/update a one-off timestamp in Rails app

New to Rails, and looking for the 'right' way to do something that seems straight-forward, but nothing I've read about sounds quite right.
I have a Rails app on Heroku, and I've added a call to an endpoint that depends on an external system. If that call is unsuccessful there'll be some follow up needed, so I save details to the error log. I've added a notification email (to a slack room for this sort of thing) to prompt me to check the logs and follow up if it happens.
In case the endpoint gets bogged down and fails repeatedly, I want to be able to throttle the slack alert so I don't spam everyone (for example, only email the slack room if 30 min have gone by since the last time it alerted).
To do this, I imagine I need:
somewhere to save a timestamp for the last email notification for the error
whenever the error occurs, compare with that timestamp and only email slack room if the 30-min window has passed. Then update the timestamp with the new value.
What's an appropriate place to save this kind of timestamp value? I've read that global variables are the devil (and wouldn't actually work in this case), but the other options (adding database field, trying the simpleconfig gem) seem excessive/incorrect for something internal that I don't even know will happen once, let alone frequently.
Is there a lightweight way to get this done?
A popular choice would be to store it in a Redis store -- especially if you already have one set up for something else, like caching. As this is itself ephemeral data, you could even use the Rails.cache API to abstract away the detail and have this code just trust that it gets stored somewhere.
Failing that, the most straightforward solution is probably to create a tiny single-row table and store it in there: it's overkill, but doesn't involve doing anything unusual, or that would look out of place in the middle of a Rails application.
As a quick and simple solution, though, a global variable isn't out of the question: it has strong limitations, like it won't be shared across multiple server processes, and it'll go away any time the process restarts... but if those add up to a risk that you'll get, say, 4-6 notifications in an error-heavy 30 minute period -- maybe that's good enough? (It'd also give you a "reset on deploy" feature for free, so you know immediately if the problem's still occurring after you think you've fixed it.)

How can I retrieve data onscreen from an external API without tying up a request?

I have a Rails 3 application that lets a user perform a search against a 3rd party database via an API. It could potentially bring back quite a bit of XML data. Also, the API could be busy serving requests for other users and have a nontrivial delay in its response time.
I only run 2 webservers so I can't afford to have a delayed request obviously. I use Sidekiq to process long-running jobs, but in all the cases I've needed that I haven't had to return a value to the screen.
I also use Pusher to communicate back to the user when a background job is finished. I am checking it out, but I don't know if it can be used for the kind of data I want to push to the screen. Right now it just pops up dialog boxes with messages that I send it.
I have thought of some pretty kooky stuff, like running the request via Sidekiq, sending the results to a session object or file, then using Pusher to kick off some kind of event to grab the data and populate the screen with it. Seems kind of Rube Goldberg-ish.
I appreciate any help or insight anyone can offer into the problem!
I had a similar situation not long ago and the way I've fixed was using memcache and threads.
I've also thought about using Sidekiq, but Sidekiq is ideal if you don't expect to use the data right away, so memcache and threads worked pretty well and gave us a good amount of control.
Instead of calling the API directly I would assign the API request to a thread and this thread once done would write to memcache, in my case this can happen incrementally, with the same API being able to return more data from the same endpoint until is complete.
From the UI I would have a basic ajax pooling mechanism that would hit a controller and check memcache for data and for the status to see if it was complete or not, this would sign the UI that it need to keep pooling for more data.

opening and closing streaming clients for specific durations

I'd like to infrequently open a Twitter streaming connection with TweetStream and listen for new statuses for about an hour.
How should I go about opening the connection, keeping it open for an hour, and then closing it gracefully?
Normally for background processes I would use Resque or Sidekiq, but from my understanding those are for completing tasks as quickly as possible, not chilling and keeping a connection open.
I thought about using a global variable like $twitter_client but that wouldn't horizontally scale.
I also thought about building a second application that runs on one box to handle this functionality, but that seems excessive if it can be integrated into the main app somehow.
To clarify, I have no trouble starting a process, capturing tweets, and using them appropriately. I'm just not sure what I should be starting. A new app? A daemon of some sort?
I've never encountered a problem like this, and am completely lost. Any direction would be much appreciated!
Although not a direct fix, this is what I would look at:
Time
You're working with time, so I'd look at what time-centric processes could be used to induce the connection for an hour
Specifically, I'd look at running a some sort of job on the server, which you could fire at specific times (programmatically if required), to open & close the connection. I only have experience with resque, but as you say, it's probably not up to the job. If I find any better solutions, I'll certainly update the answer
Storage
Once you've connected to TweetStream, you'll want to look at how you can capture the tweets for that time period. It seems a waste to create a data table just for the job, so I'd be inclined to use something like Redis to store the tweets that you need
This can then be used to output the tweets you need, allowing you to simulate storing / capturing them, but then delete them after the hour-window has passed
Delivery
I don't know what context you're using this feature in, so I'll just give you as generic process idea as possible
To display the tweets, I'd personally create some sort of record in the DB to show the time you're pinging TweetStream that day (if it changes; if it's constant, just set a constant in an initializer), and then just include some logic to try and get the tweets from Redis. If you're able to collect them, show them as you wish, else don't print anything
Hope that gives you a broader spectrum of ideas?

Ruby/Rails synchronous job manager

hi
i'm going to set up a rails-website where, after some initial user input, some heavy calculations are done (via c-extension to ruby, will use multithreading). as these calculations are going to consume almost all cpu-time (memory too), there should never be more than one calculation running at a time. also i can't use (asynchronous) background jobs (like with delayed job) as rails has to show the results of that calculation and the site should work without javascript.
so i suppose i need a separate process where all rails instances have to queue their calculation requests und wait for the answer (maybe an error message if the queue is full), kind of a synchronous job manager.
does anyone know if there is a gem/plugin with such functionality?
(nanite seemed pretty cool to me, but seems to be only asynchronous, so the rails instances would not know when the calculation is finished. is that correct?)
another idea is to write my own using distributed ruby (drb), but why invent the wheel again if it already exists?
any help would be appreciated!
EDIT:
because of the tips of zaius i think i will be able to do this asynchronously, so i'm going to try resque.
Ruby has mutexes / semaphores.
http://www.ruby-doc.org/core/classes/Mutex.html
You can use a semaphore to make sure only one resource intensive process is happening at the same time.
http://en.wikipedia.org/wiki/Mutex
http://en.wikipedia.org/wiki/Semaphore_(programming)
However, the idea of blocking a front end process while other tasks finish doesn't seem right to me. If I was doing this, I would use a background worker, and then use a page (or an iframe) with the refresh meta tag to continuously check on the progress.
http://en.wikipedia.org/wiki/Meta_refresh
That way, you can use the same code for both javascript enabled and disabled clients. And your web app threads aren't blocking.
If you have a separate process, then you have a background job... so either you can have it or you can't...
What I have done is have the website write the request params to a database. Then a separate process looks for pending requests in the database - using the daemons gem. It does the work and writes the results back to the database.
The website then polls the database until the results are ready and then displays them.
Although I use javascript to make it do the polling.
If you really cant use javascript, then it seems you need to either do the work in the web request thread or make that thread wait for the background thread to finish.
To make the web request thread wait, just do a loop in it, checking the database until the reply is saved back into it. Once its there, you can then complete the thread.
HTH, chris

In a rails app, should e-mail be sent as a background job or synchronously?

We are getting close to releasing our new rails app and so far interest seems very strong we are a little worried about where bottle necks will be. One seems to be system e-mails on signup and in other situations. Is this correct?
Should individual e-mails to users be sent asynchronously in the background? If so, what would be the best solution?
I have looked at a few solutions and can't seem to find anything definitive.
In the backgroud using http://github.com/tobi/delayed_job
I would say it depends on your requirements. If you need to be able to inform the user that sending mail failed, do it in the same thread.
If not, sending mail should support things like retries etc, so I would put the message into a queue/the filesystem/database table/etc and have another thread/process deal with the details of sending.
Same thread, if you ask me.... by generating a file in a drop folder, which an email server picks up. Then there is not too much overhead, so a seaprate threads makes little sense.
At least this is how I always handle this.

Resources