Does Rails provide a way to execute code on the server after the view is rendered and after the response is sent to the browser?
I have an action in my application that performs a lot of database transactions, which results in a slow response time for the user. What I'd like is to (1) perform some computations, (2) send the results of those computations to the browser, and then (3) save the results to the database.
It sounds like you want to implement a background job processor. This allows you to put the job into a queue to be processed asynchronously and for your users to not notice a long page load.
There are many options available. I have used and had no issues with delayed_job. Another popular one lately, which I have not used is resque.
Related
I am building a website, and I have an administrator page. The admin will have to run a reporting task, meaning that, the task will iterate all the records fetch information and generate a pdf file. Now this will be heavy on the app and the database.
What is the usual approach for it ? Should I have a button that calls a method of a class or should I have a rake task? I heard that HTTP GET requests have a limit and if the report generation takes more than that then it kills the request.
I would like to use send_data(....) so the user is given a nice download pop up box when the report is done. Will it be better to use a mailer and email it?
Thanks
We have similar functionality in our Rails apps at my job.
We have one URL/action that initiates the request to generate the PDF file, and returns right away saying the request was started successfully.
Then we have another action that we can poll with AJAX that returns whether or not the report is complete, and when it is complete, it gives the user the PDF.
The actual generation is done by a Sidekiq worker which is not subject to the webserver timeout.
In a Rails 3.2 app I have a view that is pulling in information from an external API. On slow connections, this severely reduces the page load time and affects user experience.
How can I move this into an asynchronous process so that the rest of the page loads, and the external information is rendered later once it has been fetched and is available.
The external data is large and complex and I don't think is suitable to cache in the database or in a variable.
I'm aware of delayedjob and similar gems, but these seem more suited to queuing database methods rather than in the view.
What other options are available to me?
It seems like a large data set is perfectly suitable for caching on your local server.
Keep in mind, a long request is going to lock your Rails process/thread and and can't serve any other requests while waiting for your API call to finish.
That said, you can always trigger an Ajax request to occur once the rest of the page loads.
I am designing a Rails app that takes in requests, uses data within the request to call a 3rd party web service, process the reply and then sends out a response to the original requestor and also issues a PUT request to yet another service.
I am trying to wrap my head around how to design this Rails app as it's different from the canonical Rails structure.
The objects are Lists and Tasks. Each List has many Tasks, and each Task belongs to a List.
The request I would get is something like:
http://myrailsapp.heroku.com/v1/lists?id=1&from=2012-02-12&to=2012-02-14&priority=high
In this example I am requesting tasks from 2/12/2012 to 2/14/2012 with a high priority in List #1
I would then issue a 3rd party web service call like this:
http://thirdpartywebservice.com/v1/lists?id=4128&from=2012-02-12&to=2012-02-14&priority=high
As you can see some processing was done on the data (id was changed in this case)
The results are then sent back to the requestor and to another web service via PUT.
My question is, how do I set up the Rails app to handle these types of behaviors? How does the controller structure change? This looks like a good use case for queues, how do I distribute multiple concurrent requests among queues?
For one thing I don't need data persistence (data can be discarded after the response is sent out) and also data structure design is simplified. (I don't think I need ruby objects, simply dictionaries or hashes representing these would be lighter weight and quicker to implement)
Edit
So I broke down the work flow of the app into these components
Parse incoming request
Construct 3rd part web service request
Send 3rd party request
Enqueue a worker to process the expected response
Process the response once it arrives
Send the parsed result back as a response
Which of the standard ruby controllers handle each of these steps? What are the models needed besides Lists and Tasks?
You should still use a database because passing data to Resque is messy. Rather, you should store it in the database and then pass the id to the workers, fetch the data, commit any new data or delete the record. It's really up to you but this method is cleaner. You can also use a push service like faye to let the user know when the processing is complete.
If you expect to have many concurrent requests, I would recommend Sidekiq as it's less of a memory hog. Having 4-5 resque workers can already suck up about 512 MB. The controller structure should not change. Please comment on anything you need clarified and I'll be happy to update my answer.
EDIT
You would want to use a separate database store, such as Postgres. Not sure if it's important what models you need, but essentially this is what should be happening.
In your controller, create a Request object which contains the query params you want to query this 3rd party service with. Then enqueue a job to be handled by Sidekiq/Resque, let's call this ThirdPartyRequest and pass in the id of the Request object you just created as an argument. Then render a view here showing the Request object. Let's say that Request#response is still empty cause it hasn't been processed yet, so let the user know it's still processing.
A worker then handles your job ThirdPartyRequest. ThirdPartyRequest should then fetch the Request object and obtain the query params needed to contact the third party service. It does that then gets a Request. Update the Request object with this Request then save it.
class ThirdPartyRequest
def self.perform(request_id)
request = Request.find(request_id)
# contact third party service
request.response = ...
request.save
end
end
The user can continually refresh his page to check on his/her Request object. Once it gets updated with the response, they will know its completed. If you want the page to refresh automatically, look into faye/juggernaut/private_pub or a SaaS solution like Pusher.
I'm aware of the model that involves a scheduled task runninng in the back ground which runs jobs registered with a web request but how about this for an idea that keeps everything within ASP.net...
User uploads a CSV file with, perhaps, several thousand rows. The rows are persisted to the database. I think this would take maybe a minute or so which would be an acceptable wait.
Request returns to the browser and then an automatic Ajax request would go back to the server and request, say, ten rows at a time and process them. (Each row requires a number of web service requests.)
Ajax call returns, display is updated and then another automatic Ajax request goes back for more rows. This repeats until all rows are completed.
If user leaves the web page, then they could return and restart the job.
Any thoughts?
Cheers, Ian.
If i get you right, you actually dont need any "interaction" between background jobs and the long-running request, you just want to "lauch" background jobs with incoming requests? Not such a good idea. Take a look at the Quartz.NET project, it is scheduler embeddable into ASP.NET application, it will handle this stuff for you without need of requests. Of course, if there is app pool shutdown, also your scheduler goes down, but this you cant guarantee not to happen even with your long-running requests solution, dependent on browser waiting on other side.
Also take a look on this interesting article from phil haack on this topic, with his own little scheduler library specific for ASP.NET :
http://haacked.com/archive/2011/10/16/the-dangers-of-implementing-recurring-background-tasks-in-asp-net.aspx
A server side program (or ideally service) could still be quick and dirty and would be more reliable. You could still do step 1 as you have proposed, upload the file and insert the data ( don't forget to increase the maxRequestLength time out value in the web.config ). Then have a program running on the server that checks for new records and processes them.
If the user needs status you could store an entry in the database for each file and update the database record when the import is complete.
Maybe I'm reading the question and interpreting it in a weird way, but why couldn't you read the file into a database and store in a table the current line of the file that you've completed through. You could then track your progress via the db and just send small json objects telling the user how far along you are. That way if their connection drops you can keep processing their request, and if they return later you can notify them of how far along the job is. Also, if multiple clients are connecting you can use the db to queue and throttle (by serializing) the workload. Or if the user connects mid-job with another file, then their new request will be queued up after their current job.
Following a specific action the user takes on my website, a number of messages must be sent to different emails. Is it possible to have a separate thread or worker take care of sending multiple emails so as to avoid having the response from the server take a while to return if there are a lot of emails to send?
I would like to avoid using system process or scheduled tasks, email queues.
You can definitely spawn off a background thread in your controller to handle the emails asynchronously.
I know you want to avoid queues, but another thing i have done in the past is written a windows service that pulls email from a DB queue and processes it at certain intervals. This way you can separate the 2 applications if there is a lot of email to be sent.
This can be done in many different ways, depending on how large your application is and what kind of reliability you want. Any of these ways should help you achieve what you want (in ascending order based on complexity):
If you're using IIS SMTP Server or another mail server that supports a pickup directory option, you can go with that. With this option, instead of sending the emails directly, they are saved first in the pickup directory. Your call will immediately return after the email is saved in the pickup directory, so the user won't have to wait until the email is sent. On the other hand, the server will try to send the email as soon as it's saved in the pickup directory so it's almost immediate (just without blocking the call).
You can use a background thread like described in other answers. You'll need to be careful with this option as the thread can end unexpectedly before it finishes its job. You'll need to add some code to make sure this works reliably (personally, I'd prefer not to use this option).
Using a messaging queue server like MSMQ. This is more work and you probably should only look into this if you have a large scale application or have good reasons not to use the first option with the pickup directory.
There are a few ways you could do this.
You could store enough details about the message in the database, and write a windows service to loop through them and send the email. When the user submits the form it just inserts the required data about the message and trusts the service will pick it up. Almost an email queue which you said you didn't want, but you're going to end up in a queue situation with almost any solution.
Another option would be to drop in NServiceBus. Use that for these kinds of tasks.
I typically compile the message body and store that in a table in the db along with the from and to addresses, a subject, and a timestamp indicating when the email was sent. Then I have a background task check the table periodically and pull any that haven't been sent. This task attempts to send each email and updates the timestamp accordingly. One advantage of storing the compiled message body up front is that the background task doesn't have to do any processing of context-specific data, and therefore can be pretty darn simple.
Whenever an operation like is hingent upon an event, there is always the possibility something will go wrong.
In ASP.NET you can spawn multiple threads and have those threads do the action. Make sure you tell the thread it's a background thread, otherwise ASP.NET might way for the thread to finish before rendering your page:
myThread.IsBackground = true;
I know you said you didn't want to use system process or scheduled tasks, but a windows service would be a viable approach to this as well. The approach would be to use MS Queue, or save the actions needing to be done in a DataBase table. Then have a windows service check every minute or so and do those actions.
This way, if something fails (Email server down) those emails / actions can still be done.
They will also be recorded for audit's (which is very nice to have).
This method allows you're web site to function as a website while offloading these tasks to another service. The last thing you need is for multiple ASP.NET processes to be used up waiting for emails to send. let something else handle that.