Rails application design: Queueing, Resque, Background Services, and Redis - ruby-on-rails

I am designing a Rails app that takes in requests, uses data within the request to call a 3rd party web service, process the reply and then sends out a response to the original requestor and also issues a PUT request to yet another service.
I am trying to wrap my head around how to design this Rails app as it's different from the canonical Rails structure.
The objects are Lists and Tasks. Each List has many Tasks, and each Task belongs to a List.
The request I would get is something like:
http://myrailsapp.heroku.com/v1/lists?id=1&from=2012-02-12&to=2012-02-14&priority=high
In this example I am requesting tasks from 2/12/2012 to 2/14/2012 with a high priority in List #1
I would then issue a 3rd party web service call like this:
http://thirdpartywebservice.com/v1/lists?id=4128&from=2012-02-12&to=2012-02-14&priority=high
As you can see some processing was done on the data (id was changed in this case)
The results are then sent back to the requestor and to another web service via PUT.
My question is, how do I set up the Rails app to handle these types of behaviors? How does the controller structure change? This looks like a good use case for queues, how do I distribute multiple concurrent requests among queues?
For one thing I don't need data persistence (data can be discarded after the response is sent out) and also data structure design is simplified. (I don't think I need ruby objects, simply dictionaries or hashes representing these would be lighter weight and quicker to implement)
Edit
So I broke down the work flow of the app into these components
Parse incoming request
Construct 3rd part web service request
Send 3rd party request
Enqueue a worker to process the expected response
Process the response once it arrives
Send the parsed result back as a response
Which of the standard ruby controllers handle each of these steps? What are the models needed besides Lists and Tasks?

You should still use a database because passing data to Resque is messy. Rather, you should store it in the database and then pass the id to the workers, fetch the data, commit any new data or delete the record. It's really up to you but this method is cleaner. You can also use a push service like faye to let the user know when the processing is complete.
If you expect to have many concurrent requests, I would recommend Sidekiq as it's less of a memory hog. Having 4-5 resque workers can already suck up about 512 MB. The controller structure should not change. Please comment on anything you need clarified and I'll be happy to update my answer.
EDIT
You would want to use a separate database store, such as Postgres. Not sure if it's important what models you need, but essentially this is what should be happening.
In your controller, create a Request object which contains the query params you want to query this 3rd party service with. Then enqueue a job to be handled by Sidekiq/Resque, let's call this ThirdPartyRequest and pass in the id of the Request object you just created as an argument. Then render a view here showing the Request object. Let's say that Request#response is still empty cause it hasn't been processed yet, so let the user know it's still processing.
A worker then handles your job ThirdPartyRequest. ThirdPartyRequest should then fetch the Request object and obtain the query params needed to contact the third party service. It does that then gets a Request. Update the Request object with this Request then save it.
class ThirdPartyRequest
def self.perform(request_id)
request = Request.find(request_id)
# contact third party service
request.response = ...
request.save
end
end
The user can continually refresh his page to check on his/her Request object. Once it gets updated with the response, they will know its completed. If you want the page to refresh automatically, look into faye/juggernaut/private_pub or a SaaS solution like Pusher.

Related

Share data between ActiveJob and Controller

Every n seconds application is requesting a remote JSON file that provides live prices for securities in the Trading system. JSON has a block with the data I need (marketdata) and a block with the current dataversion(version and seqnum).
Right now I use ActionController::Live (with EventSource on the client side) to push updated data to the browser. All actions are done within one method:
opening SSE connection;
forming dynamic URL;
pulling new data from remote server;
comparing/reassigning seqnum value;
updating database if needed.
So my goal now is to separate pulling & updating the database (ActiveJob) with pushing updated values to the browser (ActionController::Live). To accomplish this I need:
either to store somewhere on the server side seqnum & version to share between controller and background job;
or monitor databases for the latest changes in the updated_at fields.
So basically I have two questions:
What is more efficient between the two options above?Are there any other good approaches?
(in case the first one has a right to exist) How to implement this approach?
Considering the fact that you might have, for example, multiple rails process running, I believe it becomes quite hard for you to let activejob talk directly to rails controller in some way.
Defintely store seqnum and version, I wouldn't rely on updated_at in any case, it's too easy to get it updated randomly and so end up sending stuff to the client without any real reason. Also in this case they seem like very solid fields to point out if the file has been updated.
With polling
That being said, you want to "signal" ActionController::Live in some way and I'm afraid polling here is your only option, unless on your client side there is a specific moment when it needs to know if the file has been updated, in which case you might want to use websockets or something similar.
So, something like
cached_request = YourCachedRequest.latest # Assuming it returns a single record
updated = true
loop do
if updated
updated = false
response.stream.write cached_request.serialize_in_some_way
end
current_version = cached_request.version # use seqnum too if you need
cached_request = cached_request.reload
updated = true if cached_request.version > current_version
sleep 20.0
end
Without polling
If you want an option that doesn't involve polling, you can only go for websockets I believe. However you have a more efficient option:
Create a mini application (evenmachine/sinatra/something light) where the clients will poll (you can pass through your main application to distribute this to differente nodes of this mini application), the point of this app is only to reroute messages from your main application to polling clients.
Now, you can create an internal API endpoint for your main application that it's used only by delayed job. Delayed job will hit this endpoint only when it notices that the fetched JSON is actually updated relative to the one currently stored. If that's the case, it will hit your main app API endpoint which in turn will send a message (again, probably through an HTTP API endpoint, this time on your mini app) to all your mini app instances, which in turn will send them to your clients.
In this way, you don't overload your main server but only these mini-nodes which can have localized outages (which is a big advantage, instead of having a big system outage).

Rails app with no databse and continually updated models

I'm wondering what the best way to go about developing a rails application with the following features:
All of the data comes from a SOAP request to a 3rd party
A background task will make this soap request every ~10s
The background task will parse the response and then update an ActiveRecord model accordingly
The data isn't written to a database at all, if the app fails, when we start it back up the data will come from the soap request again
Users will make a request to the app which will simply show data in the model (i.e. from the soap request).
The idea is to avoid making the SOAP request for every single user as the data won't change that frequently. Not using a database avoids reading and writing of data that only ever comes from the request anyway.
I imagine that all of this can be completely quite simply with a few gems but I've had a bit of trouble sorting through what meets my requirements and what doesn't.
Thanks
I'm not sure what benefit you're getting from using ActiveRecord in this case.
Perhaps consider some other type of persistance for the SOAP calls?
If the results form the WebService are really not changing, I would recommend the Rails caching mechanism. Wherever in your Rails app, you can do:
Rails.cache.fetch "a_unique_cache_key" do
... do your SOAP request and return the result
end
This will do the work within the block just once and fetch its result from the rails cache store in the future.
The cache store be of various types (one of which is the memcache store). I usually go with the file store for medium traffic sites, but you may choose another:
http://guides.rubyonrails.org/caching_with_rails.html

Suggestions for how to write a service in Rails 3

I am building an application which will send status requests to users (via email & sms) on a regular basis. I want to execute the service each hour which will:
Query the database for all requests that need to be sent (based on some logic)
Send the requests through Amazon's Simple Email Service (this is already working)
Write a record of the status request notification back to the data store
I am considering wrapping up this series of operations into a single controller with an end point that can be called remotely to kick off the process within the rails app.
Longer term, I will break this process out into an app that can be run independently of my rails app, but for now I'm just trying to keep it simple.
My first inclination is to build the following:
Controller with the following elements:
A method which will orchestrate the steps outlined above (and can be called externally)
A call to the status_request model which will bring back a collection of request needing to be sent
A loop to iterate through the pending requests, which will:
Make a call to my AWS Simple Email Service module to actually send the email, and
Make a call to the status_request model to log the request back to the database
Model:
A method on my status_request model which will bring back a collection of requests that need to be sent
A method in my status_request model which will log that a notification was sent
Since this will behave as a service that gets called periodically from an outside scheduler I don't think I'll need a view for this operation. (Will, of course, need views to show users and admins what requests have been sent, but that's later...).
As someone new to Rails, I'm asking for review of this approach and any suggestions you may have.
Thanks!
Instead of a controller which Jeff pointed out exposes a security risk, you may just want to expose a rake task and use cron to invoke it on an hourly basis.
If you are still interested in building a controller, look at devise gem and its single access token, token_authenticatable, for securing the methods you are exposing.
You may also want to look at delayed_job or resque to offload the call to status_request and the loop to AWS simple service to a background worker process.
You may want a seperate controller and view for the log file so you can review progress on demand.
And if you want to get real fancy use Amazon SNS to send you alerts when the service reaches some unacceptable level of failures, backlog, etc.
Since you are trying to invoke this from an outside process, your approach should work. You could also have a worker process that processes task when they are there.
You will need routes to expose your service, and you may want to also make security decisions. How will the service that invokes your application authenticate so all others can't hit it at will?
Another consideration should be how many emails are you sending. If there are enough, we may want to look into the fact that writing this sort of loop is going to be extremely top heavy; and may affect users on the current system if it's a web application.
In the end, there are many ways to do this. I would focus on the performance/usage you expect as well as security. There's never one perfect way to solve a problem like this, and your way should just be aware of the variables it will need to be operating within.
Resque and Redis might be helpful to you in scheduling and performing operatio n .They are simple and superfast, [here](http://railscasts.com/episodes/271-resque] is a simple tut on same.

Syncing multple requests (user actions) with Backbone and Rails

The problem resides on building an architecture with Backbone and Rails
that handles syncing multiple actions to the server.
Assume the model is define on both Rails and Backbone.
I have an update and destroy operations on a model and I need them to synced
with the server on a user action (button click). On another part of the webapp,
these actions on this same model are synced on the moment they
made (easy, just send a restful ajax http request).
But in the first case, I can't really figure out an easy, stateless and atomic/transactional
save of the several actions (requests) the user took.
Sending multiple requests to the server makes the save non-atomic and a bit of non stateless.
Sending one big request with the actions formatted makes parsing on the server necessary.
So, is there another better solution?
If you want multiple updates, on different resources, as one atomic transaction, that is not REST.
So, of course, you will have to orchestrate the parameters and the requests in Rails. (but it's not about parsing, since you'll send JSON, more about creating a format for the aggregated parameters and figuring out what to do on the Rails side).
A nice way to handle multiple requests at once is at https://github.com/railscasts/414-batch-api-requests

In Rails, can an internal request be generated that behaves identically to an HTTP request?

Within my Rails application, I'd like to generate requests that behave identically to "genuine" HTTP requests.
For a somewhat contrived example, suppose I were creating a system that could batch incoming HTTP requests for later processing. The interface for it would be something like:
Create a new batch resource via the usual CRUD methodology (POST, receive a location to the newly created resource).
Update the batch resource by sending it URLs, HTTP methods, and data to be added to the collection of requests it's supposed to later perform in bulk.
"Process" the batch resource, wherein it would iterate over its collection of requests (each of which might be represented by a URL, HTTP method, and a set of data), and somehow tell Rails to process those requests in the same way as it would were they coming in as normal, "non-batched" requests.
It seems to me that there are two important pieces of work that need to happen to make this functional:
First, the incoming requests need to be somehow saved for later. This could be simply a case of saving various aspects of the incoming request, such as the path, method, data, headers, etc. that are already exposed as part of the incoming request object within a controller. It would be nice if there was a more "automatic" way of handling this--perhaps something more like object marshaling or serialization--but the brute force approach of recording individual parameters should work as well.
Second, the saved requests need to be able to be re-injected into the rails application at a later time, and go through the same process that a normal HTTP request goes through: routing, controllers, views, etc. I'd like to be able to capture the response in a string, much as the HTTP client would have seen it, and I'd also like to do this using Rails' internal machinery rather than simply using an HTTP library to have the application literally make a new request to itself.
Thoughts?
a straight forward way of storing the arguments should be serializing the request object in your controller - this should contain all important data
to call the requests later on, i would consider using the Dispatcher.dispatch class method, that takes 3 arguments: the cgi request, the session options (CgiRequest::DEFAULT_SESSION_OPTIONS should be ok) and the stream which the output is written to
Rack Middleware
After doing a lot of investigation after I'd initially asked this question, I eventually experimented with and successfully implemented a solution using Rack Middleware.
A Basic Methodology
In the `call' method of the middleware:
Check to see if we're making a request as a nested resource of a
transaction object, or if it's an otherwise ordinary request. If it's
ordinary, proceed as normal through the middleware by making a call to
app.call(env), and return the status, headers, and response.
Unless this is a transaction commit, record the "interesting" parts of the
request's env hash, and save them to the database as an "operation" associated
with this transaction object.
If this is a transaction commit, retrieve all of the relevant operations
for this transaction. Either create a new request environment, or clone the
existing one and populate it with the values saved for the operation. Also
make a copy of the original request environment for later restoration, if
control is meant to pass through the application normally post-commit.
Feed the constructed environment into a call to app.call(env). Repeat for
each operation.
If the original request environment was preserved, restore it and make one
final call to app.call(env), returning from the invocation of `call' in the
middleware the status, headers, and response from this final call to
app.call(env).
A Sample Application
I've implemented an example implementation of the methodology I describe here, which I've made available on GitHub. It also contains an in-depth example describing how the implementation might look from an API perspective. Be warned: it's quite rough, totally undocumented (with the exception of the README), and quite possibly in violation of Rails good coding practices. It can be obtained here:
http://github.com/mcwehner/transact-example
A Plugin/Gem
I'm also beginning work on a plugin or gem that will provide this sort of interface to any Rails application. It's in its formative stages (in fact it's completely devoid of code at the moment), and work on it will likely proceed slowly. Explore it as it develops here:
http://github.com/mcwehner/transact
See also
Railscasts - Rack Middleware
Rails Guides - Rails on Rack

Resources