Rails - asynchronous tasks, forked processes, best practices - ruby-on-rails

I'm using a Observer on my classes. When one of the records is created/updated I need to notfify another service (via a URL call). What is the best way to do this to avoid slowing down my class? Would using a gem liked delayed_job be overkill?
In my Observer's after_update() / after_create() I just want to launch a thread that calls the URL...

If your notifier is non-thread blocking, you could simply spawn a thread and perform the notification there. That way, your program will continue to run while that thread is waiting on the response.
Of course, you'll want some way to handle failure. You could have it try three times, and if it still fails, write a notification to the log or something.
The best (most reliable) solution would be to use a job queue. That way, if a job fails outright, you can inspect and resubmit the job again.

Definitely use a community supported/accepted gem like delayed_job or resque. It's really not as hard as you think and your app will scale better down the line.

Related

Scheduling stuff in the future

I've been doing some research but I can't figure out on how to do this in Rails.
I need to execute some code after a certain amount of time. I found some gems that can handle this but I don't want to use any.
I basically have a create function in a Rails controller and some stuff need to happen there after 24 hours.
EDIT: I tried with sleep but it needs to be async and sleep will stop everything from running until it's done, even if it is in a if statement.
You need to create an ActionJob with rails generate job your_job_name.
Then you can delay its execution
YourJob.set(wait_until: 1.day.from_now)
For this to work you indeed need a Job queue to be configured for your project. I personally recommend Sidekiq
Sadly, you will have a hard time avoiding setting up a gem for this.

Rails Controller - Return and then perform complex action (without worker)

I want to create an API endpoint in my app that takes data and performs a complex, time-consuming operation with it, but only after returning that the data has been received.
As in, I'd love to be able to do something like this:
def action
render json: { data_received: params[:whatever].present? }
perform_calculations( params[:whatever] )
end
Unfortunately, as I understand it, Ruby/Rails is synchronous, and requires that a controller action end in a render/redirect/head statement of some sort.
Ordinarily, I'd think of accomplishing this with a background worker, like so:
def action
DelayedJobActionPerformer.perform_later(params[:whatever])
render { data_received: params[:whatever].present? }
end
But a worker costs (on Heroku) a fair amount of monthly money for a beginning app like this, and I'm looking for alternatives. Is there any alternative to background workers you can think of to return from the action and then perform the behavior?
I'm thinking of maybe creating a separate Node app or something that can start an action and then respond, but that's feeling ridiculous. I guess the architecture in my mind would involve a main Rails app which performs most of the behavior, and a lightweight Node app that acts as the API endpoint, which can receive a request, respond that it's been received, and then send on the data to be performed by that first Rails app, or another. But it feels excessive, and also like just kicking the problem down the road.
At any rate, whether or not I end up having to buy a worker or few, I'd love to know if this sort of thing is feasible, and whether using an external API as a quasi-worker makes sense (particularly given the general movement towards breaking up application concerns).
Not really...
Well you can spawn a new thread:
thread = Thread.new { perform_calculations( params[:whatever] ) }
And not call thread.join, but that is highly unreliable, because that thread will be killed if the main thread terminates.
I don't know how things with cron jobs are in Heroku, but another option is to have a table with pending jobs where you save params[:whatever] and have a rake task that is triggered with cron periodically to check and perform any pending tasks. This solution is a (really) basic worker implementation.
Heard about sucker_punch, you can give it a try. This will run in single webprocess but the downside is that if the web processes is restarted and there are jobs that haven't yet been processed, they will be lost. So not recommended for critical background tasks.

Using GCD for offline persistent queue

Right now I have some older code I wrote years ago that allows an iOS app to queue up jobs (sending messages or submitting data to a back-end server, etc...) when the user is offline. When the user comes back online the tasks are run. If the app goes into the background or is terminated the queue is serialized and then loaded back when the app is launched again. I've subclassed NSOperationQueue and my jobs are subclasses of NSOperation. This gives me the flexibility of having a data structure provided for me that I can subclass directly (the operation queue) and by subclassing NSOperation I can easily requeue if my task fails (server is down, etc...).
I will very likely leave this as it is, because if it's not broke don't fix it, right? Also these are very lightweight operations and I don't expect in the current app I'm working on for there to be very many tasks queued at any given time. However I know there is some extra overhead with using NSOperation rather than using GCD directly.
I don't believe I could subclass a dispatch queue the way I can an NSOperationQueue, so there would be extra code overheard for me to maintain my own data structure and load this into & out of a dispatch queue each time the app is sent to the background, right? Also not sure how I'd handle requeueing the job if it fails. Right now if I get a HTTP 500 response from the server, for example, in my operation code I send a notification with a deep copy of the failed NSOperation object. My custom operation queue picks this notification up and adds the task to itself. Not sure how of if I'd be able to do something similar with GCD. I would also need an easy way to cancel all operations or suspend the queue when network connectivity is lost then reactivate when network access is regained.
Just hoping to get some thoughts, opinions and ideas from others who might have done something similar or are more familiar with GCD than I am.
Also worth noting I know there's some new background task support coming in iOS 7 but it will likely be a while before that will be my deployment target. I am also not sure yet if it would exactly do what I need, so at the moment just looking at the possibility of GCD.
Thanks.
If NSOperation vs submitting blocks to GCD ever shows up as measurable overhead, the problem isn't that you're using NSOperation, it's that your operations are far too granular. I would expect this overhead to be effectively unmeasurable in any real-world situation. (Sure, you could contrive a test harness to measure the overhead, but only by making operations that did effectively nothing.)
Use the highest level of abstraction that gets the job done. Move down only when hard data tells you that you should.

How can I check from my code if there's something enqueued in Sidekiq?

When certain conditions are met, I'd like to schedule a worker to run a particular job in 5 minutes. The thing is, if the same conditions are met again, I want to check if there's something scheduled to run. If there is such a worker scheduled to run, then, I don't want to enqueue again, but if there isn't, it should be queued. I hope you guys understood what I'm trying to do. Can it be achieved? If yes, how?
Sounds like you want to use or implement a simple persisted lock. The code that enqueues the job can first check for the availability of the lock, acquire and enqueue if available, skip if not. The enqueued job can be responsible for releasing the lock. You'll want to account for failure, like adding a lock timeout. The redis-mutex gem may be a useful implementation of this idea.
Best practices promote jobs that are idempotent. This means that you should be writing them in such a way that it should be safe to run them more than once. Any subsequent call doesn't change the result of the first call. You achieve this by writing logic that does the proper checks, and acts accordingly. Since you don't provide a description of what your worker does, I can't be more specific.
For an example, here is a link to the Sidekiq's FAQ: Make your workers idempotent and transactional
The benefit of this approach is that you're playing along the convenient abstraction of scheduled workers, instead of fighting against it.

Ruby/Rails synchronous job manager

hi
i'm going to set up a rails-website where, after some initial user input, some heavy calculations are done (via c-extension to ruby, will use multithreading). as these calculations are going to consume almost all cpu-time (memory too), there should never be more than one calculation running at a time. also i can't use (asynchronous) background jobs (like with delayed job) as rails has to show the results of that calculation and the site should work without javascript.
so i suppose i need a separate process where all rails instances have to queue their calculation requests und wait for the answer (maybe an error message if the queue is full), kind of a synchronous job manager.
does anyone know if there is a gem/plugin with such functionality?
(nanite seemed pretty cool to me, but seems to be only asynchronous, so the rails instances would not know when the calculation is finished. is that correct?)
another idea is to write my own using distributed ruby (drb), but why invent the wheel again if it already exists?
any help would be appreciated!
EDIT:
because of the tips of zaius i think i will be able to do this asynchronously, so i'm going to try resque.
Ruby has mutexes / semaphores.
http://www.ruby-doc.org/core/classes/Mutex.html
You can use a semaphore to make sure only one resource intensive process is happening at the same time.
http://en.wikipedia.org/wiki/Mutex
http://en.wikipedia.org/wiki/Semaphore_(programming)
However, the idea of blocking a front end process while other tasks finish doesn't seem right to me. If I was doing this, I would use a background worker, and then use a page (or an iframe) with the refresh meta tag to continuously check on the progress.
http://en.wikipedia.org/wiki/Meta_refresh
That way, you can use the same code for both javascript enabled and disabled clients. And your web app threads aren't blocking.
If you have a separate process, then you have a background job... so either you can have it or you can't...
What I have done is have the website write the request params to a database. Then a separate process looks for pending requests in the database - using the daemons gem. It does the work and writes the results back to the database.
The website then polls the database until the results are ready and then displays them.
Although I use javascript to make it do the polling.
If you really cant use javascript, then it seems you need to either do the work in the web request thread or make that thread wait for the background thread to finish.
To make the web request thread wait, just do a loop in it, checking the database until the reply is saved back into it. Once its there, you can then complete the thread.
HTH, chris

Resources