Is it possible to string / queue Ruby actions? - ruby-on-rails

I've written a number of actions in a RoR app, that perform different actions within process.
E.g.
- One action communicates with a third party service using their API and collects data.
- Another processes this data and places it into a relevant database.
- Another takes this new data and formats it in a specific way.
etc..
I would like to fire off the process at timed intervals, eg. Each hour. But I don't want to do the whole thing each time.
Sometimes I may just want to do the first two actions. At other times, I might want to do each part of the process.
So have one action run, and then when it's finished call another action. ETC..
The actions could take up to an hour to complete, if not longer, so I need a solution that won't timeout.
What would be the best way to achieve this?

You have quite a few options for processing jobs in the background:
Sidekiq: http://mperham.github.io/sidekiq/
Queue Classic: https://github.com/ryandotsmith/queue_classic
Delayed Job: https://github.com/collectiveidea/delayed_job
Resque: https://github.com/resque/resque
Just read through and pick the one that seems to fit your criteria the best.
EDIT
As you clarified, you want regularly scheduled tasks. Clockwork is a great gem for that (and generally a better option than cron):
https://github.com/tomykaira/clockwork

Related

Delayed Job executes with wrong data when I have a big amount of jobs

I read a lot and I saw that Delayed Job doesn't actually use serialized data but it retrieves information using the deserialized id.
This isn't the behavior that I was expected when I choose that gem, but I can deal with it.
The real problem is that I use DJ to fire some alerts based on some data using an after_save callback and sometimes that data fires the alert too much in the future. So basically if I save three times the medical result for different reasons and the third time finalizes it, I will fire three alerts because DJ works three times on the finalized result.
Does exist a way to enqueue a job in the same queue, for the same method just once? I saw that handler isn't exposed and handle_asyncronously doesn't accept a parameter to identify the process.
The best solution would have been to work directly on serialized data but also execute it once is acceptable.
Thank you in advance!

C# 5 .NET MVC long async task, progress report and cancel globally

I use ASP.Net MVC 5 and I have a long running action which have to poll webservices, process data and store them in database.
For that I want to use TPL library to start the task async.
But I wonder how to do 3 things :
I want to report progress of this task. For this I think about SignalR
I want to be able to left the page where I start this task from and be able to report the progression across the website (from a panel on the left but this is ok)
And I want to be able to cancel this task globally (from my panel on the left)
I know quite a few about all of technologies involved. But I'm not sure about the best way to achieve this.
Is someone can help me about the best solution ?
The fact that you want to run long running work while the user can navigate away from the page that initiates the work means that you need to run this work "in the background". It cannot be performed as part of a regular HTTP request because the user might cancel his request at any time by navigating away or closing the browser. In fact this seems to be a key scenario for you.
Background work in ASP.NET is dangerous. You can certainly pull it off but it is not easy to get right. Also, worker processes can exit for many reasons (app pool recycle, deployment, machine reboot, machine failure, Stack Overflow or OOM exception on an unrelated thread). So make sure your long-running work tolerates being aborted mid-way. You can reduce the likelyhood that this happens but never exclude the possibility.
You can make your code safe in the face of arbitrary termination by wrapping all work in a transaction. This of course only works if you don't cause non-transacted side-effects like web-service calls that change state. It is not possible to give a general answer here because achieving safety in the presence of arbitrary termination depends highly on the concrete work to be done.
Here's a possible architecture that I have used in the past:
When a job comes in you write all necessary input data to a database table and report success to the client.
You need a way to start a worker to work on that job. You could start a task immediately for that. You also need a periodic check that looks for unstarted work in case the app exits after having added the work item but before starting a task for it. Have the Windows task scheduler call a secret URL in your app once per minute that does this.
When you start working on a job you mark that job as running so that it is not accidentally picked up a second time. Work on that job, write the results and mark it as done. All in a single transaction. When your process happens to exit mid-way the database will reset all data involved.
Write job progress to a separate table row on a separate connection and separate transaction. The browser can poll the server for progress information. You could also use SignalR but I don't have experience with that and I expect it would be hard to get it to resume progress reporting in the presence of arbitrary termination.
Cancellation would be done by setting a cancel flag in the progress information row. The app needs to poll that flag.
Maybe you can make use of message queueing for job processing but I'm always wary to use it. To process a message in a transacted way you need MSDTC which is unsupported with many high-availability solutions for SQL Server.
You might think that this architecture is not very sophisticated. It makes use of polling for lots of things. Polling is a primitive technique but it works quite well. It is reliable and well-understood. It has a simple concurrency model.
If you can assume that your application never exits at inopportune times the architecture would be much simpler. But this cannot be assumed. You cannot assume that there will be no deployments during work hours and that there will be no bugs leading to crashes.
Even if using http worker is a bad thing to run long task I have made a small example of how to manage it with SignalR :
Inside this example you can :
Start a task
See task progression
Cancel task
It's based on :
twitter bootstrap
knockoutjs
signalR
C# 5.0 async/await with CancelToken and IProgress
You can find the source of this example here :
https://github.com/dragouf/SignalR.Progress

Use Sidekiq to keep a cache of results full

Say I have a particularly expensive calculation to perform during a specific user request. The plus side is that this calculation can be performed ahead of time, and pushed in a general queue for people to pull from.
Is there a way to use Sidekiq in a Ruby/Rails backend to keep this cache of results full to a certain level? Where would I store the results of this calculation?
e.g.
On server load, calculate 20 sets of results, and cache somewhere.
On user request, pop off a result to allow for immediate server response.
Regenerate one set of results in the background to fill back up to 20 in the queue.
Obviously may need to use a different number than 20 depending on how long the computation takes, and rate of user requests, but I think you get the idea.
I'm curious to know what kind of calculation actually fits this profile but that's not really important.
Since you are using Sidekiq (or would like to use Sidekiq) it means you have a Redis database. A Redis database is a great place to put this kind of info.
So you can just create a LIST in Redis of your results. During application startup fire of 20 sidekiq jobs to create your calculations. The worker doing the calculation can push the result onto the list in Redis.
As you handle requests, just pop a result off the list and queue another sidekiq job to make yourself a new calculation.

Ruby/Rails synchronous job manager

hi
i'm going to set up a rails-website where, after some initial user input, some heavy calculations are done (via c-extension to ruby, will use multithreading). as these calculations are going to consume almost all cpu-time (memory too), there should never be more than one calculation running at a time. also i can't use (asynchronous) background jobs (like with delayed job) as rails has to show the results of that calculation and the site should work without javascript.
so i suppose i need a separate process where all rails instances have to queue their calculation requests und wait for the answer (maybe an error message if the queue is full), kind of a synchronous job manager.
does anyone know if there is a gem/plugin with such functionality?
(nanite seemed pretty cool to me, but seems to be only asynchronous, so the rails instances would not know when the calculation is finished. is that correct?)
another idea is to write my own using distributed ruby (drb), but why invent the wheel again if it already exists?
any help would be appreciated!
EDIT:
because of the tips of zaius i think i will be able to do this asynchronously, so i'm going to try resque.
Ruby has mutexes / semaphores.
http://www.ruby-doc.org/core/classes/Mutex.html
You can use a semaphore to make sure only one resource intensive process is happening at the same time.
http://en.wikipedia.org/wiki/Mutex
http://en.wikipedia.org/wiki/Semaphore_(programming)
However, the idea of blocking a front end process while other tasks finish doesn't seem right to me. If I was doing this, I would use a background worker, and then use a page (or an iframe) with the refresh meta tag to continuously check on the progress.
http://en.wikipedia.org/wiki/Meta_refresh
That way, you can use the same code for both javascript enabled and disabled clients. And your web app threads aren't blocking.
If you have a separate process, then you have a background job... so either you can have it or you can't...
What I have done is have the website write the request params to a database. Then a separate process looks for pending requests in the database - using the daemons gem. It does the work and writes the results back to the database.
The website then polls the database until the results are ready and then displays them.
Although I use javascript to make it do the polling.
If you really cant use javascript, then it seems you need to either do the work in the web request thread or make that thread wait for the background thread to finish.
To make the web request thread wait, just do a loop in it, checking the database until the reply is saved back into it. Once its there, you can then complete the thread.
HTH, chris

Letting something happen at a certain time with Rails

Like with browser games. User constructs building, and a timer is set for a specific date/time to finish the construction and spawn the building.
I imagined having something like a deamon, but how would that work? To me it seems that spinning + polling is not the way to go. I looked at async_observer, but is that a good fit for something like this?
If you only need the event to be visible to the owning player, then the model can report its updated status on demand and we're done, move along, there's nothing to see here.
If, on the other hand, it needs to be visible to anyone from the time of its scheduled creation, then the problem is a little more interesting.
I'd say you need two things. A queue into which you can put timed events (a database table would do nicely) and a background process, either running continuously or restarted frequently, that pulls events scheduled to occur since the last execution (or those that are imminent, I suppose) and actions them.
Looking at the list of options on the Rails wiki, it appears that there is no One True Solution yet. Let's hope that one of them fits the bill.
I just did exactly this thing for a PBBG I'm working on (Big Villain, you can see the work in progress at MadGamesLab.com). Anyway, I went with a commands table where user commands each generated exactly one entry and an events table with one or more entries per command (linking back to the command). A secondary daemon run using script/runner to get it started polls the event table periodically and runs events whose time has passed.
So far it seems to work quite well, unless I see some problem when I throw large number of users at it, I'm not planning to change it.
To a certian extent it depends on how much logic is on your front end, and how much is in your model. If you know how much time will elapse before something happens you can keep most of the logic on the front end.
I would use your model to determin the state of things, and on a paticular request you can check to see if it is built or not. I don't see why you would need a background worker for this.
I would use AJAX to start a timer (see Periodical Executor) for updating your UI. On the model side, just keep track of the created_at column for your building and only allow it to be used if its construction time has elapsed. That way you don't have to take a trip to your db every few seconds to see if your building is done.

Resources