hi
i'm going to set up a rails-website where, after some initial user input, some heavy calculations are done (via c-extension to ruby, will use multithreading). as these calculations are going to consume almost all cpu-time (memory too), there should never be more than one calculation running at a time. also i can't use (asynchronous) background jobs (like with delayed job) as rails has to show the results of that calculation and the site should work without javascript.
so i suppose i need a separate process where all rails instances have to queue their calculation requests und wait for the answer (maybe an error message if the queue is full), kind of a synchronous job manager.
does anyone know if there is a gem/plugin with such functionality?
(nanite seemed pretty cool to me, but seems to be only asynchronous, so the rails instances would not know when the calculation is finished. is that correct?)
another idea is to write my own using distributed ruby (drb), but why invent the wheel again if it already exists?
any help would be appreciated!
EDIT:
because of the tips of zaius i think i will be able to do this asynchronously, so i'm going to try resque.
Ruby has mutexes / semaphores.
http://www.ruby-doc.org/core/classes/Mutex.html
You can use a semaphore to make sure only one resource intensive process is happening at the same time.
http://en.wikipedia.org/wiki/Mutex
http://en.wikipedia.org/wiki/Semaphore_(programming)
However, the idea of blocking a front end process while other tasks finish doesn't seem right to me. If I was doing this, I would use a background worker, and then use a page (or an iframe) with the refresh meta tag to continuously check on the progress.
http://en.wikipedia.org/wiki/Meta_refresh
That way, you can use the same code for both javascript enabled and disabled clients. And your web app threads aren't blocking.
If you have a separate process, then you have a background job... so either you can have it or you can't...
What I have done is have the website write the request params to a database. Then a separate process looks for pending requests in the database - using the daemons gem. It does the work and writes the results back to the database.
The website then polls the database until the results are ready and then displays them.
Although I use javascript to make it do the polling.
If you really cant use javascript, then it seems you need to either do the work in the web request thread or make that thread wait for the background thread to finish.
To make the web request thread wait, just do a loop in it, checking the database until the reply is saved back into it. Once its there, you can then complete the thread.
HTH, chris
Related
I have a Rails 3 application that lets a user perform a search against a 3rd party database via an API. It could potentially bring back quite a bit of XML data. Also, the API could be busy serving requests for other users and have a nontrivial delay in its response time.
I only run 2 webservers so I can't afford to have a delayed request obviously. I use Sidekiq to process long-running jobs, but in all the cases I've needed that I haven't had to return a value to the screen.
I also use Pusher to communicate back to the user when a background job is finished. I am checking it out, but I don't know if it can be used for the kind of data I want to push to the screen. Right now it just pops up dialog boxes with messages that I send it.
I have thought of some pretty kooky stuff, like running the request via Sidekiq, sending the results to a session object or file, then using Pusher to kick off some kind of event to grab the data and populate the screen with it. Seems kind of Rube Goldberg-ish.
I appreciate any help or insight anyone can offer into the problem!
I had a similar situation not long ago and the way I've fixed was using memcache and threads.
I've also thought about using Sidekiq, but Sidekiq is ideal if you don't expect to use the data right away, so memcache and threads worked pretty well and gave us a good amount of control.
Instead of calling the API directly I would assign the API request to a thread and this thread once done would write to memcache, in my case this can happen incrementally, with the same API being able to return more data from the same endpoint until is complete.
From the UI I would have a basic ajax pooling mechanism that would hit a controller and check memcache for data and for the status to see if it was complete or not, this would sign the UI that it need to keep pooling for more data.
I use ASP.Net MVC 5 and I have a long running action which have to poll webservices, process data and store them in database.
For that I want to use TPL library to start the task async.
But I wonder how to do 3 things :
I want to report progress of this task. For this I think about SignalR
I want to be able to left the page where I start this task from and be able to report the progression across the website (from a panel on the left but this is ok)
And I want to be able to cancel this task globally (from my panel on the left)
I know quite a few about all of technologies involved. But I'm not sure about the best way to achieve this.
Is someone can help me about the best solution ?
The fact that you want to run long running work while the user can navigate away from the page that initiates the work means that you need to run this work "in the background". It cannot be performed as part of a regular HTTP request because the user might cancel his request at any time by navigating away or closing the browser. In fact this seems to be a key scenario for you.
Background work in ASP.NET is dangerous. You can certainly pull it off but it is not easy to get right. Also, worker processes can exit for many reasons (app pool recycle, deployment, machine reboot, machine failure, Stack Overflow or OOM exception on an unrelated thread). So make sure your long-running work tolerates being aborted mid-way. You can reduce the likelyhood that this happens but never exclude the possibility.
You can make your code safe in the face of arbitrary termination by wrapping all work in a transaction. This of course only works if you don't cause non-transacted side-effects like web-service calls that change state. It is not possible to give a general answer here because achieving safety in the presence of arbitrary termination depends highly on the concrete work to be done.
Here's a possible architecture that I have used in the past:
When a job comes in you write all necessary input data to a database table and report success to the client.
You need a way to start a worker to work on that job. You could start a task immediately for that. You also need a periodic check that looks for unstarted work in case the app exits after having added the work item but before starting a task for it. Have the Windows task scheduler call a secret URL in your app once per minute that does this.
When you start working on a job you mark that job as running so that it is not accidentally picked up a second time. Work on that job, write the results and mark it as done. All in a single transaction. When your process happens to exit mid-way the database will reset all data involved.
Write job progress to a separate table row on a separate connection and separate transaction. The browser can poll the server for progress information. You could also use SignalR but I don't have experience with that and I expect it would be hard to get it to resume progress reporting in the presence of arbitrary termination.
Cancellation would be done by setting a cancel flag in the progress information row. The app needs to poll that flag.
Maybe you can make use of message queueing for job processing but I'm always wary to use it. To process a message in a transacted way you need MSDTC which is unsupported with many high-availability solutions for SQL Server.
You might think that this architecture is not very sophisticated. It makes use of polling for lots of things. Polling is a primitive technique but it works quite well. It is reliable and well-understood. It has a simple concurrency model.
If you can assume that your application never exits at inopportune times the architecture would be much simpler. But this cannot be assumed. You cannot assume that there will be no deployments during work hours and that there will be no bugs leading to crashes.
Even if using http worker is a bad thing to run long task I have made a small example of how to manage it with SignalR :
Inside this example you can :
Start a task
See task progression
Cancel task
It's based on :
twitter bootstrap
knockoutjs
signalR
C# 5.0 async/await with CancelToken and IProgress
You can find the source of this example here :
https://github.com/dragouf/SignalR.Progress
I'm working on an application, but at the moment I'm stuck on multithreading with rails.
I have the following situation: when some action occurs (it could be after a user clicks a button or when a scheduled task fires off), I'm starting a separate thread which parses some websites until the moment when I have to receive the SMS-code to continue parsing. At this moment I make Thread.stop.
The SMS-code comes as a POST request to some of my controllers. So I want to pass it to my stopped thread and continue its job.
But how can I access that thread?
Where is the best place to keep a link to that thread?
So how can I handle multithreading? There may be a situation when there'll be a lot of threads and a lot of SMS requests, and I need to somehow correlate them.
For all real purposes you can't, but you can have that other thread 'report' its status.
You can use redis-objects to create either a lock object using redis as its flag, create some type of counter, or just true, false value store. You can then query redis to see the corresponding state of the other thread, and exit if needed.
https://github.com/nateware/redis-objects
The cool part about this is it not only works between threads, but between applications.
I have a particularly long running method that I need to execute from my controller. The Method is in it's own Model. I am using an async controller, and I have the method setup using asyncFunc library to make it asynchronous. I have also tried invoking it on it's own process. The problem is I want to controller to go ahead and return a view so the user can continue doing other things as the method will notify the user it is completed or has any errors via e-mail.
The problem is even thogh it is an asynchronous method the controller will not move forward to return the view until the process is done. 15+ mins. and if you navigate to a different page the method stops trying to execute.
so how can I get the method to execute as a worker and free up the controller?
Any Help would be greatly appreciated.
all the best,
Chase Q, Aucoin
Use ThreadPool.QueueUserWorkItem() as a fire-and-forget approach in the ASPX page.
Do the long-running work in the WaitCallback you pass to QUWI.
when the work is complete, that WaitCallback can send an email, or whatever it wants.
You need to take care to handle the case that the w3wp.exe is stopped during the 15 minute run. What will you do if the work is 2/3 complete? Some options are, making the work restartable, or just allowing the interrupted work to be forgotten.
Making it restartable might mean, when w3wp.exe restarts, your ASP.NET logic makes sure to begin again, any work that was interrupted. It might mean that your ASP.NET logic sets "syncpoints" so that it knows where to restart.
If you want the restartable option, you might think about Workflow, which is specifically designed for this purpose - maintaining state of long-running workflows, restarting automatically, and so on. If you use Workflow, you can set it to run asynchronously, and you may decide you do not need QueueUserWorkItem.
see also:
Moving a time taking process away from my asp.net application
the Workflow Foundation tag
This will help > http://msdn.microsoft.com/en-us/library/ms227433.aspx
It is the standard way of running a background process on the server in the .NET stack.
I don't know why, but I still live in conviction that this should not be done. Executing background threads in ASP.NET smells. You will also steal threads from ASP.NET thread pool which is controlled by IIS. It can decide that something is wrong with your worker process and restart it any time just to keep memory consumption, processing time consumption or thread consumption low. If you need background logic create custom NT service and call the process on that service either via old .NET remoting or WCF.
Btw. approach I described is used frequently in commercial applications and those which doesn't use it often self-host the whole web server.
I'm using a Observer on my classes. When one of the records is created/updated I need to notfify another service (via a URL call). What is the best way to do this to avoid slowing down my class? Would using a gem liked delayed_job be overkill?
In my Observer's after_update() / after_create() I just want to launch a thread that calls the URL...
If your notifier is non-thread blocking, you could simply spawn a thread and perform the notification there. That way, your program will continue to run while that thread is waiting on the response.
Of course, you'll want some way to handle failure. You could have it try three times, and if it still fails, write a notification to the log or something.
The best (most reliable) solution would be to use a job queue. That way, if a job fails outright, you can inspect and resubmit the job again.
Definitely use a community supported/accepted gem like delayed_job or resque. It's really not as hard as you think and your app will scale better down the line.