I've been hunting around my issue for a while, probably the best I've come up with is another Stack Overflow question: How should I perform a long-running task in ASP.NET 4?
I'm in a similar place in that I'm wanting to understand what my options are, but I don't feel I know enough specifically about MVC to come to a view. I'm using MVC 5 but with the 4.8 framework, plus I note that technologies such as SignalR have become available since this question was asked. I was wondering if any experienced MVC'ers could give me a view?
I too have a long running process. More specifically, the user is importing a file. The file is delimited so the import happens line by line. The file might be thousands of lines long. Each line will be parsed and imported in a fraction of a second but the whole operation might take several minutes.
I don't particularly need behaviour to be asynchronous, but because of the length of the entire process I want to regularly update the user on progress. I'm wondering what options I have?
I've got a vague recollection that I might have looked at this problem 20-odd years ago (Classic ASP), and solved it by regular flushes, sending a bit more of the page to the client every few seconds, but I'm trying also to use a _Layout page now, so I've sent the page back already. So I don't think I have that option, even assuming such a mechanism still exists. A bit more recently, but still a while ago, I might have used javascript to poll but everything I'm reading now seems to point me to newer technologies which I'm not sure I fully understand yet.
I'm just wondering how would you solve this problem?
I would not be performing any of the file parsing on the web server, especially if it's thousands of rows long. I would delegate this to a background service of sorts, whether that be a Lambda service in the cloud or a Windows service or a scheduled task. You could then call your SignalR hub from the background task (whatever that might be) to update the progress of the import.
Related
I have developed a Delphi prog which logs into a website, collects "today's number" and displays it on a screen. Having got the data, the only place I could access it was in the final 'download completed' event. This doesn't seem right, as there seems to be no way to get away from the string of 'completed' events. The program never returns from the WB.Navigate call. WB.Stop and WB.Quit seem to refer to browser activity. I want a WB.TERMINATE or something! It's OK for now, but I want to put the web access in a loop, and with the current technique it just burrows deeper into the stack etc. I assume it's running in a separate thread, but can find no info about this. Should I put the browser component on a separate form created at run-time? Would freeing the form tidy things up? Any advice would be appreciated.
EDIT: Thanks for early responses. I must have had a senior moment. In fact the WB.Navigate call does return. The IDE tells me that it launched a thread with the web navigation request. So it's down to me to do some internal synchronizing to check the data arrival. Yes, the data is on the web page not a download. There is no API. I could have done basic searching of the web pages but it's 3 levels deep. The component made it easier to select the wanted data. All good now. Sorry for wasting your time.
I use ASP.Net MVC 5 and I have a long running action which have to poll webservices, process data and store them in database.
For that I want to use TPL library to start the task async.
But I wonder how to do 3 things :
I want to report progress of this task. For this I think about SignalR
I want to be able to left the page where I start this task from and be able to report the progression across the website (from a panel on the left but this is ok)
And I want to be able to cancel this task globally (from my panel on the left)
I know quite a few about all of technologies involved. But I'm not sure about the best way to achieve this.
Is someone can help me about the best solution ?
The fact that you want to run long running work while the user can navigate away from the page that initiates the work means that you need to run this work "in the background". It cannot be performed as part of a regular HTTP request because the user might cancel his request at any time by navigating away or closing the browser. In fact this seems to be a key scenario for you.
Background work in ASP.NET is dangerous. You can certainly pull it off but it is not easy to get right. Also, worker processes can exit for many reasons (app pool recycle, deployment, machine reboot, machine failure, Stack Overflow or OOM exception on an unrelated thread). So make sure your long-running work tolerates being aborted mid-way. You can reduce the likelyhood that this happens but never exclude the possibility.
You can make your code safe in the face of arbitrary termination by wrapping all work in a transaction. This of course only works if you don't cause non-transacted side-effects like web-service calls that change state. It is not possible to give a general answer here because achieving safety in the presence of arbitrary termination depends highly on the concrete work to be done.
Here's a possible architecture that I have used in the past:
When a job comes in you write all necessary input data to a database table and report success to the client.
You need a way to start a worker to work on that job. You could start a task immediately for that. You also need a periodic check that looks for unstarted work in case the app exits after having added the work item but before starting a task for it. Have the Windows task scheduler call a secret URL in your app once per minute that does this.
When you start working on a job you mark that job as running so that it is not accidentally picked up a second time. Work on that job, write the results and mark it as done. All in a single transaction. When your process happens to exit mid-way the database will reset all data involved.
Write job progress to a separate table row on a separate connection and separate transaction. The browser can poll the server for progress information. You could also use SignalR but I don't have experience with that and I expect it would be hard to get it to resume progress reporting in the presence of arbitrary termination.
Cancellation would be done by setting a cancel flag in the progress information row. The app needs to poll that flag.
Maybe you can make use of message queueing for job processing but I'm always wary to use it. To process a message in a transacted way you need MSDTC which is unsupported with many high-availability solutions for SQL Server.
You might think that this architecture is not very sophisticated. It makes use of polling for lots of things. Polling is a primitive technique but it works quite well. It is reliable and well-understood. It has a simple concurrency model.
If you can assume that your application never exits at inopportune times the architecture would be much simpler. But this cannot be assumed. You cannot assume that there will be no deployments during work hours and that there will be no bugs leading to crashes.
Even if using http worker is a bad thing to run long task I have made a small example of how to manage it with SignalR :
Inside this example you can :
Start a task
See task progression
Cancel task
It's based on :
twitter bootstrap
knockoutjs
signalR
C# 5.0 async/await with CancelToken and IProgress
You can find the source of this example here :
https://github.com/dragouf/SignalR.Progress
I'm running a rails 3.0 application on Heroku and using the New Relic addon/service.
I have been looking at the transaction traces feature (available in the pro version) to understand a little more about the performance characteristics of the application. However, a significant portion of time (30-50%) is "uninstrumented time". After making a few stabs by putting method_tracers in some places and going through the reasonably slow cycle to test whether I get more info, I'm feeling this is going nowhere fast.
It seems in the PHP new relic agent they have a great feature to get very detailed traces without needing to guess where to put method tracers: http://newrelic.com/docs/php/php-agent-faq#top100
Is there anything similar to this for ruby?
Note: I'm already using rpm_contrib to get some more info and have garbage collection stats enabled. Also, this is not about fixing a performance problem, just understanding how to better use the performance tools available and scratch a niggling itch about that uninstrumented time.
There isn't currently anything similar for Ruby. I'll mention it to the Ruby engineer when I get a chance. My guess is unless a lot of requests come in for it, it won't be at the top of the list for a while, though. In the meantime, you can use the method tracers to figure out the uninstrumented time.
Hope that helps.
Method tracers can work well, but if you have a lot of code in your controller, try a binary search using trace_execution_scoped, which records the time spent in a block of code:
http://newrelic.github.com/rpm/NewRelic/Agent/MethodTracer/InstanceMethods/TraceExecutionScoped.html#method-i-trace_execution_scoped
Add a couple calls to this, give each metric a sensible name like "Custom/MySlowControllerAction/block0" (first argument to trace_execution_scoped), and repeat.
The metrics you name will show up not just in Transaction Traces, but also in the Performance Breakdown for the controller action under the Web Transactions tab, so you'll see average time in that block of code across all requests, not just the slow ones.
PS: I was doing to some random search and then I got detrusion.com.
Whats this web application firewall ?
How it works ?
Any performance hit, if yes then how much?
Should I use this destruction.com or anything else better available.
Anybody??
I quickly glanced at the code and it doesnt appear to be doing all that much. Basically it maintains a white and black list of IPs. While it cannot be that much of a crazy performance hit you'd probably be better off doing this kind of request analyzing in a Rack middleware, that is before it even gets to the Rails request handling.
That being said, I dont like the fact that it will re-sync every 5 minutes DURING processing a given request. That is, it will block the current request while it re-syncs its ruleset / and lists. Which means that you're at the mercy of the Detrusion.com team to keep their site/API up. So when they go down you go down.
While its not as real-timey, I'd feel more comfortable to have the updating process be out of bound. Maybe you store the rules/lists in a flat file or a local DB (Redis would be perfect) which you load on app start. Then you have a frequent cron which reloads the ruleset from Detrusion and writes it locally.
Something like that. Just anything to de-couple your request handling from a Detrusion API check.
I have an ASP.NET MVC 2 Beta application where I need to block incoming requests for a specific action until I have some data available to return or just release the request after 30 seconds with no new data available.
In order to accomplish this, I'm using AutoResetEvent.WaitOne(30000);
The big issue is that IIS does not seem to be accepting any new request while the thread is blocked at the WaitOne instruction. New requests get hung till the thread releases.
I need to be able to parallelize the requests while still keeping the WaitOne behavior.
Async handlers are what you're looking for. If you're building a comet solution, you may want to check out our .NET implementation of a comet server here, it'll save you some time. If you're wanting to roll your own, you'll definately need to use the async handlers to avoid hitting upper concurrency limits by the time you get past 60 or 70 users, but even with the async handlers, you'll still have to do some fancy footwork. Basically, you're still going to hit some upper limits in the threadpool unless you hand off the requests into a bounded thread pool that can basically manage all the incoming requests for you.
Good luck!
You should not be blocking incoming requests at all. If the data you need are not ready, then return an empty response, or perhaps return an error code.
For a web application, it is more advisable (not a hard rule) to return a message to tell the users to retry again later due to whatever reason you want to call it.
Stalling/blocking the requests by 'waiting' doesn't really help much as the wait is undeterministic, unless of course you have a mechanism to make it so.
I do not know the nature/context/traffic pattern of your website. 30 seconds can be a number that works for you. Perhaps my points above are not really relevant, just my 2 cents.
Actually, it turns out that this behavior only happens with ASP.NET MVC 2 Beta. I had this working fine with MVC 2 Preview 2 and rolled back to this version to re-test and confirmed that the application worked fine with that version.
Now, the question is: Why am I seeing this different behavior between these two MVC release versions, and what is the correct behavior I should expect to get in this scenario?