I am curious as to how to proceed with this issue; I currently have a DataSnap server setup with a TDSAuthenticationManager class managing the authentication.
If an authentication fails, is it safe for me to write directly onto a form TMemo or something similar for logging purposes? What's the best way to observe this?
Do I need threading?
Cheers for reading,
Adrian
Yes, you need synchronization, since Datasnap events run in the context of different threads, and as you may know, the UI programming is limited to the main thread.
So, if you want to display something in the UI, you have to take care of how to do it.
On the other hand, if you want to log to a file, you don't need synchronization, but you have to be careful, since it is possible for two different threads to try to log at the same time.
The options I would evaluate are:
Protect the access to the log file using a Critical Section, thus avoiding the multi-thread access with a lock. Only one thread can access the file at a time and all other interested threads have to wait.
Create a new logging class, from which a global instance that can take log requests by simply adding the log message to a (multi thread capable) queue in memory, and running it's own thread writing them to a file when there are messages in the queue.
Since servers tend to run as a services in production environments, I would choose the latter.
Related
I asked this question about 5 years ago around how to "offload" expensive operations where the users doesn't need to wait for (such as auditng, etc) so they get a response on the front end quicker.
I now have a related but different question. On my asp.net-mvc, I have build some reporting pages where you can generate excel reports (i am using EPPlus) and powerpoint reports (i am using aspose.slides). Here is an example controller action:
public ActionResult GenerateExcelReport(FilterParams args)
{
byte[] results = GenerateLargeExcelReportThatTake30Seconds(args);
return File(results, #"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml", "MyReport.xlsx");
}
The functionality working great but I am trying to figure out if these expensive operations (some reports can take up to 30 seconds to return) are impacting other users. In the previous question, I had an expensive operation that the user DIDN"T have to wait for but in this case he does have to wait for as its a syncronoous activity (click Generate Report and expectation is that users get a report when its finished)
In this case, I don't care that the main user has to wait 30 seconds but i just want to make sure I am not negatively impacting other users because of this expensive operation, generating files, etc
Is there any best practice here in asp.net-mvc for this use case ?
You can try combination of Hangfire and SignalR. Use Hangfire to kickoff a background job and relinquish the http request. And once report generation is complete, use SignalR to generate a push notification.
SignalR notification from server to client
Alternate option is to implement a polling mechanism on client side.
Send an ajax call to enque a hangfire job to generate the report.
And then start polling some api using another ajax call that provides status and as soon report is ready, retrieve it. I prefer to use SignalR rather than polling.
If the report processing is impacting the performance on the web server, offload that processing to another server. You can use messaging (ActiveMQ or RabbitMQ or some other framework of your choice) or rest api call to kick off report generation on another server and then again use messaging or rest api call to notify report generation completion back to the web server, finally SignalR to notify the client. This will let the web server be more responsive.
UPDATE
Regarding your question
Is there any best practice here in asp.net-mvc for this use case
You have to monitor your application overtime. Monitor both Client side as well as server side. There are few tools you can rely upon such as newrelic, app dynamics. I have used newrelic and it has features to track issues both at client browser as well as server side. The names of the product are "NewRelic Browser" and "NewRelic Server". I am sure there are other tools that will capture similar info.
Analyze the metrics overtime and if you see any anomalies then take appropriate actions. If you observe server side CPU and memory spikes, try capturing metrics on client side around same timeframe. On client side if you notice any timeout issues, connection errors that means your application users are unable to connect to your app while the server is doing some heavy lifting. Next try to Identify server side bottlenecks. If there is not enough room to performance tune the code, then go thru some server capacity planning exercise and figure out how to further scale your hardware or move the background jobs out of the web servers to reduce load. Just capturing metrics using these tools may not be enough, you may have to instrument (log capturing) your application to capture additional metrics to properly monitor application health.
Here you can find some information about capacity planning for .net application from Microsoft.
-Vinod.
These are all great ideas on how to move work out of the request/response cycle. But I think #leora simply wants to know whether a long-running request will adversely impact other users of an asp.net application.
The answer is no. asp.net is multi-threaded. Each request is handled by a separate worker thread.
In general it could be considered a good practice to run long running tasks in background and give some kind of notification to user when the job is done. As you probably know web request execution time is limited to 90 seconds, so if your long running task could exceed this, you have no choice but to run in some other thread/process. If you are using .net 4.5.2 you can use HostingEnvironment.QueueBackgroundWorkItem for running long running tasks in background and use SignalR to notify user when the task is finished the execution. In case that you are generating a file you can store it on server with some unique ID and send to user a link for downloading it. You can delete this file later (with some windows service for example).
As mentioned by others, there are some more advanced background task runners such as Hangfire, Quartz.Net and others but the general concept is the same - run task in backround and notify user when it is done. Here is some nice article about different oprions to run background tasks.
You need to use async and await of C#.
From your question I figured that you are just concerned with the fact that the request can be taking more resources than it should, instead of with scalability. If that's the case, make your controller actions async, as well as all the operations you call, as long as they involve calls that block threads. e.g. if your requests go through wires or I/O operations, they will be blocking the thread without async (technically, you will, since you will wait for the response before continuing). With async, those threads become available (while awaiting for the response), and so they can potentially serve other requests of other users.
I assumed you are not wandering how to scale the requests. If you are, let me know, and I can provide details on that as well (too much to write unless it's needed).
I believe a tool/library such as Hangfire is what your looking for. First, it'll allows for you to specify a task run on a background thread (in the same application/process). Using various techniques, such as SignalR allows for real-time front-end notification.
However, something I set up after using Hangfire for nearly a year was splitting our job processing (and implementation) to another server using this documentation. I use an internal ASP.NET MVC application to process jobs on a different server. The only performance bottleneck, then, is if both servers use the same data store (e.g. database). If your locking the database, the only way around it is to minimize the locking of said resource, regardless if the methodology you use.
I use interfaces to trigger jobs, stored in a common library:
public interface IMyJob
{
MyJobResult Execute( MyJobSettings settings );
}
And, the trigger, found in the front-end application:
//tell the job to run
var settings = new MyJobSettings();
_backgroundJobClient.Enqueue<IMyJob>( c => c.Execute( settings ) );
Then, on my background server, I write the implementation (and hook in it into the Autofac IOC container I'm using):
public class MyJob : IMyJob
{
protected override MyJobResult Running( MyJobSettings settings )
{
//do stuff here
}
}
I haven't messed too much with trying to get SignalR to work across the two servers, as I haven't run into that specific use case yet, but it's theoretically possible, I imagine.
You need to monitor your application users to know if other users are being affected e.g. by recording response times
If you find that this is affecting other users, you need to run the task in another process, potentially on another machine. You can use the library Hangfire to achieve this.
Using that answer, you can declare a Task with low priority
lowering priority of Task.Factory.StartNew thread
public ActionResult GenerateExcelReport(FilterParams args)
{
byte[] result = null;
Task.Factory.StartNew(() =>
{
result = GenerateLargeExcelReportThatTake30Seconds(args);
}, null, TaskCreationOptions.None, PriorityScheduler.BelowNormal)
.Wait();
return File(result, #"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml", "MyReport.xlsx");
}
Queue the jobs in a table, and have a background process poll that table to decide which Very Large Job needs to run next. Your web client would then need to poll the server to determine when the job is complete (potentially by checking a flag in the database, but there are other methods.) This guarantees that you won't have more than one (or however many you decide is appropriate) of these expensive processes running at a time.
Hangfire and SignalR can help you here, but a queueing mechanism is really necessary to avoid major disruption when, say, five users request this same process at the same time. The approaches mentioned that fire off new threads or background processes don't appear to provide any mechanism for minimizing processor / memory consumption to avoid disrupting other users due to consuming too many resources.
I am using in my app a background job system (Sidekiq) to manage some heavy job that should not block the UI.
I would like to transmit data from the background job to the main thread when the job is finished, e.g. the status of the job or the data done by the job.
At this moment I use Redis as middleware between the main thread and the background jobs. It store data, status,... of the background jobs so the main thread can read what it happens behind.
My question is: is this a good practice to manage data between the scheduled job and the main thread (using Redis or a key-value cache)? There are others procedures? Which is best and why?
Redis pub/sub are thing you are looking for.
You just subscribe main thread using subscribe command on channel, in which worker will announce job status using publish command.
As you already have Redis inside your environment, you don't need anything else to start.
Here are two other options that I have used in the past:
Unix sockets. This was extremely fiddly, creating and closing connections was a nuisance, but it does work. Also dealing with cleaning up sockets and interacting with the file system is a bit involved. Would not recommend.
Standard RDBMS. This is very easy to implement, and made sense for my use case, since the heavy job was associated with a specific model, so the status of the process could be stored in columns on that table. It also means that you only have one store to worry about in terms of consistency.
I have used memcached aswell, which does the same thing as Redis, here's a discussion comparing their features if you're interested. I found this to work well.
If Redis is working for you then I would stick with it. As far as I can see it is a reasonable solution to this problem. The only things that might cause issues are generating unique keys (probably not that hard), and also making sure that unused cache entries are cleaned up.
I have an ASP.NET MVC 4 app hosted as an Azure web role. I want to do something that seems like it should be pretty standard: I want to create a function that I can call that initiates a VIP swap and raises and event (or calls a callback) when the VIP Swap operation is done.
Just to add some context to the situation: My website implements a workflow that takes about an hour (or less) to complete. If I want to release a new version of the website code, it's convenient (i.e. much less "backward compatibility" code to write) to first let all of the current users complete the workflow so that the new code doesn't need to deal with data created by the previous version of the code. So a management function in my website would first poke a value into the database that disables new workflows; it would then wait until all current workflows are done; it would then call the "VIP Swap" routine; finally, when the VIP Swap routine signals its completion, it would poke the database value to re-enable new workflows.
I found the Microsoft documentation for how to programmatically initiate a VIP swap here:
http://msdn.microsoft.com/en-us/library/ee460814.aspx
The procedure involves POSTing to a magic URL and including some headers in the POST, then periodically performing a GET to a magic URL and checking the response code.
The more I think about this, the more non-trivial it seems. In addition to the basic complexities of wiring up a background timer and completion notification, I don't know what complexities, if any, I might run into trying to do this stuff in the IIS environment. Can I even perform HTTP operations on a background thread? For that matter, will I run into complications just trying to use any of the half dozen or so different "do things in the background" mechanisms baked into .NET?
Any help or guidance will be greatly appreciated. In particular, I'd be ecstatic if someone could point me at a ready-to-go implementation of this function!
I don't think you will find an easy solution to this as the fabric controller is setup to do some very fancy things without your involvement. Running hour-long workflows on a cloud computing environment, where an instance can be pulled out from underneath you, (with a maximum of 5 minutes from the OnStopping event being called to clean up) requires that you do other work anyway to make sure that all of your tasks complete.
The simple question is "What do you do if an instance goes down when workflows are still running?" Do you restart them or are they lost? If they get lost then you don't care anyway, so killing workflows for an upgrade are equally unimportant. If you re-start them then use that same mechanism to decide whether or not a node is due to be shut down, and distribute the jobs accordingly. This pattern is eerily similar to the Hadoop JobTracker. Don't just run the workflows on any 'ol instance. Submit them to a (job tracker) service that decides what to do. The (job tracker) service can then use the service management API to scale up as many instances as you need running the version that you want, run workflows on the appropriate node, and shut them down when they are no longer needed or are outdated.
Unfortunately this may not be the simple solution that you are looking for, but something in your architecture needs to change, rather than trying to force PaaS to fit with your current approach. Decompose your workloads, create loosely coupled services, design for failure, and a few other cloud/distributed computing practices need to be considered. There is a reason why Hadoop is built the way that it is — and it has a reputation for being able to get work done on a bunch of somewhat unreliable commodity hardware.
I'm working with a PHP frontend which connects to a distributed back end, using Amazon SQS and a variety of message types and message consumers. I'm trying to come up with a way to safely debug those consumers, as we don't want message handlers with new, untested code consuming end-user messages, risking the messages being lost or incorrectly processed.
The actual message queue names are hardcoded as PHP constants in a class, so my first tactic was to create two different sets of queues, one for production and another for debugging, and to externalise the queue name constants into two different files. Depending on whether our debug condition is true or not, I wanted to include one or the other of those constant definitions and assign the constants in the included file to the class constants which currently have the names hardcoded.
This doesn't seem to work though because constants seem to act like class variables in PHP whereas I am trying to assign the values like instance variables. The next tactic was to see if there was anything on Amazon's side that would allow us to debug our message consumers transparently without adding lots of hacks to our code, but I couldn't see anything there that facilitated this. I'd love to know if anyone else has experienced (and ideally, solved this problem)
SQS doesn't provide a way to inspect the contents of messages in the queue, or for the sender to see if any consumers are failing to process messages.
A common approach to this problem would be to set up two sets of queues as you suggest and have the producer post the same message onto both queues. That way you can debug your code against a stream of production messages without affecting the actual production queue.
I'd recommend moving the decision of which queue to use out of your code and into config, and then deploy different config files to your development boxes vs your production boxes. The risk is always that a development box ends up talking to production systems, so having a single consistent approach to configuring those end-points across all your code is much less risky that doing it on an ad-hoc basis each time you call out to a service.
I'd also recommend putting your production and development queues in different AWS accounts with different access credentials. That way you can give your production account permission to publish to the development account's queue, but you can guarantee that your development systems can't read from the production queue.
I'm working on a Rails application that periodically needs to perform large numbers of IO-bound operations. These operations can be performed asynchronously. For example, once per day, for each user, the system needs to query Salesforce.com to fetch the user's current list of accounts (companies) that he's tracking. This results in huge numbers (potentially > 100k) of small queries.
Our current approach is to use ActiveMQ with ActiveMessaging. Each of our users is pushed onto a queue as a different message. Then, the consumer pulls the user off the queue, queries Salesforce.com, and processes the results. But this approach gives us horrible performance. Within a single poller process, we can only process a single user at a time. So, the Salesforce.com queries become serialized. Unless we run literally hundreds of poller processes, we can't come anywhere close to saturating the server running poller.
We're looking at EventMachine as an alternative. It has the advantage of allowing us to kickoff large numbers of Salesforce.com queries concurrently within a single EventMachine process. So, we get great parallelism and utilization of our server.
But there are two problems with EventMachine. 1) We lose the reliable message delivery we had with ActiveMQ/ActiveMessaging. 2) We can't easily restart our EventMachine's periodically to lessen the impact of memory growth. For example, with ActiveMessaging, we have a cron job that restarts the poller once per day, and this can be done without worrying about losing any messages. But with EventMachine, if we restart the process, we could literally lose hundreds of messages that were in progress. The only way I can see around this is to build a persistance/reliable delivery layer on top of EventMachine.
Does anyone have a better approach? What's the best way to reliably execute large numbers of asynchronous IO-bound operations?
I maintain ActiveMessaging, and have been thinking about the issues of a multi-threaded poller also, though not perhaps at the same scale you guys are. I'll give you my thoughts here, but am also happy to discuss further o the active messaging list, or via email if you like.
One trick is that the poller is not the only serialized part of this. STOMP subscriptions, if you do client -> ack in order to prevent losing messages on interrupt, will only get sent a new message on a given connection when the prior message has been ack'd. Basically, you can only have one message being worked on at a time per connection.
So to keep using a broker, the trick will be to have many broker connections/subscriptions open at once. The current poller is pretty heavy for this, as it loads up a whole rails env per poller, and one poller is one connection. But there is nothing magical about the current poller, I could imagine writing a poller as an event machine client that is implemented to create new connections to the broker and get many messages at once.
In my own experiments lately, I have been thinking about using Ruby Enterprise Edition and having a master thread that forks many poller worker threads so as to get the benefit of the reduced memory footprint (much like passenger does), but I think the EM trick could work as well.
I am also an admirer of the Resque project, though I do not know that it would be any better at scaling to many workers - I think the workers might be lighter weight.
http://github.com/defunkt/resque
I've used AMQP with RabbitMQ in a way that would work for you. Since ActiveMQ implements AMQP, I imagine you can use it in a similar way. I have not used ActiveMessaging, which although it seems like an awesome package, I suspect may not be appropriate for this use case.
Here's how you could do it, using AMQP:
Have Rails process send a message saying "get info for user i".
The consumer pulls this off the message queue, making sure to specify that the message requires an 'ack' to be permanently removed from the queue. This means that if the message is not acknowledged as processed, it is returned to the queue for another worker eventually.
The worker then spins off the message into the thousands of small requests to SalesForce.
When all of these requests have successfully returned, another callback should be fired to ack the original message and return a "summary message" that has all the info germane to the original request. The key is using a message queue that lets you acknowledge successful processing of a given message, and making sure to do so only when relevant processing is complete.
Another worker pulls that message off the queue and performs whatever synchronous work is appropriate. Since all the latency-inducing bits have already performed, I imagine this should be fine.
If you're using (C)Ruby, try to never combine synchronous and asynchronous stuff in a single process. A process should either do everything via Eventmachine, with no code blocking, or only talk to an Eventmachine process via a message queue.
Also, writing asynchronous code is incredibly useful, but also difficult to write, difficult to test, and bug-prone. Be careful. Investigate using another language or tool if appropriate.
also checkout "cramp" and "beanstalk"
Someone sent me the following link: http://github.com/mperham/evented/tree/master/qanat/. This is a system that's somewhat similar to ActiveMessaging except that it is built on top of EventMachine. It's almost exactly what we need. The only problem is that it seems to only work with Amazon's queue, not ActiveMQ.