is it possible to prevent concurrent modifications of a session when running multiple load balanced instances of an application?
Context: Multiple tomcats, all running the same application. The application uses spring session to store the sessions in a redis cluster. A load balancer distributes incoming requests to one of the tomcats (non-sticky). User hits Button, tomcat 1 processes the request very slowly (performance problem or whatever). User hits Button again, tomcat 2 is much faster and replies success. User proceeds to following pages. Tomcat 1 finishes the very first request and overwrites the session – the data of all proceeding pages is lost.
A solution would be to lock the session. Thereby tomcat 2 can detect the concurrent modification and reply with an error (much better than getting an inconsistent state).
Thx a lot
AB
Spring Session does not use any session locking mechanisms as this would have a very negative impact of performance. Note that your example focuses on a single conversation, while lock would affect all requests the belong to a given session many of which are perfectly safe to be executed concurrently.
For scenario from your example, another mechanism should be employed to provide protection. This could be something simple like disabling the button on UI until the action is completed, therefore preventing the subsequent request, or using CSRF protection which would ensure that every request that modifies the server-side.
Also note that most of the session repository implementations provided by Spring Session provide optimizations of write operations whose goal is to reduce race conditions - this includes checking the session for modifications prior to saving, and also in some cases optimized save operations that write only the attributes that have changed. This is handled different in each session repository due to different nature of underlying data stores so check the SessionRepository#save implementation in repository of your choice.
Perhaps somewhat related, Spring Session provides integration with Spring Security's concurrent session control starting from release 1.3.0 (which is, at the time of writing this post, in release candidate phase). You can check the reference manual for details.
Related
I'm using Ruby on Rails 4.2. In a controller I have a method which took a lot of time to complete making some heavy calculations. I want to inform the user of calculations progress. My idea was to have #progress variable which is updated during calculations and is read by different action processing AJAX requests from frontend. But this idea fails - I always have the default 0 value in AJAX action while the variable is updating in long method. I've tried ##progress, $progress and session[:progress] but with the exactly same results.
Now I'm considering to make a model for storing progress in database and reading it from there, but I can't believe it couldn't be done by some more simple means.
Please share your thoughts!
Theoretical:
The usual approach for these cases is to perform the job asynchronously from the HTTP handler process (so the end-user is not waiting too long for a response from the webserver).
This means:
delegate the heavy work to a background job,
somehow make the client-side aware of when the job is done (2 options here).
Practical (application of the theoretical above in a context of a Rails app):
Background job: The rails community provide a wide variety of gems (+ built-in solution ActiveJob) to do async jobs (= background tasks). They can be divided into 2 main categories:
persisted state: write a file on disk with the current state so the queue can be resumed if server reboots (DelayedJob, Que)
in-memory state: usually faster, but the queue is lost if server reboots (Resque, Sidekiq)
surface to client-side:
There are two main options here:
polling: client-side AJAX call to the back-end every X seconds to check if the background job is done
subscribing via web socket: client-side connecting via web socket to the server and listening to an event triggered when the job is done (ex: ActionCable as pointed out by #Vasilisa)
Opinion-based:
If you want to keep it simple, I would go with a very simple implementation: Resque for the back-end and a polling system for the front-end.
If you want something complete, capable of resisting server reboots and restoring the queue where it was before the crash, I would use a persisted version (DelayedJob for example) or wrap the in-memory solution with your own persisting logic.
So started working for a company on my first production job as a developer where I am writing a MVC web application. Told for certain data we didn't want to have any persistent storage, so just keep it in session. I just finished setting up the production environment so that we can do automatic deployment to the servers.
I do so in a rolling deployment manner. Drain connections from a machine, take it down, deploy new code, bring it back up then do the next machine.
From my first test this seems to kill the session data, which was what I was worried about. Is there a way within IIS to transfer session data when a user switches machines, or do I need some sort of shared file storage to recover in the case the user has to be removed from a machine to load new code onto it.
I am using Big-Ip for my load balancer that does the draining and I don't think it necessarily knows anything about IIS. First experience with the complexities of production level deployment with no downtime requirements. I imagine, I'll need a file storage backup to 'recover' if necessary. Just want to make sure I'm not missing something.
Best practice for a server farm (even if it's just two) is to NOT use session for anything. Apart from the fact that Session can cause performance problems (Session access is serialized and can slow performance as your load increases), Session is also considered transient storage, and can literally disappear at any time. When the IIS app pool is restarted, session is lost (either intentionally or unintentionally). Also, IIS can dump sessions even before they expire if it starts running low on resources.
Session is basically unpredictable, and unreliable. Anything you put in session should be rebuildable from your app if it finds the session data is gone.
Of course, this is for in-proc session. You can use a state server tied to a database, but this too will affect performance. Especially if you make heavy use of session.
In general, design your apps to NOT use session, unless it's for trivial things that can easily be recreated if they are no longer found.
I asked this question about 5 years ago around how to "offload" expensive operations where the users doesn't need to wait for (such as auditng, etc) so they get a response on the front end quicker.
I now have a related but different question. On my asp.net-mvc, I have build some reporting pages where you can generate excel reports (i am using EPPlus) and powerpoint reports (i am using aspose.slides). Here is an example controller action:
public ActionResult GenerateExcelReport(FilterParams args)
{
byte[] results = GenerateLargeExcelReportThatTake30Seconds(args);
return File(results, #"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml", "MyReport.xlsx");
}
The functionality working great but I am trying to figure out if these expensive operations (some reports can take up to 30 seconds to return) are impacting other users. In the previous question, I had an expensive operation that the user DIDN"T have to wait for but in this case he does have to wait for as its a syncronoous activity (click Generate Report and expectation is that users get a report when its finished)
In this case, I don't care that the main user has to wait 30 seconds but i just want to make sure I am not negatively impacting other users because of this expensive operation, generating files, etc
Is there any best practice here in asp.net-mvc for this use case ?
You can try combination of Hangfire and SignalR. Use Hangfire to kickoff a background job and relinquish the http request. And once report generation is complete, use SignalR to generate a push notification.
SignalR notification from server to client
Alternate option is to implement a polling mechanism on client side.
Send an ajax call to enque a hangfire job to generate the report.
And then start polling some api using another ajax call that provides status and as soon report is ready, retrieve it. I prefer to use SignalR rather than polling.
If the report processing is impacting the performance on the web server, offload that processing to another server. You can use messaging (ActiveMQ or RabbitMQ or some other framework of your choice) or rest api call to kick off report generation on another server and then again use messaging or rest api call to notify report generation completion back to the web server, finally SignalR to notify the client. This will let the web server be more responsive.
UPDATE
Regarding your question
Is there any best practice here in asp.net-mvc for this use case
You have to monitor your application overtime. Monitor both Client side as well as server side. There are few tools you can rely upon such as newrelic, app dynamics. I have used newrelic and it has features to track issues both at client browser as well as server side. The names of the product are "NewRelic Browser" and "NewRelic Server". I am sure there are other tools that will capture similar info.
Analyze the metrics overtime and if you see any anomalies then take appropriate actions. If you observe server side CPU and memory spikes, try capturing metrics on client side around same timeframe. On client side if you notice any timeout issues, connection errors that means your application users are unable to connect to your app while the server is doing some heavy lifting. Next try to Identify server side bottlenecks. If there is not enough room to performance tune the code, then go thru some server capacity planning exercise and figure out how to further scale your hardware or move the background jobs out of the web servers to reduce load. Just capturing metrics using these tools may not be enough, you may have to instrument (log capturing) your application to capture additional metrics to properly monitor application health.
Here you can find some information about capacity planning for .net application from Microsoft.
-Vinod.
These are all great ideas on how to move work out of the request/response cycle. But I think #leora simply wants to know whether a long-running request will adversely impact other users of an asp.net application.
The answer is no. asp.net is multi-threaded. Each request is handled by a separate worker thread.
In general it could be considered a good practice to run long running tasks in background and give some kind of notification to user when the job is done. As you probably know web request execution time is limited to 90 seconds, so if your long running task could exceed this, you have no choice but to run in some other thread/process. If you are using .net 4.5.2 you can use HostingEnvironment.QueueBackgroundWorkItem for running long running tasks in background and use SignalR to notify user when the task is finished the execution. In case that you are generating a file you can store it on server with some unique ID and send to user a link for downloading it. You can delete this file later (with some windows service for example).
As mentioned by others, there are some more advanced background task runners such as Hangfire, Quartz.Net and others but the general concept is the same - run task in backround and notify user when it is done. Here is some nice article about different oprions to run background tasks.
You need to use async and await of C#.
From your question I figured that you are just concerned with the fact that the request can be taking more resources than it should, instead of with scalability. If that's the case, make your controller actions async, as well as all the operations you call, as long as they involve calls that block threads. e.g. if your requests go through wires or I/O operations, they will be blocking the thread without async (technically, you will, since you will wait for the response before continuing). With async, those threads become available (while awaiting for the response), and so they can potentially serve other requests of other users.
I assumed you are not wandering how to scale the requests. If you are, let me know, and I can provide details on that as well (too much to write unless it's needed).
I believe a tool/library such as Hangfire is what your looking for. First, it'll allows for you to specify a task run on a background thread (in the same application/process). Using various techniques, such as SignalR allows for real-time front-end notification.
However, something I set up after using Hangfire for nearly a year was splitting our job processing (and implementation) to another server using this documentation. I use an internal ASP.NET MVC application to process jobs on a different server. The only performance bottleneck, then, is if both servers use the same data store (e.g. database). If your locking the database, the only way around it is to minimize the locking of said resource, regardless if the methodology you use.
I use interfaces to trigger jobs, stored in a common library:
public interface IMyJob
{
MyJobResult Execute( MyJobSettings settings );
}
And, the trigger, found in the front-end application:
//tell the job to run
var settings = new MyJobSettings();
_backgroundJobClient.Enqueue<IMyJob>( c => c.Execute( settings ) );
Then, on my background server, I write the implementation (and hook in it into the Autofac IOC container I'm using):
public class MyJob : IMyJob
{
protected override MyJobResult Running( MyJobSettings settings )
{
//do stuff here
}
}
I haven't messed too much with trying to get SignalR to work across the two servers, as I haven't run into that specific use case yet, but it's theoretically possible, I imagine.
You need to monitor your application users to know if other users are being affected e.g. by recording response times
If you find that this is affecting other users, you need to run the task in another process, potentially on another machine. You can use the library Hangfire to achieve this.
Using that answer, you can declare a Task with low priority
lowering priority of Task.Factory.StartNew thread
public ActionResult GenerateExcelReport(FilterParams args)
{
byte[] result = null;
Task.Factory.StartNew(() =>
{
result = GenerateLargeExcelReportThatTake30Seconds(args);
}, null, TaskCreationOptions.None, PriorityScheduler.BelowNormal)
.Wait();
return File(result, #"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml", "MyReport.xlsx");
}
Queue the jobs in a table, and have a background process poll that table to decide which Very Large Job needs to run next. Your web client would then need to poll the server to determine when the job is complete (potentially by checking a flag in the database, but there are other methods.) This guarantees that you won't have more than one (or however many you decide is appropriate) of these expensive processes running at a time.
Hangfire and SignalR can help you here, but a queueing mechanism is really necessary to avoid major disruption when, say, five users request this same process at the same time. The approaches mentioned that fire off new threads or background processes don't appear to provide any mechanism for minimizing processor / memory consumption to avoid disrupting other users due to consuming too many resources.
I have an ASP.NET MVC application which gathers data from multiple Databases.
The databases hold information for various sites and for every new site we have a new Database. The database for each site is connected at two points, from the site and then from HQ.
A web application updated data every minute from the site and the data is is served to the HQ (via another web application) every minute. Sometimes the application response is very slow and from what I have investigated, it may be because the connection pool starts filling up swiftly.
I want to ask what is the best approach to such application, where I can get the best performance out of it. Any guidance is welcome.
How to improve your web application performance regarding to database, really depends on your architecture. But there are some general rules which you should always follow:
Check about thread starvation:On the Web server, the .NET Framework
maintains a pool of threads that are used to service ASP.NET
requests. When a request arrives, a thread from the pool is
dispatched to process that request. If the request is processed
synchronously, the thread that processes the request is blocked
while the request is being processed, and that thread cannot service
another request.
This might not be a problem, because the thread
pool can be made large enough to accommodate many blocked threads.
However, the number of threads in the thread pool is limited. In
large applications that process multiple simultaneous long-running
requests, all available threads might be blocked. This condition is
known as thread starvation. When this condition is reached, the Web
server queues requests. If the request queue becomes full, the Web
server rejects requests with an HTTP 503 status (Server Too Busy).
for "thread starvation" the best approach is using "Asynchronous
Methods". refer here for more information.
Try to use using block for your datacontext, to dispose them immediately after finishing with them.
Huge data amount in transaction: you should check your code.
May be you using too much data without need to all of them. For
example you transfer all object which you may need just one
properties of object. In this case use "projection"(refer here for
an example).
Also you may use "lazy loading" or "eager loading" base on you
scenarios. But please be noted that none of these are magic tool for
every scenario. In some cases "lazy loading" improve performance and
on others "eager loading" makes things faster. It depends to your
deep understanding of these two terms and also your case of issue,
your code and your design.
Filter your data on server side or client side. Filtering data on server side helps to keep your server load and network traffic as less as possible. It also makes your application more responsive and with better performance. Use IQueryable Interface for server side filtering (check here for more information).
One side effect of using server side filtering is having better security
Check your architecture to see do you have any bottleneck. A
controller which gets called too much, a methods which handles lots
of objects with lots of data, a table in database which receives
requests continuously, all are candidates for bottle neck.
Ues cashing data when applicable for most requested data. But again
use cashing wisely and based on your situation. Wrong cashing makes
your server very slow.
If you think your speed issue is completely on your database, the best approach is using sql profiling tools to find out which point you have critical situation. Maybe redesign of your own tables could be an answer. Try to separate reading and writing tables as much as possible. Separation could be done by creating appropriate views. Also check this checklist for monitoring your database.
I found a question that explains how Play Framework's await() mechanism works in 1.2. Essentially if you need to do something that will block for a measurable amount of time (e.g. make a slow external http request), you can suspend your request and free up that worker to work on a different request while it blocks. I am guessing once your blocking operation is finished, your request gets rescheduled for continued processing. This is different than scheduling the work on a background processor and then having the browser poll for completion, I want to block the browser but not the worker process.
Regardless of whether or not my assumptions about Play are true to the letter, is there a technique for doing this in a Rails application? I guess one could consider this a form of long polling, but I didn't find much advice on that subject other than "use node".
I had a similar question about long requests that blocks workers to take other requests. It's a problem with all the web applications. Even Node.js may not be able to solve the problem of consuming too much time on a worker, or could simply run out of memory.
A web application I worked on has a web interface that sends request to Rails REST API, then the Rails controller has to request a Node REST API that runs heavy time consuming task to get some data back. A request from Rails to Node.js could take 2-3 minutes.
We are still trying to find different approaches, but maybe the following could work for you or you can adapt some of the ideas, I would love to get some feedbacks too:
Frontend make a request to Rails API with a generated identifier [A] within the same session. (this identifier helps to identify previous request from the same user session).
Rails API proxies the frontend request and the identifier [A] to the Node.js service
Node.js service add this job to a queue system(e.g. RabbitMQ, or Redis), the message contains the identifier [A]. (Here you should think about based on your own scenario, also assuming a system will consume the queue job and save the results)
If the same request send again, depending on the requirement, you can either kill the current job with the same identifier[A] and schedule/queue the lastest request, or ignore the latest request waiting for the first one to complete, or other decision fits your business requirement.
The Front-end can send interval REST request to check if the data processing with identifier [A] has completed or not, then these requests are lightweight and fast.
Once Node.js completes the job, you can either use the message subscription system or waiting for the next coming check status Request and return the result to the frontend.
You can also use a load balancer, e.g. Amazon load balancer, Haproxy. 37signals has a blog post and video about using Haproxy to off loading some long running requests that does not block shorter ones.
Github uses similar strategy to handle long requests for generating commits/contribution visualisation. They also set a limit of pulling time. If the time is too long, Github display a message saying it's too long and it has been cancelled.
YouTube has a nice message for longer queued tasks: "This is taking longer than expected. Your video has been queued and will be processed as soon as possible."
I think this is just one solution. You can also take a look EventMachine gem, that helps to improve the performance, handler parallel or async request.
Since this kind of problem may involve one or more services. Think about possibility of improving performance between those services(e.g. database, network, message protocol etc..), if caching may help, try out caching frequent requests, or pre-calculate results.