Strange lag in http pipeline with ASP.NET MVC on IIS6 - asp.net-mvc

I have an ASP.NET MVC application running on IIS6 with enabled wildcard mapping. After performing some load tests I digged into log files with focus on the slow requests. I have a log file from the load testing application, IIS log file and the log file from the IHttpModule I develop for this purpose which records time of Application.BeginRequest and Application.EndRequest.
When I compared data for slow requests in the IIS log file and the log file of my module I've discovered some odd behavior.
var request_processing_start = iis_log_file.time - iis_log_file.taken
var aspnet_processing_start = my_module_log_file.begin_request_called
var aspnet_processing_end = my_module_log_file.end_request_called
var request_processing_end = iis_log_file.time
For all requests values of aspnet_processing_end and request_processing_end match pretty close (difference no more then few milliseconds). However some requests have timespan between request_processing_start and aspnet_processing_start up to 30 seconds. What is the cause of this lag and a how can I prevent it?
Some more details of load test I've done:
Number of requests: on long time average 2 per seconds, in the peaks no more then 20 per second. (This is really low, we cannot blame wildcard mapping)
Processor usage: flat line bellow 5% (We cannot blame lack of cpu power)
Free memory: more than 1GB (Memory isn't also issue).
The lag I described is occurring only when the server is under load. When the load testing application isn't running it cannot be reproduced.

Related

ASP.NET MVC5 app memory leak

I have an ASP.NET MVC5 web application that is consuming excessive amounts of memory on our production server (excessive in that it increases over time and doesn't seem to stop until we recycle the application pool at ~8 GB, the max we've let it reach is 30 GB). The application does not perform like this on my dev machine or on our test server.
The production server is Windows Server 2012 R2 running IIS 8.5.9600.
I have written a small test tool which creates 50 concurrent threads and sends 1000 sequential requests each thread. In development the web application's memory stays around 400 MB, as it does in our test environment. On the production server, the memory just increases on and on. It doesn't matter what the endpoint does, I've just got it returning a vanilla .cshtml Razor view.
I've been trying figure out what could be causing the memory leak for some time and tried a few things:
Taking memory dumps and using a profiler (I've tried several different profilers). They all indicate that managed memory is within a reasonable amount (~100 - 200 MB), even on memory dumps of 8 GB!
Deploying a copy of the default "empty" ASP.NET MVC app generated by Visual Studio and running my test tool pointing to that. Same symptoms; memory is stable in dev and test environments but increases on the production server. I'm going to let it run for a while and see how high it goes, but so far it's 3 GB and climbing with each request.
The production server does have 96 GB of RAM, and from my understanding of how IIS uses this, it can get very greedy. But my dev machine has 32 GB and the max application pool size I've seen is around 600 MB and then it gets GC'd and reduces back to around ~400 MB.
What is taking up all the memory? Is this normal behaviour for IIS?
Update:
I've created a new VM server on Azure with similar specifications (112 GB RAM) and the memory stops at around 400 MB also. There must be something specific to our production server causing the problem.
In our case, it was our hosting provider who had installed some monitoring tool which used the .NET profiling API poorly.
I suggest anyone who is observing similar symptoms in their app to try configuring a new server instance.

Application Pool Occasionally Spiking Memory Consumption

We have just launched a new MVC5 web site. The site uses Entity Framework for its data and also implements a couple of WebApi services for some simple AngularJS pages used on the web site.
The site has gone through development and testing without a problem, but now it is installed on an IIS 8.5 production server we are seeing the following entries in the IIS (WAS) event logs:
Here is first error:
A worker process serving application pool 'xxx' has requested a recycle
because it reached its private bytes memory limit.
Around 90 seconds later we see this error:
A worker process '4880' serving application pool 'xxx' failed to stop
a listener channel for protocol 'http' in the allotted time. The data
field contains the error number.
Which is immediately (the same time to the second) followed by a third error:
A process serving application pool 'xxx' exceeded time limits during
shut down. The process id was '4880'.
Finally, we see another Application Pool reccycle event:
A worker process serving application pool 'xxx' has requested a recycle
because it reached its private bytes memory limit.
We are currently seeing this problem approximately once per day and it does not seem to be related to site traffic/loading.
The reason we set the Application Pool to recycle on a Private Bytes consumption exceeded 4,194,304 KB (4 GB) - it normally (for perhaps 36 hours) sits at less than 1 GB, was because we had noticed that occasionally the Application Pools Private Memory consumption would increase linearly. Again we did not see this during development or local testing.
We have tried running load tests of several hundred concurrent users across the application, but have been unable to replicate this error sequence.
We have also run the application locally for extended periods of time with ReSharper's dotMemory profiler and memory snapshots do not reveal any problems.
Are there any tools/techniques available that we can run on the production server that would give us more information on what is happening?

Why My web site time out while running JMeter load Test

I'm new to JMeter and. I followed this tutorial to learn JMeter.
I tried to do a load tested under following conditions.
Number of Threads (Users) - 1000
Ramp-Up Period (in seconds) - 10
Loop Count - 5
While I'm running the test, I tried to load my website (after clear cache)But, it takes more than usual time to load the page. This issue doesn't occur when the browser has cached data.
Can someone please tell me why this is happening? Is it because of when 1000 users load my site, it may crash or something?
Any kind of explanation will be appreciated.
While running your JMeter test if you try to load your website (after clear cache), it will always take more time to load than usual.It's because you have cleared the cache and now the browser needs to render the page resources again to load your desired page.After loading is complete and if you try to load the page again without clearing cache, it will take less time to load the page this time.Browser does not fetch page resources every time, rather the browser saves it in its cache.So next time when you try to open or load that page, the browser could use those cache to open that page for you in the shortest period of time. So for the first time when a browser load a page it takes more time than loading that specific page later(without clearing cache).
Another point is , as your Jmeter test was running while you tried to load your website, it will take a longer time to load your website.Because your application was already handling some requests send by JMeter.So handling extra load will impact on your website page response time.
Ramp up time 10sec for 1000 users!!!
It is not the best practice. You have to give enough time to warm up those 1000 users. 10 sec is too small to be the ramp up time for 1000 users.So during the JMeter test period, it is obvious that your browser will take an unexpected time to load your webpage(using Browser) or end up notifying "Connection Timeout".It necessarily doesn't mean that your application is crashed. It's simply because of unrealistic test script design in JMeter.
Could you elaborate on the type of webserver software you are using e.g?
- Apache HTTPD 2.4 / Nginx / Apache Tomcat / IIS
And the underlying operating system?
- Windows (Server?) / Mac OS X / Linux
If your webserver machine is not limited by the maximum performance of your CPU, Disk etc. (check the Task Manager) your performance might be limited by the configuration of Apache.
Could you please check the Apache HTTPD log files for relevant warnings?
Depending on your configuration (httpd.conf + any files "Include"d from there) you may be using the mpm_winnt worker, that has a configurable number of worker threads which by default is 64 according to:
https://httpd.apache.org/docs/2.4/mod/mpm_common.html#threadsperchild
Once these are all busy new requests from any client (your browser, your loadtest, etc.) will have to wait for their turn.
Try and see what happens if you increase the number of threads!

How to improve Website Waiting time?

While website loading speed testing I found that website is sometimes loading very quickly and some times it takes lot of time to start loading. When I checked it in detail, I found on some requests wait time was just in few hundred milliseconds, while on some other request which was slow it was actually taking 5 to 30 seconds in wait time.
What may be the cause of this kind of deviation from few milliseconds to 30 or more seconds. And how to improve it.
The site is build upon ASP.net MVC3 and Microsoft SQL Server database.
What patterns are there i.e. are the same URLs always slow, and other URLs always fast, or does it just appear to be random?
Look at what else is running on the server, is it a dedicated server or a VPS?
Look at the DB performance i.e. is it consistent, which are the queries that are taking the longest time, most CPU, most IO etc.
How busy is the site, do the slowdowns match when the app-pool is being recycled or started up?

Load-testing web-app

When load testing a basic web application, what sanity checks do you do other than expected response time?
Is it fair to ask for peak memory usage?
What other checks do you make?
On the server
Requests per second the application can withstand
Requests per second that hit the database (if any, related to the number above, but it's useful to have them as separate figures)
Transferred bandwidth (separated by media type, if possible)
CPU utilization
Memory utilization
On the client
Response time
Weight of the average page
Is the CPU usage high at any time
Run something like YSlow to see what can you optimize on the output to make it quick for users
Stress testing tools usually come with most of these measures (except for Memory, CPU and database usage), as do YSlow or Firebug do on the client.
We look at a pretty wide variety of metrics when analyzing the results of a load test.
On the server, we start with these main 4 categories:
CPU (% utilization, context switches/sec, process queue length)
Memory (% use, page reads/sec, page writes/sec)
Bandwidth (incoming, outgoing, send & receive errors, # connections, connection failures, segment retransmits/sec)
Disk (Disk I/O Time %, avg service time, queue length, reads and writes/sec)
We also like look at metrics specific to the webserver and application server in use. For example, in IIS we look at IIS connection counts, cache hit rates and turnover frequency, etc. In .NET, we would be looking at ASP.NET Requests/sec, ASP.NET Last Request Execution Time, ASP.NET Current Requests, ASP.NET Queued Requests, ASP.NET Request Wait Time, ASP.NET Errors/sec and many others.
On the client side, we are primarily looking at total load time for the pages, duration and TTFB (time to first byte) for critical transactions, bandwidth usage, average page size and failure rate. We also find two metrics very useful - we call them Waiting Users and Average Wait Time. Not many tools have these - they tell you at each sample period exactly how many simulated users are in the process of retrieving a resource from the server and how long, on average, they have been waiting for the resource to arrive. We find these very useful for
determining when the server has reached its capacity
discovering that the server has stopped responding to certain types of requests (typically for certain resources, such as those requiring a database query)
Another good sanity check is to run the tests for at least 24 hours. We do that because one app ran nicely for a few hours then degraded. Discovered some issues with scheduled tasks as well as db connection pooling.
There are a number of services online that can do this type of testing for you as well. Of course, one of the downsides to this approach is that its harder to correlate the data from the service (which is what can be observed externally) with your own internal data about disk I/O, DB ops, etc. If you end up going this route I would suggest finding a vendor that will give you programmatic access to the raw test result data.

Resources