WCAT #Requests more than #Virtual Clients - post

This problem is now solved.
Removing the content because, I had added too much confidential details earlier.
Thanks stackoverflow.

In your case, WCAT will simulate 1000 virtual clients hammering the site over a duration of 120 seconds. The 2046 requests are the number of requests completed in that time frame, some of your virtual clients will probably have hit the site multiple times during that time window.
WCAT will give you a number of request served over a period of time.
jMeter on the other hand will perform a pre-defined number of requests and give you a total time to complete them.

Where it is reflected 2625 POSTs? I can't see it.

Because transactions are being executed in the warm up phase of the load test.
See the highlighted portion below. The text is taken from GALIN ILIEV'S post on the topic
WCAT uses a “warm-up” period in order to allow the Web Server to
achieve steady state before taking measurements of throughput,
response time and performance counters. For instance there is a slight
delay on first request on ASP.NET sites when Just-In-Time (JIT)
compilation is performed.
WCAT will divide the warm-up phase into two parts. For the first
half of the warm-up period WCAT will slowly add virtual clients until
all virtual clients have been activated. The second half is pure
load generation.

Related

How to calculate application availability (SLA)

I have standard ASP.NET MVC project and I need to calculate application availability to find out our SLA level. So, I need to get something like this for our web application.
Information from my hosting provider
System Availability: 99.9860%
Total Uptime: 30d 10h:22m:44s
Total Downtime: 0d 0h:6m:9s
Total Reboots: 3
Mean Time Between Reboots: 10.15 days
But I need to calculate availability for application. So, the question is
How to calculate ASP.NET MVC application availability in proper way?
Maybe someone has already implemented that, or any suggestion how to do that, any help will be appreciated.
Where to start?
The first point what I think that is Application Insights and availability test. The problem is that the minimum value of test frequency is 5 minutes. I need more precise measurements.
Next, create a some tool that will call my app every second and collect information. Result: a very large number of requests.
Also, get some perf counters from IIS or something like that. Need to investigate if it is possible.
I know that the question possible is too broad, but I didn't find any info about implementation of application availability. What do you think about that?
It would take to long if I was to explain all parts that can be done, so I'll keep it short.
Usually you define all these details in a Service Level Agreement where you also define the availability target (i.e. 99 %) that also include planned downtime. A 99 % availability target is to have the app running and its functionality as described in the document with at most approx. 87.6 h per year. Here is a SLA uptime calculator.
The normal interval is 5 minutes as you say, but it you can prove by using an external site / service that the suppliers are not meeting the requirements, you calculate your loss (revenue loss, labor costs etc) and claim the money from them. You already have a Business Impact Analysis (BIA) I guess otherwise you should do it.
Ok, now to the programming / DevOps part. I usually develop applications / services with this in mind and report its status to a third party service like NewRelic, Uptrends or similar. As an example I also use a self-made service for this because accurate requirements for delivering data at least once a second with a hard deadline. In my solution I use WebSockets to send data in both directions following a schedule, event or when needed. A benefit with that is that you can send status (good or bad) let say every 500 ms and you will know within one second if the app has failed (≈ 499 ms + 500 ms).
Using a service like this you can measure the uptime, custom events of interest and possible errors within a second and a ton of other metrics. Usually within 5-100 ms but WCET/WCRT is hard to estimate.
To answer your question, you cannot calculate application availability with so few measure points, once every 5 min is covering approx. 12 seconds per hour and you cannot have any reliable calculation from that. You can assume everything was ok between the measure points but that is called guessing. I have made implementations that have 14 400 measure points per hour in order to provide 500 ms accuracy (Banks).
I hope you got an answer that helps you with your problem.

High traffic volume in short amount of time

I'm looking for some advice here. My school's student section registration process is online and involves around 6,000 students
They base seating off first come first serve basis. Every year they open the site at noon and floods of people get on a try to register as fast as possible to get good seats. Every year without fail the server crashes and everyone is mad.
After several years of being frustrated myself, I've offered to redo their registration system.
My plan is to rewrite it in ruby on rails, and use heroku for hosting.
Does a heroku dyno only handle one request at a time?
Heroku scales up to 50 dynos. Will that be enough to handle around 6,000 users with about 5 pageviews per transaction in a short amount of time, say a half hour?
Any helpful strategies or tips you can give me before I dive into this project?
Does a heroku dyno only handle one request at a time?
Yes. Heroku dynos are single threaded.
Heroku scales up to 50 dynos. Will that be enough to handle around
6,000 users with about 5 pageviews per transaction in a short amount
of time, say a half hour?
This depends on how fast your page loads. For arguments sake let's pretend it takes 2s per page request (as per Google Analytics recommendation) and you need to load 6,000 users x 5 page views / 30 minutes - 1000 page views per minute.
At 2s per page load, one single dyno would load 30 page views per minute. At 50 dynos, this would be 1500 page views per minute. This would obviously allow you to exceed your overall goal and leave you some room for error, but if all 6000 users are hitting the page at once then a single Heroku app may not be able to keep up depending on your timeout. You would need to implement a user queue system - explained below.
Any helpful strategies or tips you can give me before I dive into this
project?
All that said, a 2s load time may vary depending on the assets your page needs to load, the amount it needs to interact with a database, it's queries, caching, etc. Your page can also potentially serve much faster.
You also need to worry about the initial hit of all the users. This could be taken care of via a first come first serve queue system - similar to that used by Ticketmaster if you've ever used their site. This could be accomplished via AWS SQS or your preference of queue system.
With a user queue and caching of your assets and common database queries, you should be able to accomplish this with 50 or less dynos.
EDIT: I'm taking your word for it that Heroku will run 50 web dynos. They show 24 as max on their pricing page, but I cannot find any info one way or another.
Does a heroku dyno only handle one request at a time?
It depends on web server you use (https://devcenter.heroku.com/articles/dynos#dynos-and-requests). If you want more concurrency within a dyno, I'd suggest taking a look at something like Puma.
Heroku scales up to 50 dynos. Will that be enough to handle around 6,000 users with about 5 pageviews per transaction in a short amount of time, say a half hour?
Any helpful strategies or tips you can give me before I dive into this project?
You can have more than 50 dynos. A specific answer for you app is going to be way better that a guess or generalization. Run a load test against your site (e.g., using Blitz) and find out the real numbers. Costs for add-ons are pro-rated per second, so you only pay for the period you have it installed. So make sure you uninstall or downgrade Blitz once you've finished your test.

How to improve Website Waiting time?

While website loading speed testing I found that website is sometimes loading very quickly and some times it takes lot of time to start loading. When I checked it in detail, I found on some requests wait time was just in few hundred milliseconds, while on some other request which was slow it was actually taking 5 to 30 seconds in wait time.
What may be the cause of this kind of deviation from few milliseconds to 30 or more seconds. And how to improve it.
The site is build upon ASP.net MVC3 and Microsoft SQL Server database.
What patterns are there i.e. are the same URLs always slow, and other URLs always fast, or does it just appear to be random?
Look at what else is running on the server, is it a dedicated server or a VPS?
Look at the DB performance i.e. is it consistent, which are the queries that are taking the longest time, most CPU, most IO etc.
How busy is the site, do the slowdowns match when the app-pool is being recycled or started up?

Load-testing web-app

When load testing a basic web application, what sanity checks do you do other than expected response time?
Is it fair to ask for peak memory usage?
What other checks do you make?
On the server
Requests per second the application can withstand
Requests per second that hit the database (if any, related to the number above, but it's useful to have them as separate figures)
Transferred bandwidth (separated by media type, if possible)
CPU utilization
Memory utilization
On the client
Response time
Weight of the average page
Is the CPU usage high at any time
Run something like YSlow to see what can you optimize on the output to make it quick for users
Stress testing tools usually come with most of these measures (except for Memory, CPU and database usage), as do YSlow or Firebug do on the client.
We look at a pretty wide variety of metrics when analyzing the results of a load test.
On the server, we start with these main 4 categories:
CPU (% utilization, context switches/sec, process queue length)
Memory (% use, page reads/sec, page writes/sec)
Bandwidth (incoming, outgoing, send & receive errors, # connections, connection failures, segment retransmits/sec)
Disk (Disk I/O Time %, avg service time, queue length, reads and writes/sec)
We also like look at metrics specific to the webserver and application server in use. For example, in IIS we look at IIS connection counts, cache hit rates and turnover frequency, etc. In .NET, we would be looking at ASP.NET Requests/sec, ASP.NET Last Request Execution Time, ASP.NET Current Requests, ASP.NET Queued Requests, ASP.NET Request Wait Time, ASP.NET Errors/sec and many others.
On the client side, we are primarily looking at total load time for the pages, duration and TTFB (time to first byte) for critical transactions, bandwidth usage, average page size and failure rate. We also find two metrics very useful - we call them Waiting Users and Average Wait Time. Not many tools have these - they tell you at each sample period exactly how many simulated users are in the process of retrieving a resource from the server and how long, on average, they have been waiting for the resource to arrive. We find these very useful for
determining when the server has reached its capacity
discovering that the server has stopped responding to certain types of requests (typically for certain resources, such as those requiring a database query)
Another good sanity check is to run the tests for at least 24 hours. We do that because one app ran nicely for a few hours then degraded. Discovered some issues with scheduled tasks as well as db connection pooling.
There are a number of services online that can do this type of testing for you as well. Of course, one of the downsides to this approach is that its harder to correlate the data from the service (which is what can be observed externally) with your own internal data about disk I/O, DB ops, etc. If you end up going this route I would suggest finding a vendor that will give you programmatic access to the raw test result data.

TFS Load Testing Web Tests

I am configuring a load test and am curious/confused on settings. I am testing an intranet website, that is expected to have 6000 concurrent users. My employer had some previous consultant tell them that the load test users does not matter and that we need to worry about requests/second. They have previously determined that those 6000 users would generate 30 rps, while I feel that is not correct we need to show that we can exceed that number. The previous load test was set for only 200 users and the results showed that it did exceed the 200 rps. They were happy with the results, but that is not how I understand this.
My question is, if we need to support 6000 concurrent users should I just set my users to 6000 and run, or is the rps an adequate piece of data to rely on?
It is really hard to measure the apples of a "Virtual User" with the orange that is a real person. A real person may take seconds to minutes to read a webpage and then take some action. A virtual user will be able to process a webpage every few seconds.
To test adequately you need to figure out a common unit of "work" between real users and the load we can generate with Visual Studio. The consultant probably recommended that RPS be used as it is easy to measure from any loadtest with whatever webtests inside it. It is a good measure.
The accuracy of the RPS measure rests on the assumptions made about your users.
The math works a little like:
I have 6000 users, who need to use the site every day. Mostly they log in in the mornining, work a bit before morning tea and hit the site more heavily from 2pm-3:30pm. Say
Looking at previous logs for a site or just guessing you can say:
Maybe at peak a users hits the site every minute or so.
Figuring at peak site usage 30% of the users are working.
So
Users:6000
Peak percentage: 30%
RPS/users: 1/60
6000 * 30% * 1/60 = 30 RPS.
So if the site can process 200RPS we can roughly say it is equivalent to all 6000 users hitting the site for a page every minute.
6000 * 100% * 1/60 = 200 RPS.
When you change the assumptions about your real users, the number of RPS changes, often dramatically.

Resources