How to calculate application availability (SLA) - asp.net-mvc

I have standard ASP.NET MVC project and I need to calculate application availability to find out our SLA level. So, I need to get something like this for our web application.
Information from my hosting provider
System Availability: 99.9860%
Total Uptime: 30d 10h:22m:44s
Total Downtime: 0d 0h:6m:9s
Total Reboots: 3
Mean Time Between Reboots: 10.15 days
But I need to calculate availability for application. So, the question is
How to calculate ASP.NET MVC application availability in proper way?
Maybe someone has already implemented that, or any suggestion how to do that, any help will be appreciated.
Where to start?
The first point what I think that is Application Insights and availability test. The problem is that the minimum value of test frequency is 5 minutes. I need more precise measurements.
Next, create a some tool that will call my app every second and collect information. Result: a very large number of requests.
Also, get some perf counters from IIS or something like that. Need to investigate if it is possible.
I know that the question possible is too broad, but I didn't find any info about implementation of application availability. What do you think about that?

It would take to long if I was to explain all parts that can be done, so I'll keep it short.
Usually you define all these details in a Service Level Agreement where you also define the availability target (i.e. 99 %) that also include planned downtime. A 99 % availability target is to have the app running and its functionality as described in the document with at most approx. 87.6 h per year. Here is a SLA uptime calculator.
The normal interval is 5 minutes as you say, but it you can prove by using an external site / service that the suppliers are not meeting the requirements, you calculate your loss (revenue loss, labor costs etc) and claim the money from them. You already have a Business Impact Analysis (BIA) I guess otherwise you should do it.
Ok, now to the programming / DevOps part. I usually develop applications / services with this in mind and report its status to a third party service like NewRelic, Uptrends or similar. As an example I also use a self-made service for this because accurate requirements for delivering data at least once a second with a hard deadline. In my solution I use WebSockets to send data in both directions following a schedule, event or when needed. A benefit with that is that you can send status (good or bad) let say every 500 ms and you will know within one second if the app has failed (≈ 499 ms + 500 ms).
Using a service like this you can measure the uptime, custom events of interest and possible errors within a second and a ton of other metrics. Usually within 5-100 ms but WCET/WCRT is hard to estimate.
To answer your question, you cannot calculate application availability with so few measure points, once every 5 min is covering approx. 12 seconds per hour and you cannot have any reliable calculation from that. You can assume everything was ok between the measure points but that is called guessing. I have made implementations that have 14 400 measure points per hour in order to provide 500 ms accuracy (Banks).
I hope you got an answer that helps you with your problem.

Related

CloudKit: free public storage and data transfer

I would like to understand the CloudKit free usage calculation, but I can't.
Could anyone describe what 40 requests per seconds (10 per 100.000 users) are? I couldn't find any definition what a request is. If I had 2 apps and every app would ping my CloudKit server at the same time, would it result in two requests per second (for the described moment)? How do I know how to limit the request in my apps and how to queue the requests so they can be done later when the time comes where the limit is not reached at the CloudKit server?
What about the 2GB data transfer (50 mb per user)? How should I understand these 50 mb, per second, per day for the eternity? What will happen if one user for one of my apps used 50 mb traffic?
How do I limit my app and still have a good clint server communication? Will I get an error when the limit is reached and won't automatically charged by Apple?
I do really like the programming ease of CloudKit but I'm kinda scared that it could go all way wrong and I will get charged for misunderstanding.
It is really hard for me to imagine how it is calculated.
I think your biggest concern will be quelled by knowing that you can set usage limits on these services. If you've hit this limit then the service will return an error and you can handle that in your app.
40 requests per second is across all users and devices. If you have 3600 users and they all pinged the server once per hour, that would average out to about 1/second. While that won't be enough to build a service like facebook, instagram, or twitter, it would probably be sufficient for getting weather data, a daily schedule, or food truck locations. For up to 4,000,000 users, the free tier will cover each user checking at most once every three hours with an even distribution.
2GB data transfer is for all of your users. Since the scaling doesn't take effect until you have 100,000 users, 2GB of data transfer is a pretty good amount to get you off the ground. Since it scales at 50MB per user, it's easy to figure out how much you can trust your app to communicate with the server. If just one user goes over but you're still under the total usage then you won't get charged. If you do go over, it's $0.10/GB of data transfer.
You could limit your app to only communicate so much until the user needs to pay for a premium service. If you allowed 50MB/user/month of Data Transfer and let the user know when they approached this limit that they'd have to pay then you'd never go over. You could also have ads on the device that essentially pay for the service to scale thus allowing users who use the app more to have more privileges than passive users but still allowing everyone to have a base usage.
The prices are at the bottom of this page and are fairly reasonable. You can definitely get a cheaper rate if you build things yourself and use AWS, but you'd need to be in the millions of users and/or have high demands for that to be a better option.
40 requests per second is across all users and devices. If you have
3600 users and they all pinged the server once per hour, that would
average out to about 1/second. While that won't be enough to build a
service like facebook, instagram, or twitter, it would probably be
sufficient for getting weather data, a daily schedule, or food truck
locations. For up to 4,000,000 users, the free tier will cover each
user checking at most once every three hours with an even
distribution.
Just about the 40 requests / second limit:
If this is correct, I sincerely don't understand why so many people in the forums says this is more than enough. For certain apps it might be enough to sync once per hour but if you want to keep save-game files synchronized between devices then 40 request /second is ridiculous. A weather app? don't make me laugh. On 90% of the apps out there you are going need to to insert, update, and delete, I wonder how many requests a simple update it is... I hope just 1, but I seriously doubt it.
On Firebase there is not a request limit like this one and the upload is FREE. They just charge you for the downloads.
I might be missing something about this CloudKit thing because I don't get that ridiculous limit.

Heroku: How expensive could running my application be?

I'm about to release an iOS app, and deploy its backend (rails backend that serves the iOS app) to heroku.
I have very little knowledge when it comes to the practical price you will pay based on traffic, etc. This link (http://notes.ericjiang.com/posts/881) states... Nowadays, especially with faster code and faster computers, a standard 512MB dyno can power websites with tens of thousands of hits per hour.
I'm trying to get a rough estimate of how much running my backend on heroku could cost me. What's the best way to figure this out? The pricing is all very straightforward. It basically just comes down to how many dyno's I'm going to need.
If I get 5 beta testers to run my iOS app for a 10 minute window, can I extrapolate some statistics as to how much my backend is being used? Is it the 'hits' that matter, or the 'data' transferred, or the 'time' the backend is actively doing something, like queueing up some resulting data?
Is there a formula to figure it out? Let's say say a user averages 10 hits per minute, and I constantly have an average of 5000 users. That would be 3 million hits per hour. What exactly should I be looking for in trying to determine an accurate pricing for my first backend?
While Heroku do have some limits surrounding bandwidth (not requests) for the most part your cost is close to fixed.
Monthly pricing is typically made up from a combination of:
Dynos
Databases
Addons
Premium support
Heroku provide a price calculator on their website. Further, standard (non-hobby) dynos and up include metrics around CPU usage and memory usage.
My suggestion if you're just starting out? Start with one web dyno and a Postgres database. Beta test your app and check your metrics. A Rails app on a single Standard 1X dyno can handle a reasonable amount of traffic (depending on what else it might be doing) and if you need to add more dynos it's only a command line interface away.
Hope that helps.

Tool for monitoring QOS

In my project
We crawls x number of server.
Number of user for each server varies from 1 to n.
We crawls 1 to z item for each user.
Currently we are monitoring QOS using graphite. We are storing time taken to crawl the item.
x.time_taken
Problem with this approach is that if only single user is affected we get false alert about QOS.
What will be the correct tool/technique to answer/monitor following points:
Alert only if minimum k user are affected. [Not number of events]
List of user which were affected.
I think graphite and statsd is not correct tool for this. What will be better tool for answering those two question ?
What you are asking for is often called Service Monitoring. For very good reasons you want to know the service impact of an event, rather than just that an event has happened.
The advantage of this approach is exactly as you state in your requirements - you can focus on events which impact a large part of your user base and you have a list of the users affected right away.
The main drawback, IMHO, is that Service Monitoring is usually much more complex than simple performance or event/alert monitoring. It also often relies on a service model, which in my experience is something that is hard to build and even harder to keep up to date.
For example if a server in your system shows a significant slow down or failure, depending on your architecture this may impact all users who use a service that relies on that server, or it may impact a very small subset, or even none at all initially, if there is a load balancing mechanism or redundancy mechanism in place.
You would need to reflect this architecture in your service monitoring model, and also change it every time you update your system architecture or deployment.
If your system is static enough or critical enough to warrant the investment then this may be worth your while. If not then a simple compromise may be just to update the graphing and alerting you are doing to alert when the average response time over a set number of users, or over all users on a server increases by a significant amount.
This may give you most of the benefits you are after without having to invest in the extra complexity of a service monitoring solution.
If you definitely are looking to expand your monitoring approach and want to stick with open source tools then I would start by looking at NAGIOS if your focus is on infrastructure, or there are quite a few web service monitoring solutions with Free Tiers such as pingdom:
http://www.nagios.org
https://www.pingdom.com

WCAT #Requests more than #Virtual Clients

This problem is now solved.
Removing the content because, I had added too much confidential details earlier.
Thanks stackoverflow.
In your case, WCAT will simulate 1000 virtual clients hammering the site over a duration of 120 seconds. The 2046 requests are the number of requests completed in that time frame, some of your virtual clients will probably have hit the site multiple times during that time window.
WCAT will give you a number of request served over a period of time.
jMeter on the other hand will perform a pre-defined number of requests and give you a total time to complete them.
Where it is reflected 2625 POSTs? I can't see it.
Because transactions are being executed in the warm up phase of the load test.
See the highlighted portion below. The text is taken from GALIN ILIEV'S post on the topic
WCAT uses a “warm-up” period in order to allow the Web Server to
achieve steady state before taking measurements of throughput,
response time and performance counters. For instance there is a slight
delay on first request on ASP.NET sites when Just-In-Time (JIT)
compilation is performed.
WCAT will divide the warm-up phase into two parts. For the first
half of the warm-up period WCAT will slowly add virtual clients until
all virtual clients have been activated. The second half is pure
load generation.

TFS Load Testing Web Tests

I am configuring a load test and am curious/confused on settings. I am testing an intranet website, that is expected to have 6000 concurrent users. My employer had some previous consultant tell them that the load test users does not matter and that we need to worry about requests/second. They have previously determined that those 6000 users would generate 30 rps, while I feel that is not correct we need to show that we can exceed that number. The previous load test was set for only 200 users and the results showed that it did exceed the 200 rps. They were happy with the results, but that is not how I understand this.
My question is, if we need to support 6000 concurrent users should I just set my users to 6000 and run, or is the rps an adequate piece of data to rely on?
It is really hard to measure the apples of a "Virtual User" with the orange that is a real person. A real person may take seconds to minutes to read a webpage and then take some action. A virtual user will be able to process a webpage every few seconds.
To test adequately you need to figure out a common unit of "work" between real users and the load we can generate with Visual Studio. The consultant probably recommended that RPS be used as it is easy to measure from any loadtest with whatever webtests inside it. It is a good measure.
The accuracy of the RPS measure rests on the assumptions made about your users.
The math works a little like:
I have 6000 users, who need to use the site every day. Mostly they log in in the mornining, work a bit before morning tea and hit the site more heavily from 2pm-3:30pm. Say
Looking at previous logs for a site or just guessing you can say:
Maybe at peak a users hits the site every minute or so.
Figuring at peak site usage 30% of the users are working.
So
Users:6000
Peak percentage: 30%
RPS/users: 1/60
6000 * 30% * 1/60 = 30 RPS.
So if the site can process 200RPS we can roughly say it is equivalent to all 6000 users hitting the site for a page every minute.
6000 * 100% * 1/60 = 200 RPS.
When you change the assumptions about your real users, the number of RPS changes, often dramatically.

Resources