Azure app service availability loss. The memory counter Page Reads/sec was at a dangerous level - asp.net-mvc

Environment:
Asp Net MVC app(.net framework 4.5.1) hosted on Azure app service with two instances.
App uses Azure SQL server database.
Also, app uses MemoryCache (System.Runtime.Caching) for caching purposes.
Recently, I noticed availability loss of the app. It happens almost every day.
Observations:
The memory counter Page Reads/sec was at a dangerous level (242) on instance RD0003FF1F6B1B. Any value over 200 can cause delays or failures for any app on that instance.
What 'The memory counter Page Reads/sec' means?
How to fix this issue?

What 'The memory counter Page Reads/sec' means?
We could get the answer from this blog. The recommended Page reads/sec value should be under 90. Higher values indicate insufficient memory and indexing issues.
“Page reads/sec indicates the number of physical database page reads that are issued per second. This statistic displays the total number of physical page reads across all databases. Because physical I/O is expensive, you may be able to minimize the cost, either by using a larger data cache, intelligent indexes, and more efficient queries, or by changing the database design.”
How to fix this issue?
Based on my experience, you could have a try to enable Local Cache in App
Service.
You enable Local Cache on a per-web-app basis by using this app setting: WEBSITE_LOCAL_CACHE_OPTION = Always
By default, the local cache size is 300 MB. This includes the /site and /siteextensions folders that are copied from the content store, as well as any locally created logs and data folders. To increase this limit, use the app setting WEBSITE_LOCAL_CACHE_SIZEINMB. You can increase the size up to 2 GB (2000 MB) per web app.

There is some memory performance problems can be listed
excessive paging,
memory shortages,
memory leaks
Memory counter values can be used to detect the presence of various performance problems. Tracking counter values both on a system-wide and a per-process basis helps you to pinpoint the cause in Azure such as in other systems.
Even if there is no change in the process, a change in the system can cause memory problems. the system-wide
researching in the azure:
Shared resources plans (Free and Basic) have memory limits as seen here: https://learn.microsoft.com/en-us/azure/azure-subscription-service-limits#app-service-limits.
Quotas: https://learn.microsoft.com/en-us/azure/app-service-web/web-sites-monitor
Also, you can check in the portal under your web app settings, search for “quotas”, and also check out “Diagnose and solve problems” and hit “metrics per instance (app service plan)” which will show you memory used for the plan.
A MemoryCache bug in .net 4 can also cause this type of behavior
https://stackoverflow.com/a/15715990/914284

Related

App engine for high memory rails application

Looks like the most powerful instance type you can have in Google App Engine is one with 2G memory. One of our Rails application reaches the memory limit quickly on higher load. Autoscaling helps but wondering if there is a way to add more power instances in GAE?
If not, how have you solved this problem?
yes, in App Engine Standard the higher tier is F4_HIGHMEM with 2048 MB of memory. You have 3 ways to scale up with standard:
Automatic: based on request rate, response latencies, and other application metrics.
Basic: creates dynamic instances when your application receives requests.
Manual: uses resident instances that continuously run the specified number of instances regardless of the load level.
Therefore, the question here would be how are you reaching this limit? How are you managing your memory? Take a look into your console metrics: memoryusage. A ladder graphs shown a bad usage of the memory. When deploying apps in the Cloud, you must have in mind that the usage of the resources bust be more accurate.
You can analyze and check if choosing an automatic scale based on Max concurrent Requests would be a good option for you to mitigate your issue with the memory.
This is for Standard, Flexible is managed different. You can specify from 0.9 to 6.5 GB per CPU core.

AWS server became slow after traffic increase

I have a single page Angular app that makes request to a Rails API service. Both are running on a t2xlarge Ubuntu instance. I am using a Postgres database.
We had increase in traffic, and my Rails API became slow. Sometimes, I get an error saying Passenger queue full for rails application.
Auto scaling on the server is working; three more instances are created. But I cannot trace this issue. I need root access to upgrade, which I do not have. Please help me with this.
As you mentioned that you are using T2.2xlarge instance type. Firstly I want to tell you should not use T2 instance type for production environment. Cause of T2 instance uses CPU Credit. Lets take a look on this
What happens if I use all of my credits?
If your instance uses all of its CPU credit balance, performance
remains at the baseline performance level. If your instance is running
low on credits, your instance’s CPU credit consumption (and therefore
CPU performance) is gradually lowered to the base performance level
over a 15-minute interval, so you will not experience a sharp
performance drop-off when your CPU credits are depleted. If your
instance consistently uses all of its CPU credit balance, we recommend
a larger T2 size or a fixed performance instance type such as M3 or
C3.
Im not sure you won't face to the out of CPU Credit problem because you are using Xlarge type but I think you should use other fixed performance instance types. So instance's performace maybe one part of your problem. You should use cloudwatch to monitor on 2 metrics: CPUCreditUsage and CPUCreditBalance to make sure the problem.
Secondly, how about your ASG? After scale-out, did your service become stable? If so, I think you do not care about this problem any more because ASG did what it's reponsibility.
Please check the following
If you are opening a connection to Database, make sure you close it.
If you are using jquery, bootstrap, datatables, or other css libraries, use the CDN links like
<link rel="stylesheet" ref="https://cdnjs.cloudflare.com/ajax/libs/bootstrap-select/1.12.4/css/bootstrap-select.min.css">
it will reduce a great amount of load on your server. do not copy the jquery or other external libraries on your own server when you can directly fetch it from other servers.
There are a number of factors that can cause an EC2 instance (or any system) to appear to run slowly.
CPU Usage. The higher the CPU usage the longer to process new threads and processes.
Free Memory. Your system needs free memory to process threads, create new processes, etc. How much free memory do you have?
Free Disk Space. Operating systems tend to thrash when the file systems on system drives run low on free disk space. How much free disk space do you have?
Network Bandwidth. What is the average bytes in / out for your
instance?
Database. Monitor connections, free memory, disk bandwidth, etc.
Amazon has CloudWatch which can provide you with monitoring for everything except for free disk space (you can add an agent to your instance for this metric). This will also help you quickly see what is happening with your instances.
Monitor your EC2 instances and your database.
You mention T2 instances. These are burstable CPUs which means that if you have consistenly higher CPU usage, then you will want to switch to fixed performance EC2 instances. CloudWatch should help you figure out what you need (CPU or Memory or Disk or Network performance).
This is totally independent of AWS Server. Looks like your software needs more juice (RAM, StorageIO, Network) and it is not sufficient with one machine. You need to evaluate the metric using cloudwatch and adjust software needs based on what is required for the software.
It could be memory leaks or processing leaks that may lead to this as well. You need to create clusters or server farm to handle the load.
Hope it helps.

Perfino System Requirements

We're planning to evaluate and eventually potentially purchase perfino. I went quickly through the docs and cannot find the system requirements for the installation. Also I cannot find it's compatibility with JBoss 7.1. Can you provide details please?
There are no hard system requirements for disk space, it depends on the amount of business transactions that you're recording. All data will be consolidated, so the database reaches a maximum size after a while, but it's not possible to say what that size will be. Consolidation times can be configured in the general settings.
There are also no hard system requirements for CPU and physical memory. A low-end machine will have no problems monitoring 100 JVMs, but the exact details again depend on the amount of monitored business transactions.
JBoss 7.1 is supported. "Supported" means that web service and EJB calls can be tracked between JVMs, otherwise all application servers work with perfino.
I haven't found any official system requirements, but this is what we figured out experimentally.
We collect about 10,000 transactions a minute from 8 JVMs. We have a lot of distinct and long SQL queries. We use AWS machine with 2 VCPUs and 8GB RAM.
When the Perfino GUI is not being used, the CPU load is low. However, for the GUI to work properly, we had to modify perfino_service.vmoptions:
-Xmx6000m. Before that we had experienced multiple OutOfMemoryError in Perfino when filtering in the transactions view. After changing the memory settings, the GUI is running fine.
This means that you need a machine with about 8GB RAM. I guess this depends on the number of distinct transactions you collect. Our limit is high, at 30,000.
After 6 weeks of usage, there's 7GB of files in the perfino directory. Perfino can clear old recordings after a configurable time.

Why is my web application's memory usage so high?

I have a C# MVC App that also uses EF.
It's working well but on my local dev machine IIS Express uses in the order of 100Mb of memory, but when its in the production environment it uses 600mb of memory and seems to be challenging the specs of our VPS.
The 600mb is taken from PerfMons private bytes counter on the app pool process. RedGates performance monitor however seems to say the private bytes is more in the order of 150mb - I'm not sure what the difference between the two measures is.
What is a reasonable guide to private bytes usage that should I expect PerfMon to report for a production site?
I read somewhere that private bytes may be reporting memory that is available to the application not necessarily memory that is currently allocated by the application. I still find it alarming that it has reached 500-600mb - presumably the OS must think the applications memory demand may peak there?
Should I be alarmed and any advice on how to figure out what is going on?
UPDATE
If I run it on Win7 with IIS it only consumes around 100mb. Similar to result from IIS Express - so does this mean its something more to do with the IIS configuration on my production machine?

Windows Mobile memory corruption

Is WM operating system protects process memory against one another?
Can one badly written application crash some other application just mistakenly writing over the first one memory?
Windows Mobile, at least in all current incarnations, is build on Windows CE 5.0 and therefore uses CE 5.0's memory model (which is the same as it was in CE 3.0). The OS doesn't actually do a lot to protect process memory, but it does enough to generally keep processes from interfering with one another. It's not hard and fast though.
CE processes run in "slots" of which there are 32. The currently running process gets swapped to slot zero, and it's addresses are re-based to zero (so all memory in the running process effectively has 2 addresses, the slot 0 address and it's non-zero slot address). These addresses are proctected (though there's a simple API call to cross the boundary). This means that pointer corruptions, etc will not step on other apps but if you want to, you still can.
Also CE has the concept of shared memory. All processes have access to this area and it is 100% unprotected. If your app is using shared memory (and the memory manager can give you a shared address without you specifically asking, depending on your allocation and its size). If you have shared memory then yes, any process can access that data, including corrupting it, and you will get no error or warning in either process.
Is WM operating system protects process memory against one another?
Yes.
Can one badly written application crash some other application just mistakenly writing over the first one memory?
No (but it might do other things like use up all the 'disk' space).
Even if you're a device driver, to get permission to write to memory that's owned by a different process there's an API which you must invoke explicitly.
While ChrisW's answer is technically correct, my experience of Windows mobile is that it is much easier to crash the entire device from an application than it is on the desktop. I could guess at a few reasons why this is the case;
The operating sytem is often much more heavily OEMed than Windows desktop, that is the amount of manufacturer specific low level code can be very high, which leads to manufacturer specific bugs at a level that can cause bad crashes. On many devices it is common to see a new firmware revision every month or so, where the revisions are fixes to such bugs.
Resources are scarcer, and an application that exhausts all available resources is liable to cause a crash.
The protection mechanisms and architecture vary quite a bit. The device I'm currently working with is SH4 based, while you mostly see ARM, X86 and the odd MIPs CPU..

Resources