Perfino System Requirements - perfino

We're planning to evaluate and eventually potentially purchase perfino. I went quickly through the docs and cannot find the system requirements for the installation. Also I cannot find it's compatibility with JBoss 7.1. Can you provide details please?

There are no hard system requirements for disk space, it depends on the amount of business transactions that you're recording. All data will be consolidated, so the database reaches a maximum size after a while, but it's not possible to say what that size will be. Consolidation times can be configured in the general settings.
There are also no hard system requirements for CPU and physical memory. A low-end machine will have no problems monitoring 100 JVMs, but the exact details again depend on the amount of monitored business transactions.
JBoss 7.1 is supported. "Supported" means that web service and EJB calls can be tracked between JVMs, otherwise all application servers work with perfino.

I haven't found any official system requirements, but this is what we figured out experimentally.
We collect about 10,000 transactions a minute from 8 JVMs. We have a lot of distinct and long SQL queries. We use AWS machine with 2 VCPUs and 8GB RAM.
When the Perfino GUI is not being used, the CPU load is low. However, for the GUI to work properly, we had to modify perfino_service.vmoptions:
-Xmx6000m. Before that we had experienced multiple OutOfMemoryError in Perfino when filtering in the transactions view. After changing the memory settings, the GUI is running fine.
This means that you need a machine with about 8GB RAM. I guess this depends on the number of distinct transactions you collect. Our limit is high, at 30,000.
After 6 weeks of usage, there's 7GB of files in the perfino directory. Perfino can clear old recordings after a configurable time.

Related

AWS server became slow after traffic increase

I have a single page Angular app that makes request to a Rails API service. Both are running on a t2xlarge Ubuntu instance. I am using a Postgres database.
We had increase in traffic, and my Rails API became slow. Sometimes, I get an error saying Passenger queue full for rails application.
Auto scaling on the server is working; three more instances are created. But I cannot trace this issue. I need root access to upgrade, which I do not have. Please help me with this.
As you mentioned that you are using T2.2xlarge instance type. Firstly I want to tell you should not use T2 instance type for production environment. Cause of T2 instance uses CPU Credit. Lets take a look on this
What happens if I use all of my credits?
If your instance uses all of its CPU credit balance, performance
remains at the baseline performance level. If your instance is running
low on credits, your instance’s CPU credit consumption (and therefore
CPU performance) is gradually lowered to the base performance level
over a 15-minute interval, so you will not experience a sharp
performance drop-off when your CPU credits are depleted. If your
instance consistently uses all of its CPU credit balance, we recommend
a larger T2 size or a fixed performance instance type such as M3 or
C3.
Im not sure you won't face to the out of CPU Credit problem because you are using Xlarge type but I think you should use other fixed performance instance types. So instance's performace maybe one part of your problem. You should use cloudwatch to monitor on 2 metrics: CPUCreditUsage and CPUCreditBalance to make sure the problem.
Secondly, how about your ASG? After scale-out, did your service become stable? If so, I think you do not care about this problem any more because ASG did what it's reponsibility.
Please check the following
If you are opening a connection to Database, make sure you close it.
If you are using jquery, bootstrap, datatables, or other css libraries, use the CDN links like
<link rel="stylesheet" ref="https://cdnjs.cloudflare.com/ajax/libs/bootstrap-select/1.12.4/css/bootstrap-select.min.css">
it will reduce a great amount of load on your server. do not copy the jquery or other external libraries on your own server when you can directly fetch it from other servers.
There are a number of factors that can cause an EC2 instance (or any system) to appear to run slowly.
CPU Usage. The higher the CPU usage the longer to process new threads and processes.
Free Memory. Your system needs free memory to process threads, create new processes, etc. How much free memory do you have?
Free Disk Space. Operating systems tend to thrash when the file systems on system drives run low on free disk space. How much free disk space do you have?
Network Bandwidth. What is the average bytes in / out for your
instance?
Database. Monitor connections, free memory, disk bandwidth, etc.
Amazon has CloudWatch which can provide you with monitoring for everything except for free disk space (you can add an agent to your instance for this metric). This will also help you quickly see what is happening with your instances.
Monitor your EC2 instances and your database.
You mention T2 instances. These are burstable CPUs which means that if you have consistenly higher CPU usage, then you will want to switch to fixed performance EC2 instances. CloudWatch should help you figure out what you need (CPU or Memory or Disk or Network performance).
This is totally independent of AWS Server. Looks like your software needs more juice (RAM, StorageIO, Network) and it is not sufficient with one machine. You need to evaluate the metric using cloudwatch and adjust software needs based on what is required for the software.
It could be memory leaks or processing leaks that may lead to this as well. You need to create clusters or server farm to handle the load.
Hope it helps.

Best setting for HTTPJVMMaxHeapSize in Domino 8.5.3 64 Bit

I am trying to find a definitive answer as to what the best setting is for the JVM Heap Size in Domino 8.5.3 FP4 - 64 Bit for Windows.
I know that by default it's set to 1024M. Some web sites have suggested that it's recommended to be 1G / 1024M - but that's the default setting so it that as good as it gets?
Other sites have said 25% of available RAM.
My Domino server has 12GB RAM available. It's currently got HTTPJVMMaxHeapSize = 1024M and Task Manager tells me that around 7GB of RAM is in use and nhttp.exe is using around 1.1GB. I want to increase this Heap to 2GB or 3GB if possible - will there be any issues doing to?
I'm running Windows Server 2008 R2 Standard Edition.
Speaking from an XPages perspective:
1 - Understand the dynamics of the working set and hardware
That is, understand what the application code, server runtime, and hardware profile is doing when processing a given working set (ie: the XPages application(s) within the server). Is the application coded in a non-optimal manner in terms of lifecycle execution and memory usage? Is the application making use of memory or disk persistence for component tree serialization? Is the server assigned an adequate amount of JVM memory? Is the hardware providing enough CPU and memory?
2 - Profile and monitor the working set with upper limit loads
To fully answer some of the questions in #1, detailed performance and scalability profiling must be carried out using tools like the XPages Toolbox and Eclipse Memory Analyzer. Furthermore, test the working set using Rational Performance Tester (or some other performance testing tool) to mimic real life concurrent workloads in a test environment. This allows you to set up a test environment where you can hit your application with (n) number of concurrent users using automation and collect that all invaluable data on health etc.
3 - Analyse the profile information to identify bottlenecks within the working set
Remember your working set can be one or more applications. Each doing something different, and having different load requirements. Be specific about the task at hand - do you want to tune the system more generally for all applications (for an average scale) or do you want the server to be fully optimized for a specific application (for a targeted scale)?
4 - Optimize the working set where applicable
Get in and make changes to the XSP / Java / ServerSide JavaScript code where applicable - use your knowledge of the XPages Request Processing Lifecycle and also look out for those hungry JVM memory consumers under high end load scenarios. Always favor disk persistence (disk storage is cheaper than RAM!) and code your custom Java objects and Managed Beans accordingly to cope with serialization and restoration. And make the trade-offs between function and speed in these scenario's where high scalability ends up burning CPU... a smarter UX with targetted functions / actions etc.
5 - Scale the hardware where applicable
Be prepared to increase cores, clock speeds, RAM, disk storage based on the needs of the working set - a cyclic approach to profiling, monitoring, and optimizing the working set will shed more and more light on the suitability of the hardware as this process evolves.
5 - Repeat from #2 until the working set and hardware performs and scales to the expected requirements / load expectancy of the system
I would recommend to set server's notes.ini param JavaVerboseGC=1 with console output into console.log in IBM_TECHNICAL_SUPPORT directory. After some time, take that log file and use IBM Support Assistant with tool IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer (GCMV).
This could help how to interpret collected data: http://www-01.ibm.com/support/docview.wss?uid=swg27013824&aid=1.
You definitely should do that if you get OutOfMemoryException.
Heap size set too high can lead to fewer GC runs but taking longer time - server will periodically "hang" for very short time. So too high setting is not recommended.

Why is membase server so slow in response time?

I have a problem that membase is being very slow on my environment.
I am running several production servers (Passenger) on rails 2.3.10 ruby 1.8.7.
Those servers communicate with 2 membase machines in a cluster.
the membase machines each have 64G of memory and a100g EBS attached to them, 1G swap.
My problem is that membase is being VERY slow in response time and is actually the slowest part right now in all of the application lifecycle.
my question is: Why?
the rails gem I am using is memcache-northscale.
the membase server is 1.7.1 (latest).
The server is doing between 2K-7K ops per second (for the cluster)
The response time from membase (based on NewRelic) is 250ms in average which is HUGE and unreasonable.
Does anybody know why is this happening?
What can I do inorder to improve this time?
It's hard to immediately say with the data at hand, but I think I have a few things you may wish to dig into to narrow down where the issue may be.
First of all, do your stats with membase show a significant number of background fetches? This is in the Web UI statistics for "disk reads per second". If so, that's the likely culprit for the higher latencies.
You can read more about the statistics and sizing in the manual, particularly the sections on statistics and cluster design considerations.
Second, you're reporting 250ms on average. Is this a sliding average, or overall? Do you have something like max 90th or max 99th latencies? Some outlying disk fetches can give you a large average, when most requests (for example, those from RAM that don't need disk fetches) are actually quite speedy.
Are your systems spread throughout availability zones? What kind of instances are you using? Are the clients and servers in the same Amazon AWS region? I suspect the answer may be "yes" to the first, which means about 1.5ms overhead when using xlarge instances from recent measurements. This can matter if you're doing a lot of fetches synchronously and in serial in a given method.
I expect it's all in one region, but it's worth double checking since those latencies sound like WAN latencies.
Finally, there is an updated Ruby gem, backwards compatible with Fauna. Couchbase, Inc. has been working to add back to Fauna upstream. If possible, you may want to try the gem referenced here:
http://www.couchbase.org/code/couchbase/ruby/2.0.0
You will also want to look at running Moxi on the client-side. By accessing Membase, you need to go through a proxy (called Moxi). By default, it's installed on the server which means you might make a request to one of the servers that doesn't actually have the key. Moxi will go get it...but then you're doubling the network traffic.
Installing Moxi on the client-side will eliminate this extra network traffic: http://www.couchbase.org/wiki/display/membase/Moxi
Perry

Passenger server upgrade: Processor (CPU) Cores VS Ram?

I went through documentation of Passenger to find out how many application instances it can run with respect to hardware configuration. Documentation only talks about RAM
The optimal value depends on your system’s hardware and the server’s average load. You should experiment with different values. But generally speaking, the value should be at least equal to the number of CPUs (or CPU cores) that you have. If your system has 2 GB of RAM, then we recommend a value of 30. If your system is a Virtual Private Server (VPS) and has about 256 MB RAM, and is also running other services such as MySQL, then we recommend a value of 2.
It says minimum value can be number of CPU/CPU Cores we have. I have a VPS with one VCPU & 1GB RAM & my service provider has an option to just upgrade the RAM. I'm wondering how far I can just keep upgrading only RAM? How important it is to upgrade number of CPUs?
Quick Answer
Depends on what resources are the bottleneck for your app.
Long answer
You'll need to factor in a few things:
How much CPU time does your app need?
How much RAM does any given instance of your app use at peak load?
Does your app spend a lot of time doing IO intensive tasks? (ie: db and file reads/writes, network communication)
There can be other things to factor in, but your bottlenecks will probably be one of the above. If RAM is your main bottleneck, by all means use your newly available RAM. However, if it turns out that your app is being slowed down by CPU availability or flooded IO, no amount of RAM is going to speed things up.
On the topic of CPU cores; my understanding is that the main Apache process that runs Passenger is a single threaded process. Apache spawns new threads to handle concurrency on an as-needed basis. Each additional CPU core theoretically allows you to spawn x*n threads, where x is the number of threads you can optimally run under a single CPU core and n is the number of CPU cores available to Apache.
Disclaimer: I'm not very well read on Passenger internals; though this logic usually holds true for other kinds of Apache configurations.

Monitoring CPU Core Usage on Terminal Servers

I have windows 2003 terminal servers, multi-core. I'm looking for a way to monitor individual CPU core usage on these servers. It is possible for an end-user to have a run-away process (e.g. Internet Explorer or Outlook). The core for that process may spike to near 100% leaving the other cores 'normal'. Thus, the overall CPU usage on the server is just the total of all the cores or if 7 of the cores on a 8 core server are idle and the 8th is running at 100% then 1/8 = 12.5% usage.
What utility can I use to monitor multiple servers ? If the CPU usage for a core is "high" what would I use to determine the offending process and then how could I automatically kill that process if it was on the 'approved kill process' list?
A product from http://www.packettrap.com/ called PT360 would be perfect except they use SMNP to get data and SMNP appears to only give total CPU usage, it's not broken out by an individual core. Take a look at their Dashboard option with the CPU gauge 'gadget'. That's exactly what I need if only it worked at the core level.
Any ideas?
Individual CPU usage is available through the standard windows performance counters. You can monitor this in perfmon.
However, it won't give you the result you are looking for. Unless a thread/process has been explicitly bound to a single CPU then a run-away process will not spike one core to 100% while all the others idle. The run-away process will bounce around between all the processors. I don't know why windows schedules threads this way, presumably because there is no gain from forcing affinity and some loss due to having to handle interrupts on particular cores.
You can see this easily enough just in task manager. Watch the individual CPU graphs when you have a single compute bound process running.
You can give Spotlight on Windows a try. You can graphically drill into all sorts of performance and load indicators. Its freeware.
perfmon from Microsoft can monitor each individual CPU. perfmon also works remote and you can monitor farious aspects of Windows.
I'm not sure if it helps to find run-away processes because the Windows scheduler dos not execute a process always on the same CPU -> on your 8 CPU machine you will see 12.5 % usage on all CPU's if one process runs away.

Resources