Ejabberd Server Processor Overload (CPU Overload)

Ejabberd Server Processor Overload (CPU Overload) - erlang

We have enabled the clustering in the 2 Ejabberd servers. But still we are getting the CPU overload alert after 78 sessions (around 156 users) connected to Ejabberd and server went to hung status.
Since we are getting the alert after around 150+ users connected, what are all the possible resources we can increase at hardware level (like memory, processor, etc.,) to resolve this issue?
Ejabberd Version: 17.01
CPU Count: 4 (each server)
Memory: 8GB (each server)

You get CPU overload with just 78 clients connected in each node? Obviously there is something weird there!
Are the clients just connected, or are they sending many messages?
Do the accounts have a small roster, or do they have thousands of contacts?
What happens if only one node is used, not in cluster: does it handle many more accounts, or it overloads CPU like in cluster?

Related

Hyperledger Fabric RAM utilization with multiple organizations

I am running some tests on fabric.
I have one experiment where I run a single organization of 16 peers and invoke some functions on each peer. After I do another experiment with 8 organizations with 2 peers each and invoke some functions on each peer.
One of the measured metrics is the difference in RAM usage of all containers before and after all invoke functions.
In the case of one organization I get about 1GB extra RAM usage, in case of 8 organizations I get about 6GB extra RAM usage. Does anyone know the reason for this behaviour ?
The invoke functions all store the same data in the blockchain.
I have tried with 2 organizations and 8 peers per each and I get the same result. That is as soon as the number of organizations is increased the RAM usage skyrockets.

influxDB query speed

My influxdb measurement have 24 Field Keys and 5 tag keys.
I try to do 'select last(cpu) from mymeasurement', and found result :
When there is no client throwing data into it, it'll take around 2 seconds to got the result
But when I run 95 client throwing data (per 5 seconds) into it, the query will take more than 10 seconds before it show the result. is it normal ?
Note :
My system is a Centos7 VM in xenserver with 4 vcore CPU and 8 GB ram, the top command show 30% cpu while that clients throw datas.

Some ideas:
Check your vCPU configuration on other VMs running on the same host. Other VMs you might have that don't need the extra vCPUs should only be configured with one vCPU, for a latency boost.
If your DB server requires 4 vCPUs and your host already has very little CPU% used during queries, you might want to check the storage and memory configurations of the VM in case your server is slow due to swap partition use, especially if your swap partition is located on a Virtual Disk over the network via iSCSI or NFS.
It might also be a memory allocation issue within the VM and server application. If you have XenTools installed on the VM, try on a system without the XenTools installed to rule out latency issues related to the XenTools driver.

How to monitor windows service and process with zabbix

I am new in Zabbix and I am using Zabbix 3.4 version. I have installed server on Linux and want to monitor and check status of Windows service using its Windows agent.
I got the status of services using the key below
service.info[<serviceName>,state]
It returns me proper status of service. Now I want to check how much CPU is utilized by process and how much memory is utilized by process.
I tried some of keys but it's not returning proper value.
perf_counter[\Process(<processName>)\% User Time] // to get CPU utilization by process
proc_info[<processName>,wkset] // to get memory utilize by process
system.cpu.util[,system,avg5] // to get total CPU utilization
vm.memory.size[available] // to get total RAM utilization
But none of above working properly. I tried other keys also but agent logs say it's unsupported. I checked forum and searched on Google but nothing found.

Usually there isn't a direct match Windows Service -> Specific process.
Any service spawns N processes for its internals and also can spawn additional processes to manage incoming connection, log requests and so on.
Think about a classic httpd server: you should find at least one master process, various pre-forked server processes and php/php-fpm processes for current requests.
Regarding the keys you provided, what do you mean by "not working properly" ?
You can refer to Zabbix documentation for Windows-specific items for the exact syntax of the items and the meaning of the return values.

You can use Zabbix item for CPU Utilization of average 5 min:
system.cpu.util[,,avg5]
This will give you average usage of CPU per 5 mins on Windows server. You can then create an appropriate trigger for the same.

Mirrored queue performance factors

We operate two dual-node brokers, each broker having quite different queues and workloads. Each box has 24 cores (H/T) worth of Xeon E5645 # 2.4GHz with 48GB RAM, connected by Gigabit LAN with ~150μs latency, running RHEL 5.6, RabbitMQ 3.1, Erlang R16B with HiPE off. We've tried with HiPE on but it made no noticeable performance impact, and was very crashy.
We appear to have hit a ceiling for our message rates of between 1,000/s and 1,400/s both in and out. This is broker-wide, not per-queue. Adding more consumers doesn't improve throughput overall, just gives that particular queue a bigger slice of this apparent "pool" of resource.
Every queue is mirrored across the two nodes that make up the broker. Our publishers and consumers connect equally to both nodes in a persistant way. We notice an ADSL-like asymmetry in the rates too; if we manage to publish a high rate of messages the deliver rate drops to high double digits. Testing with an un-mirrored queue has much higher throughput, as expected. Queues and Exchanges are durable, messages are not persistent.
We'd like to know what we can do to improve the situation. The CPU on the box is fine, beam takes a core and a half for 1 process, then another 80% each of two cores for another couple of processes. The rest of the box is essentially idle. We are using ~20GB of RAM in userland with system cache filling the rest. IO rates are fine. Network is fine.
Is there any Erlang/OTP tuning we can do? delegate_count is the default 16, could someone explain what this does in a bit more detail please?

This is difficult to answer without knowing more about how your producers and consumers are configured, which client library you're using and so on. As discussed on irc (http://dev.rabbitmq.com/irclog/index.php?date=2013-05-22) a minute ago, I'd suggest you attempt to reproduce the topology using the MulticastMain java load test tool that ships with the RabbitMQ java client. You can configure multiple producers/consumers, message sizes and so on. I can certainly get 5Khz out of a two-node cluster with HA on my desktop, so this may be a client (or application code) related issue.

localhost vs LAN : speed difference?

I'm currently in the process of performance profiling. We have a basic client/server application. Would the TCP transfer speed be different if I ran client/server on the same machine (localhost) vs across two computers on a LAN?

TCP transfer speed will be! because if you ran it on same computer it will forward packets locally without even touching LAN and network adapter.
But overall speed of client+server may be better on different machines, especially if you do not communicate with server too often.

When using localhost, local resources are more likely to be the performance bottleneck because of memory, disk, cpu, etc. When using two computers, its more likely the network will be the bottleneck because of latency, bandwidth, throughput, packet loss, etc.
It depends on what your application does and how it uses the network, client, and server.

Yes definitely, the latency of sending it across the network would slow the program down. The throughput wouldn't but if you're waiting for replies before sending data then this builds up because of extra latency.

I just hit this issue on a project at work. Using UDP with localhost is at least an order of magnitude faster than over a network connection (maybe two orders of magnitude), and I believe with localhost there is no MTU ceiling of 1500 as normally exists for network ports.
One unconfirmed suspicion is the built in network ports on PCs are not all the same quality, so even if they claim to be gigabit, you may not be able to really go that fast. But it may also be making lots of Windows system calls (one OS call per packet) may be a significant overhead. With TCP I can hand the OS a large chunk of data to write in a single call. With UDP I have to hand it a packet at a time, limited by the MTU size, resulting in a much larger number of OS calls. But unconfirmed as yet.
Have not tried Linux yet.

It really depends on what your application does....
As an example:
If it transfers 10GB files from client to server, then yes, it will make a difference.

I don't know if it measurable (that also depends on your LAN's speed) but from a logical point of view, of course there is a difference. Localhost will always be the fastest as the data has not be sent through another medium (like air or copper wire).
But depending on what your application does, this might or might not matter.

The transfer times would almost certainly be faster if the client and server were on the same machine. That may not actually matter to the performance of your program as a whole depending on the other resources consumed by the client and server.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart