Docker containers running terribly slow on baremetals, however system resources under utilised - docker

We have multiple baremetal servers part of our dockers and Openshfit(kubernetes) cluster. For some reason the underlying pods are extremely slow only with BM Nodes, the traditional VMs hosted on exsi servers work flawless. the pods take up very long to come up at all times liveness probes fail often. The BM nodes have 72 cores and 600 GB RAM and 2 n/w ports & are underutilised say Load Average just about 10 ~ 20 and Free RAM over 300 ~ 400 Gigis at all times. sar output looks normal, /var/log/messages have nothing unusual. Not able to nail down what's causing the slowness..
Is there a linux/docker command that will help here & what do i look for? Could this be a noisy neighbour problem? or do I need to tweak some Kernel Parameter(s). The slowness is always there, it's not intermittent. We have closely worked with RH support and got nothing from that exercise. Any suggestions welcome..

Related

How big can a GKE container image get before it's a problem?

This question is admittedly somewhat vague. If you have suggestions how to better word it, please by all means, give me feedback...
I want to understand how big a GKE container image can get before there may be problems, either serious or minor. For example, I've built a docker image (not deployed yet) that is 683 MB.
(As an aside, the reason it's so big is that I'm running a computer vision library licensed from a company with certain attributes: (1) uses native libraries that are not compatible with Alpine; (2) uses Java; (3) uses Node.js to run a required licensing daemon in same container; (4) has some very large machine learning model files.)
Although the service will have auto-scaling enabled, I expect the auto-scaling to be fairly light. It might add a new pod occasionally, but not major spikes up and down.
The size of the container will determine how many resources to assign it and thus how much CPU, memory and disk space your nodes.must have. I have seen containers require over 2 GB of memory and still work fine within the cluster.
There probably is an upper limit but the containers would have to be enormous, your container size should not pose any issues aside from possibly container startup
In practice, you're going to have issues pushing an image to GCR before you have issues running it on GKE, but there isn't a hard limit outside the storage capabilities of your nodes. You can get away with O(GB) pretty easily.

What is the Impact of having more replicas in Docker Swarm mode?

I understand the use of replicas in Docker Swarm mode. It is mainly to eliminate points of failure and reduce the amount of downtime. It is well explained in this post.
Since having more replicas is more useful for a system as a whole, why don't companies just initialise as many replicas as possible e.g 1000 replicas for a docker service? I can imagine a large corporation running a back-end system may face multiple points of failures at any given time and they would benefit from having more instances of the particular service.
I would like to know how many replicas are considered TOO MUCH and what are the factors affecting the performance of a Docker Swarm?
I can think of hardware overhead being a limiting factor.
Lets say your running Rails app. Each instance required 128Mb of RAM, and 10% CPU usage. 9 instances is a touch over 1Gb of memory and 1 entire CPU.
While that does not sounds like a lot, image an organization has 100 + teams each with 3,4,5 applications each. The hardware requirements to operation an application at acceptable levels quickly ramp up.
Then there is network chatter. 10MB/s is typical in big org/corp settings. While a heartbeat check for a couple instances is barely noticeable, heartbeat on 100's of instances could jam up the network.
At the end of the day it comes down the constraints. What are the boundaries within the software, hardware, environment, budgetary, and support systems? It is often hard to imagine the pressures present when (technical) decisions are made.

How to emulate 500-50000 worker (docker) nodes network?

So I have a worker docker images. I want to spin up a network of 500-50000 nodes to emulate what happens to a private blockchain such as etherium on different scales. What would be a recomendation for an opensource tool/library for such job:
a) one that would make sure that even on a low-endish (say one 40 cores node) all workers will be moved forward in time equaly (not realtime)
b) would allow (a) in a distributed setting (say 10 low-endish nodes on a single lan)
In other words I do not seek for realtime network emulation, so I can wait for 10 hours to simulate 1 minute and it would be good enough fro me. I thought about Kathara yet a problem still stands - how to make sure that say 10000 containers are given the same amount of ticks in a round-robin manner?
So how to emulate a complex network of docker workers?
I'm taking the assumption that you will run each inside of a container. To ensure each container runs with similar CPU access, you can configure CPU reservations and limits on each replica. These numbers get computed down to fractional slices of a core, so on an 8 core system, you could give each container 0.01 of a core to run upwards of 800 containers. See the compose documentation on how to set resource constraints. And with swarm mode, you could spread these replicas across multiple nodes, sharing a network.
That said, I think the advice to run shorter simulations on more hardware is good. You will find a significant portion of the time is spent in context switching between each process, possibly invalidating any measurements you want to take.
You will also encounter scalability issues with docker and the orchestration tool you choose. For example, you'll need to adjust the subnet size for any shared network which defaults to a /24 with around 253 available IP's. The docker engine itself will likely be spending a non-trivial amount of CPU time maintaining the state for all of the running containers.

Docker container hosting

Does someone know if there is a Docker hoster where you can just rent resources per container. All hosters I know require you to setup machines/nodes yourself first. So they are renting out machines not container resources.
I need to run 50 to 200 containers that need between 600 and 1000 MB of memory each but only a few hours per day. When I look at Amazon, Google, Digital Ocean, Linode and others, they have a weird pricing structure. The more you pay, the less you get. More expensive machines have less memory and less processors available. The smallest and cheapest machines seems to give you the most RAM and CPU per dollar.
This makes it harder to provision the machines. Using Docker Swarm does not add value as I need to have one container per machine (to get the best price/performance). So I would really like to be able to just rent per container, not per machine/node. But as far as I know nobody is offering that yet.
Not sure if you're still looking for a container service like you describe, but Cycle does exactly what you're looking for. It's super simple to use, and your containers run on bare-metal for top performance.
I'm the CTO so if you have any questions or anything let me know.

Random Inode/Ram Cache Drops in CentOS

I run a CentOS 5.7 machine (64bit) with 24GB ram and 4x SAS drives in RAID10 setup.
This machine runs nginx/1.0.10, php-fpm & xcache. About a month back the RAM usage of this machine has changed.
About every few hours the 'CACHE' is flushed from the RAM, this happens exactly when the 'Inode table usage' drops. I'm pretty sure these drops are related. (see the 2 attached images).
This server hosts quite a lot of small files (20M all a few KB big). Not many files are deleted (maybe 100 per hour (total size a few MB max)), not enough to account for the huge Inode table drops.
I also have no crons running which could cause these drops.
Sar -r output: http://pastebin.com/C4D0B79i
My question: Why are these huge RAM/Inode usage drops happening? How can I get Nginx/PHP to use all of my servers RAM?
EDIT: I have put my configs here: http://pastebin.com/iEWJchc4 and the output of LSOF here: http://hostlogr.com/lsof.txt. The thing i do notice the VERY large number of php-fpm processes that go to /dev/zero. Which is specified in my xcache configuration. Could that possibly be wrong?
solved it by putting vm.zone_reclaim_mode = 0

Resources