Scaling: Docker containers vs Vms - docker

I'm trying to understand the benefits of scaling an app via docker.
As I understand it; if I have a dockerized app running in a single docker container or pod in kubernetes; to scale it i.e. in the case of increased load I would create multiple instances of the same container and load balance requests to each container; thing is if all these containers are on the same vm aren't they all drawing from the same system resources?
What is scaling via docker then? Is it real? If all the container instances of your app are all drawing on the resources of the host machine?

You are correct that scaling up and running more instances of a container on a single host will cause them to compete for resources. You'd want a kubernetes cluster with more than one 'minion'/host nodes. Having more than one container on a single host can be good, if load balanced properly, and a single container crashes, the other will still be up, without any delay or downtime while the replacement container is created.

Related

What is a cluster and a node oriented to containers?

Sorry for this question, but I just started with Docker and Docker Compose and I really didn't need any of this until I read that I need to use Docker Swarn or Kuebernetes to have more stability in production. I started reading about Docker Swarn and they mentioned nodes and clusters.
I was really happy not knowing about this as I understood docker-compose:
Is that I could manage my services/containers from a single file
and only have to run several commands to launch, build, delete, etc.
all my services based on the docker-compose configuration.
But now the nodes and cluster have come out and I've really gone a bit crazy, and that's why if you can help me understand this next step in the life of containers. I've been googling and it's not very clear to me.
I hope you can help me and explain it to me in a way that I can understand.
Thank you!
A node is just a physical or virtual machine.
In Kubernetes/Docker Swarm context each node must have the relevant binaries installed (Docker Engine, kubelet etc..)
A cluster is a grouping of one or more nodes.
If you have just been testing on your local machine you have a single node.
If you were to add a second machine and link both machines together using docker swarm/kubernetes then you would have created a 2 node cluster
You can then use docker swarm/kubernetes to run your services/containers on any or all nodes in your cluster. This allows your services to be more resilient and fault tolerant.
By default Docker Compose runs a set of containers on a single system. If you need to run more containers than fit on one system, or you're just afraid of that system crashing, you need more than one system to do it. The cluster is the group of all of the systems (physical computers, virtual machines, cloud instances) that are working together to run the containers. Each of those individual systems is a node.
The other important part of the cluster container setups is that you can generally run multiple replicas of a give container, and you don't care where in the cluster they run. Say you have five nodes, and a Web server container, and you'd like to run three copies of it for redundancy. Instead of having to pick a node, ssh to it, and manually docker run there, you just tell the cluster manager "run me three of these", and it chooses a node and launches the container for you. You can also scale the containers up and down at runtime, or potentially set the cluster to do the scaling on its own based on load.
If your workload is okay running a single copy of containers on a single server, you don't need a cluster setup. (You might have some downtime during updates or if the single server dies.) Swarm has the advantages of being bundled with Docker and being able to use Docker-native tools (docker-compose can deploy to a Swarm cluster). Kubernetes is much more complex, but at this point most public cloud providers will sell you a preconfigured Kubernetes cluster, and it has better stories around security, storage management, and autoscaling. There are also a couple other less-prominent alternatives like Nomad and Mesos out there.

I'm still confused by Docker containers and images

I know that containers are a form of isolation between the app and the host (the managed running process). I also know that container images are basically the package for the runtime environment (hopefully I got that correct). What's confusing to me is when they say that a Docker image doesn't retain state. So if I create a Docker image with a database (like PostgreSQL), wouldn't all the data get wiped out when I stop the container and restart? Why would I use a database in a Docker container?
It's also difficult for me to grasp LXC. On another question page I see:
LinuX Containers (LXC) is an operating system-level virtualization
method for running multiple isolated Linux systems (containers) on a
single control host (LXC host)
What does that exactly mean? Does it mean I can have multiple versions of Linux running on the same host as long as the host support LXC? What else is there to it?
LXC and Docker, Both are completely different. But we say both are container holders.
There are two types of Containers,
1.Application Containers: Whose main motto is to provide application dependencies. These are Docker Containers (Light Weight Containers). They run as a process in your host and gets all the things done you want. They literally don't need any OS Image/ Boot Up thing. They come and they go in a matter of seconds. You cannot run multiple process/services inside a docker container. If you want, you can do run multiple process inside a docker container, but it is laborious. Here, resources (CPU, Disk, Memory, RAM) will be shared.
2.System Containers: These are fat Containers, means they are heavy, they need OS Images
to launch themselves, at the same time they are not as heavy as Virtual Machines, They are very similar to VM's but differ in architecture a bit.
In this, Let us say Ubuntu as a Host Machine, if you have LXC installed and configured in your ubuntu host, You can run a Centos Container, a Ubuntu(with Differnet Version), a RHEL, a Fedora and any linux flavour on top of a Ubuntu Host. You can also run multiple process inside an LXC contianer. Here also resoucre sharing will be done.
So, If you have a huge application running in one LXC Container, it requires more resources, simultaneously if you have another application running inside another LXC container which require less resources. The Container with less requirement will share the resources with the container with more resource requirement.
Answering Your Question:
So if I create a Docker image with a database (like PostgreSQL), wouldn't all the data get wiped out when I stop the container and restart?
You won't create a database docker image with some data to it(This is not recommended).
You run/create a container from an image and you attach/mount data to it.
So, when you stop/restart a container, data will never gets lost if you attach that data to a volume as this volume resides somewhere other than the docker container (May be a NFS Server or Host itself).
Does it mean I can have multiple versions of Linux running on the same host as long as the host support LXC? What else is there to it?
Yes, You can do this. We are running LXC Containers in our production.

Can Docker-Swarm run in fail-over-mode only?

I am facing a situation, where I need to run about 100 different applications in Docker containers. It is not reasonable to run all 100 containers on one hardware, so I need to spread the applications over several machines.
As far as I understood, docker-swarm is for scaling only, which means when I run my containers in a swarm, than all 100 containers will automatically be deployed and started on every node of my docker-swarm. But this is not what I am searching for. I want to split the applications and for example run 50 on node1 and 50 on node2.
Question 1:
Is there a way to configure docker-swarm in a way, where my applications will be automatically dispatched on the node which has the most idling resources?
Question 2:
Is there a kind of fail-over-mode in docker swarm which can stop a container on one node and try to start it on on another in case something goes wrong?
all 100 containers will automatically be deployed and started on every node of my docker-swarm
This is not true. When you deploy 100 containers in a swarm, the containers will be distributed on the available nodes in the swarm. You will mostly get an even distribution of containers on all nodes.
Question 1: Is there a way to configure docker-swarm in a way, where my applications will be automatically dispatched on the node which has the most idling resources?
Docker swarm does not check the available resources (memory, cpu ...) available on the nodes before deploying a container on it. The distribution of containers is balanced per nodes, without taking into account the availability of resources on each node.
You can however build a strategy of distributing container on the nodes. You can use placement constraints were you can restrict where a container can be deployed. You can label nodes having a lot of resources and restrict some heavy containers to only run on these nodes.
Question 2: Is there a kind of fail-over-mode in docker swarm which can stop a container on one node and try to start it on on another in case something goes wrong?
If a container crashes, docker swarm will ensure that a new container is started. Again, the decision of what node to deploy the new container on cannot be predetermined.

Running multiple docker containers in same host

I am new to docker. I have a doubt regarding docker. Based on the understanding of docker, Docker will help to create the container of the application we can to deploy along with application dependencies.
My question is that if i have web application inside docker container, is it possible to run multiple containers inside single host? If yes, How will i make sure the request be directed to each app?.
Will there be any change in performance depending on number of core of host?
Is it possible to run multiple containers inside single host?
Yes, you can run many.
If yes, How will direct requests to the right container?
You have many options, the simplest is just to run the container with port forwarding (which is built in to docker), but you could also run a load balancer or proxy on the host.
Will there be any change in performance depending on number of core of host?
There can be, of course. It depends on whether or not you're already reaching a performance bottleneck of some sort before adding another container. All the containers are making use of the same hardware.

How do I avoid download images on all docker hosts which are part of my swarm?

I have a swarm setup which has around 6 nodes. Whenever I execute a docker run or docker pull command from the swarm manager it downloads the new image on all the swarm nodes.
This is creating data redundancy and choking my network.
Is there any way I can avoid this ?
Swarm Nodes need Images available to them by design. That will help swarm to start the container on an available node immediately when current node hosting the container crashes or current hosting node goes into maintenance (Drain Mode).
On the other hand docker Images will be pulled one time only, and you can use them until you upgrade your service.
Another one, Docker is designed for microservices, If you Image getting too large, Maybe you should try to cut it down to multiple containers.

Resources