Controlling the hosts where my containers run with docker swarm - docker

I'm jumping from a local docker-compose building, to a production environment, in which I have 4 vps. The first (the manager) is the one with the least resources. The other 3 have the same and are biggers (the workers). I decided to use docker swarm, to manage these infrastructure. My doubt is, Should I be concerned about which host x container is running on? Or this is a bad concept of mine? I mean, is the docker swarm meant for me to abstract from the underlying nodes, and create the services and containers trusting that docker will manage the resources successfully?

Answer is... both!
The goal is to let Docker Swarm manage things for you as much as possible, but also add constraints in order for your application to deploy on the hardware that matches best its requirements.
For example, if you have a reverse proxy and machine learning models, you might want to deploy your reverse proxy on a CPU optimized server, and your machine learning models on a memory optimized instance.
You need to label your nodes properly, and then add constraints so that services are only deployed to the nodes that match your labels. For example in the example above you could add 2 labels: reverse-proxy and ml.
I am explaining how to do this more precisely in this article in case you're interested: https://juliensalinas.com/en/container-orchestration-docker-swarm-nlpcloud/

Related

Should I dockerize mysql and nginx in production?

Our company has a dedicated Linux server that wants to host all services on it.
We have several wordpress, laravel, asp and node websites. We want to dockerize all of these. But we want all services to use the same mysql.
Should we also run mysql in Docker? or not.
How will it be to up and down Docker Compose of one of the projects? Do they affect each other?
I am a little confused.
Well, it all depends on the size of your application/services. On a virtual machine, I would not suggest Dockerizing everything and running a docker-compose to up services. Take for example a database like MySQL, in docker container there are some constraints like the maximum size of the volume/container and networking, which by using the docker-compose you need to take care of with additional parameters, daemon changes. Which can be all configured but to know what exactly needs to be configured in what way is a painful process. There can also be problems with the replication of database, you should not have one database in production. What if the data gets lost? Shouldn't you have a second replica?
Now, for the reverse proxy, it depends. Depends on the size of the production as well. What happends if the container is restarted, upgraded? Will the proxy be down and all your services be unavailable? YES! It may be only for a few minutes, but this is production we are talking about.
On the other hand, it all depends on the size of the project, the size of the traffic, and the budget. Take for example a deployment on kubernetes (you did not specify the deployment target, only docker compose so i will default to kubernetes), where everything is in the form of containers. For each node, you have a ingress-controller (one of the most popular is nginx). If this is production you are talking about, then you can write ingress rules to route the traffic. Ingress-controller is deployed as a DaemonSet, so each node has its own ingress-controller and if one node is down, you would also have another one. The same goes for the database.
What I am trying to say, is that running a simple docker-compose on a machine in production is very risky. Use an environment that can scale up either horizontally or vertically (docker swarm, kube). I hope, I clarified the idea behind the production deployment well.

Why use docker service?

This question illustrates the theoretical differences between docker run and docker service.
What I don't understand is when would one need to use the exact same container replicated multiple times (as per the Docker documentation example)?
There, they run the same web app replicated 5 times.
Is deployment on Kubernetes (for example) a potential use case, where the developer does not want to centralize the app on one host, in order to make it more resilient, hence why 5 replicas are created?
To understand, can someone please please with an example use case, where the docker service is useful?
swarm is an orchestrator just like kubernetes. docker service deploys services to swarm just as you deploy your services to kubernetes using kubectl.
swarm is essentially built-in primitive orchestrator. One possible case for replicas is running a proxy that directs requests to proper containers. You could expose multiple machines and have one take place of another in case another fails. Or any other high availability case you could think of.
Your question could be rephrased as "What's the difference between running a single container and running containers in a cluster?", which would be another question altogether, but that rephrasing might help illustrate what docker service does.
If you want to scale your application, you can run multiple instances of it (horizontal scaling) or you beef up the machine(s) that it runs on (vertical scaling). For the first, you would have to put a load balancer in front of your application so that the traffic is evenly distributed between the different instances. The idea is that those instances run on different hosts, so if one goes down, your application is still up. Some controlling instance (a Kubernetes service, for example) will notice that one of your instances has gone south and won't direct any more traffic to it. Nowadays, with all the cloud stuff going on, this is typically the way to go.
You don't need Kubernetes for such a setup, but you're right, this would be a typical use case for it. At least if you run your application in a Docker container.
Once use case is running on Docker swarm which consists of n number of nodes in your swarm cluster. You can run replicas of your application on the swarm cluster with a load balancer/reverse proxy to load balance your setup. If any one of the nodes goes down the application can still run.
But the exact use case for running multiple instances is scalabilty. Suppose you know that one instance of your app can serve 10000 users (Assume Bank authentication) at a time.
If you want your application to serve 50K users just run 5 replicas(using docker service create) .

Does it make sense to run Kubernetes on a single server?

I'm using Docker I have implemented a system to deploy environments (on a single server) based on Git branches using Traefik (*.dev.domain.com) and Docker Compose templates.
I like Kubernetes and I've never switched to it since I'm limited to one single server for my infrastructure. I've only used it using local installations (Docker for Windows).
So, my question is: does it make sense to run a Kubernetes "cluster" (master and nodes) on a single server to orchestrate and route containers (in place of Traefik/Rancher/Docker Compose)?
This use is for development and staging only for the moment, so high availability is not a prerequisite.
Thanks.
If it is not a production environment, it doesn't matter how many nodes you are using. So yes, it should be just fine in this case. But make sure all the k8s features you will need in production are available in test/dev, to keep things similar and portable.
AFAIU,
I do not see a requirement for kubernetes unless we are doing below at least for single host using native docker run or docker-compose or docker engine swarm mode -
Make sure there are enough(>=2) replicas of your app in a single server and you are balancing the load across those apps docker containers.
If you want to go bit advanced, we should be able to scale up & down dynamically (docker swarm mode supports this out of the box else use jwilder nginx proxy).
Your deployment should not cause a downtime. Make sure a single container is always healthy at any instant of time while deploying.
Container should auto heal(restart automatically) in case your HTTP or TCP health check fails.
Doing all of the above will certainly put you in a better place but single host is still a single source of failure which you got to deal with at regular intervals.
Preferred : if possible try to start with docker engine swarm mode or kubernetes single master or minikube. This will automatically take care of all the above scenarios out of the box and will also allow you to further scale up anytime by adding more nodes without changing much in your YML files for docker swarm or kubernetes.
Ref -
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
https://docs.docker.com/engine/swarm/
I would use single host k8s only if I managed clusters with the same project that I would like to deploy to the said host. This enables you to reuse manifests and all the automation you've created for your clusters.
Have I had single host environments only, I would probably stick to docker-compose.
If you're looking to try it out your easiest options are probably minikube (easy to run single-node cluster locally but without some features) or using one of the free trial accounts for a managed Kubernetes service from one of the big cloud providers (fully-featured and multi-node but limited use before you have to pay).

Multiple images inside one container

So, here is the problem, I need to do some development and for that I need following packages:
MongoDb
NodeJs
Nginx
RabbitMq
Redis
One option is that I take a Ubuntu image, create a container and start installing them one by one and done, start my server, and expose the ports.
But this can easily be done in a virtual box also, and it will not going to use the power of Docker. So for that I have to start building my own image with these packages. Now here is the question if I start writing my Dockerfile and if place the commands to download the Node js (and others) inside of it, this again becomes the same thing like virtualization.
What I need is that I start from Ubuntu and keep on adding the references of MongoDb, NodeJs, RabbitMq, Nginx and Redis inside the Dockerfile and finally expose the respective ports out.
Here are the queries I have:
Is this possible? Like adding the refrences of other images inside the Dockerfile when you are starting FROM one base image.
If yes then how?
Also is this the correct practice or not?
How to do these kind of things in Docker ?
Thanks in advance.
Keep images light. Run one service per container. Use the official images on docker hub for mongodb, nodejs, rabbitmq, nginx etc. Extend them if needed. If you want to run everything in a fat container you might as well just use a VM.
You can of course do crazy stuff in a dev setup, but why spend time setting up something that has zero value in a production environment? What if you need to scale up one of the services? How do set memory and cpu constraints on each service? .. and the list goes on.
Don't make monolithic containers.
A good start is to use docker-compose to configure a set of services that can talk to each other. You can make a prod and dev version of your docker-compose.yml file.
Getting into the right frame of mind
In a perfect world you would run your containers in clustered environment in production to be able to scale your system and have concurrency, but that might be overkill depending on what you are running. It's at least good to have this in the back of your head because it can help you to make the right decisions.
Some points to think about if you want to be a purist :
How do you have persistent volume storage across multiple hosts?
Reverse proxy / load balancer should probably be the entry point into the system that talks to the containers using the internal network.
Is my service even able run in a clustered environment (multiple instances of the container)
You can of course do dirty things in dev such as mapping in host volumes for persistent storage (and many people who use docker standalone in prod do that as well).
Ideally we should separate docker in dev and docker i prod. Docker is a fantastic tool during development as you can have redis, memcached, postgres, mongodb, rabbitmq, node or whatnot up and running in minutes sharing that compose setup with the rest of the team. Docker in prod can be a completely different beast.
I would also like to add that I'm generally against the fanaticism that "everything should be running in docker" in prod. Run services in docker when it makes sense. It's also not uncommon for larger companies to make their own base images. This can be a lot of work and will require maintenance to keep up with security fixes etc. It's not necessarily the first thing you jump on when starting with docker.

Networking among kubernetes minions

I installed an 8-node kubernetes cluster (1 master + 7 minion) but I faced a networking problem among minions.
I installed my cluster according to this step-by-step Fedora manual, so I use Fedora 20 with its testing repository to get kubernetes binaries.
After installing, I wanted to try the guestbook example, but it seems to me there is a problem with the inter-container networking.
Although containers/PODs are in running state and I can reach my 3 frontend containers (via browser) and the redis containers as well (via natcat), but the frontend, which not on the same host with the redis, cannot reach redis master. The frontend's PHP give back network exception.
Can anybody help me why the containers cannot reach each other among the hosts?
I hope I could describe my setup enough accurately and thanks in advance.
The Fedora guide you followed will only get you running on a single machine. It avoids the issues around setting up networking across nodes.
For kubernetes to work, the following network set up must be satisfied:
Every container should be able to talk to every other container, even across nodes. This means also that the bridge IP range for those containers must not overlap.
Code running on any node that isn't in a container should be able to reach every container (and vise-versa), even across nodes.
It is not necessary (but useful) if computers on the network that aren't part of the cluster can reach the containers directly.
There are a lot of ways to achieve this -- for instance the set up for vagrant sets up GRE tunnels between each node. On GCE we use features of the platform to do the routing. If you are on physical machines on a switch you can probably just do a big layer 2 network w/ bridges. A bulletproof way to get started (but perhaps not the most performant, depending on your set up) is to use something like flannel.
We are working on making this stuff easier to start up (without using a mess of shell scripts) and are thinking of building something like flannel in so that there is a reasonable default.

Resources