Why use docker service? - docker

This question illustrates the theoretical differences between docker run and docker service.
What I don't understand is when would one need to use the exact same container replicated multiple times (as per the Docker documentation example)?
There, they run the same web app replicated 5 times.
Is deployment on Kubernetes (for example) a potential use case, where the developer does not want to centralize the app on one host, in order to make it more resilient, hence why 5 replicas are created?
To understand, can someone please please with an example use case, where the docker service is useful?

swarm is an orchestrator just like kubernetes. docker service deploys services to swarm just as you deploy your services to kubernetes using kubectl.
swarm is essentially built-in primitive orchestrator. One possible case for replicas is running a proxy that directs requests to proper containers. You could expose multiple machines and have one take place of another in case another fails. Or any other high availability case you could think of.
Your question could be rephrased as "What's the difference between running a single container and running containers in a cluster?", which would be another question altogether, but that rephrasing might help illustrate what docker service does.

If you want to scale your application, you can run multiple instances of it (horizontal scaling) or you beef up the machine(s) that it runs on (vertical scaling). For the first, you would have to put a load balancer in front of your application so that the traffic is evenly distributed between the different instances. The idea is that those instances run on different hosts, so if one goes down, your application is still up. Some controlling instance (a Kubernetes service, for example) will notice that one of your instances has gone south and won't direct any more traffic to it. Nowadays, with all the cloud stuff going on, this is typically the way to go.
You don't need Kubernetes for such a setup, but you're right, this would be a typical use case for it. At least if you run your application in a Docker container.

Once use case is running on Docker swarm which consists of n number of nodes in your swarm cluster. You can run replicas of your application on the swarm cluster with a load balancer/reverse proxy to load balance your setup. If any one of the nodes goes down the application can still run.
But the exact use case for running multiple instances is scalabilty. Suppose you know that one instance of your app can serve 10000 users (Assume Bank authentication) at a time.
If you want your application to serve 50K users just run 5 replicas(using docker service create) .

Related

Controlling the hosts where my containers run with docker swarm

I'm jumping from a local docker-compose building, to a production environment, in which I have 4 vps. The first (the manager) is the one with the least resources. The other 3 have the same and are biggers (the workers). I decided to use docker swarm, to manage these infrastructure. My doubt is, Should I be concerned about which host x container is running on? Or this is a bad concept of mine? I mean, is the docker swarm meant for me to abstract from the underlying nodes, and create the services and containers trusting that docker will manage the resources successfully?
Answer is... both!
The goal is to let Docker Swarm manage things for you as much as possible, but also add constraints in order for your application to deploy on the hardware that matches best its requirements.
For example, if you have a reverse proxy and machine learning models, you might want to deploy your reverse proxy on a CPU optimized server, and your machine learning models on a memory optimized instance.
You need to label your nodes properly, and then add constraints so that services are only deployed to the nodes that match your labels. For example in the example above you could add 2 labels: reverse-proxy and ml.
I am explaining how to do this more precisely in this article in case you're interested: https://juliensalinas.com/en/container-orchestration-docker-swarm-nlpcloud/

Benefit to placing Database and Application in same Kubernetes pod

I know just the bare minimum of Kubernetes. However, I wanted to know if there would be any benefit in running 2 containers in a single pod:
1 Container running the application (e.g. a NodeJS app)
1 Container running the corresponding local database (e.g. a PouchDB database)
Would this would increase performance or the down-sides of coupling the two containers would overcome any benefits?
Pods, are designed to put together containers that share the same lifecyle. Containers inside the same pod, share some namespaces (like networking) and volumes.
This way, coupling an app with its database could look like a good idea, because the app could just connect to the database through localhost, etc. But it is not! As Diego Velez pointed out, one of the first limitations you could face is scaling your app. If you couple your app with your database, you are forced to scale your database whenever you scale your app, what is not optimal at all and prevents you from benefit of one of the main benefits of using a container orchestrator like kubernetes.
Some good use cases are:
Container with app + container with agent for app metrics, ci agents, etc.
CI/CD container (like jenkins agents) + container(s) with tools for CI/CD.
Container with app + container with proxy (like in istio making use of the sidecar pattern).
Lets say you neeed to scale your app (the pod), what would happen is that the DB will also be scaled, and that will cause an error because it is not set to be a cluster, just a single node.

What's a typical ElasticSearch/Logstash/Kibana deployment model look like

Being a novice to docker/elastic search worlds, I am trying to build a deployment model of using elastic search via containers in one of my project.
I have few application servers, each of which have some logs. I would like to have all these logs at one place. Below is what I have in my mind.
All application servers install filebeat to push data to a Logstash server (in a docker image). This LogStash server forward these logs to elasticsearch docker image that also have kibana.
Does this make sense? Is it OK to have logstash in one image and ElasticSearch/Kibana on a different one? Are there any pros/cons of this approach? What could be alternative approaches to architect this?
The policy of Docker is that 1 container does 1 thing and 1 thing good. So I would go for a docker image for ElasticSearch, 1 for Kibana and one for LogStash. Add them together with docker compose.
https://docs.docker.com/v17.09/engine/userguide/eng-image/dockerfile_best-practices/#use-multi-stage-builds
Each container should have only one concern
Decoupling applications into multiple containers makes it much easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
You may have heard that there should be “one process per container”. While this mantra has good intentions, it is not necessarily true that there should be only one operating system process per container. In addition to the fact that containers can now be spawned with an init process, some programs might spawn additional processes of their own accord. For instance, Celery can spawn multiple worker processes, or Apache might create a process per request. While “one process per container” is frequently a good rule of thumb, it is not a hard and fast rule. Use your best judgment to keep containers as clean and modular as possible.
If containers depend on each other, you can use Docker container networks to ensure that these containers can communicate.

How to keep a certain number of Docker containers running the same application and add/remove them as needed?

I've working with Docker containers. What Ive done is lunching 5 containers running the same application, I use HAProxy to redirect requests to them, I added a volume to preserve data and set restart policy as Always.
It works. (So far this is my load balancing aproach)But sometimes I need another container to join the pool as there might be more requests, or maybe at first I don't need 5 containers.
This is provided by the Swarm Mode addition in Docker 1.12. It includes orchestration that lets you not only scale your service up or down, but recover from an outage by automatically rescheduling the jobs to run on other nodes.
If you don't want to use Docker 1.12 (yet!), you can also use a Service Discovery like Consul, register your containers inside and use a tool like Consul Template to regenerate your load balancer configuration accordingly.
I made a talk 6 months ago about it. You can find the code and the configuration I used during my demo here: https://github.com/bargenson/dockerdemo

Should I use separate Docker containers for my web app?

Do I need use separate Docker container for my complex web application or I can put all required services in one container?
Could anyone explain me why I should divide my app to many containers (for example php-fpm container, mysql container, mongo container) when I have ability to install and launch all stuff in one container?
Something to think about when working with Docker is how it works inside. Docker replaces your PID 1 with the command you specify in the CMD (and ENTRYPOINT, which is slightly more complex) directive in your Dockerfile. PID 1 is normally where your init system lives (sysvinit, runit, systemd, whatever). Your container lives and dies by whatever process is started there. When the process dies, your container dies. Stdout and stderr for that process in the container is what you are given on the host machine when you type docker logs myContainer. Incidentally, this is why you need to jump through hoops to start services and run cronjobs (things normally done by your init system). This is very important in understanding the motivation for doing things a certain way.
Now, you can do whatever you want. There are many opinions about the "right" way to do this, but you can throw all that away and do what you want. So you COULD figure out how to run all of those services in one container. But now that you know how docker replaces PID 1 with whatever command you specify in CMD (and ENTRYPOINT) in your Dockerfiles, you might think it prudent to try and keep your apps running each in their own containers, and let them work with each other via container linking. (Update -- 27 April 2017: Container linking has been deprecated in favor of regular ole container networking, which is much more robust, the idea being that you simply join your separate application containers to the same network so they can talk to one another).
If you want a little help deciding, I can tell you from my own experience that it ends up being much cleaner and easier to maintain when you separate your apps into individual containers and then link them together. Just now I am building a Wordpress installation from HHVM, and I am installing Nginx and HHVM/php-fpm with the Wordpress installation in one container, and the MariaDB stuff in another container. In the future, this will let me drop in a replacement Wordpress installation directly in front of my MariaDB data with almost no hassle. It is worth it to containerize per app. Good luck!
When you divide your web application to many containers, you don't need to restart all the services when you deploy your application. Like traditionally you don't restart your mysql server when you update your web layer.
Also if you want to scale your application, it is easier if your application is divided separate containers. Then you can just scale those parts of your application that are needed to solve your bottlenecks.
Some will tell you that you should run only 1 process per container. Others will say 1 application per container. Those advices are based on principles of microservices.
I don't believe microservices is the right solution for all cases, so I would not follow those advices blindly just for that reason. If it makes sense to have multiples processes in one container for your case, then do so. (See Supervisor and Phusion baseimage for that matter)
But there is also another reason to separate containers: In most cases, it is less work for you to do.
On the Docker Hub, there are plenty of ready to use Docker images. Just pull the ones you need.
What's remaining for you to do is then:
read the doc for those docker images (what environnement variable to set, etc)
create a docker-compose.yml file to ease operating those containers
It is probably better to have your webapp in a single container and your supporting services like databases etc. in a separate containers. By doing this if you need to do rolling updates or restarts you can keep your database online while your application nodes are doing individual restarts so you wont experience downtime. If you have caching with something like Redis etc this is also useful for the same reason. It will also allow you to more easily add nodes to scale in a loosely coupled fashion. It will also allow you to manage the containers in a manner more suitable to a specific purpose. For the type of application you are describing I see very few arguments for running all services on a single container.
It depends on the vision and road map you have for your application. Putting all components of an application in one tier in this case docker container is like putting all eggs in one basket.
Whenever your application would require security, performance related issues then separating those three components in their own containers would be an ideal solution. It's needless to mention that this division of labor across containers would come at some cost and which would be related to wiring up those containers together for communication and security etc.

Resources