What's a typical ElasticSearch/Logstash/Kibana deployment model look like - docker

Being a novice to docker/elastic search worlds, I am trying to build a deployment model of using elastic search via containers in one of my project.
I have few application servers, each of which have some logs. I would like to have all these logs at one place. Below is what I have in my mind.
All application servers install filebeat to push data to a Logstash server (in a docker image). This LogStash server forward these logs to elasticsearch docker image that also have kibana.
Does this make sense? Is it OK to have logstash in one image and ElasticSearch/Kibana on a different one? Are there any pros/cons of this approach? What could be alternative approaches to architect this?

The policy of Docker is that 1 container does 1 thing and 1 thing good. So I would go for a docker image for ElasticSearch, 1 for Kibana and one for LogStash. Add them together with docker compose.
https://docs.docker.com/v17.09/engine/userguide/eng-image/dockerfile_best-practices/#use-multi-stage-builds
Each container should have only one concern
Decoupling applications into multiple containers makes it much easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
You may have heard that there should be “one process per container”. While this mantra has good intentions, it is not necessarily true that there should be only one operating system process per container. In addition to the fact that containers can now be spawned with an init process, some programs might spawn additional processes of their own accord. For instance, Celery can spawn multiple worker processes, or Apache might create a process per request. While “one process per container” is frequently a good rule of thumb, it is not a hard and fast rule. Use your best judgment to keep containers as clean and modular as possible.
If containers depend on each other, you can use Docker container networks to ensure that these containers can communicate.

Related

Why multi-container docker apps are built?

Can somebody explain it with some examples? Why multi-container docker apps are built? while you can contain your app in a single docker container.
When you make a multi-container app you have to do networking. Is not it easy to run a single image of a single container rather than two images of two containers?
There are several good reasons for this:
It's easier to reuse prebuilt images. If you need MySQL, or Redis, or an Nginx reverse proxy, these all exist as standard images on Docker Hub, and you can just include them in a multi-container Docker Compose setup. If you tried to put them into a single image, you'd have to install and configure them yourself.
The Docker tooling is built for single-purpose containers. If you want the logs of a multi-process container, docker logs will generally print out the supervisord logs, which aren't what you want; if you want to restart one of those containers, the docker stop; docker rm; docker run sequence will delete the whole thing. Instead with a multi-process container you need to use debugging tools like docker exec to do anything, which is harder to manage.
You can upgrade one part without affecting the rest. Upgrading the code in a container usually involves building a new image, stopping and deleting the old container, and running a new container from the new image. The "deleting the old container" part is important, and routine; if you need to delete your database to upgrade your application, you risk losing data.
You can scale one part without affecting the rest. More applicable in a cluster environment like Docker Swarm or Kubernetes. If your application is overloaded (especially in production) you'd like to run multiple copies of it, but it's hard to run multiple copies of a standard relational database. That essentially requires you to run these separately, so you can run one proxy, five application servers, and one database.
Setting up a multi-container application shouldn't be especially difficult; the easiest way is to use Docker Compose, which will deal with things like creating a network for you.
For the sake of simplification, I would say you can run only one application with a public entry point (like API) in a single container. Actually, this approach is recommended by Docker official documentation.
Microservices
Because of this single constraint, you cannot run microservices that require their own entry points in a single docker container.
It could be more a discussion on the advantages of Monolith application vs Microservices.
Database
Even if you decide to run the Monolith application only, still you need to connect some database there. As you noticed, Docker has an additional network-configuration layer, so if you want to run Database and application locally, the easiest way is to use docker-compose to run both images (Database and your Application) inside one, automatically configured network.
# Application definition
application: <your app definition>
# Database definition
database:
image: mysql:5.7
In my example, you can just connect to your DB via https://database:<port> URL from your main app (plus credentials eventually) and it will work.
Scalability
However, why we should split images for the database from the application? One word - scalability. For development purposes, you want to have your local DB, maybe with docker because it is handy. For production purposes, you will put the application image to run somewhere (Kubernetes, Docker-Swarm, Azure App Services, etc.). To handle multiple requests at the same time, you want to run multiple instances of your application. However what about the database? You cannot connect to the internal instance of DB hosted in the same container, because other instances of your app in other containers will have a completely different set of data (without synchronization).
Most often you are electing to use a separate Database server - no matter if running it on the container or fully manged databases (like Azure CosmosDB or Mongo Atlas), but with your own configuration, scaling, and synchronization dedicated for DB only. Your app just needs to worry about the proper URL to that. Most cloud providers are exposing such services out of the box, so you are not worrying about the configuration by yourself.
Easy to change
Last but not least argument is about changing the initial setup overtime. You might change the database provider, or upgrade the version of the image in the future (such things are required from time to time). When you separate images, you can modify one without touching others. It is decreasing the cost of maintenance significantly.
Also, you can add additional services very easy - different logging aggregator? No Problem, additional microservice running out-of-the-box? Easy.

Why use docker service?

This question illustrates the theoretical differences between docker run and docker service.
What I don't understand is when would one need to use the exact same container replicated multiple times (as per the Docker documentation example)?
There, they run the same web app replicated 5 times.
Is deployment on Kubernetes (for example) a potential use case, where the developer does not want to centralize the app on one host, in order to make it more resilient, hence why 5 replicas are created?
To understand, can someone please please with an example use case, where the docker service is useful?
swarm is an orchestrator just like kubernetes. docker service deploys services to swarm just as you deploy your services to kubernetes using kubectl.
swarm is essentially built-in primitive orchestrator. One possible case for replicas is running a proxy that directs requests to proper containers. You could expose multiple machines and have one take place of another in case another fails. Or any other high availability case you could think of.
Your question could be rephrased as "What's the difference between running a single container and running containers in a cluster?", which would be another question altogether, but that rephrasing might help illustrate what docker service does.
If you want to scale your application, you can run multiple instances of it (horizontal scaling) or you beef up the machine(s) that it runs on (vertical scaling). For the first, you would have to put a load balancer in front of your application so that the traffic is evenly distributed between the different instances. The idea is that those instances run on different hosts, so if one goes down, your application is still up. Some controlling instance (a Kubernetes service, for example) will notice that one of your instances has gone south and won't direct any more traffic to it. Nowadays, with all the cloud stuff going on, this is typically the way to go.
You don't need Kubernetes for such a setup, but you're right, this would be a typical use case for it. At least if you run your application in a Docker container.
Once use case is running on Docker swarm which consists of n number of nodes in your swarm cluster. You can run replicas of your application on the swarm cluster with a load balancer/reverse proxy to load balance your setup. If any one of the nodes goes down the application can still run.
But the exact use case for running multiple instances is scalabilty. Suppose you know that one instance of your app can serve 10000 users (Assume Bank authentication) at a time.
If you want your application to serve 50K users just run 5 replicas(using docker service create) .

Containers Orchestations and some docker functions

I am familiarizing with the architecture and practices to package, build and deploy software or unless, small pieces of software.
If suddenly I am mixing concepts with specific tools (sometimes is unavoidable), let me know if I am wrong, please.
On the road, I have been reading and learning about the images and containers terms and their respective relationships in order to start to build the workflow software systems of a better possible way.
And I have a question about the services orchestration in the context of the docker :
The containers are lightweight and portable encapsulations of an environment in which we have all the binary and dependencies we need to run our application. OK
I can set up communication between containers using container links --link flag.
I can replace the use of container links, with docker-compose in order to automate my services workflow and running multi-containers using .yaml file configurations.
And I am reading about of the Container orchestration term, which defines the relationship between containers when we have distinct "software pieces" separate from each other, and how these containers interact as a system.
Well, I suppose that I've read good the documentation :P
My question is:
A docker level, are container links and docker-compose a way of container orchestration?
Or with docker, if I want to do container orchestration ... should I use docker-swarm?
You should forget you ever read about container links. They've been obsolete in pure Docker for years. They're also not especially relevant to the orchestration question.
Docker Compose is a simplistic orchestration tool, but I would in fact class it as an orchestration tool. It can start up multiple containers together; of the stack it can restart individual containers if their configurations change. It is fairly oriented towards Docker's native capabilities.
Docker Swarm is mostly just a way to connect multiple physical hosts together in a way that docker commands can target them as a connected cluster. I probably wouldn't call that capability on its own "orchestration", but it does have some amount of "scheduling" or "placement" ability (Swarm, not you, decides which containers run on which hosts).
Of the other things I might call "orchestration" tools, I'd probably divide them into two camps:
General-purpose system automation tools that happen to have some Docker capabilities. You can use both Ansible and Salt Stack to start Docker containers, for instance, but you can also use these tools for a great many other things. They have the ability to say "run container A on system X and container B on system Y", but if you need inter-host communication or other niceties then you need to set them up as well (probably using the same tool).
Purpose-built Docker automation tools like Docker Compose, Kubernetes, and Nomad. These tend to have a more complete story around how you'd build up a complete stack with a bunch of containers, service replication, rolling updates, and service discovery, but you mostly can't use them to manage tasks that aren't already in Docker.
Some other functions you might consider:
Orchestration: How can you start multiple connected containers all together?
Networking: How can one container communicate with another, within the cluster? How do outside callers connect to the system?
Scheduling: Which containers run on which system in a multi-host setup?
Service discovery: When one container wants to call another, how does it know who to call?
Management plane: As an operator, how do you do things like change the number of replicas of some specific service, or cause an update to a newer image for a service?

Should I use separate Docker containers for my web app?

Do I need use separate Docker container for my complex web application or I can put all required services in one container?
Could anyone explain me why I should divide my app to many containers (for example php-fpm container, mysql container, mongo container) when I have ability to install and launch all stuff in one container?
Something to think about when working with Docker is how it works inside. Docker replaces your PID 1 with the command you specify in the CMD (and ENTRYPOINT, which is slightly more complex) directive in your Dockerfile. PID 1 is normally where your init system lives (sysvinit, runit, systemd, whatever). Your container lives and dies by whatever process is started there. When the process dies, your container dies. Stdout and stderr for that process in the container is what you are given on the host machine when you type docker logs myContainer. Incidentally, this is why you need to jump through hoops to start services and run cronjobs (things normally done by your init system). This is very important in understanding the motivation for doing things a certain way.
Now, you can do whatever you want. There are many opinions about the "right" way to do this, but you can throw all that away and do what you want. So you COULD figure out how to run all of those services in one container. But now that you know how docker replaces PID 1 with whatever command you specify in CMD (and ENTRYPOINT) in your Dockerfiles, you might think it prudent to try and keep your apps running each in their own containers, and let them work with each other via container linking. (Update -- 27 April 2017: Container linking has been deprecated in favor of regular ole container networking, which is much more robust, the idea being that you simply join your separate application containers to the same network so they can talk to one another).
If you want a little help deciding, I can tell you from my own experience that it ends up being much cleaner and easier to maintain when you separate your apps into individual containers and then link them together. Just now I am building a Wordpress installation from HHVM, and I am installing Nginx and HHVM/php-fpm with the Wordpress installation in one container, and the MariaDB stuff in another container. In the future, this will let me drop in a replacement Wordpress installation directly in front of my MariaDB data with almost no hassle. It is worth it to containerize per app. Good luck!
When you divide your web application to many containers, you don't need to restart all the services when you deploy your application. Like traditionally you don't restart your mysql server when you update your web layer.
Also if you want to scale your application, it is easier if your application is divided separate containers. Then you can just scale those parts of your application that are needed to solve your bottlenecks.
Some will tell you that you should run only 1 process per container. Others will say 1 application per container. Those advices are based on principles of microservices.
I don't believe microservices is the right solution for all cases, so I would not follow those advices blindly just for that reason. If it makes sense to have multiples processes in one container for your case, then do so. (See Supervisor and Phusion baseimage for that matter)
But there is also another reason to separate containers: In most cases, it is less work for you to do.
On the Docker Hub, there are plenty of ready to use Docker images. Just pull the ones you need.
What's remaining for you to do is then:
read the doc for those docker images (what environnement variable to set, etc)
create a docker-compose.yml file to ease operating those containers
It is probably better to have your webapp in a single container and your supporting services like databases etc. in a separate containers. By doing this if you need to do rolling updates or restarts you can keep your database online while your application nodes are doing individual restarts so you wont experience downtime. If you have caching with something like Redis etc this is also useful for the same reason. It will also allow you to more easily add nodes to scale in a loosely coupled fashion. It will also allow you to manage the containers in a manner more suitable to a specific purpose. For the type of application you are describing I see very few arguments for running all services on a single container.
It depends on the vision and road map you have for your application. Putting all components of an application in one tier in this case docker container is like putting all eggs in one basket.
Whenever your application would require security, performance related issues then separating those three components in their own containers would be an ideal solution. It's needless to mention that this division of labor across containers would come at some cost and which would be related to wiring up those containers together for communication and security etc.

Docker - one process per container?

I sometimes use Docker for my development work. When I do, I usually work on an out-of-the-box LAMP image from tutum.
My question is: Doesn't it defeat the purpose to work with Docker if it runs multiple processes in one container? (like the container started off Tutum's LAMP image) Isn't the whole idea of Docker to separate each process into a separate container?
While it is generally a good rule of thumb to separate processes into separate containers, that's not the main benefit/purpose of docker. The benefit of docker is immutability. And if throwing two processes into a single container makes for cleaner logic then go for it. Though in this case, I would definitely consider at least stripping out the DB into its own container, and talk to it through a docker link. The database shouldn't have to go down every time you rebuild your image.
Generally sometimes it is neccessary or more useful to use one container for more than one process like in this situation.
Such situation happens when processes are used together to fulfill its task. I can imagine for example situation when somebody want to add logging to the web application by using ELK (Elasticsearch, Logstash, Kibana). Those things run together and can have supervisor for monitoring processes inside one container.
But for most cases it is better to use one process per container. What is more docker command should start process itself, for example running java aplication by
/usr/bin/java -jar application.jar
apart from running external script:
./launchApplication.sh
See discussion on http://www.reddit.com/r/docker/comments/2t1lzp/docker_and_the_pid_1_zombie_reaping_problem/ where the problem is concerned.

Resources