Should I use separate Docker containers for my web app? - docker

Do I need use separate Docker container for my complex web application or I can put all required services in one container?
Could anyone explain me why I should divide my app to many containers (for example php-fpm container, mysql container, mongo container) when I have ability to install and launch all stuff in one container?

Something to think about when working with Docker is how it works inside. Docker replaces your PID 1 with the command you specify in the CMD (and ENTRYPOINT, which is slightly more complex) directive in your Dockerfile. PID 1 is normally where your init system lives (sysvinit, runit, systemd, whatever). Your container lives and dies by whatever process is started there. When the process dies, your container dies. Stdout and stderr for that process in the container is what you are given on the host machine when you type docker logs myContainer. Incidentally, this is why you need to jump through hoops to start services and run cronjobs (things normally done by your init system). This is very important in understanding the motivation for doing things a certain way.
Now, you can do whatever you want. There are many opinions about the "right" way to do this, but you can throw all that away and do what you want. So you COULD figure out how to run all of those services in one container. But now that you know how docker replaces PID 1 with whatever command you specify in CMD (and ENTRYPOINT) in your Dockerfiles, you might think it prudent to try and keep your apps running each in their own containers, and let them work with each other via container linking. (Update -- 27 April 2017: Container linking has been deprecated in favor of regular ole container networking, which is much more robust, the idea being that you simply join your separate application containers to the same network so they can talk to one another).
If you want a little help deciding, I can tell you from my own experience that it ends up being much cleaner and easier to maintain when you separate your apps into individual containers and then link them together. Just now I am building a Wordpress installation from HHVM, and I am installing Nginx and HHVM/php-fpm with the Wordpress installation in one container, and the MariaDB stuff in another container. In the future, this will let me drop in a replacement Wordpress installation directly in front of my MariaDB data with almost no hassle. It is worth it to containerize per app. Good luck!

When you divide your web application to many containers, you don't need to restart all the services when you deploy your application. Like traditionally you don't restart your mysql server when you update your web layer.
Also if you want to scale your application, it is easier if your application is divided separate containers. Then you can just scale those parts of your application that are needed to solve your bottlenecks.

Some will tell you that you should run only 1 process per container. Others will say 1 application per container. Those advices are based on principles of microservices.
I don't believe microservices is the right solution for all cases, so I would not follow those advices blindly just for that reason. If it makes sense to have multiples processes in one container for your case, then do so. (See Supervisor and Phusion baseimage for that matter)
But there is also another reason to separate containers: In most cases, it is less work for you to do.
On the Docker Hub, there are plenty of ready to use Docker images. Just pull the ones you need.
What's remaining for you to do is then:
read the doc for those docker images (what environnement variable to set, etc)
create a docker-compose.yml file to ease operating those containers

It is probably better to have your webapp in a single container and your supporting services like databases etc. in a separate containers. By doing this if you need to do rolling updates or restarts you can keep your database online while your application nodes are doing individual restarts so you wont experience downtime. If you have caching with something like Redis etc this is also useful for the same reason. It will also allow you to more easily add nodes to scale in a loosely coupled fashion. It will also allow you to manage the containers in a manner more suitable to a specific purpose. For the type of application you are describing I see very few arguments for running all services on a single container.

It depends on the vision and road map you have for your application. Putting all components of an application in one tier in this case docker container is like putting all eggs in one basket.
Whenever your application would require security, performance related issues then separating those three components in their own containers would be an ideal solution. It's needless to mention that this division of labor across containers would come at some cost and which would be related to wiring up those containers together for communication and security etc.

Related

What's a typical ElasticSearch/Logstash/Kibana deployment model look like

Being a novice to docker/elastic search worlds, I am trying to build a deployment model of using elastic search via containers in one of my project.
I have few application servers, each of which have some logs. I would like to have all these logs at one place. Below is what I have in my mind.
All application servers install filebeat to push data to a Logstash server (in a docker image). This LogStash server forward these logs to elasticsearch docker image that also have kibana.
Does this make sense? Is it OK to have logstash in one image and ElasticSearch/Kibana on a different one? Are there any pros/cons of this approach? What could be alternative approaches to architect this?
The policy of Docker is that 1 container does 1 thing and 1 thing good. So I would go for a docker image for ElasticSearch, 1 for Kibana and one for LogStash. Add them together with docker compose.
https://docs.docker.com/v17.09/engine/userguide/eng-image/dockerfile_best-practices/#use-multi-stage-builds
Each container should have only one concern
Decoupling applications into multiple containers makes it much easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
You may have heard that there should be “one process per container”. While this mantra has good intentions, it is not necessarily true that there should be only one operating system process per container. In addition to the fact that containers can now be spawned with an init process, some programs might spawn additional processes of their own accord. For instance, Celery can spawn multiple worker processes, or Apache might create a process per request. While “one process per container” is frequently a good rule of thumb, it is not a hard and fast rule. Use your best judgment to keep containers as clean and modular as possible.
If containers depend on each other, you can use Docker container networks to ensure that these containers can communicate.

Multiple images inside one container

So, here is the problem, I need to do some development and for that I need following packages:
MongoDb
NodeJs
Nginx
RabbitMq
Redis
One option is that I take a Ubuntu image, create a container and start installing them one by one and done, start my server, and expose the ports.
But this can easily be done in a virtual box also, and it will not going to use the power of Docker. So for that I have to start building my own image with these packages. Now here is the question if I start writing my Dockerfile and if place the commands to download the Node js (and others) inside of it, this again becomes the same thing like virtualization.
What I need is that I start from Ubuntu and keep on adding the references of MongoDb, NodeJs, RabbitMq, Nginx and Redis inside the Dockerfile and finally expose the respective ports out.
Here are the queries I have:
Is this possible? Like adding the refrences of other images inside the Dockerfile when you are starting FROM one base image.
If yes then how?
Also is this the correct practice or not?
How to do these kind of things in Docker ?
Thanks in advance.
Keep images light. Run one service per container. Use the official images on docker hub for mongodb, nodejs, rabbitmq, nginx etc. Extend them if needed. If you want to run everything in a fat container you might as well just use a VM.
You can of course do crazy stuff in a dev setup, but why spend time setting up something that has zero value in a production environment? What if you need to scale up one of the services? How do set memory and cpu constraints on each service? .. and the list goes on.
Don't make monolithic containers.
A good start is to use docker-compose to configure a set of services that can talk to each other. You can make a prod and dev version of your docker-compose.yml file.
Getting into the right frame of mind
In a perfect world you would run your containers in clustered environment in production to be able to scale your system and have concurrency, but that might be overkill depending on what you are running. It's at least good to have this in the back of your head because it can help you to make the right decisions.
Some points to think about if you want to be a purist :
How do you have persistent volume storage across multiple hosts?
Reverse proxy / load balancer should probably be the entry point into the system that talks to the containers using the internal network.
Is my service even able run in a clustered environment (multiple instances of the container)
You can of course do dirty things in dev such as mapping in host volumes for persistent storage (and many people who use docker standalone in prod do that as well).
Ideally we should separate docker in dev and docker i prod. Docker is a fantastic tool during development as you can have redis, memcached, postgres, mongodb, rabbitmq, node or whatnot up and running in minutes sharing that compose setup with the rest of the team. Docker in prod can be a completely different beast.
I would also like to add that I'm generally against the fanaticism that "everything should be running in docker" in prod. Run services in docker when it makes sense. It's also not uncommon for larger companies to make their own base images. This can be a lot of work and will require maintenance to keep up with security fixes etc. It's not necessarily the first thing you jump on when starting with docker.

Is it best practice to daemonize a process within docker?

Many best practice guides emphasize making your process a daemon and having something watch it to restart in case of failure. This made sense for a while. A specific example can be sidekiq.
bundle exec sidekiq -d
However, with Docker as I build I've found myself simply executing the command, if the process stops or exits abruptly the entire docker container poofs and a new one is automatically spun up - basically the entire point of daemonizing a process and having something watch it (All STDOUT is sent to CloudWatch / Elasticsearch for monitoring).
I feel like this also tends to re-enforce the idea of a single process in a docker container, which if you daemonize would tend to in my opinion encourage a violation of that general standard.
Is there any best practice documentation on this even if you're running only a single process within the container?
You don't daemonize a process inside a container.
The -d is usually seen in the docker run -d command, using a detached (not daemonized) mode, where the the docker container would run in the background completely detached from your current shell.
For running multiple processes in a container, the background one would be a supervisor.
See "Use of Supervisor in docker" (or the more recent docker --init).
Some relevent 12 Factor app recommendations
An app is executed in the execution environment as one or more processes
Concurrency is implemented by running additional processes (rather than threads)
Website:
https://12factor.net/
Docker was open sourced by a PAAS operator (dotCloud) so it's entirely possible the authors were influenced by this architectural recommendation. Would explain why Docker is designed to normally run a single process.
The thing to remember here is that a Docker container is not a virtual machine, although it's entirely possible to make it quack like one. In practice a docker container is a jailed process running on the host server. Container orchestration engines like Kubernetes (Mesos, Docker Swarm mode) have features that will ensure containers stay running, replacing them should the need arise.
Remember my mention of duck vocalization? :-) If you want your container to run multiple processes then it's possible to run a supervisor process that keeps everything healthy and running inside (A container dies when all processes stop)
https://docs.docker.com/engine/admin/using_supervisord/
The ultimate expression of this VM envy would be LXD from Ubuntu, here an entire set of VM services get bootstrapped within LXC containers
https://www.ubuntu.com/cloud/lxd
In conclusion is it a best practice? I think there is no clear answer. Personally I'd say no for two reasons:
I'm fixated on deploying 12 factor compliant applications, so married to the single process model
If I need to run two processes on the same set of data, then in Kubernetes I can run containers within the same POD... Means Kubernetes manages the processes (running as separate containers with a common data volume).
Clearly my reasons are implementation specific.
There are multiple run supervisors that can help you take a foreground process (or multiple ones) run them monitored and restart them on failure (or exit the container).
one is runit (http://smarden.org/runit/), which I have not used myself.
my choice is S6 (http://skarnet.org/software/s6/). someone already built a container envelope for it, named S6-overlay (https://github.com/just-containers/s6-overlay) which is what I usually use if/when I need to have a user-space process run as daemon. it also has facets to do prep work on container start, change permissions and more, in runtime.
tl;dr: I can't find a best practices document that relates directly to this for docker, but I agree with you.
The only best "Best Practices" for docker I could find was at dockers own site, which states that containers should be one process. In my mind, that means foregrounded processes as well. So basically, I've drawn the same conclusion as you. (You've probably read that too, but this is for anyone else reading this).
Honestly, I think we are still in (relatively) new territory with best practices for docker. Anecdotally, it has been a best practice in the organizations I've worked with. The number of times I've felt more satisfied with a foregrounded process has been significantly greater then the times I've said to myself "Boy, I sure wish I backgrounded that one." In fact, I don't think I've ever said that.
The only exception I can think of is when you are trying to evaluate software and need a quick and dirty way to ship infrastructure off to someone. EG: "Hey, there is this new thing called LAMP stacks I just heard of, here is a docker container that has all the components for you to play around with". Again, though, that's an outlier and I would shudder if something like that ever made it to production or even any sort of serious development environment.
Additionally, it certainly forces a micro-architecture style, which I think is ultimately a good thing.

Dockerize Applications or Machines?

I apologize if my question is too basic, but I just started to learn docker and there are some concepts that are not clear to me.
I think of docker as a fully functional virtual machine, and therefore I thought of transforming a server with a set of services, into a container. Then I started reading about "dockerizing" applications (for instance postgresql), and I learned that a Docker file can only have one default instruction, when executing a container. I read that it is possible to coordinate multiple executing instructions using a supervisor, but I started wondering if that is really the best approach, or if it would be better to lean towards a microservices architecture?
To state better my point, I would like to describe my use case. I want to create an environment that provides tomcat (with some services deployed through servlets) and a postgreSQL database. Ideally, I would like the services (and the database), to run in the same host (on different ports).
Is it a best practice to create one container for Tomcat and one for the Database, or is it better to ship them in the same container?
And if I create two different containers, which framework should I use to orchestrate them? Is it Docker compose appropriated for this task?
1)
I think of docker as a fully functional virtual machine, and therefore I thought of transforming a server with a set of services, into a container.
NO NO and NO! It is not virtual machine
Run only one process per container
In almost all cases, you should only run a single process in a single container. Decoupling applications into multiple containers makes it much easier to scale horizontally and reuse containers. If that service depends on another service, make use of container linking.
Read Best practices for writing Dockerfiles https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
2)
And if I create two different containers, which framework should I use to orchestrate them? Is it Docker compose appropriated for this task?
You need to start with docker-compose. When you will be more familiar with it you make you own well-founded decision.

Docker - one process per container?

I sometimes use Docker for my development work. When I do, I usually work on an out-of-the-box LAMP image from tutum.
My question is: Doesn't it defeat the purpose to work with Docker if it runs multiple processes in one container? (like the container started off Tutum's LAMP image) Isn't the whole idea of Docker to separate each process into a separate container?
While it is generally a good rule of thumb to separate processes into separate containers, that's not the main benefit/purpose of docker. The benefit of docker is immutability. And if throwing two processes into a single container makes for cleaner logic then go for it. Though in this case, I would definitely consider at least stripping out the DB into its own container, and talk to it through a docker link. The database shouldn't have to go down every time you rebuild your image.
Generally sometimes it is neccessary or more useful to use one container for more than one process like in this situation.
Such situation happens when processes are used together to fulfill its task. I can imagine for example situation when somebody want to add logging to the web application by using ELK (Elasticsearch, Logstash, Kibana). Those things run together and can have supervisor for monitoring processes inside one container.
But for most cases it is better to use one process per container. What is more docker command should start process itself, for example running java aplication by
/usr/bin/java -jar application.jar
apart from running external script:
./launchApplication.sh
See discussion on http://www.reddit.com/r/docker/comments/2t1lzp/docker_and_the_pid_1_zombie_reaping_problem/ where the problem is concerned.

Resources