I want to be able to develop microservices locally, but also to 'push' them into production with minimal configurational changes. I used to put all microservices into one docker-compose locally; but I start to see this might no be the practical.
The new idea is to have single docker-compose per service. This does not means it will run with only one container; it might have more inside (like some datastore behind etc).
From that new point of view, let's take a look at the well-known docker voting app example, that consist of 5 components:
(P) Python webapp which lets you vote between two options
(R) Redis queue which collects new votes
(J) Java worker which consumes votes and stores them in…
(S) Postgres database backed by a Docker volume
(N) Node.js webapp which shows the results of the voting in real time
Let's say you want to push this example into production (so having just one docker-compose is not an option:). Not forget that more infrastructure-related components may be added on top of it (like kibana, prometheus...). And we want to be able to scale what we need; and we use e.g. swarm.
The question is:
How to organize this example: in single docker-composes or many?
What microservices do we have here? In other words, which components would you combine into single docker-compose? Example: J and S?
If services are not in single docker-compose, do we add them to same overlay network to use swarm dns feature?
and so on...
(I don't need details on how to install stuff, this question is about top-level organization)
Docker Compose is mostly for defining different container, configure and using a single command make them available (also for sequencing). So it is best suited for local development, integration testing and use it as part of your Continuous Integration process.
While not ruling out Docker compose can be used in production environment, I think it would be a good case of using Kubernetes which gives more control over scaling, managing multiple containers.
This blog has some example scenarios to try out (and many other resources which can be helpful)
https://renzedevries.wordpress.com/2016/05/31/deploying-a-docker-container-to-kubernetes-on-amazon-aws/comment-page-1/#comment-10
Related
I have been assigned a problem statement which goes as follows:
I am building platform-as-a-service from scratch, which has pipelined execution. Here pipelined execution means that output of a service can be input into another service. The platform can offer number of services, which can be pipelined together. Output of a service can be input to multiple services.
I am new to this field so how to go about this task is not very intuitive to me.
After researching a bit, I found out that I can use Docker to deploy services in containers. So I installed Docker on Ubuntu and installed few images and run them as service (for example, MongoDB). What I am thinking of is that I need to run the services in containers, and define a way of taking input and output to these services. But how exactly do I do this using Docker containers. As an example, I want to send a query as an input to MongoDB (running as a service) and want an output, which I want to feed into another service.
Am I thinking in the right direction? If not in what direction should I be thinking of going about implementing this task?
Is there a standard way of exchanging data between services? (For example output of on service as input to another)
Is there something that Docker offers that I can leverage?
NOTE: I cannot use any high level API which does this for me. I need to implement it myself.
I am doing containerisation of some legacy application (web + service + C++ app) which runs on Linux environment and currently has more than 10 clients.
I could set up and run the app (C++ app) from Docker. Since the app is going to read some property file which will be different for different clients. So I tried to mount drive for sharing data outside of Docker (runtime some files may got changed).
But my biggest concern is how do I run a single container for different clients whose runtime (in memory state) will be different? (Application will run forever, until someone kills/stops it).
Do I need to run n containers for n clients?
Does Docker swarm/Kubernetes have some property for such a scenario?
Will ech client get its own dedicated container?
Can you suggest also some further reading/studying for such scenarios?
And for the Database - since every client will have different data - different DB should be used?
You can isolate containers by supplying them with a unique name and environment variables.
Example:
docker run --name client1 --env-file ./client1.env your-image-name-here
You can have isolated environments and configurations which is unique for each context respectively with this approach.
You need N containers for N clients. But you can use same image for N containers. So one container per customer and each container will be identified by it's own unique name and environment variables.
What you make sound like your basic needs are :
same application for everyone
one configuation per client
one DB per client
The first point is easy to solive : "same application = same image".
Then, what you'll need to personalize the application will be configuration and DB path.
If you want to containerize the DB, the questions will be the same, to let's say you have a DB url instead (it could could be a container : it doesn't matter that much).
There are various options to personnalize your applications :
inherit from the common image and decline it in as many images as you need... with a serious impact on maintenability
add customisation through "docker-compose", which is easier to read, write and maintain!
If one instance per client is ok, just go with a docker-compose per client.
If you happen to need more, go for swarm mode (you can use swarm mode for one instance as well).
In both cases, you'll need a docker-compose (actually you don't really NEED it since you can do all the same through command line, but it's less easy to maintain to my opov, and less easy to explain!).
It may look like that :
version: "3.7"
my-service:
image: your/common/image:1.0
volume:
- /a/path/from/host/with/confs:/a/path/to/container/conf/dir # will replace content there!
environment:
- "DB_URL=my-cny.denver.com:3121/db_client" # can vary or be the same if DB_NAME vary instead
- "DB_NAME=my-cny.denver.com:3121/db_client_1" #vary the name of the DB
- "DB_PSSWD=toto"
...
There are things you shouldn't do, such as writing clear PWD here, but that's just an example.
There are better mecanisms for config file and sensitive data that should be managed though "config" and "secret" mecanisms.
We are moving some of our internal services to rely on Docker instead of direct installation on the host OS (good thing, right :).
We use docker stack command with compose file (as it felt to us it is the modern approach). But we are not sure about how to properly make our stacks modular, while allowing composition:
Let's imagine we have two stacks: stackA and stackB. Those two can perfectly be used in isolation, so for the moment we decided to host them in two separate repositories, each containing the docker-compose.yml of the corresponding stack.
Yet, there is also a mode where stackB can communicate with stackA to provide additional features. On some nodes, we might want to deploy both, and have them communicate.
By default, when we start both stacks on the same node with:
docker stack deploy -c stackA/ A-stack
docker stack deploy -c stackB/ B-stack
Both end up on different overlay networks, and cannot easily communicate.
It seems we are faced with a choice, for which we could only find 3 options at the moment:
We have seen ways to add external networks to stackB in its compose file, but that means now the stackB can only be deployed if stackA already runs (because it wants to join an external network)
We could define another compose file, manually merging both. But that leads us to maintain another repo, and duplicate changes.
We could have the stack communicate over the host network through exposed ports, but it might feel a bit weird.
Is there a best/recommended approach to keep different stacks modular, while allowing to easily compose them?
Or is it an implicit assumption that as soon as two containers are supposed to communicate, they have to be deployed from the same compose file?
I handle usually handle more than one stack in cases when I want handle them separate. Common situations are horizontal scaling of same web service image for different customer installations with different configurations f.e. databases.
The separated stacks allow me easy to shutdown them without any impact of other installations
I also like the standard naming conventions in multiple stack installations. Same services have same names beside the stack prefix.
To let the stack communicate over the boundaries thy only have to share the same network.
The first stack defines in my cases implizit a network and the other stack join that network by compose file configuration.
...
networks:
default:
external:
name: FIRST_STACK_NAME_default
...
I have a question related with the best practices for deploying applications to the production based on the docker swarm.
In order to simplify discussion related with this question/issue lets consider following scenario:
Our swarm contains:
6 servers (different hosts)
on each of these servers, we will have one service
each service will have only one task/replica docker running
Memcached1 and Memcached2 uses public images from docker hub
"Recycle data 1" and "Recycle data 2" uses custom image from private repository
"Client 1" and "Client 2" uses custom image from private repository
So at the end, for our example application, we have 6 dockers running across 6 different servers. 2 dockers are memcached, and 4 of them are clients which are communicating with memcached.
"Client 1" and "Client 2" are going to insert data in the memcached based on the some kind of rules. "Recycle data 1" and "Recycle data 2" are going to update or delete data from memcached based on some kind of rules. Simple as that.
Our applications which are communicating with memcached are custom ones, and they are written by us. The code for these application reside on github (or any other repository). What is the best way to deploy this application to the production:
Build images which will contain copied code within the image which you can use to deploy things to the swarm
Build image which will use volume where code reside outside of the image.
Having in mind that I am deploying swarm to the production for the first time, I can see a lot of issues with way number 1. Having a code incorporate to the images seems non logical to me, having in mind that in 99% of the time, the updates which are going to happen are going to be code based. This will require building image every time when you want to update the code which runs on specific docker (no matter how small that change is).
Way number 2. seems much more logical to me. But at this specific moment I am not sure is this possible? So there are a number of questions here:
What is the best approach in case where we are going to host multiple dockers which will run the same code in the background?
Is it possible on docker swarm, to have one central host,server (manager, anywhere) where we can clone our repositories and share those repositores as volumes across the docker swarm? (in our example, all 4 customer services will mount volume where we have our code hosted)
If this is possible, what is the docker-compose.yml implementation for it?
After digging more deeper and working with docker and docker swarm mode for last 3 months, these are the answers on questions above:
Answer 1: In general, you should consider your docker image as "compiled" version of your program. Your image should contain either code base, or compiled version of the program (depends which programming language you are using), and that specific image represents your version of the app. Every single time when you want to deploy your next version, you will generate the new image.
This is probably best approach for 99% of the apps which are going to be hosted with the docker (exceptions are development environments and apps where you really want to bash and control things directly from the docker container by itself).
Answer 2: It is possible but it is extremely bad approach. As mentioned in answer one, the best one is to copy the app code directly into the image and "consider" your image (running container) as "app by itself".
I was not able to wrap my head around this concept at the begging, because this concept will not allow you to simply go to the server (or where ever you are hosting your docker) and change the app and restart docker (obviously because container will be at the same beginning again after restart using the same image, same base of code you deployed with that image). Any kind of change SHOULD and NEEDS to be deployed as different image with different version. That is what docker is all about.
Additionally, initial idea for sharing same code base across multiple swarm services is possible, but it totally ruins purpose of the versioning across docker swarm.
Consider having 3 services which are used as redundant services (failover), and you want to use new version on one of them as beta test. This will not be possible with the shared code base.
I've been making some tests with docker and so far I'm wondering why it's considered a good practice to separate the DB and the app in two containers.
Having two containers seems to be cumbersome to manage and I don't really see the value in it.
Whereas I like the idea of having a self sustainable container per app.
One reason is the separation of data storage and application. If you you put both in their own container, you can update them independently. In my experience this is a common process, because usually the application will evolve faster than the underlying database.
It also frees you to run the containers in different places, which might be a constraint in your operations. Or to run multiple containers from the same database image with different applications.
Often it is also a good thing to be able to scale the UI from one instance to multiple instance, all connected to the same database (or cache instance or HTTP backend). This is mentioned briefly in the docker best practices.
I also understand the urge to run multiple processes in one container. That's why so many minimalist init systems/supervisors like s6 came up lately. I prefer this for demos of applications which require a couple things, like an nginx for frontend, a database and maybe a redis instance. But you could also write a basic docker-compose file and run the demo with multiple containers.
It depends on what you consider your "DB", is it the database application or the content.
The latter is easy, the content needs to be persisted outside the lifetime of the application. The convention used to be to have a "data" container, which simplified linking it with the application (e.g. using the Docker Engine create command --volumes-from parameter). With Docker 1.9 there is a new volume API which has superceded the concept of "data" containers. But you should never store your data in the overlay filesystem (if not only for persistence, but for performance).
If you are referring to a database application, you really enter a semi-religious debate with the microservices crowd. Docker is built to run single process. It is built for 12-factor apps. It is built for microservices. It is definitely possible to run more than one process in a container, but with it you have to consider the additional complexity of managing/monitoring these processes (e.g. using an init process like supervisord), dealing with logging, etc.
I've delivered both. If you are managing the container deployment (e.g. you are hosting the app), it is actually less work to use multiple containers. This allows you to use Docker's abstraction layers for networking and persistent storage. It also provides maximum portability as you scale the application (perhaps you may consider using convoy or flocker volume drivers or an overlay network for hosting containers across multiple servers). If you are developing a product for distribution, it is more convenient to deliver a single Docker Repository (with one Image). This minimizes the support costs as you guide customers through deployment.