I am new with docker swarm and I'm having ambitious to deploy my application with docker swarm.
With the docker swarm, it has itself discovery service but I googled around and found out people are mentioning about the Consul as discovery service.
My question is. What is the advantage of Consul? Why don't we just use default discovery service?
Thanks,
Consul was used as a service discovery module in the standalone Swarm (prior to docker 1.12). However, since docker 1.12, Swarm mode was introduced with comes with default discovery service. So you don't need an external store.
Key point to notice is that if you had a swarm with an external store like consul, it would still have some data/metadata that needs to be preserved. Hence the use of Consul still exists.
Let us first look at the scope of service discovery provided by both swarm and Consul.
Swarm is to fascilitate service discovery on your docker network/infra only, while consul can be used with almost anything if you know how to use it, be it a monolythic application or a microservice, consul gives you all of that at one place.
Secondly, even though Swarm is great to handle a small infrastructure loads, it doesn't really go well with handling high production loads for a resource heavy infrastructure. This is why there are other tools in existance, for example kubernetes, ECS etc.
So considering that you have an application which you know is going to grow, I would rather go for a solution that works well with whatever I may try in future without having to change too much and works well with scaling on any IaaS provider. Hope that helps.
Related
First of all:
I have some poor experience with docker swarm (I mean I touch in production env). I read a lot of about it, and I know concepts like veth, overlay, labels-pinning, vtep, bridge and so on. I know also that docker swarm use some distributed key-value storage to deal with management of cluster.
However, there is something that I don't understand: service disorvery/DNS/resolving service name.
How does it work? Where is this DNS server placed? Who cares to resolve service names?
Is it possible to read the content of distributed key-value storage?
How does it work? Where is this DNS server placed? Who cares to resolve service names?
It's embedded in the docker daemon itself. Every container which is part of a user defined network, does is name resolving requests to 127.0.0.11.
https://docs.docker.com/v17.09/engine/userguide/networking/configure-dns/
Is it possible to read the content of distributed key-value storage?
Docker is using libkv. But I'm not sure if it's possible to bypass the docker daemon and access it.
https://github.com/docker/libkv
The DNS in Docker Swarm overlay networks works just like docker-compose bridge networks. Services resolve inside the same network by their service name. Other things like Swarm VIP and Routing Mesh make the whole solution slightly different but don't directly affect DNS resolution.
The Swarm raft log isn't meant to be easily read, but it's not more than just the service definitions of services and networks you create in Swarm. I've never needed to look at it directly in a production system.
Here's a 3.5 training video on all things Swarm (including lots of network details) from former Docker engineer Jerome Petazzoni and AJ Bowen
Also, Laura Frank has some details from last years DockerCon on how the raft log and consensus works and might point to some tools if you want to look under the hood.
I have a couple of Docker swarm questions (Sorry for not splitting them up but they are all closely related):
Do all instances in a swarm have to run on different machines or can they all run on the same? (if having limited amount of hardware and just wanting to try swarm mode)
Do I have to run swarm mode to be able to communicate between instances?
What is the key difference between swarm mode and just running a number of containers as regular?
What are the options of communication between instances of containers? (in swarm and in regular mode) http? named pipes? other?
If using http communication between containers on same machine, will it be roughly similarly as fast as named pipes?
Is there any built in support for a message bus or similar in Docker?
Is there support for any consensus protocol in Docker?
Are there any GUI's for designing, managing, testing and/or debugging Docker swarms?
Can a container list other containers, stop/restart some and start new ones? (to be able to function as a manager for other containers)
Can a container be given access to OS-features (Linux in my case) to configure for instance a reverse proxy or port forwarding on the WAN?
Background: What I'm trying to figure out is how I should go about and build a micro service mesh using Docker. The containers will be running .NET Core. I'm not too keen on relying too much on specifically Docker since it may not be the preferred tech in a couple of years. What can/should I do with Docker and what can/should I do inside the containers. That's what I'm trying to figure out.
I've copied your questions and tried to answer them.
Do all instances in a swarm have to run on different machines or can they all run on the same? (if having limited amount of hardware and just wanting to try swarm mode)
You can have only one machine in a swarm and run multiple tasks of the same service or in other words your scale of a service can be more than the number of actual machines. I have a testing swarm with a single machine and one with three and it works the same way.
Do I have to run swarm mode to be able to communicate between instances?
You have to run your docker in swarm mode in order to create a service, please see this link
What is the key difference between swarm mode and just running a number of containers as regular?
The key difference afaik is, that when a task goes down, docker puts another task up automatically. And you can easily scale your services, which means you can easily have multiple tasks just by scaling your service (up or down). As of running a container - when it goes down you have to manually start another.
What are the options of communication between instances of containers? (in swarm and in regular mode) http? named pipes? other?
I've currently only tested with a couple of wildfly servers in a swarm, which are on the same network. I'm not sure about others, but would love to find out. I've only read about RabbitMQ, but can't seem to find the link atm.
If using http communication between containers on same machine, will it be roughly similarly as fast as named pipes?
I can't say.
Is there any built in support for a message bus or similar in Docker?
I can't say.
Are there any GUI's for designing, managing, testing and/or debugging Docker swarms?
I've tested rancher and portainer.io, for a list of them I found this link
Can a container list other containers, stop/restart some and start new ones?
I'm not sure why would you want to do that? And I guess it's possible, see this link
Can a container be given access to OS-features (Linux in my case) to configure for instance a reverse proxy or port forwarding on the WAN?
I can't say.
#namokarm did a great job, and I'm filling in the gaps:
Benefits of Swarm over docker run or docker-compose.
All communications between containers has to be TCP/UDP etc. You could force two containers to only run on a single machine, then bind-mount their socket so they skip the network, but that would be a bit of an anti-pattern. Swarm is designed for everything to be distributed and TCP/UDP.
In a few cases, such as PHP-FPM + Nginx, I recommend bundling both in the same container (against docker best practices, but trust me it's easier than separate containers). This will ensure they scale together (1-to-1 relationship) and stay fast since they use local sockets to communicate). I only recommend this for a few setups like this, the other being ColdFusion + Nginx because they are two parts of the same tool that provide a HTTP response... I don't recommend bundling images together in nearly all other cases, but I'm open to ideas :).
Rancher is no longer supporting Swarm. Portainer and SwarmPit are GUI options.
Yes a container running something like Portainer/SwarmPit or controlling the Docker socket through a bind-mount or TCP can control the whole Swarm. This is how all docker management works :)
For reverse proxy, you would run a container-based proxy like Traefik or Docker Flow Proxy, which sets up HAProxy for Docker and Swarm.
Many of these topics are discussed in my DockerCon talks: https://www.bretfisher.com/dockercon18/
I'm new to docker and microservices. I've started to decompose my web-app into microservices and currently, I'm doing manual configuration.
After some study, I came across docker swarm mode which allows service discovery. Also, I came across other tools for service discovery such as Eureka and Consul.
My main aim is to replace IP addresses in curl call with service name and load balance between multiple instances of same service.
i.e. for ex. curl http://192.168.0.11:8080/ to curl http://my-service
I have to keep my services language independent.
Please suggest, Do I need to use Consul with docker swarm for service discovery or i can do it without Consul? What are the advantages?
With the new "swarm mode", you can use docker services to create clustered services across multiple swarm nodes. You can then access those same services, load-balanced, by using the service name rather than the node name in your requests.
This only applies to nodes within the swarm's overlay network. If your client systems are part of the same swarm, then discovery should work out-of-the-box with no need for any external solutions.
On the other hand, if you want to be able to discover the services from systems outside the swarm, you have a few options:
For stateless services, you could use docker's routing mesh, which will make the service port available across all swarm nodes. That way you can just point at any node in the swarm, and docker will direct your request to a node that is running the service (regardless of whether the node you hit has the service or not).
Use an actual load balancer in front of your swarm services if you need to control routing or deal with different states. This could either be another docker service (i.e. haproxy, nginx) launched with the --mode global option to ensure it runs on all nodes, or a separate load-balancer like a citrix netscaler. You would need to have your service containers reconfigure the LB through their startup scripts or via provisioning tools (or add them manually).
Use something like consul for external service discovery. Possibly in conjunction with registrator to add services automatically. In this scenario you just configure your external clients to use the consul server/cluster for DNS resolution (or use the API).
You could of course just move your service consumers into the swarm as well. If you're separating the clients from the services in different physical VLANs (or VPCs etc) though, you would need to launch your client containers in separate overlay networks to ensure you don't effectively defeat any physical network segregation already in place.
Service discovery (via dns) is built into docker since version 1.12. When you create a custom network (like bridge or overlay if you have multiple hosts) you can simply have the containers talk to each other via name as long as they are part of same network. You can also have an alias for each container which would round-robin the list of containers which have the same alias. For simple example see:
https://linuxctl.com/docker-networking-options-bridge
As long as you are using the bridge mode for your docker network and creating your containers inside that network, service discovery is available to you out of the box.
You will need to get help from other tools once your infrastructure starts to span in to multiple servers and microservices distributed on them.
Swarm is a good tool to start with, however, I would like to stick to consul if it comes to any IaaS provider like Amazon for my production loads.
We are currently moving towards microservices with Docker from a monolith application running in JBoss. I want to know the platform/tools/frameworks to be used to test these Docker containers in developer environment. Also what tools should be used to deploy these containers to this developer test environment.
Is it a good option to use some thing like Kubernetes with chef/puppet/vagrant?
I think so. Make sure to get service discovery, logging and virtual networking right. For the former you can check out skydns. Docker now has a few logging plugins you can use for log management. For virtual networking you can look for Flannel and Weave.
You want service discovery because Kubernetes will schedule the containers the way it sees fit and you need some way of telling what IP/port your microservice will be at. Virtual networking make it so each container has it's own subnet thus preventing port clashes in case you have two containers with the same ports exposed in the same host (kubernetes won't let it clash, it will schedule containers to run until you have hosts with ports available, if you try to create more it just won't run).
Also, you can try the built-in cluster tools in Docker itself, like docker service, docker network commands and Docker Swarm.
Docker-machine helps in case you already have a VM infrastructure in place.
We have created and open-sourced a platform to develop and deploy docker based microservices.
It supports service discovery, clustering, load balancing, health checks, configuration management, diagnosing and mini-DNS.
We are using it in our local development environment and production environment on AWS. We have a Vagrant box with everything prepared so you can give it a try:
http://armada.sh
https://github.com/armadaplatform/armada
I am trying to link 2 docker containers using mesos/marathon framework. As I understand there is no way to use the docker link feature in mesos/martahon. So the way to go forward is to use service discovery. Since zookeeper is already used my question is how to use zookeeper for service discovery so that 1 container can talk to another one.
For service discovery on Mesos/Marathon, you can use a proxy server (see https://mesosphere.github.io/marathon/docs/service-discovery-load-balancing.html) or a DNS server that derives settings from Mesos automatically (see https://github.com/mesosphere/mesos-dns).
Although possible I would reconsider using Zookeeper as a centralized KV store for your configuration and services information. You could try to implement a daemon to ask and save data in zookeeper in order to configure your container's config files and live patching, but it's a complex solution (there are examples of this approach in this post from Pinterest, or in Hadoop's ZKFailoverController daemon). From my point of view there are more suited solutions as Consul or etcd, with implementations of the daemons as kelseyhightower/confd or consul-template.