I want to find a way to deploy an etcd cluster as a Docker Swarm service that would automatically configure itself without any interaction. Basically, I think of something in spirit of this command:
docker service create --name etcd --replicas 3 my-custom-image/etcd
I'm assuming that overlay network is configured to be secure and provide both encryption and authentication, so I believe I don't need TLS, not even --auto-tls. Don't want an extra headache finding a way to provision the certificates, when this can be solved on the another layer.
I need an unique --name for each instance, but I can get that from an entrypoint script that would use export ETCD_NAME=$(hostname --short).
The problem is, I'm stuck on initial configuration. Based on the clustering guide there are three options, but none seems to fit:
The DNS discovery scenario is closest to what I'm looking for, but Docker doesn't support DNS SRV records discovery at the moment. I can lookup etcd and I will get all the IPs of my nodes' containers, but there are no _etcd-server._tcp records.
I cannot automatically build ETCD_INITIAL_CLUSTER because while I know the IPs, I don't know the names of the other nodes and I'm not aware about any way to figure those out. (I'm not going to expose Docker API socket to etcd container for this.)
There is no preexisting etcd cluster, and while supplying the initial configuration URI from discovery.etcd.io is a possible workaround I'm interested in not doing this. I'm aiming for "just deploy a stack from this docker-compose.yml and it'll automatically do the right thing, no questions asked" no-brainer scenario.
Is there any trick I can pull?
As you have correctly said you know the IPs of your nodes’ containers,
so the suggested trick is to simply build the required etcd names as derivatives of each node’s IP address.
inside each container etcd is named using this particular container's IP i.e. etcd-$ip
ETCD_INITIAL_CLUSTER is populated using other containers' IPs in a similar way
The names could be as simple as etcd-$ip or even better i.e. we could use the netmask to calculate the node’s IP on this network to make the names prettier.
In this case in a simple 3-nodes configuration one could end up having names like etcd-02 etcd-03 etc
No specific requirements exist for the name attribute, it just needs to be unique and human-readable. Although it indeed looks like a trick it might work
Related
First of all:
I have some poor experience with docker swarm (I mean I touch in production env). I read a lot of about it, and I know concepts like veth, overlay, labels-pinning, vtep, bridge and so on. I know also that docker swarm use some distributed key-value storage to deal with management of cluster.
However, there is something that I don't understand: service disorvery/DNS/resolving service name.
How does it work? Where is this DNS server placed? Who cares to resolve service names?
Is it possible to read the content of distributed key-value storage?
How does it work? Where is this DNS server placed? Who cares to resolve service names?
It's embedded in the docker daemon itself. Every container which is part of a user defined network, does is name resolving requests to 127.0.0.11.
https://docs.docker.com/v17.09/engine/userguide/networking/configure-dns/
Is it possible to read the content of distributed key-value storage?
Docker is using libkv. But I'm not sure if it's possible to bypass the docker daemon and access it.
https://github.com/docker/libkv
The DNS in Docker Swarm overlay networks works just like docker-compose bridge networks. Services resolve inside the same network by their service name. Other things like Swarm VIP and Routing Mesh make the whole solution slightly different but don't directly affect DNS resolution.
The Swarm raft log isn't meant to be easily read, but it's not more than just the service definitions of services and networks you create in Swarm. I've never needed to look at it directly in a production system.
Here's a 3.5 training video on all things Swarm (including lots of network details) from former Docker engineer Jerome Petazzoni and AJ Bowen
Also, Laura Frank has some details from last years DockerCon on how the raft log and consensus works and might point to some tools if you want to look under the hood.
We're dockerizing our micro services app, and I ran into some discovery issues.
The app is configured as follows:
When the a service is started in 'non-local' mode, it uses Consul as its Discovery registry.
When a service is started in 'local' mode, it automatically binds an address per service (For example, tcp://localhost:61001, tcp://localhost:61002 and so on. Hard coded addresses)
After dockerizing the app (for local mode only, for now) each service is a container (Docker images orchestrated with docker-compose. And with docker-machine, if that matters)
But one service can not interact with another service since they are not on the same machine and tcp://localhost:61001 will obviously not work.
Using docker-compose with links and specifying localhost as an alias (service:localhost) didn't work. Is there a way for 2 containers to "share" the same localhost?
If not, what is the best way to approach this?
I thought about using specific hostname per service, and then specify the hostname in the links section of the docker-compose. (But I doubt that this is the elegant solution)
Or maybe use a dockerized version of Consul and integrate with it?
This post: How to share localhost between two different Docker containers? provided some insights about why localhost shouldn't be messed with - but I'm still quite puzzled on what's the correct approach here.
Thanks!
But one service can not interact with another service since they are not on the same machine and tcp://localhost:61001 will obviously not work.
Actually, they can. You are right that tcp://localhost:61001 will not work, because using localhost within a container would be referring to the container itself, similar to how localhost works on any system by default. This means that your services cannot share the same host. If you want them to, you can use one container for both services, although this really isn't the best design since it defeats one of the main purposes of Docker Compose.
The ideal way to do it is with docker-compose links, the guide you referenced shows how to define them, but to actually use them you need to use the linked container's name in URLs as if the linked container's name had an IP mapping defined in the original container's /etc/hosts (not that it actually does, but just so you get the idea). If you want to change it to be something different from the name of the linked container, you can use a link alias, which are explained in the same guide you referenced.
For example, with a docker-compose.yml file like this:
a:
expose:
- "9999"
b:
links:
- a
With a listening on 0.0.0.0:9999, b can interact with a by making requests from within b to tcp://a:9999. It would also be possible to shell into b and run
ping a
which would send ping requests to the a container from the b container.
So in conclusion, try replacing localhost in the request URL with the literal name of the linked container (or the link alias, if the link is defined with an alias). That means that
tcp://<container_name>:61001
should work instead of
tcp://localhost:61001
Just make sure you define the link in docker-compose.yml.
Hope this helps
On production, never use docker or docker compose alone. Use an orchestrator (rancher, docker swarm, k8s, ...) and deploy your stack there. Orchestrator will take care of the networking issue. Your container can link each other, so you can access them directly by a name (don't care too much about the ip).
On local host, use docker compose to startup your containers and use link. do not use a local port but the name of the link. (if your container A need to access container B on port 1234, then do a link B linked to A with name BBBB and use tcp://BBBB:1234 to access the container from A )
If you really want to bind port to your localhost and use this, access port by your host IP, not localhost.
If changing the hard-coded addresses is not an option for now, perhaps you could modify the startup scripts of your containers to forward forward ports in each local container to the required services in other machines.
This would create some complications though, because you would have to setup ssh in each of your containers, and manage the corresponding keys.
Come to think of it, if encryption is not an issue, ssh is not necessary. Using socat or redir would probably be enough.
socat TCP4-LISTEN:61001,fork TCP4:othercontainer:61001
Elasticsearch is designed to run in cluster mode, all I have to do is to define the relevant node IPs in the cluster via environment variable and as long as network connectivity is available it will connect and join the other nodes to the cluster.
I have 3 nodes, 1 is acting as the docker swarm manager and the other two are workers. I have initialized the manager and joined the worker nodes and everything looks ok from that standpoint.
Now I'm trying to run the elasticsearch container in a way that will allow me to join all nodes to the same elasticsearch cluster, however, I want the nodes to join using their overlay network interface and that means that I need to know the container internal IP addresses at the time of running the docker service create command, how can I do this? Do I have to use something like consul to achieve this?
Some clarifications:
I need to know, at the time of service creation the IP addresses (or DNS names) for all Elasticsearch participants so I could start the cluster correctly. This has to be at the time of creation and not afterwards. Also, as I understand, I can expose ports 9200/9300 for all services and work with the external machine IPs and get it to work, but I would like to use the overlay network to do all these communications (I thought this is what swarm mode is for).
Only a partial solution here.
So, when attaching your services to a custom overlay network, you indeed have access to Docker's custom Service discovery feature. I'll detail the networking feature of Docker Swarm mode, before trying to tie it to your problem.
I'll be using the different term of services and tasks, in which a service could be elasticsearch, whereas a task is a single instance of that elasticsearch service.
Docker networking
The idea is that for each services you create, docker assigns a Virtual IP (VIP), and a custom dns alias. You can retrieve this VIP using the docker service inspect myservice command.
But, there is two modes to attach a service to an overlay network dnsrr and VIP. You can select these options using the --endpoint-mode options of docker service create.
The VIP mode (I believe it is the default one, or at least the most used), affects the virtual ip to the service's dns alias. This means that doing an nslookup servicename would return to you a single vip, that behind the scenes, would be linked to one of your container in a round robin fashion. But, there is also a special dns alias that lets you access all of your instances ips (all of your tasks ips) : tasks.myservice.
So in VIP mode you can retrieve all of your tasks ips using a simple nslookup tasks.myservice, where myservice is a service name.
The other mode is dnsrr. This mode simply gets rid of the VIP, and connects the dns alias to the different tasks (=service instances), in a round robin way. This way, you simply have to do a nslookup myservice to retrieve the different service instances ip.
Elasticsearch clustering
Ok so first of all I'm not really familiar with the way elasticsearch lets you cluster. From what I understood from your question, you need when running the elasticsearch binary, give it as a parameter, the adress of all of the other nodes it needs to cluster with.
So what I would do, is to create a custom Elasticsearch image, probably based on the one from the default library, to which I would add a custom Entrypoint that would firstly run a script to retrieve the other tasks ip.
I'd believe that staying in VIP mode is suitable for you, since there is the tasks.myservice dns alias. You'll then need to parse the output to retrieve the tasks ip (and probably remove yours). Then you'll be able to save them in a config file environment variable, or use them as a runtime option for your elasticsearch binary.
Edit: To create a custom overlay network, you will need to use the docker network create command, and use the --network option of docker service create
This is answer is mainly based on the Swarm mode networking documentation
I'm learning Docker-Swarm with Consul and found some issues I don't really understand. Basically, I created a Docker-Swarm cluster (node-01 and node-02) with Consul Sevice Discovery. I then run a multi-container application (Express app with Mongo) and I can see it is running on node-02. In order to run it, I have to go in and find the IP address of my node-02 and then open the browser.
It works fine, it's just that I was expecting that I could just go to some virtual IP (or DNS) and that it the Consul service (or Swarm) would then translate it to the correct IP address of node-02 in this example.
Next item is that when I log into Consul web UI, I was expecting to see the nodes under the 'nodes' menu, but that seems not to be the case. I was also expecting to get an overview of the 'applications' or 'services' I was running on the node-01 and node-02, but that is also not the case.
My questions are:
Can someone explain why I would need to manually find out on which node in the cluster my app is running. Cannot imagine this is done in larger deployments.
Can someone address why I don't see the 'nodes' and 'services' in the Consul UI?
Note: I tried to be as short as possible though I have been documenting the full setup in a blog post (with screenshots) for those who want to see more details. Go to blog post
Question 1
I would like to access the service without having to use the Swarm agent's IP address
Solution
It is feasible, you just need to start up a reverse proxy such as nginx in a container (here are the official nginx images). At the start up of this container use the option --link with the name of the application. Thus the IP address of this container will added in the file /etc/hosts of the reverse proxy container (remember to use --name and --hostname). Run this reverse proxy container on a specific node.
So the solution to get rid of the IP address issue is to deploy another container on a specific node (and then specific IP address)? Yes! But using --link will make this issue scalable ;)
Question 2
I was expecting to see the nodes under the 'nodes' menu, but that seems not to be the case.
What you do mean? What did you expect? Do you need to query the k,v-storage DB?
Check this: https://github.com/vmudigal/microservices-sample
Microservices Sample Architecture
I have a small service that is split into 3 docker containers. One backend, one frontend and a small logging part. I now want to start them using coreOS and fleet.
I want to try and start 3 redundant backend containers, so the frontend can switch between them, if one of them fails.
How do i link them? If i only use one, it's easy, i just give it a name, e.g. 'back' and link it like this
docker run --name front --link back:back --link graphite:graphite -p 8080:8080 blurio/hystrixfront
Is it possible to link multiple ones?
The method you us will be somewhat dependent on the type of backend service you are running. If the backend service is http then there are a few good proxy / load balancers to choose from.
nginx
haproxy
The general idea behind these is that your frontend service need only be introduced to a single entry point which nginx or haproxy presents. The tricky part with this, or any cloud service, is that you need to be able to introduce backend services, or remove them, and have them available to the proxy service. There are some good writeups for nginx and haproxy to do this. Here is one:
haproxy tutorial
The real problem here is that it isn't automatic. There may be some techniques that automatically introduce/remove backends for these proxy servers.
Kubernetes (which can be run on top of coreos) has a concept called 'Services'. Using this deployment method you can create a 'service' and another thing called 'replication controller' which provides the 'backend' docker process for the service you describe. Then the replication controller can be instructed to increase/decrease the number of backend processes. You frontend accesses the 'service'. I have been using this recently and it works quite well.
I realize this isn't really a cut and paste answer. I think the question you ask is really the heart of cloud deployment.
As Michael stated, you can get this done automatically by adding a discovery service and bind it to the backend container. The discovery service will add the IP address (usually you'll want to bind this to be the IP address of your private network to avoid unnecessary bandwidth usage) and port in the etcd key-value store and can be read from the load balancer container to automatically update the load balancer to add the available nodes.
There is a good tutorial by Digital Ocean on this:
https://www.digitalocean.com/community/tutorials/how-to-use-confd-and-etcd-to-dynamically-reconfigure-services-in-coreos