I am trying to link 2 docker containers using mesos/marathon framework. As I understand there is no way to use the docker link feature in mesos/martahon. So the way to go forward is to use service discovery. Since zookeeper is already used my question is how to use zookeeper for service discovery so that 1 container can talk to another one.
For service discovery on Mesos/Marathon, you can use a proxy server (see https://mesosphere.github.io/marathon/docs/service-discovery-load-balancing.html) or a DNS server that derives settings from Mesos automatically (see https://github.com/mesosphere/mesos-dns).
Although possible I would reconsider using Zookeeper as a centralized KV store for your configuration and services information. You could try to implement a daemon to ask and save data in zookeeper in order to configure your container's config files and live patching, but it's a complex solution (there are examples of this approach in this post from Pinterest, or in Hadoop's ZKFailoverController daemon). From my point of view there are more suited solutions as Consul or etcd, with implementations of the daemons as kelseyhightower/confd or consul-template.
Related
First of all:
I have some poor experience with docker swarm (I mean I touch in production env). I read a lot of about it, and I know concepts like veth, overlay, labels-pinning, vtep, bridge and so on. I know also that docker swarm use some distributed key-value storage to deal with management of cluster.
However, there is something that I don't understand: service disorvery/DNS/resolving service name.
How does it work? Where is this DNS server placed? Who cares to resolve service names?
Is it possible to read the content of distributed key-value storage?
How does it work? Where is this DNS server placed? Who cares to resolve service names?
It's embedded in the docker daemon itself. Every container which is part of a user defined network, does is name resolving requests to 127.0.0.11.
https://docs.docker.com/v17.09/engine/userguide/networking/configure-dns/
Is it possible to read the content of distributed key-value storage?
Docker is using libkv. But I'm not sure if it's possible to bypass the docker daemon and access it.
https://github.com/docker/libkv
The DNS in Docker Swarm overlay networks works just like docker-compose bridge networks. Services resolve inside the same network by their service name. Other things like Swarm VIP and Routing Mesh make the whole solution slightly different but don't directly affect DNS resolution.
The Swarm raft log isn't meant to be easily read, but it's not more than just the service definitions of services and networks you create in Swarm. I've never needed to look at it directly in a production system.
Here's a 3.5 training video on all things Swarm (including lots of network details) from former Docker engineer Jerome Petazzoni and AJ Bowen
Also, Laura Frank has some details from last years DockerCon on how the raft log and consensus works and might point to some tools if you want to look under the hood.
I am new with docker swarm and I'm having ambitious to deploy my application with docker swarm.
With the docker swarm, it has itself discovery service but I googled around and found out people are mentioning about the Consul as discovery service.
My question is. What is the advantage of Consul? Why don't we just use default discovery service?
Thanks,
Consul was used as a service discovery module in the standalone Swarm (prior to docker 1.12). However, since docker 1.12, Swarm mode was introduced with comes with default discovery service. So you don't need an external store.
Key point to notice is that if you had a swarm with an external store like consul, it would still have some data/metadata that needs to be preserved. Hence the use of Consul still exists.
Let us first look at the scope of service discovery provided by both swarm and Consul.
Swarm is to fascilitate service discovery on your docker network/infra only, while consul can be used with almost anything if you know how to use it, be it a monolythic application or a microservice, consul gives you all of that at one place.
Secondly, even though Swarm is great to handle a small infrastructure loads, it doesn't really go well with handling high production loads for a resource heavy infrastructure. This is why there are other tools in existance, for example kubernetes, ECS etc.
So considering that you have an application which you know is going to grow, I would rather go for a solution that works well with whatever I may try in future without having to change too much and works well with scaling on any IaaS provider. Hope that helps.
I'm new to docker and microservices. I've started to decompose my web-app into microservices and currently, I'm doing manual configuration.
After some study, I came across docker swarm mode which allows service discovery. Also, I came across other tools for service discovery such as Eureka and Consul.
My main aim is to replace IP addresses in curl call with service name and load balance between multiple instances of same service.
i.e. for ex. curl http://192.168.0.11:8080/ to curl http://my-service
I have to keep my services language independent.
Please suggest, Do I need to use Consul with docker swarm for service discovery or i can do it without Consul? What are the advantages?
With the new "swarm mode", you can use docker services to create clustered services across multiple swarm nodes. You can then access those same services, load-balanced, by using the service name rather than the node name in your requests.
This only applies to nodes within the swarm's overlay network. If your client systems are part of the same swarm, then discovery should work out-of-the-box with no need for any external solutions.
On the other hand, if you want to be able to discover the services from systems outside the swarm, you have a few options:
For stateless services, you could use docker's routing mesh, which will make the service port available across all swarm nodes. That way you can just point at any node in the swarm, and docker will direct your request to a node that is running the service (regardless of whether the node you hit has the service or not).
Use an actual load balancer in front of your swarm services if you need to control routing or deal with different states. This could either be another docker service (i.e. haproxy, nginx) launched with the --mode global option to ensure it runs on all nodes, or a separate load-balancer like a citrix netscaler. You would need to have your service containers reconfigure the LB through their startup scripts or via provisioning tools (or add them manually).
Use something like consul for external service discovery. Possibly in conjunction with registrator to add services automatically. In this scenario you just configure your external clients to use the consul server/cluster for DNS resolution (or use the API).
You could of course just move your service consumers into the swarm as well. If you're separating the clients from the services in different physical VLANs (or VPCs etc) though, you would need to launch your client containers in separate overlay networks to ensure you don't effectively defeat any physical network segregation already in place.
Service discovery (via dns) is built into docker since version 1.12. When you create a custom network (like bridge or overlay if you have multiple hosts) you can simply have the containers talk to each other via name as long as they are part of same network. You can also have an alias for each container which would round-robin the list of containers which have the same alias. For simple example see:
https://linuxctl.com/docker-networking-options-bridge
As long as you are using the bridge mode for your docker network and creating your containers inside that network, service discovery is available to you out of the box.
You will need to get help from other tools once your infrastructure starts to span in to multiple servers and microservices distributed on them.
Swarm is a good tool to start with, however, I would like to stick to consul if it comes to any IaaS provider like Amazon for my production loads.
In a microservice stack that uses docker for container orchestration, consul for service discovery and mesos for container scheduling, there are two services that user facing (with GUI) requiring to be configured with HAProxy for load balancing.
The question is, at which level should they be load-balanced. There are some implementations of LB that support each use case. dockercloud-haproxy, fabio with consul and marathon-lb if DC/OS is in place.
What would be a selection criteria ?
If your need is not that pressing you might want to wait until docker 1.12 is stabilised and use the included LB feature, which is pretty slick:
https://blog.docker.com/2016/06/docker-1-12-built-in-orchestration/
I have a small service that is split into 3 docker containers. One backend, one frontend and a small logging part. I now want to start them using coreOS and fleet.
I want to try and start 3 redundant backend containers, so the frontend can switch between them, if one of them fails.
How do i link them? If i only use one, it's easy, i just give it a name, e.g. 'back' and link it like this
docker run --name front --link back:back --link graphite:graphite -p 8080:8080 blurio/hystrixfront
Is it possible to link multiple ones?
The method you us will be somewhat dependent on the type of backend service you are running. If the backend service is http then there are a few good proxy / load balancers to choose from.
nginx
haproxy
The general idea behind these is that your frontend service need only be introduced to a single entry point which nginx or haproxy presents. The tricky part with this, or any cloud service, is that you need to be able to introduce backend services, or remove them, and have them available to the proxy service. There are some good writeups for nginx and haproxy to do this. Here is one:
haproxy tutorial
The real problem here is that it isn't automatic. There may be some techniques that automatically introduce/remove backends for these proxy servers.
Kubernetes (which can be run on top of coreos) has a concept called 'Services'. Using this deployment method you can create a 'service' and another thing called 'replication controller' which provides the 'backend' docker process for the service you describe. Then the replication controller can be instructed to increase/decrease the number of backend processes. You frontend accesses the 'service'. I have been using this recently and it works quite well.
I realize this isn't really a cut and paste answer. I think the question you ask is really the heart of cloud deployment.
As Michael stated, you can get this done automatically by adding a discovery service and bind it to the backend container. The discovery service will add the IP address (usually you'll want to bind this to be the IP address of your private network to avoid unnecessary bandwidth usage) and port in the etcd key-value store and can be read from the load balancer container to automatically update the load balancer to add the available nodes.
There is a good tutorial by Digital Ocean on this:
https://www.digitalocean.com/community/tutorials/how-to-use-confd-and-etcd-to-dynamically-reconfigure-services-in-coreos