service discovery in docker without using consul - docker

I'm new to docker and microservices. I've started to decompose my web-app into microservices and currently, I'm doing manual configuration.
After some study, I came across docker swarm mode which allows service discovery. Also, I came across other tools for service discovery such as Eureka and Consul.
My main aim is to replace IP addresses in curl call with service name and load balance between multiple instances of same service.
i.e. for ex. curl http://192.168.0.11:8080/ to curl http://my-service
I have to keep my services language independent.
Please suggest, Do I need to use Consul with docker swarm for service discovery or i can do it without Consul? What are the advantages?

With the new "swarm mode", you can use docker services to create clustered services across multiple swarm nodes. You can then access those same services, load-balanced, by using the service name rather than the node name in your requests.
This only applies to nodes within the swarm's overlay network. If your client systems are part of the same swarm, then discovery should work out-of-the-box with no need for any external solutions.
On the other hand, if you want to be able to discover the services from systems outside the swarm, you have a few options:
For stateless services, you could use docker's routing mesh, which will make the service port available across all swarm nodes. That way you can just point at any node in the swarm, and docker will direct your request to a node that is running the service (regardless of whether the node you hit has the service or not).
Use an actual load balancer in front of your swarm services if you need to control routing or deal with different states. This could either be another docker service (i.e. haproxy, nginx) launched with the --mode global option to ensure it runs on all nodes, or a separate load-balancer like a citrix netscaler. You would need to have your service containers reconfigure the LB through their startup scripts or via provisioning tools (or add them manually).
Use something like consul for external service discovery. Possibly in conjunction with registrator to add services automatically. In this scenario you just configure your external clients to use the consul server/cluster for DNS resolution (or use the API).
You could of course just move your service consumers into the swarm as well. If you're separating the clients from the services in different physical VLANs (or VPCs etc) though, you would need to launch your client containers in separate overlay networks to ensure you don't effectively defeat any physical network segregation already in place.

Service discovery (via dns) is built into docker since version 1.12. When you create a custom network (like bridge or overlay if you have multiple hosts) you can simply have the containers talk to each other via name as long as they are part of same network. You can also have an alias for each container which would round-robin the list of containers which have the same alias. For simple example see:
https://linuxctl.com/docker-networking-options-bridge

As long as you are using the bridge mode for your docker network and creating your containers inside that network, service discovery is available to you out of the box.
You will need to get help from other tools once your infrastructure starts to span in to multiple servers and microservices distributed on them.
Swarm is a good tool to start with, however, I would like to stick to consul if it comes to any IaaS provider like Amazon for my production loads.

Related

Using custom service discovery with docker swarm

I have a docker swarm mode orchestration on my servers, and as my business requirements I have a custom service discovery (it's run by swarm too).
Every services after running call register method on service discovery and introduce his contact information.
So service discovery could reversing traffics and balancing load between instance by introduced ip and port
My problem there is, when an instance(ruined in container) call discovery register method, his remote-addr is not real (mean it's not equal to hostname -i) and service discovery can not find it in network
Is any idea?
One option would be for the service discovery to also participate in the swarm. Then it should be able to find the instances which are in containers that are in the swarm.
Another would be for the containers to run with --net=host. Though this may defeat the reason for having them in a swarm in the first place.

How is load balancing done in Docker-Swarm mode

I'm working on a project to set up a cloud architecture using docker-swarm. I know that with swarm I could deploy replicas of a service which means multiple containers of that image will be running to serve requests.
I also read that docker has an internal load balancer that manages this request distribution.
However, I need help in understanding the following:
Say I have a container that exposes a service as a REST API or say its a web app. And If I have multiple containers (replicas) deployed in the swarm and I have other containers (running some apps) that talk to this HTTP/REST service.
Then, when I write those apps which IP:PORT combination do I use? Is it any of the worker node IP's running these services? Will doing so take care of distributing the load appropriately even amongst other workers/manager running the same service?
Or should I call the manager which in turn takes care of routing appropriately (even if the manager node does not have a container running this specific service)?
Thanks.
when I write those apps which IP:PORT combination do I use? Is it any
of the worker node IP's running these services?
You can use any node that is participating in the swarm, even if there is no replica of the service in question exists on that node.
So you will use Node:HostPort combination. The ingress routing mesh will route the request to an active container.
One Picture Worth Ten Thousand Words
Will doing so take care of distributing the load appropriately even
amongst other workers/manager running the same service?
The ingress controller will do round robin by default.
Now The clients should use dns round robin to access the service on the docker swarm nodes. The classic DNS cache problem will occur. To avoid that we can use external load balancer like HAproxy.
An important additional information to the existing answer
The advantage of using a proxy (HAProxy) in-front of docker swarm is, swarm nodes can reside on a private network that is accessible to the proxy server, but that is not publicly accessible. This will make your cluster secure.
If you are using AWS VPC, you can create a private subnet and place your swarm nodes inside the private subnet and place the proxy server in public subnet which can forward the traffic to the swarm nodes.
When you access the HAProxy load balancer, it forwards requests to nodes in the swarm. The swarm routing mesh routes the request to an active task. If, for any reason the swarm scheduler dispatches tasks to different nodes, you don’t need to reconfigure the load balancer.
For more details please read https://docs.docker.com/engine/swarm/ingress/

Cluster of forward proxies

I'm trying to figure out whether Docker Swarm or Kubernetes are a good choice for my use case.
Basically, I want to build a small cluster of forward proxies (via squid, nginx or a custom nodejs script), and be able to deploy/start/stop/purge them all together.
I should be able to access the proxy cluster via a single IP address, manager should be able to load-balance the request to a node, and each proxy node must use a unique outgoing IP address.
I'm wondering:
Are Docker Swarm and/or Kubernetes the right way to go about it?
If so, should I set-up Docker Swarm and/or Kubernetes and its worker nodes (running the proxy) on a single dedicated server or separate virtual servers?
Is it also possible for all the cluster nodes to share a file system storage for caching, common config etc.
Any other tips to get this working.
Thanks!
Docker running in swarm mode should work well for this
Run docker on a single dedicated server; I see no need for virtual servers. You could also run the swarm across multiple dedicated servers.
https://docs.docker.com/engine/swarm/secrets/ work well for some settings and configurations. If you require significant storage, simply add a database service to your cluster
Docker swarm mode fits your requirements quite well; requests are automatically balanced across your swarm and each service instance can be configured to have a unique address. You should check out the swarm mode tutorial: https://docs.docker.com/engine/swarm/swarm-tutorial/

What is the different between putting a separate service discovery and integrate it into the cluster machine in Docker Swarm

I am having problem understanding the need of a separated service discovery server while we could register the slave node to the master node at the slave node start-up through whatever protocol. Hosting another service seem redundant to me.
Docker Swarm is there to create a cluster of hosts running Docker and schedule containers across the cluster.
It does not include service discovery, which is provided by a backend service, such as etcd, consul or zookeeper.
The first problem: service registration and discovery is an infrastructure concern, not an application concern.
The second problem: implementing service registration and discovery when infrastructure and application implementation are mutually agnostic is tough.
The DockerCon makes that distinction clear this morning (Nov. 16th, 2015), with the "Docker Stack":
(Graphics from #laurelcomics)
Docker networking solves these problems by backing an interface (DNS) with pluggable infrastructure components that adhere to a common KV interface.
You can see consul.io used in:
"Easy routing and service discovery with Docker, Consul and nginx"
"Docker Overlay Networks: That was Easy"
"Docker DNS getaddrinfo ENOTFOUND"
That means:
Consul is a KV (Key/Value) store which can be plugged into Swarm in order to manage the service discovery aspect.
Swarm is the access layer, which is usually the layer that contains a gateway or routing component that allows others to actually reach your services.
(Image from the "Easy routing and service discovery with Docker, Consul and nginx" article written by Ladislav Gazo)
The goal is to isolate what is an infrastructure concern (Discovery service) in its own container, separate from a dev tool concern (Swarm).

How to connect two containers which run in two different hosts?

I run a web container in server A, and a DB container in server B.
How can I connect these two containers?
I only know how to connect two containers which run in same host.
You need a:
service discovery which will record your containers hosts (like Consul, a KV -- Key/Value store)
A good way to think of Consul is broken into 3 layers.
The middle layer is the actual config store, which is not that different from etcd or Zookeeper.
The layers above and below are pretty unique to Consul.
The killer feature of Consul is its service catalog. Instead of using the key-value store to arbitrarily model your service directory as you would with etcd or Zookeeper, Consul exposes a specific API for managing services.
a registrator (you can use swarm even though it is a bit overkill, or registrator)
Registrator is a single, host-level service you run as a Docker container.
It watches for new containers, inspects them for service information, and registers them with a service registry. It also deregisters them when the container dies.
It has a pluggable registry system, meaning it can work with a number of service discovery systems.

Resources