Docker Swarm - Is it possible to restrict network access workers? - docker

we built a Docker Swarm cluster, over several cloud providers.
Everything works but we have new constraints and need to restrict network communications between the cloud providers.
Is it possible to build a Docker Swarm cluster with "local load balancing"? What I mean by this question is, is it possible to use:
- one cloud provider for Swarm managers, with network access to Swarm workers;
- two cloud providers for Swarm workers, with network access to the Swarm managers, but no network access between these cloud providers?
In that case, would the load balancing still work if someone runs a web request towards one of the workers?
Please find below a drawing of the targeted architecture.

Related

Containerized applications with docker swarm on GCP

I have a project to containerize several applications (Gitlab, Jenkins, Wordpress, Python Flask app...). Currently each application runs on a Compute Engine VM each at GCP. My goal would be to move everything to a cluster (Swarm or Kubernetes).
However I have different questions about Docker Swarm on Google Cloud Platform:
How can I expose my Python application on the outside (HTTP load balancer) as well as the other applications only available in my private VPC ?
From what I've seen on the internet, I have the impression that docker swarm is very little used. Should I go for a kubernetes cluster instead ? (I have good knowledge of Docker/Kubernetes)
It is difficult to find information about Docker Swarm in cloud providers. What would be an architecture with Docker Swarm on GCP?
Thanks for your help.
I'd create a template and from that an instance group for all VM, which shall host the Docker swarm. And a separate instance or instance group for said internal purposes - so that there is a strict separation, which can then be used to route the internal & external traffic accordingly (this would apply in any case). Google Kubernetes Engine is about the same as such an instance group, but Google managed infrastructure. See the tutorial, there's not much difference - except that it better integrates with gcloud & kubectl. While there is no requirement to want or need to maintain the underlying infrastructure, GKE is probably less effort.
What you are basically asking is:
Kubernetes vs. Docker Swarm: What’s the Difference?
Docker Swarm vs Kubernetes: A Helpful Guide for Picking One
Kubernetes vs. Docker: What Does it Really Mean?
Docker Swarm vs. Kubernetes: A Comparison
Kubernetes vs Docker Swarm

How to expose the entire docker swarm cluster to the external world via a public IP?

Am trying to implement a cluster of containerised applications in the production using docker in the swarm mode.
Let me describe a very minimalist scenario.
All i have is just 5 aws-ec2 instances.
None of these nodes have a public IP assigned and all have private IPs assigned part of a subnet.
For example,
Manager Nodes
172.16.50.1
172.16.50.2
Worker Nodes
172.16.50.3
172.16.50.4
172.16.50.5
With the above infrastructure, have created a docker swarm with the first node's IP (172.16.50.1) as the --advertise-addr so that the other 4 nodes join the swarm as manager or worker with their respective tokens.
I didn't want to overload the Manager Nodes by making them doing the role of worker nodes too. (Is this a good idea or resource under-utilization?).
Being the nodes are 4 core each, am hosting 9 replicas of my web application which are distributed in the 3 worker nodes each running 3 containers hosting my web app.
Now with this setup in hand, how should i go about exposing the entire docker swarm cluster with a VIP (virtual IP) to the external world for consumption?
please validate my below thoughts:
1. Should I have a classic load-balancer setup like keeping a httpd or nginx or haproxy based reverse proxy which has a public IP assigned
and make it balance the load to the above 5 nodes where our
docker-swarm is deployed?
One downside I see here is that the above reverse-proxy would be Single Point of Failure? Any ideas how this could be made fault-tolerant/hightly available? should I try a AnyCast solution?
2. Going for a AWS ALB/ELB which would route the traffic to the above 5 nodes where our swarm is.
3. If keeping a separate Load Balancer is the way to go, then what does really docker-swarm load-balancing and service discovery is all
about?
what is docker swarm's answer to expose 1 virtual IP or host name to the external clients to access services in the swarm cluster?
Docker-swarm touts a lot about overlay networks but not sure how it
relates to my issue of exposing the cluster via VIP to clients in the
internet. Should we always keep the load balancer aware of the IP
addresses of the nodes that join the docker swarm later?
please shed some light!
On further reading, I understand that the Overlay Network we are creating in the swarm manager node only serves inter container communication.
The only difference from the other networking modes like bridge, host, macvlan is that the others enables communication among containers with in a single host and while the Overlay network facilitates communication among containers deployed in different subnets too. i.e., multi-host container communication.
with this knowledge as the headsup, to expose the swarm to the world via a single public IP assigned to a loadbalancer which would distribute requests to all the swarm nodes. This is just my understanding at a high level.
This is where i need your inputs and thoughts please...explaining the industry standard on how this is handled?

How is load balancing done in Docker-Swarm mode

I'm working on a project to set up a cloud architecture using docker-swarm. I know that with swarm I could deploy replicas of a service which means multiple containers of that image will be running to serve requests.
I also read that docker has an internal load balancer that manages this request distribution.
However, I need help in understanding the following:
Say I have a container that exposes a service as a REST API or say its a web app. And If I have multiple containers (replicas) deployed in the swarm and I have other containers (running some apps) that talk to this HTTP/REST service.
Then, when I write those apps which IP:PORT combination do I use? Is it any of the worker node IP's running these services? Will doing so take care of distributing the load appropriately even amongst other workers/manager running the same service?
Or should I call the manager which in turn takes care of routing appropriately (even if the manager node does not have a container running this specific service)?
Thanks.
when I write those apps which IP:PORT combination do I use? Is it any
of the worker node IP's running these services?
You can use any node that is participating in the swarm, even if there is no replica of the service in question exists on that node.
So you will use Node:HostPort combination. The ingress routing mesh will route the request to an active container.
One Picture Worth Ten Thousand Words
Will doing so take care of distributing the load appropriately even
amongst other workers/manager running the same service?
The ingress controller will do round robin by default.
Now The clients should use dns round robin to access the service on the docker swarm nodes. The classic DNS cache problem will occur. To avoid that we can use external load balancer like HAproxy.
An important additional information to the existing answer
The advantage of using a proxy (HAProxy) in-front of docker swarm is, swarm nodes can reside on a private network that is accessible to the proxy server, but that is not publicly accessible. This will make your cluster secure.
If you are using AWS VPC, you can create a private subnet and place your swarm nodes inside the private subnet and place the proxy server in public subnet which can forward the traffic to the swarm nodes.
When you access the HAProxy load balancer, it forwards requests to nodes in the swarm. The swarm routing mesh routes the request to an active task. If, for any reason the swarm scheduler dispatches tasks to different nodes, you don’t need to reconfigure the load balancer.
For more details please read https://docs.docker.com/engine/swarm/ingress/

Cluster of forward proxies

I'm trying to figure out whether Docker Swarm or Kubernetes are a good choice for my use case.
Basically, I want to build a small cluster of forward proxies (via squid, nginx or a custom nodejs script), and be able to deploy/start/stop/purge them all together.
I should be able to access the proxy cluster via a single IP address, manager should be able to load-balance the request to a node, and each proxy node must use a unique outgoing IP address.
I'm wondering:
Are Docker Swarm and/or Kubernetes the right way to go about it?
If so, should I set-up Docker Swarm and/or Kubernetes and its worker nodes (running the proxy) on a single dedicated server or separate virtual servers?
Is it also possible for all the cluster nodes to share a file system storage for caching, common config etc.
Any other tips to get this working.
Thanks!
Docker running in swarm mode should work well for this
Run docker on a single dedicated server; I see no need for virtual servers. You could also run the swarm across multiple dedicated servers.
https://docs.docker.com/engine/swarm/secrets/ work well for some settings and configurations. If you require significant storage, simply add a database service to your cluster
Docker swarm mode fits your requirements quite well; requests are automatically balanced across your swarm and each service instance can be configured to have a unique address. You should check out the swarm mode tutorial: https://docs.docker.com/engine/swarm/swarm-tutorial/

service discovery in docker without using consul

I'm new to docker and microservices. I've started to decompose my web-app into microservices and currently, I'm doing manual configuration.
After some study, I came across docker swarm mode which allows service discovery. Also, I came across other tools for service discovery such as Eureka and Consul.
My main aim is to replace IP addresses in curl call with service name and load balance between multiple instances of same service.
i.e. for ex. curl http://192.168.0.11:8080/ to curl http://my-service
I have to keep my services language independent.
Please suggest, Do I need to use Consul with docker swarm for service discovery or i can do it without Consul? What are the advantages?
With the new "swarm mode", you can use docker services to create clustered services across multiple swarm nodes. You can then access those same services, load-balanced, by using the service name rather than the node name in your requests.
This only applies to nodes within the swarm's overlay network. If your client systems are part of the same swarm, then discovery should work out-of-the-box with no need for any external solutions.
On the other hand, if you want to be able to discover the services from systems outside the swarm, you have a few options:
For stateless services, you could use docker's routing mesh, which will make the service port available across all swarm nodes. That way you can just point at any node in the swarm, and docker will direct your request to a node that is running the service (regardless of whether the node you hit has the service or not).
Use an actual load balancer in front of your swarm services if you need to control routing or deal with different states. This could either be another docker service (i.e. haproxy, nginx) launched with the --mode global option to ensure it runs on all nodes, or a separate load-balancer like a citrix netscaler. You would need to have your service containers reconfigure the LB through their startup scripts or via provisioning tools (or add them manually).
Use something like consul for external service discovery. Possibly in conjunction with registrator to add services automatically. In this scenario you just configure your external clients to use the consul server/cluster for DNS resolution (or use the API).
You could of course just move your service consumers into the swarm as well. If you're separating the clients from the services in different physical VLANs (or VPCs etc) though, you would need to launch your client containers in separate overlay networks to ensure you don't effectively defeat any physical network segregation already in place.
Service discovery (via dns) is built into docker since version 1.12. When you create a custom network (like bridge or overlay if you have multiple hosts) you can simply have the containers talk to each other via name as long as they are part of same network. You can also have an alias for each container which would round-robin the list of containers which have the same alias. For simple example see:
https://linuxctl.com/docker-networking-options-bridge
As long as you are using the bridge mode for your docker network and creating your containers inside that network, service discovery is available to you out of the box.
You will need to get help from other tools once your infrastructure starts to span in to multiple servers and microservices distributed on them.
Swarm is a good tool to start with, however, I would like to stick to consul if it comes to any IaaS provider like Amazon for my production loads.

Resources