What are the pros of having Docker containers running and communicating with each other through a Docker virtual network instead of simply having them communicate with each other through the host machine? Say I have container A exposed through port 8000 on the host machine (-p 8000:8000) and container B exposed through port 9000 on the host machine (-p 9000:9000). Container A can communicate with container B through host.docker.internal:9000 but, if they were deployed in the same Docker network, A would be able to communicate with B simply through <name of container B>:9000. The latter is obviously neater in my opinion, but other than that what are its benefits?
Security .
By creating a private network that is only accessible to internal Docker services, you remove a door for attacks to occur. A common architecture is
-pub---> PROXY --priv---> MAIN SERVICE --priv--> DATABASE
Only the proxy needs to be exposed to the public (host) network interface. All 3 services can be part of a private network where internal traffic occurs.
Simplification .
The private network traffic is considered "trusted" so no need for SSL cert (HTTPS) and having every service implement SSL/TLS verification.
It is also typically (or should always be) much faster than the public facing networking which means no need for some optimisation used on the web (zipping or other compression schemes, caching).
Multi VMs
When services span multiple VMs, they are typically not tied to a specific VM. This allows components (Containers, Tasks, etc) to even be moved around to different or new VMs by orchestrators (Kubernetes, Mesos,...). The communication between services is done through a private (overlay) network spanning all the VMs. Your service then only needs to refer to other services by name and let the orchestrator reroute it correctly.
Related
I am trying to build a container containing 3 applications, for example:
Grafana;
Node-RED;
NGINX.
So I will just need to expose one por, for example:
NGINX reverse proxy on port 3001/grafana redirects to grafana on port 3000 and;
NGINX reverse proxy on port 3001/nodered redirects to nodered on port 1880.
Does it make any sense in your vision? Or this architecture is not feasible if compared to docker compose?
If I understand correctly, your concern is about opening only one port publicly.
For this, you would be better off building 3 separate containers, each with their own service, and all in the same docker network. You could plug your services like you described within the virtual network instead of within the same container.
Why ? Because containers are specifically designed to hold the environment for a single application, in order to provide isolation and reduce compatibility issues, with all the network configuration done at a higher level, outside of the containers.
Having all your services inside the same container thwart these mentioned advantages of containerized applications. It's almost like you're not even using containers.
Am trying to implement a cluster of containerised applications in the production using docker in the swarm mode.
Let me describe a very minimalist scenario.
All i have is just 5 aws-ec2 instances.
None of these nodes have a public IP assigned and all have private IPs assigned part of a subnet.
For example,
Manager Nodes
172.16.50.1
172.16.50.2
Worker Nodes
172.16.50.3
172.16.50.4
172.16.50.5
With the above infrastructure, have created a docker swarm with the first node's IP (172.16.50.1) as the --advertise-addr so that the other 4 nodes join the swarm as manager or worker with their respective tokens.
I didn't want to overload the Manager Nodes by making them doing the role of worker nodes too. (Is this a good idea or resource under-utilization?).
Being the nodes are 4 core each, am hosting 9 replicas of my web application which are distributed in the 3 worker nodes each running 3 containers hosting my web app.
Now with this setup in hand, how should i go about exposing the entire docker swarm cluster with a VIP (virtual IP) to the external world for consumption?
please validate my below thoughts:
1. Should I have a classic load-balancer setup like keeping a httpd or nginx or haproxy based reverse proxy which has a public IP assigned
and make it balance the load to the above 5 nodes where our
docker-swarm is deployed?
One downside I see here is that the above reverse-proxy would be Single Point of Failure? Any ideas how this could be made fault-tolerant/hightly available? should I try a AnyCast solution?
2. Going for a AWS ALB/ELB which would route the traffic to the above 5 nodes where our swarm is.
3. If keeping a separate Load Balancer is the way to go, then what does really docker-swarm load-balancing and service discovery is all
about?
what is docker swarm's answer to expose 1 virtual IP or host name to the external clients to access services in the swarm cluster?
Docker-swarm touts a lot about overlay networks but not sure how it
relates to my issue of exposing the cluster via VIP to clients in the
internet. Should we always keep the load balancer aware of the IP
addresses of the nodes that join the docker swarm later?
please shed some light!
On further reading, I understand that the Overlay Network we are creating in the swarm manager node only serves inter container communication.
The only difference from the other networking modes like bridge, host, macvlan is that the others enables communication among containers with in a single host and while the Overlay network facilitates communication among containers deployed in different subnets too. i.e., multi-host container communication.
with this knowledge as the headsup, to expose the swarm to the world via a single public IP assigned to a loadbalancer which would distribute requests to all the swarm nodes. This is just my understanding at a high level.
This is where i need your inputs and thoughts please...explaining the industry standard on how this is handled?
I want to host web services (say a simple nodejs api service)
There is a limitation on the number of services that I can host on a single host, since the number of ports available on a host is only 65536.
I can think of having a virtual sub-network that is visible only within the host and then have a proxy server that sits on the host and routes the APIs to the appropriate web-service.
Is it possible to do this with dockers - where each service is deployed in a container, a proxy server routing the APIs to the appropriate container?
Is there any off the shelf solution for this (preferably free of cost).
First of all, I doubt you can run 65536 processes per host, unless it's huge. Anyway, I would not recommend that because of availability and performance. Too many processes will be competing for the same resources, leading to a lot of context switches. That said, it's doable.
If your services are HTTP you can use a reverse proxy, like nginx or traefik. If not, you can use HAProxy for TCP services. Traefik is a better option because it performs service discovery, so you won't need to configure the endpoints manually.
In this setup the networking should be bridge, which is the default in Docker. Every container will have its own IP address, so you won't have any problem regarding port exhaustion.
I'm new to docker and microservices. I've started to decompose my web-app into microservices and currently, I'm doing manual configuration.
After some study, I came across docker swarm mode which allows service discovery. Also, I came across other tools for service discovery such as Eureka and Consul.
My main aim is to replace IP addresses in curl call with service name and load balance between multiple instances of same service.
i.e. for ex. curl http://192.168.0.11:8080/ to curl http://my-service
I have to keep my services language independent.
Please suggest, Do I need to use Consul with docker swarm for service discovery or i can do it without Consul? What are the advantages?
With the new "swarm mode", you can use docker services to create clustered services across multiple swarm nodes. You can then access those same services, load-balanced, by using the service name rather than the node name in your requests.
This only applies to nodes within the swarm's overlay network. If your client systems are part of the same swarm, then discovery should work out-of-the-box with no need for any external solutions.
On the other hand, if you want to be able to discover the services from systems outside the swarm, you have a few options:
For stateless services, you could use docker's routing mesh, which will make the service port available across all swarm nodes. That way you can just point at any node in the swarm, and docker will direct your request to a node that is running the service (regardless of whether the node you hit has the service or not).
Use an actual load balancer in front of your swarm services if you need to control routing or deal with different states. This could either be another docker service (i.e. haproxy, nginx) launched with the --mode global option to ensure it runs on all nodes, or a separate load-balancer like a citrix netscaler. You would need to have your service containers reconfigure the LB through their startup scripts or via provisioning tools (or add them manually).
Use something like consul for external service discovery. Possibly in conjunction with registrator to add services automatically. In this scenario you just configure your external clients to use the consul server/cluster for DNS resolution (or use the API).
You could of course just move your service consumers into the swarm as well. If you're separating the clients from the services in different physical VLANs (or VPCs etc) though, you would need to launch your client containers in separate overlay networks to ensure you don't effectively defeat any physical network segregation already in place.
Service discovery (via dns) is built into docker since version 1.12. When you create a custom network (like bridge or overlay if you have multiple hosts) you can simply have the containers talk to each other via name as long as they are part of same network. You can also have an alias for each container which would round-robin the list of containers which have the same alias. For simple example see:
https://linuxctl.com/docker-networking-options-bridge
As long as you are using the bridge mode for your docker network and creating your containers inside that network, service discovery is available to you out of the box.
You will need to get help from other tools once your infrastructure starts to span in to multiple servers and microservices distributed on them.
Swarm is a good tool to start with, however, I would like to stick to consul if it comes to any IaaS provider like Amazon for my production loads.
With the release of docker 1.9 came container networks. I understand how to use them but i'm not sure why I should use them. what are the benefits of using container networks?
The main reason I uses container networks is so that I can give the containers the same IP's as real virtual machines. I use docker a lot to test salt-stack states. I put all the machines I want into a docker-compose file and use docker images with salt-master / salt-minion installed. I apply my states and I have a working copy of the target environment.
There are two main benefits: containers isolation and communication in a clustered environment.
"Container networks" are virtual networks; they work on top of existing physical networks but allow you to create private networks that are only visible to the containers who are on that network. This isolation is an important security feature. You typically place all of your backend container services like database, indexing, app server in their own private network and only expose one container (nginx, apache, haproxy,...) to the outside world (public network).
Virtual networks also allow the creation of overlay networks, which permits containers in different hosts to communicate as if they we were on the same host. This is the basis for clustered solution like Docker Swarm.