Is it possible to mount docker volume between two physical instances? - docker

I’m trying to understand the swarm and volumes in scale. I followed the https://docs.docker.com/get-started until the end and everything was clear and working as expected. Now I want to do the same thing (have two nodes, 1 manager, 1 worker) but on my physical devices. Let’s say I have two computers on the same network or two AWS instances.
I tried to set it up between my two computers on the local network, but it seems that the manager’s IP is responding to the manager nodes and worker’s IP to worker nodes. Also, the Redis is not working on the worker’s IP.
I think it should be possible, but I can’t find an article where they are doing a similar thing.
I mean there should be a way, I don’t think that companies are running all their Docker containers on the same server (AWS instances)
P.S.: I'm using all the same files and setups provided in the link above.

Related

Controlling the hosts where my containers run with docker swarm

I'm jumping from a local docker-compose building, to a production environment, in which I have 4 vps. The first (the manager) is the one with the least resources. The other 3 have the same and are biggers (the workers). I decided to use docker swarm, to manage these infrastructure. My doubt is, Should I be concerned about which host x container is running on? Or this is a bad concept of mine? I mean, is the docker swarm meant for me to abstract from the underlying nodes, and create the services and containers trusting that docker will manage the resources successfully?
Answer is... both!
The goal is to let Docker Swarm manage things for you as much as possible, but also add constraints in order for your application to deploy on the hardware that matches best its requirements.
For example, if you have a reverse proxy and machine learning models, you might want to deploy your reverse proxy on a CPU optimized server, and your machine learning models on a memory optimized instance.
You need to label your nodes properly, and then add constraints so that services are only deployed to the nodes that match your labels. For example in the example above you could add 2 labels: reverse-proxy and ml.
I am explaining how to do this more precisely in this article in case you're interested: https://juliensalinas.com/en/container-orchestration-docker-swarm-nlpcloud/

Why use docker service?

This question illustrates the theoretical differences between docker run and docker service.
What I don't understand is when would one need to use the exact same container replicated multiple times (as per the Docker documentation example)?
There, they run the same web app replicated 5 times.
Is deployment on Kubernetes (for example) a potential use case, where the developer does not want to centralize the app on one host, in order to make it more resilient, hence why 5 replicas are created?
To understand, can someone please please with an example use case, where the docker service is useful?
swarm is an orchestrator just like kubernetes. docker service deploys services to swarm just as you deploy your services to kubernetes using kubectl.
swarm is essentially built-in primitive orchestrator. One possible case for replicas is running a proxy that directs requests to proper containers. You could expose multiple machines and have one take place of another in case another fails. Or any other high availability case you could think of.
Your question could be rephrased as "What's the difference between running a single container and running containers in a cluster?", which would be another question altogether, but that rephrasing might help illustrate what docker service does.
If you want to scale your application, you can run multiple instances of it (horizontal scaling) or you beef up the machine(s) that it runs on (vertical scaling). For the first, you would have to put a load balancer in front of your application so that the traffic is evenly distributed between the different instances. The idea is that those instances run on different hosts, so if one goes down, your application is still up. Some controlling instance (a Kubernetes service, for example) will notice that one of your instances has gone south and won't direct any more traffic to it. Nowadays, with all the cloud stuff going on, this is typically the way to go.
You don't need Kubernetes for such a setup, but you're right, this would be a typical use case for it. At least if you run your application in a Docker container.
Once use case is running on Docker swarm which consists of n number of nodes in your swarm cluster. You can run replicas of your application on the swarm cluster with a load balancer/reverse proxy to load balance your setup. If any one of the nodes goes down the application can still run.
But the exact use case for running multiple instances is scalabilty. Suppose you know that one instance of your app can serve 10000 users (Assume Bank authentication) at a time.
If you want your application to serve 50K users just run 5 replicas(using docker service create) .

Is it possible to run Kubernetes nodes on hosts that can be physically compromised?

Currently I am working on a project where we have a single trusted master server, and multiple untrusted (physically in an unsecured location) hosts (which are all replicas of each other in different physical locations).
We are using Ansible to automate the setup and configuration management however I am very unimpressed in how big of a gap we have in our development and testing environments, and production environment, as well as the general complexity we have in configuration of the network as well as containers themselves.
I'm curious if Kubernetes would be a good option for orchestrating this? Basically, multiple unique copies of the same pod(s) on all untrusted hosts must be kept running, and communication should be restricted between the hosts, and only allowed between specific containers in the same host and specific containers between the hosts and the main server.
There's a little bit of a lack of info here. I'm going to make the following assumptions:
K8s nodes are untrusted
K8s masters are trusted
K8s nodes cannot communicate with each other
Containers on the same host can communicate with each other
Kubernetes operates on the model that:
all containers can communicate with all other containers without NAT
all nodes can communicate with all containers (and vice-versa) without NAT
the IP that a container sees itself as is the same IP that others see it as
Bearing this in mind, you're going to have some difficulty here doing what you want.
If you can change your physical network requirements, and ensure that all nodes can communicate with each other, you might be able to use Calico's Network Policy to segregate access at the pod level, but that depends entirely on your flexibility.

Docker Swarm constraint to keep multiple containers together?

I have three containers that need to run on the same Swarm node/host in order to have access to the same data volume. I don't care which host they are delegated to - since it is running on Elastic AWS instances, they will come up and down without my knowing it.
This last fact makes it tricky even though it seems like it should be fairly common. Is there a placement constraint that would allow this? Obviously node.id or node.hostname are out as those are not constant. I thought about labels - that would work, but then I have no idea how to have a "replacement" AWS instance automatically get the label.
Swarm doesn't have the feature to put containers on the same host together yet (with your requirements of not using ID or hostname). That's known as "Pods" in Kubernetes. Docker Swarm takes a more distributed approach. You could try to hack together a label assignment on new instance startup but that isn't ideal.
In Swarm the way to solve this problem today is with using a different volume driver plugin then the built-in "local" driver. Here's a list of certified ones. The key in Swarm is to not use local storage on a node for volumes. Those volumes will get lost when the node dies anyway, so it's best in Swarm to move your volumes to shared storage.
In AWS I'd suggest you try EFS as shared storage if you need multiple containers to access it at once, and use either Docker's CloudStor driver (comes with Docker for AWS template) or the REX-Ray storage orchestrator solution which ensures shared data paths (NFS, EFS, S3, etc.) are connected to the correct node for the correct Service task.

Networking among kubernetes minions

I installed an 8-node kubernetes cluster (1 master + 7 minion) but I faced a networking problem among minions.
I installed my cluster according to this step-by-step Fedora manual, so I use Fedora 20 with its testing repository to get kubernetes binaries.
After installing, I wanted to try the guestbook example, but it seems to me there is a problem with the inter-container networking.
Although containers/PODs are in running state and I can reach my 3 frontend containers (via browser) and the redis containers as well (via natcat), but the frontend, which not on the same host with the redis, cannot reach redis master. The frontend's PHP give back network exception.
Can anybody help me why the containers cannot reach each other among the hosts?
I hope I could describe my setup enough accurately and thanks in advance.
The Fedora guide you followed will only get you running on a single machine. It avoids the issues around setting up networking across nodes.
For kubernetes to work, the following network set up must be satisfied:
Every container should be able to talk to every other container, even across nodes. This means also that the bridge IP range for those containers must not overlap.
Code running on any node that isn't in a container should be able to reach every container (and vise-versa), even across nodes.
It is not necessary (but useful) if computers on the network that aren't part of the cluster can reach the containers directly.
There are a lot of ways to achieve this -- for instance the set up for vagrant sets up GRE tunnels between each node. On GCE we use features of the platform to do the routing. If you are on physical machines on a switch you can probably just do a big layer 2 network w/ bridges. A bulletproof way to get started (but perhaps not the most performant, depending on your set up) is to use something like flannel.
We are working on making this stuff easier to start up (without using a mess of shell scripts) and are thinking of building something like flannel in so that there is a reasonable default.

Resources