Define different network delays between docker containers - docker

I want to simulate a network of computers with different network delays between different computers using docker (ideally docker-compose). For example, docker A and docker B have no delay, but B and C have 3 seconds between them. Linux's Traffic control does not suffice because it doesn't differentiate across containers.
This was already asked here but was not answered, and I feel like assuming there is a good solution to this, it should be out there for whomever may need it in the future.
Thanks in advance!

Related

Can concurrent docker containers speed-up computations?

As a disclaimer let me just say that I am a beginner with Docker and hence the question might sound a bit "dummy".
I am exploring parallelization options to speed-up some computations. I'm working with Python, so I followed the official guidelines to create my first image and then run it as a container.
For the time being, I use a dummy program that generates a very large np random matrix (let's say 4000 x 4000) and then finds how many elements in each row fall into a predefined range [min, max].
I then launched a second container of the same image obviously with a different port and name. I didn't get any speed-ups in the computations which I was expecting somehow, since:
a) I haven't developed any mechanism for the 2 containers to somehow "talk to each other" and share intermediate results
b) I am not sure if the program itself is suitable for speedups in such a way.
So my questions corresponding to a, b above are:
Is parallelism a "feature" supported by docker deployments and in what sense? Is it just load sharing? And if I implement a load balancer, how does docker know how to transfer intermediate results from one container to the other?
If the previous question is not "correct", do I then need to write "parallel" versions of my programs to assign to each container? Isn't this equivalent to writing MPI versions of my program and assign them to different cores in my system? What would be the benefit of a docker architecture then?
thanks in advance
Docker is just a way of deploying your application - it does not in itself allow you to 'support' parallelism just by using Docker. Your application itself needs to support parallelism. Docker (and Kubernetes, etc) can help you scale out in parallel easily but your applications need to be able to support that scaling out.
So if you can run multiple instances of your application in parallel now (however you might do that) and it would not deliver any performance improvement then Docker will not help you. If running multiple instances now does improve performance then Docker will help scale out.

Does Docker Swarm keep data synced among nodes?

I've never done anything with Docker Swarm, or Kubernetes so I'm trying to learn what does what, and which is best for my purpose before tackling it.
My scenario:
I have a Desktop PC running Docker Desktop, and ..
I have a Raspberry PI running Docker on Raspbian
This is all on a home LAN, so I don't really want to get crazy with complicated things.
I want to run Pi Hole and DNSCrypt Proxy containers on both 'machines', (as redundancy, mostly because the Docker Desktop seems to crash a lot taking down my entire DNS system with it when I just use that machine for Pi-hole).
My main thing is, I want all the data/configurations, etc. between them to stay in sync (i.e. Pi hole's container data stays in sync on both devices, etc.), and I want the manager to make sure it's always up, in case of crashes, and so on.
My questions:
Being completely new to this area, and just doing a bit of poking around:
it seems that Kubernetes might be a bit much, and more complicated than I need for this?
That's why I was thinking Swarm instead, but I'm also not sure whether either of them will keep data synced?
And, say I create 2 Pi-hole containers on the Manager machine, does it create 1 on the manager machine, and 1 on the worker machine?
Any info is appreciated!
Docker doesn't quite have anything that directly meets your need, but if you've got a reliable file server on your home LAN, you could do it really easily.
Broadly speaking you want to look at Docker Volume Plugins. Most of them ultimately work via an external storage provider and so won't be that helpful for you. There's a couple of more exotic ones like Portworx or StorageOS that can do portable/replicated storage purely in Docker, but I think most of them are a paid license.
But, if you have a fileserver that you trust to stay up and running, you can mount an NFS/CIFS share as a volume as mentioned in the Docker Docs, and Docker can handle re-connecting it when a container moves from one node to another due to a failure.
One other note: you want two manager nodes and one container per service in your swarm. You need to have one working Manager node for the swarm to work (this is important if a Manager crashes). Multiple separate instances would generally only be helpful if the service was designed as a distributed/fault tolerant application.

Docker Samba Container performance issue

This is my question that maybe some of you will find it stupid, but it wouldn't hurt me because i know that i am a newbie on the Docker system.
I have successfully set up a running container :
a samba V3 apps (cf https://github.com/dastrasmue/rpi-samba but there is a lot of others)
its share successfully a mounted usb drive on my PI over my network
But i have a performance issue, i cannot do 2 operations at the same time, for example :
copy a movie from this share to my computer
explore this share from another computer
In fact, the second task is freezed until the first one has finished ?
It act just like if my samba can handle only one task at a time (single threaded)??
Maybe it is normal because of the use of docker (one container for one task).
Can anyone confirm this ? Or did someone succeed to handle multiple operation in the same time on a Samba container (or another).
Or just tell me it is not a good idea to use docker with samba :)
Ty

In Docker Swarm mode is there any point in replicating a service more than the number of hosts available?

I have been looking into the new Docker Swarm mode that will be available in Docker 1.12. In this Docker Swarm Mode Walkthrough video, they create a simple Nginx service that is composed of a single Nginx container. In the video, they have 4 nodes in the Swarm cluster. During the scaling demonstration, they increase the replication factor to 10, thus creating 10 copies of the Nginx container across all 4 machines in the cluster.
I get that the video is just a demonstration, but in the real world, what is the point of creating more replicas of a container (or service) than there are nodes in the Swarm cluster? It seems to be pointless since two containers on the same machine would be sharing that machines finite computing resources anyway. I don't get what the benefit is.
So my question is, is there any real world benefit to replicating a Docker service or container beyond the number of nodes in the Swarm cluster?
Thanks
It depends on how the application handles threading and multiple requests. A single threaded application, or job that only handles one request at a time, may use a fraction of the OS resources and benefit from running multiple instances on a single host. An application that's been tuned to process requests concurrently and which fully utilizes the OS will see no benefit and will in fact incur a penalty of taking away resources to run multiple instances of the application.
One advantage can be performing live zero-downtime software updates. See the Docker 0.12rc2 Swarm tutorial on rolling updates
You have a RabbitMQ or other Queue System with a high load on data. You can start more Containers with workers than nodes to handle the high data load on your RabbitMQ.
Hardware resource constrain is not the only thing one needs to consider when you have your services replicated.
A simple example would be if you are having a service to provide security details. The resource consumption by this service will be low (read a record from Db/Cache and send it out). However if there are 20 or 30 requests to be handled by the same service the requests will be queued up.
Yes there are better ways to implement my example but I believe is good enough to illustrate why one might replicate a service on the same host/node.

One docker container per node or many containers per big node

We have a little farm of docker containers, spread over several Amazon instances.
Would it make sense to have fewer big host images (in terms of ram and size) to host multiple smaller containers at once, or to have one host instance per container, sized according to container needs?
EDIT #1
The issue here is that we need to decide up-front. I understand that we can decide later using various monitoring stats, but we need to make some architecture and infrastructure decisions before it is going to be used. More over, we do not have control over what content is going to be deployed.
You should read
An Updated Performance Comparison of Virtual Machines
and Linux Containers
http://domino.research.ibm.com/library/cyberdig.nsf/papers/0929052195DD819C85257D2300681E7B/$File/rc25482.pdf
and
Resource management in Docker
https://goldmann.pl/blog/2014/09/11/resource-management-in-docker/
You need to check how much memory, CPU, I/O,... your containers consume, and you will draw your conclusions
You can easily, at least, check a few things with docker stats and docker top my_container
the associated docs
https://docs.docker.com/engine/reference/commandline/stats/
https://docs.docker.com/engine/reference/commandline/top/

Resources