Docker and high performance cluster (HPC) ability - docker

i'm searching a possibility to get the power of 4 HPC-Nodes combined. The nodes are exactly the same, 32 Cores, 128GB RAM. The idea is to give students of our university the possibility to deploy there containers and run there programs on it.
I've read something about Docker Machine and Swarm, but it's more like Highly Availability.. Any ideas? :)
Kind regards,
Sub

be aware that the docker daemon runs as root, therefore granting users to mount the host as root...
see
https://stackoverflow.com/a/30754910/771372

Related

Is it possible to run a large number of docker containers?

A small introduction to history. I am building a small service (website) where the user is provided with all sorts of tools that work according to the parameters specified by the user himself. In my implementation, it turns out that the tools are one big script that runs in the docker. It turns out that my service should launch a new docker container for each user.
I was thinking about using "aws fargate" or "gcloud run", or any other resource that makes it possible to run a docker container.
But I'm interested. What if there are 1000 or 10000 users, each one will have its own docker container, is that good? Do the services (aws, gcloud) have any restrictions, or is it a bad implementation?
Based upon my understanding you have suggested that you instantiate a Docker container for each of your users, I think there are a couple of issues with this:
Depending on how many users you have you get into the realms of too many containers. (each container will consume resources, not just Memory and CPU but also TCP/IP pool exhaustion.)
Isolation -> Read containers are not VMs

Would sharing memory across docker containers limit the overall memory usage? Is it possible on Windows?

The project I'm working on uses roughly 10 different docker containers for various services that communicate with each other. Each one uses a lot RAM and it's hard to limit our containers too much because we need to err on the side of caution and have extra for each one.
What I would like to do is share RAM across containers so that they have a shared pool and need less extra RAM for each one.
I see shared memory is possible with
https://docs.docker.com/engine/reference/run/#ipc-settings---ipc
I'm not sure if this actually would solve this problem or if it's only useful for allowing containers to communicate with each other via named pipes kind of thing.
Also we're running Windows Docker containers on a Windows host and it looks like this might only be possible on a Linux host? The Linux host part of it is potentially fixable.

Can we run a single container over multiple machines (hosts)?

I just want to know .. Is there any kind of facility available now in docker. I have already gone through some of the documentations in docker regarding the multi-host facility such as,
Docker swarm
Docker service (with replicas)
And also I am aware about the volume problems in swarm mode and the maximum resource (RAM and CPU) limit to a container will vary and depends upon where (at what machine) it assigned by the swarm manager. So here my question is,
How to run a single container instance over multiple machines (not as service) ? (This means a single container can acquire all resources [RAM1 + RAM2 + ... + RAMn] over these connected machines)
is there any way to achieve this ?
My question may be idiotic. But I am curious to know.. how to achieve the same ?
The answer is No. Containerization technologies cannot handle compute, network and storage resources across cluster as one unit. They're only orchestrate them.
Docker and Co. based on cgroup, namespaces, layered FS, virtual networks, etc. All of them wired to specific machine + running processes and requiring additional servicing to manage containers not only on concrete machine, but in the cluster(For example, Mesos, k8s or Swarm).
You can check products such as Hadoop, Spark, Cassandra, Akka framework and other distributed computation implementations to see examples how to manage cluster resources as one unit.
PS You should always think about increasing system complexity with rising of components distribution.

How Docker / Container / Isolation can make the programs to run faster?

Recently, I discovered docker containers. Could someone please explain the performance difference in running a program on the host compared to running in a container?
Also, what does "programs run faster" mean in terms of better performance and lower startup latency? What would this look like in real-world terms?
Does containerisation require more resources compared to running in the host?
Am I correct in thinking that containers are better than VM since they don't have os of their own.
How do concepts like process scheduling work (are containers treated as a process by the host)?

Is there a formula for calculating the overhead of a Docker container?

Supposed I want to run several Docker containers at the same time.
Is there any formula I can use to find out in advance on how many containers can be run at the same time by a single Docker host? I.e., how much CPU, memory & co. do I have to take into account for the containers themselves?
It's not a formula per se, but you can gather information about resource usage in the container by examining Linux control groups in /sys/fs/cgroup.
Links
See this excellent post by Jérôme Petazzoni of Docker, Inc on the subject.
See also Google's cAdvisor tool to view container resource usage.
This IBM research paper documents that Docker performance is higher than KVM in every measurement.
docker stats is also useful for getting a rough idea for how CPU and Memory your containers use.
cAdvisor is going to provide resource usage and other interesting stats about all containers o a host. We have a preliminary setup for root usage, but we are adding a lot more this week.

Resources