Yum update stucks inside docker - docker

I have installed Docker 20.10 on RHEL 9 system, and installed CentOS 7 container on docker. But when I tried yum update on it, it takes a long time while running transaction, as if yum stuck while updating.
Yum update
I tried strace -p 6351 to see what is happening inside yum, and it endlessly says fnctl(765158398, F_GETFD) = -1 EBADF (Bad File Descriptor) strace -p 6351
Same thing happens when I tried yum install openssh-server, but yum install telnet worked fine.
I really want to know what is happening on my docker. Any idea to fix it??

After some research, I have found that ulimit -n, ulimit -Hn, ulimit -Sn value inside the container was 1073741824, and it made yum check every possible file descriptor, from 0 to 1073741824.
I have inserted --ulimit nofile=1024:262144 in docker commandline (like docker run --ulimit nofile=1024:262144 --name test -p 2202:22/tcp -i -t centos:7 /bin/bash), and yum update worked fine! Now I can enjoy yum on docker happily!

Is there also a solution without setting it in every docker container and maybe in containerd? I am experiencing similar issues in centos 7.9 containers running on centos 9 hosts using Kubernetes/containerd. Yum installations take hours instead of minutes.
Update: I've added LimitNOFILE=1048576 to the service unit of containerd now and it works.

Related

RAPIDS.ai dependencies cuml and cudf not found no matter how I install

I have followed every version of the instructions on the AWS-EC2 setup for RAPIDS.ai: https://rapids.ai/cloud#AWS-EC2
I can confirm that I am using the exact instance type in the instructions, and following the steps exactly.
When I try to use the docker approach, the --gpus all command is not accepted.
When I try to use the conda approach, the install fails with the error:
PackageNotFoundError: Packages missing in current channels:
- glibc
I have tried (many) different solutions provided to solve both of these problems, none of them seem to work. I really just need to test some python code with cuml and cudf imports in a notebook. Been at this for 7 hours (after giving up on my local and SageMaker).
You note that the --gpus all command is not accepted, which suggests that you do not have the NVIDIA Docker runtime installed.
I followed the instructions you linked and I did run into an issue where the sudo yum install -y nvidia-docker2 command failed and I needed to disable an Amazon yum repo that was causing come conflicts as outlined in this issue.
$ sudo yum-config-manager --disable amzn2-graphics
$ sudo yum install -y nvidia-docker2
$ sudo yum-config-manager --enable amzn2-graphics
Once I'd done that and run sudo systemctl restart docker I was able to start the RAPIDS container.
$ docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/rapidsai:cuda11.2-runtime-ubuntu18.04-py3.7
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.download.nvidia.com/licenses/NVIDIA_Deep_Learning_Container_License.pdf
A JupyterLab server has been started!
To access it, visit http://localhost:8888 on your host machine.
Ensure the following arguments were added to "docker run" to expose the JupyterLab server to your host machine:
-p 8888:8888 -p 8787:8787 -p 8786:8786
Make local folders visible by bind mounting to /rapids/notebooks/host
(rapids) root#be7253bb4fdb:/rapids/notebooks#
Turns out, the frist AMI suggested in the documentation is not compatible. Use the Deep Learning NVIDIA one instead.

Upload speed inside Docker Container is limited to 4 Mbit/s

I am new to dockers and I have a docker container running for a set of students with some version-specific compilers and it's a part of a virtual laboratory setup.
Everything is fine with the setup except for the network. I have a 200 Mbps network and this is a speed test done on my phone on the same 200 Mbps network.
I did a speed test on the host machine where the docker container is running. It is running on Ubuntu 20.04 LTS. It is all good.
From inside the docker container running on the above host machine, I did a speed test with the same server ID 9898 and with an auto-selected server too.
We can see that the upload speed inside the docker container is limited to 4 Mbit/s somehow. I cannot find a reason why elsewhere.
I have seen recently that many students experienced connection drops during their attempt to connect to our SSH server. I believe this has something to do with the bandwidth limit.
The docker run command I am using to run this container build is as follows.
$ sudo docker run -p 7766:22 --detach --name lahtp-serv-3 --hostname lahtp-server --mount source=lahtp-3-storage-hdd,target=/var/lahtp-storage main:0.1
I asked a few people who suggested to run the container with net=host which will run the container with the host network instead of the docker network. I would like to know why docker container limits the upload bandwidth and how using the host network instead of the docker network fixes the issue?
Update #1:
I tried to spawn a new Ubuntu 18.04 container with the following command:
$ sudo docker run --net=host -it ubuntu:18.04 /bin/bash
Once inside the container, I installed the following to run the speedtest.
root#lahtp-server:/# apt-get update && apt-get upgrade -y && apt-get install build-essential openssh-server speedtest-cli
Once the installation is done, here is the results.
But adding --net=host does not change the issue. The upload speed is still 4 Mbit/s
How to remove this bandwidth throttling?
Update #2
I spawned a new Ubuntu 14.04 docker container using the following command
$ sudo docker run -it ubuntu:14.04 /bin/bash
Once the container is up, I installed the following
$ apt-get install python3-dev python3-pip
$ pip3 install speedtest-cli
And tested inside this container, and here are the results.
NO THROTTLING.
Did the same with Ubuntu 16.04 LTS, No throttling.
$ sudo docker run -it ubuntu:16.04 /bin/bash
And once inside the container
$ apt-get install python3-dev python3-pip
$ pip3 install speedtest-cli
NO THROTTLING.

Docker build error

I'm trying to build a Docker image, and are following the basic tutorial on Dockers own page. My Dockerfile looks like
FROM docker/whalesay:latest
RUN apt-get -y update && apt-get install -y fortunes
CMD /usr/games/fortune -a | cowsay
That is exact the same as Docker provide.
I'm running linux mint 18, and Docker is installed. I'm able to run images, like hello-world or others that I've build earlier and pushed to docker hub. (Used windows when i created them)
If I try to create images that I've created earlier, the same thing happens. It always crashes when RUN apt-get -y update && apt-get -y install.
Do anyone know how to solve this problem?
Thanks!
Picture of error message
As per the image it fails to resolve "archive.ubuntu.com"
do the following as per references.
Uncomment the following line in /etc/default/docker
DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4"
Restart the Docker service sudo service docker restart
Delete any images which have cached the invalid DNS settings.
Build again and the problem should be solved.
Ref: Docker build "Could not resolve 'archive.ubuntu.com'" apt-get fails to install anything
Actual Ref: https://www.digitalocean.com/community/questions/docker-on-ubuntu-14-04-could-not-resolve-archive-ubuntu-com

Mesos master keeps starting automatically

I have mesos installed in some docker containers and when ever I bring the container up, the mesos-master process starts by default on all these containers. Even on those that I have mesos-agents running on.
I have no idea why this is happening and this is rather annoying.
I am installing mesos the following way
RUN rpm -i http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm && \
yum -y install mesos-0.28.2
Any ideas on why this is happening? Is this the expected behavior?
Please let me know what I can do to stop this.
This is expected behavior
You need to explicit disable Mesos Master (and ZooKeeper if you installed it). Depending on your system version it can be done as follow:
On RedHat 6 / CentOS 6:
sudo stop mesos-master
sudo sh -c "echo manual > /etc/init/mesos-master.override"
On RedHat 7 / CentOS 7:
sudo systemctl stop mesos-master.service
sudo systemctl disable mesos-master.service
For more take a look at slave-setup tutorial.

install/access executable for existing docker container

I want to run an executable and all of its libraries from within my container. How do I do that?
For my Ubuntu 14.04 server, I can do sudo apt-get install tetex-base tetex-bin
In this case, however, someone already set up a docker container for me, and I need to be able to run the program from within the container.
I got it working with
docker exec -it containerName apt-get install tetex-base tetex-bin
See docs.

Resources