How to handle python module installation when running jupyter notebook in docker? - docker

I'm currently starting to use the awesome jupyter notebook. Since I've always had troubles with stuff not working because of different python versions and python module versions, I like to run jupyter notebook in a docker container. I've created a Dockerfile to build my image (based on the official jupyter/scipy-notebook image on dockerhub), I have everything up and running and its working great.
The only thing that concerns me is how to handle the installation of different python modules I might need during the next week(s). How do you guys handle that?
1) Install the needed modules in the running docker container, then use docker commit and save the running container as a new image?
2) Always edit the Dockerfile to install the needed modules and re-build the image?
3) Don't delete the container (no --rm flag) and just restart it?
1) and 2) seem to be a little complicated, but I also want to be able to be able to start from a "fresh" notebook in case I mess something up, so 3) is also not perfect. Is there something I missed?

Related

How to load and run offline docker image built using docker-compose build?

I'm new to docker and have been dabbling with it for the past few days. I've managed to successfully use docker-compose for a multi-container deployment involving an app server (flask + gunicorn) and web server (nginx).
Now, I'd like to recreate the deployment on an offline machine. After doing research, it seems that most have mentioned use docker save and docker load to transfer over the base images. However, I'm wondering whether its possible to recreate the deployment from the image created by docker-compose build? Reason being I would like to skip the entire process of wheeling my python package dependencies for offline use, which I would have to do for the method starting from the base images.
I've tried to save that particular image (output of docker-compose build) and load it on the offline machine, and then tried docker run and docker-compose up but both don't seem to work. Would like to check with the community whether this method is even possible, and if so what's the right way to go about it?
Thanks!
To solve my issue, I ended up making an image of each individual container post pip install, then using docker-compose.yml simply to spin them up. As David mentioned, it doesn't seem possible to spin up the container from the single image output by docker-compose build.

Using a pre-built docker container image with the requirements pre-installed

I'm looking at a way to run a docker container with all the requirements already provided to avoid to wait on the download of the requirements.
I'm debugging python lambda locally.
I use the sam-cli integration in PyCharm.
To specify the requirements I have them all listed in a requirements.txt file.
When I run the debug sam build is executed with the user container setting.
This goes an fetches all the requirements from the internet into the container and then executes it.
When I'm working offline or with slow internet I would like to be able to use a container that has all the requirements. This will also be great to speed up the debugging process.
How can I setup my environment so it uses a pre-built docker container?
Build a new container base on old container and add a RUN instruction to install all you requirements to new layer on new image.

Additional steps in Dockerfile

I have a Docker image which is a server for a web IDE (Jupyter notebook) for Haskell.
Each time I want to allow the usage of a library in the IDE, I have to go to the Dockerfile and add the install command into it, then rebuild the image.
Another drawback of this, I have to fork the original image on Github, not allowing me to contribute to it.
I was thinking about writing another Dockerfile which pulls the base one with the FROM directive and then RUNs the commands to install the libraries. But, as they are in separate layers, the guest system does not find the Haskell package manager command.
TL;DR: I want to run stack install <library> (stack is like npm or pip, but for Haskell) from the Dockerfile, but I dont want to have a fork of the base image.
How could I solve this problem?
I was thinking about writing another Dockerfile which pulls the base one with the FROM directive and then RUNs the commands to install the libraries. But, as they are in separate layers, the guest system does not find the Haskell package manager command.
This is indeed the correct way to do this, and it should work. I'm not sure I understand the "layers" problem here - the commands executed by RUN should be running in an intermediate container that contains all of the layers from the base image and the previous RUN commands. (Ignoring the possibility of multi-stage builds, but these were added in 17.05 and did not exist when this question was posted.)
The only scenario I can see where stack might work in the running container but not in the Dockerfile RUN command would be if the $PATH variable isn't set correctly at this point. Check this variable, and make sure RUN is running as the correct user?

How to make docker image of host operating system which is running docker itself?

I started using Docker and I can say, it is a great concept.
Everything is going fine so far.
I installed docker on ubuntu (my host operating system) , played with images from repository and made new images.
Question:
I want to make an image of the current(Host) operating system. How shall I achieve this using docker itself ?
I am new to docker, so please ignore any silly things in my questions, if any.
I was doing maintenance on a server, the ones we pray not to crash, and I came across a situation where I had to replace sendmail with postfix.
I could not stop the server nor use the docker hub available image because I need to be clear sure I will not have problems. That's why I wanted to make an image of the server.
I got to this thread and from it found ways to reproduce the procedure.
Below is the description of it.
We start by building a tar file of the entire filesystem of the machine (excluding some non necessary and hardware dependent directory - Ok, it may not be as perfect as I intent, but it seams to be fine to me. You'll need to try whatever works for you) we want to clone (as pointed by #Thomasleveil in this thread).
$ sudo su -
# cd /
# tar -cpzf backup.tar.gz --exclude=/backup.tar.gz --exclude=/proc --exclude=/tmp --exclude=/mnt --exclude=/dev --exclude=/sys /
Then just download the file into your machine, import targz as an image into the docker and initialize the container. Note that in the example I put the date-month-day of image generation as image tag when importing the file.
$ scp user#server-uri:path_to_file/backup.tar.gz .
$ cat backup.tar.gz | docker import - imageName:20190825
$ docker run -t -i imageName:20190825 /bin/bash
IMPORTANT: This procedure generates a completely identical image, so it is of great importance if you will use the generated image to distribute between developers, testers and whateever that you remove from it or change any reference containing restricted passwords, keys or users to avoid security breaches.
I'm not sure to understand why you would want to do such a thing, but that is not the point of your question, so here's how to create a new Docker image from nothing:
If you can come up with a tar file of your current operating system, then you can create a new docker image of it with the docker import command.
cat my_host_filesystem.tar | docker import - myhost
where myhost is the docker image name you want and my_host_filesystem.tar the archive file of your OS file system.
Also take a look at Docker, start image from scratch from superuser and this answer from stackoverflow.
If you want to learn more about this, searching for docker "from scratch" is a good starting point.

Dynamically get docker version during image build

I'm working on a project the requires me to run docker within docker. Currently, I am just relying on the docker client to be running within docker and passing in an environment variable to the TCP address of the docker daemon with which I want to communicate.
The file in the Dockerfile that I use to install the client looks like this:
RUN curl -s https://get.docker.io/builds/Linux/x86_64/docker-latest -o /usr/local/bin/docker
However, the problem is that this will always download the latest docker version. Ideally, I will always have the Docker instance running this container on the latest version, but occasionally it may be a version behind (for example I haven't yet upgraded from 1.2 to 1.3). What I really want is a way to dynamically get the version of the Docker instance that's building this Dockerfile, and then pass that in to the URL to download the appropriate version of Docker. Is this at all possible? The only thing I can think of is to have an ENV command at the top of the Dockerfile, which I need to manually set, but ideally I was hoping that it could be set dynamically based on the actual version of the Docker instance.
While your question makes sense from an engineering point of view, it is at odds with the intention of the Dockerfile. If the build process depended on the environment, it would not be reproducible elsewhere. There is not a convenient way to achieve what you ask.

Resources