Modifying a docker image - docker

I have recently started working on docker. I have downloaded a docker image and I want to change it in a way so that I can copy a folder with its contents from my local into that image or may be edit any file in the image.
I thought if I can extract the image somehow, do the changes and then create one image. Not sure if it will work like that. I tried looking for options but couldn't find a promising solution to it.
The current Dockerfile for the image is somewhat like this:
FROM abc/def
MAINTAINER Humpty Dumpty <#hd>
RUN sudo apt-get install -y vim
ADD . /home/humpty-dumpty
WORKDIR /home/humpty-dumpty
RUN cd lib && make
CMD ["bash"]
Note:- I am looking for an easy and clean way to change the existing image only and not to create a new image with the changes.

As an existing docker image cannot be changed what I did was that I created a dockerfile for a new docker image based on my original docker image for its contents and modified it to include test folder from local in the new image.
This link was helpful Build your own image - Docker Documentation
FROM abc/def:latest
The above line in docker file tells Docker which image your image is based on. So, the contents from parent image are copied to new image
Finally, for including the test folder on local drive I added below command in my docker file
COPY test /home/humpty-dumpty/test
and the test folder was added in that new image.
Below is the dockerfile used to create the new image from the existing one.
FROM abc/def:latest
# Extras
RUN sudo apt-get install -y vim
# copies local folder into the image
COPY test /home/humpty-dumpty/test
Update:- For editing a file in the running docker image, we can open that file using vim editor installed through the above docker file
vim <filename>
Now, the vim commands can be used to edit and save the file.

You don't change existing images, images are marked with a checksum and are considered read-only. Containers that use an image point to the same files on the filesystem, adding on their on RW layer for the container, and therefore depend on the image being unchanged. Layer caching also adds to this dependency.
Because of the layered filesystem and caching, creating a new image with just your one folder addition will only add a layer with that addition, and not a full copy of a new image. Therefore, the easy/clean/correct way is to create a new image using a Dockerfile.

First of all, I will not recommend messing with other image. It would be better if you can create your own. Moving forward, You can use copy command to add folder from host machine to the docker image.
COPY <src> <dest>
The only caveat is <src> path must be inside the context of the build; you cannot COPY ../something /something, because the first step of a docker build is to send the context directory (and subdirectories) to the docker daemon.
FROM abc/def
MAINTAINER Humpty Dumpty <#hd>
RUN sudo apt-get install -y vim
// Make sure you already have /home/humpty-dumpty directory
// if not create one
RUN mkdir -p /home/humpty-dumpty
COPY test /home/humpty-dumpty/ // This will add test directory to home/humpty-dumpty
WORKDIR /home/humpty-dumpty
RUN cd lib && make
CMD ["bash"]

I think you can use the docker cp command to make changes to the container which is build from your docker image and then commit the changes.
Here is a reference,
Guide for docker cp:
https://docs.docker.com/engine/reference/commandline/cp/
Guide for docker commit: https://docs.docker.com/engine/reference/commandline/container_commit/
Remember, docker image is a ready only so you cannot make any changes to that. The only way is to modify your docker file and recreate the image but in that case you lose the data(if not mounted on docker volume ). But you can make changes to container which is not ready only.

Related

How can I reproduce docker commands seen in docker hub image layers?

I am wondering how it is possible to reproduce the docker commands seen in this docker image. The image copies certain versions of clang and gcc, which is something I wish to do in my own dockerfile. I cannot use the linked docker image, as it contains many commands that are unnecessary for the work I want to do.
The very first command is
ADD file:2cddee716e84c40540a69c48051bd2dcf6cd3bd02a3e399334e97f20a77126ff in /
Further down, there are many similar COPY commands. I wish to reproduce the following command in my own dockerfile:
COPY dir:49371ba683da700cabfad7284da39bd2144aa0c46086c3015a74737d7be6b51e in /compilers/clang/3.4.2
The command copies clang-3.4.2 into the given folder. I am unsure how I can do the same, or even what the hash is/means.
I tried looking, but I couldn't find the Dockerfile used to create the image. There is another way though.
It's quite a large image and I'm on a terrible internet connection, so I haven't tested this myself, but one thing you can do is copy the things you need from the image into a new one of your own like this
FROM cnsun/perses:perses_part_54_name_clang_trunk AS original
FROM ubuntu:latest
COPY --from=original /compilers/clang/3.4.2 /compilers/clang/3.4.2
You can also copy the files from the image to your computer. Then you can copy them from there into new images without referencing the cnsun image:
docker run --rm -v $(pwd):/dest --entrypoint /bin/bash cnsun/perses:perses_part_54_name_clang_trunk -c "cp -r /compilers/clang/3.4.2 /dest"
This will copy the /compilers/clang/3.4.2 directory into the current directory on the host. If your host is Windows, replace $(pwd) with %cd%.

Docker - how to ensure commit will persist a file?

I keep doing a pull, run, <UPLOAD FILE>, commit, tag, push cycle only to be dismayed that my file is gone when I pull the pushed container. My goal is to include an ipynb file with my image that serves as a README/ tutorial for my users.
Reading other posts, I see that commit is/ isn't the way to add a file. What causes commit to persist/ disregard a file? Am I supposed to use docker cp to add the file before commiting?
If you need to publish your notebook file in a docker image, use a Dockerfile, something like this-
FROM jupyter/datascience-notebook
COPY mynotebook.ipynb /home/jovyan/work
Then, once you have your notebook the way you want it, just run docker build, docker push. To try and help you a bit more, the reason you are having your problem is that the jupyter images store the notebooks in a volume. Data in a volume is not part of the image, it lives on the filesystem of the host machine. That means that a commit isn't going to save anything in the work folder.
Really, an ipynb is a data file, not an application. The right way to do this is probably to just upload the ipynb file to a file store somewhere and tell your users to download it, since they could use one docker image to run many data files. If you really want a prebuilt image using the workflow you described, you could just put the file somewhere else that isn't in a volume so that it gets captured in your commit.
For those of you looking for some place to start with docker build, below are the lines in the Dockerfile that I triggered with docker build -t your-image-name:your-new-t
Dockerfile
FROM jupyter/datascience-notebook:latest
MAINTAINER name <email>
# ====== PRE SUDO ======
ENV JUPYTER_ENABLE_LAB=yes
# If you run pip as sudo it continually prints errors.
# Tidyverse is already installed, and installing gorpyter installs the correct versions of other Python dependencies.
RUN pip install gorpyter
# commenting out our public repo
ENV R_HOME=/opt/conda/lib/R
# https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s09.html
# Looks like /usr/local/man is symlinking all R/W toward /usr/local/share/man instead
COPY python_sdk.ipynb /usr/local/share/man
COPY r_sdk.ipynb /usr/local/share/man
ENV NOTEBOOK_DIR=/usr/local/share/man
WORKDIR /usr/local/share/man
# ====== SUDO ======
USER root
# Spark requires Java 8.
RUN sudo apt-get update && sudo apt-get install openjdk-8-jdk -y
ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# If you COPY files into the same VOLUME that you mount in docker-compose.yml, then those files will disappear at runtime.
# `user_notebooks/` is the folder that gets mapped as a VOLUME to the user's local folder during runtime.
RUN mkdir /usr/local/share/man/user_notebooks

Troubleshoot directory path error in COPY command in docker file

I am using COPY command in my docker file on top of ubuntu 16.04. I am getting error as no such file or directory eventhough the directory is present. In the below docker file I want to copy the directory "auth" present inside workspace directory to the docker image (at path /home/ubuntu) and then build the image.
FROM ubuntu:16.04
RUN apt-get update
COPY /home/ubuntu/authentication/workspace /home/ubuntu
WORKDIR /home/ubuntu/auth
a Dockerfile COPY command can only refer to files under the context - the current location of the Dockerfile, aka .
so you have a few options now:
if it is possible to copy the /home/ubuntu/authentication/workspace/ directory content to somewhere inside your project before the build (so now it will be included in your Dockerfile context and you can access it via COPY ./path/to/content /home/ubuntu) it can be great. but sometimes you dont want it.
instead of copying the directory, bind it to your container via a volume:
when you run the container, add a -v option:
docker run [....] -v /home/ubuntu/authentication/workspace:/home/ubuntu [...]
mind that a volume is designed so any change you made inside the container dir(/home/ubuntu) will affect the bound directory on your host side (/home/ubuntu/authentication/workspace) and vice versa.
i found a something over here: this guy is forcing the Dockerfile to accept his context- he is sitting inside the /home/ubuntu/authentication/workspace/ directory, and running there
docker build . -f /path/to/Dockerfile
so now inside his Dockerfile he can refer to /home/ubuntu/authentication/workspace as his context (.)

Sending docker build contexts

My directory structure is as follows
cassandra
Dockerfile
downloads
225M file
I am inside cassandra directory. My build command is
docker build -t image_cassandra .
I know that it will send all the contents in . current directory. So it takes so much of time to send this 225M file. I need this file in my Dockerfile.
Add downloads/ /tmp/
I want to avoid this much of delay. And I know that, we cannot use ../ in docker ADD command. So is there any way to reduce the size of build context and have this ADD command.
This file is not part of web. So i cannot use any apt-get wget statements. Or isn't possible?
You could separate the project into two docker images, a big one that changes infrequently, and a small one that you can change fast.
Project1/
Dockerfile
bigfile
Project2/
Dockerfile
Project1/Dockerfile would look like this:
FROM ubuntu
RUN apt-get install cassandra
ADD bigfile
Then if you build it and tag it with docker build -t project1 Project1, you can use the result in Project2/Dockerfile:
FROM project1
RUN fast configuration commands

How do I dockerize an existing application...the basics

I am using windows and have boot2docker installed. I've downloaded images from docker hub and run basic commands. BUT
How do I take an existing application sitting on my local machine (lets just say it has one file index.php, for simplicity). How do I take that and put it into a docker image and run it?
Imagine you have the following existing python2 application "hello.py" with the following content:
print "hello"
You have to do the following things to dockerize this application:
Create a folder where you'd like to store your Dockerfile in.
Create a file named "Dockerfile"
The Dockerfile consists of several parts which you have to define as described below:
Like a VM, an image has an operating system. In this example, I use ubuntu 16.04. Thus, the first part of the Dockerfile is:
FROM ubuntu:16.04
Imagine you have a fresh Ubuntu - VM, now you have to install some things to get your application working, right? This is done by the next part of the Dockerfile:
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y python
For Docker, you have to create a working directory now in the image. The commands that you want to execute later on to start your application will search for files (like in our case the python file) in this directory. Thus, the next part of the Dockerfile creates a directory and defines this as the working directory:
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
As a next step, you copy the content of the folder where the Dockerfile is stored in to the image. In our example, the hello.py file is copied to the directory we created in the step above.
COPY . /usr/src/app
Finally, the following line executes the command "python hello.py" in your image:
CMD [ "python", "hello.py" ]
The complete Dockerfile looks like this:
FROM ubuntu:16.04
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y python
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY . /usr/src/app
CMD [ "python", "hello.py" ]
Save the file and build the image by typing in the terminal:
$ docker build -t hello .
This will take some time. Afterwards, check if the image "hello" how we called it in the last line has been built successfully:
$ docker images
Run the image:
docker run hello
The output shout be "hello" in the terminal.
This is a first start. When you use Docker for web applications, you have to configure ports etc.
Your index.php is not really an application. The application is your Apache or nginx or even PHP's own server.
Because Docker uses features not available in the Windows core, you are running it inside an actual virtual machine. The only purpose for that would be training or preparing images for your real server environment.
There are two main concepts you need to understand for Docker: Images and Containers.
An image is a template composed of layers. Each layer contains only the differences between the previous layer and some offline system information. Each layer is fact an image. You should always make your image from an existing base, using the FROM directive in the Dockerfile (Reference docs at time of edit. Jan Vladimir Mostert's link is now a 404).
A container is an instance of an image, that has run or is currently running. When creating a container (a.k.a. running an image), you can map an internal directory from it to the outside. If there are files in both locations, the external directory override the one inside the image, but those files are not lost. To recover them you can commit a container to an image (preferably after stopping it), then launch a new container from the new image, without mapping that directory.
You'll need to build a docker image first, using a dockerFile, you'd probably setup apache on it, tell the dockerFile to copy your index.php file into your apache and expose a port.
See http://docs.docker.com/reference/builder/
See my other question for an example of a docker file:
Switching users inside Docker image to a non-root user (this is for copying over a .war file into tomcat, similar to copying a .php file into apache)
First off, you need to choose a platform to run your application (for instance, Ubuntu). Then install all the system tools/libraries necessary to run your application. This can be achieved by Dockerfile. Then, push Dockerfile and app to git or Bitbucket. Later, you can auto-build in the docker hub from github or Bitbucket. The later part of this tutorial here has more on that. If you know the basics just fast forward it to 50:00.

Resources