How can I keep ALL changes made to a docker container? - docker

I'm using docker as a "light" virtual machine. For example, when I need to do some experiments on Ubuntu and don't want to mess up the host OS, I simply run docker run -it ubuntu bash.
Generally I'm very happy with it, except that I cannot keep the changes after I exit, which means I need to rerun
apt update && apt install vim git python python3 <other_tools> && pip install flask coverage <other_libraries> && .....
every single time I start the docker container as a VM, which is very inefficient.
I've noticed this question, but it only enables me to keep some specific files from being erased, whereas I want the whole system (including but not limited to all configuration, cache and tools installed) to be retained between the life cycles of the docker container.

You must use something like
docker commit mycontainer_id myuser/myimage:12
see the doc: docker commit
and then you launch your saved image myuser/myimage:12
But you should definitely use a Dockerfile

Related

Docker - Extending a container with another image?

At my company, we have hardened containers created by the security team, and I would like to extend the hardened container with another docker image. For example, if we have a hardened Debian container, and I want to add Apache, how do I do this?
I understand I can use FROM to use a base, but the examples I've seen, don't add another level of published images to an existing base, but specific commands. Do I just go to the official Dockerhub Apache (HTTP) image and just copy and paste the commands from the github repo? I'm assuming there's a cleaner way (but not sure if there is).
For example, do I
FROM mycompanyprivaterepo/Debian:latest
//some command?
FROM httpd
docker build -t mynewimagewithapache
UPDATE:
After attempting via apt-get apache2 per some comments, it kept hanging on interactive questions, Solved with the help of comments using:
My Dockerfile:
FROM myprivaterepo/hardened-ubuntu
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get -qq install apache2
and building via:
$ docker build -t hardened-ubuntu-apache
Well, as far as I understood, you cannot use multi-stage builds and just
COPY --from=base-image /path/to/file/you-are-interested-in /path/inside/new-stage-image
in order to copy the required data to your preferred image. If this is the case, then you have to create your own Dockerfile with base image as your company mycompanyprivaterepo/Debian:latest, and then just create some layers on top of it in order to install required software, using RUN.

Set ldconfig LD_LIBRARY_PATH in a docker container

I have a docker container which I use to build software and generate shared libraries in. I would like to use those libraries in another docker container for actually running applications. To do this, I am using the build docker with a mounted volume to have those libraries on the host machine.
My docker file for the RUNTIME container looks like this:
FROM openjdk:8
RUN apt update
ENV LD_LIBRARY_PATH /build/dist/lib
RUN ldconfig
WORKDIR /build
and when I run with the following:
docker run -u $(id -u ${USER}):$(id -g ${USER}) -it -v $(realpath .):/build runtime_docker bash
I do not see any of the libraries from /build/dist/lib in the ldconfig -p cache.
What am I doing wrong?
You need to COPY the libraries into the image before you RUN ldconfig; volumes won't help you here.
Remember that first you run a docker build command. That runs all of the commands in the Dockerfile, without any volumes mounted. Then you take that image and docker run a container from it. Volume mounts only happen when the docker run happens, but the RUN ldconfig has already happened.
In your Dockerfile, you should COPY the files into the image. There's no particular reason to not use the "normal" system directories, since the image has an isolated filesystem.
FROM openjdk:8
# Copy shared-library dependencies in
COPY dist/lib/libsomething.so.1 /usr/lib
RUN ldconfig
# Copy the actual binary to run in and set it as the default container command
COPY dist/bin/something /usr/bin
CMD ["something"]
If your shared libraries are only available at container run-time, the conventional solution (as far as I can tell) would be to include the ldconfig command in a startup script, and use the dockerfile ENTRYPOINT directive to make your runtime container execute this script every time the container runs.
This should achieve your desired behaviour, and (I think) should avoid needing to generate a new container image every time you rebuild your code. This is slightly different from the common Docker use case of generating a new image for every build by running docker build at build-time, but I think it's a perfectly valid use case, and quite compatible with the way Docker works. Docker has historically been used as a CI/CD tool to streamline post-build workflows, but it is increasingly being used for other things, such as the build step itself. This naturally means people are coming up with slightly different ways of using Docker to facilitate various new and different types of workflow.

Is it possible to remove unwanted packages from docker image?

I'm trying to reduce the size of my docker image which is using Centos 7.2
The issue is that it's 257MB which is too high...
I have followed the best practices to write Dockerfile in order to reduce the size...
Is there a way to modify the image after the build and rebuild that image to see the size reduced ?
First of all if you want to reduce an OS size, don't start with big one like CentOS, you can start with alpine which is small
Now if you are still keen on using CentOS, do the following:
docker run -d --name centos_minimal centos:7.2.1511 tail -f /dev/null
This will start a command in the background. You can then get into the container using
docker exec -it centos_minimal bash
Now start removing packages that you don't need using yum remove or yum purge. Once you are done you can commit the image
docker commit centos_minimal centos_minimal:7.2.1511_trial1
Experimental Squash Image
Another option is to use an experimental feature of the build command. In this you can have a dockerfile like below
FROM centos:7
RUN yum -y purge package1 package2 package2
Then build this file using
docker build --squash -t centos_minimal:squash .
For this you need to add "experimental": true to your /etc/docker/daemon.json and then restart the docker server
It is possible, but not at all elegant. Just like you can add software to the base image, you could also remove:
FROM centos:7
RUN yum -y update && yum clean all
RUN yum -y install new_software
RUN yum -y remove obsolete_software
Ask yourself: does your OS have to be CentOS? Then I would recommend you use the default installation and make sure your have enough disk space and memory.
If it does not need to be CentOS, you should rather start with a more minimalistic image. See the discussion here:
Which Docker base image should be used to install Apps in a container without any additional OS?

Compatability of Dockerfile RUN Commands Cross-OS (apt-get)

A beginner's question; how does Docker handle underlying operating system variations when using the RUN command?
Let's take, for example, a very simple Official Docker Hub Dockerfile, for JRE 1.8. When it comes to installing the packages for java, the Dockerfile uses apt-get:
RUN apt-get update && apt-get install -y --no-install-recommends ...
To the untrained eye, this appears to be a platform-specific instruction that will only work on Debian-based operating systems (or at least ones with APT installed).
How exactly would this work on a CentOS installation, for example, where the package manager would be yum? Or god forbid, something like Solaris.
If this pattern of using RUN to fork arbitrary shell commands is prevalent in docker, how does one avoid inter-platform, or even inter-version, dependencies?
i.e. what if the Dockerfile writer has a newer version of (say) grep than I do, and they've used some new CLI flag that isn't available on earlier versions?
The only two outcomes from this can be: (1) RUN command exits with non-zero exit code (2) the Dockerfile changes the installed version of grep before running the command.
The common point shared by all Dockerfiles is the FROM statement. It is the first line in the file and indicates the parent Docker image you're building on. A typical base image could be one with Ubuntu (i.e.: https://hub.docker.com/_/ubuntu/). The snippet you share in your question would fit well in an Ubuntu image (with apt-get) but not in a CentOS image.
In summary, you're installing docker in your CentOS system, but you're building a Docker image with Ubuntu in it.
As I commented in your question, you can add FROM statement to specify which relaying OS you want. for example:
FROM docker.io/centos:latest
RUN yum update -y
RUN yum install -y java
...
now you have to build/create the image with:
docker build -t <image-name> .
The idea is that you'll use the OS you are familiar with (for example, CentOS) and build an image of it. Now, you can take this image and run it above Ubuntu/CentOS/RHEL/whatever... with
docker run -it <image-name> bash
(You just need to install docker in the desired OS.

Get ride of Vmware and move to Docker, how to properly setup the Dockerfile or the cointainer?

I am a PHP developer so most of the time for test any application I am working on what I do is:
Create a Vmware VM and install a complete OS: most of the time I like to use CentOS
Setup everything on the VM meaning: Apache and modules, PHP and modules and MySQL or MariaDB
Anytime I start a new VM from scratch there are a few steps I run:
# Install EPEL and Remi Repos
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
wget http://rpms.remirepo.net/enterprise/remi-release-6.rpm
rpm -Uvh remi-release-6.rpm epel-release-latest-6.noarch.rpm
# Install Apache, PHP and its dependencies
yum -y install php php-common php-cli php-fpm php-gd php-intl php-mbstring php-mcrypt php-opcache php-pdo php-pear php-pecl-apcu php-imagick php-pecl-xdebug php-pgsql php-xml php-mysqlnd php-pecl-zip php-process php-soap
# Start Apache on 235 run level
chkconfig --levels 235 httpd on
# Setup MariaDB repos
nano /etc/yum.repos.d/MariaDB.repo
# Write this inside the MariaDB.repo file
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/5.5/centos6-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
# Install MariaDB
yum -y install MariaDB MariaDB-server
# Start service
service mysql start
# Start MariaDB on run level 235
chkconfig --levels 235 mysql on
# Setup MariaDB (this is interactive)
/usr/bin/mysql_secure_installation
# A few more steps
This is annoying task and I need to do all the time (when I mess up the VM trying new things and changing here and there. So here is where Docker, I think, comes to save. After read a few I know the basic of Docker and I have pull a CentOS image by running docker run -it centos but that's all just a bash shell and a basic CentOS image so is my task to install & setup everything.
Here are my doubts about Docker and how to handle this repetitive and common tasks:
Should I create a Dockerfile (this is my first Dockerfile so perhaps the order is not the right or I am complete mistaken) with the content below and put all the repetitive tasks inside run-setup.sh file?
FROM centos:latest
MAINTAINER MyName <MyEmail>
RUN yum -y update && yum clean all
ADD run-setup.sh /run-setup.sh
RUN chmod -v +x /run-setup.sh
CMD ["/run-setup.sh"]
EXPOSE 80
Should I run the repetitive tasks by hand as I do before on the VM?
The command /usr/bin/mysql_secure_installation is complete interactive since I need to answer a few questions and set a password, how to deal with this one or any other interactive?
Any better idea?
I will start answering your questions:
Yes, you could start with a Dockerfile. However, I recommend you using the commands straight into the file so that its easier to maintain in the future. An e.g. could be Dockerfile of apache from github.
Repetitive tasks, no. You could save the images of the containers by pushing your images to a public registry like docker hub or you could host a private one which can be a docker container itself.
Inter activeness should be worked around somehow with command line options, bash read or passing a file if possible etc. I do not think there is a straight answer to this.
Better ideas, the usual pattern is to host the Dockerfile in a github or bitbucket public repository and then configure automated builds against docker hub. They all come for free :)
There are also many live working examples you could get from the docker hub. Start searching for an image, choose the most popular/offical one, then you must have links to the Dockerfile.
Let me know how it goes.

Resources