Create new docker image vs run shell commands - docker

we are working with fabric-ca docker image. it does not come with scp installed so we have two options:
Option 1: create a new image as described here
Option 2: install scp from the shell when container is started
we'd like to understand what are the pros and cons of each.

Option 1: allows you to build on it further, creates a stable state, you can verify / test an image before releasing
Option 2: takes longer to startup, requires being online during container start, it is harder to trace / understand and manage software stack locked in e.g. bash scripts that start dockers vs. Dockerfile and whatever technology you will end up using for container orchestration.
Ultimately, I use option 2 only for discovery, proof of concept or trying something out. Once I know I need certain container on ongoing basis, I build a proper image via Dockerfile.

You should consider your option 2 a non-starter. Either build a custom image or use a host directory bind-mount (docker run -v /host/path:/container/path option) to inject the data you need; I would probably prefer the bind-mount option.
It’s extremely routine to docker rm a container, and when you do, any changes you’ve made locally in a container are lost. For example, if there is a new software release or a critical security update, you have to recreate the container with a new image. You should pretty much never install software in an interactive shell in a container, especially if you’re going to use it to copy in data your application needs: you’ll have to repeat this step every single time you delete and recreate the container.

Option 1:
The BUILD of the image is longer, but you execute it only the first time
The RUN is faster
You don't need an internet connection at RUN
Include a verification of the different steps
Allow tracability
Option 2:
The RUN is longer
You need need an internet connection at RUN
Harder to trace

Related

Intro to Docker for FreeBSD Jail User - How and should I start the container with systemd?

We're currently migrating room server to the cloud for reliability, but our provider doesn't have the FreeBSD option. Although I'm prepared to pay and upload a custom system image for deployment, I nontheless want to learn how to start a application system instance using Docker.
in FreeBSD Jail, what I did was to extract an entire base.txz directory hierarchy as system content into /usr/jail/app, and pkg -r /usr/jail/app install apache24 php perl; then I configured /etc/jail.conf to start the /etc/rc script in the jail.
I followed the official FreeBSD Handbook, and this is generally what I've worked out so far.
But Docker is another world entirely.
To build a Docker image, there are two options: a) import from a tarball, b) use a Dockerfile. The latter of which lets you specify a "CMD", which is the default command to run, but
Q1. why isn't it available from a)?
Q2. where are information like "CMD ENV" stored? in the image? in the container?
Q3. How to start a GNU/Linux system in a container? Do I just run systemd and let it figure out the rest from configuration? Do I need to pass to it some special arguments or envvars?
You should think of a Docker container as a packaging around a single running daemon. The ideal Docker container runs one process and one process only. Systemd in particular is so heavyweight and invasive that it's actively difficult to run inside a Docker container; if you need multiple processes in a container then a lighter-weight init system like supervisord can work for you, but that's usually an exception more than a standard packaging.
Docker has an official tutorial on building and running custom images which is worth a read through; this is a pretty typical use case for Docker. In particular, best practice is to write a Dockerfile that describes how to build an image and check it into source control. Containers should avoid having persistent data if they can (storing everything in an external database is ideal); if you change an image, you need to delete and recreate any containers based on it. If local data is unavoidable then either Docker volumes or bind mounts will let you keep data "outside" the container.
While Docker has several other ways to create containers and images, none of them are as reproducible. You should avoid the import, export, and commit commands; and you should only use save and load if you can't use or set up a Docker registry and are forced to move images between systems via a tar file.
On your specific questions:
Q1. I suspect the best reason the non-docker build paths to create images don't easily let you specify things like CMD is just an implementation detail: if you look at the docker history of an image you'll see the CMD winds up being its own layer. Don't worry about it and use a Dockerfile.
Q2. The default CMD, any set ENV variables, and other related metadata are stored in the image alongside the filesystem tree. (Once you launch a container, it has a normal Unix process tree, with the initial process being pid 1.)
Q3. You don't "start a system in a container". Generally run one process or service in a container, and manage their lifecycles independently.

Best Practices for Cron on Docker

I've transitioned to using docker with cron for some time but I'm not sure my setup is optimal. I have one cron container that runs about 12 different scripts. I can edit the schedule of the scripts but in order to deploy a new version of the software running (some scripts which run for about 1/2 day) I have to create a new container to run some of the scripts while others finish.
I'm considering either running one container per script (the containers will share everything in the image but the crontab). But this will still make it hard to coordinate updates to multiple containers sharing some of the same code.
The other alternative I'm considering is running cron on the host machine and each command would be a docker run command. Doing this would let me update the next run image by using an environment variable in the crontab.
Does anybody have any experience with either of these two solutions? Are there any other solutions that could help?
If you are just running docker standalone (single host) and need to run a bunch of cron jobs without thinking too much about their impact on the host, then making it simple running them on the host works just fine.
It would make sense to run them in docker if you benefit from docker features like limiting memory and cpu usage (so they don't do anything disruptive). If you also use a log driver that writes container logs to some external logging service so you can easily monitor the jobs.. then that's another good reason to do it. The last (but obvious) advantage is that deploying new software using a docker image instead of messing around on the host is often a winner.
It's a lot cleaner to make one single image containing all the code you need. Then you trigger docker run commands from the host's cron daemon and override the command/entrypoint. The container will then die and delete itself after the job is done (you might need to capture the container output to logs on the host depending on what logging driver is configured). Try not to send in config values or parameters you change often so you keep your cron setup as static as possible. It can get messy if a new image also means you have to edit your cron data on the host.
When you use docker run like this you don't have to worry when updating images while jobs are running. Just make sure you tag them with for example latest so that the next job will use the new image.
Having 12 containers running in the background with their own cron daemon also wastes some memory, but the worst part is that cron doesn't use the environment variables from the parent process, so if you are injecting config with env vars you'll have to hack around that mess (write them do disk when the container starts and such).
If you worry about jobs running parallel there are tons of task scheduling services out there you can use, but that might be overkill for a single docker standalone host.

Services in CentOS 7 Docker image without systemd

I'm trying to create a Docker container based on CentOS 7 that will host R, shiny-server, and rstudio-server, but to I need to have systemd in order for the services to start. I can use the systemd enabled centos image as a basis, but then I need to run the container in privileged mode and allow access to /sys/fs/cgroup on the host. I might be able to tolerate the less secure situation, but then I'm not able to share the container with users running Docker on Windows or Mac.
I found this question but it is 2 years old and doesn't seem to have any resolution.
Any tips or alternatives are appreciated.
UPDATE: SUCCESS!
Here's what I found: For shiny-server, I only needed to execute shiny-server with the appropriate parameters from the command line. I captured the appropriate call into a script file and call that using the final CMD line in my Dockerfile.
rstudio-server was more tricky. First, I needed to install initscripts to get the dependencies in place so that some of the rstudio scripts would work. After this, executing rstudio-server start would essentially do nothing and provide no error. I traced the call through the various links and found myself in /usr/lib/rstudio-server/bin/rstudio-server. The daemonCmd() function tests cat /proc/1/comm to determine how to start the server. For some reason it was failing, but looking at the script, it seems clear that it needs to execute /etc/init.d/rstudio-server start. If I do that manually or in a Docker CMD line, it seems to work.
I've taken those two CMD line requirements and put them into an sh script that gets called from a CMD line in the Dockerfile.
A bit of a hack, but not bad. I'm happy to hear any other suggestions.
You don't necessarily need to use an init system like systemd.
Essentially, you need to start multiple services, there are existing patterns for this. Check out this page about how to use supervisord to achieve the same thing: https://docs.docker.com/engine/admin/using_supervisord/

Can you share Docker containers?

I have been trying to figure out why one might choose adding every "step" of their setup to a Dockerfile which will create your container in a certain state.
The alternative in my mind is to just create a container from a simple base image like ubuntu and then (via shell input) configure your container the way you'd like.
But can you share containers? If you can only share images with Docker then I'd understand why one would want every step of their container setup listed in a Dockerfile.
The reason I ask is because I imagine there is some amount of headache involved with porting shell commands, file changes for configs, etc. to correct Dockerfile syntax and have them work correctly? But as a novice with Docker I could be overestimating the difficulty of that task.
EDIT: I suppose another valid reason for having the Dockerfile with each setup step is for documentation as to the initial state of the container. As opposed to being given a container in a certain state, but not necessarily having a way to know what all was done from the container's image base state.
But can you share containers? If you can only share images with Docker then I'd understand why one would want every step of their container setup listed in a Dockerfile.
Strictly speaking, no. However, you can create a new image from an existing container using the docker commit command:
$ docker commit <container-name> <image-name>
This command will create a new image from the existing container that you can push and pull from/to registries, export and import and create new containers from.
The reason I ask is because I imagine there is some amount of headache involved with porting shell commands, file changes for configs, etc. to correct Dockerfile syntax and have them work correctly? But as a novice with Docker I could be overestimating the difficulty of that task.
If you're already using some other mechanism for automated configuration, you can simply integrate your existing automation into the Docker build. For instance, if you are already configuring your images using shell scripts, simply add a build step in your Dockerfile in which to add your install scripts to the container and execute it. In theory, this can also work with configuration management utilities like Puppet, Salt and others.
EDIT: I suppose another valid reason for having the Dockerfile with each setup step is for documentation as to the initial state of the container. As opposed to being given a container in a certain state, but not necessarily having a way to know what all was done from the container's image base state.
True. As mentioned in comments, there are clear advantages to have an automated and reproducible build of your image. If you build your containers manually and then create an image with docker commit, you don't necessarily know how to re-build this image at a later point in time (which may become necessary when you want to release a new version of your application or re-build the image on top of an updated base image).

Is it possible/sane to develop within a container Docker

I'm new to Docker and was wondering if it was possible (and a good idea) to develop within a docker container.
I mean create a container, execute bash, install and configure everything I need and start developping inside the container.
The container becomes then my main machine (for CLI related works).
When I'm on the go (or when I buy a new machine), I can just push the container, and pull it on my laptop.
This sort the problem of having to keep and synchronize your dotfile.
I haven't started using docker yet, so is it something realistic or to avoid (spacke disk problem and/or pull/push timing issue).
Yes. It is a good idea, with the correct set-up. You'll be running code as if it was a virtual machine.
The Dockerfile configurations to create a build system is not polished and will not expand shell variables, so pre-installing applications may be a bit tedious. On the other hand after building your own image to create new users and working environment, it won't be necessary to build it again, plus you can mount your own file system with the -v parameter of the run command, so you can have the files you are going to need both in your host and container machine. It's versatile.
> sudo docker run -t -i -v
/home/user_name/Workspace/project:/home/user_name/Workspace/myproject <container-ID>
I'll play the contrarian and say it's a bad idea. I've done work where I've tried to keep a container "long running" and have modified it, but then accidentally lost it or deleted it.
In my opinion containers aren't meant to be long running VMs. They are just meant to be instances of an image. Start it, stop it, kill it, start it again.
As Alex mentioned, it's certainly possible, but in my opinion goes against the "Docker" way.
I'd rather use VirtualBox and Vagrant to create VMs to develop in.
Docker container for development can be very handy. Depending on your stack and preferred IDE you might want to keep the editing part outside, at host, and mount the directory with the sources from host to the container instead, as per Alex's suggestion. If you do so, beware potential performance issue on macos x with boot2docker.
I would not expect much from the workflow with pushing the images to sync between dev environments. IMHO keeping Dockerfiles together with the code and synching by SCM means is more straightforward direction to start with. I also carry supporting Makefiles to build image(s) / run container(s) same place.

Resources