I'm trying to create a docker image that would run a script every minute using cron. Most of the time it'll finish immediately, but sometimes it'll take 10 minutes. I need to make sure multiple instances of the script aren't run at the same time. On Ubuntu I've used the run-one package, but it seems missing from alpine. How can this be fixed?
Related
why we don't use CMD apt update instead of RUN apt update on Dockerfile
we use RUN apt update for update an image this is for one time but why we don't use CMD apt update for update every container we create ? ? ? ?
As it sounds like you already know, RUN is intended "xecute any commands in a new layer on top of the current image and commit the results", and CMD is intended to "xecute any commands in a new layer on top of the current image and commit the results". So RUN is a build-time instruction, while CMD is a run-time instruction.
There are a few reasons this won't be a good idea:
Containers should be fast
Containers are usually expected to consume as few resources as possible, and startup and shutdown quickly and easily. If we update a container's packages EVERY time we want to run a container, it might take the container many minutes or even hours on a bad network before it can even start running whatever process it is intended for.
Unexpected behavior
Part of the process when developing a new container image is ensuring that the packages that are necessary for the container to work, play well together. But if we are upgrading all the packages each time the container is run on whatever system it is run on, it is possible (if not inevitable) that there will eventually be a package that will be published that introduces a breaking change to the container, and this is obviously not ideal.
Now this could be avoided by removing the default repositories and replacing them with your own where you can vet each package upgrade, test them together, and publish them, but this is probably a much greater effort than what would make sense unless the repos would be serving multiple container images.
Image Versioning
Many container images (ex Golang) will version their images based on the version of Golang they support; however, when the underlying packages on the container are changing how would you start to version the image?
Now this isn't necessarily a deal breaker, but it could cause confusion among the containers user-base and ultimately undercut their trust in the container.
Unexpected network traffic
Even if well documented, most developers would not expect this type of functionality and would lead to development issues when your container requires access to the internet. For example, in a K8s environment networking can be extremely strict and the developer would need to manually open up a route to the internet (or a set of custom repos).
Additionally, even if the networking is not an issue, if you expected a lot of these containers to be started, you might be clogging the network with the upgrade packages and cause network performance issues.
Wouldn't work for base images
While it sounds like you are probably not developing an image intended to serve as a base image for anything else... but obviously the CMD likely would be overriden for the base image.
I have a docker image that receives an argument and runs for roughly a minute.
From time to time I need to run it over a set of 10K-100K inputs.
I tried doing this using AWS Batch, but it ran very slowly, since at each given moment only few dockers ran.
Is there an easy alternative which allows configuring number of dockers to run simultaneously and thus controlling the over all run time?
As of December 2020, you can now run your docker containers on AWS Lambda:
https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/
With this release, the maximum time a Lambda can run has been increased to 15 minutes (up from 5 minutes).
Since you indicate your process only runs for roughly 1 minute, Lambda could be an option for you.
In my project, I have a number of micro-services that rely upon each other. I am using Docker Compose to bring everything up in the right order.
During development, when I write some new code for a container, the container will need to be restarted, so that the new code can be tried. Thus far I've simply been using a restart of the whole thing, thus:
docker-compose down && docker-compose up -d
That works fine, but bringing everything down and up again takes ~20 seconds, which will be too long for a live environment. I am therefore looking into various strategies to ensure that micro-services may be rebooted individually with no interruption at all.
My first approach, which nearly works, is to scale up the service to reboot, from one instance to two. I then programmatically reset the reverse proxy (Traefik) to point to the new instance, and then when that is happy, I docker stop on the old one.
My scale command is the old variety, since I am using Compose 1.8.0. It looks like this:
docker-compose scale missive-storage-backend=2
The only problem is that if there is a new image, Docker Compose does not use it - it stubbornly uses the hash identical to the already running instance. I've checked docker-compose scale --help and there is nothing in there relating to forcing the use of a new image.
Now I could use an ordinary docker run, but then I'd have to replicate all the options I've set up for this service in my docker-compose.yml, and I don't know if something run outside of the Compose file would be understood as being part of that Compose application (e.g. would it be stopped with a docker-compose down despite having been started manually?).
It's possible also that later versions of Docker Compose may have more options in the scale function (it has been merged with up anyway).
What is the simplest way to get this feature?
(Aside: I appreciate there are a myriad of orchestration tools to do gentle reboots and other wizardry, and I will surely explore that bottomless pit when I have the time available. For now, I feel that writing a few scripts to do some deployment tasks is the quicker win.)
I've fixed this. Firstly I tried upgrading to Compose 1.9, but that didn't seem to offer the changes I need. I then bumped to 1.13, which is when scale becomes deprecated as a separate command, and appears as a switch to the up command.
As a test, I have an image called missive-storage, and I add a dummy change to the Dockerfile, so docker ps reports the name of the image in a running container as d4ebdee0f3e2 instead (since missive-storage:latest has changed).
The ps line looks like this:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
45b8023f6ef1 d4ebdee0f3e2 "/usr/local/bin/du..." 4 minutes ago Up 4 minutes app_missive-storage-backend_1
I then issue this command (missive-storage-backend is the DC service name for image missive-storage):
docker-compose up -d --no-recreate --scale missive-storage-backend=2
which results in these containers:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0bd6577f281a missive-storage "/usr/local/bin/du..." 2 seconds ago Up 2 seconds app_missive-storage-backend_2
45b8023f6ef1 d4ebdee0f3e2 "/usr/local/bin/du..." 4 minutes ago Up 4 minutes app_missive-storage-backend_1
As you can see, this gives me two running containers, one based on the old image, and one based on the new image. From here I can just redirect traffic by sending a configuration change to the front-end proxy, then stop the old container.
Note that --no-recreate is important - without it, Docker Compose seems liable to reboot everything, defeating the object of the exercise.
I've transitioned to using docker with cron for some time but I'm not sure my setup is optimal. I have one cron container that runs about 12 different scripts. I can edit the schedule of the scripts but in order to deploy a new version of the software running (some scripts which run for about 1/2 day) I have to create a new container to run some of the scripts while others finish.
I'm considering either running one container per script (the containers will share everything in the image but the crontab). But this will still make it hard to coordinate updates to multiple containers sharing some of the same code.
The other alternative I'm considering is running cron on the host machine and each command would be a docker run command. Doing this would let me update the next run image by using an environment variable in the crontab.
Does anybody have any experience with either of these two solutions? Are there any other solutions that could help?
If you are just running docker standalone (single host) and need to run a bunch of cron jobs without thinking too much about their impact on the host, then making it simple running them on the host works just fine.
It would make sense to run them in docker if you benefit from docker features like limiting memory and cpu usage (so they don't do anything disruptive). If you also use a log driver that writes container logs to some external logging service so you can easily monitor the jobs.. then that's another good reason to do it. The last (but obvious) advantage is that deploying new software using a docker image instead of messing around on the host is often a winner.
It's a lot cleaner to make one single image containing all the code you need. Then you trigger docker run commands from the host's cron daemon and override the command/entrypoint. The container will then die and delete itself after the job is done (you might need to capture the container output to logs on the host depending on what logging driver is configured). Try not to send in config values or parameters you change often so you keep your cron setup as static as possible. It can get messy if a new image also means you have to edit your cron data on the host.
When you use docker run like this you don't have to worry when updating images while jobs are running. Just make sure you tag them with for example latest so that the next job will use the new image.
Having 12 containers running in the background with their own cron daemon also wastes some memory, but the worst part is that cron doesn't use the environment variables from the parent process, so if you are injecting config with env vars you'll have to hack around that mess (write them do disk when the container starts and such).
If you worry about jobs running parallel there are tons of task scheduling services out there you can use, but that might be overkill for a single docker standalone host.
I'm trying to create a Docker container based on CentOS 7 that will host R, shiny-server, and rstudio-server, but to I need to have systemd in order for the services to start. I can use the systemd enabled centos image as a basis, but then I need to run the container in privileged mode and allow access to /sys/fs/cgroup on the host. I might be able to tolerate the less secure situation, but then I'm not able to share the container with users running Docker on Windows or Mac.
I found this question but it is 2 years old and doesn't seem to have any resolution.
Any tips or alternatives are appreciated.
UPDATE: SUCCESS!
Here's what I found: For shiny-server, I only needed to execute shiny-server with the appropriate parameters from the command line. I captured the appropriate call into a script file and call that using the final CMD line in my Dockerfile.
rstudio-server was more tricky. First, I needed to install initscripts to get the dependencies in place so that some of the rstudio scripts would work. After this, executing rstudio-server start would essentially do nothing and provide no error. I traced the call through the various links and found myself in /usr/lib/rstudio-server/bin/rstudio-server. The daemonCmd() function tests cat /proc/1/comm to determine how to start the server. For some reason it was failing, but looking at the script, it seems clear that it needs to execute /etc/init.d/rstudio-server start. If I do that manually or in a Docker CMD line, it seems to work.
I've taken those two CMD line requirements and put them into an sh script that gets called from a CMD line in the Dockerfile.
A bit of a hack, but not bad. I'm happy to hear any other suggestions.
You don't necessarily need to use an init system like systemd.
Essentially, you need to start multiple services, there are existing patterns for this. Check out this page about how to use supervisord to achieve the same thing: https://docs.docker.com/engine/admin/using_supervisord/