What is image "containersol/minimesos" in minimesos? - docker

I was able to setup the minimesos cluster on my laptop and also could deploy a small command-line utility. Now the questions;
What is the image "containersol/minimesos" used for? It is pulled but I don't see it running, when I do "docker ps". "docker images" lists it.
How come when I run "top" inside the mesos-agent container, I see all the processes running in my host (laptop)? This is a bit strange.
I was trying to figure out what's inside minimesos script. I see that there's just one "docker run ... " command. Would really appreciate if I could get to know what the aforementioned command does that results into 4 containers (1 master, 1 slave, 1 zk, 1 marathon) running on my laptop.

containersol/minimesos runs the Java code that is the core of minimesos. It only runs until it executes the command from the CLI. When you do minimesos up the command name and the minimesosFile will be passed to this container. The container in turn will execute the Java code that will create the other containers that form the Mesos cluster specified in the minimesosFile. That should answer #3 also. Take a look at MesosCluster class thats the root of where the magic happens.
I don't know the answer to #2 will get back to you when I find out.

Every minimesos command runs as a short lived container, whose image is containersol/minimesos.
When you run 'minimesos up' it launches containersol/minimesos with 'up' as the argument. It then launches a cluster by starting other containers like containersol/mesos-agent and containersol/mesos-master. After the cluster is up the containersol/minimesos container exists and is removed.
We have separated cli and minimesos core as a refactoring to prepare for the upcoming API module. We are creating an API to support clients for different programming language. The first client will be a Golang client.
In this new setup minimesos will run launch a long running API server and any minimesos cli commands call the API. The clients will also launch the API server and call the API.

Related

Is there a way to set the "--rm" option for a docker container deployed in a GCP compute instance?

I'm admittedly very new to Docker so this might be a dumb question but here it goes.
I have a Python ETL script that I've packaged in a Docker container essentially following this tutorial, then using cloud functions and cloud scheduler, I have the instance turn start every hour, run the sync and then shut down the instance.
I've run into an issue though where after this process has been running for a while the VM runs out of hard drive space. The script doesn't require any storage or persistence of state - it pulls any state data from external systems and only uses temporary files which are supposed to be deleted when the machine shuts down.
This has caused particular problems where updates I make to the script stop working because the machine doesn't have the space to download the latest version of the container.
I'm guessing it's either logs or perhaps files created automatically to try to persist the state - either within the Docker container or on the VM.
I'm wondering whether if I could get the VM to run the instance with the "--rm" flag so that the image was removed when it was finished this could solve this problem. This would theoretically guarantee that I'm always starting with the most recent image.
The trouble is, I can't for the life of my find a way to configure the "rm" option within the instance settings and the documentation for container options only covers passing arguments to the container ENTRYPOINT and not the docker run options docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
I feel like I'm either missing something obvious or it's not designed to be used this way. Is this something that can be configured in the Dockerfile or is there a different way I have to set up the VM in the first place?
Basically I just want the docker image to be pulled fresh and run each time and not leave any remnants on the VM that will slowly run out of space.
Also, I know Cloud Run might work in some similar situations but I need the script to be able to run for as long as it needs to (particularly at the start when it's backfilling data) and so the 15 minute cap on runtime would be a problem.
Any suggestions would be appreciated!
Note: I'm posting this as an answer as I need more space than a comment. If anyone feels it is not a good answer and wants it deleted, I will be delighted to do such.
Recapping the story, we have a Compute Engine configured to start a Docker Container. The Compute Engine runs the container and then we stop it. An hour later we restart it, let it run and then we stop it again. This continues on into the future. What we seem to find is that the disk associated with the Compute Engine fills up and we end up breaking. The thinking is that the container contained within the Compute Engine is created at first launch of the Compute Engine and then each time it is restarted, it is being "re-used" as opposed to a brand new container instance being created. This means that resources consumed by the container from one run to the next (eg disk storage) continues to grow.
What we would like to happen is that when the Compute Engine starts, it will always create a brand new instance of the container with no history / resource usage of the past. This means that we won't consume resources over time.
One way to achieve this outside of GCP would be to start the container through Docker with the "--rm" flag. This means that when the container ends, it will be auto-deleted and hence there will be no previous container to start the next time the Compute Engine starts. Again ... this is a recap.
If we dig through how GCP Compute Engines work as they relate to containers, we come across a package called "Konlet" (Konlet). This is the package responsible for loading the container in the Compute engine. This appears to be itself a Docker container application written in Go. It appears to read the metadata associated with the Compute Engine and based on that, performs API calls to Docker to launch the target container. The first thing to see from this is that the launch of the target Docker container does not appear to be executed through simple docker command line. This then implies that we can't "simply" edit a script.
Konlet is open source so in principle, we could study it in detail and see if there are special flags associated with it to achieve the equivalent of --rm. However, my immediate recommendation is to post an issue at the Konlet GitHub site and ask the author whether there is a --rm equivalent option for Konlet and, if not, could one be added (and if not, what is the higher level thinking).
In the meantime, let me offer you an alternative to your story. If I am hearing you correctly, every hour you fire a job to start a compute engine, do work and then shutdown the compute engine. This compute engine hosts your "leaky" docker container. What if instead of starting/stopping your compute engine you created/destroyed your compute engine? While the creation/destruction steps may take a little longer to run, given that you are running this once an hour, a minute or two delay might not be egregious.

Self updating docker stack

I have a docker stack deployed with 20+ services which comprise my application. I would like to know that is there a way to update this stack with the latest changes to the software from within one of the containers running as a part of the stack?
Approach i have tried:
In one of the containers for a service, mounted the docker socket and the /usr/bin/docker file and downloaded the latest compose file from the server.
Instantiated a script which downloads the latest images
Initiate a docker stack deploy with the new compose file
Everything works fine this way but if the service which is running this process itself has an update and if that docker stack deploy tries to create this service before any other service in the stack, then the stack update fails.
Any suggestion or alternative approaches for this?
There is no out of the box solution for docker swarm mode (something like watchtower for single docker). I think you already found the best solution for doing this automatically. I would suggest you put the update container (the one that is updating the services) on a ignore list. Then on one of your master nodes, create a cron that updates that one container. I know this is not a prefect solution, but it should work.
The standard way to do this is to build a new Docker image that contains your new application code. Tag it (as in the docker build -t argument) with some unique version, like a source control tag or date stamp. Start a new container with the new application code, then stop and delete the old container.
As a general rule you do not upgrade the software inside a running container. Delete the old container and start a new container with the software and version you want. Also, this is generally managed by an operator, a continuous deployment system, or an orchestration system, not by the container itself. (Mounting the Docker socket into a container is a significant security exposure.)
(Imagine setting up a second copy of your cluster that works exactly the same way as your production cluster, except that it has the software you want to deploy tomorrow. You don't want your production cluster picking that up on its own until you've tested it. This scheme should give you a reproducible deployment setup so that it's easy to start that pre-production cluster, but also give you control over which specific versions are running where.)

Is it possible to specify a Docker image build argument at pod creation time in Kubernetes?

I have a Node.JS based application consisting of three services. One is a web application, and two are internal APIs. The web application needs to talk to the APIs to do its work, but I do not want to hard-code the IP address and ports of the other services into the codebase.
In my local environment I am using the nifty envify Node.JS module to fix this. Basically, I can pretend that I have access to environment variables while I'm writing the code, and then use the envify CLI tool to convert those variables to hard-coded strings in the final browserified file.
I would like to containerize this solution and deploy it to Kubernetes. This is where I run into issues...
I've defined a couple of ARG variables in my Docker image template. These get turned into environment variables via RUN export FOO=${FOO}, and after running npm run-script build I have the container I need. OK, so I can run:
docker build . -t residentmario/my_foo_app:latest --build-arg FOO=localhost:9000 BAR=localhost:3000
And then push that up to the registry with docker push.
My qualm with this approach is that I've only succeeded in punting having hard-coded variables to the container image. What I really want is to define the paths at pod initialization time. Is this possible?
Edit: Here are two solutions.
PostStart
Kubernetes comes with a lifecycle hook called PostStart. This is described briefly in "Container Lifecycle Hooks".
This hook fires as soon as the container reaches ContainerCreated status, e.g. the container is done being pulled and is fully initialized. You can then use the hook to jump into the container and run arbitrary commands.
In our case, I can create a PostStart event that, when triggered, rebuilds the application with the correct paths.
Unless you created a Docker image that doesn't actually run anything (which seems wrong to me, but let me know if this is considered an OK practice), this does require some duplicate work: stopping the application, rerunning the build process, and starting the application up again.
Command
Per the comment below, this event doesn't necessarily fire at the right time. Here's another way to do it that's guaranteed to work (and hence, superior).
A useful Docker container ends with some variant on a CMD serving the application. You can overwrite this run command in Kubernetes, as explained in the "Define a Command and Arguments for a Container" section of the documentation.
So I added a command to the pod definition that ran a shell script that (1) rebuilt the application using the correct paths, provided as an environment variable to the pod and (2) started serving the application:
command: ["/bin/sh"]
args: ["./scripts/build.sh"]
Worked like a charm.

Best Practices for Cron on Docker

I've transitioned to using docker with cron for some time but I'm not sure my setup is optimal. I have one cron container that runs about 12 different scripts. I can edit the schedule of the scripts but in order to deploy a new version of the software running (some scripts which run for about 1/2 day) I have to create a new container to run some of the scripts while others finish.
I'm considering either running one container per script (the containers will share everything in the image but the crontab). But this will still make it hard to coordinate updates to multiple containers sharing some of the same code.
The other alternative I'm considering is running cron on the host machine and each command would be a docker run command. Doing this would let me update the next run image by using an environment variable in the crontab.
Does anybody have any experience with either of these two solutions? Are there any other solutions that could help?
If you are just running docker standalone (single host) and need to run a bunch of cron jobs without thinking too much about their impact on the host, then making it simple running them on the host works just fine.
It would make sense to run them in docker if you benefit from docker features like limiting memory and cpu usage (so they don't do anything disruptive). If you also use a log driver that writes container logs to some external logging service so you can easily monitor the jobs.. then that's another good reason to do it. The last (but obvious) advantage is that deploying new software using a docker image instead of messing around on the host is often a winner.
It's a lot cleaner to make one single image containing all the code you need. Then you trigger docker run commands from the host's cron daemon and override the command/entrypoint. The container will then die and delete itself after the job is done (you might need to capture the container output to logs on the host depending on what logging driver is configured). Try not to send in config values or parameters you change often so you keep your cron setup as static as possible. It can get messy if a new image also means you have to edit your cron data on the host.
When you use docker run like this you don't have to worry when updating images while jobs are running. Just make sure you tag them with for example latest so that the next job will use the new image.
Having 12 containers running in the background with their own cron daemon also wastes some memory, but the worst part is that cron doesn't use the environment variables from the parent process, so if you are injecting config with env vars you'll have to hack around that mess (write them do disk when the container starts and such).
If you worry about jobs running parallel there are tons of task scheduling services out there you can use, but that might be overkill for a single docker standalone host.

Docker: Run command while another command is running

I need to configure a program running in a docker container. To achieve that the program must be running (and provide an open port) so that the administration program can connect to the running process. Unfortunately there is no simple editable config file so this is the only way. The RUN command is obviously not the right one because it does not provide a running instance after docker went to the next command. The best way would be doing this while building the docker image but if it has to be done during container start it would be OK as well. But there is (as far as I know) also no easy way to run multiple commands on startup. Does anyone has an idea how to do that?
To make it a bit more clear, here is a simple example from my Dockerfile:
# this command should start the application which has to be configured
RUN /usr/local/server/server.sh
# I tried this command alternatively because the shell script is blocking
RUN nohup /usr/local/server/server.sh &
# this is the command which starts an administration program which connects to the running instance started above
RUN /usr/local/administration/adm [some configuration parameters...]
# afterwards the server process can be stopped
Downloading the complete program directory containing the correct state could be a solution, too. But then the configuration cannot changed easily in the Dockerfile, what would be great.
A Dockerfile is supposed to be a sequential list of instructions to produce an image. The image should contain your application's code, and all of its installable dependencies.
Each RUN instruction gets executed as its own container. Once the command that you run completes, any changed files get committed as a new image layer.
Trying to run a process in the background, will cause the command you are running to return immediately. Once that happens, the container is considered stopped, and the Dockerfile's next instruction will be executed in a new separate container.
If you really need two processes running, you will need to produce a command that you can pass to a single RUN instruction.

Resources