CI/CD results don't consider whether containers started successfully internally - docker

When I bring up a Docker container in detached mode, the results usually return sooner as they will not print out on the console. This is a problem for me when I'm running it through Gitlab's CI/CD.
So for example, when I have this command in the deployment stage of my gitlab-ci.yml:
ssh root#123.123.123.123 docker-compose up -d
This brings up all the containers in docker-compose in detached mode on my instance at the IP address. The console will usually output:
MyContainerA ... Done!
MyContainerB ... Done!
<exit>
To Gitlab CI, the deployment stage is completed successfully because there were no errors.
However, in reality, this doesn't guarantee that everything is good because the container may not have started up successfully internally. For example, the npm start may have failed and the container exits the next moment.
This makes the CI results (success/failed) to be unreliable. Is this normal when it comes to deploying Docker containers through CI/CD?
What should the correct way to deploy a Docker container through Gitlab CI/CD (or any other CI/CD per se) such that the CI's final result (success/failed) considers whether the containers actually started up successfully internally or not?

Related

Get output from Gitlab CI using docker logs

I am working on a script aggregating all job traces from pipeline’s jobs. My goal is to:
Send traces to Graylog server
Save job traces locally to make them accessible from the machine in case of Graylog shutdown.
My first thought was accessing the logs from my GitLab CI using using docker logs (or some other cli tool) on my machine with docker.
I know from this thread that it's possible to do from docker containers using for example:
echo "My output" >> /proc/1/fd/1
But is that possible to do from Gitlab-runner containers? My .gitlab-ci.yml for testing looks like this:
image: python:latest
stages:
- test
test:
stage: test
tags:
- test
script:
- echo "My output" >> /proc/1/fd/1
Generally I would like to be able to get "My output" from machine using docker logs command but I am not sure how to do this. I use docker executor for my Gitlab runner.
I hope my explanation is understandable.
You cannot do this with any of the official docker-based GitLab executors. Job output logs are not emitted from the runner or containers it starts. All output from a job container is captured and transmitted to the GitLab server in realtime. The output never reaches the docker logging driver. Therefore, you cannot use docker logs or similar utilities to obtain job logs.
You can obtain job logs either: (1) from the configured storage of the GitLab server or (2) by using the jobs API.
For example, you can run a log forwarder (like splunk universal forwarder, graylogs forwarder, etc.) directly on a self-hosted GitLab instance to forward job traces to respective external systems.
GitLab 15.6 (November 2022) might help here:
GitLab Runner 15.6
We’re also releasing GitLab Runner 15.6 today! GitLab Runner is the lightweight, highly-scalable agent that runs your CI/CD jobs and sends the results back to a GitLab instance.
GitLab Runner works in conjunction with GitLab CI/CD, the open-source continuous integration service included with GitLab.
What's new:
Service container logs
Bug Fixes
GitLab Runner on Windows in Kubernetes: error preparation failed
The list of all changes is in the GitLab Runner CHANGELOG.
See Documentation.
The Service container logs means:
Logs generated by applications running in service containers can be captured for subsequent examination and debugging.
Please note:
Enabling CI_DEBUG_SERVICES may result in masked variables being revealed.
When CI_DEBUG_SERVICES is enabled, service container logs and the CI job’s logs are streamed to the job’s trace log concurrently, which makes it possible for a service container log to be inserted inside a job’s masked log.
This would thwart the variable masking mechanism and result in the masked variable being revealed.

Jenkins: docker agent with docker container in it

I am about to create new structure for CI/CD for our Jenkins. My goal is to create an environment for building and compiling apps. The environment has to be same on the server and developers local machines.
I need to come up with solution, that allows developers to build app on their local machines in the same way as it is compiled on Jenkins worker nodes.
I think, that using docker container to have one fixed environment is a good way. So I have created docker container [1] , that contains all necessary tools to build the application. Now developers can build theirs apps on local machines in the same way as Jenkins does. When someone need to build the app, he just pulls the container, mount source code directory into the container and executes command in the container.
Building looks like this:docker run --rm -v$(pwd):/app env_cont 'build'.
On the server I use a plugin for docker pipelines.
This solution works fine. Building apps is platform interdependent and can be done on any machine.
Now I started toying with the idea to use docker for my Jenkins worker nodes as well. Like having one (physical) node with exposed docker API and use it as a docker cloud for spawning worker nodes [2] . I like this approach, but here comes the problem: How to use docker nodes [2] for running docker containers in it [1] . I guess, that I can install docker tool inside docker container [2] , that is used as a worker node and run the container in it. So the process would look like this:
Job is added into Jenkins queue.
Jenkins connects to worker node's docker API and spawns docker container [2] as a new worker node.
Worker node (which is running as a container) runs another "env_cont" container [1] (with environment for building) and build the app inside the "env_cont" container.
My question is. Is this a good practice? I am little bit worried, that i kinda ower-thinking the problem. What do you thing is a good approach?

Cleaning up orphaned docker containers after Jenkins job is terminated

I work at a large organization that runs hundreds of jobs in a shared Jenkins cluster.
My Jenkins job needs to run integration tests against untrusted code running inside Docker containers. I am fearful that that when my Jenkins job gets terminated abruptly (e.g. job aborted or times out) I will be left with orphaned containers.
I have tried https://github.com/moby/moby/issues/1905 and ulimits does not work for me (this is because it only works for containers that run bash, and I cannot guarantee that mine will do so).
I tried https://stackoverflow.com/a/26351355/14731 but --lxc-conf is not a recognized option for Docker for Windows (this needs to run across all platforms supported by docker).
Any ideas?
Well you can have a cleanup command in the first and last step of your job, for example, first clean old deads, then rename the existing contailer to old_$jobname and kill it
docker container prune -f
docker rename $jobname old$jobname
docker kill old$jobname do whatever you need
launch your new container
- docker run --name $jobname$
By the looks of things, people are handling this outside of docker.
They are adding Jenkins post-build steps that clean up orphaned docker containers on aborted or failed builds.
See Martin Kenneth's build script as an example.

Docker pipeline's "inside" not working in Jenkins slave running within Docker container

I'm having issues getting a Jenkins pipeline script to work that uses the Docker Pipeline plugin to run parts of the build within a Docker container. Both Jenkins server and slave run within Docker containers themselves.
Setup
Jenkins server running in a Docker container
Jenkins slave based on custom image (https://github.com/simulogics/protokube-jenkins-slave) running in a Docker container as well
Docker daemon container based on docker:1.12-dind image
Slave started like so: docker run --link=docker-daemon:docker --link=jenkins:master -d --name protokube-jenkins-slave -e EXTRA_PARAMS="-username xxx -password xxx -labels docker" simulogics/protokube-jenkins-slave
Basic Docker operations (pull, build and push images) are working just fine with this setup.
(Non-)Goals
I want the server to not have to know about Docker at all. This should be a characteristic of the slave/node.
I do not need dynamic allocation of slaves or ephemeral slaves. One slave started manually is quite enough for my purposes.
Ideally, I want to move away from my custom Docker image for the slave and instead use the inside function provided by the Docker pipeline plugin within a generic Docker slave.
Problem
This is a representative build step that's causing the issue:
image.inside {
stage ('Install Ruby Dependencies') {
sh "bundle install"
}
}
This would cause an error like this in the log:
sh: 1: cannot create /workspace/repo_branch-K5EM5XEVEIPSV2SZZUR337V7FG4BZXHD4VORYFYISRWIO3N6U67Q#tmp/durable-98bb4c3d/pid: Directory nonexistent
Previously, this warning would show:
71f4de289962-5790bfcc seems to be running inside container 71f4de28996233340c2aed4212248f1e73281f1cd7282a54a36ceeac8c65ec0a
but /workspace/repo_branch-K5EM5XEVEIPSV2SZZUR337V7FG4BZXHD4VORYFYISRWIO3N6U67Q could not be found among []
Interestingly enough, exactly this problem is described in CloudBees documentation for the plugin here https://go.cloudbees.com/docs/cloudbees-documentation/cje-user-guide/index.html#docker-workflow-sect-inside:
For inside to work, the Docker server and the Jenkins agent must use the same filesystem, so that the workspace can be mounted. The easiest way to ensure this is for the Docker server to be running on localhost (the same computer as the agent). Currently neither the Jenkins plugin nor the Docker CLI will automatically detect the case that the server is running remotely; a typical symptom would be errors from nested sh commands such as
cannot create /…#tmp/durable-…/pid: Directory nonexistent
or negative exit codes.
When Jenkins can detect that the agent is itself running inside a Docker container, it will automatically pass the --volumes-from argument to the inside container, ensuring that it can share a workspace with the agent.
Unfortunately, the detection described in the last paragraph doesn't seem to work.
Question
Since both my server and slave are running in Docker containers, what kid of volume mapping do I have to use to make it work?
I've seen variations of this issue, also with the agents powered by the kubernetes-plugin.
I think that for it to work the agent/jnlp container needs to share workspace with the build container.
By build container I am referring to the one that will run the bundle install command.
This could be possibly work via withArgs
The question is why would you want to do that? Most of the pipeline steps are being executed on master anyway and the actual build will run in the build container. What is the purpose of also using an agent?

Gitlab Continuous Integration on Docker

I have a Gitlab server running on a Docker container: gitlab docker
On Gitlab there is a project with a simple Makefile that runs pdflatex to build pfd file.
On the Docker container I installed texlive and make, I also installed docker runner, command:
curl -sSL https://get.docker.com/ | sh
the .gitlab-ci.yml looks like follow:
.build:
script: &build_script
- make
build:
stage: test
tags:
- Documentation Build
script: *build
The job is stuck running and a message is shown:
This build is stuck, because the project doesn't have any runners online assigned to it
any idea?
The top comment on your link is spot on:
"Gitlab is good, but this container is absolutely bonkers."
Secondly looking at gitlab's own advice you should not be using this container on windows, ever.
If you want to use Gitlab-CI from a Gitlab Server, you should actually be installing a proper Gitlab server instance on a proper Supported Linux VM, with Omnibus, and should not attempt to use this container for a purpose it is manifestly unfit for: real production way to run Gitlab.
Gitlab-omnibus contains:
a persistent (not stateless!) data tier powered by postgres.
a chat server that's entire point in existing is to be a persistent log of your team chat.
not one, but a series of server processes that work together to give you gitlab server functionality and web admin/management frontend, in a design that does not seem ideal to me to be run in production inside docker.
an integrated CI build manager that is itself a Docker container manager. Your docker instance is going to contain a cache of other docker instances.
That this container was built by Gitlab itself is no indication you should actually use it for anything other than as a test/toy or for what Gitlab themselves actually use it for, which is probably to let people spin up Gitlab nightly builds, probably via kubernetes.
I think you're slightly confused here. Judging by this comment:
On the Docker container I installed texlive and make, I also installed
docker runner, command:
curl -sSL https://get.docker.com/ | sh
It seems you've installed docker inside docker and not actually installed any runners? This won't work if that's the case. The steps to get this running are:
Deploy a new gitlab runner. The quickest way to do this will be to deploy another docker container with the gitlab runner docker image. You can't run a runner inside the docker container you've deployed gitlab in. You'll need to make sure you select an executor (I suggest using the shell executor to get you started) and then you need to register the runner. There is more information about how to do this here. What isn't detailed here is that if you're using docker for gitlab and docker for gitlab-runner, you'll need to link the containers or set up a docker network so they can communicate with each other
Once you've deployed and registered the runner with gitlab, you will see it appear in http(s)://your-gitlab-server/admin/runners - from here you'll need to assign it to a project. You can also make it as "Shared" runner which will execute jobs from all projects.
Finally, add the .gitlab-ci.yml as you already have, and the build will work as expected.
Maybe you've set the wrong tags like me. Make sure the tag name with your available runner.
tags
- Documentation Build # tags is used to select specific Runners from the list of all Runners that are allowed to run this project.
see: https://docs.gitlab.com/ee/ci/yaml/#tags

Resources