I am using gitlab runner to run my tests on Digital ocean servers
After I re-installed gitlab runner I started to get the following errors:
I have the following config for my runner:
concurrent = 10
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "Builds coordinator"
url = "https://gitlab.com/"
token = "<token>"
executor = "docker+machine"
limit = 10
[runners.docker]
tls_verify = false
image = "alpine:latest"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.machine]
IdleCount = 0
IdleTime = 0
OffPeakTimezone = ""
OffPeakIdleCount = 0
OffPeakIdleTime = 0
MachineDriver = "digitalocean"
MachineName = "gitlab-runner-autoscale-%s"
MachineOptions = [
"digitalocean-image=coreos-stable",
"digitalocean-ssh-user=core",
"digitalocean-access-token=<token>",
"digitalocean-region=lon1",
"digitalocean-size=4gb",
"digitalocean-private-networking"
]
[runners.cache]
Type = "s3"
Path = "cache_for_builds"
Shared = false
[runners.cache.s3]
ServerAddress = "ams3.digitaloceanspaces.com"
AccessKey = "<key>"
SecretKey = "<secret>"
BucketName = "cache-for-builds"
BucketLocation = "ams3"
Insecure = false
I tried to do docker-machine ls when the build was running and I saw the following output:
... ERRORS
... Unknown Unable to query docker version: Cannot connect to the docker engine endpoint
or
... ERRORS
... Unable to query docker version: Get https://<ip>:2376/v1.15/version: remote error: tls: bad certificate
I tried to search for certs on my server with find -name "*cert*" and found them in /root/.docker/machine/certs. I added path to the certs to the [runners.docker] section as described in the docs but it didn't help:
[runners.docker]
...
tls_cert_path = "/root/.docker/machine/certs"
When I view gitlab runner logs with journalctl -u gitlab-runner I see
Jan 13 21:06:59 builds-coordinator gitlab-runner[3360]: ERROR: Error creating machine: Error checking the host: Error checking and/or regenerating the certs: There was an error validating certificates for host "<ip>:2376": remote error: tls: bad certificate driver=digitalocean name=r
Jan 13 21:06:59 builds-coordinator gitlab-runner[3360]: ERROR: You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'. driver=digitalocean name=runner-gitlab-runner-autoscale-<id> operation=create
Jan 13 21:06:59 builds-coordinator gitlab-runner[3360]: ERROR: Be advised that this will trigger a Docker daemon restart which might stop running containers. driver=digitalocean name=runner-gitlab-runner-autoscale-<id> operation=create
Jan 13 21:06:59 builds-coordinator gitlab-runner[3360]: The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag. driver=digitalocean name=runner-gitlab-runner-autoscale-<id> operation=create
Jan 13 21:07:00 builds-coordinator gitlab-runner[3360]: WARNING: Machine creation failed, trying to provision error=exit status 1 name=runner-gitlab-runner-autoscale-<id>
Jan 13 21:07:01 builds-coordinator gitlab-runner[3360]: Waiting for SSH to be available... name=runner-gitlab-runner-autoscale-<id> operation=provision
Before I was using gitlab runner version 11.5.1
How can I fix these errors and make gitlab runner running the builds?
Related
Im trying to setup a gitlab runner with dind, to build docker images in Gitlab CI Pipelines, but im getting the following errors each build:
*** WARNING: Service runner-project-2-concurrentdocker-0 probably didn't start properly.
Health check error:
service "runner-project-2-concurrentdocker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2021-10-12T14:12:26.652132966Z time="2021-10-12T14:12:26.651911909Z" level=info msg="Starting up"
2021-10-12T14:12:26.653211174Z time="2021-10-12T14:12:26.653132075Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2021-10-12T14:12:26.653320513Z time="2021-10-12T14:12:26.653240584Z" level=warning msg="Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network." host="tcp://0.0.0.0:2375"
2021-10-12T14:12:27.653863417Z time="2021-10-12T14:12:27.653434622Z" level=warning msg="Binding to an IP address without --tlsverify is deprecated. Startup is intentionally being slowed down to show this message" host="tcp://0.0.0.0:2375"
My Gitlab Runner config.toml looks like this:
concurrent = 10
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "gitlab-runner-docker"
url = "https://gitlab.my.host/"
token = "MYTOKEN"
executor = "docker"
environment = ["DOCKER_DRIVER=overlay2", "DOCKER_HOST=tcp://docker:2375/", "DOCKER_TLS_CERTDIR="]
[runners.docker]
tls_verify = false
image = "docker:dind"
privileged = true
[[runners.docker.services]]
name = "docker:dind"
command = ["--registry-mirror", "http://192.168.1.21"]
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/certs/client", "/cache"]
shm_size = 0
There are mysterious bug reports and user questions on those errors, but non of them seem to fix my problem. I tried removing the tls_verify settings, and setting the DOCKER_TLS_CERTDIR: "" CI Pipeline variable and what not. Is there any chance to get those runners booting up fast again with or without tls verification?
I'm trying to have several gitlab runners using different docker daemons on the same host
Currently using gitlab-runner 10.7.0 and docker 19.03.3. The goal is to maximize the usage of resources. Since I have two SSD disks on the machine, I want the runners to use both of them. The only way I found to have some runners use one disk while some others use the other disk is to have two docker daemons, one running on each disk.
I have one docker daemon running on unix:///var/run/docker-1.sock and one on unix:///var/run/docker-2.sock. They use each a dedicated bridge created manually. The (systemd) startup command line looks like /usr/bin/dockerd --host unix:///var/run/docker_socket/docker-%i.sock --containerd=/run/containerd/containerd.sock --pidfile /var/run/docker-%i.pid --data-root /data/local%i/docker/ --exec-root /data/local%i/docker_run/ --bridge docker-%i --fixed-cidr 172.%i0.0.1/17
The gitlab_runner mounts /var/run/docker_socket/ and runs on docker-1.sock.
I tried having one per docker daemon but then two jobs runs on the same runner although the limit is set to 1 (and also there are some sometimes errors appearing like ERROR: Job failed (system failure): Error: No such container: ...)
After registration the config.toml looks like:
concurrent = 20
check_interval = 0
[[runners]]
name = "[...]-large"
limit = 1
output_limit = 32768
url = "[...]"
token = "[...]"
executor = "docker"
[runners.docker]
host = "unix:///var/run/docker-1.sock"
tls_verify = false
image = "debian:jessie"
memory = "24g"
cpuset_cpus = "1-15"
privileged = false
security_opt = ["seccomp=unconfined"]
disable_cache = false
volumes = ["/var/run/docker-1.sock:/var/run/docker.sock"]
shm_size = 0
[runners.cache]
[[runners]]
name = "[...]-medium-1"
limit = 1
output_limit = 32768
url = "[...]"
token = "[...]"
executor = "docker"
[runners.docker]
host = "unix:///var/run/docker-2.sock"
tls_verify = false
image = "debian:jessie"
memory = "12g"
cpuset_cpus = "20-29"
privileged = false
security_opt = ["seccomp=unconfined"]
disable_cache = false
volumes = ["/var/run/docker-2.sock:/var/run/docker.sock"]
shm_size = 0
[runners.cache]
The two docker daemons are working fine. Tested with docker --host unix:///var/run/docker-<id>.sock ps
The current solution seems to be kind of OK but there are random errors in the gitlab_runner logs:
ERROR: Appending trace to coordinator... error couldn't execute PATCH against http://[...]/api/v4/jobs/223116/trace: Patch http://[...]/api/v4/jobs/223116/trace: read tcp [...] read: connection reset by peer runner=0ec8a845
Other people tried this, apparently with some success:
This one seems to list the whole set of options needed to properly run each instance of dockerd : Is it possible to start multiple docker daemons on the same machine. What are yours?
This other https://www.jujens.eu/posts/en/2018/Feb/25/multiple-docker/, does not speak about the possible extra bridge config.
NB: Docker documentation says the feature is experimental: https://docs.docker.com/engine/reference/commandline/dockerd/#run-multiple-daemons
I’m trying to use the gitlab-runner with docker-ssh. Here is how my config.toml looks like:
[[runners]]
name = “CI/CD docker-ssh alfa”
url = “https://gitlab.com/”
token = “<SOME_TOKEN>“
executor = “docker-ssh”
[runners.ssh]
user = “myuser”
password = “my password”
[runners.docker]
tls_verify = false
image = “ubuntu:latest”
privileged = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
But I got this error:
Running with gitlab-runner 11.3.0 (d78e9e67)
on CI/CD docker-ssh alfa 1f147b76
Using Docker executor with image ubuntu:latest …
ERROR: Preparation failed: build directory needs to be absolute and non-root path
Will be retried in 3s …
Using Docker executor with image ubuntu:latest …
ERROR: Preparation failed: build directory needs to be absolute and non-root path
So I tried to change the build directory and here hows my config.toml file looks like now:
[[runners]]
name = “CI/CD docker-ssh alfa”
url = “https://gitlab.com/”
token = “<SOME_TOKEN>“
executor = “docker-ssh”
builds_dir = “/home/myuser/“
[runners.ssh]
user = “myuser”
password = “my password”
[runners.docker]
tls_verify = false
image = “ubuntu:latest”
privileged = false
disable_cache = false
volumes = [”/cache"]
shm_size = 0
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
But I got this new error:
Running with gitlab-runner 11.3.0 (d78e9e67)
on CI/CD docker-ssh alfa 1f147b76
Using Docker executor with image ubuntu:latest …
WARNING: Since GitLab Runner 10.0 docker-ssh and docker-ssh+machine executors are marked as DEPRECATED and will be removed in one of the upcoming releases
Pulling docker image ubuntu:latest …
Using docker image sha256:cd6d8154f1e16e38493c3c2798977c5e142be5e5d41403ca89883840c6d51762 for ubuntu:latest …
ERROR: Preparation failed: dial tcp 172.17.0.2:22: getsockopt: connection refused
Will be retried in 3s …
Any idea what am I doing wrong?
Stick with an HTTPS URL, and try fixing instead the error:
build directory needs to be absolute and non-root path
See this thread
I was running my CI on an old gitlab-ci-multi-runner 9.5.1.
I update to gitlab-runner 10.8.0 and now it’s ok.
Or this thread:
Set build_dir="C:\\gitlab-runner\\builds" in the config.toml.
I am new bie to gitlab-runner, i have tried to setup gitlab-runner-autoscaling but i am unable to download ecr images in a build. When i try to ssh into docker-machine i am able to download images, i even tried to ssh into the VM and tried to pull ecr images as root and as ubuntu user(ubuntu 16.04 AMI), it only fails while running a build .
Please let me know how i can troubleshoot.
1. How can i find the command gitlab-runner is using to pull ecr image/
2. How to find the user its running the docker command.
Runner config:
[[runners]]
name = "registry-test4"
limit = 1
url = "http://gitlab.xxxxxxxx.com/"
token = "xxxxxxxxxxxxxxx"
executor = "docker+machine"
[runners.docker]
tls_verify = false
image = "ruby:2.1"
privileged = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]
[runners.machine]
IdleCount = 1
MachineDriver = "amazonec2"
MachineName = "gitlab-runner-ci-%s"
MachineOptions = ["amazonec2-iam-instance-profile=xxxxxxxxxxx", "amazonec2-ssh-user=ubuntu", "amazonec2-region=us-east-1", "amazonec2-instance-type=t2.large", "amazonec2-ami=ami-xxxxx", "amazonec2-vpc-id=vpc-xxxxx", "amazonec2-subnet-id=subnet-xxxxx", "amazonec2-zone=a", "amazonec2-root-size=32", "amazonec2-keypair-name=spot", "amazonec2-ssh-keypath=/root/.ssh/spot", "amazonec2-userdata=/etc/gitlab-runner/bootstrap.sh", "amazonec2-request-spot-instance=true", "amazonec2-security-group=docker_machine_git_as_prod", "amazonec2-security-group=consul-agent-prod", "amazonec2-private-address-only", "amazonec2-spot-price=x.xx"]
OffPeakPeriods = ["* * 5-11 * * mon-fri *", "* * * * * sat,sun *"]
OffPeakTimezone = ""
OffPeakIdleCount = 1
OffPeakIdleTime = 1200
Error:
Running with gitlab-runner 10.2.0 (0a75cdd1)
on registry-test4 (31b91ac3)
Using Docker executor with image xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/dev/sbt:latest ...
Using docker image sha256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxfor predefined container...
Pulling docker image xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/dev/sbt:latest ...
ERROR: Preparation failed: Error response from daemon: Get https://xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/v2/dev/sbt/manifests/latest: no basic auth credentials
Will be retried in 3s ...
.gitlab-ci.yml
---
main:
image: xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/dev/sbt:latest
script: sbt +runCI
Solved this issue , by installing ecr binary
https://github.com/awslabs/amazon-ecr-credential-helper
on gitlab-runner server passing these parameters in /root/.docker/config.json. (earlier ecr was installed only on the VM docker-machine was provisioning.)
{
"credsStore": "ecr-login"
}
I am trying to configure a simple Gitlab-ci build pipeline and am running all of the components in docker containers. I followed the general guides on docs.gitlab.com and got a runner registered with gitlab. But when a build kicks off, the runner tries to clone the repository in question and seems to use the gitlab instance's container-id in place of the url, and I get an unreachable-host error:
Cloning repository...
Cloning into '/builds/root/ci-demo'...
fatal: unable to access 'http://gitlab-ci-token:xxxxxxxxxxxxxxxxxxxx#cdfd596f2bc4/root/ci-demo.git/': Could not resolve host: cdfd596f2bc4
ERROR: Job failed: exit code 1
Is there something obvious that I've overlooked? There are quite a few similar questions on SO and the internet in general, but none seem to have a problem with the target container-id being substituted for the url.
gitlab-runner's config.toml:
concurrent = 1
check_interval = 0
[[runners]]
name = "runner_name"
url = "http://[ipaddr]:[port]/"
token = "xxxxxxx"
executor = "docker"
[runners.docker]
tls_verify = false
image = "maven:latest"
privileged = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
[runners.cache]