Run buildah within gitlab-ci - docker

I want to use buildah from gitlab-ci, in order to build an image, run a container from it and do some tests against it.
My current gitlab-ci is:
tests:
tags:
- docker
image: quay.io/buildah/stable
stage: test
variables:
STORAGE_DRIVER: "vfs"
BUILDAH_FORMAT: "docker"
BUILDAH_ISOLATION: "rootless"
only:
refs:
- merge_requests
changes:
- **/*
script:
- buildah info --debug
- buildah unshare docker/test/run.sh
My runner is private gitlab runner, I don't want to change its configuration (to not break other CI).
The content of run.sh is:
#!/usr/bin/env bash
set -euo pipefail
container=$(buildah --ulimit nofile=8192 --name my-container from phusion/baseimage:bionic-1.0.0-amd64)
The error is:
level=warning msg="error reading allowed ID mappings: error reading subuid mappings for user \"root\" and subgid mappings for group \"root\": No subuid ranges found for user \"root\" in /etc/subuid" level=warning msg="Found no UID ranges set aside for user \"root\" in /etc/subuid." level=warning msg="Found no GID ranges set aside for user \"root\" in /etc/subgid." No buildah sali-container already exists... Package Sali Creating sali-container Completed short name "phusion/baseimage" with unqualified-search registries (origin: /etc/containers/registries.conf) Getting image source signatures Copying blob
sha256:36505266dcc64eeb1010bd2112e6f73981e1a8246e4f6d4e287763b57f101b0b Copying blob
sha256:1907967438a7f3c5ff54c8002847fe52ed596a9cc250c0987f1e2205a7005ff9 Copying blob
sha256:23884877105a7ff84a910895cd044061a4561385ff6c36480ee080b76ec0e771 Copying blob
sha256:2910811b6c4227c2f42aaea9a3dd5f53b1d469f67e2cf7e601f631b119b61ff7 Copying blob
sha256:bc38caa0f5b94141276220daaf428892096e4afd24b05668cd188311e00a635f Copying blob
sha256:53c90fd859186b7b770d65adcb6ae577d4c61133f033e628530b1fd8dc0af643 Copying blob
sha256:d039079bb3a9bf1acf69e7c00db0e6559a86148c906ba5dab06b67c694bbe87c Copying config
sha256:32c929dd2961004079c1e35f8eb5ef25b9dd23f32bc58ac7eccd72b4aa19f262 Writing manifest to image destination Storing signatures level=error msg="Error while applying layer: ApplyLayer
exit status 1 stdout: stderr: potentially insufficient UIDs or GIDs available in user namespace (requested 0:42 for /etc/gshadow): Check /etc/subuid and /etc/subgid: lchown /etc/gshadow: invalid argument" 4 errors occurred while pulling:
* Error initializing source docker://registry.fedoraproject.org/phusion/baseimage:bionic-1.0.0-amd64: Error reading manifest bionic-1.0.0-amd64 in registry.fedoraproject.org/phusion/baseimage: manifest unknown: manifest unknown
* Error initializing source docker://registry.access.redhat.com/phusion/baseimage:bionic-1.0.0-amd64: Error reading manifest bionic-1.0.0-amd64 in registry.access.redhat.com/phusion/baseimage: name unknown: Repo not found
* Error initializing source docker://registry.centos.org/phusion/baseimage:bionic-1.0.0-amd64: Error reading manifest bionic-1.0.0-amd64 in registry.centos.org/phusion/baseimage: manifest unknown: manifest unknown
* Error committing the finished image: error adding layer with blob "sha256:23884877105a7ff84a910895cd044061a4561385ff6c36480ee080b76ec0e771": ApplyLayer exit status 1 stdout: stderr: potentially insufficient UIDs or GIDs available in user namespace (requested 0:42 for /etc/gshadow): Check /etc/subuid and /etc/subgid: lchown /etc/gshadow: invalid argument level=error msg="exit status 125" level=error msg="exit status 125"
The result of buildah info --debug:
{
"debug": {
"buildah version": "1.18.0",
"compiler": "gc",
"git commit": "",
"go version": "go1.15.2"
},
"host": {
"CgroupVersion": "v1",
"Distribution": {
"distribution": "fedora",
"version": "33"
},
"MemFree": 9021378560,
"MemTotal": 15768850432,
"OCIRuntime": "runc",
"SwapFree": 0,
"SwapTotal": 0,
"arch": "amd64",
"cpus": 4,
"hostname": "runner-cvBUQadt-project-2197143-concurrent-0",
"kernel": "4.14.83+",
"os": "linux",
"rootless": false,
"uptime": "6391h 28m 15.45s (Approximately 266.29 days)"
},
"store": {
"ContainerStore": {
"number": 0
},
"GraphDriverName": "vfs",
"GraphOptions": [
"vfs.imagestore=/var/lib/shared"
],
"GraphRoot": "/var/lib/containers/storage",
"GraphStatus": {},
"ImageStore": {
"number": 0
},
"RunRoot": "/var/run/containers/storage"
}
}
I read other posts about the errors I had and came to this configuration, which is not enough. I choose buildah by thinking it would be easy to use from a CI as it is supposed to run rootless, but this is a real nightmare... I am poor lonesome developer and not a sysadmin, I don't understand how to setup linux for buildah... Can somebody help me?

Buildah is going to need to run as root or within a user namespace with sufficent UIDs to install files with different UID.
This looks like for some reason buildah thought it should run within a user namespace and then did not find root listed within the user namespace. This usually happens when you did not run with enough privileges.

Related

Missing Manifest from docker

The remote Docker seems to be well configured but when I try to download any image to my virtual docker It always fails with this trace :
2020-04-01T16:20:32.969Z [jfrt ] [INFO ] [b6e7232e6d2e0cb4] [DockerV2VirtualRepoHandler:117] [http-nio-8081-exec-8] - Fetching docker manifest for repo 'thanosio/thanos' and tag 'latest'
2020-04-01T16:20:33.874Z [jfrt ] [ERROR] [b6e7232e6d2e0cb4] [.DockerV2RemoteRepoHandler:448] [http-nio-8081-exec-8] - Missing Manifest from docker-via-intranet 'v2/thanosio/thanos/manifests/latest' not found at docker-via-intranet:thanosio/thanos/latest/list.manifest.json
2020-04-01T16:20:34.703Z [jfrt ] [ERROR] [b6e7232e6d2e0cb4] [.DockerV2RemoteRepoHandler:448] [http-nio-8081-exec-8] - Missing Manifest from docker-remote 'v2/thanosio/thanos/manifests/latest' not found at docker-remote:thanosio/thanos/latest/list.manifest.json
2020-04-01T16:20:35.545Z [jfrt ] [ERROR] [b6e7232e6d2e0cb4] [.DockerV2RemoteRepoHandler:448] [http-nio-8081-exec-8] - Missing Manifest from quay-io 'v2/thanosio/thanos/manifests/latest' not found at quay-io:thanosio/thanos/latest/list.manifest.json
DockerHub requires token authentication. You should check the "Enable Token Authentication" box. After doing this, try to pull an image you have never pulled before (since JFrog Container Registry caches 404s for a period of time). You can also go to the Advanced settings and set the missed metadata retrieval cache period to zero (instead of waiting for the cache period to expire).

Sharing files with ECS and EFS

Could you help me, please?
I'm trying to configure an ECS cluster to share files using EFS but I'm facing the following issue:
level=info time=2020-03-02T17:30:27Z msg="TaskHandler: Sending task change: TaskChange:
[arn:aws:ecs:us-east-1:959242800104:task/74086a36-c405-4248-8475-3234b011bee8 -> STOPPED, Known
Sent: NONE, PullStartedAt: 2020-03-02 17:30:27.661062367 +0000 UTC m=+3131.201879282,
PullStoppedAt: 2020-03-02 17:30:27.744492758 +0000 UTC m=+3131.285309673, ExecutionStoppedAt:
2020-03-02 17:30:27.913073824 +0000 UTC m=+3131.453890739,
arn:aws:ecs:us-east-1:959242800104:task/74086a36-c405-4248-8475-3234b011bee8 redmine -> STOPPED, Reason
CannotCreateContainerError: Error response from daemon: failed to mount local volume: mount
:/mnt/efs/redmine:/var/lib/docker/volumes/ecs-redmine-22-attachments-cee2f0e7e0ebc5f55000/_data,
data: addr=10.0.0.127,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport:
no such file or directory, Known Sent: NONE] sent: false" module=task_handler_types.go
If I only declare a volume inside my ECS task, the container started normally but if I try to map the outside volume with the container folder the issue happens.
I followed this tutorial: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_efs.html and it seems the problem isn't in security groups but the container itself.
I'm using the alpine version of Redmine.
Follow the config snippets:
...
"mountPoints": [
{
"readOnly": null,
"containerPath": "/usr/src/redmine/files",
"sourceVolume": "attachments"
}
],
...
"volumes": [
{
"efsVolumeConfiguration": {
"fileSystemId": "fs-xxxxx",
"rootDirectory": "/mnt/efs/redmine"
},
"name": "attachments",
"host": null,
"dockerVolumeConfiguration": null
}
],
Thanks in advance.
The log says: "no such file or directory": Make sure the directory on efs exists before using it.
Other considerations:
You cannot use "efsVolumeConfiguration" with ECS-Fargate. Currently only for ECS-on-EC2 (Fargate support is in the making).
I followed those links in order to solve my problem. I thought that EFS is not ready to use in ECS.
I had to map EFS inside EC2 and after that I had access from docker container.
https://gist.github.com/duluca/ebcf98923f733a1fdb6682f111b1a832#update-your-cloud-formation-template
https://xiaoyunyang.github.io/post/a-complete-guide-to-deploying-your-web-app-to-amazon-web-service/#set-up-efs-with-your-containers

docker: Error creating container: 400 Client Error: Bad Request (\"invalid reference format\")"

While trying to build an awx image (Ansible works) for ppc64le, the following comes up:
TASK [image_build : Build AWX distribution using container] ***************************************************************************************************************************************************
fatal: [localhost -> localhost]: FAILED! => {"changed": false, "msg": "Error creating container: 400 Client Error: Bad Request (\"invalid reference format\")"}
to retry, use: --limit #/root/awx/installer/install.retry
PLAY RECAP ****************************************************************************************************************************************************************************************************
localhost : ok=10 changed=3 unreachable=0 failed=1
How can I see what really happens in the background? Any verbose docker logs that I can look at? The message itself is somewhat useless to me. I already set Ansible to verbose but this also was of no help.
Docker image names can only consist of lowercase (a-z) characters.
Either you are giving a un-supported image name or the variable(or paths) passed to the buid(or the container) cannot be resolved.
To enable debug logs, add "--debug" to docker daemon (/etc/systemd/system/multi-user.target.wants/docker.service for systemd based linux env)
For reference: https://docs.docker.com/config/daemon/#configure-the-docker-daemon

Debugging Elastic Beanstalk Docker run failures?

I'm new to EB and AWS, and my docker images build fine but fail to run on Elastic Beanstalk. My suspicion is that they are not connecting to the database correctly, however, I'm not getting anything useful when I run "eb logs" from the commandline. Here are the errors:
{
"status": "FAILURE",
"api_version": "1.0",
"results": [
{
"status": "FAILURE",
"msg": "(TRUNCATED)...rrun.aws.json: No such file or directory
73927c49adff622a1a229d9369bdd80674d96d20f3eb99a9cdea786f4411a368
Docker container quit unexpectedly after launch: Docker container quit unexpectedly on Wed May 20 17:15:02 UTC 2015:.
Check snapshot logs for details.
Hook /opt/elasticbeanstalk/hooks/appdeploy/pre/04run.sh failed.
For more detail, check /var/log/eb-activity.log using console or EB CLI",
"returncode": 1,
"events": [
{
"msg": "Successfully pulled node:0.12.2-slim",
"severity": "TRACE",
"timestamp": 1432142064
},
{
"msg": "Successfully built aws_beanstalk/staging-app",
"severity": "TRACE",
"timestamp": 1432142094
},
{
"msg": "Docker container quit unexpectedly after launch: Docker container quit unexpectedly on Wed May 20 17:15:02 UTC 2015:. Check snapshot logs for details.",
"severity": "ERROR",
"timestamp": 1432142102
}
]
}
],
"truncated": "true"
}
And after the build completes:
[2015-05-20T17:15:02.694Z] INFO [8603] - [CMD-AppDeploy/AppDeployStage0/AppDeployPreHook/04run.sh] : Activity execution failed, because: cat: /var/app/current/Dockerrun.aws.json: No such file or directory
cat: /var/app/current/Dockerrun.aws.json: No such file or directory
73927c49adff622a1a229d9369bdd80674d96d20f3eb99a9cdea786f4411a368
Docker container quit unexpectedly after launch: Docker container quit unexpectedly on Wed May 20 17:15:02 UTC 2015:. Check snapshot logs for details. (ElasticBeanstalk::ExternalInvocationError)
caused by: cat: /var/app/current/Dockerrun.aws.json: No such file or directory
cat: /var/app/current/Dockerrun.aws.json: No such file or directory
73927c49adff622a1a229d9369bdd80674d96d20f3eb99a9cdea786f4411a368
Docker container quit unexpectedly after launch: Docker container quit unexpectedly on Wed May 20 17:15:02 UTC 2015:. Check snapshot logs for details. (Executor::NonZeroExitStatus)
The docker containers work locally, so what else can I do to figure out what's going wrong? I keep hearing about "snapshot logs" but where do I check these snapshot logs? Are they the output of what I'm already running "eb logs"?
I had this issue for a day or two. I managed to see the logs by going AWS Console > Elastic Beanstalk > Environment > ${YOUR_APPLICATION_ENV}
On the left pane;
Log > Request Logs > Download > Open in any text editor.
/var/log/eb-docker/containers/eb-current-app/
Follow the path and you will see the what causing the error and can fix it.
Assuming you have SSH access to the EC2 instance running your container, these are a few log files useful for debugging single container Docker instances in Beanstalk:
/tmp/docker_build.log
/tmp/docker_pull.log
/tmp/docker_run.log
In order to look at the error logs for the running process, first read the
/tmp/docker_run.log file. This file contains the Docker process id. Something like this:
c6ae58e4ad77e926f6a8230237acf95771c6b5d80d48fb1bc20591f964fd690c
The first few characters should match the process listed from the command docker ps. Use this value to find the corresponding log file in the following directory:
/var/log/eb-docker/containers/eb-current-app/
The format of the file name is eb-docker-ps-id-stdouterr.log
I had this issue when my containers were crashing because there was no traffic allowed between EBS and RDS. If you use any database try curling it. Also, you might want to try sudo docker logs CONTAINER_ID a try catching something useful. What might help also is trying to launch container manually from the instance. There's slight possibility something will come up.

How to run gerrit cookbook inside docker containers?

I'm running community gerrit cookbook in docker using chef-solo.
If I run the cookbook in a Dockerfile as a build step, it throws an error (check the log). But if I run the image and go inside the container and run the same command, it works fine.
Any idea what's going on?
Its complaining about sudo, yet continues and creates symbolic link. 'target_mode = nil' should not be a problem since it complains about same thing when I run the command inside the container as well but works fine. It ends up complaining about init.d script which does not make sense.
chef-solo as a build step:
RUN chef-solo --log_level debug -c /resources/solo.rb -j /resources/node.json
Logs:
[ :08+01:00] INFO: Processing ruby_block[gerrit-init] action run (gerrit::default line 225)
sudo: sorry, you must have a tty to run sudo
[ :08+01:00] INFO: /opt/gerrit/war/gerrit-2.7.war exist....initailizing gerrit
[ :08+01:00] INFO: ruby_block[gerrit-init] called
[ :08+01:00] INFO: Processing link[/etc/init.d/gerrit] action create (gerrit::default line 240)
[ :08+01:00] DEBUG: link[/etc/init.d/gerrit] created symbolic link from /etc/init.d/gerrit -> /opt/gerrit/install/bin/gerrit.sh
[ :08+01:00] INFO: link[/etc/init.d/gerrit] created
[ :08+01:00] DEBUG: found target_mode == nil, so no mode was specified on resource, not managing mode
[ :08+01:00] DEBUG: found target_uid == nil, so no owner was specified on resource, not managing owner
[ :08+01:00] DEBUG: found target_gid == nil, so no group was specified on resource, not managing group
[ :08+01:00] INFO: Processing link[/etc/rc3.d/S90gerrit] action create (gerrit::default line 244)
[ :08+01:00] DEBUG: link[/etc/rc3.d/S90gerrit] created symbolic link from /etc/rc3.d/S90gerrit -> ../init.d/gerrit
[ :08+01:00] INFO: link[/etc/rc3.d/S90gerrit] created
[ :08+01:00] DEBUG: found target_mode == nil, so no mode was specified on resource, not managing mode
[ :08+01:00] DEBUG: found target_uid == nil, so no owner was specified on resource, not managing owner
[ :08+01:00] DEBUG: found target_gid == nil, so no group was specified on resource, not managing group
[ :08+01:00] INFO: Processing service[gerrit] action enable (gerrit::default line 248)
[ :08+01:00] DEBUG: service[gerrit] supports status, running
================================================================================
Error executing action `enable` on resource 'service[gerrit]'
================================================================================
Chef::Exceptions::Service
-------------------------
service[gerrit]: unable to locate the init.d script!
Resource Declaration:
---------------------
# In /var/chef/cookbooks/gerrit/recipes/default.rb
248: service 'gerrit' do
249: supports :status => false, :restart => true, :reload => true
250: action [ :enable, :start ]
251: end
252:
Compiled Resource:
------------------
# Declared in /var/chef/cookbooks/gerrit/recipes/default.rb:248:in `from_file'
service("gerrit") do
action [:enable, :start]
supports {:status=>true, :restart=>true, :reload=>true}
retries 0
retry_delay 2
guard_interpreter :default
service_name "gerrit"
pattern "gerrit"
cookbook_name :gerrit
recipe_name "default"
end
Containers are not virtual machines, meaning they run single processes and not have process managers running.This explains why chef-solo will have issues creating service resources.
I would suggest reading about some of the emerging support that chef is designing for containers:
https://docs.getchef.com/containers.html
https://github.com/opscode/chef-init
I don't pretend it makes lots of sense at first read. I am yet to be convinced that chef is the best way to build a container.
The actually error was sudo: sorry, you must have a tty to run sudo, linux terminal not assigned due to security reasons, more info in this link here.
By default Docker runs as root, there is no need to do sudo. The cookbook I was running created 'gerrit' user which was causing me to do sudo. I removed the user and ran everything as root. Solved!

Resources