Molecule : Testing roles : Failed to get Dbus Connection Operation not permitted - docker

I facing an issue on my Molecule Test. I have begin to study this tool 2 days ago for information.
on a Ubuntu VM running with Vagrant,I have create a role and initialze Molecule's folder and create a testinfra test file ( with the docker provider ).
The error is when my task's role are running, at the step of checking service running, it failed.
fatal: [instance]: FAILED! => {"changed": false, "msg": "Could not find the requested service httpd: "}
I was design to simply install 2 packages including httpd on a Centos Image.
When im loggin directly to the Molecule VM ( so through docker ), when i simply type systemctl the error message is
Failed to get D-Bus connection: Operation not permitted
As adviced Geerlingguy, i have specify volume mapped on cgroup folder
platforms:
- name: instance
#image: docker.io/pycontribs/centos:7
image: geerlingguy/docker-${MOLECULE_DISTRO:-centos7}-ansible:latest
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:ro
The error is not related to Testinfra but only the docker built image.
Could someone help me to understand why this error message ?
Is that because im on a VirtualBox ran by Vagrant ?
Thanks all for reading :-)

I have added that on my mocule.yml file config according molecule documentation ( https://molecule.readthedocs.io/en/latest/examples.html#docker ) :
platforms:
- name: instance
#image: docker.io/pycontribs/centos:7
image: geerlingguy/docker-centos7-ansible:latest
capabilities:
- SYS_ADMIN
command: /sbin/init
systemctl working fine now

Related

How to deploy container Gitlab on a mounted point?

I'm trying to mount one container Gitlab on my CIFS mounted point, but that returns errors.
I have my Nas CIFS mounted point at /mnt/serveurwiki/ and i have 2 folders in "gitlab" and "wiki".
As you can see, I have all rights on the folder.
And this is my docker-compose.yml:
version: '3.6'
services:
gitlab:
image: gitlab/gitlab-ce:latest
user: root
ports:
- '42007:80'
- '42008:443'
- '42009:22'
volumes:
- /mnt/serveurwiki/gitlab/config:/etc/gitlab
- /mnt/serveurwiki/gitlab/logs:/var/log/gitlab
- /mnt/serveurwiki/gitlab/data:/var/opt/gitlab
networks:
- network
networks:
network:
For precision, I tested the compose in other positions and all work, but when I try to deploy my container in my point mount (/mnt/serveurwiki), I have that error in the logs of the container.
[2022-07-11T08:56:12+00:00] FATAL: Stacktrace dumped to /opt/gitlab/embedded/cookbooks/cache/chef-stacktrace.out
[2022-07-11T08:56:12+00:00] FATAL: ---------------------------------------------------------------------------------------
[2022-07-11T08:56:12+00:00] FATAL: PLEASE PROVIDE THE CONTENTS OF THE stacktrace.out FILE (above) IF YOU FILE A BUG REPORT
[2022-07-11T08:56:12+00:00] FATAL: ---------------------------------------------------------------------------------------
[2022-07-11T08:56:12+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: storage_directory[/var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 34) had an error: Mixlib::ShellOut::ShellCommandFailed: ruby_block[directory resource: /var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 36) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of chgrp git /var/opt/gitlab/.ssh ----
STDOUT:
STDERR: chgrp: changing group of '/var/opt/gitlab/.ssh': Operation not permitted
---- End output of chgrp git /var/opt/gitlab/.ssh ----
Ran chgrp git /var/opt/gitlab/.ssh returned 1
Anyone have an idea what I can do for that and why I have that error?
For testing, you could:
build your own image based on gitlab/gitlab-ce:latest
change the ENTRYPOINT to call your own script before calling the one from GitLab
check/change the permissions of /var/opt/gitlab in your custom ENTRYPOINT script.
That way, you have a better control of the environment used by the GitLab image.

Skaffold dev fails

I am having this error, after running skaffold dev.
Step 1/6 : FROM node:current-alpine3.11
exiting dev mode because first build failed: unable to stream build output: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.49.1:53: read udp 192.168.49.2:35889->192.168.49.1:53: i/o timeout. Please fix the Dockerfile and try again..
Here is skaffold.yml
apiVersion: skaffold/v2beta11
kind: Config
metadata:
name: *****
build:
artifacts:
- image: 127.0.0.1:32000/auth
context: auth
docker:
dockerfile: Dockerfile
deploy:
kubectl:
manifests:
- infra/k8s/auth-depl.yaml
local:
push: false
artifacts:
- image: 127.0.0.1:32000/auth
context: auth
docker:
dockerfile: Dockerfile
sync:
manual:
- src: "src/**/*.ts"
dest: .
I have tried all possible solutions I saw online, including adding 8.8.8.8 as the DNS, but the error still persists. I am using Linux and running ubuntu, I am also using Minikube locally. Please assist.
This is a Community Wiki answer, posted for better visibility, so feel free to edit it and add any additional details you consider important.
In this case:
minikube delete && minikube start
solved the problem but you can start from restarting docker daemon. Since this is Minikube cluster and Skaffold uses for its builds Minikube's Docker daemon, as suggested by Brian de Alwis in his comment, you may start from:
minikube stop && minikube start
or
minikube ssh
su
systemctl restart docker
I searched for similar errors and in many cases e.g. here or in this thread, setting up your DNS to something reliable like 8.8.8.8 may also help:
sudo echo "nameserver 8.8.8.8" >> /etc/resolv.conf
in case you use Minikube you should first:
minikube ssh
su ### to become root
and then run:
echo "nameserver 8.8.8.8" >> /etc/resolv.conf
The following error message:
Please fix the Dockerfile and try again
may be somewhat misleading in similar cases as Dockerfile is probably totally fine, but as we can read in other part:
lookup registry-1.docker.io on 192.168.49.1:53: read udp 192.168.49.2:35889->192.168.49.1:53: i/o timeout.
it's definitely related with failing DNS lookup. This is well described here as well known issue.
Get i/o timeout
Get https://index.docker.io/v1/repositories//images: dial tcp: lookup on :53: read udp :53: i/o timeout
Description
The DNS resolver configured on the host cannot resolve the registry’s
hostname.
GitHub link
N/A
Workaround
Retry the operation, or if the error persists, use another DNS
resolver. You can do this by updating your /etc/resolv.conf file
with these or other DNS servers:
nameserver 8.8.8.8 nameserver 8.8.4.4

Clustering vernemq docker containers running on different machines

I was hoping this would be an easy one by just using the below snippet on the second instance's docker-compose.yml file
- DOCKER_VERNEMQ_DISCOVERY_NODE=<ip address of the first instance>
but that doesn't seem to work.
Log of the second instance confirms it's attempting to cluster:
13:56:09.795 [info] Sent join request to: 'VerneMQ#<ip address of the first instance>'
13:56:16.800 [info] Unable to connect to 'VerneMQ#<ip address of the first instance>'
While the log of the first instance does not show anything at all.
From within the second instance I can confirm that the endpoint is accessible:
$ docker exec -it vernemq /bin/sh
$ curl <ip address of the first instance>:44053
curl: (56) Recv failure: Connection reset by peer
then in the log of the first instance I see an error which is totally expected and confirms I've reached the first instance
13:58:33.572 [error] CRASH REPORT Process <0.3050.0> with 0 neighbours crashed with reason: bad argument in vmq_cluster_com:process_bytes/3 line 142
13:58:33.572 [error] Ranch listener {{172,19,0,2},44053} terminated with reason: bad argument in vmq_cluster_com:process_bytes/3 line 142
It might have to do with the fact that ip address as seen from within the docker container is 172.19.0.2 while the external one is 10. ....
Also tried adding hostname of the first instance to known_hosts to no avail.
Please advise.
I'm using erlio/docker-vernemq:1.10.0
$ docker --version
Docker version 19.03.13, build 4484c46d9d
$ docker-compose --version
docker-compose version 1.27.2, build 18f557f9
I managed to get this sorted by creating a docker overlay network
on machine1: docker swarm init
on machine2: docker swarm join --token ...
on machine1: docker network create --driver=overlay --attachable vernemq-overlay-net
The relevant bits of my dockerfile are:
version: '3.6'
services:
vernemq:
container_name: ${NODE_NAME:?Node name not specified}
image: vernemq/vernemq:1.10.4.1
environment:
- DOCKER_VERNEMQ_NODENAME=${NODE_NAME:?Node name not specified}
- DOCKER_VERNEMQ_DISCOVERY_NODE=${DISCOVERY_NODE:-}
networks:
default:
external:
name: vernemq-overlay-net
with the following env vars:
machine1:
NODE_NAME=vernemq1.example.com
DISCOVERY_NODE=
machine2:
NODE_NAME=vernemq2.example.com
DISCOVERY_NODE=vernemq1.example.com
Note:
Chances are machine2 won't find vernemq-overlay-net due to a bug in docker-compose as far as I remember.
In that case you start a container with docker: docker run -dit --name alpine --net=vernemq-overlay-net alpine which will make it available for docker-compose.

Filebeat not running using docker-compose: setting 'filebeat.prospectors' has been removed

I'm trying to launch filebeat using docker-compose (I intend to add other services later on) but every time I execute the docker-compose.yml file, the filebeat service always ends up with the following error:
filebeat_1 | 2019-08-01T14:01:02.750Z ERROR instance/beat.go:877 Exiting: 1 error: setting 'filebeat.prospectors' has been removed
filebeat_1 | Exiting: 1 error: setting 'filebeat.prospectors' has been removed
I discovered the error by accessing the docker-compose logs.
My docker-compose file is as simple as it can be at the moment. It simply calls a filebeat Dockerfile and launches the service immediately after.
Next to my Dockerfile for filebeat I have a simple config file (filebeat.yml), which is copied to the container, replacing the default filebeat.yml.
If I execute the Dockerfile using the docker command, the filebeat instance works just fine: it uses my config file and identifies the "output.json" file as well.
I'm currently using version 7.2 of filebeat and I know that the "filebeat.prospectors" isn't being used. I also know for sure that this specific configuration isn't coming from my filebeat.yml file (you'll find it below).
It seems that, when using docker-compose, the container is accessing another configuration file instead of the one that is being copied to the container, by the Dockerfile, but so far I haven't been able to figure it out how, why and how can I fix it...
Here's my docker-compose.yml file:
version: "3.7"
services:
filebeat:
build: "./filebeat"
command: filebeat -e -strict.perms=false
The filebeat.yml file:
filebeat.inputs:
- paths:
- '/usr/share/filebeat/*.json'
fields_under_root: true
fields:
tags: ['json']
output:
logstash:
hosts: ['localhost:5044']
The Dockerfile file:
FROM docker.elastic.co/beats/filebeat:7.2.0
COPY filebeat.yml /usr/share/filebeat/filebeat.yml
COPY output.json /usr/share/filebeat/output.json
USER root
RUN chown root:filebeat /usr/share/filebeat/filebeat.yml
RUN mkdir /usr/share/filebeat/dockerlogs
USER filebeat
The output I'm expecting should be similar to the following, which comes from the successful executions I'm getting when I'm executing it as a single container.
The ERROR is expected because I don't have logstash configured at the moment.
INFO crawler/crawler.go:72 Loading Inputs: 1
INFO log/input.go:148 Configured paths: [/usr/share/filebeat/*.json]
INFO input/input.go:114 Starting input of type: log; ID: 2772412032856660548
INFO crawler/crawler.go:106 Loading and starting Inputs completed. Enabled inputs: 1
INFO log/harvester.go:253 Harvester started for file: /usr/share/filebeat/output.json
INFO pipeline/output.go:95 Connecting to backoff(async(tcp://localhost:5044))
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp [::1]:5044: connect: cannot assign requested address
INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://localhost:5044)) with 1 reconnect attempt(s)
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp [::1]:5044: connect: cannot assign requested address
INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://localhost:5044)) with 2 reconnect attempt(s)
I managed to figure out what the problem was.
I needed to map the location of the config file and logs directory in the docker-compose file, using the volumes tag:
version: "3.7"
services:
filebeat:
build: "./filebeat"
command: filebeat -e -strict.perms=false
volumes:
- ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml
- ./filebeat/logs:/usr/share/filebeat/dockerlogs
Finally I just had to execute the docker-compose command and everything start working properly:
docker-compose -f docker-compose.yml up -d

using ansible with docker-compose

I am trying to deploy a docker setup using Ansible playbook. For this, I am using docker_service.
My Playbook looks like:
---
- name: Run Docker compose
hosts: all
gather_facts: no
tasks:
- debug: msg="Container - {{ inventory_hostname }}"
- docker_service:
project_src: "compose"
state: absent
- docker_service:
project_src: "compose"
state: present
Upon running this simple playbook as:
ansible-playbook -v playbook.yml --ask-sudo-pass
I added --ask-sudo-pass to ensure that it was not a permission issue.
OUTPUT
SUDO password:
PLAY [Run Docker compose] ******************************************************
TASK [debug] *******************************************************************
ok: [prolims-staging] => {
"msg": "Container - prolims-staging"
}
TASK [docker_service] **********************************************************
fatal: [prolims-staging]: FAILED! => {"changed": false, "msg": "Error connecting: Error while fetching server API version: ('Connection aborted.', error(13, 'Permission denied'))"}
to retry, use: --limit #/data/prolims-provision/provision-docker.retry
PLAY RECAP *********************************************************************
prolims-staging : ok=1 changed=0 unreachable=0 failed=1
I did try looking out for this issue on other forums as well ( and similar questions on this StackOverflow too), but those were not helpful.
Note: I am able to run docker-compose successfully in the target machine from its CLI (using sudo).
Also, I tried playing around with docker_container as well. I tried to execute a playbook with contents below:
...
- name: check container status
command: docker ps
register: result
- name: Create a container
docker_container:
name: db_pg
image: "postgres:latest"
state: present
recreate: yes
...
and running this playbook works perfectly fine.
I assume, posting my docker-compose file might not be relevant here.
I followed this example, but did not work. Maybe, I might be missing some stupid or really important thing here.
Any help on understanding and resolving this issue would be appreciated.
I am able to run docker-compose successfully in the target machine from its CLI (using sudo).
So you need to use become declaration for the task.
I added --ask-sudo-pass to ensure that it was not a permission issue.
Just adding --ask-sudo-pass to the ansible-playbook parameters doesn't have any effect unless the relevant tasks/plays have become declaration (and become_method is set to sudo, but this is by default).
Reference.

Resources