how to make ansible get access to an sshd container? - docker

I use an ansible script to load & start the https://hub.docker.com/r/rastasheep/ubuntu-sshd/ container.
so it starts well of course :
bash-4.4$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8bedbd3b7d88 rastasheep/ubuntu-sshd "/usr/sbin/sshd -D" 37 minutes ago Up 36 minutes 0.0.0.0:49154->22/tcp test
bash-4.4$
so after ansible failure on ssh access to it I tested manually from shell
this is also ok.
bash-4.4$ ssh root#172.17.0.2
The authenticity of host '172.17.0.2 (172.17.0.2)' can't be established.
ECDSA key fingerprint is SHA256:YtTfuoRRR5qStSVA5UuznGamA/dvf+djbIT6Y48IYD0.
ECDSA key fingerprint is MD5:43:3f:41:e9:89:45:06:6f:f6:42:c4:6a:70:37:f8:1d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '172.17.0.2' (ECDSA) to the list of known hosts.
root#172.17.0.2's password:
root#8bedbd3b7d88:~# logout
Connection to 172.17.0.2 closed.
bash-4.4$
so the step that failed is trying to get on it from ansible script & make access to ssh-copy-id
ansible error message is :
Fatal: [172.17.0.2]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: Warning: Permanently added '172.17.0.2' (ECDSA) to the list of known hosts.\r\nPermission denied (publickey,password).\r\n", "unreachable": true}
---
- hosts: 127.0.0.1
tasks:
- name: start docker service
service:
name: docker
state: started
- name: load and start the container we wanna use
docker_container:
name: test
image: rastasheep/ubuntu-sshd
state: started
ports:
- "49154:22"
- name: Wait maximum of 300 seconds for ports to be available
wait_for:
host: 0.0.0.0
port: 49154
state: started
- hosts: 172.17.0.2
vars:
passwordadmin: $6$pbE6yznA$AeFIdI.....K0
passwordroot: $6$TMrxQUxT$I8.JIzR.....TV1
ansible_ssh_extra_args: "-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null"
tasks:
- name: Build test container root user rsa ssh-key
shell: docker exec test ssh-keygen -b 2048 -t rsa -f /root/.ssh/id_rsa -q -N ""
so I cannot even run the needed step to build ssh
how to do then ??
1st step (ansible task) : load docker container
2cd step (ansible task on only 172.17.0.2) : connect to it & setup it
there will be 3rd step to run application on it after that.
the problem occurs only when starting the 2cd step

Ok after many trys on a second container
conclusion is my procedure was bad
what I have done to solve that :
build a diroctory tree separating ./ ./inventory ./includes
build 1 yaml file by host (local, docker, labo)
build 1 main yaml file on ./
build 1 new host file in ./inventory
connect forced by sshpass to docker on default password
changed it
add the host key on authorized key to a login dedicated usage
installed pyhton (needed to answer ansible host else it makes
randomly module errors or refused connections depending on current
action)
setup a ssh login user in sudoers
then I can un the docker.yaml actions
then only at last I can run the labo.yaml actions.
Thanks for help
now I'm able to build the missing tools.

Related

How to make Drone Docker Plugin use self-signed certs?

I'm facing the same problem as here - I have set up a private Docker Registry with TLS certification (certificates generated via Certbot), and I can interact with it directly via curl etc. (thus proving that the certificate is correct), but the Docker Plugin in my Drone flow gives an error x509: certificate signed by unknown authority.
As per this StackOverflow answer, I believe that putting the certificate at /etc/docker/certs.d/<my_registry_address:port>/ca.crt should fix this problem, but it doesn't appear to (neither does adding the certificate into the standard /etc/ssl/certs/ca-certificates.crt location)
Demonstration that the certificates work as-expected, having already built the Docker Drone Plugin locally as per https://github.com/drone-plugins/drone-docker:
$ docker run --rm -v <path_to_directory_containing_pems>:/custom-certs -it --entrypoint /bin/sh plugins/docker
/ # ls /custom-certs
accounts archive csr keys live renewal renewal-hooks
/ # apk add curl
...
OK: 28 MiB in 56 packages
/ # curl https://docker-registry.scubbo.org:8843/v2/_catalog
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.
/ # curl https://docker-registry.scubbo.org:8843/v2/_catalog --cacert /custom-certs/live/docker-registry.scubbo.org/fullchain.pem
{"repositories":[...]}
/ # cat /custom-certs/live/docker-registry.scubbo.org/fullchain.pem >> /etc/ssl/certs/ca-certificates.crt
/ # curl https://docker-registry.scubbo.org:8843/v2/_catalog
{"repositories":[...]}
Here's my .drone.yml, for a Runner instantiated with --env=DRONE_RUNNER_VOLUMES=/var/run/docker.sock:/var/run/docker.sock,<path_to_directory_containing_pems>:/custom-certs:
kind: pipeline
name: hello-world
type: docker
platform:
os: linux
arch: arm64
steps:
- name: copy-cert-into-place
image: busybox
volumes:
- name: docker-cert-persistence
path: /etc/docker/certs.d/
commands:
# https://stackoverflow.com/a/56410355/1040915
# Note that we need to mount the whole `custom-certs` directory into the workflow and then copy the file to `/etc/...`,
# rather than mounting the file directly into `/etc/...`, because the original file is a symlink and it's not possible (AFAIK)
# to instruct Docker to "mount the eventual-target-of this symlink into <location>"
- mkdir -p /etc/docker/certs.d/docker-registry.scubbo.org:8843
- cp -L /custom-certs/live/docker-registry.scubbo.org/fullchain.pem /etc/docker/certs.d/docker-registry.scubbo.org:8843/ca.crt
- name: check-cert-persists-between-stages
image: alpine
volumes:
- name: docker-cert-persistence
path: /etc/docker/certs.d/
commands:
- apk add curl
# The command below would fail if the cert was unavailable or invalid
- curl https://docker-registry.scubbo.org:8843/v2/_catalog --cacert /etc/docker/certs.d/docker-registry.scubbo.org:8843/ca.crt
- name: build-image
# ...contents irrelevant to this question...
- name: push-built-image
image: plugins/docker
volumes:
- name: docker-cert-persistence
path: /etc/docker/certs.d/
settings:
repo: docker-registry.scubbo.org:8843/scubbo/blog_nginx
tags: built_in_ci
debug: true
launch_debug: true
volumes:
- name: docker-cert-persistence
temp: {}
giving these logs from push-built-image step - ending in...
+ /usr/local/bin/docker tag 472d41d9c03ee60fe9c1965ad9cfd36a1cdb6cbf docker-registry.scubbo.org:8843/scubbo/blog_nginx:built_in_ci
+ /usr/local/bin/docker push docker-registry.scubbo.org:8843/scubbo/blog_nginx:built_in_ci
The push refers to repository [docker-registry.scubbo.org:8843/scubbo/blog_nginx]
Get "https://docker-registry.scubbo.org:8843/v2/": x509: certificate signed by unknown authority
exit status 1
How should I go about providing the CA Certificate to my Drone Docker Plugin step to permit it to communicate over TLS with a secure Docker registry? This answer suggests simply reverting to insecure integration, which works but is unsatisfactory.
EDIT: After re-reading this documentation, I extended the copy-cert-into-place commands to copy all 3 certificate-related files:
commands:
- mkdir -p /etc/docker/certs.d/docker-registry.scubbo.org:8843
- cp -L /custom-certs/live/docker-registry.scubbo.org/fullchain.pem /etc/docker/certs.d/docker-registry.scubbo.org:8843/ca.crt
- cp -L /custom-certs/live/docker-registry.scubbo.org/privkey.pem /etc/docker/certs.d/docker-registry.scubbo.org:8843/client.key
- cp -L /custom-certs/live/docker-registry.scubbo.org/cert.pem /etc/docker/certs.d/docker-registry.scubbo.org:8843/client.cert
but that did not resolve the problem - same x509: certificate signed by unknown authority error.
EDIT2: I directly confirmed (directly on a host, outside the context of a plugin or docker container) that adding the certificate to the path used above is sufficient to permit interaction with the registry:
$ docker pull docker-registry.scubbo.org:8843/scubbo/blog_nginx:built_in_ci
Error response from daemon: Get "https://docker-registry.scubbo.org:8843/v2/": x509: certificate signed by unknown authority
$ sudo cp -L <path_to_directory_containing_pems>/live/docker-registry.scubbo.org/chain.pem /etc/docker/certs.d/docker-registry.scubbo.org\:8843/ca.crt
$ docker pull docker-registry.scubbo.org:8843/scubbo/blog_nginx:built_in_ci
built_in_ci: Pulling from scubbo/blog_nginx
Digest: sha256:3a17f86f23050303d94443f24318b49fb1a5e2d0cc9228270678c8aa55b4d2c2
Status: Image is up to date for docker-registry.scubbo.org:8843/scubbo/blog_nginx:built_in_ci
docker-registry.scubbo.org:8843/scubbo/blog_nginx:built_in_ci
This isn't a complete answer, but I was able to get secure registry access working by switching from mounting a directory, to mounting the file directly:
I changed the docker run option to --env=DRONE_RUNNER_VOLUMES=/var/run/docker.sock:/var/run/docker.sock,$(readlink -f <path_to_directory_containing_pems>/live/docker-registry.scubbo.org/chain.pem):/registry_cert.crt
I changed the commands in copy-cert-into-place to:
- mkdir -p /etc/docker/certs.d/docker-registry.scubbo.org:8843
- cp /registry_cert.crt /etc/docker/certs.d/docker-registry.scubbo.org:8843/ca.crt
I don't consider this a complete answer (and would love further input or advice!), because:
I don't know why copying the file out of the mounted directory into /etc/docker/... (as in the original question) didn't work, but mounting the file directly from the host filesystem worked. (Note that the check-cert-persists-between-stages stage confirms that the certificate is correct, so it's not a mistake of copying a wrong or empty file)
I don't know how to mount the file directly into an in-stage path that contains a colon - this answer indicates how to mount a path containing a colon directly into a container, but in this case we're passing the path to DRONE_RUNNER_VOLUMES

How to make Jenkins to use private key and passphrase to run Ansible playbook

I am using Jenkins to run some Ansible playbooks. One of the simple tests I did was to have the playbook to cat the fstab file on a remote server:
The playbook looks like this:
---
- hosts: "tesst-1-server"
tasks:
- name: dislpay /etc/fstab
shell: cat /etc/fstab
register: fstab_reg
- debug: msg="{{ fstab_reg.stdout }}"
In Jenkins, I have a freestyle project, it uses Invoke Ansible Playbook to call the above playbook, and the project credentials was setup with a different: ansible-user. This is different from the default user-jenkins that runs Jenkins. User ansible-user can ssh to all my servers. I have ansible-user setup in Jenkins Credential with its private key and passphrase. But when I run the project, I got an error:
[update_fstab] $ /usr/bin/ansible-playbook google/ansible/test-scripts/test/sub_book.yml -i /etc/ansible/hosts -f 5 --private-key /tmp/ssh14117407503194058572.key -u ansible-user
[WARNING]: Invalid characters were found in group names but not replaced, use
-vvvv to see details
fatal: [test-1-server]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ansible-user#test-1-server: Permission denied (publickey).", "unreachable": true}
I am not quiet sure what exactly the error is saying as I have setup the private key and passphrase to ansible-user's credentials. What does the group names in the message mean? Because this is done through Jenkins, I am not sure how to do the -vvv as it suggested.
How can I make Jenkins to pass the private key and passphrase to the Ansible playbook?
Thanks!
I think I have found the "issue". After I switched to a different user other than ansible-user, the playbook worked. Interesting thing is that when I created the private key pairs for ansible-user, I used "-m PEM" and it should be good for Jenkins.

How to run minikube inside a docker container?

I intend to test a non-trivial Kubernetes setup as part of CI and wish to run the full system before CD. I cannot run --privileged containers and am running the docker container as a sibling to the host using docker run -v /var/run/docker.sock:/var/run/docker.sock
The basic docker setup seems to be working on the container:
linuxbrew#03091f71a10b:~$ docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
However, minikube fails to start inside the docker container, reporting connection issues:
linuxbrew#03091f71a10b:~$ minikube start --alsologtostderr -v=7
I1029 15:07:41.274378 2183 out.go:298] Setting OutFile to fd 1 ...
I1029 15:07:41.274538 2183 out.go:345] TERM=xterm,COLORTERM=, which probably does not support color
...
...
...
I1029 15:20:27.040213 197 main.go:130] libmachine: Using SSH client type: native
I1029 15:20:27.040541 197 main.go:130] libmachine: &{{{<nil> 0 [] [] []} docker [0x7a1e20] 0x7a4f00 <nil> [] 0s} 127.0.0.1 49350 <nil> <nil>}
I1029 15:20:27.040593 197 main.go:130] libmachine: About to run SSH command:
sudo hostname minikube && echo "minikube" | sudo tee /etc/hostname
I1029 15:20:27.040992 197 main.go:130] libmachine: Error dialing TCP: dial tcp 127.0.0.1:49350: connect: connection refused
This is despite the network being linked and the port being properly forwarded:
linuxbrew#51fbce78731e:~$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
93c35cec7e6f gcr.io/k8s-minikube/kicbase:v0.0.27 "/usr/local/bin/entrโ€ฆ" 2 minutes ago Up 2 minutes 127.0.0.1:49350->22/tcp, 127.0.0.1:49351->2376/tcp, 127.0.0.1:49348->5000/tcp, 127.0.0.1:49349->8443/tcp, 127.0.0.1:49347->32443/tcp minikube
51fbce78731e 7f7ba6fd30dd "/bin/bash" 8 minutes ago Up 8 minutes bpt-ci
linuxbrew#51fbce78731e:~$ docker network ls
NETWORK ID NAME DRIVER SCOPE
1e800987d562 bridge bridge local
aa6b2909aa87 host host local
d4db150f928b kind bridge local
a781cb9345f4 minikube bridge local
0a8c35a505fb none null local
linuxbrew#51fbce78731e:~$ docker network connect a781cb9345f4 93c35cec7e6f
Error response from daemon: endpoint with name minikube already exists in network minikube
The minikube container seems to be alive and well when trying to curl from the host and even sshis responding:
mastercook#linuxkitchen:~$ curl https://127.0.0.1:49350
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:49350
mastercook#linuxkitchen:~$ ssh root#127.0.0.1 -p 49350
The authenticity of host '[127.0.0.1]:49350 ([127.0.0.1]:49350)' can't be established.
ED25519 key fingerprint is SHA256:0E41lExrrezFK1QXULaGHgk9gMM7uCQpLbNPVQcR2Ec.
This key is not known by any other names
What am I missing and how can I make minikube properly discover the correctly working minikube container?
Because minikube does not complete the cluster creation, running Kubernetes in a (sibling) Docker container favours kind.
Given that the (sibling) container does not know enough about its setup, the networking connections are a bit flawed. Specifically, a loopback IP is selected by kind (and minikube) upon cluster creation even though the actual container sits on a different IP in the host docker.
To correct the networking, the (sibling) container needs to be connected to the network actually hosting the Kubernetes image. To accomplish this, the procedure is illustrated below:
Create a kubernetes cluster:
linuxbrew#324ba0f819d7:~$ kind create cluster --name acluster
Creating cluster "acluster" ...
โœ“ Ensuring node image (kindest/node:v1.21.1) ๐Ÿ–ผ
โœ“ Preparing nodes ๐Ÿ“ฆ
โœ“ Writing configuration ๐Ÿ“œ
โœ“ Starting control-plane ๐Ÿ•น๏ธ
โœ“ Installing CNI ๐Ÿ”Œ
โœ“ Installing StorageClass ๐Ÿ’พ
Set kubectl context to "kind-acluster"
You can now use your cluster with:
kubectl cluster-info --context kind-acluster
Thanks for using kind! ๐Ÿ˜Š
Verify if the cluster is accessible:
linuxbrew#324ba0f819d7:~$ kubectl cluster-info --context kind-acluster
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The connection to the server 127.0.0.1:36779 was refused - did you specify the right host or port?
3.) Since the cluster cannot be reached, retrieve the control planes master IP. Note the "-control-plane" addition to the cluster name:
linuxbrew#324ba0f819d7:~$ export MASTER_IP=$(docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' acluster-control-plane)
4.) Update the kube config with the actual master IP:
linuxbrew#324ba0f819d7:~$ sed -i "s/^ server:.*/ server: https:\/\/$MASTER_IP:6443/" $HOME/.kube/config
5.) This IP is still not accessible by the (sibling) container and to connect the container with the correct network retrieve the docker network ID:
linuxbrew#324ba0f819d7:~$ export MASTER_NET=$(docker inspect --format='{{range .NetworkSettings.Networks}}{{.NetworkID}}{{end}}' acluster-control-plane)
6.) Finally connect the (sibling) container ID (which should be stored in the $HOSTNAME environment variable) with the cluster docker network:
linuxbrew#324ba0f819d7:~$ docker network connect $MASTER_NET $HOSTNAME
7.) Verify whether the control plane accessible after the changes:
linuxbrew#324ba0f819d7:~$ kubectl cluster-info --context kind-acluster
Kubernetes control plane is running at https://172.18.0.4:6443
CoreDNS is running at https://172.18.0.4:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
If kubectl returns Kubernetes control plane and CoreDNS URL, as shown in the last step above, the configuration has succeeded.
You can run minikube in docker in docker container. It will use docker driver.
docker run --name dind -d --privileged docker:20.10.17-dind
docker exec -it dind sh
/ # wget https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
/ # mv minikube-linux-amd64 minikube
/ # chmod +x minikube
/ # ./minikube start --force
...
* Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
/ # ./minikube kubectl -- run --image=hello-world
/ # ./minikube kubectl -- logs pod/hello
Hello from Docker!
...
Also, note that --force is for running minikube using docker driver as root which we shouldn't do according minikube instructions.

Skaffold dev fails

I am having this error, after running skaffold dev.
Step 1/6 : FROM node:current-alpine3.11
exiting dev mode because first build failed: unable to stream build output: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.49.1:53: read udp 192.168.49.2:35889->192.168.49.1:53: i/o timeout. Please fix the Dockerfile and try again..
Here is skaffold.yml
apiVersion: skaffold/v2beta11
kind: Config
metadata:
name: *****
build:
artifacts:
- image: 127.0.0.1:32000/auth
context: auth
docker:
dockerfile: Dockerfile
deploy:
kubectl:
manifests:
- infra/k8s/auth-depl.yaml
local:
push: false
artifacts:
- image: 127.0.0.1:32000/auth
context: auth
docker:
dockerfile: Dockerfile
sync:
manual:
- src: "src/**/*.ts"
dest: .
I have tried all possible solutions I saw online, including adding 8.8.8.8 as the DNS, but the error still persists. I am using Linux and running ubuntu, I am also using Minikube locally. Please assist.
This is a Community Wiki answer, posted for better visibility, so feel free to edit it and add any additional details you consider important.
In this case:
minikube delete && minikube start
solved the problem but you can start from restarting docker daemon. Since this is Minikube cluster and Skaffold uses for its builds Minikube's Docker daemon, as suggested by Brian de Alwis in his comment, you may start from:
minikube stop && minikube start
or
minikube ssh
su
systemctl restart docker
I searched for similar errors and in many cases e.g. here or in this thread, setting up your DNS to something reliable like 8.8.8.8 may also help:
sudo echo "nameserver 8.8.8.8" >> /etc/resolv.conf
in case you use Minikube you should first:
minikube ssh
su ### to become root
and then run:
echo "nameserver 8.8.8.8" >> /etc/resolv.conf
The following error message:
Please fix the Dockerfile and try again
may be somewhat misleading in similar cases as Dockerfile is probably totally fine, but as we can read in other part:
lookup registry-1.docker.io on 192.168.49.1:53: read udp 192.168.49.2:35889->192.168.49.1:53: i/o timeout.
it's definitely related with failing DNS lookup. This is well described here as well known issue.
Get i/o timeout
Get https://index.docker.io/v1/repositories//images: dial tcp: lookup on :53: read udp :53: i/o timeout
Description
The DNS resolver configured on the host cannot resolve the registryโ€™s
hostname.
GitHub link
N/A
Workaround
Retry the operation, or if the error persists, use another DNS
resolver. You can do this by updating your /etc/resolv.conf file
with these or other DNS servers:
nameserver 8.8.8.8 nameserver 8.8.4.4

Cannot conect to Docker container running in VSTS

I have a test which starts a Docker container, performs the verification (which is talking to the Apache httpd in the Docker container), and then stops the Docker container.
When I run this test locally, this test runs just fine. But when it runs on hosted VSTS, thus a hosted build agent, it cannot connect to the Apache httpd in the Docker container.
This is the .vsts-ci.yml file:
queue: Hosted Linux Preview
steps:
- script: |
./test.sh
This is the test.sh shell script to reproduce the problem:
#!/bin/bash
set -e
set -o pipefail
function tearDown {
docker stop test-apache
docker rm test-apache
}
trap tearDown EXIT
docker run -d --name test-apache -p 8083:80 httpd
sleep 10
curl -D - http://localhost:8083/
When I run this test locally, the output that I get is:
$ ./test.sh
469d50447ebc01775d94e8bed65b8310f4d9c7689ad41b2da8111fd57f27cb38
HTTP/1.1 200 OK
Date: Tue, 04 Sep 2018 12:00:17 GMT
Server: Apache/2.4.34 (Unix)
Last-Modified: Mon, 11 Jun 2007 18:53:14 GMT
ETag: "2d-432a5e4a73a80"
Accept-Ranges: bytes
Content-Length: 45
Content-Type: text/html
<html><body><h1>It works!</h1></body></html>
test-apache
test-apache
This output is exactly as I expect.
But when I run this test on VSTS, the output that I get is (irrelevant parts replaced with โ€ฆ).
2018-09-04T12:01:23.7909911Z ##[section]Starting: CmdLine
2018-09-04T12:01:23.8044456Z ==============================================================================
2018-09-04T12:01:23.8061703Z Task : Command Line
2018-09-04T12:01:23.8077837Z Description : Run a command line script using cmd.exe on Windows and bash on macOS and Linux.
2018-09-04T12:01:23.8095370Z Version : 2.136.0
2018-09-04T12:01:23.8111699Z Author : Microsoft Corporation
2018-09-04T12:01:23.8128664Z Help : [More Information](https://go.microsoft.com/fwlink/?LinkID=613735)
2018-09-04T12:01:23.8146694Z ==============================================================================
2018-09-04T12:01:26.3345330Z Generating script.
2018-09-04T12:01:26.3392080Z Script contents:
2018-09-04T12:01:26.3409635Z ./test.sh
2018-09-04T12:01:26.3574923Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/02476800-8a7e-4e22-8715-c3f706e3679f.sh
2018-09-04T12:01:27.7054918Z Unable to find image 'httpd:latest' locally
2018-09-04T12:01:30.5555851Z latest: Pulling from library/httpd
2018-09-04T12:01:31.4312351Z d660b1f15b9b: Pulling fs layer
[โ€ฆ]
2018-09-04T12:01:49.1468474Z e86a7f31d4e7506d34e3b854c2a55646eaa4dcc731edc711af2cc934c44da2f9
2018-09-04T12:02:00.2563446Z % Total % Received % Xferd Average Speed Time Time Time Current
2018-09-04T12:02:00.2583211Z Dload Upload Total Spent Left Speed
2018-09-04T12:02:00.2595905Z
2018-09-04T12:02:00.2613320Z 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (7) Failed to connect to localhost port 8083: Connection refused
2018-09-04T12:02:00.7027822Z test-apache
2018-09-04T12:02:00.7642313Z test-apache
2018-09-04T12:02:00.7826541Z ##[error]Bash exited with code '7'.
2018-09-04T12:02:00.7989841Z ##[section]Finishing: CmdLine
The key thing is this:
curl: (7) Failed to connect to localhost port 8083: Connection refused
10 seconds should be enough for apache to start.
Why can curl not communicate with Apache on its port 8083?
P.S.:
I know that a hard-coded port like this is rubbish and that I should use an ephemeral port instead. I wanted to get it running first wirth a hard-coded port, because that's simpler than using an ephemeral port, and then switch to an ephemeral port as soon as the hard-coded port works. And in case the hard-coded port doesn't work because the port is unavailable, the error should look different, in that case, docker run should fail because the port can't be allocated.
Update:
Just to be sure, I've rerun the test with sleep 100 instead of sleep 10. The results are unchanged, curl cannot connect to localhost port 8083.
Update 2:
When extending the script to execute docker logs, docker logs shows that Apache is running as expected.
When extending the script to execute docker ps, it shows the following output:
2018-09-05T00:02:24.1310783Z CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
2018-09-05T00:02:24.1336263Z 3f59aa014216 httpd "httpd-foreground" About a minute ago Up About a minute 0.0.0.0:8083->80/tcp test-apache
2018-09-05T00:02:24.1357782Z 850bda64f847 microsoft/vsts-agent:ubuntu-16.04-docker-17.12.0-ce-standard "/home/vsts/agents/2โ€ฆ" 2 minutes ago Up 2 minutes musing_booth
The problem is that the VSTS build agent runs in a Docker container. When the Docker container for Apache is started, it runs on the same level as the VSTS build agent Docker container, not nested inside the VSTS build agent Docker container.
There are two possible solutions:
Replacing localhost with the ip address of the docker host, keeping the port number 8083
Replacing localhost with the ip address of the docker container, changing the host port number 8083 to the container port number 80.
Access via the Docker Host
In this case, the solution is to replace localhost with the ip address of the docker host. The following shell snippet can do that:
host=localhost
if grep '^1:name=systemd:/docker/' /proc/1/cgroup
then
apt-get update
apt-get install net-tools
host=$(route -n | grep '^0.0.0.0' | sed -e 's/^0.0.0.0\s*//' -e 's/ .*//')
fi
curl -D - http://$host:8083/
The if grep '^1:name=systemd:/docker/' /proc/1/cgroup inspects whether the script is running inside a Docker container. If so, it installs net-tools to get access to the route command, and then parses the default gw from the route command to get the ip address of the host. Note that this only works if the container's network default gw actually is the host.
Direct Access to the Docker Container
After launching the docker container, its ip addresses can be obtained with the following command:
docker container inspect --format '{{range .NetworkSettings.Networks}}{{.IPAddress}} {{end}}' <container-id>
Replace <container-id> with your container id or name.
So, in this case, it would be (assuming that the first ip address is okay):
ips=($(docker container inspect --format '{{range .NetworkSettings.Networks}}{{.IPAddress}} {{end}}' nuance-apache))
host=${ips[0]}
curl http://$host/

Resources