Installing Zookeeper on offline openshift - docker

I've an Openshift Origin cluster running offline on 3 Centos 7 vm. It's working fine, I've a registry where I push my images like this :
docker login -u <username> -e <any_email_address> -p <token_value> <registry_ip>:<port>
Login is successful, then :
oc tag <image-id> <docker-registry-IP>:<port>/<project-name>/<image>
So, for nginx for example :
oc tag 49011ce3b713 172.30.222.111:5000/test/nginx
Then I push it to the internal registry :
docker push 172.30.222.111:5000/test/nginx
And finaly :
oc new-app nginx --name="nginx"
With nginx, everything is working fine, now my problem :
I'm actually wanting to put Zookeeper on it, so : I do the same steps than above, I also install "jboss/base-jdk:7" which is a dependancy of Zookeeper, problem is :
docker push 172.30.222.111:5000/test/jboss/base-jdk:7
Giving :
[root#master 994089]# docker push 172.30.222.111:5000/test/jboss/base-jdk:7
The push refers to a repository [172.30.222.111:5000/test/jboss/base-jdk]
c4c6a9114a05: Layer already exists
3bf2c105669b: Layer already exists
85c6e373d858: Layer already exists
dc1e2dcdc7b6: Layer already exists
Received unexpected HTTP status: 500 Internal Server Error
The problem seems to be the "/" here jboss**/**base-jdk:7
I also tried to push just like this :
docker push 172.30.222.111:5000/test/base-jdk:7
This is working , but Zookeeper is looking for exactly "jboss/base-jdk:7", and not just "base-jdk:7"
Finally, I'm blocked here, when trying this command : oc new-app zookeeper --name="zookeeper" --loglevel=8 --insecure-registry --allow-missing-images
I0628 14:31:54.009713 53407 dockerimagelookup.go:92] checking local Docker daemon for "jboss/base-jdk:7"
I0628 14:31:54.030546 53407 dockerimagelookup.go:380] partial match on "172.30.222.111:5000/test/base-jdk:7" with 0.375000
I0628 14:31:54.030571 53407 dockerimagelookup.go:346] exact match on "jboss/base-jdk:7"
I0628 14:31:54.030578 53407 dockerimagelookup.go:107] Found local docker image match "172.30.222.111:5000/test/base-jdk:7" with score 0.375000
I0628 14:31:54.030589 53407 dockerimagelookup.go:107] Found local docker image match "jboss/base-jdk:7" with score 0.000000
I0628 14:31:54.032799 53407 componentresolvers.go:59] Error from resolver: [can't look up Docker image "jboss/base-jdk:7": Internal error occurred: Get http://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 10.253.158.90:53: no such host]
I0628 14:31:54.032831 53407 dockerimagelookup.go:169] Added missing image match for jboss/base-jdk:7
F0628 14:31:54.032882 53407 helpers.go:110] error: can't look up Docker image "jboss/base-jdk:7": Internal error occurred: Get http://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 10.253.158.90:53: no such host
We can see that 172.30.222.111:5000/test/base-jdk:7 is found but it's not exactly what the command is looking for so it doesn't use it...
So, if you have any idea how to solve this ! :)

Resolved by upgrading to Openshift 1.5.1, previous was 1.3.1.

Related

Logspout container in Docker

I am trying to deploy logspout container in docker, but keep running into an issue which I have searched in this website and github but to no avail, so hoping someone knows.
I followed the following commands as per the Readme here: https://github.com/gliderlabs/logspout
(1) docker pull gliderlabs/logspout:latest (also tried with logspout:master, same results)
(2) docker run -d --name="logspout" --volume=/var/run/docker.sock:/var/run/docker.sock --publish=127.0.0.1:8000:80 gliderlabs/logspout (also tried with -v /var/run/docker.sock:/var/run/docker.sock, same results)
The container gets created but stops immediately. When I check the container logs (docker container logs logspout), I only see the following entries:
2021/12/19 06:37:12 # logspout v3.2.14 by gliderlabs
2021/12/19 06:37:12 # adapters: raw syslog tcp tls udp multiline
2021/12/19 06:37:12 # options :
2021/12/19 06:37:12 persist:/mnt/routes
2021/12/19 06:37:12 # jobs : pump routes http[health,logs,routes]:80
2021/12/19 06:37:12 # routes : none
2021/12/19 06:37:12 pump ended: Get http://unix.sock/containers/json?: dial unix /var/run/docker.sock: connect: no such file or directory
I checked docker.sock as ls -la /var/run/docker.sock results in srw-rw---- 1 root docker 0 Dec 12 09:49 /var/run/docker.sock. So docker.sock does exist, which adds to the confusion as to why the container can't find it.
I am new to linux/docker, but my understanding is that using -v or --version would automatically mount the location to the container, but does not seem to be happening here. So I am wondering if anyone has any suggestion on what needs to be done so that the logspout container can find the docker.sock.
System Info: Docker version 20.10.11, build dea9396; Raspberry Pi 4 ARM 64, OS: Debian GNU/Linux 11 (bullseye)
EDIT: added comment about -v tag in step (2) above
The container must be able to access the Docker Unix socket to mount it. This is typically a problem when namespace remapping is enabled. To disable remapping for the logspout container, pass the --userns=host flag to docker run, .. create, etc.

Clustering vernemq docker containers running on different machines

I was hoping this would be an easy one by just using the below snippet on the second instance's docker-compose.yml file
- DOCKER_VERNEMQ_DISCOVERY_NODE=<ip address of the first instance>
but that doesn't seem to work.
Log of the second instance confirms it's attempting to cluster:
13:56:09.795 [info] Sent join request to: 'VerneMQ#<ip address of the first instance>'
13:56:16.800 [info] Unable to connect to 'VerneMQ#<ip address of the first instance>'
While the log of the first instance does not show anything at all.
From within the second instance I can confirm that the endpoint is accessible:
$ docker exec -it vernemq /bin/sh
$ curl <ip address of the first instance>:44053
curl: (56) Recv failure: Connection reset by peer
then in the log of the first instance I see an error which is totally expected and confirms I've reached the first instance
13:58:33.572 [error] CRASH REPORT Process <0.3050.0> with 0 neighbours crashed with reason: bad argument in vmq_cluster_com:process_bytes/3 line 142
13:58:33.572 [error] Ranch listener {{172,19,0,2},44053} terminated with reason: bad argument in vmq_cluster_com:process_bytes/3 line 142
It might have to do with the fact that ip address as seen from within the docker container is 172.19.0.2 while the external one is 10. ....
Also tried adding hostname of the first instance to known_hosts to no avail.
Please advise.
I'm using erlio/docker-vernemq:1.10.0
$ docker --version
Docker version 19.03.13, build 4484c46d9d
$ docker-compose --version
docker-compose version 1.27.2, build 18f557f9
I managed to get this sorted by creating a docker overlay network
on machine1: docker swarm init
on machine2: docker swarm join --token ...
on machine1: docker network create --driver=overlay --attachable vernemq-overlay-net
The relevant bits of my dockerfile are:
version: '3.6'
services:
vernemq:
container_name: ${NODE_NAME:?Node name not specified}
image: vernemq/vernemq:1.10.4.1
environment:
- DOCKER_VERNEMQ_NODENAME=${NODE_NAME:?Node name not specified}
- DOCKER_VERNEMQ_DISCOVERY_NODE=${DISCOVERY_NODE:-}
networks:
default:
external:
name: vernemq-overlay-net
with the following env vars:
machine1:
NODE_NAME=vernemq1.example.com
DISCOVERY_NODE=
machine2:
NODE_NAME=vernemq2.example.com
DISCOVERY_NODE=vernemq1.example.com
Note:
Chances are machine2 won't find vernemq-overlay-net due to a bug in docker-compose as far as I remember.
In that case you start a container with docker: docker run -dit --name alpine --net=vernemq-overlay-net alpine which will make it available for docker-compose.

Airflow pull docker image from private google container repository

I am using the https://github.com/puckel/docker-airflow image to run Airflow. I had to add pip install docker in order for it to support DockerOperator.
Everything seems ok, but I can't figure out how to pull an image from a private google docker container repository.
I tried adding the connection in the admin section type of google cloud conenction and running the docker operator as.
t2 = DockerOperator(
task_id='docker_command',
image='eu.gcr.io/project/image',
api_version='2.3',
auto_remove=True,
command="/bin/sleep 30",
docker_url="unix://var/run/docker.sock",
network_mode="bridge",
docker_conn_id="google_con"
)
But always get an error...
[2019-11-05 14:12:51,162] {{taskinstance.py:1047}} ERROR - No Docker
registry URL provided
I also tried the docker_conf_option
t2 = DockerOperator(
task_id='docker_command',
image='eu.gcr.io/project/image',
api_version='2.3',
auto_remove=True,
command="/bin/sleep 30",
docker_url="unix://var/run/docker.sock",
network_mode="bridge",
dockercfg_path="/usr/local/airflow/config.json",
)
I get the following error:
[2019-11-06 13:59:40,522] {{docker_operator.py:194}} INFO - Starting
docker container from image
eu.gcr.io/project/image
[2019-11-06 13:59:40,524] {{taskinstance.py:1047}} ERROR -
('Connection aborted.', FileNotFoundError(2, 'No such file or
directory'))
I also tried using only dockercfg_path="config.json" and got the same error.
I can't really use Bash Operator to try to docker login as it does not recognize docker command...
What am I missing?
line 1: docker: command not found
t3 = BashOperator(
task_id='print_hello',
bash_command='docker login -u _json_key - p /usr/local/airflow/config.json eu.gcr.io'
)
airflow.hooks.docker_hook.DockerHook is using docker_default connection where one isn't configured.
Now in your first attempt, you set google_con for docker_conn_id and the error thrown is showing that host (i.e registry name) isn't configured.
Here are a couple of changes to do:
image argument passed in DockerOperator should be set to image tag without registry name prefixing it.
DockerOperator(api_version='1.21',
# docker_url='tcp://localhost:2375', #Set your docker URL
command='/bin/ls',
image='image',
network_mode='bridge',
task_id='docker_op_tester',
docker_conn_id='google_con',
dag=dag,
# added this to map to host path in MacOS
host_tmp_dir='/tmp',
tmp_dir='/tmp',
)
provide registry name, username and password for the underlying DockerHook to authenticate to Docker in your google_con connection.
You can obtain long lived credentials for authentication from a service account key. For username, use _json_key and in password field paste in the contents of the json key file.
Here are logs from running my task:
[2019-11-16 20:20:46,874] {base_task_runner.py:110} INFO - Job 443: Subtask docker_op_tester [2019-11-16 20:20:46,874] {dagbag.py:88} INFO - Filling up the DagBag from /Users/r7/OSS/airflow/airflow/example_dags/example_docker_operator.py
[2019-11-16 20:20:47,054] {base_task_runner.py:110} INFO - Job 443: Subtask docker_op_tester [2019-11-16 20:20:47,054] {cli.py:592} INFO - Running <TaskInstance: docker_sample.docker_op_tester 2019-11-14T00:00:00+00:00 [running]> on host 1.0.0.127.in-addr.arpa
[2019-11-16 20:20:47,074] {logging_mixin.py:89} INFO - [2019-11-16 20:20:47,074] {local_task_job.py:120} WARNING - Time since last heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.989537 s
[2019-11-16 20:20:47,088] {logging_mixin.py:89} INFO - [2019-11-16 20:20:47,088] {base_hook.py:89} INFO - Using connection to: id: google_con. Host: gcr.io/<redacted-project-id>, Port: None, Schema: , Login: _json_key, Password: XXXXXXXX, extra: {}
[2019-11-16 20:20:48,404] {docker_operator.py:209} INFO - Starting docker container from image alpine
[2019-11-16 20:20:52,066] {logging_mixin.py:89} INFO - [2019-11-16 20:20:52,066] {local_task_job.py:99} INFO - Task exited with return code 0
I know the question is about GCR but it's worth noting that other container registries may expect the config in a different format.
For example Gitlab expects you to pass the fully qualified image name to the DAG and only put the Gitlab container registry host name in the connection:
DockerOperator(
task_id='docker_command',
image='registry.gitlab.com/group/project/image:tag',
api_version='auto',
docker_conn_id='gitlab_registry',
)
The set up your gitlab_registry connection like:
docker://gitlab+deploy-token-1234:ABDCtoken1234#registry.gitlab.com
Based on recent Cloud Composer documentation, it's recommended to use KubernetesPodOperator instead, like this:
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
KubernetesPodOperator(
task_id='docker_op_tester',
name='docker_op_tester',
dag=dag,
namespace="default",
image="eu.gcr.io/project/image",
cmds=["ls"]
)
Further to #Tamlyn's answer, we can also skip the creation of connection (docker_conn_id) from airflow and use it with gitlab as under
On your development machine :
https://gitlab.com/yourgroup/yourproject/-/settings/repository (create a token here and get details for logging in)
docker login registry.gitlab.com (on the machine to login to docker from the machine to push the image to docker - enter your gitlab credentials when prompted)
docker build -t registry.gitlab.com/yourgroup/yourproject . && docker push registry.gitlab.com/yourgroup/yourproject (builds and pushes to your project repo's container registry)
On your airflow machine :
https://gitlab.com/yourgroup/yourproject/-/settings/repository (you can use the above created token for logging in)
docker login registry.gitlab.com (to login to docker from the machine to pull the image from docker, this skips the need for creating a docker registry connection - enter your gitlab credentials when prompted = this generates ~/.docker/config.json which is required Reference from docker docs )
In your dag :
dag = DAG(
"dag_id",
default_args = default_args,
schedule_interval = "15 1 * * *"
)
docker_trigger = DockerOperator(
task_id = "task_id",
api_version = "auto",
network_mode = "bridge",
image = "registry.gitlab.com/yourgroup/yourproject",
auto_remove = True, # use if required
force_pull = True, # use if required
xcom_all = True, # use if required
# tty = True, # turning this on screws up the log rendering
# command = "", # use if required
environment = { # use if required
"envvar1": "envvar1value",
"envvar2": "envvar2value",
},
dag = dag,
)
this works with Ubuntu 20.04.2 LTS (tried and tested) with airflow installed on the instance
You will need to instal Cloud SDK in your workstation which includes the gcloud command-line tool.
After installing Cloud SDK and Docker version 18.03 or newer
According to their documentation to pull from Container Registry, use the command:
docker pull [HOSTNAME]/[PROJECT-ID]/[IMAGE]:[TAG]
or
docker pull [HOSTNAME]/[PROJECT-ID]/[IMAGE]#[IMAGE_DIGEST]
where:
[HOSTNAME] is listed under Location in the console. It's one of four
options: gcr.io, us.gcr.io, eu.gcr.io, or asia.gcr.io.
[PROJECT-ID] is your Google Cloud Platform Console project ID.
[IMAGE] is the image's name in Container Registry.
[TAG] is the tag applied to the image. In a registry, tags are unique
to an image.
[IMAGE_DIGEST] is the sha256 hash value of the image contents. In the
console, click on the specific image to see its metadata. The digest
is listed as the Image digest.
To get the pull command for a specific image:
Click on the name of an image to go to the specific registry.
In the registry, check the box next to the version of the image that
you want to pull.
Click SHOW PULL COMMAND on the top of the page.
Copy the pull command, which identifies the image using either the
tag or the digest
*Also check that you have push and pull permissions from the registry.
**Configured Docker to use gcloud as a credential helper, or are using another authentication method. To use gcloud as the credential helper, run the command:
gcloud auth configure-docker

Docker Pull fail in disconnected Enviornment

I have to install openshift on disconnected system so i followed following steps(original installation requires more image but for sake of understanding i am provided minimum steps)
on system with internet i did following steps
docker pull docker.io/openshift/origin-node:v3.11.0
docker save -o openshift-origin-v3.11.0-images.tar \
docker.io/openshift/origin-node:v3.11.0
on second disconnected system i did following
docker load -i openshift-origin-v3.11.0-images.tar
Now when i start script for installation it pull the images with command docker.io/openshift/origin-node:v3.11.0
which is throwing following error
Error getting v2 registry: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on [::1]:53: dial udp [::1]:53: connect: no route to host
When on second system i do docker images
[root#x ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/openshift/origin-node v3.11.0 14d965ab72d5 4 days ago 1.17 GB
Its showing me that image is available. Whats wrong here? My understanding is it should first look locally and then will check from dockerhub
Update1:
if i directly pull it saying
[root#x ~]# docker pull docker.io/openshift/origin-node:v3.11.0
Trying to pull repository docker.io/openshift/origin-node ...
Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.x.x:53: server misbehaving
I am expecting it should say
Status: Image is up to date for
I changed following in /etc/containers/registries.conf and it works
From
[registries.search]
registries = ['registry.access.redhat.com', 'docker.io', 'registry.fedoraproject.org', 'quay.io', 'registry.centos.org']
To
[registries.search]
registries = []
[registries.block]
registries = ['registry.access.redhat.com', 'docker.io', 'registry.fedoraproject.org', 'quay.io', 'registry.centos.org']

Connection refused on pushing a docker image

I'm going to setup a local registry by following https://docs.docker.com/registry/deploying/.
docker run -d -p 5000:5000 --restart=always --name reg ubuntu:16.04
When I try to run the following command:
$ docker push localhost:5000/my-ubuntu
I get Error:
Get http://localhost:5000/v2/: dial tcp 127.0.0.1:5000: connect:connection refused
Any idea?
Connection refused usually means that the service you are trying to connect to isn't actually up and running like it should. There could be other reasons as outlined in this question, but essentially, for your case, it simply means that the registry is not up yet.
Wait for the registry container to be created properly before you do anything else - docker run -d -p 5000:5000 --restart=always --name registry registry:2 that creates a local registry from the official docker image.
Make sure that the registry container is up by running docker ps | grep registry, and then proceed further.
More comments about
Kubenetes(K8s) / Minikube
docker / image / registry, container
If you are using Minikube, and want to pull down an image from 127.0.0.1:5000,
then you meet the errors below:
Failed to pull image "127.0.0.1:5000/nginx_operator:latest": rpc error: code = Unknown desc = Error response from daemon: Get http://127.0.0.1:5000/v2/: dial tcp 127.0.0.1:5000: connect: connection refused
Full logs:
$ kubectl describe pod/your_pod
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m29s default-scheduler Successfully assigned tj-blue-whale-05-system/tj-blue-whale-05-controller-manager-6c8f564575-kwxdv to minikube
Normal Pulled 2m25s kubelet Container image "gcr.io/kubebuilder/kube-rbac-proxy:v0.5.0" already present on machine
Normal Created 2m24s kubelet Created container kube-rbac-proxy
Normal Started 2m23s kubelet Started container kube-rbac-proxy
Normal BackOff 62s (x5 over 2m22s) kubelet Back-off pulling image "127.0.0.1:5000/nginx_operator:latest"
Warning Failed 62s (x5 over 2m22s) kubelet Error: ImagePullBackOff
Normal Pulling 48s (x4 over 2m23s) kubelet Pulling image "127.0.0.1:5000/nginx_operator:latest"
Warning Failed 48s (x4 over 2m23s) kubelet Failed to pull image "127.0.0.1:5000/nginx_operator:latest": rpc error: code = Unknown desc = Error response from daemon: Get http://127.0.0.1:5000/v2/: dial tcp 127.0.0.1:5000: connect: connection refused
Warning Failed 48s (x4 over 2m23s) kubelet Error: ErrImagePull
Possible root cause:
The registry must be setup inside the Minikube side instead of your host side.
i.e.
host: registry (127.0.0.1:5000)
minikube: no registry (the K8s could not find your image)
How to check?
Step1: check your Minikube container
$ docker ps -a
CONTAINER ID IMAGE ... STATUS PORTS NAMES
8c6f49491dd6 gcr.io/k8s-minikube/kicbase:v0.0.15-snapshot4 ... Up 15 hours 127.0.0.1:49156->22/tcp, 127.0.0.1:49155->2376/tcp, 127.0.0.1:49154->5000/tcp, 127.0.0.1:49153->8443/tcp minikube
# your Minikube is under running
# host:49154 <--> minikube:5000
# where:
# - port 49154 was allocated randomly by the docker service
# - port 22: for ssh
# - port 2376: for docker service
# - port 5000: for registry (image repository)
# - port 8443: for Kubernetes
Step2: login to your Minikube
$ minikube ssh
docker#minikube:~$ curl 127.0.0.1:5000
curl: (7) Failed to connect to 127.0.0.1 port 5000: Connection refused
# setup
# =====
# You did not setup the registry.
# Let's try to setup it.
docker#minikube:~$ docker run --restart=always -d -p 5000:5000 --name registry registry:2
# test
# ====
# test the registry using the following commands
docker#minikube:~$ curl 127.0.0.1:5000
docker#minikube:~$ curl 127.0.0.1:5000/v2
Moved Permanently.
docker#minikube:~$ curl 127.0.0.1:5000/v2/_catalog
{"repositories":[]}
# it's successful
docker#minikube:~$ exit
logout
Step3: build your image, and push it into the registry of your Minikube
# Let's take nginx as an example. (You can build your own image)
$ docker pull nginx
# modify the repository (the source and the name)
$ docker tag nginx 127.0.0.1:49154/nginx_operator
# check the new repository (source and the name)
$ docker images | grep nginx
REPOSITORY TAG IMAGE ID CREATED SIZE
127.0.0.1:49154/nginx_operator latest ae2feff98a0c 3 weeks ago 133MB
# push the image into the registry of your Minikube
$ docker push 127.0.0.1:49154/nginx_operator
Step4: login to your Minikube again
$ minikube ssh
# check the registry
$ curl 127.0.0.1:5000/v2/_catalog
{"repositories":["nginx_operator"]}
# it's successful
# get the image info
$ curl 127.0.0.1:5000/v2/nginx_operator/manifests/latest
docker#minikube:~$ exit
logout
Customize exposed ports of Minikube
if you would like to use the port 5000 on the host side instead of using 49154 (which was allocated randomly by the docker service)
i.e.
host:5000 <--> minikube:5000
you need to recreate a minikube instance with the flag --ports
# delete the old minikube instance
$ minkube delete
# create a new one (with the docker driver)
$ minikube start --ports=5000:5000 --driver=docker
# or
$ minikube start --ports=127.0.0.1:5000:5000 --driver=docker
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5d1e5b61a3bf gcr.io/k8s-minikube/kicbase:v0.0.15-snapshot4 "/usr/local/bin/entr…" About a minute ago Up About a minute 0.0.0.0:5000->5000/tcp, 127.0.0.1:49162->22/tcp, 127.0.0.1:49161->2376/tcp, 127.0.0.1:49160->5000/tcp, 127.0.0.1:49159->8443/tcp minikube
$ docker port minikube
22/tcp -> 127.0.0.1:49162
2376/tcp -> 127.0.0.1:49161
5000/tcp -> 127.0.0.1:49160
5000/tcp -> 0.0.0.0:5000
8443/tcp -> 127.0.0.1:49159
you can see: 0.0.0.0:5000->5000/tcp
Re-test your registry in the Minikube
# in the host side
$ docker pull nginx
$ docker tag nginx 127.0.0.1:5000/nginx_operator
$ docker ps -a
$ docker push 127.0.0.1:5000/nginx_operator
$ minikube ssh
docker#minikube:~$ curl 127.0.0.1:5000/v2/_catalog
{"repositories":["nginx_operator"]}
# Great!

Resources