Push local image to local IBM Private Cloud cluster? - docker

On ubuntu I have installed a local IBM Private Cloud cluster using this guide:
https://github.com/IBM/deploy-ibm-cloud-private/blob/master/docs/deploy-vagrant.md
Next I would like to push some local docker images I have on my host to the IBM cluster. I have found this guide:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_1.2.0/manage_images/using_docker_cli.html
where bullet 2 says:
Obtain the configure-registry-cert.sh script from your system administrator. The script is located in the /<installation_directory>/misc/configure-registry-cert.sh directory. You must obtain the IBM® Cloud private registry certificate script to pull and push images to the private image registry.
I have SSH'ed to the master container with:
vagrant ssh
but I have not been able to find /<installation_directory>/misc/configure-registry-cert.sh
in either /home/vagrant or /opt
UPDATE:
I have found this guide:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_2.1.0/manage_images/using_docker_cli.html
which says that you need to copy cert from master node to client machine (my host) with:
scp /etc/docker/certs.d/<cluster_CA_domain>\:8500/ca.crt \
root#<client_node>:/etc/docker/certs.d/<cluster_CA_domain>\:8500/
I created a password for root and copied /etc/docker/certs.d/mycluster.icp:8500/ca.crt from the master node to my local docker installation in /etc/docker/certs.d/mycluster.icp:8500/ca.crt
But when I then try to login I get the below error:
$ docker login mycluster.icp:8500
Username: admin
Password:
Error response from daemon: Get https://mycluster.icp:8500/v2/: x509: certificate signed by unknown authority
where I specified admin as password (I use admin/admin for logging in to the web interface) since I have not found info on what credentials to use for that login.
Based on:
https://www.ibm.com/developerworks/community/blogs/fe25b4ef-ea6a-4d86-a629-6f87ccf4649e/entry/Working_with_the_local_docker_registry_from_Spectrum_Conductor_for_Containers?lang=en
it says that I first need to create a namespace and then a user for that namespace. I can create a namespace but I don't have an option to create a new user.
Any ideas on how to login to the docker registry?
And as requested below I can confirm that the ca.cert indeed is in the correct location on the master node:
$ vagrant ssh
Welcome to Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-131-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
0 packages can be updated.
0 updates are security updates.
Last login: Thu Jul 26 19:59:18 2018 from 192.168.27.100
vagrant#master:~$ sudo passwd
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
vagrant#master:~$ su
Password:
root#master:/home/vagrant# ls -la /etc/docker/certs.d/mycluster.icp\:8500/
total 12
drwxr-xr-x 2 root root 4096 Jul 26 19:54 .
drwxr-xr-x 3 root root 4096 Jul 26 19:53 ..
-rw-r--r-- 1 root root 1850 Jul 26 19:54 ca.crt
root#master:/home/vagrant#

You can try to update your docker configuration to put <cluster_CA_domain>\:8500 registry in the insecure registry list.
/usr/bin/docker --insecure-registry docker-reg:5000 -d
you can update the docker service add --insecure-registry mycluster.icp:8500 in the docker option. then
```systemctl daemon-reload
systemctl restart docker```
And then you can try docker login mycluster.icp:8500
remember to add mycluster.icp in your /etc/hosts

Related

How to run crictl command as non root user

How to run crictl as non-root user.
My docker commands work with non-root user because my user is added to docker group.
id
uid=1002(kube) gid=100(users) groups=100(users),10(wheel),1001(dockerroot),1002(docker)
I am running dockerD daemon which uses containerd and runc as runtime.
I installed crictl binary and pointed it to connect to existing dockershim socket with config file as below.
cat /etc/crictl.yaml
runtime-endpoint: unix:///var/run/dockershim.sock
image-endpoint: unix:///var/run/dockershim.sock
timeout: 2
debug: false
pull-image-on-create: false
crictl works fine with sudo but without sudo it fails like this.
[user#hostname~]$ crictl ps
FATA[0002] connect: connect endpoint 'unix:///var/run/dockershim.sock', make sure you are running as root and the endpoint has been started: context deadline exceeded
I also tried to change group of dockershim.sock to 'docker' from 'root' just like docker.sock was to try, still same.
srwxr-xr-x 1 root docker 0 Jan 2 23:36 /var/run/dockershim.sock
srw-rw---- 1 root docker 0 Jan 2 23:33 /var/run/docker.sock
sudo usermod -aG docker $USER
or you can see docker postinstall

Ansible hangs using become when executed by jenkins user

Ansible playbook via Jenkins job
For my CI/CD pipeline, I'm using Jenkins to execute an Ansible playbook using the Ansible plugin. Jenkins and ansible are running on the same GCP instance (Debian 10). Ansible is running the playbook against another GCP instance (Also Debian 10) via the private network. When I run the playbook manually using my default GCP user, everything works flawlessly. However, when Jenkins is trying to run the playbook or when I'm running the playbook manually as the jenkins user, playbook execution hangs whenever become is set to yes.
Ansible stdout
22m 19s
[WindBox_main#2] $ ansible-playbook ./ansible/playbook.yml -i ./ansible/hosts
PLAY [all] *********************************************************************
TASK [Gathering Facts] *********************************************************
ok: [10.156.0.2]
TASK [docker_stack : Ping host] ************************************************
ok: [10.156.0.2]
TASK [docker_stack : Ping host with become] ************************************
I have the ansible become password stored in a vars file (I know, it should be in the vault), and I verified that ansible is able to access this variable. The private key used by ansible to establish an ssh connection has the same owner and group as the vars file and ssh connection is not a problem.
Manually establishing an ssh connection with the jenkins user and private key works as expected, so does the sudo command when logged in as the ansible-cicd user.
Initial research has pointed me towards this post, stating that the process running on the remote host might be waiting for user input, this does not seem likely as this is not happening when running as my default user. I assume that remotely everything is running as the ansible-cicd user, as specified in the hosts file.
I assume it is some sort of permission problem. It does not look like it's any of the files mentioned below. Is there something I am missing? Does ansible's become require local sudo access?
Any help would be greatly appreciated.
Files
hosts
[dockermanager]
10.156.0.2 ansible_user=ansible-cicd ansible_ssh_private_key_file=/var/lib/jenkins/.ssh/id_rsa
[dockermanager:vars]
ansible_python_interpreter=/usr/bin/python3
ansible_connection=ssh
playbook.yml
---
- hosts: all
vars:
ansible_python_interpreter: /usr/bin/python3
ansible_connection: local
vars_files:
- /etc/ansible/secrets.yml
roles:
- docker_stack
roles/docker_stack/main.yml (shortened for brevity, but these are the first 2 steps)
---
- name: Ping host
ansible.builtin.ping:
- name: Ping host with become
ansible.builtin.ping:
become: yes
/var/lib/jenkins/.ssh/
total 20
drwx------ 2 jenkins jenkins 4096 Nov 23 22:12 .
drwxr-xr-x 26 jenkins jenkins 4096 Nov 25 15:27 ..
-rw------- 1 jenkins jenkins 1831 Nov 25 15:19 id_rsa
-rw-r--r-- 1 jenkins jenkins 405 Nov 25 15:19 id_rsa.pub
-rw-r--r-- 1 jenkins jenkins 666 Nov 25 15:13 known_hosts
/etc/ansible/
total 16
drwxr-xr-x 2 ansible ansible 4096 Nov 24 13:42 .
drwxr-xr-x 80 root root 4096 Nov 25 15:09 ..
-rw-r--r-- 1 jenkins jenkins 62 Nov 24 13:42 secrets.yml
As mentioned by #mdaniel in the comments, the ansible_connection variable was defined twice (in hosts and in playbook) and ssh was overridden by local, meaning ansible never actually connected to my remote machine.
Removing ansible_collection: local from the playbook vars solved the problem.

Docker (Spotify) API - cannot connect to Docker

In my Docker (Spring Boot) application I would like to execute Docker commands. I use the docker-spotify-api (client).
I get different connection errors. I start the application as part of a docker-compose.yml.
This is what I tried so far on an EC2 AWS VPS:
docker = DefaultDockerClient.builder()
.uri(URI.create("tcp://localhost:2376"))
.build();
=> TCP protocol not supported.
docker = DefaultDockerClient.builder()
.uri(URI.create("tcp://localhost:2375"))
.build();
=> TCP protocol not supported.
docker = new DefaultDockerClient("unix:///var/run/docker.sock");
==> No such file
docker = DefaultDockerClient.builder()
.uri("unix:///var/run/docker.sock")
.build();
==> No such file
docker = DefaultDockerClient.builder()
.uri(URI.create("http://localhost:2375")).build();
or
docker = DefaultDockerClient.builder()
.uri(URI.create("http://localhost:2376")).build();
or
docker = DefaultDockerClient.builder()
.uri(URI.create("https://localhost:2376"))
.build();
==> Connect to localhost:2376 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
Wthat is my environment on EC2 VPS:
$ ls -l /var/run
lrwxrwxrwx 1 root root 6 Nov 14 07:23 /var/run -> ../run
$ groups ec2-user
ec2-user : ec2-user adm wheel systemd-journal docker
$ ls -l /run/docker.sock
srw-rw---- 1 root docker 0 Feb 14 17:16 /run/docker.sock
echo $DOCKER_HOST $DOCKER_CERT_PATH
(empty)
This situation is similar to https://github.com/spotify/docker-client/issues/838#issuecomment-318261710.
You use docker-compose on the host to start up your application; Within the container, the Spring Boot application is using docker-spotify-api.
What you can try is to mount /var/run/docker.sock:/var/run/docker.sock in you compose file.
As #Benjah1 indicated, /var/run/docker.sock had to be mounted first.
To do so in a docker-compose / Docker Swarm environment you can do:
volumes:
- /var/run/docker.sock:/var/run/docker.sock
Furthermore, the other options resulted in errors because the default setting of Docker is that it won't open up to tcp/http connections. You can change this, of course, taking a small risk.
What is your DOCKER_HOST and DOCKER_CERT_PATH env vars value.
Try below as docker-client communicates with your local Docker daemon using the HTTP Remote API
final DockerClient docker = DefaultDockerClient.builder()
.uri(URI.create("https://localhost:2376"))
.build();
please also verify the privileges of docker.sock is it visible to your app and check weather your docker service is running or not as from above screenshot your docker.sock looks empty but if service is running it should contain pid in it
It took me some time to figure out, but I was running https://hub.docker.com/r/alpine/socat/ locally, and also wanted to connect to my Docker daemon and couldn't (same errors). Then it struck me: the solution on that webpage uses 127.0.0.1 as the ip address to bind to. Instead, start that container with 0.0.0.0, and then inside your container, you can do this: DockerClient dockerClient = new DefaultDockerClient("http://192.168.1.215:2376"); (use your own ip-address of course).
This worked for me.

Permission issues in nexus3 docker container

When I start nexus3 in a docker container I get the following error messages.
$ docker run --rm sonatype/nexus3:3.8.0
Warning: Cannot open log file: ../sonatype-work/nexus3/log/jvm.log
Warning: Forcing option -XX:LogFile=/tmp/jvm.log
Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file ../sonatype-work/nexus3/log/jvm.log due to Permission denied
Unable to update instance pid: Unable to create directory /nexus-data/instances
/nexus-data/log/karaf.log (Permission denied)
Unable to update instance pid: Unable to create directory /nexus-data/instances
It indicates that there is a file permission issue.
I am using Red Hat Enterprise Linux 7.5 as host machine and the most recent docker version.
On another machine (ubuntu) it works fine.
The issue occurs in the persistent volume (/nexus-data). However, I do not mount a specific volume and let docker use a anonymous one.
If I compare the volumes on both machines I can see the following permissions:
For Red Hat, where it is not working is belongs to root.
$ docker run --rm sonatype/nexus3:3.8.0 ls -l /nexus-data
total 0
drwxr-xr-x. 2 root root 6 Mar 1 00:07 etc
drwxr-xr-x. 2 root root 6 Mar 1 00:07 log
drwxr-xr-x. 2 root root 6 Mar 1 00:07 tmp
On ubuntu, where it is working it belongs to nexus. Nexus is also the default user in the container.
$ docker run --rm sonatype/nexus3:3.8.0 ls -l /nexus-data
total 12
drwxr-xr-x 2 nexus nexus 4096 Mar 1 00:07 etc
drwxr-xr-x 2 nexus nexus 4096 Mar 1 00:07 log
drwxr-xr-x 2 nexus nexus 4096 Mar 1 00:07 tmp
Changing the user with the options -u is not an option.
I could solve it by deleting all local docker images: docker image prune -a
Afterwards it downloaded the image again and it worked.
This is strange because I also compared the fingerprints of the images and they were identical.
An example of docker-compose for Nexus :
version: "3"
services:
#Nexus
nexus:
image: sonatype/nexus3:3.39.0
expose:
- "8081"
- "8082"
- "8083"
ports:
# UI
- "8081:8081"
# repositories http
- "8082:8082"
- "8083:8083"
# repositories https
#- "8182:8182"
#- "8183:8183"
environment:
- VIRTUAL_PORT=8081
volumes:
- "./nexus/data/nexus-data:/nexus-data"
Setup the volume :
mkdir -p ./nexus/data/nexus-data
sudo chown -R 200 nexus/ # 200 because it's the UID of the nexus user inside the container
Start Nexus
sudo docker-compose up -d
hf
You should attribute correct right to the folder where the persistent volume is located.
chmod u+wxr -R <folder of /nexus-data volumes>
Be carefull, if you execute previous command, it would give write, read and execution right to all users. If you want to give more restricted right, you should modify the command.

Docker-machine create with generic driver, Certificates not working but SSH does

Im trying to get a docker-machine up and running on a Ubuntu 14.04TSL server in our network. I have installed docker+docker-machine on the server and im able to create the docker-machine on the server with this command from my computer:
docker-machine create --driver generic --generic-ip-address 10.10.3.76 --generic-ssh-key "/Users/username/Documents/keys/mysshkey.pem" --generic-ssh-user ubuntuuser dockermachinename
The command above creates the docker-machine and im able to list it with
docker-machine ls
Im able to SSH to it by running
docker-machine ssh dockermachinename
but when i try to connect the server with (-D for debug information)
docker-machine -D env dockermachinename
I get the following message
Docker Machine Version: 0.5.2 ( 0456b9f )
Found binary path at /usr/local/bin/docker-machine-driver-generic
Launching plugin server for driver generic
Plugin server listening at address 127.0.0.1:54213
() Calling .GetVersion
Using API Version 1
() Calling .SetConfigRaw
() Calling .GetMachineName
(dockermachinename) Calling .GetState
(dockermachinename) Calling .GetURL
Reading CA certificate from /Users/username/.docker/machine/certs/ca.pem
Reading server certificate from /Users/username/.docker/machine/machines/dockermachinename/server.pem
Reading server key from /Users/username/.docker/machine/machines/dockermachinename/server-key.pem
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "10.10.3.76:2376": dial tcp 10.10.3.76:2376: i/o timeout
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.
I really need to solve this so all help is appreciated!
On Ubuntu you will need to do following steps:
1. Create user which don't require password
sudo visudo
at the end of file add following line (make sure to specify your username):
username ALL=(ALL:ALL) NOPASSWD: ALL
and then save and exit. And after that add your username to docker group like this (change username with your actual username):
usermod -aG docker username
2. Edit docker config to open 2375 and 2376 ports
sudo systemctl edit docker.service
add following snippet to that file:
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// -H tcp://0.0.0.0:2376 -H tcp://0.0.0.0:2375
then save and exit. After that reload config and restart docker deamon with:
sudo systemctl daemon-reload
sudo systemctl restart docker.service
3. Create docker-machine
Remove existing machine which is failing with:
docker-machine rm machine1
and try to create it one more time like this:
docker-machine create -d generic --generic-ip-address ip --generic-ssh-key ~/.ssh/key --generic-ssh-user username --generic-ssh-port 22 machine1
please change ip, key, username and machine1 with you actual values.
If this produce error like this:
Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "192.168.0.26:2376": tls: oversized record received with length 20527
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.
then SSH to your machine and cd into following directory:
cd /etc/systemd/system/docker.service.d/
list all files in it with:
ls -l
you will probably have something like this:
-rw-r--r-- 1 root root 274 Jul 2 17:47 10-machine.conf
-rw-r--r-- 1 root root 101 Jul 2 17:46 override.conf
you will need to delete all files except 10-machine.conf with sudo rm.
After that remove machine you created and create it again. It should now work. I hope this helps. Maybe you already steps 1 and 2 if so then skip them and just try to remove override.conf file or any file in that dir which is not 10-machine.conf.

Resources