Docker Swarm - Secrets problem when creating new container - docker

i am running Production Docker Swarm Cluster with 3 manager nodes and many workers.
Every node (managers, worker) running same Docker and BTRFS Version:
Server Version: 17.12.1-ce
Storage Driver: btrfs
Build Version: Btrfs v4.9.1
Library Version: 102
I got deployed service with 1 replication. This service using secret.
"Secrets": [
{
"File": {
"Name": "/var/secret",
"UID": "0",
"GID": "0",
"Mode": 400
},
"SecretID": "vb8485hcixfhnqrp29m8lrfm2",
"SecretName": "supersecret"
}
This secret exists on Docker Swarm manager Leader:
{
"ID": "vb8485hcixfhnqrp29m8lrfm2",
"Version": {
"Index": 124153
},
"CreatedAt": "2020-08-17T12:22:29.656205519Z",
"UpdatedAt": "2020-08-17T12:22:29.656205519Z",
"Spec": {
"Name": "supersecret",
"Labels": {}
}
}
But cannot start container from this service. When i'm trying to update this service with "docker sevice update --force ${service_name}" always got exited container with Error:
Error response from daemon: unable to get secret from secret store: secret vb8485hcixfhnqrp29m8lrfm2 not found.
Container is creater without "secrets" folder in /var/lib/docker/container_ID/
drwx------. 1 root root 0 Aug 19 11:06 checkpoints
-rw-------. 1 root root 9305 Aug 19 11:41 config.v2.json
-rw-r--r--. 1 root root 1599 Aug 19 11:41 hostconfig.json
-rw-r--r--. 1 root root 13 Aug 19 11:41 hostname
-rw-r--r--. 1 root root 150 Aug 19 11:41 hosts
-rw-r--r--. 1 root root 48 Aug 19 11:41 resolv.conf
-rw-r--r--. 1 root root 71 Aug 19 11:41 resolv.conf.hash
drwx------. 1 root root 0 Aug 19 11:06 shm
I dont know what to do or what's wrong. Your help will be much appreciated.

As a response to
Did you try to reference the secret by its name instead of its id? Honestly, I never even though about referencing secrets by its randomly generated id, when there is a stable self declared name. – Metin
as per Docker API dopcumentation:
SecretName is the name of the secret that this references, but this is
just provided for lookup/display purposes. The secret in the reference
will be identified by its ID.
Which means that adressing the secret under it's name likely won't solve the problem. Im starting to suspect, that this error message doesn't describe what is actually happening (or at least not fully).

Related

Elasticsearch create snapshot repo throws RepositoryVerificationException

I am trying to take a snapshot for a elasticsearch cluster.The design is the following. There are 3 VMs that run 1 master, 1 data and 1 client node each in Docker containers. Each VM has a volume attached for storing. So a cluster with 3 masters 3 clients 3 data nodes and 3 volumes.
After reading the documentation I created a separate backup volume that I attached to one of the VMs. After that i created a NFS between all 3 VMs that saves the data on the backup volume and then I modified the cluster and mounted the shared NFS directory as a volume to all the nodes in the cluster
So now each VM has the following:
VM1:
drwxr-xr-x 16 root root 3560 Jul 24 10:30 dev
drwxr-xr-x 2 nobody nogroup 4096 Jul 24 11:49 elastic-backup
drwxr-xr-x 97 root root 4096 Jul 24 14:04 etc
drwxr-xr-x 5 root root 4096 Apr 27 12:53 home
VM2:
drwxr-xr-x 2 root root 4096 Jul 24 13:52 bin
drwxr-xr-x 3 root root 4096 Jul 24 12:09 boot
drwxr-xr-x 5 root root 4096 Jan 27 16:41 data
drwxr-xr-x 16 root root 3580 Jul 24 11:48 dev
drwxr-xr-x 2 nobody nogroup 4096 Jul 24 11:49 elastic-backup
VM3:
drwxr-xr-x 3 root root 4096 Jul 24 15:28 boot
drwxr-xr-x 5 root root 4096 Jan 27 16:41 data
drwxr-xr-x 16 root root 3560 Jul 24 10:30 dev
drwxr-xr-x 2 nobody nogroup 4096 Jul 24 15:34 elastic-backup
When i create a file into it i can see it, modify or whatever and the action is visible from each VM.
Elasticsearch docker nodes:
drwxr-xr-x 1 elasticsearch elasticsearch 4096 May 15 2018 config
drwxr-xr-x 4 elasticsearch elasticsearch 4096 Jul 23 12:15 data
drwxr-xr-x 2 elasticsearch elasticsearch 4096 Jul 24 15:08 elastic-backup
Each docker elasticsearch node has the same directory mounted. I can see all the files from each node.
The problem is that whenever I try to create a snapshot repository i get the following error:
Call:
PUT /_snapshot/elastic-backup-1
{
"type": "fs",
"settings": {
"location": "/usr/share/elasticsearch/elastic-backup"
}
}
Error:
{
"error": {
"root_cause": [
{
"type": "repository_verification_exception",
"reason": "[elastic-backup-1] [[some-id, 'RemoteTransportException[[master-2][VM2-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{master-2}{some-id}{some-id}{VM2-ip}{VM2-ip}{zone=AZ2}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[data-2][VM2-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{data-2}{some-id}{some-id}{VM2-ip}{VM2-ip}{zone=AZ2}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[data-1][VM1-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{data-1}{some-id}{some-id}{VM1-ip}{VM1-ip}{zone=AZ1}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[master-1][VM1-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{master-1}{some-id}{some-id}{VM1-ip}{VM1-ip}{zone=AZ1}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];'], [some-id, 'RemoteTransportException[[data-3][VM3-ip][internal:admin/repository/verify]]; nested: RepositoryVerificationException[[elastic-backup-1] a file written by master to the store [/usr/share/elasticsearch/elastic-backup] cannot be accessed on the node [{data-3}{some-id}{some-id}{VM3-ip}{VM3-ip}{zone=AZ1}]. This might indicate that the store [/usr/share/elasticsearch/elastic-backup] is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node];']]"
}
etc ..
Anything I am doing wrong ? How can this be fixed
As stated by Christian_Dahlqvist, you must provide a shared file system.
You need to have a shared volume, e.g. a NFS volume, behind the repository path that all nodes can access. This means that if node1 writes a file it will be visible by node 2 and node 3. A directory in the local file system will therefore not work, even if the path is identical on all machines.

devcontainer, how to make X display work (mount graphics inside docker in visual studio code)

Normally I use this trick to make X work inside docker:
docker run --rm -it -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME/Videos:/videos -e DISPLAY=unix$DISPLAY --name knl kdenlive
I tried doing the same on a devcontainer:
{
"name": "example_dockerized_environment",
"dockerFile": "Dockerfile",
"extensions": [
"ms-vscode.cpptools",
"twxs.cmake",
"eamodio.gitlens"
],
"mounts": [ "source=../,target=/home/project", "source=/tmp/.X11-unix, target=/tmp/.X11-unix"],
"containerEnv": {
"DISPLAY": "unix:0"
},
"runArgs": ["--privileged"]
}
As you can see, I passed unix$DISPLAY and also mounted X11-unix
But I get
root#5e6a10efbea6:/workspaces/leuze_lidar_volume/sdk/quanergy_client-master# ./visualizer --host localhost
ERROR: In /build/vtk6-YpT4yb/vtk6-6.2.0+dfsg1/Rendering/OpenGL/vtkXOpenGLRenderWindow.cxx, line 1466
vtkXOpenGLRenderWindow (0x2293740): bad X server connection. DISPLAY=Aborted (core dumped)
When I do echo $DISPLAY inside docker I see nothing. I tried doing
export DISPLAY=unix:0 then I get
root#5e6a10efbea6:/workspaces/leuze_lidar_volume/sdk/quanergy_client-master# ./visualizer --host localhost
ERROR: In /build/vtk6-YpT4yb/vtk6-6.2.0+dfsg1/Rendering/OpenGL/vtkXOpenGLRenderWindow.cxx, line 1466
vtkXOpenGLRenderWindow (0x281dcf0): bad X server connection. DISPLAY=unix:0. Aborting.
Aborted (core dumped)
I also see no /tmp.x11-unix inside the container:
root#5e6a10efbea6:/tmp# ls -la /tmp
total 16
drwxrwxrwt 1 root root 4096 Mar 18 03:47 .
drwxr-xr-x 1 root root 4096 Mar 18 03:45 ..
srwxr-xr-x 1 root root 0 Mar 18 03:42 vscode-git-askpass-c2ca47727522d7940b4cce1d99fcc88d32ccfefc.sock
srwxr-xr-x 1 root root 0 Mar 18 03:47 vscode-git-ipc-f52b0dbfd870db22481ea656170b7615ea1e6497.sock
srwxr-xr-x 1 root root 0 Mar 18 03:45 vscode-ipc-032f3099-16ea-4f5d-8561-586571a4aea9.sock
srwxr-xr-x 1 root root 0 Mar 18 03:32 vscode-ipc-425af2fc-ddb1-4554-b93b-3a5bede4c52d.sock
srwxr-xr-x 1 root root 0 Mar 18 03:47 vscode-ipc-58739ccc-fb7d-4289-808e-21d31c703d1a.sock
srwxr-xr-x 1 root root 0 Mar 18 03:42 vscode-ipc-aa7aed50-92e4-4b2b-b17e-d70c1bba595e.sock
-rw-r--r-- 1 root root 2342 Mar 18 03:46 vscode-remote-containers-6a199ce05d20a43a350860289798f388414d648c.js
srwxr-xr-x 1 root root 0 Mar 18 03:46 vscode-remote-containers-ipc-6a199ce05d20a43a350860289798f388414d648c.sock
srwxr-xr-x 1 root root 0 Mar 18 03:46 vscode-ssh-auth-6a199ce05d20a43a350860289798f388414d648c.sock
drwxr-xr-x 2 root root 4096 Mar 18 03:46 vscode-typescript0
root#5e6a10efbea6:/tmp#
this works:
{
"name": "my_docker_environment",
"dockerFile": "Dockerfile",
"extensions": [
"ms-vscode.cpptools",
"twxs.cmake",
"eamodio.gitlens",
"ms-vscode.cmake-tools"
],
"containerEnv": {
"DISPLAY": "unix:0"
},
"mounts": [
"source=/tmp/.X11-unix,target=/tmp/.X11-unix,type=bind,consistency=cached"
],
"runArgs": ["--privileged"]
}
I don't recall if "runArgs": ["--privileged"] is needed but I guess no
You migth need to do
xhost local:root
on a terminal on the host before launching your app
tested on macOS, the only extra setting in devcontainer.json I used:
"containerEnv": {
"DISPLAY": "docker.for.mac.host.internal:0"
},
on host machine xhost +localhost before running devcontainer

Is it safe to delete docker logs generated at /var/lib/docker/containers/HASH

The log is now is currently 13GB, I don't know if it safe to delete the log, and how to make the log smaller
root#faith:/var/lib/docker/containers/f1ac17e833be2e5d1586d34c51324178bd18f969d
1046cbb59f10eaa4bcf84be# ls -alh
total 13G
drwx------ 2 root root 4.0K Mar 6 08:35 .
drwx------ 3 root root 4.0K Feb 24 11:00 ..
-rw-r--r-- 1 root root 2.1K Feb 24 10:15 config.json
-rw------- 1 root root 13G Feb 25 00:27 f1ac17e833be2e5d1586d34c51324178bd18f96
9d1046cbb59f10eaa4bcf84be-json.log
-rw-r--r-- 1 root root 611 Feb 24 10:15 hostconfig.json
-rw-r--r-- 1 root root 13 Feb 24 10:15 hostname
-rw-r--r-- 1 root root 175 Feb 24 10:15 hosts
-rw-r--r-- 1 root root 61 Feb 24 10:15 resolv.conf
-rw-r--r-- 1 root root 71 Feb 24 10:15 resolv.conf.hash
Congratulations, you have discovered one of The Big Unsolved Problems with Docker!
As Nathaniel says, Docker assumes it has complete ownership of things under /var/lib/docker so trying to delete files there from behind Docker's back may not work.
However, based on components in issue 7333 and in PR 9753, it looks like people are successfully using logrotate and the copytruncate directive to rotate docker logs. Both these links are worth reading, because they contain a long discussion about the pitfalls of Docker logging and some potential solutions.
Ideally, Docker itself would have much better native support for log management. Until then, here are some alternatives to consider:
If you control the source for your applications, you can configure everything to log to syslog rather than to stdout/stderr. You can then have a variety of solutions you can pursue, from running a syslog service inside your container to exposing the hosts's /dev/log inside the container.
Another options is to run systemd inside your container, and use this to start your services. systemd will collect stdout/stderr from your services and feed that to journald, and journald will take care of things like log rotation (and also give you a reasonably flexible mechanism for querying the logs).
These ought to be cleaned up when you delete the container. (Thus, it is not OK to delete them, because Docker believes that it has control of /var/lib/docker.)

How to monitor docker containers log from non-root user?

I want to monitor docker containers log from non-root user(td-agent) and on host server,
sudo chmod o+rx /var/lib/docker
sudo find /var/lib/docker/containers/ -type d -exec chmod o+rx {} \;
sudo find /var/lib/docker/containers/ -type f -exec chmod o+r {} \;
But containers directory rollback 600 and each container directory keep 600.
# find /var/lib/docker/containers -ls
143142 4 drwx------ 4 root root 4096 Aug 14 12:01 /var/lib/docker/containers
146027 4 drwx------ 2 root root 4096 Aug 14 12:00 /var/lib/docker/containers/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d
146031 4 -rw-r--r-- 1 root root 190 Aug 14 12:00 /var/lib/docker/containers/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d/hostconfig.json
146046 4 -rw-r--r-- 1 root root 13 Aug 14 12:00 /var/lib/docker/containers/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d/hostname
146047 4 -rw-r--r-- 1 root root 174 Aug 14 12:00 /var/lib/docker/containers/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d/hosts
146030 4 -rw-r--r-- 1 root root 3305 Aug 14 12:00 /var/lib/docker/containers/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d/config.json
146049 4 -rw------- 1 root root 1853 Aug 14 12:00 /var/lib/docker/containers/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d/145efa73652aad14e1706e8fcd1597ccbbb49fd756047f3931270b46fe01945d-json.log
146050 4 drwx------ 2 root root 4096 Aug 14 12:01 /var/lib/docker/containers/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370
146054 4 -rw-r--r-- 1 root root 190 Aug 14 12:01 /var/lib/docker/containers/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370/hostconfig.json
146056 4 -rw-r--r-- 1 root root 13 Aug 14 12:01 /var/lib/docker/containers/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370/hostname
146057 4 -rw-r--r-- 1 root root 174 Aug 14 12:01 /var/lib/docker/containers/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370/hosts
146053 4 -rw-r--r-- 1 root root 3286 Aug 14 12:01 /var/lib/docker/containers/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370/config.json
146058 4 -rw------- 1 root root 1843 Aug 14 12:01 /var/lib/docker/containers/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370/f09796f978ef5bab1449d2d10d400228eb76376579e7e33c615313eeed53f370-json.log
How to monitor this each json.log? or any other good monitoring way?
logspout is another way to collect containerslogs. I'm not sure this is the best solution, but it is very interesting and consistent way to collect containers logs.
You just need to run logspout container. This container has a feature that send docker containers' logs to other syslog server. (or you can use HTTP api also. see repository)
# (172.17.42.1 is host ip address)
$ docker run -v=/var/run/docker.sock:/tmp/docker.sock progrium/logspout syslog://172.17.42.1:5140
And fluentd that is running on host can handle these logs through syslog protocal. Below is td-agent.conf example. It receive logs from syslog protocal and send them to elasticsearch server. (check this example project)
<source>
type syslog
port 5140
bind 0.0.0.0
tag syslog.udp
format /^(?<time>.*?) (?<container_id>.*?) (?<container_name>.*?): (?<message>.*?)$/
time_format %Y-%m-%dT%H:%M:%S%z
</source>
<match syslog.**>
index_name <ES_INDEX_NAME>
type_name <ES_TYPE_NAME>
type elasticsearch
host <ES_HOST>
port <ES_PORT>
flush_interval 3s
</match>
As I discussed in detail in this answer that the OP never acknowledged whatsoever, I find the best approach is to configure the applications running within the container to log messages to syslog, and mount the host's syslog socket to the container.
docker run -v /dev/log:/dev/log ...
Downside of this approach is that if the syslog daemon on the host is restarted, the container will lose it's socket since the daemon recreates the socket at restart.
A fix for this would be to add another socket (in rsyslog this can be done using the imuxsock module). Create the additional socket in some known directory, then bind mount the directory instead of /dev/log directly. The additional socket will also be removed when rsyslog restarts, but will be recreated and available to the application in the directory following the restart.
One easy way to deal with this issue is to mount host's /sys/fs/cgroup into a Docker container that's running in_docker_metrics. See https://github.com/bdehamer/docker-librato
Sematext Docker Agent (open-source, github) can do this for you. You won't need td-agent. SDA will collect logs, but also events and metrics. See https://github.com/sematext/sematext-agent-docker and
https://sematext.com/docker

docker error: x509: certificate signed by unknown authority

while running docker commands, I keep getting such error:
$ sudo docker search mattdm/fedora
2014/06/05 22:12:25 Error: Get https://index.docker.io/v1/search?q=mattdm%2Ffedora: x509: certificate signed by unknown authority
I'm using Fedora 20 x86_64 without any http proxy.
I searched with google, but failed to find any clue of this, and have no idea how to troubleshoot this error, could anyone give me some prompt on fixing this?
here is some additional info may help:
$ sudo docker version
Client version: 0.11.1
Client API version: 1.11
Go version (client): go1.2.1
Git commit (client): fb99f99/0.11.1
Server version: 0.11.1
Server API version: 1.11
Git commit (server): fb99f99/0.11.1
Go version (server): go1.2.1
$ curl https://index.docker.io/v1/search?q=mattdm/fedora
{"query": "mattdm/fedora", "num_results": 2, "results": [{"is_trusted": false, "is_official": false, "name": "mattdm/fedora", "star_count": 49, "description": "A basic Fedora image corresponding roughly to a minimal install, minus some things which don't make sense in a container. Use tag `f20` for Fedora 20 or `f19` for Fedora 19."}, {"is_trusted": false, "is_official": false, "name": "mattdm/fedora-small", "star_count": 8, "description": "A small Fedora image on which to build. Contains just enough that you'll be able to run `yum install` in your dockerfiles to create something useful. Use tag `f19` for Fedora 19."}]}
$ ls -l /etc/pki/tls/certs/
total 1500
lrwxrwxrwx. 1 root root 49 Feb 18 03:58 ca-bundle.crt -> /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
-rw-r--r--. 1 root root 713687 Jan 5 2013 ca-bundle.crt.rpmsave
lrwxrwxrwx. 1 root root 55 Feb 18 03:58 ca-bundle.trust.crt -> /etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt
-rw-r--r--. 1 root root 796502 Jan 5 2013 ca-bundle.trust.crt.rpmsave
-rw-r--r--. 1 root root 1338 Mar 14 12:13 ca-certificates.crt
-rw-------. 1 root root 1025 Sep 25 2012 localhost.crt
-rwxr-xr-x. 1 root root 610 Apr 8 08:36 make-dummy-cert
-rw-r--r--. 1 root root 2242 Apr 8 08:36 Makefile
-rwxr-xr-x. 1 root root 829 Apr 8 08:36 renew-dummy-cert
This was proved to be related to the CDN provider.
check here: https://github.com/dotcloud/docker/issues/6474

Resources