how to start influxdb 2.0.2? - influxdb

ok I installed (in ubuntu 20.04) as it said the official page of influxdb https://portal.influxdata.com/downloads/, specifically these commands:
wget https://dl.influxdata.com/influxdb/releases/influxdb_2.0.2_amd64.deb
sudo dpkg -i influxdb_2.0.2_amd64.deb
then add commands to start and create persistence with the daemon.
systemctl enable --now influxdb
systemctl status influxdb
and it comes out as if it was activated and running normally
● influxdb.service - InfluxDB is an open-source, distributed, time series database
Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-11-20 17:43:54 -03; 55min ago
Docs: https://docs.influxdata.com/influxdb/
Main PID: 750 (influxd)
Tasks: 7 (limit: 1067)
Memory: 33.8M
CGroup: /system.slice/influxdb.service
└─750 /usr/bin/influxd
Nov 20 17:44:03 hypercc influxd[750]: ts=2020-11-20T20:44:03.754479Z lvl=info msg="Open store (start)" log_id=0QarEkHl000 service=storage-engine op_name=tsdb_open op_event=start
Nov 20 17:44:03 hypercc influxd[750]: ts=2020-11-20T20:44:03.754575Z lvl=info msg="Open store (end)" log_id=0QarEkHl000 service=storage-engine op_name=tsdb_open op_event=end op_elapsed=0.098ms
Nov 20 17:44:03 hypercc influxd[750]: ts=2020-11-20T20:44:03.754661Z lvl=info msg="Starting retention policy enforcement service" log_id=0QarEkHl000 service=retention check_interval=30m
Nov 20 17:44:03 hypercc influxd[750]: ts=2020-11-20T20:44:03.754888Z lvl=info msg="Starting precreation service" log_id=0QarEkHl000 service=shard-precreation check_interval=10m advance_period=30m
Nov 20 17:44:03 hypercc influxd[750]: ts=2020-11-20T20:44:03.755164Z lvl=info msg="Starting query controller" log_id=0QarEkHl000 service=storage-reads concurrency_quota=10 initial_memory_bytes_quota_per_query=9223372036854775807 memory_bytes_quota_per_query=9223372036854775807 max_memory_bytes=0 queue_size=10
Nov 20 17:44:03 hypercc influxd[750]: ts=2020-11-20T20:44:03.755725Z lvl=info msg="Configuring InfluxQL statement executor (zeros indicate unlimited)." log_id=0QarEkHl000 max_select_point=0 max_select_series=0 max_select_buckets=0
Nov 20 17:44:04 hypercc influxd[750]: ts=2020-11-20T20:44:04.071001Z lvl=info msg=Starting log_id=0QarEkHl000 service=telemetry interval=8h
Nov 20 17:44:04 hypercc influxd[750]: ts=2020-11-20T20:44:04.071525Z lvl=info msg=Listening log_id=0QarEkHl000 transport=http addr=:8086 port=8086
Nov 20 18:14:03 hypercc influxd[750]: ts=2020-11-20T21:14:03.757182Z lvl=info msg="Retention policy deletion check (start)" log_id=0QarEkHl000 service=retention op_name=retention_delete_check op_event=start
Nov 20 18:14:03 hypercc influxd[750]: ts=2020-11-20T21:14:03.757233Z lvl=info msg="Retention policy deletion check (end)" log_id=0QarEkHl000 service=retention op_name=retention_delete_check op_event=end op_elapsed=0.074ms
What should I add to be able to write "influx" and go directly to the DB to make queries? is it something with the ip address?
When I enter influx, I only get help options but it doesn't say anything about connecting or something like that.
by the way here https://docs.influxdata.com/influxdb/v2.0/get-started/ it is installed in a different way but it is supposed that both ways work well.
thanks.

Usually tools like Telegraf are used to collect data and write it to InfluxDB. You can install Telegraf on each server you want to collect data from.
https://docs.influxdata.com/telegraf/v1.17/
You can browse to http://your_server_ip:8086 and login to chronograf (included to InfluxDB 2.0). Here you can create dashboards and query data from InfluxDB.
Its also possible to do manual queries via the InfluxDB CLI. You can simply use it with the influx query command in your terminal.
https://docs.influxdata.com/influxdb/v2.0/query-data/
Note that some commands need authentication before you are allowed to execute them (e.g. the user command). You can authenticate by adding the -t parameter followed by a valid user token (can be found in the web interface).
Example: influx -t token_here user list
Hope this helps you out.

Related

Unable to fix error Cannot connect to the Docker daemon at tcp://localhost:2375/. Is the docker daemon running? for remote GitLab runner

I am struggling to resolve the issue
Cannot connect to the Docker daemon at tcp://localhost:2375/. Is the docker daemon running?
I am using our companies GitLab EE instance, which comes with a bunch of shared group runners. However I would like to be able to use my own runners especially since I will be able to employ the GPU for some machine learning tasks. I have the following .gitlab-ci.yml:
run_tests:
image: python:3.9-slim-buster
before_script:
- apt-get update
- apt-get install make
script:
- python --version
- pip --version
- make test
build_image:
image: docker:20.10.23
services:
- docker:20.10.23-dind
variables:
DOCKER_TLS_CRETDIR: "/certs"
DOCKER_HOST: tcp://localhost:2375/
before_script:
- echo "User $REGISTRY_USER"
- echo "Token $ACCESS_TOKEN"
- echo "Host $REGISTRY_HOST_ALL"
- echo "$ACCESS_TOKEN" | docker login --username $REGISTRY_USER --password-stdin $REGISTRY_HOST_ALL
script:
- docker build --tag $REGISTRY_HOST_ALL/<PATH_TO_USER>/python-demoapp .
- docker push $REGISTRY_HOST_ALL/<PATH_TO_USER>/python-demoapp
The application is currently a demo and it's used in the following tutorial. Note that <PATH_TO_USER> in the above URLs is just a placeholder (I cannot reveal the original one since it contains internal information) and points at my account space, where the project python-demoapp is located. With untagged jobs enabled, I am hoping to have the following workflow:
Push application code change
GitLab pipeline triggered
2.1 Execute tests
2.2 Build image
2.3 Push image to container repository
Re-use image with application inside (e.g. run locally)
I have setup the variables accordingly to contain my username, an access token (generated in GitLab) and the registry host. All of these are correct and I am able to execute everything up to the docker build ... section.
Now as for the runner I followed the instructions provided in GitLab to set it up. I chose to create a VM (QEMU+KVM+libvirt) with a standard minimal installation of Debian 11 with everything set to default (including NAT network, which appears to be working since I can access the Internet through it), where the runner currently resides. I am doing this in order to save the setup and later on transfer it onto a server and run multiple instances of the VM with slight modification (e.g. GPU passthrough for Nvidia CUDA Docker/Podman setup).
Beside the runner (binary was downloaded from our GitLab instance), I installed Docker CE (in the future will be replaced with Podman due to licensing and pricing) following the official instructions. The Docker executor is ran as a systemd service (docker.service, docker.socket), that is I need sudo to interact with it. The runner has its own user (also part of the sudo group) as the official documentation is telling me to do.
The GitLab runner's configuration file gitlab-runner-config.toml contains the following information:
concurrent = 1
check_interval = 0
shutdown_timeout = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "Test runner (Debian 11 VM, Docker CE, personal computer)"
url = "<COMPANY_GITLAB_INSTANCE_URL>"
id = <RUNNER_ID>
token = "<ACCESS_TOKEN>"
token_obtained_at = 2023-01-24T09:18:33Z
token_expeires_at = 2023-02-01T00:00:00Z
executor = "docker"
[runners.custom_build_dir]
[runners.cache]
MaxUploadedArchiveSize = 0
[runners.cache.s3]
[runners.cache.gcs]
[runners.cache.azure]
[runners.docker]
tls_verify = false
image = "python:3.9-slim-buster"
privileged = true
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
cache_dir = "/cache"
volumes = ["/cache", "/certs/client", "/var/run/docker.sock"]
shm_size = 0
The configuration file was generated by running
sudo gitlab-runner register --url <COMPANY_GITLAB_INSTANCE_URL> --registration-token <ACCESS_TOKEN>
I added the extra cache volumes beside /cache, the cache_dir and changed priveleged to true` (based on my research). All for this based on various posts (including Docker's own issue tracking system) from people having the same issue.
I have made sure that dockerd is listening on the respective port (see comment below for the original poster):
$ sudo ss -nltp
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=601,fd=3))
LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=601,fd=4))
LISTEN 0 4096 *:2375 *:* users:(("dockerd",pid=618,fd=9))
In addition I have added export DOCKER_HOST=tcp://0.0.0.0:2375 to the .bashrc of ever user out there (except root - perhaps that's the problem?) including the gitlab-runner user.
The Dockerfile within the repository contains the following:
FROM python:3.9-slim-buster
RUN apt-get update && apt-get install make
The log file from the CICD pipeline for this job is (trimmed down) as follows:
Running with gitlab-runner 15.8.0 (12335144)
on Test runner (Debian 11 VM, Docker CE, personal computer) <IDENTIFIER>, system ID: <SYSTEM_ID>
Preparing the "docker" executor 02:34
Using Docker executor with image docker:20.10.23 ...
Starting service docker:20.10.23-dind ...
Pulling docker image docker:20.10.23-dind ...
Using docker image sha256:70ae571e74c1d711d3d5bf6f47eaaf6a51dd260fe0036c7d6894c008e7d24297 for docker:20.10.23-dind with digest docker#sha256:85a1b877d0f59fd6c7eebaff67436e26f460347a79229cf054dbbe8d5ae9f936 ...
Waiting for services to be up and running (timeout 30 seconds)...
*** WARNING: Service runner-dbms-tss-project-42787-concurrent-0-b0bbcfd1a821fc06-docker-0 probably didn't start properly.
Health check error:
service "runner-dbms-tss-project-42787-concurrent-0-b0bbcfd1a821fc06-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2023-01-26T10:09:30.933962365Z Certificate request self-signature ok
2023-01-26T10:09:30.933981575Z subject=CN = docker:dind server
2023-01-26T10:09:30.943472545Z /certs/server/cert.pem: OK
2023-01-26T10:09:32.607191653Z Certificate request self-signature ok
2023-01-26T10:09:32.607205915Z subject=CN = docker:dind client
2023-01-26T10:09:32.616426179Z /certs/client/cert.pem: OK
2023-01-26T10:09:32.705354066Z time="2023-01-26T10:09:32.705227099Z" level=info msg="Starting up"
2023-01-26T10:09:32.706355355Z time="2023-01-26T10:09:32.706298649Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
2023-01-26T10:09:32.707357671Z time="2023-01-26T10:09:32.707318325Z" level=info msg="libcontainerd: started new containerd process" pid=72
2023-01-26T10:09:32.707460567Z time="2023-01-26T10:09:32.707425103Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2023-01-26T10:09:32.707466043Z time="2023-01-26T10:09:32.707433214Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2023-01-26T10:09:32.707468621Z time="2023-01-26T10:09:32.707445818Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2023-01-26T10:09:32.707491420Z time="2023-01-26T10:09:32.707459517Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2023-01-26T10:09:32.768123834Z time="2023-01-26T10:09:32Z" level=warning msg="containerd config version `1` has been deprecated and will be removed in containerd v2.0, please switch to version `2`, see https://github.com/containerd/containerd/blob/main/docs/PLUGINS.md#version-header"
2023-01-26T10:09:32.768761837Z time="2023-01-26T10:09:32.768714616Z" level=info msg="starting containerd" revision=5b842e528e99d4d4c1686467debf2bd4b88ecd86 version=v1.6.15
2023-01-26T10:09:32.775684382Z time="2023-01-26T10:09:32.775633270Z" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
2023-01-26T10:09:32.775764839Z time="2023-01-26T10:09:32.775729470Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.aufs\"..." type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.779824244Z time="2023-01-26T10:09:32.779733556Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.aufs\"..." error="aufs is not supported (modprobe aufs failed: exit status 1 \"ip: can't find device 'aufs'\\nmodprobe: can't change directory to '/lib/modules': No such file or directory\\n\"): skip plugin" type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.779836825Z time="2023-01-26T10:09:32.779790644Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.779932891Z time="2023-01-26T10:09:32.779904447Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.btrfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs (ext4) must be a btrfs filesystem to be used with the btrfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.779944348Z time="2023-01-26T10:09:32.779929392Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.devmapper\"..." type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.779958443Z time="2023-01-26T10:09:32.779940747Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.devmapper" error="devmapper not configured"
2023-01-26T10:09:32.779963141Z time="2023-01-26T10:09:32.779951447Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.780022382Z time="2023-01-26T10:09:32.780000266Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.780134525Z time="2023-01-26T10:09:32.780107812Z" level=info msg="loading plugin \"io.containerd.snapshotter.v1.zfs\"..." type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.780499276Z time="2023-01-26T10:09:32.780466045Z" level=info msg="skip loading plugin \"io.containerd.snapshotter.v1.zfs\"..." error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin" type=io.containerd.snapshotter.v1
2023-01-26T10:09:32.780507315Z time="2023-01-26T10:09:32.780489797Z" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
2023-01-26T10:09:32.780548237Z time="2023-01-26T10:09:32.780529316Z" level=warning msg="could not use snapshotter devmapper in metadata plugin" error="devmapper not configured"
2023-01-26T10:09:32.780552144Z time="2023-01-26T10:09:32.780544232Z" level=info msg="metadata content store policy set" policy=shared
2023-01-26T10:09:32.795982271Z time="2023-01-26T10:09:32.795854170Z" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
2023-01-26T10:09:32.795991535Z time="2023-01-26T10:09:32.795882407Z" level=info msg="loading plugin \"io.containerd.event.v1.exchange\"..." type=io.containerd.event.v1
2023-01-26T10:09:32.795993243Z time="2023-01-26T10:09:32.795894367Z" level=info msg="loading plugin \"io.containerd.gc.v1.scheduler\"..." type=io.containerd.gc.v1
2023-01-26T10:09:32.795994639Z time="2023-01-26T10:09:32.795932065Z" level=info msg="loading plugin \"io.containerd.service.v1.introspection-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.795996061Z time="2023-01-26T10:09:32.795949931Z" level=info msg="loading plugin \"io.containerd.service.v1.containers-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.795997456Z time="2023-01-26T10:09:32.795963627Z" level=info msg="loading plugin \"io.containerd.service.v1.content-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.796001074Z time="2023-01-26T10:09:32.795983562Z" level=info msg="loading plugin \"io.containerd.service.v1.diff-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.796219139Z time="2023-01-26T10:09:32.796194319Z" level=info msg="loading plugin \"io.containerd.service.v1.images-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.796231068Z time="2023-01-26T10:09:32.796216520Z" level=info msg="loading plugin \"io.containerd.service.v1.leases-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.796240878Z time="2023-01-26T10:09:32.796228403Z" level=info msg="loading plugin \"io.containerd.service.v1.namespaces-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.796254974Z time="2023-01-26T10:09:32.796239993Z" level=info msg="loading plugin \"io.containerd.service.v1.snapshots-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.796261567Z time="2023-01-26T10:09:32.796252251Z" level=info msg="loading plugin \"io.containerd.runtime.v1.linux\"..." type=io.containerd.runtime.v1
2023-01-26T10:09:32.796385360Z time="2023-01-26T10:09:32.796360610Z" level=info msg="loading plugin \"io.containerd.runtime.v2.task\"..." type=io.containerd.runtime.v2
2023-01-26T10:09:32.796451372Z time="2023-01-26T10:09:32.796435082Z" level=info msg="loading plugin \"io.containerd.monitor.v1.cgroups\"..." type=io.containerd.monitor.v1
2023-01-26T10:09:32.797042788Z time="2023-01-26T10:09:32.796984264Z" level=info msg="loading plugin \"io.containerd.service.v1.tasks-service\"..." type=io.containerd.service.v1
2023-01-26T10:09:32.797093357Z time="2023-01-26T10:09:32.797073997Z" level=info msg="loading plugin \"io.containerd.grpc.v1.introspection\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797100437Z time="2023-01-26T10:09:32.797091084Z" level=info msg="loading plugin \"io.containerd.internal.v1.restart\"..." type=io.containerd.internal.v1
2023-01-26T10:09:32.797148696Z time="2023-01-26T10:09:32.797138286Z" level=info msg="loading plugin \"io.containerd.grpc.v1.containers\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797164876Z time="2023-01-26T10:09:32.797153186Z" level=info msg="loading plugin \"io.containerd.grpc.v1.content\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797176732Z time="2023-01-26T10:09:32.797165488Z" level=info msg="loading plugin \"io.containerd.grpc.v1.diff\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797187328Z time="2023-01-26T10:09:32.797176464Z" level=info msg="loading plugin \"io.containerd.grpc.v1.events\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797208889Z time="2023-01-26T10:09:32.797196407Z" level=info msg="loading plugin \"io.containerd.grpc.v1.healthcheck\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797220812Z time="2023-01-26T10:09:32.797209290Z" level=info msg="loading plugin \"io.containerd.grpc.v1.images\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797232031Z time="2023-01-26T10:09:32.797221051Z" level=info msg="loading plugin \"io.containerd.grpc.v1.leases\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797242686Z time="2023-01-26T10:09:32.797231676Z" level=info msg="loading plugin \"io.containerd.grpc.v1.namespaces\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797254415Z time="2023-01-26T10:09:32.797243815Z" level=info msg="loading plugin \"io.containerd.internal.v1.opt\"..." type=io.containerd.internal.v1
2023-01-26T10:09:32.797484534Z time="2023-01-26T10:09:32.797456547Z" level=info msg="loading plugin \"io.containerd.grpc.v1.snapshots\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797500729Z time="2023-01-26T10:09:32.797487444Z" level=info msg="loading plugin \"io.containerd.grpc.v1.tasks\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797524336Z time="2023-01-26T10:09:32.797502098Z" level=info msg="loading plugin \"io.containerd.grpc.v1.version\"..." type=io.containerd.grpc.v1
2023-01-26T10:09:32.797535447Z time="2023-01-26T10:09:32.797526933Z" level=info msg="loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." type=io.containerd.tracing.processor.v1
2023-01-26T10:09:32.797562995Z time="2023-01-26T10:09:32.797539848Z" level=info msg="skip loading plugin \"io.containerd.tracing.processor.v1.otlp\"..." error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
2023-01-26T10:09:32.797570791Z time="2023-01-26T10:09:32.797558864Z" level=info msg="loading plugin \"io.containerd.internal.v1.tracing\"..." type=io.containerd.internal.v1
2023-01-26T10:09:32.797589770Z time="2023-01-26T10:09:32.797579849Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
2023-01-26T10:09:32.797766243Z time="2023-01-26T10:09:32.797741256Z" level=info msg=serving... address=/var/run/docker/containerd/containerd-debug.sock
2023-01-26T10:09:32.797805542Z time="2023-01-26T10:09:32.797792483Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock.ttrpc
2023-01-26T10:09:32.797836935Z time="2023-01-26T10:09:32.797820296Z" level=info msg=serving... address=/var/run/docker/containerd/containerd.sock
2023-01-26T10:09:32.797854712Z time="2023-01-26T10:09:32.797842891Z" level=info msg="containerd successfully booted in 0.029983s"
2023-01-26T10:09:32.802286356Z time="2023-01-26T10:09:32.802232926Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2023-01-26T10:09:32.802291484Z time="2023-01-26T10:09:32.802269035Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2023-01-26T10:09:32.802322916Z time="2023-01-26T10:09:32.802306355Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2023-01-26T10:09:32.802369464Z time="2023-01-26T10:09:32.802323232Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2023-01-26T10:09:32.803417318Z time="2023-01-26T10:09:32.803366010Z" level=info msg="parsed scheme: \"unix\"" module=grpc
2023-01-26T10:09:32.803424723Z time="2023-01-26T10:09:32.803376046Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
2023-01-26T10:09:32.803426453Z time="2023-01-26T10:09:32.803384392Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock <nil> 0 <nil>}] <nil> <nil>}" module=grpc
2023-01-26T10:09:32.803428210Z time="2023-01-26T10:09:32.803389450Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
2023-01-26T10:09:32.837720263Z time="2023-01-26T10:09:32.837658881Z" level=info msg="Loading containers: start."
2023-01-26T10:09:32.886897024Z time="2023-01-26T10:09:32.886828923Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.18.0.0/16. Daemon option --bip can be used to set a preferred IP address"
2023-01-26T10:09:32.920867085Z time="2023-01-26T10:09:32.920800006Z" level=info msg="Loading containers: done."
2023-01-26T10:09:32.944768798Z time="2023-01-26T10:09:32.944696558Z" level=info msg="Docker daemon" commit=6051f14 graphdriver(s)=overlay2 version=20.10.23
2023-01-26T10:09:32.944804324Z time="2023-01-26T10:09:32.944774928Z" level=info msg="Daemon has completed initialization"
2023-01-26T10:09:32.973804146Z time="2023-01-26T10:09:32.973688991Z" level=info msg="API listen on /var/run/docker.sock"
2023-01-26T10:09:32.976059008Z time="2023-01-26T10:09:32.975992051Z" level=info msg="API listen on [::]:2376"
*********
Pulling docker image docker:20.10.23 ...
Using docker image sha256:25deb61ef2709b05249ad4e66f949fd572fb43d67805d5ea66fe3f86766b5cef for docker:20.10.23 with digest docker#sha256:2655039c6abfc8a1d75978c5258fccd5c5cedf880b6cfc72077f076d0672c70a ...
Preparing environment 00:00
Running on runner-dbms-tss-project-42787-concurrent-0 via debian...
Getting source from Git repository 00:02
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/<PATH_TO_USER>/python-demoapp/.git/
Checking out 93e494ea as master...
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:01
Using docker image sha256:25deb61ef2709b05249ad4e66f949fd572fb43d67805d5ea66fe3f86766b5cef for docker:20.10.23 with digest docker#sha256:2655039c6abfc8a1d75978c5258fccd5c5cedf880b6cfc72077f076d0672c70a ...
$ echo "User $REGISTRY_USER"
User [MASKED]
$ echo "Token $ACCESS_TOKEN"
Token [MASKED]
$ echo "Host $REGISTRY_HOST_ALL"
Host ..............
$ echo "$ACCESS_TOKEN" | docker login --username $REGISTRY_USER --password-stdin $REGISTRY_HOST_ALL
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
$ docker build --tag $REGISTRY_HOST_ALL/<PATH_TO_USER>/python-demoapp .
Cannot connect to the Docker daemon at tcp://localhost:2375/. Is the docker daemon running?
Cleaning up project directory and file based variables 00:00
ERROR: Job failed: exit code 1
From my understanding I need two images here:
The python-capable one - here the official Python image from Docker Hub, which is used to run the tests as well as for the image that is added to the container registry
The Docker DinD one - this is the Docker in Docker setup, which allows building a Docker image inside a running Docker container.
The second one is way above my head and it's the (for me) obvious culprit for my headaches.
Perhaps important additional information: my computer is outside our company's network. The GitLab instance is accessible externally through user authentification (username + password for the WebUI, access tokens and SSH keys otherwise).
Do I need two separate runners? I have seen a lot of examples, where people are using a single runner to do multiple jobs including testing and image building (even packaging) so I don't believe I do. I am not really a Docker expert as you can probably tell. :D If more information is required, please let me know in the comments below, especially if I am overdoing it and there is a much easier way to accomplish what I am trying to.
DISCUSSION
Health check error regarding Docker volume
I can see the following error in the log posted above:
Health check error:
service "runner-dbms-tss-project-42787-concurrent-0-b0bbcfd1a821fc06-docker-0-wait-for-service" timeout
The footprint looked familiar so I went back to check some old commands I executed and apparently this is a Docker volume. However on my host
$ docker volume ls
DRIVER
local runner-...415a70
local runner-...66cea8
neither volumes have that name. So I am guessing that this is a volume that is created by Docker in Docker.
Adding hosts to JSON configuration file for Docker daemon
I added the following configuration at /etc/systemd/system/docker.service.d/90-docker.conf:
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --config-file /etc/docker/daemon.json
with daemon.json containing the following:
{
"hosts": [
"tcp://0.0.0.0:2375",
"unix:///var/run/docker.sock"
]
}
Now I am noticing an additional error in the job's log:
failed to load listeners: can't create unix socket /var/run/docker.sock: is a directory
On my host I checked and the path is an actual socket file (information retrieved by executing file command on the path). This means that the issues is again inside the Docker container, that is part of the DinD. I have read online that apparently Docker would automatically create the path and it will be a directory for some reason.
In addition the above mentioned error in the original question has now changed to
unable to resolve docker endpoint: Invalid bind address format: http://localhost:2375/
even though I cannot find any http://localhost:2375 entry on my host, leading again to the conclusion that something with the DinD setup went wrong.

Telegraf boot-up issue

I tried to setup the tool chain of mosquitto, telegraf, and influxdb. All three are installed
on a raspberry pi using apt. To debug, I use a file output from telegraf.
This connection does not work when the pi boots. Mosquito is working if subscribed from outside.
telegraf collects system and disk information. However telegraf does not collect mqtt information.
When I restart mosquitto like
sudo service mosquitto stop
mosquitto -v
the connection is working.
When I restart mosquitto like
sudo service mosquitto stop
sudo service mosquitto start
it is again not working.
What could be the difference?
I just upgraded to the latest versions, but that did not change anything.
mosquitto 1.5.7
telegraf 1.15.3
influxdb 1.8.2
The boot messages of mosquitto are fine:
Sep 14 21:34:30 raspberrypi systemd[1]: Starting Mosquitto MQTT v3.1/v3.1.1 Broker...
Sep 14 21:34:31 raspberrypi systemd[1]: Started Mosquitto MQTT v3.1/v3.1.1 Broker.
The boot messages from telegraf report connection to mosquitto, though there is some trouble with influxdb
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! Starting Telegraf 1.15.3
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.300652Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/16 duration=598.507ms
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.300796Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/152 duration=675.711ms
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.366628Z lvl=info msg="Opened file" log_id=0PFXdCuW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/base/autogen/2/000000001-000000001.tsm id=0 duration=11.324ms
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.374469Z lvl=info msg="Opened file" log_id=0PFXdCuW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/base/autogen/24/000000319-000000002.tsm id=0 duration=22.091ms
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! Loaded inputs: system mqtt_consumer disk
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! Loaded aggregators:
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! Loaded processors:
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! Loaded outputs: influxdb file
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! Tags enabled: host=raspberrypi
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"raspberrypi", Flush Interval:10s
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.489708Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/2 duration=188.821ms
Sep 14 21:34:54 raspberrypi telegraf[401]: 2020-09-14T19:34:54Z I! [inputs.mqtt_consumer] Connected [tcp://localhost:1883]
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.548591Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/24 duration=239.663ms
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.552787Z lvl=info msg="Opened file" log_id=0PFXdCuW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/base/autogen/32/000000271-000000002.tsm id=0 duration=22.821ms
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.788229Z lvl=info msg="Opened file" log_id=0PFXdCuW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/base/autogen/62/000000006-000000002.tsm id=0 duration=203.005ms
Sep 14 21:34:54 raspberrypi influxd[407]: ts=2020-09-14T19:34:54.842928Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/32 duration=352.965ms
Sep 14 21:34:56 raspberrypi influxd[407]: ts=2020-09-14T19:34:56.503706Z lvl=info msg="Opened file" log_id=0PFXdCuW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/base/autogen/40/000000004-000000002.tsm id=0 duration=71.762ms
Sep 14 21:34:58 raspberrypi systemd[1]: systemd-fsckd.service: Succeeded.
Sep 14 21:34:59 raspberrypi influxd[407]: ts=2020-09-14T19:34:59.734290Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/62 duration=5185.491ms
Sep 14 21:34:59 raspberrypi influxd[407]: ts=2020-09-14T19:34:59.762419Z lvl=info msg="Opened file" log_id=0PFXdCuW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/base/autogen/41/000000001-000000001.tsm id=0 duration=8.874ms
Sep 14 21:34:59 raspberrypi influxd[407]: ts=2020-09-14T19:34:59.785965Z lvl=info msg="Opened shard" log_id=0PFXdCuW000 service=store trace_id=0PFXdEbG000 op_name=tsdb_open index_version=inmem path=/var/lib/influxdb/data/base/autogen/40 duration=4942.818ms
The relevant parts oc telegraf.conf are
[[outputs.influxdb]]
urls = ["http://127.0.0.1:8086"]
database = "base"
skip_database_creation = true
username = "telegraf"
password = "****"
content_encoding = "identity"
[[outputs.file]]
files = ["stdout", "/tmp/metrics.out"]
[[inputs.mqtt_consumer]]
servers = ["tcp://localhost:1883"]
topics = ["home/garden/+"]
topic_tag = "mqtt_topic"
qos = 1
max_undelivered_messages = 1
persistent_session = true
client_id = "lord_of_the_pis"
data_format = "json"
The client_id was the problem.
client_id = "lord_of_the_pis"
With a shorter client_id, it works fine.

Inlfuxdb Retention policy being activated incorrectly

I have a Influxdb database that is losing data due the activation of the retention policy.
I upgraded the influxdb code from the v1.6.3 to v1.7.7, but the behavior is the same.
> SHOW RETENTION POLICIES ON "telegraf"
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 false
forever 0s 168h0m0s 1 true
Aug 16 06:02:25 influxdb influxd[805]: ts=2019-08-16T09:02:25.623073Z lvl=info msg="Retention policy deletion check (start)" log_id=0HEpQh70000 service=retention trace_id=0HIQTFLW000 op_name=retention_delete_check op_event=start
Aug 16 06:02:25 influxdb influxd[805]: ts=2019-08-16T09:02:25.623477Z lvl=info msg="Retention policy deletion check (end)" log_id=0HEpQh70000 service=retention trace_id=0HIQTFLW000 op_name=retention_delete_check op_event=end op_elapsed=0.487ms
Aug 16 06:32:25 influxdb influxd[805]: ts=2019-08-16T09:32:25.623033Z lvl=info msg="Retention policy deletion check (start)" log_id=0HEpQh70000 service=retention trace_id=0HISB6aW000 op_name=retention_delete_check op_event=start
Aug 16 06:32:25 influxdb influxd[805]: ts=2019-08-16T09:32:25.623339Z lvl=info msg="Retention policy deletion check (end)" log_id=0HEpQh70000 service=retention trace_id=0HISB6aW000 op_name=retention_delete_check op_event=end op_elapsed=0.352ms
Aug 16 07:02:25 influxdb influxd[805]: ts=2019-08-16T10:02:25.622970Z lvl=info msg="Retention policy deletion check (start)" log_id=0HEpQh70000 service=retention trace_id=0HITtyqW000 op_name=retention_delete_check op_event=start
Aug 16 07:02:25 influxdb influxd[805]: ts=2019-08-16T10:02:25.623272Z lvl=info msg="Retention policy deletion check (end)" log_id=0HEpQh70000 service=retention trace_id=0HITtyqW000 op_name=retention_delete_check op_event=end op_elapsed=0.362ms
Aug 16 07:32:25 influxdb influxd[805]: ts=2019-08-16T10:32:25.622899Z lvl=info msg="Retention policy deletion check (start)" log_id=0HEpQh70000 service=retention trace_id=0HIVbq5W000 op_name=retention_delete_check op_event=start
Aug 16 07:32:25 influxdb influxd[805]: ts=2019-08-16T10:32:25.623780Z lvl=info msg="Retention policy deletion check (end)" log_id=0HEpQh70000 service=retention trace_id=0HIVbq5W000 op_name=retention_delete_check op_event=end op_elapsed=0.917ms
Aug 16 08:02:25 influxdb influxd[805]: ts=2019-08-16T11:02:25.622839Z lvl=info msg="Retention policy deletion check (start)" log_id=0HEpQh70000 service=retention trace_id=0HIXKhLW000 op_name=retention_delete_check op_event=start
Aug 16 08:02:25 influxdb influxd[805]: ts=2019-08-16T11:02:25.622987Z lvl=info msg="Retention policy deletion check (end)" log_id=0HEpQh70000 service=retention trace_id=0HIXKhLW000 op_name=retention_delete_check op_event=end op_elapsed=0.171ms
I should not see the retention policy being activated ever, as the duration is set to '0s'. Any help is much appreciated.
If you dont want forever retention policy to stay just write following query to influx.
> DROP RETENTION POLICY "forever" ON "telegraf"
And make autogen retention policy as default for telegraf database.
> ALTER RETENTION POLICY "autogen" ON "telegraf" DEFAULT

Sporadically my docker container gets 'orphaned'

My docker container builds a software product which takes more or less a couple of hours.
Most times, it runs fine. However, sometimes it gets 'orphaned' towards the end of execution.
By 'orphaned,' I mean:
1. 'docker ps' reports the container properly.
2. 'docker inspect' reports normally too.
3. however, 'docker exec' returns an error, saying "connect: connection refused": unknown"
[jenkins#aga-slave-jenkins-lnx1 ~]$ docker exec -it 6f667c2ca550 bash
connection error: desc = "transport: dial unix /var/run/docker/containerd/docker-containerd.sock: connect: connection refused": unknown
Once orphaned, it wouldn't go to the next step or exit; it's just hanging there forever.
So my only option is to restart the docker daemon in order to end this misery.
Here is my dockerfile after non-crucial parts omitted for brevity.
RUN svn update -q --no-auth-cache --username $SVN_USER --password $SVN_PASSWORD $WORKSPACE/_Build && \
svn update -q --no-auth-cache --username $SVN_USER --password $SVN_PASSWORD $IVY_REPOSITORY && \
ant -f $WORKSPACE/_Build/_Checkout.xml checkoutLibraries $ANT_ARGUMENTS -Daga.component=ui && \
ant -f $WORKSPACE/_Build/_BuildAll.xml retrieveAll && \
ant -f $WORKSPACE/_Build/_BuildAll.xml $ANT_ARGUMENTS -Daga.component=ui -Drun.tests=false -Dgenerate.javadoc=false -Drun.findbugs=false -Drun.checkstyle=false -Drun.pmd=false && \
ant -f $WORKSPACE/_Build/_BuildAll.xml gather
FROM ${AGA_REPO}base_aga${AGA_VERSION}
Once the 'orphan' problem happens, the docker container hangs between the two tasks.
The dockerd logs contain an interesting line which is the last one in the following snippet.
[jenkins#aga-slave-jenkins-lnx1 ~]$ journalctl -u docker.service |grep 'Jan 13' |tail
Jan 13 23:03:30 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." module=containerd type=io.containerd.grpc.v1
Jan 13 23:03:30 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." module=containerd type=io.containerd.grpc.v1
Jan 13 23:03:30 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." module=containerd type=io.containerd.monitor.v1
Jan 13 23:03:30 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." module=containerd type=io.containerd.runtime.v1
Jan 13 23:03:30 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." module=containerd type=io.containerd.grpc.v1
Jan 13 23:03:30 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." module=containerd type=io.containerd.grpc.v1
Jan 13 23:03:31 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." module=containerd type=io.containerd.grpc.v1
Jan 13 23:03:31 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd-debug.sock" module="containerd/debug"
Jan 13 23:03:31 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg=serving... address="/var/run/docker/containerd/docker-containerd.sock" module="containerd/grpc"
Jan 13 23:03:31 aga-slave-jenkins-lnx1.aga.net dockerd[11352]: time="2018-01-13T23:03:27-05:00" level=info msg="containerd successfully booted in 0.274601s" module=containerd
I am a beginner in docker and wondering the issue might have to do with the booted containerd.
Thanks for your help!
Thanks for looking.
I came to believe too tight memory for the VM was causing or at least giving too much stress to my docker container.
It's running fine free of trouble with a doubled memory.
Not sure what exactly was going wrong with the small memory.

Docker swarm can be accessed only on nodes, where container is running

I'm currently running docker swarm on 3 nodes. First I created network as
docker network create -d overlay xx_net
after that a service as
docker service create --network xxx_net --replicas 1 -p 12345:12345 --name nameofservice nameofimage:1
If I read correctly, this is routing mesh (=ok for me). But I can only access service on that node-ip, where container is running, even it should be available on every node ip's.
If I drain some node, container starts up on different node and then it's on available on new ip.
**more information added below here:
I rebooted all servers - 3 workers, where on of them is manager
after boot, all seems to work ok!
I'm using rabbitmq-image from docker hub. Dockerfile is quite small: FROM rabbitmq:3-management Container has been started at worker 2
I can connect to rabbitmq's management page from all workers: worker1-ip:15672, worker2-ip:15672, worker3-ip:15672, so I think all ports needed is open.
about after 1 hour, rabbitmq-container has been moved from worker 2 to worker 3 - I do not know reason.
after that I cannot anymore connect from worker1-ip:15672, worker2-ip:15672 but from worker3-ip:15672 all still works!
I drained worker3 as docker node update --availability drain worker3
container started at worker1.
after that I can only connect from worker1-ip:15672, not anymore from worker2 or worker3
One test more:
all docker services restarted on all workers, and all works again?!
- let's wait a few hours...
Today's status:
2 of 3 nodes are working ok. On service log of manager:
Jul 12 07:53:32 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:53:32.787953754Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:53:39 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:53:39.787783458Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:55:27 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:55:27.790564790Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:55:41 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:55:41.787974530Z" level=info msg="memberlist: Marking dockerswarmworker2-459b4229d652 as failed, suspect timeout reached"
Jul 12 07:56:33 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:56:33.027525926Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 07:56:33 dockerswarmmanager dockerd[7180]: time="2017-07-12T07:56:33.027668473Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 08:13:22 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:13:22.787796692Z" level=info msg="memberlist: Marking dockerswarmworker2-03ec8453a81f as failed, suspect timeout reached"
Jul 12 08:21:37 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:21:37.788694522Z" level=info msg="memberlist: Marking dockerswarmworker2-03ec8453a81f as failed, suspect timeout reached"
Jul 12 08:24:01 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:24:01.525570127Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
Jul 12 08:24:01 dockerswarmmanager dockerd[7180]: time="2017-07-12T08:24:01.525713893Z" level=error msg="logs call failed" error="container not ready for logs: context canceled" module="node/agent/taskmanager" node.id=b6vnaouyci7b76ol1apq96zxx
and from worker's docker log:
Jul 12 08:20:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:20:47.486202716Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:21:38 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:38.288117026Z" level=warning msg="memberlist: Refuting a dead message (from: h999-99-999-185.scenegroup.fi-891b24339f8a)"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404554761Z" level=warning msg="Neighbor entry already present for IP 10.255.0.3, mac 02:42:0a:ff:00:03"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404588738Z" level=warning msg="Neighbor entry already present for IP 104.198.180.163, mac 02:42:0a:ff:00:03"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404609273Z" level=warning msg="Neighbor entry already present for IP 10.255.0.6, mac 02:42:0a:ff:00:06"
Jul 12 08:21:39 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:39.404622776Z" level=warning msg="Neighbor entry already present for IP 104.198.180.163, mac 02:42:0a:ff:00:06"
Jul 12 08:21:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:21:47.486007317Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:22:47 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:22:47.485821037Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
Jul 12 08:23:17 dockerswarmworker2 dockerd[677]: time="2017-07-12T08:23:17.630602898Z" level=error msg="Bulk sync to node h999-99-999-185.scenegroup.fi-891b24339f8a timed out"
And this one from working worker:
Jul 12 08:33:09 h999-99-999-185.scenegroup.fi dockerd[10330]: time="2017-07-12T08:33:09.219973777Z" level=warning msg="Neighbor entry already present for IP 10.0.0.3, mac xxxxx"
Jul 12 08:33:09 h999-99-999-185.scenegroup.fi dockerd[10330]: time="2017-07-12T08:33:09.220539013Z" level=warning msg="Neighbor entry already present for IP "managers ip here", mac xxxxxx"
I restarted docker on problematic worker and it started to work again.
I'll be following...
** Today's results:
2 of workers available, one is not
I didn't a thing
after 4 hour "swarm alone", all seems to works again?!
services has been moved from worker to other for any good reason, all results seems to be problem with communication.
quite confusing.
Upgrade to docker 17.06
Ingress overlay networking was broken for a long time until about 17.06-rc3

Resources