i am new to prometheus , cadvisor and docker-compose. i made a docker-compose file including my own created application named chat, with a mongo container. those work fine. now i want to monitor my containers with prometheus and cadvisor. im getting following errors:
cadvisor | W0419 11:41:00.576916 1 sysinfo.go:203] Nodes topology is not available, providing CPU topology
cadvisor | W0419 11:41:00.577437 1 sysfs.go:348] unable to read /sys/devices/system/cpu/cpu0/online: open /sys/devices/system/cpu/cpu0/online: no such file or directory
cadvisor | E0419 11:41:00.582000 1 info.go:114] Failed to get system UUID: open /etc/machine-id: no such file or directory
and
prometheus | ts=2022-04-19T11:54:19.051Z caller=main.go:438 level=error msg="Error loading config (--config.file=/etc/prometheus/prometheus.yml)" file=/etc/prometheus/prometheus.yml err="parsing YAML file /etc/prometheus/prometheus.yml: yaml: unmarshal errors:\n line 2: field scrape-interval not found in type config.plain"
i tryed to change the config parameter from my docker-compose into, but it dont changed the error:
command:
- '--config.file=./prometheus/prometheus.yml'
docker-compose.yml:
version : '3.7'
services:
chat-api:
container_name: chat-api
build:
context: .
dockerfile: ./Dockerfile
ports:
- '4000:4000'
networks:
- cchat
restart: 'on-failure'
userdb:
image: mongo:latest
container_name: mongodb
volumes:
- userdb:/data/db
networks:
- cchat
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: always
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9080:9080'
networks:
- cloudchat
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
restart: always
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker:/var/lib/docker:ro
devices:
- /dev/kmsg:/dev/kmsg
depends_on:
- chat-api
networks:
- cchat
volumes:
userdb:
networks:
cchat:
prometheus.yml:
global:
scrape-interval: 2s
scrape_configs:
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
project structure:
picture of project structure
I guess it's quite late but you can try mounting /etc/machine-id:/etc/machine-id:ro.
Running in privileged mode could help too. This is my configuration which is working without problems:
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.47.0
container_name: cadvisor
restart: unless-stopped
privileged: true
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
Some important note, don't use latest it seems it's not the latest version (source: https://github.com/google/cadvisor/issues/3066).
Related
I want to set up node exporter on my server to be monitored using docker compose but do not want the metrics to be freely available to all.
My current docker-compose.yml file looks like this;
version: '3.8'
networks:
monitoring:
driver: bridge
volumes:
prometheus_data: {}
services:
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.rootfs=/rootfs'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
- '--collector.netclass.ignored-devices=^(veth.*)$$'
ports:
- 9100:9100
networks:
- monitoring
labels:
org.label-schema.group: "monitoring"
When I add bottom line to my docker-compose.yml file then I get error message "services.node-exporter Additional property basic_auth_users is not allowed".
basic_auth_users:
prometheus: my_pass
can someone please help me where I am making mistakes or how the whole thing would work.
Ps: I would like to install on the server to be monitored only Node-Exporter since a Prometheus instance is not necessary there.... (Correct me if it is wrong approach)
Best regards
Solution -> docker-compose.yml
version: '3.8'
networks:
monitoring:
driver: bridge
volumes:
prometheus_data: {}
services:
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
- ./prometheus/web.yml:/etc/prometheus/web.yml
command:
- '--path.procfs=/host/proc'
- '--path.rootfs=/rootfs'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
- '--collector.netclass.ignored-devices=^(veth.*)$$'
- '--web.config=/etc/prometheus/web.yml'
ports:
- 9100:9100
networks:
- monitoring
labels:
org.label-schema.group: "monitoring"
i should see tree targets in my prometheus dashboard, one from prometheus itself, which works, and one from my self created node.js application called chat-api, and one from cadvisor. for cadvisor i get following error, when i run docker-compose up:
cadvisor | W0419 22:12:08.195849 1 sysinfo.go:203] Nodes topology is not available, providing CPU topology
cadvisor | W0419 22:12:08.196364 1 sysfs.go:348] unable to read /sys/devices/system/cpu/cpu0/online: open /sys/devices/system/cpu/cpu0/online: no such file or directory
cadvisor | E0419 22:12:08.200398 1 info.go:114] Failed to get system UUID: open /etc/machine-id: no such file or directory
i changed the parameters in my docker-compose file but it dont change anything. im a beginner in docker.
docker-compose.yml:
version : '3.7'
services:
chat-api:
container_name: chat-api
build:
context: .
dockerfile: ./Dockerfile
ports:
- '4000:4000'
networks:
- cchat
restart: 'on-failure'
userdb:
image: mongo:latest
container_name: mongodb
volumes:
- userdb:/data/db
networks:
- cchat
cadvisor:
image: gcr.io/cadvisor/cadvisor
container_name: cadvisor
privileged: true
restart: always
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker:/var/lib/docker:ro
devices:
- /dev/kmsg:/dev/kmsg
depends_on:
- chat-api
networks:
- cchat
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: always
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- prometheus-data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
depends_on:
- chat-api
networks:
- cchat
volumes:
userdb:
prometheus-data:
networks:
cchat:
prometheus.yml:
global:
scrape-interval: 5s
scrape_configs:
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'chat-api'
static_configs:
- targets: ['chat-api:4000']
Dockerfile:
FROM node:alpine
WORKDIR .
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 4000
CMD ["node", "server.js"]
chat-api is a node application with express
my folder structure:
structure
I try to setup zipkin, elasticsearch, prometheus and grafana with docker-compose.yml
When I run dockers, see in the log:
dependencies_zipkin | 19/09/30 14:37:09 ERROR NetworkClient: Node [172.28.0.2:9200] failed (java.net.ConnectException: Connection refused (Connection refused)); no other nodes left - aborting...
I'm on MacOS X with docker 2.1.0.3
the content of my docker-compose.yml is this one:
version: '3.7'
services:
storage:
image: openzipkin/zipkin-elasticsearch7
container_name: elasticsearch
ports:
- "9200:9200"
environment:
- "xpack.security.enabled=false"
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
restart: unless-stopped
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- $PWD/prometheus:/etc/prometheus/
- /tmp/prometheus:/prometheus/data:rw
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
ports:
- "9090:9090"
restart: unless-stopped
zipkin:
image: openzipkin/zipkin
container_name: zipkin
depends_on:
- dependencies
- storage
environment:
- "STORAGE_TYPE=elasticsearch"
- "ES_HOSTS=storage"
ports:
- "9411:9411"
restart: unless-stopped
grafana:
image: grafana/grafana
container_name: grafana
ports:
- "3000:3000"
restart: unless-stopped
dependencies:
image: openzipkin/zipkin-dependencies
container_name: dependencies_zipkin
depends_on:
- storage
environment:
- "STORAGE_TYPE=elasticsearch"
- "ES_HOSTS=storage"
When I connect to localhost:9200, I see that elasticsearch is working fine and on port 9411, zipkin is deployed but I have the error:
ERROR: cannot load service names: server error (Service Unavailable)(due to the network error
In the log, I have this information:
105 ^[[35mdependencies_zipkin |^[[0m 19/09/30 14:45:20 ERROR NetworkClient: Node [172.28.0.2:9200] failed (java.net.ConnectException: Connection refused (Connection refused)); no other nodes left - aborting...
and this one
^[[31mzipkin |^[[0m java.lang.IllegalStateException: couldn't connect any of [Endpoint{storage:80, ipAddr=172.28.0.2, weight=1000}]
Any idea?
UPDATE
by using mysql it is working fine, so the problem is at the level of elastic search.
I tried alsoo by using
"STORAGE_PORT_9200_TCP_ADDR=127.0.0.1"
but the issue still occurs.
UPDATE
As mention is the solution gave by Brian, I have to use:
ES_HOSTS=http://storage:9300
the key is on port, I was using the port 9200
The error disappear between zipkin and es but still occurs between es and zipkin-dependencies.
The problem lies in your ES_HOSTS variable, from the docs here:
ES_HOSTS: A comma separated list of elasticsearch base urls to connect to ex. http://host:9200.
Defaults to "http://localhost:9200".
So you will need: ES_HOSTS=http://storage:9200
Finally I have this file:
version: '3.7'
services:
storage:
image: openzipkin/zipkin-elasticsearch7
container_name: elasticsearch
ports:
- 9200:9200
zipkin:
image: openzipkin/zipkin
container_name: zipkin
environment:
- STORAGE_TYPE=elasticsearch
- "ES_HOSTS=elasticsearch:9300"
ports:
- 9411:9411
depends_on:
- storage
dependencies:
image: openzipkin/zipkin-dependencies
container_name: dependencies
entrypoint: crond -f
depends_on:
- storage
environment:
- STORAGE_TYPE=elasticsearch
- "ES_HOSTS=elasticsearch:9300"
- "ES_NODES_WAN_ONLY=true"
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- $PWD/prometheus:/etc/prometheus/
- /tmp/prometheus:/prometheus/data:rw
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
ports:
- "9090:9090"
grafana:
image: grafana/grafana
container_name: grafana
depends_on:
- prometheus
ports:
- "3000:3000"
Main differences are the usage of
"ES_HOSTS=elasticsearch:9300"
instead of
"ES_HOSTS=storage:9300"
and in the dependencies configuration I add the entrypoint in dependencies:
entrypoint: crond -f
This one is really the key to not have the exception when I start docker-compose.
To solve this issue, I check the this project: https://github.com/openzipkin/docker-zipkin
The remaining question is: why do I need to use entrypoint: crond -f
My docker-compose.yml looks like the below. When i run docker-compose up I get the below error.
ERROR: In file './docker-compose.yml', the service name True must be a quoted string, i.e. 'True'.
version: '3'
services:
db:
restart: always
image: postgres:9.6-alpine
container_name: pleroma_postgres
networks:
- pleroma
volumes:
- ./postgres:/var/lib/postgresql/data
web:
build: .
image: pleroma
container_name: pleroma_web
restart: always
environment:
- VIRTUAL_HOST=<myplaceholderhost>
- VIRTUAL_PORT=4000
- LETSENCRYPT_HOST=<myplaceholderhost>
- LETENCRYPT_EMAIL=<myplaceholderemail>
expose:
- "4000"
volumes:
- ./uploads:/pleroma/uploads
depends_on:
- db
nginx:
image: jwilder/nginx-proxy
container_name: nginx
volumes:
- /var/run/docker.sock:/tmp/docker.sock:ro
- /apps/docker-articles/nginx/vhost.d:/etc/nginx/vhost.d
- /apps/docker-articles/nginx/certs:/etc/nginx/certs:ro
- /apps/docker-articles/nginx/html:/usr/share/nginx/html
restart: always
ports:
- "80:80"
- "443:443"
labels:
com.github.jrcs.letsencrypt_nginx_proxy_companion.nginx_proxy: "true"
networks:
- pleroma
letsencrypt:
image: jrcs/letsencrypt-nginx-proxy-companion:v1.5
container_name: letsencrypt
volumes_from:
- nginx
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /apps/docker-articles/nginx/vhost.d:/etc/nginx/vhost.d
- /apps/docker/articles/nginx/certs:/etc/nginx/certs:rw
- /apps/docker-articles/nginx/html:/usr/share/nginx/html
networks:
pleroma:
My docker version is
Docker version 18.06.1-ce, build e68fc7a
My docker compose version is
docker-compose version 1.23.1, build b02f1306
Running CoreOS version 1911.3.0
I ended up resolving this issue by modifying the nginx and letsencrypt portions of my docker-compose.yml file to be as follows.
nginx:
image: jwilder/nginx-proxy
container_name: nginx
volumes:
- /var/run/docker.sock:/tmp/docker.sock:ro
- /apps/docker-articles/nginx/vhost.d:/etc/nginx/vhost.d
- /apps/docker-articles/nginx/certs:/etc/nginx/certs:ro
- /apps/docker-articles/nginx/html:/usr/share/nginx/html
restart: always
ports:
- "80:80"
- "443:443"
labels:
- "NGINX_PROXY_CONTAINER=true"
networks:
- pleroma
letsencrypt:
image: jrcs/letsencrypt-nginx-proxy-companion:v1.5
container_name: letsencrypt
environment:
- NGINX_PROXY_CONTAINER=true
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /apps/docker-articles/nginx/vhost.d:/etc/nginx/vhost.d
- /apps/docker/articles/nginx/certs:/etc/nginx/certs:rw
- /apps/docker-articles/nginx/html:/usr/share/nginx/html
It seems "volumes_from" is deprecated in docker-compose v3. As well as I had forgotted quotes around my label and needed to set my environment within letsencrypt.
in CentOS env your .yml file directory must be /usr/local/bin
I have successfully setup Prometheus service in a docker container.Also I am running the services like node-exporter and cadvisor on different other ports in the same hosts.
All the services are being run using the docker-compose.
Here is the sample
version: '2'
volumes:
grafana_data: {}
services:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanager/alert.rules:/alertmanager/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
node-exporter:
image: prom/node-exporter
ports:
- '9100:9100'
cadvisor:
image: google/cadvisor:latest
privileged: true
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
- /cgroup:/sys/fs/cgroup:ro
ports:
- '8080:8080'
How to make the cadvisor service not accessible to public as for
now everyone can access the cadvisor and node-exporter visiting the
host url with ports it is being assigned. But as the prometheus
depends on it only prometheus should be able to access it.
If you don't need to access the service externally, simply don't publish the ports for that service, delete the ports section from each of those services. The resulting compose file will look like:
version: '2'
volumes:
grafana_data: {}
services:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanager/alert.rules:/alertmanager/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
node-exporter:
image: prom/node-exporter
# removed "ports" from here
cadvisor:
image: google/cadvisor:latest
privileged: true
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
- /cgroup:/sys/fs/cgroup:ro
# removed "ports" from here
Containers talk to each other across a shared network, which you get by default with docker compose or a docker stack. To use container to container networking, reference the target container by it's service name (in this case: node-exporter and cadvisor), and use the container port, not the published port, which in your case was the same.
This configuration should work as intended. Please note that you'll need to update your Prometheus config to reference the Node Exporter and CAdvisor by their alias names (node-exporter, cadvisor) instead of their IP.
version: '2'
volumes:
grafana_data: {}
services:
prometheus:
image: prom/prometheus
privileged: true
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alertmanager/alert.rules:/alertmanager/alert.rules
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- '9090:9090'
links:
- 'node-exporter'
- 'cadvisor'
node-exporter:
image: prom/node-exporter
cadvisor:
image: google/cadvisor:latest
privileged: true
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
- /cgroup:/sys/fs/cgroup:ro