Alertmanager docker container refuses connections - docker

I have a docker-compose file with one django app, Prometheus monitoring container and Alertmanager container.
All builds fine, app is running, Prometheus is monitoring but when it is to fire an alert, the alert does not reach the Alertmanager container with the following error message:
prometheus_1 | level=error ts=2021-08-02T08:58:16.018Z caller=notifier.go:527 component=notifier alertmanager=http://0.0.0.0:9093/api/v2/alerts count=1 msg="Error sending alert" err="Post \"http://0.0.0.0:9093/api/v2/alerts\": dial tc
p 0.0.0.0:9093: connect: connection refused"
Alertmanager also refuses telnet test connection like so
klex#DESKTOP-PVC5EP:~$ telnet 0.0.0.0 9093
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.
Connection closed by foreign host.
the docker-compose file is:
version: "3"
services:
web:
container_name: smsgate
build: .
command: sh -c "python manage.py migrate &&
python manage.py collectstatic --no-input &&
python manage.py runserver 0.0.0.0:15001"
volumes:
- .:/smsgate:rw
- static_volume:/home/app/smsgate/static
- /var/run/docker.sock:/var/run/docker.sock
ports:
- "15001:15001"
env_file:
- .env.prod
image: smsgate
restart: "always"
networks:
- promnet
prometheus:
image: prom/prometheus
volumes:
- ./prometheus/:/etc/prometheus/
depends_on:
- alertmanager
ports:
- "9090:9090"
networks:
- promnet
alertmanager:
image: prom/alertmanager
ports:
- "9093:9093"
volumes:
- ./alertmanager/:/etc/alertmanager/
restart: "always"
command:
- '--config.file=/etc/alertmanager/alertmanager.yml'
networks:
- promnet
volumes:
static_volume:
alertmanager_volume:
prometheus_volume:
networks:
promnet:
driver: bridge
And prometheus.yml configuration file is
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets:
- "0.0.0.0:9093"
rule_files:
- alert.rules.yml
scrape_configs:
- job_name: monitoring
metrics_path: /metrics
static_configs:
- targets:
- smsgate:15001
There is very likely a network? configuration problem either as the service seems not to accept any connections.
Prometheus and Alertmanager GUI interfaces can be accessed via browser on
http://127.0.0.1:9090/ and
http://127.0.0.1:9093/ respectively
Any help would be much appreciated.

Try to use the service name instead of the 0.0.0.0. Change the last line in the configuration of alerting block as:
alerting:
alertmanagers:
- static_configs:
- targets:
- "alertmanager:9093"
given that they are on the same net, it should work just fin
Update
I misunderstood the problem in the first place. Apologies. Please check the updated block above ā˜šŸ½

Related

Custom metrics is not showing in prometheus web ui so does in grafana

First of all I tried this solution didn't work for me.
I need to log some custom metrics using Prometheus.
docker-compose.yml
version: "3"
volumes:
prometheus_data: {}
grafana_data: {}
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
hostname: my_service
ports:
- 9090:9090
depends_on:
- my_service
my-service:
build: .
ports:
- 8080:8080
grafana:
image: grafana/grafana:latest
container_name: grafana
hostname: grafana
ports:
- 3000:3000
depends_on:
- prometheus
prometheus.yml
global:
scrape_interval: 5s
scrape_timeout: 10s
external_labels:
monitor: 'my-project'
rule_files:
scrape_configs:
- job_name: myapp
scrape_interval: 10s
static_configs:
- targets:
- my_service:8080
I tried external ip as well, but i can't see my metrics in prometheus UI. Also, the target page is showing localhost:9090 is up.
What could be the problem? Can anyone correct the docker compose and prometheus file?
Thanks
So I found it. I have to set my scrape configs with the container name. something like this
scrape_configs:
- job_name: my-service
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
static_configs:
- targets:
- 'prometheus:9090'
- 'my-service:8080'
Once you fix your Prometheus volumes to your data, you will see your service is up and running at http://localhost:9090/targets

Prometheus keeps showing up on 9090 even if docker container not running

Following a tutorial. I have the following docker compose:
version: "3.9"
services:
grafana:
image: grafana/grafana
ports:
- 3000:3000
prometheus:
image: prom/prometheus
ports:
- 9090:9090
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
postgres:
image: postgres:12
ports:
- 5432:5432
volumes:
- ./backup:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: postgrespassword
POSTGRES_DB: shop
postgres-exporter:
image: prometheuscommunity/postgres-exporter
ports:
- 9187:9187
environment:
DATA_SOURCE_NAME: "postgresql://postgres:postgrespassword#postgres:5432/shop?sslmode=disable"
links:
- postgres
- prometheus
And this as my prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ["localhost:9090"]
- job_name: postgres-exporter
static_configs:
- targets: ["postgres-exporter:9187"]
I realized I needed to make some changes, so brought down the containers. But noticed I could still access:
http://xxx.xx.xx.xxx:9090/config
In checking the containers, it seemed to restart Prometheus. I disabled it again, checked on the browser, and saw it got restarted again. Additionally, any changes I make to prometheus.yml aren't reflected in the config.
I checked running services, there is no instance of prometheus running outside of docker.
Disabling the docker service brings it down, but as soon as I re-enable it, it appears again, always with the old config.
any ideas? (have tried killing the port as well, just respawns)
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
34c66a78b75e prom/prometheus:latest "/bin/prometheus --cā€¦" 2 hours ago Up 2 hours 9090/tcp my-prometheus.1.iht6uk1okzlb3gc9y9jdaf2fg

Traefik - 502 Bad Gateway

Hello,
I'm trying to connect a Docker container to Traefik. It doesn't work.
I can't access my container through the host.
I tried several things: check the ports, check if the containers are on the same network, stop and restart the networks, check if there is an error, etc. Impossible to see where it comes from. :(
I have an another two another containers that already work with Traefik. But this container doesn't want...
I've been banging my head on it for two days now that I have a bump on my forehead the size of a horn.
My traefik docker logs :
msg="'502 Bad Gateway' caused by: dial tcp 172.23.0.6:5345: connect: connection refused"
My Docker compose file :
services:
db:
image: mariadb:10.3
env_file:
- env/mysql.env
volumes:
- database_volume:/var/lib/mysql
#ports:
#- "127.0.0.1:3306:3306"
networks:
- traefik-public
passbolt:
image: passbolt/passbolt:latest-ce
#Alternatively you can use rootless:
#image: passbolt/passbolt:latest-ce-non-root
tty: true
labels:
- traefik.enable=true
- traefik.docker.network=traefik-public
- traefik.constraint-label=traefik-public
- traefik.http.routers.vault-https.rule=Host(`vault.audiowizard.fr`)
- traefik.http.routers.vault-https.entrypoints=websecure
- traefik.http.routers.vault-https.tls=true
- traefik.http.routers.vault-https.tls.certresolver=lets-encrypt
- traefik.http.services.vault-https.loadbalancer.server.port=5345
depends_on:
- db
env_file:
- env/passbolt.env
volumes:
- gpg_volume:/etc/passbolt/gpg
- images_volume:/usr/share/php/passbolt/webroot/img/public
command: ["/usr/bin/wait-for.sh", "-t", "0", "db:3306", "--", "/docker-entrypoint.sh"]
networks:
- traefik-public
#ports:
#- 5080:80
#- 5443:443
#Alternatively for non-root images:
# - 80:8080
# - 443:4433
volumes:
database_volume:
gpg_volume:
images_volume:
networks:
traefik-public:
external: true
I noticed in the commented code you have used the following ports.
#ports:
#- 5080:80
#- 5443:443
#Alternatively for non-root images:
# - 80:8080
# - 443:4433
But the port you have used in the label is 5343,
- traefik.http.services.vault-https.loadbalancer.server.port=5345
Can you please double check if this is the right port?

Change container port in docker compose

I am trying to get a docker-compose file working with both airflow and spark. Airflow typically runs on 8080:8080 which is needed by spark as well. I have the following docker-compose file:
version: '3.7'
services:
master:
image: gettyimages/spark
command: bin/spark-class org.apache.spark.deploy.master.Master -h master
hostname: master
environment:
MASTER: spark://master:7077
SPARK_CONF_DIR: /conf
SPARK_PUBLIC_DNS: localhost
expose:
- 7001
- 7002
- 7003
- 7004
- 7005
- 7077
- 6066
ports:
- 4040:4040
- 6066:6066
- 7077:7077
- 8080:8080
volumes:
- ./conf/master:/conf
- ./data:/tmp/data
worker:
image: gettyimages/spark
command: bin/spark-class org.apache.spark.deploy.worker.Worker spark://master:7077
hostname: worker
environment:
SPARK_CONF_DIR: /conf
SPARK_WORKER_CORES: 2
SPARK_WORKER_MEMORY: 1g
SPARK_WORKER_PORT: 8881
SPARK_WORKER_WEBUI_PORT: 8081
SPARK_PUBLIC_DNS: localhost
links:
- master
expose:
- 7012
- 7013
- 7014
- 7015
- 8881
ports:
- 8081:8081
volumes:
- ./conf/worker:/conf
- ./data:/tmp/data
postgres:
image: postgres:9.6
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
logging:
options:
max-size: 10m
max-file: "3"
webserver:
image: puckel/docker-airflow:1.10.9
restart: always
depends_on:
- postgres
environment:
- LOAD_EX=y
- EXECUTOR=Local
logging:
options:
max-size: 10m
max-file: "3"
volumes:
- ./dags:/usr/local/airflow/dags
# Add this to have third party packages
- ./requirements.txt:/requirements.txt
# - ./plugins:/usr/local/airflow/plugins
ports:
- "8082:8080" # NEED TO CHANGE THIS LINE
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
but specifically need to change the line:
ports:
- "8082:8080" # NEED TO CHANGE THIS LINE
under web server so there's no port conflict. However when I change the container port to something other than 8080:8080 it doesnt work (cannot connect/find server). How do I change the container port successfully?
When you specified port mapping in docker you are giving 2 ports, for instance: 8082:8080.
The right port is the one that is listening within the container.
You can have several containers that listening internally to the same port. They are still not available in your localhost - that's why we use the ports section for.
Now, in your localhost you cannot bind the same port more than once. That's why docker failed when you try to set on the left side 8080 more than one time.
In your current compose file, the spark service port is mapped to 8080 (left side of 8080:8080) and the webserver service is mapped to 8082 (left side of 8082:8080).
if you want to access spark, goto: http://localhost:8080 and for the web server, go to http://localhost:8082

consumer: Cannot connect to amqp://user:**#localhost:5672//: [Errno 111] Connection refused

I am trying to build my airflow using docker and rabbitMQ. I am using rabbitmq:3-management image. And I am able to access rabbitMQ UI, and API.
In airflow I am building airflow webserver, airflow scheduler, airflow worker and airflow flower. Airflow.cfg file is used to config airflow.
Where I am using broker_url = amqp://user:password#127.0.0.1:5672/ and celery_result_backend = amqp://user:password#127.0.0.1:5672/
My docker compose file is as follows
version: '3'
services:
rabbit1:
image: "rabbitmq:3-management"
hostname: "rabbit1"
environment:
RABBITMQ_ERLANG_COOKIE: "SWQOKODSQALRPCLNMEQG"
RABBITMQ_DEFAULT_USER: "user"
RABBITMQ_DEFAULT_PASS: "password"
RABBITMQ_DEFAULT_VHOST: "/"
ports:
- "5672:5672"
- "15672:15672"
labels:
NAME: "rabbitmq1"
webserver:
build: "airflow/"
hostname: "webserver"
restart: always
environment:
- EXECUTOR=Celery
ports:
- "8080:8080"
depends_on:
- rabbit1
command: webserver
scheduler:
build: "airflow/"
hostname: "scheduler"
restart: always
environment:
- EXECUTOR=Celery
depends_on:
- webserver
- flower
- worker
command: scheduler
worker:
build: "airflow/"
hostname: "worker"
restart: always
depends_on:
- webserver
environment:
- EXECUTOR=Celery
command: worker
flower:
build: "airflow/"
hostname: "flower"
restart: always
environment:
- EXECUTOR=Celery
ports:
- "5555:5555"
depends_on:
- rabbit1
- webserver
- worker
command: flower
I am able to build images using docker compose. However, I am not able to connect my airflow scheduler to rabbitMQ. I am getting following error:
consumer: Cannot connect to amqp://user:**#localhost:5672//: [Errno
111] Connection refused.
I have tried using 127.0.0.1 and localhost both.
What I am doing wrong ?
From within your airflow containers, you should be able to connect to the service rabbit1. So all you need to do is to change amqp://user:**#localhost:5672//: to amqp://user:**#rabbit1:5672//: and it should work.
Docker compose creates a default network and attaches services that do not explicitly define a network to it.
You do not need to expose the 5672 & 15672 ports on rabbit1 unless you want to be able to access it from outside the application.
Also, generally it is not recommended to build images inside docker-compose.
I solved this issue by installing rabbitMQ server into my system with command sudo apt install rabbitmq-server.

Resources