Sending metrics from telegraf to prometheus

Sending metrics from telegraf to prometheus - monitoring

I'm running prometheus and telegraf on the same host.
I'm using a few inputs plugins:
inputs.cpu
inputs.ntpq
I've configured to the prometheus_client output plugin to send data to prometheus
Here's my config:
[[outputs.prometheus_client]]
## Address to listen on.
listen = ":9126"
## Use HTTP Basic Authentication.
# basic_username = "Foo"
# basic_password = "Bar"
## If set, the IP Ranges which are allowed to access metrics.
## ex: ip_range = ["192.168.0.0/24", "192.168.1.0/30"]
# ip_range = []
## Path to publish the metrics on.
path = "/metrics"
## Expiration interval for each metric. 0 == no expiration
#expiration_interval = "0s"
## Collectors to enable, valid entries are "gocollector" and "process".
## If unset, both are enabled.
# collectors_exclude = ["gocollector", "process"]
## Send string metrics as Prometheus labels.
## Unless set to false all string metrics will be sent as labels.
# string_as_label = true
## If set, enable TLS with the given certificate.
# tls_cert = "/etc/ssl/telegraf.crt"
# tls_key = "/etc/ssl/telegraf.key"
## Export metric collection time.
#export_timestamp = true
Here's my prometheus config
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
# - job_name: 'node_exporter'
# scrape_interval: 5s
# static_configs:
# - targets: ['localhost:9100']
- job_name: 'telegraf'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9126']
If i'm going to http://localhost:9090/metrics i don't see any metrics which are coming from telegraf.
I've captured some logs from telegraf as well
/opt telegraf --config /etc/telegraf/telegraf.conf --input-filter filestat --test
➜ /opt tail -F /var/log/telegraf/telegraf.log
2019-02-11T17:34:20Z D! [outputs.prometheus_client] wrote batch of 28 metrics in 1.234869ms
2019-02-11T17:34:20Z D! [outputs.prometheus_client] buffer fullness: 0 / 10000 metrics.
2019-02-11T17:34:30Z D! [outputs.file] wrote batch of 28 metrics in 384.672µs
2019-02-11T17:34:30Z D! [outputs.file] buffer fullness: 0 / 10000 metrics.
2019-02-11T17:34:30Z D! [outputs.prometheus_client] wrote batch of 30 metrics in 1.250605ms
2019-02-11T17:34:30Z D! [outputs.prometheus_client] buffer fullness: 9 / 10000 metrics.
I don't see an issue in the logs.

The /metrics endpoint of your Prometheus server exports metrics about the server itself, not metrics that it scraped from targets like the telgraf exporter.
Go to http://localhost:9090/targets, you should see a list of targets that your Prometheus server is scraping. If configured correctly, the telegraf exporter should be one of them.
To query Prometheus for telegraf exporter generated metrics, navigate your browser to http://localhost:9090/graph and enter e.g. cpu_time_user in the query field. If the CPU plugin is enabled it should have that and more metrics.

You should use the following Prometheus config file in order to scrape metrics exported by prometheus_client at Telegraf:
scrape_configs:
- job_name: telegraf
static_configs:
- targets:
- "localhost:9126"
Path to this file must be passed to --config.file command-line flag when starting Prometheus.
See more details about Prometheus config in these docs.
P.S. There is an alternative solution to push metrics collected by Telegraf directly to Prometheus-like system such as VictoriaMetrics instead of InfluxDB - see these docs. Later these metrics can be queried with PromQL-compatible query language - MetricsQL.

Related

remove timestamp from log line with Promtail

I am scraping logs from docker with Promtail to Loki.
Works very well, but I would like to remove timestamp from log line once it has been extracted by Promtail.
The reason is that I end up with log panel that half of screen is occupied by timestamp. If I want to display timestamp in panel, I can do that, so I dont really need it in log line.
I have been reading documentation, but not sure how to approach it. logfmt? replace? timestamp?
https://grafana.com/docs/loki/latest/clients/promtail/stages/logfmt/
promtail-config.yml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
# local machine logs
- job_name: local logs
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
# docker containers
- job_name: containers
docker_sd_configs:
- host: unix:///var/run/docker.sock
refresh_interval: 15s
pipeline_stages:
- docker: {}
relabel_configs:
- source_labels: ['__meta_docker_container_label_com_docker_compose_service']
regex: '(.*)'
target_label: 'service'
Thank you

Actually I just realized I was looking for wrong thing. I just wanted to display less logs in Grafana, logs were formatted properly. I just had to select fields to display.
Thanks!

Why is Docker looking for a non-existent tag when using ${TAG:-latest} in .yml file?

I’ve been working with a docker deployment and I’m seeing an irksome behavior: The full project is here (I'm using the v1.0.0-CODI tag): https://github.com/NACHC-CAD/anonlink-entity-service
Sometimes (often) I get the following error:
running:
docker-compose -p anonlink -f tools/docker-compose.yml up --remove-orphans
I get:
Pulling db_init (data61/anonlink-app:v1.15.0-22-gba57975)...
ERROR: manifest for data61/anonlink-app:v1.15.0-22-gba57975 not found: manifest unknown: manifest unknown
anonlink-app is specified in the .yml file as:
data61/anonlink-app:${TAG:-latest}
How is it that docker looking for a non-existent tag?
The full .yml file is shown below.
version: '3.4'
services:
db:
image: postgres:11.13
environment:
- POSTGRES_PASSWORD=rX%QpV7Xgyrz
volumes:
- psql:/var/lib/postgresql/data
#ports:
#- 5432:5432
healthcheck:
test: pg_isready -q -h db -p 5432 -U postgres
interval: 5s
timeout: 30s
retries: 5
minio:
image: minio/minio:RELEASE.2021-02-14T04-01-33Z
command: server /export
env_file: .env
volumes:
- minio:/export
ports:
- 9000:9000
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
redis:
image: redis:5.0
# The flask application server
backend:
image: data61/anonlink-app:${TAG:-latest}
env_file: .env
environment:
- FLASK_DB_MIN_CONNECTIONS=1
- FLASK_DB_MAX_CONNECTIONS=10
depends_on:
- db
- db_init
- redis
- minio
- objectstore_init
# The application server can also setup the database
db_init:
image:
data61/anonlink-app:${TAG:-latest}
env_file:
- .env
environment:
- DEBUG=true
- DATABASE_PASSWORD=rX%QpV7Xgyrz
- FLASK_APP=entityservice
entrypoint: /bin/sh -c "dockerize -wait tcp://db:5432 alembic upgrade head"
depends_on:
- db
# Set up the object store to have another more restricted user
objectstore_init:
image: minio/mc:RELEASE.2021-02-14T04-28-06Z
environment:
- OBJECT_STORE_SECURE=false
env_file:
- .env
entrypoint: |
/bin/sh /opt/init-object-store.sh
volumes:
- ./init-object-store.sh:/opt/init-object-store.sh:ro
depends_on:
- minio
# A celery worker
worker:
image: data61/anonlink-app:${TAG:-latest}
depends_on:
- redis
- db
command: celery -A entityservice.async_worker worker --loglevel=info -O fair -Q celery,compute,highmemory
env_file:
- .env
environment:
- CELERY_ACKS_LATE=true
- REDIS_USE_SENTINEL=false
- CELERYD_MAX_TASKS_PER_CHILD=2048
#- CHUNK_SIZE_AIM=300_000_000
- CELERY_DB_MIN_CONNECTIONS=1
- CELERY_DB_MAX_CONNECTIONS=3
nginx:
image: data61/anonlink-nginx:${TAG:-latest}
ports:
- 8851:8851
depends_on:
- backend
environment:
TARGET_SERVICE: backend
PUBLIC_PORT: 8851
# A celery monitor. Useful for debugging.
# celery_monitor:
# image: data61/anonlink-app:${TAG:-latest}
# depends_on:
# - redis
# - worker
# command: celery flower -A entityservice.async_worker
# ports:
# - 8888:8888
# Jaeger UI is available at http://localhost:16686
jaeger:
image: jaegertracing/all-in-one:latest
environment:
COLLECTOR_ZIPKIN_HTTP_PORT: 9411
# ports:
# - 5775:5775/udp
# - 6831:6831/udp
# - 6832:6832/udp
# - 5778:5778
# - 16686:16686
# - 14268:14268
# - 9411:9411
volumes:
psql:
minio:
This values.yaml file also exists in the project (note is uses v1.15.1, not v1.15.0-22-gba57975)
rbac:
## TODO still needs work to fully lock down scope etc
## See issue #88
create: false
anonlink:
## Set arbitrary environment variables for the API and Workers.
config: {
## e.g.: to control which task is added to which celery worker queue.
## CELERY_ROUTES: "{
## 'entityservice.tasks.comparing.create_comparison_jobs': { 'queue': 'highmemory' }, ...
## }"
}
objectstore:
## Settings for the Object Store that Anonlink Entity Service uses internally
## Connect to the object store using https
secure: false
## Settings for uploads via Object Store
## Toggle the feature providing client's with restricted upload access to the object store.
## By default we don't expose the Minio object store, which is required for clients to upload
## via the object store. See section `minio.ingress` to create an ingress for minio.
uploadEnabled: true
## Server used as the external object store URL - provided to clients so should be externally
## accessible. If not provided, the minio.ingress is used (if enabled).
#uploadServer: "s3.amazonaws.com"
## Tell clients to make secure connections to the upload object store.
uploadSecure: true
## Object store credentials used to grant temporary upload access to clients
## Will be created with an "upload only" policy for a upload bucket if using the default
## MINIO provisioning.
uploadAccessKey: "EXAMPLE_UPLOAD_KEY"
uploadSecretKey: "EXAMPLE_UPLOAD_SECRET"
## The bucket for client uploads.
uploadBucket:
name: "uploads"
## Settings for downloads via Object Store
## Toggle the feature providing client's with restricted download access to the object store.
## By default we don't expose the Minio object store, which is required for clients to download
## via the object store.
downloadEnabled: true
## Tell clients to make secure connections to the download object store.
downloadSecure: true
## Server used as the external object store URL for downloads - provided to clients so
## should be externally accessible. If not provided, the minio.ingress is used (if enabled).
#downloadServer: "s3.amazonaws.com"
## Object store credentials used to grant temporary download access to clients
## Will be created with an "get only" policy if using the default MINIO provisioning.
downloadAccessKey: "EXAMPLE_DOWNLOAD_KEY"
downloadSecretKey: "EXAMPLE_DOWNLOAD_SECRET"
api:
## Deployment component name
name: api
## Defines the serviceAccountName to use when `rbac.create=false`
serviceAccountName: default
replicaCount: 1
## api Deployment Strategy type
strategy:
type: RollingUpdate
# type: Recreate
## Annotations to be added to api pods
##
podAnnotations: {}
# iam.amazonaws.com/role: linkage
## Annotations added to the api Deployment
deploymentAnnotations: # {}
# This annotation enables jaeger injection for open tracing
"sidecar.jaegertracing.io/inject": "true"
## Settings for the nginx proxy
www:
image:
repository: data61/anonlink-nginx
tag: "v1.4.9"
# pullPolicy: Always
pullPolicy: IfNotPresent
## Nginx proxy server resource requests and limits
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 200m
memory: 256Mi
app:
image:
repository: data61/anonlink-app
tag: "v1.15.1"
pullPolicy: IfNotPresent
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
limits:
cpu: 1
memory: 8Gi
requests:
cpu: 500m
memory: 512Mi
dbinit:
enabled: "true"
## Database init runs migrations after install and upgrade
image:
repository: data61/anonlink-app
tag: "v1.15.1"
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
## Annotations added to the database init job's pod.
# podAnnotations: {}
# sidecar.istio.io/inject: "false"
## A job that creates an upload only object store user.
objectstoreinit:
enabled: true
image:
repository: minio/mc
tag: RELEASE.2020-01-13T22-49-03Z
## Annotations added to the object store init job's pod.
# podAnnotations: {}
# sidecar.istio.io/inject: "false"
ingress:
## By default, we do not want the service to be accessible outside of the cluster.
enabled: false
## Ingress annotations
annotations: {}
## Suggested annotations
## To handle large uploads we increase the proxy buffer size
#ingress.kubernetes.io/proxy-body-size: 4096m
## Redirect to ssl
#ingress.kubernetes.io/force-ssl-redirect: "true"
## Deprecated but common
## https://kubernetes.io/docs/concepts/services-networking/ingress/#deprecated-annotation
# kubernetes.io/ingress.class: ""
path: /
pathType: Prefix
## Entity Service API Ingress hostnames
## Must be provided if Ingress is enabled
hosts: []
## E.g:
#- beta.anonlink.data61.xyz
## Ingress TLS configuration
## This example setup is for nginx-ingress. We use certificate manager.
## to create the TLS secret in the namespace with the name
## below.
tls: []
## secretName is the kubernetes secret which will contain the TLS secret and certificates
## for the provided host url. It is automatically generated from the deployed cert-manager.
#- secretName: beta-anonlink-data61-tls
# hosts:
# - beta.anonlink.data61.xyz
service:
annotations: []
labels:
tier: frontend
clusterIp: ""
## Expose the service to be accessed from outside the cluster (LoadBalancer service).
## or access it from within the cluster (ClusterIP service).
## Set the service type and the port to serve it.
## Ref: http://kubernetes.io/docs/user-guide/services/
## Most likely ingress is enabled so this should be ClusterIP,
## Otherwise "LoadBalancer".
type: ClusterIP
servicePort: 80
## If using a load balancer on AWS you can optionally lock down access
## to a given IP range. Provide a list of IPs that are allowed via a
## security group.
loadBalancerSourceRanges: []
workers:
name: "matcher"
image:
repository: "data61/anonlink-app"
tag: "v1.15.1"
pullPolicy: Always
## The initial number of workers for this deployment
## Note the workers.highmemory.replicaCount are in addition
replicaCount: 1
## Enable a horizontal pod autoscaler
## Note: The cluster must have metrics-server installed.
## https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/
autoscaler:
enabled: false
minReplicas: 1
maxReplicas: 20
podAnnotations: {}
deploymentAnnotations: # {}
# This annotation enables jaeger injection for open tracing
"sidecar.jaegertracing.io/inject": "true"
#strategy: ""
## Additional Entity Service Worker container arguments
##
extraArgs: {}
## Worker configuration
## These settings populate the deployment's configmap.
## Desired task size in "number of comparisons"
## Note there is some overhead creating a task and a single dedicated cpu core can do between 50M and 100M
## comparisons per second, so much lower that 100M isn't generally worth splitting across celery workers.
CHUNK_SIZE_AIM: "300_000_000"
## More than this many entities and we skip caching in redis
MAX_CACHE_SIZE: "1_000_000"
## How many seconds do we keep cache ephemeral data such as run progress
## Default is 30 days:
CACHE_EXPIRY_SECONDS: "2592000"
## Specific configuration for celery
## Note that these configurations are the same for a "normal" worker, and a "highmemory" one,
## except for the requested resources and replicaCount which can differ.
celery:
## Number of fork worker celery node will have. It is recommended to use the same concurrency
## as workers.resources.limits.cpu
CONCURRENCY: "2"
## How many messages to prefetch at a time multiplied by the number of concurrent processes. Set to 1 because
## our tasks are usually quite "long".
PREFETCH_MULTIPLIER: "1"
## Maximum number of tasks a pool worker process can execute before it’s replaced with a new one
MAX_TASKS_PER_CHILD: "2048"
## Late ack means the task messages will be acknowledged after the task has been executed, not just before.
ACKS_LATE: "true"
## Currently, enable only the monitoring of celery.
monitor:
enabled: false
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
memory: 500Mi
cpu: 500m
## It is recommended to set limits. celery does not like to share resources.
limits:
memory: 1Gi
cpu: 2
## At least one "high memory" worker is also required.
highmemory:
replicaCount: 1
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
memory: 2Gi
cpu: 1
## It is recommended to set limits. celery does not like to share resources.
limits:
memory: 2Gi
cpu: 2
postgresql:
## See available settings and defaults at:
## https://github.com/kubernetes/charts/tree/master/stable/postgresql
nameOverride: "db"
persistence:
enabled: false
size: 8Gi
metrics:
enabled: true
#serviceMonitor:
#enabled: true
#namespace:
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
#limits:
# memory: 8Gi
requests:
#memory: 1Gi
cpu: 200m
global:
postgresql:
postgresqlDatabase: postgres
postgresqlUsername: postgres
postgresqlPassword: "examplePostgresPassword"
## In this section, we are not installing Redis. The main goal is to define configuration values for
## other services that need to access Redis.
redis:
## Note the `server` options are ignored if provisioning redis
## using this chart.
## External redis server url/ip
server: ""
## Does the redis server support the sentinel protocol
useSentinel: true
sentinelName: "mymaster"
## Note if deploying redis-ha you MUST have the same password below!
password: "exampleRedisPassword"
redis-ha:
## Settings for configuration of a provisioned redis ha cluster.
## https://github.com/DandyDeveloper/charts/tree/master/charts/redis-ha#configuration
## Provisioning is controlled in the `provision` section
auth: true
redisPassword: "exampleRedisPassword"
#replicas: 3
redis:
resources:
requests:
memory: 512Mi
cpu: 100m
limits:
memory: 10Gi
sentinel:
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
memory: 256Mi
persistentVolume:
enabled: false
size: 10Gi
nameOverride: "memstore"
# Enable transparent hugepages
# https://github.com/helm/charts/tree/master/stable/redis-ha#host-kernel-settings
sysctlImage:
enabled: true
mountHostSys: true
command:
- /bin/sh
- -xc
- |-
sysctl -w net.core.somaxconn=10000
echo never > /host-sys/kernel/mm/transparent_hugepage/enabled
# Enable prometheus exporter sidecar
exporter:
enabled: true
minio:
## Configure the object storage
## https://github.com/helm/charts/blob/master/stable/minio/values.yaml
## Root access credentials for the object store
## Note no defaults are provided to help prevent data breaches where
## the object store is exposed to the internet
#accessKey: "exampleMinioAccessKey"
#secretKey: "exampleMinioSecretKet"
defaultBucket:
enabled: true
name: "anonlink"
## Settings for deploying standalone object store
## Can distribute the object store across multiple nodes.
mode: "standalone"
service.type: "ClusterIP"
persistence:
enabled: false
size: 50Gi
storageClass: "default"
metrics:
serviceMonitor:
enabled: false
#additionalLabels: {}
#namespace: nil
# If you'd like to expose the MinIO object store
ingress:
enabled: false
#labels: {}
#annotations: {}
#hosts: []
#tls: []
nameOverride: "minio"
## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
resources:
requests:
memory: 256Mi
cpu: 100m
limits:
memory: 5Gi
provision:
# enable to deploy a standalone version of each service as part of the helm deployment
minio: true
postgresql: true
redis: true
## Tracing config used by jaeger-client-python
## https://github.com/jaegertracing/jaeger-client-python/blob/master/jaeger_client/config.py
tracingConfig: |-
logging: true
metrics: true
sampler:
type: const
param: 1
## Custom logging file used to override the default settings. Will be used by the workers and the api container.
## Example of logging configuration:
loggingCfg: |-
version: 1
disable_existing_loggers: False
formatters:
simple:
format: "%(message)s"
file:
format: "%(asctime)-15s %(name)-12s %(levelname)-8s: %(message)s"
filters:
stderr_filter:
(): entityservice.logger_setup.StdErrFilter
stdout_filter:
(): entityservice.logger_setup.StdOutFilter
handlers:
stdout:
class: logging.StreamHandler
level: DEBUG
formatter: simple
filters: [stdout_filter]
stream: ext://sys.stdout
stderr:
class: logging.StreamHandler
level: ERROR
formatter: simple
filters: [stderr_filter]
stream: ext://sys.stderr
info_file_handler:
class: logging.handlers.RotatingFileHandler
level: INFO
formatter: file
filename: info.log
maxBytes: 10485760 # 10MB
backupCount: 20
encoding: utf8
error_file_handler:
class: logging.handlers.RotatingFileHandler
level: ERROR
formatter: file
filename: errors.log
maxBytes: 10485760 # 10MB
backupCount: 20
encoding: utf8
loggers:
entityservice:
level: INFO
entityservice.database.util:
level: WARNING
entityservice.cache:
level: WARNING
entityservice.utils:
level: INFO
celery:
level: INFO
jaeger_tracing:
level: WARNING
propagate: no
werkzeug:
level: WARNING
propagate: no
root:
level: INFO
handlers: [stdout, stderr, info_file_handler, error_file_handler]
This is also interesting...
According to the web site https://docs.docker.com/language/golang/develop/ the tag v1.15.0 seems to exists
The .env file looks like this:
SERVER=http://nginx:8851
DATABASE_PASSWORD=myPassword
# Object Store Configuration
# Provide root credentials to MINIO to set up more restricted service accounts
# MC_HOST_alias is equivalent to manually configuring a minio host
# mc config host add minio http://minio:9000 <MINIO_ACCESS_KEY> <MINIO_SECRET_KEY>
#- MC_HOST_minio=http://AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY#minio:9000
MINIO_SERVER=minio:9000
MINIO_ACCESS_KEY=AKIAIOSFODNN7EXAMPLE
MINIO_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
MINIO_SECURE=false
# Object store account which will have upload only object store access.
#UPLOAD_OBJECT_STORE_SERVER=
UPLOAD_OBJECT_STORE_BUCKET=uploads
UPLOAD_OBJECT_STORE_SECURE=false
UPLOAD_OBJECT_STORE_ACCESS_KEY=EXAMPLE_UPLOAD_ACCESS_KEY
UPLOAD_OBJECT_STORE_SECRET_KEY=EXAMPLE_UPLOAD_SECRET_ACCESS_KEY
# Object store account which will have "read only" object store access.
#DOWNLOAD_OBJECT_STORE_SERVER=
DOWNLOAD_OBJECT_STORE_ACCESS_KEY=EXAMPLE_DOWNLOAD_ACCESS_KEY
DOWNLOAD_OBJECT_STORE_SECRET_KEY=EXAMPLE_DOWNLOAD_SECRET_ACCESS_KEY
DOWNLOAD_OBJECT_STORE_SECURE=false
# Logging, monitoring and metrics
LOG_CFG=entityservice/verbose_logging.yaml
JAEGER_AGENT_HOST=jaeger
SOLVER_MAX_CANDIDATE_PAIRS=500000000
SIMILARITY_SCORES_MAX_CANDIDATE_PAIRS=999000000

Configure basic_auth for Prometheus Target

One of the targets in static_configs in my prometheus.yml config file is secured with basic authentication. As a result, an error of description "Connection refused" is always displayed against that target in the Prometheus Targets' page.
I have researched how to setup prometheus to provide the security credentials when trying to scrape that particular target but couldn't find any solution. What I found was how to set it up on the scrape_config section in the docs. This won't work for me because I have other targets that are not protected with basic_auth.
Please help me out with this challenge.
Here is part of my .yml config as regards my challenge.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
scrape_timeout: 5s
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:5000']
labels:
service: 'Auth'
- targets: ['localhost:5090']
labels:
service: 'Approval'
- targets: ['localhost:6211']
labels:
service: 'Credit Assessment'
- targets: ['localhost:6090']
labels:
service: 'Sweep'
- targets: ['localhost:6500']
labels:

I would like to add more details to the #PatientZro answer.
In my case, I need to create another job (as specified), but basic_auth needs to be at the same level of indentation as job_name. See example here.
As well, my basic_auth cases require a path as they are not displayed at the root of my domain.
Here is an example with an API endpoint specified:
- job_name: 'myapp_health_checks'
scrape_interval: 5m
scrape_timeout: 30s
static_configs:
- targets: ['mywebsite.org']
metrics_path: "/api/health"
basic_auth:
username: 'email#username.me'
password: 'cfgqvzjbhnwcomplicatedpasswordwjnqmd'
Best,

Create another job for the one that needs auth.
So just under what you've posted, do another
- job_name: 'prometheus-basic_auth'
scrape_interval: 5s
scrape_timeout: 5s
static_configs:
- targets: ['localhost:5000']
labels:
service: 'Auth'
basic_auth:
username: foo
password: bar

How to change my instance names on Prometheus

I'm monitoring multiple computers in the same cluster, for that I'm using prometheus.
Here is my config file prometheus.yml:
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "Server-monitoring-Api"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- targets: ["localhost:9182"]
- targets: ["192.168.1.71:9182"]
- targets: ["192.168.1.84:9182"]
I'm new to Prometheus, I want to show the name of my target, i.e: rather than using for example 192.168.1.71:9182 I only want the target name to be shown, I have a research, I found this:
relabel_configs:
- source_labels: [__meta_ec2_tag_Name]
target_label: instance
But I dont know how to use to relabel my targets(instances), any help will be appreciated, thanks for your help.

The snippet that you found should work only if you're using the EC2 service discover features of Prometheus (which doesn't seem your case since you're using some static targets).
I see a couple of options. You could expose directly in your metrics a different metrics (hostname) with the value of the hostname. Or you could use the textfile collector to expose the same metric as a static value (on a different port).
I recommend reading this post which explains why having a different metric for the "name" or "role" of the machine is usually a better approach than having a hostname label in your metrics.
It is also possible to add a custom label in the Prometheus config directly, something like (since you have your static targets anyhow). Finally, if you are already using the Prometheus node exporter you could use the node_uname_info metric (the nodename label).
- job_name: 'Kafka'
metrics_path: /metrics
static_configs:
- targets: ['10.0.0.4:9309']
labels:
hostname: hostname-a

Prometheus scrape from unknown number of (docker-)hosts

I have a Docker Swarm with a Prometheus container and 1-n containers for a specific microservice.
The microservice-container can be reached by a url. I suppose the requests to this url is kind of load-balanced (of course...).
Currently I have spawned two microservice-container. Querying the metrics now seems to toggle between the two containers. Example: Number of total requests: 10, 13, 10, 13, 10, 13,...
This is my Prometheus configuration. What do I have to do? I do not want to adjust the Prometheus config each time I kill or start a microservice-container.
scrape_configs:
- job_name: 'myjobname'
metrics_path: '/prometheus'
scrape_interval: 15s
static_configs:
- targets: ['the-service-url:8080']
labels:
application: myapplication
UPDATE 1
I changed my configuration as follows which seems to work. This configuration uses a dns lookup inside of the Docker Swarm and finds all instances running the specified service.
scrape_configs:
- job_name: 'myjobname'
metrics_path: '/prometheus'
scrape_interval: 15s
dns_sd_configs:
- names: ['tasks.myServiceName']
type: A
port: 8080
The question here is: Does this configuration recognize that a Docker instance is stopped and another one is started?
UPDATE 2
There is a parameter for what I am asking for:
scrape_configs:
- job_name: 'myjobname'
metrics_path: '/prometheus'
scrape_interval: 15s
dns_sd_configs:
- names: ['tasks.myServiceName']
type: A
port: 8080
# The time after which the provided names are refreshed
[ refresh_interval: <duration> | default = 30s ]
That should do the trick.

So the answer is very simple:
There are multiple, documented ways to scrape.
I am using the dns-lookup-way:
scrape_configs:
- job_name: 'myjobname'
metrics_path: '/prometheus'
scrape_interval: 15s
dns_sd_configs:
- names ['tasks.myServiceName']
type: A
port: 8080
refresh_interval: 15s

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Sending metrics from telegraf to prometheus - monitoring

Related

remove timestamp from log line with Promtail

Why is Docker looking for a non-existent tag when using ${TAG:-latest} in .yml file?

Configure basic_auth for Prometheus Target

How to change my instance names on Prometheus

Prometheus scrape from unknown number of (docker-)hosts

Categories

Resources