Duplicate timestamps values in grafana and thanos datasource showing single timestamp value - monitoring

Grafana showing duplicate timestamps values and Thanos showing the correct single timestamp value. I'm hitting a single curl request for application API Thanos showing the correct value but when I'm running the same query in grafana it shows two count values. I'm using telegraf agent for collecting metrics in prometheus.
My whole setup is running in Kubernetes and I'm using telegraf statsd for application monitoring.
Telegraf conf >>
[agent]
interval = "15s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
flush_buffer_when_full = true
collection_jitter = "0s"
flush_interval = "1s"
flush_jitter = "0s"
quiet = false
debug = false
logfile = "/var/log/telegraf/telegraf.log"
logfile_rotation_max_size = "10MB"
logfile_rotation_max_archives = 5
hostname = "${HOSTNAME}"
[global_tags]
dc = "${datacenter}"
component = "k8s"
role = "node"
job = "${job}"
service = "containerorchestration"
subcomponent = "worker"
organization = "${organization}"
environment = "${environment}"
environmentversion = "${environmentversion}"
infraversion = "${infraversion}"
[[inputs.cpu]]
percpu = false
totalcpu = true
fielddrop = ["time_*"]
[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "overlay", "aufs", "squashfs"]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]
[[inputs.net]]
fielddrop = ["icmp_*", "icmpmsg_*", "tcp_*", "udp_*", "udplite_*", "ip_*"]
[[inputs.netstat]]
[[inputs.linux_sysctl_fs]]
[[outputs.prometheus_client]]
listen = ":9273"
metric_version = 2
path = "/metrics"
expiration_interval = "16s"
export_timestamp = true
Telegraf statd conf >>
[[inputs.statsd]]
protocol = "udp"
max_tcp_connections = 250
tcp_keep_alive = false
service_address = ":8130"
delete_gauges = true
delete_counters = true
delete_sets = true
delete_timings = true
parse_data_dog_tags = true
percentiles = [90.0, 95.0, 99.0]
metric_separator = "_"
datadog_extensions = true
allowed_pending_messages = 10000
percentile_limit = 1000
And Prometheus job conf >
- job_name: 'ec2-telegraf'
sample_limit: 4000
metrics_path: '/metrics'
scrape_interval: '15s'
ec2_sd_configs:
- region: "XXXXXXX"
profile: "XXXXXXXXXX"
role_arn: XXXXXXXXXXXXXXX
refresh_interval: 100s
port: 9273
filters:
- name: instance-state-name
values:
- running
- name: tag:Environment
values:
- performance
relabel_configs:
- source_labels: [__meta_ec2_tag_Businessunit]
target_label: businessunit
- source_labels: [__meta_ec2_tag_Environment]
target_label: environment
- source_labels: [__meta_ec2_tag_Techteam]
target_label: techteam
- source_labels: [__meta_ec2_tag_component]
target_label: component
- source_labels: [__meta_ec2_tag_subcomponent]
target_label: subcomponent
- source_labels: [__meta_ec2_tag_role]
target_label: role
- source_labels: [__meta_ec2_tag_aws_autoscaling_groupName]
target_label: asgname
- source_labels: [__meta_ec2_tag_Service]
target_label: service
Need help, please share the suggestion.
Thanks

Related

Docker-Compose Gitlab / Postgresql wont start / database permission denied after copy of instance

To test upgrade on my docker container i copied it in the docker-compose.yml. I also copied the docker files an changed conf in the gitlab.rb file. I changed domain and port as follows:
### OLD CONF
gitlab:
image: gitlab/gitlab-ce:14.9.0-ce.0
depends_on:
- nginx-proxy
- nginx-proxy-letsencrypt
restart: always
volumes:
- './gitlab/config:/etc/gitlab'
- './gitlab/logs:/var/log/gitlab'
- './gitlab/data:/var/opt/gitlab'
# volumes:
# - ./gitlab:/etc/gitlab
# - ./gitlab/backups:/var/opt/gitlab/backups
# - ./gitlab/ssh:/etc/ssh
# - ./gitlab/git-data:/var/opt/gitlab/git-data
ports:
- "10296:22"
environment:
GITLAB_OMNIBUS_CONFIG: |
external_url 'https://gitlab.mysite.biz'
# letsencrypt
letsencrypt['enabled'] = false
# email
gitlab_rails['gitlab_email_enabled'] = true
gitlab_rails['gitlab_email_from'] = ''
gitlab_rails['gitlab_email_display_name'] = 'Gitlab'
gitlab_rails['gitlab_email_reply_to'] = 'admin#mysite.de'
# backups
gitlab_rails['backup_keep_time'] = 604800 # 7 days
gitlab_rails['backup_upload_connection'] = {
:provider => 'Local',
:local_root => '/backup'
}
gitlab_rails['backup_upload_remote_directory'] = '.'
# ssh
gitlab_rails['gitlab_shell_ssh_port'] = 10296
# mailserver
gitlab_rails['smtp_enable'] = true
gitlab_rails['smtp_address'] = "mail.mysite.de"
gitlab_rails['smtp_port'] = 587
gitlab_rails['smtp_user_name'] = "admin#mysite.de"
gitlab_rails['smtp_password'] = "1234"
gitlab_rails['smtp_domain'] = "mysite.de"
gitlab_rails['smtp_authentication'] = "plain"
gitlab_rails['smtp_enable_starttls_auto'] = true
gitlab_rails['smtp_tls'] = false
gitlab_rails['gitlab_email_from'] = 'admin#mysite.de'
gitlab_rails['gitlab_email_reply_to'] = 'admin#mysite.de'
VIRTUAL_HOST: gitlab.mysite.biz
LETSENCRYPT_HOST: gitlab.mysite.biz
container_name: gitlab
logging:
options:
max-size: "100m"
max-file: "2"
### NEW CONF
gitlab-new:
image: gitlab/gitlab-ce:14.9.0-ce.0
depends_on:
- nginx-proxy
- nginx-proxy-letsencrypt
restart: always
volumes:
- './gitlab-new/config:/etc/gitlab'
- './gitlab-new/logs:/var/log/gitlab'
- './gitlab-new/data:/var/opt/gitlab'
ports:
- "10297:22"
environment:
GITLAB_SKIP_UNMIGRATED_DATA_CHECK: 'true'
GITLAB_OMNIBUS_CONFIG: |
external_url 'https://gitlab-new.mysite.biz'
# letsencrypt
letsencrypt['enabled'] = false
# email
gitlab_rails['gitlab_email_enabled'] = true
gitlab_rails['gitlab_email_from'] = ''
gitlab_rails['gitlab_email_display_name'] = 'Gitlab'
gitlab_rails['gitlab_email_reply_to'] = 'admin#mysite.de'
# backups
gitlab_rails['backup_keep_time'] = 604800 # 7 days
gitlab_rails['backup_upload_connection'] = {
:provider => 'Local',
:local_root => '/backup'
}
gitlab_rails['backup_upload_remote_directory'] = '.'
# ssh
gitlab_rails['gitlab_shell_ssh_port'] = 10297
# mailserver
gitlab_rails['smtp_enable'] = true
gitlab_rails['smtp_address'] = "mail.mysite.de"
gitlab_rails['smtp_port'] = 587
gitlab_rails['smtp_user_name'] = "admin#mysite.de"
gitlab_rails['smtp_password'] = "1234"
gitlab_rails['smtp_domain'] = "mysite.de"
gitlab_rails['smtp_authentication'] = "plain"
gitlab_rails['smtp_enable_starttls_auto'] = true
gitlab_rails['smtp_tls'] = false
gitlab_rails['gitlab_email_from'] = 'admin#mysite.de'
gitlab_rails['gitlab_email_reply_to'] = 'admin#mysite.de'
VIRTUAL_HOST: gitlab-new.mysite.biz
LETSENCRYPT_HOST: gitlab-new.mysite.biz
container_name: gitlab-new
logging:
options:
max-size: "100m"
max-file: "2"
At fist running of "docker-compose up gitlab-new" the problem was that the folder opt/gitlab/embedded/service/gitlab-rails/log/ hadn't the right permissions on all log files. With chmod 0666 i solved this, but now i have the problem that my database postgresql wont start.
The log:
2023-02-17T10:56:28.663Z: Cached record for ApplicationSetting couldn't be loaded, falling back to uncached record: could not connect to server: Permission denied
Is the server running locally and accepting
connections on Unix domain socket "/var/opt/gitlab/postgresql/.s.PGSQL.5432"?
When i try to start postgresql it stays down with no message
Where at least can i find out more detailled information why the postgresql wont start? I cant find logs regarding this.
Also what could be the reason? you now know i tried to copy the same instance for keeping the data, could this be the reason and if yes, why?

Artillery + Playwright, StatsD data not being ingested in InfluxDB correctly by Telegraf (Template not working)

I have some tests written using Artillery + Playwright and I am using the publish-metrics plugin with type influxdb-statsd. I then have the following telegraf.config
[[outputs.influxdb_v2]]
urls = ["http://${INFLUX_DB2_HOST_ADDRESS}:8086"]
token = "${INFLUX_DB2_TOKEN}"
organization = "${INFLUX_DB2_ORGANIZATION}"
bucket = "${INFLUX_DB2_BUCKET}"
[[inputs.statsd]]
protocol = "udp"
max_tcp_connections = 250
tcp_keep_alive = false
service_address = ":8125"
delete_gauges = true
delete_counters = true
delete_sets = true
delete_timings = true
metric_separator = "_"
parse_data_dog_tags = true
datadog_extensions = true
datadog_distributions = false
Data from artillery is sent in this format to statsD
artillery.browser.page.FID.compliance-hub_dashboard.min:3.2|g
artillery.browser.page.FID.compliance-hub_dashboard.max:3.2|g
artillery.browser.page.FID.compliance-hub_dashboard.count:2|g
artillery.browser.page.FID.compliance-hub_dashboard.p50:3.2|g
I would like to set up a telegraf template so that in Influx DB
artillery.browser.page.FID.compliance-hub_dashboard is a measurement and min, max, count and p50 are fields.
How do I do that?
I tried:
templates = [
"measurement.measurement.measurement.measurement.measurement.field",
]
but it's not working. :(
What I see in InfluxDb is a measurement of artillery_browser_page_FID_compliance-hub_dashboard_min with a field of value = 3.2

Gitlab-CE running in Docker-Container, but can't get Container Registry running

I have Gitlab-CE running in a Docker-Container and everything works fine, but I can't get the container-registry running.
This is my docker-compose.yml
web:
image: 'gitlab/gitlab-ce:latest'
restart: always
hostname: 'git.mydomain.com'
environment:
GITLAB_OMNIBUS_CONFIG: |
#SSL
external_url 'https://git.mydomain.com'
letsencrypt['enable'] = false
# nginx['redirect_http_to_https'] = true
#registry_nginx['redirect_http_to_https'] = true
#mattermost_nginx['redirect_http_to_https'] = true
#letsencrypt['enable'] = false
nginx['enable'] = true
nginx['listen'] = 443
nginx['client_max_body_size']='250m'
nginx['redirect_http_to_https'] = true
nginx['ssl_certificate'] = '/etc/gitlab/ssl/git.mydomain.com.crt.pem'
nginx['ssl_certificate_key'] = '/etc/gitlab/ssl/git.mydomain.com.key.pem'
nginx['ssl_protocols']="TLSv1.1 TLSv1.2"
#ENABLE CONTAINER REGISTRY
registry_external_url = 'https://git.mydomain.com:4567'
registry_nginx['listen_port'] = 4567
registry_nginx['listen_https'] = false
#gitlab_rails['registry_path'] = "/var/gitlab/gitlab-rails/shared/registry"
#gitlab_rails['gitlab_default_projects_features_container_registry'] = true
#registry_enable = true
#registry_nginx['enable'] = true
#gitlab_rails['lfs_enabled'] = true
#registry_nginx['ssl_certificate'] = "/etc/gitlab/ssl/git.mydomain.com.crt.pem'"
#registry_nginx['ssl_certificate_key'] = "/etc/gitlab/ssl/git.mydomain.com.key.pem"
#SMTP
gitlab_rails['smtp_enable'] = true
gitlab_rails['smtp_address'] = 'smtp.mydomain.de'
gitlab_rails['smtp_port'] = 465
gitlab_rails['smtp_user_name'] = 'me#mydomain.com'
gitlab_rails['smtp_password'] = '...'
gitlab_rails['smtp_domain'] = 'https://com.mydomain.de'
gitlab_rails['smtp_enable_starttls_auto'] = true
gitlab_rails['smtp_tls'] = true
gitlab_rails['smtp_openssl_verify_mode'] = 'none'
# If your SMTP server does not like the default 'From: gitlab#localhost' you
# can change the 'From' with this setting.
gitlab_rails['gitlab_email_from'] = 'gitlab#mydomain.com'
gitlab_rails['gitlab_email_reply_to'] = 'noreply#mydomain.com'
# Add any other gitlab.rb configuration here, each on its own line
ports:
- '443:443'
- '80:80'
- '4567:4567'
volumes:
- '/srv/gitlab/config:/etc/gitlab'
- '/srv/gitlab/config/ssl:/etc/gitlab/ssl'
- '/srv/gitlab/logs:/var/log/gitlab'
- '/srv/gitlab/data:/var/opt/gitlab'
The commented lines regarding the container-registry I hava already tried in all possible combinations,with no effort. In some cases gitlab runs but without container-registry, and in other cases gitlab fails to start.
what am I missing?

Telegraf http listener v2: unable to send JSON with string values

I'm trying to send this very simple JSON string to Telegraf to be saved into InfluxDB:
{ "id": "id_123", "value": 10 }
So the request would be this: curl -i -XPOST 'http://localhost:8080/telegraf' --data-binary '{"id": "id_123","value": 10}'
When I make that request, I get the following answer: HTTP/1.1 204 No Content Date: Tue, 20 Apr 2021 13:02:49 GMT but when I check what was written to database, there is only value field:
select * from http_listener_v2
time host influxdb_database value
---- ---- ----------------- -----
1618923747863479914 my.host.com my_db 10
What am I doing wrong?
Here's my Telegraf config:
[global_tags]
[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
# OUTPUTS
[[outputs.influxdb]]
urls = ["http://127.0.0.1:8086"]
database = "telegraf"
username = "xxx"
password = "xxx"
[outputs.influxdb.tagdrop]
influxdb_database = ["*"]
[[outputs.influxdb]]
urls = ["http://127.0.0.1:8086"]
database = "httplistener"
username = "xxx"
password = "xxx"
[outputs.influxdb.tagpass]
influxdb_database = ["httplistener"]
# INPUTS
## system
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.mem]]
[[inputs.swap]]
[[inputs.system]]
## http listener
[[inputs.http_listener_v2]]
service_address = ":8080"
path = "/telegraf"
methods = ["POST", "PUT"]
data_source = "body"
data_format = "json"
[inputs.http_listener_v2.tags]
influxdb_database = "httplistener"
Use json_string_fields = ["id"]

Unable to get response http Post to local express app from Kapacitor stream

I am following SE Thread to get some response to HTTP POST on an express node. But unable to get any response from kapacitor.
Environment
I am using Windows 10 via PowerShell.
I am connected to an InfluxDB internal Server which is mentioned in the kapacitor.conf and have a TICKscript to stream data via it.
kapacitor.conf
hostname = "134.102.97.81"
data_dir = "C:\\Users\\des\\.kapacitor"
skip-config-overrides = true
default-retention-policy = ""
[alert]
persist-topics = true
[http]
bind-address = ":9092"
auth-enabled = false
log-enabled = true
write-tracing = false
pprof-enabled = false
https-enabled = false
https-certificate = "/etc/ssl/kapacitor.pem"
https-private-key = ""
shutdown-timeout = "10s"
shared-secret = ""
[replay]
dir = "C:\\Users\\des\\.kapacitor\\replay"
[storage]
boltdb = "C:\\Users\\des\\.kapacitor\\kapacitor.db"
[task]
dir = "C:\\Users\\des\\.kapacitor\\tasks"
snapshot-interval = "1m0s"
[load]
enabled = false
dir = "C:\\Users\\des\\.kapacitor\\load"
[[influxdb]]
enabled = true
name = "DB5Server"
default = true
urls = ["https://influxdb.internal.server.address:8086"]
username = "user"
password = "password"
ssl-ca = ""
ssl-cert = ""
ssl-key = ""
insecure-skip-verify = true
timeout = "0s"
disable-subscriptions = true
subscription-protocol = "https"
subscription-mode = "cluster"
kapacitor-hostname = ""
http-port = 0
udp-bind = ""
udp-buffer = 1000
udp-read-buffer = 0
startup-timeout = "5m0s"
subscriptions-sync-interval = "1m0s"
[influxdb.excluded-subscriptions]
_kapacitor = ["autogen"]
[logging]
file = "STDERR"
level = "DEBUG"
[config-override]
enabled = true
[[httppost]]
endpoint = "kapacitor"
url = "http://localhost:1440"
headers = { Content-Type = "application/json;charset=UTF-8"}
alert-template = "{\"id\": {{.ID}}}"
The daemon runs without any problems.
test2.tick
dbrp "DBTEST"."autogen"
stream
|from()
.measurement('humid')
|alert()
.info(lambda: TRUE)
.post()
.endpoint('kapacitor')
Already defined the task .\kapacitor.exe define bc_1 -tick test2.tick
Enabled it .\kapacitor.exe enable bc_1
The status shows nothing:
.\kapacitor.exe show bc_1
ID: bc_1
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 13 Mar 19 15:33 CET
Modified: 13 Mar 19 16:23 CET
LastEnabled: 13 Mar 19 16:23 CET
Databases Retention Policies: ["NIMBLE"."autogen"]
TICKscript:
dbrp "TESTDB"."autogen"
stream
|from()
.measurement('humid')
|alert()
.info(lambda: TRUE)
.post()
.endpoint('kapacitor')
DOT:
digraph bc_1 {
graph [throughput="0.00 points/s"];
stream0 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
stream0 -> from1 [processed="0"];
from1 [avg_exec_time_ns="0s" errors="0" working_cardinality="0" ];
from1 -> alert2 [processed="0"];
alert2 [alerts_inhibited="0" alerts_triggered="0" avg_exec_time_ns="0s" crits_triggered="0" errors="0" infos_triggered="0" oks_triggered="0" warns_triggered="0" working_cardinality="0" ];
}
The Daemon logs provide this for the task
ts=2019-03-13T16:25:23.640+01:00 lvl=debug msg="starting enabled task on startup" service=task_store task=bc_1
ts=2019-03-13T16:25:23.677+01:00 lvl=debug msg="starting task" service=kapacitor task_master=main task=bc_1
ts=2019-03-13T16:25:23.678+01:00 lvl=info msg="started task" service=kapacitor task_master=main task=bc_1
ts=2019-03-13T16:25:23.679+01:00 lvl=debug msg="listing dot" service=kapacitor task_master=main dot="digraph bc_1 {\nstream0 -> from1;\nfrom1 -> alert2;\n}"
ts=2019-03-13T16:25:23.679+01:00 lvl=debug msg="started task during startup" service=task_store task=bc_1
ts=2019-03-13T16:25:23.680+01:00 lvl=debug msg="opened service" source=srv service=*task_store.Service
ts=2019-03-13T16:25:23.680+01:00 lvl=debug msg="opening service" source=srv service=*replay.Service
ts=2019-03-13T16:25:23.681+01:00 lvl=debug msg="skipping recording, metadata is already correct" service=replay recording_id=353d8417-285d-4fd9-b32f-15a82600f804
ts=2019-03-13T16:25:23.682+01:00 lvl=debug msg="skipping recording, metadata is already correct" service=replay recording_id=a8bb5c69-9f20-4f4d-8f84-109170b6f583
But I get nothing on the Express Node side. The code is exactly the same as that in the above mentioned SE thread.
Any Help as to how to capture stream from Kapacitor on HTTP Post? I already have a live system that is pushing information into the dedicated database already
I was able to shift focus from stream to batch in the above query. I have documented the complete process on medium.com.
Some Files:
kapacitor.gen.conf
hostname = "my-windows-10"
data_dir = "C:\\Users\\<user>\\.kapacitor"
skip-config-overrides = true
default-retention-policy = ""
[alert]
persist-topics = true
[http]
bind-address = ":9092"
auth-enabled = false
log-enabled = true
write-tracing = false
pprof-enabled = false
https-enabled = false
https-certificate = "/etc/ssl/kapacitor.pem"
https-private-key = ""
shutdown-timeout = "10s"
shared-secret = ""
[replay]
dir = "C:\\Users\\des\\.kapacitor\\replay"
[storage]
boltdb = "C:\\Users\\des\\.kapacitor\\kapacitor.db"
[task]
dir = "C:\\Users\\des\\.kapacitor\\tasks"
snapshot-interval = "1m0s"
[load]
enabled = false
dir = "C:\\Users\\des\\.kapacitor\\load"
[[influxdb]]
enabled = true
name = "default"
default = true
urls = ["http://127.0.0.1:8086"]
username = ""
password = ""
ssl-ca = ""
ssl-cert = ""
ssl-key = ""
insecure-skip-verify = true
timeout = "0s"
disable-subscriptions = true
subscription-protocol = "http"
subscription-mode = "cluster"
kapacitor-hostname = ""
http-port = 0
udp-bind = ""
udp-buffer = 1000
udp-read-buffer = 0
startup-timeout = "5m0s"
subscriptions-sync-interval = "1m0s"
[influxdb.excluded-subscriptions]
_kapacitor = ["autogen"]
[logging]
file = "STDERR"
level = "DEBUG"
[config-override]
enabled = true
# Subsequent Section describes what this conf does
[[httppost]]
endpoint = "kap"
url = "http://127.0.0.1:30001/kapacitor"
headers = { "Content-Type" = "application/json"}
TICKScript
var data = batch
| query('SELECT "v" FROM "telegraf_test"."autogen"."humid"')
.period(5s)
.every(10s)
data
|httpPost()
.endpoint('kap')
Define the Task
.\kapacitor.exe define batch_test -tick .\batch_test.tick -dbrp DBTEST.autogen
I suspect the hostname was michieveous where it was set to localhost previously but I set it my machine's hostname and instead used the IP address 127.0.0.1 whereever localhost was mentioned

Resources