"Could not find expected ':' " error on Prometheus YAML file - oauth-2.0

I am trying to configure a YAML using oauth2.0 and I keep getting a "Could not find expected ':'" error on the line towards the end that says "scopes."
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
metrics_path: '/actuator/prometheus'
#scheme defaults to 'http'.
static_configs:
- targets: ["localhost:8080"]
oauth2:
client_id: ''
[ client_secret: '']
scopes:
[ - '']
token_url: ''
#tls_config:

Your YAML is invalid.
I find the Prometheus config documentation confusing to read.
In this case, you're misreading [ ... ]. It's used to denote that something is optional in the documentation. The [ and ] should not be literally included in your YAML.
Use a tool like yamllint to validate your YAML.
So, try:
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: []
rule_files: []
scrape_configs:
- job_name: "prometheus"
metrics_path: "/actuator/prometheus"
static_configs:
- targets: ["localhost:8080"]
oauth2:
client_id: ""
client_secret: ""
scopes: []
token_url: ""
Notes:
Per oauth2 you must have either client_secret or client_secret_file.
I removed your comments for clarity but comments (# ...) are fine.
I removed your commented out list elements and replaced them with empty lists ([]). You will likely need to fix those; you're probably going to need at least one scope, for example.
It's a good practice to be consistent in your use of " or ' to delimit strings; I've used " in the above.
There are 2 ways to represents list in YAML. The JSON way and the YAML way. Somewhat awkwardly, I think you must use the JSON way to represent an empty list ([]). Again, it's good to be consistent and stick with one. The following are equivalent:
JSON-way:
some-list: ["a","b","c"]
YAML-way:
some-list:
- "a"
- "b"
- "c"

Related

Cypher queries fails with Neo4jError: Unknown function 'apoc.convert.fromJsonMap' but apoc should be installed

I deployed Neo4j in my AKS cluster using the standalone Helm chart.
It all gets deployed and my Node.js server connects to Neo4j correctly.
However queries throw the Neo4jError: Unknown function 'apoc.convert.fromJsonMap' error, so apoc is clearly missing.
I followed the procedure described here https://neo4j.com/docs/operations-manual/current/kubernetes/configuration/#operations-installing-plugins and my Values are here below.
The only difference I find is that in the guide apoc core is actually enabled afterwards by upgrading the helm chart, while I'm installing it with the option enabled already.
Looking at https://neo4j.com/docs/apoc/current/config/ I saw
As of Neo4j v.5.0, APOC config settings are no longer supported in the neo4j.conf file. Please move all apoc.* settings to apoc.conf. It is also possible to set the config settings using environment variables.
so as neo4j-standalone is using version 4.4.16 I moved the apoc configurations from apoc.config to neo4.config but still apoc procedures are not found by the queries.
Is there something I'm missing out to configure in order to enable apoc?
Thank you very much.
neo4j-db:
# neo4j-standalone:
nameOverride: "neo4j"
fullnameOverride: 'neo4j'
neo4j:
# Name of your cluster
name: "fixit-neo4j" # this will be the label: app: value for the service selector
password: "password"
##
passwordFromSecret: ""
passwordFromSecretLookup: false
edition: "community"
acceptLicenseAgreement: "yes"
offlineMaintenanceModeEnabled: false
resources:
cpu: "1000m"
memory: "2Gi"
volumes:
data:
mode: 'volumeClaimTemplate'
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
storageClassName: neo4j-sc-data
resources:
requests:
storage: 4Gi
backups:
mode: 'share' # share an existing volume (e.g. the data volume)
share:
name: 'logs'
logs:
mode: 'volumeClaimTemplate'
volumeClaimTemplate:
accessModes:
- ReadWriteOnce
storageClassName: neo4j-sc-logs
resources:
requests:
storage: 4Gi
services:
# A ClusterIP service with the same name as the Helm Release name should be used for Neo4j Driver connections originating inside the
# Kubernetes cluster.
default:
# Annotations for the K8s Service object
annotations: { }
# A LoadBalancer Service for external Neo4j driver applications and Neo4j Browser
neo4j:
### this would create cluster-neo4j svc
enabled: false
# env:
# NEO4J_PLUGINS: '["graph-data-science"]'
config:
server.bolt.enabled : "true"
server.bolt.tls_level: "REQUIRED"
server.bolt.listen_address: "0.0.0.0:7687"
dbms.ssl.policy.bolt.client_auth: "NONE"
dbms.ssl.policy.bolt.enabled: "true"
server.directories.plugins: "/var/lib/neo4j/labs"
dbms.security.procedures.unrestricted: "apoc.*"
server.config.strict_validation.enabled: "false"
dbms.security.procedures.allowlist: "gds.*,apoc.*"
apoc_config:
apoc.trigger.enabled: "true"
apoc.jdbc.neo4j.url: "jdbc:foo:bar"
apoc.import.file.enabled: "true"
startupProbe:
failureThreshold: 1000
periodSeconds: 50
ssl:
# setting per "connector" matching neo4j config
bolt:
privateKey:
secretName: tls-secret
subPath: tls.key
publicCertificate:
secretName: tls-secret
subPath: tls.crt
trustedCerts:
sources: [ ]
revokedCerts:
sources: [ ]
OK after a bit of looking at quite a few issues on the same subject, I found that some solutions for this problem was to add dbms.directories.plugins: "/var/lib/neo4j/labs" and dbms.config.strict_validation: "false" in the config section which, as I understand it, mirrors these settings both for server and dbms. It indeed worked, but it's weird that in the official guide it's not mentioned. I mean, these mirrored settings make sense, tell both the server and the dbms where to look for plugins, but still it should be mentioned. I see so many post about this, which means the documentation is not clear enough. It's easy to take things for granted and in fact because this mirrored plugin location both for the server AND dbms need is just not stated anywhere in the docs, I as many others thought that dbms was already configured with the same location as server.directories.plugins: "/var/lib/neo4j/labs" ( which the docs say to configure ) and haven't added it, but hey.. ain't nobody's perfect I guess. Hope they change the docs then for future devs' sake, but meanwhile this answer could be helpful.
So the correct configuration is
env:
NEO4J_PLUGINS: '["graph-data-science"]'
config:
server.bolt.enabled: 'true'
server.bolt.tls_level: 'REQUIRED'
server.bolt.listen_address: '0.0.0.0:7687'
dbms.ssl.policy.bolt.client_auth: 'NONE'
dbms.ssl.policy.bolt.enabled: 'true'
## apoc
server.directories.plugins: '/var/lib/neo4j/labs'
server.config.strict_validation.enabled: 'false'
dbms.security.procedures.unrestricted: 'apoc.*'
dbms.security.procedures.allowlist: 'gds.*,apoc.*'
### additional needed dbms config mirroring server config
dbms.directories.plugins: "/var/lib/neo4j/labs"
dbms.config.strict_validation: "false"
apoc_config:
apoc.trigger.enabled: "true"
apoc.jdbc.neo4j.url: "jdbc:foo:bar"
apoc.import.file.enabled: "true"
It seems the docs are missing installing the APOC plugin. Change the following line to include APOC as well:
NEO4J_PLUGINS: '["graph-data-science", "apoc"]'
and you should be good

cadvisor machine_cpu_cores metrics comes every 5 minutes

I am using Victoriametrics, cadvisor and docker to find our per container cpu usage. I am running the following query against cadvisor metrics
(sum(rate(container_cpu_usage_seconds_total{name != ''}[1m])) BY (instance, name) * 100) / ignoring(name) (sum by (instance) (machine_cpu_cores))
This should divide cpu_usage by machine_cpu_cores, but the metric machine_cpu_cores is available every 5 minutes, inspite of having a shorter collection interval in my config
global:
scrape_interval: 10s
scrape_configs:
- job_name: node_exporter
static_configs:
- targets: ["localhost:9100"]
- job_name: cadvisor
static_configs:
- targets: ["localhost:8080"]
This leads to the following break in the graph
This metric should be available at at least a 1 minute resolution. Has anyone else faced these issues with cadvisor?
Could you please check what returns the following queries on the same time interval:
scrape_interval(container_cpu_usage_seconds_total[5m])
scrape_interval(machine_cpu_cores[5m])
Do you have any non-default flags configured for your VictoriaMetrics process?
I was able to achieve this by using the on (instance) group_left() expression instead of the ignoring flag
(sum(rate(container_cpu_usage_seconds_total{name != ''}[1m])) BY (instance, name) * 100) / on(instance) group_left() (sum by (instance) (machine_cpu_cores))

how to promtail parse json to label and timestamp

I have a probleam to parse a json log with promtail, please, can somebody help me please. I try many configurantions, but don't parse the timestamp or other labels.
log entry:
{timestamp=2019-10-25T15:25:41.041-03, level=WARN, thread=http-nio-0.0.0.0-8080-exec-2, mdc={handler=MediaController, ctxCli=127.0.0.1, ctxId=FdD3FVqBAb0}, logger=br.com.brainyit.cdn.vbox.
controller.MediaController, message=[http://localhost:8080/media/sdf],c[500],t[4],l[null], context=default}
promtail-config.yml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: vbox-main
static_configs:
- targets:
- localhost
labels:
job: vbox
appender: main
__path__: /var/log/vbox/main.log
pipeline_stages:
- json:
expressions:
timestamp: timestamp
message: message
context: context
level: level
timestamp:
source: timestamp
format: RFC3339Nano
labels:
context:
level:
output:
source: message
I've tried the setup of Promtail with Java SpringBoot applications (which generates logs to file in JSON format by Logstash logback encoder) and it works.
The example log line generated by application:
{"timestamp":"2020-06-06T01:00:30.840+02:00","version":1,"message":"Started ApiApplication in 1.431 seconds (JVM running for 6.824)","logger_name":"com.github.pnowy.spring.api.ApiApplication","thread_name":"main","level":"INFO","level_value":20000}
The prometail config:
# Promtail Server Config
server:
http_listen_port: 9080
grpc_listen_port: 0
# Positions
positions:
filename: /tmp/positions.yaml
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: springboot
pipeline_stages:
- json:
expressions:
level: level
message: message
timestamp: timestamp
logger_name: logger_name
stack_trace: stack_trace
thread_name: thread_name
- labels:
level:
- template:
source: new_key
template: 'logger={{ .logger_name }} threadName={{ .thread_name }} | {{ or .message .stack_trace }}'
- output:
source: new_key
static_configs:
- targets:
- localhost
labels:
job: applogs
__path__: /Users/przemek/tools/promtail/*.log
Please notice that the output (the log text) is configured first as new_key by Go templating and later set as the output source. The logger={{ .logger_name }} helps to recognise the field as parsed on Loki view (but it's an individual matter of how you want to configure it for your application).
Here you will find quite nice documentation about entire process: https://grafana.com/docs/loki/latest/clients/promtail/pipelines/
The example was run on release v1.5.0 of Loki and Promtail (Update 2020-04-25: I've updated links to current version - 2.2 as old links stopped working).
The section about timestamp is here: https://grafana.com/docs/loki/latest/clients/promtail/stages/timestamp/ with examples - I've tested it and also didn't notice any problem. Hope that help a little bit.
The JSON configuration part: https://grafana.com/docs/loki/latest/clients/promtail/stages/json/
Result on Loki:

How to configurate http_post_2xx in blackbox exporter?

I am new to Prometheus, and I have been trying to set up blackbox exporter for monitoring my server by the http_post_2xx module but not http_2xx module. However, given so much researching on the internet, I still do not figure out.
Here is the background of my situation: I used to monitor my website available or not by Postman. After sending a post request, I should be able to receive a signal indicating the status is 200 OK or could not get any response, manually. This is ineffective and unresponsible as I should not be noticed an error from my website-visitor but not myself. Therefore, I turn to Prometheus.
Blackbox exporter seems like my solution. I build blackbox exporter on my server, and the configure file is like this:
modules:
http_post_2xx:
prober: http
timeout: 5s
http:
method: POST
headers:
Content-Type: application/json
body: '{text: "hi"}'
I configure the prometheus.yml in this way:
- job_name: 'blackbox'
metrics_path: /probe
params:
module: [http_post_2xx]
static_configs:
- targets:
- 10.0.100.130:2001
- 10.0.100.130:2002 # The IP address I want to monitor
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 10.0.100.130:9115 # Turn on this port for sending metrics
The dashboard I apply is 5345, but I got something like this:
enter image description here
I do not know why the HTTP Status Code is N/A or No, but the status from Postman is 200 OK. Is there anything wrong of my configuration?

Prometheus AlertManager - Send Alerts to different clients based on routes

I have 2 services A and B which I want to monitor. Also I have 2 different notification channels X and Y in the form of receivers in the AlertManager config file.
I want to send to notify X if service A goes down and want to notify Y if service B goes down. How can I achieve this my configuration?
My AlertManager YAML file is:
route:
receiver: X
receivers:
- name: X
email_configs:
- name: Y
email_configs:
And alert.rule files is:
groups:
- name: A
rules:
- alert: A_down
expr: expression
for: 1m
labels:
severity: critical
annotations:
summary: "A is down"
- name: B
rules:
- alert: B_down
expr: expression
for: 1m
labels:
severity: warning
annotations:
summary: "B is down"
The config should roughly look like this (not tested):
route:
group_wait: 30s
group_interval: 5m
repeat_interval: 2h
receiver: 'default-receiver'
routes:
- match:
alertname: A_down
receiver: X
- match:
alertname: B_down
receiver: Y
The idea is, that each route field can has a routes field, where you can put a different config, that gets enabled if the labels in match match the condition.
For clarifying - The General Flow to handle alert in Prometheus (Alertmanager and Prometheus integration) is like this:
SomeErrorHappenInYourConfiguredRule(Rule) -> RouteToDestination(Route)
-> TriggeringAnEvent(Reciever)-> GetAMessageInSlack/PagerDuty/Mail/etc...
For example:
if my aws machine cluster production-a1 is down, I want to trigger an event sending "pagerDuty" and "Slack" to my team with the relevant error.
There's 3 files important to configure alerts on your prometheus system:
alertmanager.yml - configuration of you routes (getting the triggered
errors) and receivers (how to handle this errors)
rules.yml - This rules will contain all the thresholds and rules
you'll define in your system.
prometheus.yml - global configuration to integrate your rules into routes and recivers together (the two above).
I'm attaching a Dummy example In order to demonstrate the idea, in this example I'll watch overload in my machine (using node exporter installed on it):
On /var/data/prometheus-stack/alertmanager/alertmanager.yml
global:
# The smarthost and SMTP sender used for mail notifications.
smtp_smarthost: 'localhost:25'
smtp_from: 'JohnDoe#gmail.com'
route:
receiver: defaultTrigger
group_wait: 30s
group_interval: 5m
repeat_interval: 6h
routes:
- match_re:
service: service_overload
owner: ATeam
receiver: pagerDutyTrigger
receivers:
- name: 'pagerDutyTrigger'
pagerduty_configs:
- send_resolved: true
routing_key: <myPagerDutyToken>
Add some rule On /var/data/prometheus-stack/prometheus/yourRuleFile.yml
groups:
- name: alerts
rules:
- alert: service_overload_more_than_5000
expr: (node_network_receive_bytes_total{job="someJobOrService"} / 1000) >= 5000
for: 10m
labels:
service: service_overload
severity: pager
dev_team: myteam
annotations:
dev_team: myteam
priority: Blocker
identifier: '{{ $labels.name }}'
description: 'service overflow'
value: '{{ humanize $value }}%'
On /var/data/prometheus-stack/prometheus/prometheus.yml add this snippet to integrate alertmanager:
global:
...
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager:9093"
rule_files:
- "yourRuleFile.yml"
...
Pay attention that the key point of this example is service_overload which connects and binds the rule into the right receiver.
Reload the config (restart the service again or stop and start your docker containers) and test it, if it's configured well you can watch the alerts in http://your-prometheus-url:9090/alerts

Resources