upgrading sssd on rhel8 breaks PAM auth in docker container - docker

i am having a problem with PAM authentication in docker container(used for auth for RStudio server), /var/lib/sss is mounted in the container so PAM authentication works. But on sssd-2.7.3-4.el8_7.3 it no longer works, the below log is from /var/log/sssd/sssd_pam.log. As a result i have had to make a versionlock on sssd 'yum versionlock add sssd-0:2.6.2-4.el8_6.1.*' which is not a good practise. Does anyone know what could be wrong?
* (2023-02-08 9:24:58): [pam] [get_client_cred] (0x4000): Client [0x55d1b39ddf20][24] creds: euid[0] egid[0] pid[673277] cmd_line['/usr/lib/rstudio-server/bin/rserver-pam'].
* (2023-02-08 9:24:58): [pam] [setup_client_idle_timer] (0x4000): Idle timer re-set for client [0x55d1b39ddf20][24]
* (2023-02-08 9:24:58): [pam] [accept_fd_handler] (0x0400): [CID#1] Client [cmd /usr/lib/rstudio-server/bin/rserver-pam][uid 0][0x55d1b39ddf20][24] connected to privileged pipe!
* (2023-02-08 9:24:58): [pam] [sss_cmd_get_version] (0x0200): [CID#1] Received client version [3].
* (2023-02-08 9:24:58): [pam] [sss_cmd_get_version] (0x0200): [CID#1] Offered version [3].
* (2023-02-08 9:24:58): [pam] [pam_cmd_authenticate] (0x0100): [CID#1] entering pam_cmd_authenticate
* (2023-02-08 9:24:58): [pam] [sss_domain_get_state] (0x1000): [CID#1] Domain mydomain.com is Active
* (2023-02-08 9:24:58): [pam] [sss_parse_name] (0x0100): [CID#1] Domain not provided!
* (2023-02-08 9:24:58): [pam] [sss_parse_name_for_domains] (0x0200): [CID#1] name 'admin-jnk' matched without domain, user is admin-jnk
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] command: SSS_PAM_AUTHENTICATE
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] domain: not set
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] user: admin-jnk
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] service: rstudio
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] tty: not set
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] ruser: not set
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] rhost: not set
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] authtok type: 1 (Password)
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] newauthtok type: 0 (No authentication token available)
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] priv: 1
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] cli_pid: 3667
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] child_pid: 0
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] logon name: admin-jnk
* (2023-02-08 9:24:58): [pam] [pam_print_data] (0x0100): [CID#1] flags: 0
* (2023-02-08 9:24:58): [pam] [cache_req_set_plugin] (0x2000): [CID#1] CR #0: Setting "Initgroups by name" plugin
* (2023-02-08 9:24:58): [pam] [cache_req_send] (0x0400): [CID#1] CR #0: REQ_TRACE: New request [CID #1] 'Initgroups by name'
* (2023-02-08 9:24:58): [pam] [cache_req_process_input] (0x0400): [CID#1] CR #0: Parsing input name [admin-jnk]
* (2023-02-08 9:24:58): [pam] [sss_domain_get_state] (0x1000): [CID#1] Domain mydomain.com is Active
* (2023-02-08 9:24:58): [pam] [sss_parse_name] (0x0100): [CID#1] Domain not provided!
* (2023-02-08 9:24:58): [pam] [sss_parse_name_for_domains] (0x0200): [CID#1] name 'admin-jnk' matched without domain, user is admin-jnk
* (2023-02-08 9:24:58): [pam] [cache_req_set_name] (0x0400): [CID#1] CR #0: Setting name [admin-jnk]
* (2023-02-08 9:24:58): [pam] [cache_req_domain_copy_cr_domains] (0x0040): [CID#1] No requested domains found, please check configuration options for typos.
/etc/sssd/sssd.conf
[sssd]
domains = mydomain.com
config_file_version = 2
services = nss, pam, autofs
[domain/mydomain.com]
ad_domain = mydomain.com
krb5_realm = MYDOMAIN.COM
realmd_tags = manages-system joined-with-adcli
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
ldap_id_mapping = True
use_fully_qualified_names = False
fallback_homedir = /mydomain/bruker/%u
access_provider = simple
simple_allow_groups = RBAG_Linux#mydomain.com
tried to change sssd.conf but nothing helped.

I have this exact same issue as well.
Upgraded RedHat 8.6 (sssd 2.6.2-4) to RedHat 8.7 (sssd 2.7.3-4) on the host, and once I do Ubuntu 20.04 based containers no longer can pam authenticate (ie su and sudo fail).
Log is similar to what you posted.

Related

Traefik IngressRouteTCP can't handle HTTP2

I have an IngressRouteTCP configured for Traefik running in an AKS cluster (behind an Azure load balancer). I'm trying to do routing based on SNI, rather than on the hostname header. The certificate is generated by Cloudflare for test.example.com.
As you can see below, it doesn't work. What does work, is setting a TLSOption with alpnProtocols to http/1.1. But that would default to http/1 as I understand it. My application supports http2, so I'd prefer to use that.
I'm not sure why it fails? Is it Traefik, curl or the application?
Testing it with curl -svk --connect-to test.example.com:443:my-azure-load-balancer.cloudapp.azure.com:443 https://test.example.com gives this:
* Connecting to hostname: my-azure-load-balancer.cloudapp.azure.com
* Connecting to port: 443
* Trying x.x.x.x:443...
* TCP_NODELAY set
* Connected to my-azure-load-balancer.cloudapp.azure.com (x.x.x.x) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: O=CloudFlare, Inc.; OU=CloudFlare Origin CA; CN=CloudFlare Origin Certificate
* start date: Sep 20 12:40:00 2022 GMT
* expire date: Sep 16 12:40:00 2037 GMT
* issuer: C=US; O=CloudFlare, Inc.; OU=CloudFlare Origin SSL Certificate Authority; L=San Francisco; ST=California
* SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x558d30129860)
> GET / HTTP/2
> Host: test.example.com
> user-agent: curl/7.68.0
> accept: */*
>
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* http2 error: Remote peer returned unexpected data while we expected SETTINGS frame. Perhaps, peer does not support HTTP/2 properly.
* Connection #0 to host my-azure-load-balancer.cloudapp.azure.com left intact
I'm using Traefik version 2.9.1
This is my Kubernetes configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: whoami
spec:
replicas: 2
selector:
matchLabels:
app: whoami
template:
metadata:
labels:
app: whoami
spec:
containers:
- name: whoami
image: traefik/whoami:v1.6.0
imagePullPolicy: IfNotPresent
---
apiVersion: v1
kind: Service
metadata:
name: whoami
labels:
app: whoami
spec:
type: ClusterIP
ports:
- port: 80
name: whoami
selector:
app: whoami
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRouteTCP
metadata:
name: whoami
spec:
entryPoints:
- websecure
routes:
- match: HostSNI(`test.example.com`)
services:
- name: whoami
port: 80
tls:
secretName: cloudflare-cert
These are the only logs related to this that I can find in the logs (the second log line doesn't always come though):
traefik-c757597b9-2xv65 time="2022-10-31T08:52:59Z" level=debug msg="Handling connection from 10.9.3.227:61988 to 10.9.3.73:80"
traefik-c757597b9-2xv65 time="2022-10-31T08:52:59Z" level=debug msg="Error during connection: read tcp 10.9.3.58:34400->10.9.3.73:80: read: connection reset by peer"
Those IPs are:
10.9.3.227 - The Kubernetes node
10.9.3.58 - Traefik
10.9.3.73 - The service for the whoami pod
I initially tested this behind Cloudflare, and received this:

Getting "State of the connection with the Jaeger Collector backend.." (jaeger/TRANSIENT_FAILURE) while running OpenTelemetry Collector

I am trying to build a simple application that sends traces to OpenTelemetry Collector, which exports the traces to Jaeger Backend.
But while I spin up the collector and Jaeger Backend, I get the following message,
info jaegerexporter/exporter.go:186 State of the connection with the Jaeger Collector backend {"kind": "exporter", "name": "jaeger", "state": "TRANSIENT_FAILURE"}
When I run the go application, I see no traces on the Jaeger UI. Also, no logs from the collector the shell.
main.go
package main
import (
"context"
"fmt"
"time"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
func initialize() {
traceExp, err := otlptracehttp.New(
context.TODO(),
otlptracehttp.WithEndpoint("0.0.0.0:55680"),
otlptracehttp.WithInsecure(),
)
if err != nil {
fmt.Println(err)
}
bsp := sdktrace.NewBatchSpanProcessor(traceExp)
tracerProvider := sdktrace.NewTracerProvider(
sdktrace.WithSpanProcessor(bsp),
)
otel.SetTracerProvider(tracerProvider)
}
func main() {
initialize()
tracer := otel.Tracer("demo-client-tracer")
ctx, span := tracer.Start(context.TODO(), "span-name")
defer span.End()
time.Sleep(time.Second)
fmt.Println(ctx)
}
Following are the collector config and docker-compose file.
otel-collector-config
receivers:
otlp:
protocols:
http:
processors:
batch:
exporters:
jaeger:
endpoint: "http://jaeger-all-in-one:14250"
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]
docker-compose.yaml
version: "2"
services:
# Jaeger
jaeger-all-in-one:
image: jaegertracing/all-in-one:latest
ports:
- "16686:16686"
- "14268"
- "14250:14250"
# Collector
otel-collector:
image: otel/opentelemetry-collector:latest
command: ["--config=/etc/otel-collector-config.yaml"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "4317"
- "55680:55680"
depends_on:
- jaeger-all-in-one
Additional Logs while running docker-compose up,
Starting open-telemetry-collector-2_jaeger-all-in-one_1 ... done
Starting open-telemetry-collector-2_otel-collector_1 ... done
Attaching to open-telemetry-collector-2_jaeger-all-in-one_1, open-telemetry-collector-2_otel-collector_1
jaeger-all-in-one_1 | 2021/09/02 09:26:58 maxprocs: Leaving GOMAXPROCS=8: CPU quota undefined
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0155272,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.015579,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.016236,"caller":"flags/admin.go:106","msg":"Mounting health check on admin server","route":"/"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0163133,"caller":"flags/admin.go:117","msg":"Starting admin HTTP server","http-addr":":14269"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0163486,"caller":"flags/admin.go:98","msg":"Admin server started","http.host-port":"[::]:14269","health-status":"unavailable"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.017912,"caller":"memory/factory.go:61","msg":"Memory storage initialized","configuration":{"MaxTraces":0}}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.018202,"caller":"static/strategy_store.go:138","msg":"Loading sampling strategies","filename":"/etc/jaeger/sampling_strategies.json"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0273001,"caller":"server/grpc.go:82","msg":"Starting jaeger-collector gRPC server","grpc.host-port":":14250"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0273921,"caller":"server/http.go:48","msg":"Starting jaeger-collector HTTP server","http host-port":":14268"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0276191,"caller":"server/zipkin.go:49","msg":"Not listening for Zipkin HTTP traffic, port not configured"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0276558,"caller":"grpc/builder.go:70","msg":"Agent requested insecure grpc connection to collector(s)"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0276873,"caller":"channelz/logging.go:50","msg":"[core]parsed scheme: \"\"","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0277174,"caller":"channelz/logging.go:50","msg":"[core]scheme \"\" not registered, fallback to default scheme","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0277457,"caller":"channelz/logging.go:50","msg":"[core]ccResolverWrapper: sending update to cc: {[{:14250 <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0277772,"caller":"channelz/logging.go:50","msg":"[core]ClientConn switching balancer to \"round_robin\"","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0277963,"caller":"channelz/logging.go:50","msg":"[core]Channel switches to new LB policy \"round_robin\"","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0278597,"caller":"grpclog/component.go:55","msg":"[balancer]base.baseBalancer: got new ClientConn state: {{[{:14250 <nil> 0 <nil>}] <nil> <nil>} <nil>}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0279217,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.028044,"caller":"channelz/logging.go:50","msg":"[core]Subchannel picks a new address \":14250\" to connect","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0284538,"caller":"grpclog/component.go:71","msg":"[balancer]base.baseBalancer: handle SubConn state change: 0xc000688840, CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.028513,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0280442,"caller":"grpc/builder.go:109","msg":"Checking connection to collector"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.028587,"caller":"grpc/builder.go:120","msg":"Agent collector connection state change","dialTarget":":14250","status":"CONNECTING"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0294988,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.029561,"caller":"grpclog/component.go:71","msg":"[balancer]base.baseBalancer: handle SubConn state change: 0xc000688840, READY","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0296533,"caller":"grpclog/component.go:71","msg":"[roundrobin]roundrobinPicker: newPicker called with info: {map[0xc000688840:{{:14250 <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0297205,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to READY","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0297425,"caller":"grpc/builder.go:120","msg":"Agent collector connection state change","dialTarget":":14250","status":"READY"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0298278,"caller":"./main.go:233","msg":"Starting agent"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0298927,"caller":"querysvc/query_service.go:137","msg":"Archive storage not created","reason":"archive storage not supported"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0299237,"caller":"app/flags.go:124","msg":"Archive storage not initialized"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0300004,"caller":"app/agent.go:69","msg":"Starting jaeger-agent HTTP server","http-port":5778}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0303733,"caller":"channelz/logging.go:50","msg":"[core]parsed scheme: \"\"","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0304158,"caller":"channelz/logging.go:50","msg":"[core]scheme \"\" not registered, fallback to default scheme","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0304341,"caller":"channelz/logging.go:50","msg":"[core]ccResolverWrapper: sending update to cc: {[{:16685 <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0304427,"caller":"channelz/logging.go:50","msg":"[core]ClientConn switching balancer to \"pick_first\"","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0304537,"caller":"channelz/logging.go:50","msg":"[core]Channel switches to new LB policy \"pick_first\"","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0305033,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0305545,"caller":"channelz/logging.go:50","msg":"[core]Subchannel picks a new address \":16685\" to connect","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"warn","ts":1630574818.0307937,"caller":"channelz/logging.go:75","msg":"[core]grpc: addrConn.createTransport failed to connect to {:16685 localhost:16685 <nil> 0 <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp :16685: connect: connection refused\". Reconnecting...","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.030827,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0308597,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00061fd40, {CONNECTING <nil>}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0308924,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0309658,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00061fd40, {TRANSIENT_FAILURE connection error: desc = \"transport: Error while dialing dial tcp :16685: connect: connection refused\"}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0309868,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0314078,"caller":"app/static_handler.go:181","msg":"UI config path not provided, config file will not be watched"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0315406,"caller":"app/server.go:197","msg":"Query server started"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0315752,"caller":"healthcheck/handler.go:129","msg":"Health Check state change","status":"ready"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0315914,"caller":"app/server.go:276","msg":"Starting GRPC server","port":16685,"addr":":16685"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574818.0316222,"caller":"app/server.go:257","msg":"Starting HTTP server","port":16686,"addr":":16686"}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.031331,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.0314019,"caller":"channelz/logging.go:50","msg":"[core]Subchannel picks a new address \":16685\" to connect","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.0315094,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00061fd40, {CONNECTING <nil>}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.0315537,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.0323153,"caller":"channelz/logging.go:50","msg":"[core]Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.0325227,"caller":"grpclog/component.go:71","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00061fd40, {READY <nil>}","system":"grpc","grpc_log":true}
jaeger-all-in-one_1 | {"level":"info","ts":1630574819.0325499,"caller":"channelz/logging.go:50","msg":"[core]Channel Connectivity change to READY","system":"grpc","grpc_log":true}
otel-collector_1 | 2021-09-02T09:26:59.628Z info service/collector.go:303 Starting otelcol... {"Version": "v0.33.0", "NumCPU": 8}
otel-collector_1 | 2021-09-02T09:26:59.628Z info service/collector.go:242 Loading configuration...
otel-collector_1 | 2021-09-02T09:26:59.630Z info service/collector.go:258 Applying configuration...
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/exporters_builder.go:264 Exporter was built. {"kind": "exporter", "name": "jaeger"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/pipelines_builder.go:214 Pipeline was built. {"pipeline_name": "traces", "pipeline_datatype": "traces"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/receivers_builder.go:227 Receiver was built. {"kind": "receiver", "name": "otlp", "datatype": "traces"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info service/service.go:143 Starting extensions...
otel-collector_1 | 2021-09-02T09:26:59.630Z info service/service.go:188 Starting exporters...
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/exporters_builder.go:93 Exporter is starting... {"kind": "exporter", "name": "jaeger"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info jaegerexporter/exporter.go:186 State of the connection with the Jaeger Collector backend {"kind": "exporter", "name": "jaeger", "state": "CONNECTING"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/exporters_builder.go:98 Exporter started. {"kind": "exporter", "name": "jaeger"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info service/service.go:193 Starting processors...
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/pipelines_builder.go:52 Pipeline is starting... {"pipeline_name": "traces", "pipeline_datatype": "traces"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/pipelines_builder.go:63 Pipeline is started. {"pipeline_name": "traces", "pipeline_datatype": "traces"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info service/service.go:198 Starting receivers...
otel-collector_1 | 2021-09-02T09:26:59.630Z info builder/receivers_builder.go:71 Receiver is starting... {"kind": "receiver", "name": "otlp"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info otlpreceiver/otlp.go:93 Starting HTTP server on endpoint 0.0.0.0:4318 {"kind": "receiver", "name": "otlp"}
otel-collector_1 | 2021-09-02T09:26:59.630Z info otlpreceiver/otlp.go:159 Setting up a second HTTP listener on legacy endpoint 0.0.0.0:55681 {"kind": "receiver", "name": "otlp"}
otel-collector_1 | 2021-09-02T09:26:59.631Z info otlpreceiver/otlp.go:93 Starting HTTP server on endpoint 0.0.0.0:55681 {"kind": "receiver", "name": "otlp"}
otel-collector_1 | 2021-09-02T09:26:59.631Z info builder/receivers_builder.go:76 Receiver started. {"kind": "receiver", "name": "otlp"}
otel-collector_1 | 2021-09-02T09:26:59.631Z info service/collector.go:206 Setting up own telemetry...
otel-collector_1 | 2021-09-02T09:26:59.631Z info service/telemetry.go:99 Serving Prometheus metrics {"address": ":8888", "level": 0, "service.instance.id": "0fe56a33-e40e-4251-9a82-100fa600c4a0"}
otel-collector_1 | 2021-09-02T09:26:59.631Z info service/collector.go:218 Everything is ready. Begin running and processing data.
otel-collector_1 | 2021-09-02T09:27:00.631Z info jaegerexporter/exporter.go:186 State of the connection with the Jaeger Collector backend {"kind": "exporter", "name": "jaeger", "state": "TRANSIENT_FAILURE"}
Thanks!
Updating otel-collector-config.yaml to the following endpoint should work:
receivers:
otlp:
protocols:
http:
processors:
batch:
exporters:
jaeger:
endpoint: jaeger-all-in-one:14250
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [jaeger]

Traefik cannot fetch Acme certificate with Route 53

I'm having a little trouble configuring Traefik and ACME certs with AWS Route 53. I tried both http and dns challenges with no avail. It keeps getting this error: acme: error presenting token: route53: failed to determine hosted zone ID: NoCredentialProviders: no valid providers in chain
what am I doing wrong here? Thanks in advance.
httpChallenge error (note there is no firewall on):
app_1 | time="2019-02-20T21:49:52Z" level=debug msg="Using HTTP Challenge provider."
app_1 | time="2019-02-20T21:50:04Z" level=error msg="Unable to obtain ACME certificate for domains \"monitor.example.net\" detected thanks to rule \"Host:monitor.example.net\" : unable to generate a certificate for the domains [monitor.example.net]: acme: Error -> One or more domains had a problem:\n[monitor.example.net] acme: error: 400 :: urn:ietf:params:acme:error:connection :: Fetching http://monitor.example.net/.well-known/acme-challenge/AwJq4WU0OKN943nyHW6e3jzirdsWw6QAeE-CXD7QRhQ: Timeout during connect (likely firewall problem), url: \n"
dnsChallenge error:
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="Try to challenge certificate for domain [monitor.example.net] founded in Host rule"
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="Looking for provided certificate(s) to validate [\"monitor.example.net\"]..."
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="Domains [\"monitor.example.net\"] need ACME certificates generation for domains \"monitor.example.net\"."
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="Loading ACME certificates [monitor.example.net]..."
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="Building ACME client..."
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="https://acme-v02.api.letsencrypt.org/directory"
app_1 | time="2019-02-20T21:18:26Z" level=debug msg="Using DNS Challenge provider: route53"
app_1 | time="2019-02-20T21:18:27Z" level=error msg="Unable to obtain ACME certificate for domains \"monitor.example.net\" detected thanks to rule \"Host:monitor.example.net\" : unable to generate a certificate for the domains [monitor.example.net]: acme: Error -> One or more domains had a problem:\n[monitor.example.net] [monitor.example.net] acme: error presenting token: route53: failed to determine hosted zone ID: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\n"
Attached docker-compose.yml
version: '3'
services:
app:
image: traefik:alpine
restart: always
ports:
- 80:80
- 443:443
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- ./traefik.toml:/traefik.toml
- ./acme.json:/acme.json
labels:
- traefik.frontend.rule=Host:monitor.example.net
- traefik.port=8080
networks:
- web
networks:
web:
external: true
Attached traefik.toml
logLevel = "DEBUG"
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.dashboard]
address = ":8080"
[entryPoints.dashboard.auth]
[entryPoints.dashboard.auth.basic]
users = ["admin:foobar"]
[entryPoints.http]
address = ":80"
# [entryPoints.http.redirect]
# entryPoint = "https"
[entryPoints.https]
address = ":443"
[entryPoints.https.tls]
[api]
entrypoint="dashboard"
[acme]
email = "donotspam#me.com"
storage = "acme.json"
entryPoint = "https"
onHostRule = true
# [acme.httpChallenge] #<--tried both httpChallenge and dnsChallenge
# entryPoint = "http"
[acme.dnsChallenge]
provider = "route53"
delayBeforeCheck = 0
[docker]
domain = "example.net"
watch = true
network = "web"
The HTTP challenge requires that port 80 be accessible on the Internet.
For the DNS challenge you need to define the credentials:
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, [AWS_REGION], [AWS_HOSTED_ZONE_ID] or a configured user/instance IAM profile.
https://docs.traefik.io/configuration/acme/#provider

Consul Empty reply from server

I'm trying to get a consul server cluster up and running. I have 3 dockerized consul servers running, but I can't access the Web UI, the HTTP API nor the DNS.
$ docker logs net-sci_discovery-service_consul_1
==> WARNING: Expect Mode enabled, expecting 3 servers
==> Starting Consul agent...
==> Consul agent running!
Version: 'v0.8.5'
Node ID: 'ccd38897-6047-f8b6-be1c-2aa0022a1483'
Node name: 'consul1'
Datacenter: 'dc1'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 172.20.0.2 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2017/07/07 23:24:07 [INFO] raft: Initial configuration (index=0): []
2017/07/07 23:24:07 [INFO] raft: Node at 172.20.0.2:8300 [Follower] entering Follower state (Leader: "")
2017/07/07 23:24:07 [INFO] serf: EventMemberJoin: consul1 172.20.0.2
2017/07/07 23:24:07 [INFO] consul: Adding LAN server consul1 (Addr: tcp/172.20.0.2:8300) (DC: dc1)
2017/07/07 23:24:07 [INFO] serf: EventMemberJoin: consul1.dc1 172.20.0.2
2017/07/07 23:24:07 [INFO] consul: Handled member-join event for server "consul1.dc1" in area "wan"
2017/07/07 23:24:07 [INFO] agent: Started DNS server 127.0.0.1:8600 (tcp)
2017/07/07 23:24:07 [INFO] agent: Started DNS server 127.0.0.1:8600 (udp)
2017/07/07 23:24:07 [INFO] agent: Started HTTP server on 127.0.0.1:8500
2017/07/07 23:24:09 [INFO] serf: EventMemberJoin: consul2 172.20.0.3
2017/07/07 23:24:09 [INFO] consul: Adding LAN server consul2 (Addr: tcp/172.20.0.3:8300) (DC: dc1)
2017/07/07 23:24:09 [INFO] serf: EventMemberJoin: consul2.dc1 172.20.0.3
2017/07/07 23:24:09 [INFO] consul: Handled member-join event for server "consul2.dc1" in area "wan"
2017/07/07 23:24:10 [INFO] serf: EventMemberJoin: consul3 172.20.0.4
2017/07/07 23:24:10 [INFO] consul: Adding LAN server consul3 (Addr: tcp/172.20.0.4:8300) (DC: dc1)
2017/07/07 23:24:10 [INFO] consul: Found expected number of peers, attempting bootstrap: 172.20.0.2:8300,172.20.0.3:8300,172.20.0.4:8300
2017/07/07 23:24:10 [INFO] serf: EventMemberJoin: consul3.dc1 172.20.0.4
2017/07/07 23:24:10 [INFO] consul: Handled member-join event for server "consul3.dc1" in area "wan"
2017/07/07 23:24:14 [ERR] agent: failed to sync remote state: No cluster leader
2017/07/07 23:24:17 [WARN] raft: Heartbeat timeout from "" reached, starting election
2017/07/07 23:24:17 [INFO] raft: Node at 172.20.0.2:8300 [Candidate] entering Candidate state in term 2
2017/07/07 23:24:17 [INFO] raft: Election won. Tally: 2
2017/07/07 23:24:17 [INFO] raft: Node at 172.20.0.2:8300 [Leader] entering Leader state
2017/07/07 23:24:17 [INFO] raft: Added peer 172.20.0.3:8300, starting replication
2017/07/07 23:24:17 [INFO] raft: Added peer 172.20.0.4:8300, starting replication
2017/07/07 23:24:17 [INFO] consul: cluster leadership acquired
2017/07/07 23:24:17 [INFO] consul: New leader elected: consul1
2017/07/07 23:24:17 [WARN] raft: AppendEntries to {Voter 172.20.0.3:8300 172.20.0.3:8300} rejected, sending older logs (next: 1)
2017/07/07 23:24:17 [WARN] raft: AppendEntries to {Voter 172.20.0.4:8300 172.20.0.4:8300} rejected, sending older logs (next: 1)
2017/07/07 23:24:17 [INFO] raft: pipelining replication to peer {Voter 172.20.0.3:8300 172.20.0.3:8300}
2017/07/07 23:24:17 [INFO] raft: pipelining replication to peer {Voter 172.20.0.4:8300 172.20.0.4:8300}
2017/07/07 23:24:18 [INFO] consul: member 'consul1' joined, marking health alive
2017/07/07 23:24:18 [INFO] consul: member 'consul2' joined, marking health alive
2017/07/07 23:24:18 [INFO] consul: member 'consul3' joined, marking health alive
2017/07/07 23:24:20 [INFO] agent: Synced service 'consul'
2017/07/07 23:24:20 [INFO] agent: Synced service 'messaging-service-kafka'
2017/07/07 23:24:20 [INFO] agent: Synced service 'messaging-service-zookeeper'
$ curl http://127.0.0.1:8500/v1/catalog/service/consul
curl: (52) Empty reply from server
dig #127.0.0.1 -p 8600 consul.service.consul
; <<>> DiG 9.8.3-P1 <<>> #127.0.0.1 -p 8600 consul.service.consul
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
$ dig #127.0.0.1 -p 8600 messaging-service-kafka.service.consul
; <<>> DiG 9.8.3-P1 <<>> #127.0.0.1 -p 8600 messaging-service-kafka.service.consul
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
I can't get my services to register via the HTTP API either; those shown above are registered using a config script when the container launches.
Here's my docker-compose.yml:
version: '2'
services:
consul1:
image: "consul:latest"
container_name: "net-sci_discovery-service_consul_1"
hostname: "consul1"
ports:
- "8400:8400"
- "8500:8500"
- "8600:8600"
volumes:
- ./etc/consul.d:/etc/consul.d
command: "agent -server -ui -bootstrap-expect 3 -config-dir=/etc/consul.d -bind=0.0.0.0"
consul2:
image: "consul:latest"
container_name: "net-sci_discovery-service_consul_2"
hostname: "consul2"
command: "agent -server -join=consul1"
links:
- "consul1"
consul3:
image: "consul:latest"
container_name: "net-sci_discovery-service_consul_3"
hostname: "consul3"
command: "agent -server -join=consul1"
links:
- "consul1"
I'm relatively new to both docker and consul. I've had a look around the web and the above options are my understanding of what is required. Any suggestions on the way forward would be very welcome.
Edit:
Result of docker container ps -all:
$ docker container ps --all
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e0a1c3bba165 consul:latest "docker-entrypoint..." 38 seconds ago Up 36 seconds 8300-8302/tcp, 8500/tcp, 8301-8302/udp, 8600/tcp, 8600/udp net-sci_discovery-service_consul_3
7f05555e81e0 consul:latest "docker-entrypoint..." 38 seconds ago Up 36 seconds 8300-8302/tcp, 8500/tcp, 8301-8302/udp, 8600/tcp, 8600/udp net-sci_discovery-service_consul_2
9e2dedaa224b consul:latest "docker-entrypoint..." 39 seconds ago Up 38 seconds 0.0.0.0:8400->8400/tcp, 8301-8302/udp, 0.0.0.0:8500->8500/tcp, 8300-8302/tcp, 8600/udp, 0.0.0.0:8600->8600/tcp net-sci_discovery-service_consul_1
27b34c5dacb7 messagingservice_kafka "start-kafka.sh" 3 hours ago Up 3 hours 0.0.0.0:9092->9092/tcp net-sci_messaging-service_kafka
0389797b0b8f wurstmeister/zookeeper "/bin/sh -c '/usr/..." 3 hours ago Up 3 hours 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp net-sci_messaging-service_zookeeper
Edit:
Updated docker-compose.yml to include long format for ports:
version: '3.2'
services:
consul1:
image: "consul:latest"
container_name: "net-sci_discovery-service_consul_1"
hostname: "consul1"
ports:
- target: 8400
published: 8400
mode: host
- target: 8500
published: 8500
mode: host
- target: 8600
published: 8600
mode: host
volumes:
- ./etc/consul.d:/etc/consul.d
command: "agent -server -ui -bootstrap-expect 3 -config-dir=/etc/consul.d -bind=0.0.0.0 -client=127.0.0.1"
consul2:
image: "consul:latest"
container_name: "net-sci_discovery-service_consul_2"
hostname: "consul2"
command: "agent -server -join=consul1"
links:
- "consul1"
consul3:
image: "consul:latest"
container_name: "net-sci_discovery-service_consul_3"
hostname: "consul3"
command: "agent -server -join=consul1"
links:
- "consul1"
From the Consul Web Gui page, make sure you have launched an agent with the -ui parameter.
The UI is available at the /ui path on the same port as the HTTP API.
By default this is http://localhost:8500/ui
I do see 8500 mapped to your host on broadcast (0.0.0.0).
Check also (as in this answer) if the client_addr can help (at least for testing)

dockerhub registery: x509: certificate signed by unknown authority

I've spend hours looking to solve this issue, however I'm unable to find any topics related to this issue, since all I find is custom registeries.
When running any of the docker commands that connect to docker hub, either through https://registry-1.docker.io/v2/ or https://index.docker.io/v1, all requests end up in "x509: certificate signed by unknown authority". However using curl to run query the same endpoints seem to function just fine.
I've reinstalled docker completely, purging all configuration files, however it does not seem to make a difference.
Anything I'm missing?
docker info:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 17.05.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.35-1-lts
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.34GiB
ID: 5Q4D:TLJF:3I3U:O522:VQMK:24BU:H5ND:UPOU:MWYS:WGTB:XFXR:BQES
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Ena
Using docker:
[user#hostname]$ docker search ubunut
Error response from daemon: Get https://index.docker.io/v1/search?q=ubunut&n=25: x509: certificate signed by unknown authority
Using curl:
[user#hostname]$ curl -v https://index.docker.io/v1/search?q=ubunut&n=25
[1] 2152
[user#hostname]$ * Trying 34.200.194.233...
* TCP_NODELAY set
* Connected to index.docker.io (34.200.194.233) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:#STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: OU=GT98568428; OU=See www.rapidssl.com/resources/cps (c)15; OU=Domain Control Validated - RapidSSL(R); CN=*.docker.io
* start date: Mar 19 17:34:32 2015 GMT
* expire date: Apr 21 01:51:52 2018 GMT
* subjectAltName: host "index.docker.io" matched cert's "*.docker.io"
* issuer: C=US; O=GeoTrust Inc.; CN=RapidSSL SHA256 CA - G3
* SSL certificate verify ok.
> GET /v1/search?q=ubunut HTTP/1.1
> Host: index.docker.io
> User-Agent: curl/7.54.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.6.2
< Date: Wed, 05 Jul 2017 12:10:22 GMT
< Content-Type: application/json
< Transfer-Encoding: chunked
< Vary: Cookie
< X-Frame-Options: SAMEORIGIN
< Strict-Transport-Security: max-age=31536000
<
{"num_pages": 1, "num_results": 21, "results": [{"is_automated": true, "name": "han4wluc/try-docker-ubunut-node", "is_trusted": true, ... *truncated*
I solved the problem as follows:
I removed the file /etc/ssl/cert/ca-certificates.crt.
I ran the command sudo pacman -S ca-certificates-utils.
I restarted docker with the systemctl restart docker command.
I got this hint from this link:
https://unix.stackexchange.com/questions/339613/arch-linux-ca-certificates-crt-not-found/396169#396169

Resources