Lost my openshift console ("Application is not available") - oauth

The console ui in my OpenShift 4.5.x installation has mysteriously stopped working. Visiting the console URL now results in the message:
Application is not available
The application is currently not serving requests at this endpoint. It may not have been started or is still starting.
One usually sees this if a route exists but is cannot find a corresponding service or pod, but in this case, the route exists:
$ oc -n openshift-console get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
console console-openshift-console.apps.example.com console https reencrypt/Redirect None
downloads downloads-openshift-console.apps.example.com downloads http edge/Redirect None
The service exists:
$ oc -n openshift-console get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
console ClusterIP 172.30.36.70 <none> 443/TCP 57d
downloads ClusterIP 172.30.190.186 <none> 80/TCP 57d
And the pods exist and are healthy:
$ oc -n openshift-console get pods
NAME READY STATUS RESTARTS AGE
console-76c8d7d755-gtfm8 0/1 Running 1 4m12s
console-76c8d7d755-mvf6n 0/1 Running 1 4m12s
downloads-9656c996-mmqhk 1/1 Running 0 53d
downloads-9656c996-z2khj 1/1 Running 0 53d
Looking at the logs for the console pods, there appears to be a problem contacting the oauth service:
2021-01-04T22:05:48Z auth: error contacting auth provider (retrying in 10s): Get https://kubernetes.default.svc/.well-known/oauth-authorization-server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2021-01-04T22:05:58Z auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.example.com/oauth/token failed: Head https://oauth-openshift.apps.example.com: EOF
2021-01-04T22:06:13Z auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.example.com/oauth/token failed: Head https://oauth-openshift.apps.example.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2021-01-04T22:06:23Z auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.example.com/oauth/token failed: Head https://oauth-openshift.apps.example.com: EOF
2021-01-04T22:06:38Z auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.example.com/oauth/token failed: Head https://oauth-openshift.apps.example.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2021-01-04T22:06:53Z auth: error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.example.com/oauth/token failed: Head https://oauth-openshift.apps.example.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
But the pods in the openshift-authentication namespace appear to be healthy and are not reporting any errors in the logs. Where should I be looking for the source of the problem?
The expected route and service exist in the openshift-authentication namespace:
$ oc -n openshift-authentication get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
oauth-openshift oauth-openshift.apps.example.com oauth-openshift 6443 passthrough/Redirect None
$ oc -n openshift-authentication get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
oauth-openshift ClusterIP 172.30.233.202 <none> 443/TCP 57d
$ oc -n openshift-authentication get route oauth-openshift -o json | jq .status
{
"ingress": [
{
"conditions": [
{
"lastTransitionTime": "2020-11-08T19:48:08Z",
"status": "True",
"type": "Admitted"
}
],
"host": "oauth-openshift.apps.example.com",
"routerCanonicalHostname": "apps.example.com",
"routerName": "default",
"wildcardPolicy": "None"
}
]
}

It turned out to be an issue with the default ingress routers. There were no obvious errors, but I was able to resolve the problem by restarting the routers:
oc -n openshift-ingress get pod -o json |
jq -r '.items[].metadata.name' |
xargs oc -n openshift-ingress delete pod

i had same issue on OpenShift 3.11
i just deleted secret with certificate, openshift will create new secret, now console works.
oc delete secret console-serving-cert -n openshift-console

Related

ImagePullBackOff after Kubectl run

I am new to Kubernetes. I am using Minikube for Mac with VM hyperkit. I also have docker-desktop installed (in which I have tried both enable/disable Kubernetes).
docker pull is executed smoothly with no error.
but on
kubectl run kubernetes-jenkins --image=jenkins:latest --port=8080
(or any image, be it gcr.io/google-samples/kubernetes-bootcamp:v1) it fails with ImagePullBackOff
Trimming few parts from kubectl cluster-info dump:
I1230 10:20:56.812648 1 serving.go:312] Generated self-signed
cert in-memory W1230 10:20:58.777494 1
configmap_cafile_content.go:102] unable to load initial CA bundle for:
"client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
due to: configmap "extension-apiserver-authentication" not found W1230
10:20:58.778005 1 configmap_cafile_content.go:102] unable to
load initial CA bundle for:
"client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
due to: configmap "extension-apiserver-authentication" not found W1230
10:20:58.849619 1 authorization.go:47] Authorization is disabled
W1230 10:20:58.850375 1 authentication.go:92] Authentication is
disabled
"reason": "Failed",
"message": "Failed to pull image \"jenkins:latest\": rpc error: code = Unknown desc = Error response from daemon: Get
https://registry-1.docker.io/v2/: dial tcp: lookup
registry-1.docker.io on 192.168.64.1:53: read udp
192.168.64.3:38558-\u003e192.168.64.1:53: read: connection refused",
"source": {
"component": "kubelet",
"host": "minikube"
}
Why kubectl is unable to pull image from the repository?
In minikube your local docker registry docker image can't be found,so you have to set your docker env to use minikube registry for local image you build and pull
eval $(minikube docker-env)
if that doesn't solve your problem, you have to start minikube by telling it's registry
minikube start --vm-driver="virtualbox" --insecure-registry=$(docker-machine ip registry):80

How to fix Spring Cloud Data Flow Kubernetes container Readiness probe failed: HTTP probe failed with statuscode: 401

Have deployed Spring Cloud Data flow on Azure AKS using Helm: helm install --name my-release stable/spring-cloud-data-flow
Data Flow Server Implementation
Name: spring-cloud-dataflow-server
Version: 2.0.1.RELEASE
But getting Liveness probe and Readiness probe failed 401:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 10m (x52 over 103m) kubelet, aks-nodepool1-28921497-0 Liveness probe failed: HTTP probe failed with statuscode: 401
Warning BackOff 6m8s (x138 over 73m) kubelet, aks-nodepool1-28921497-0 Back-off restarting failed container
Warning Unhealthy 67s (x220 over 104m) kubelet, aks-nodepool1-28921497-0 Readiness probe failed: HTTP probe failed with statuscode: 401
Reading this dococument https://docs.spring.io/spring-cloud-dataflow/docs/2.0.2.RELEASE/reference/htmlsingle/#_application_and_server_properties
deployer.appName.kubernetes.probeCredentialsSecret=myprobesecret
But how to Set/Run the deployer properties if using Helm only to deploy Data Flow on AKS cluster ?
Or how to make the release using the default ProbeSecret ? I did not create or modify ProbeSecret when deploying the DataFlow with Helm.
Thanks
We support a variety of deployer properties that you can override per stream/task deployment in SCDF. The probeCredentialsSecret property is one of them and it is specifically designed to supply a secret to gain access to a protected liveness and readiness probes.
Whether or not you used Helm to provision SCDF on K8s, the actual property needs to be supplied at the time of stream/task deployment.
Unless you create a secret and configure it in SCDF, you will not be able to successfully handshake with the secured probes.
Please follow the ref. guide that walks through the configuration with an example.

kubectl can't connect to docker registry to download image

I'm stepping through Kubernetes in Action to get more than just familiarity with Kubernetes.
I already had a Docker Hub account that I've been using for Docker-specific experiments.
As described in chapter 2 of the book, I built the toy "kubia" image, and I was able to push it to Docker Hub. I verified this again by logging into Docker Hub and seeing the image.
I'm doing this on Centos7.
I then run the following to create the replication controller and pod running my image:
kubectl run kubia --image=davidmichaelkarr/kubia --port=8080 --generator=run/v1
I waited a while for statuses to change, but it never finishes downloading the image, when I describe the pod, I see something like this:
Normal Scheduled 24m default-scheduler Successfully assigned kubia-25th5 to minikube
Normal SuccessfulMountVolume 24m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-x5nl4"
Normal Pulling 22m (x4 over 24m) kubelet, minikube pulling image "davidmichaelkarr/kubia"
Warning Failed 22m (x4 over 24m) kubelet, minikube Failed to pull image "davidmichaelkarr/kubia": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
So I then constructed the following command:
curl -v -u 'davidmichaelkarr:**' 'https://registry-1.docker.io/v2/'
Which uses the same password I use for Docker Hub (they should be the same, right?).
This gives me the following:
* About to connect() to proxy *** port 8080 (#0)
* Trying **.**.**.**...
* Connected to *** (**.**.**.**) port 8080 (#0)
* Establish HTTP proxy tunnel to registry-1.docker.io:443
* Server auth using Basic with user 'davidmichaelkarr'
> CONNECT registry-1.docker.io:443 HTTP/1.1
> Host: registry-1.docker.io:443
> User-Agent: curl/7.29.0
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 Connection established
<
* Proxy replied OK to CONNECT request
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* subject: CN=*.docker.io
* start date: Aug 02 00:00:00 2017 GMT
* expire date: Sep 02 12:00:00 2018 GMT
* common name: *.docker.io
* issuer: CN=Amazon,OU=Server CA 1B,O=Amazon,C=US
* Server auth using Basic with user 'davidmichaelkarr'
> GET /v2/ HTTP/1.1
> Authorization: Basic ***
> User-Agent: curl/7.29.0
> Host: registry-1.docker.io
> Accept: */*
>
< HTTP/1.1 401 Unauthorized
< Content-Type: application/json; charset=utf-8
< Docker-Distribution-Api-Version: registry/2.0
< Www-Authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io"
< Date: Wed, 24 Jan 2018 18:34:39 GMT
< Content-Length: 87
< Strict-Transport-Security: max-age=31536000
<
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
* Connection #0 to host *** left intact
I don't understand why this is failing auth.
Update:
Based on the first answer and the info I got from this other question, I edited the description of the service account, adding the "imagePullSecrets" key, then I deleted the replicationcontroller again and recreated it. The result appeared to be identical.
This is the command I ran to create the secret:
kubectl create secret docker-registry regsecret --docker-server=registry-1.docker.io --docker-username=davidmichaelkarr --docker-password=** --docker-email=**
Then I obtained the yaml for the serviceaccount, added the key reference for the secret, then set that yaml as the settings for the serviceaccount.
This are the current settings for the service account:
$ kubectl get serviceaccount default -o yaml
apiVersion: v1
imagePullSecrets:
- name: regsecret
kind: ServiceAccount
metadata:
creationTimestamp: 2018-01-24T00:05:01Z
name: default
namespace: default
resourceVersion: "81492"
selfLink: /api/v1/namespaces/default/serviceaccounts/default
uid: 38e2882c-009a-11e8-bf43-080027ae527b
secrets:
- name: default-token-x5nl4
Here's the updated events list from the describe of the pod after doing this:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m default-scheduler Successfully assigned kubia-f56th to minikube
Normal SuccessfulMountVolume 7m kubelet, minikube MountVolume.SetUp succeeded for volume "default-token-x5nl4"
Normal Pulling 5m (x4 over 7m) kubelet, minikube pulling image "davidmichaelkarr/kubia"
Warning Failed 5m (x4 over 7m) kubelet, minikube Failed to pull image "davidmichaelkarr/kubia": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Normal BackOff 4m (x6 over 7m) kubelet, minikube Back-off pulling image "davidmichaelkarr/kubia"
Warning FailedSync 2m (x18 over 7m) kubelet, minikube Error syncing pod
What else might I be doing wrong?
Update:
I think it's likely that all these issues with authentication are unrelated to the real issue. The key point is what I see in the pod description (breaking into multiple lines to make it easier to see):
Warning Failed 22m (x4 over 24m) kubelet,
minikube Failed to pull image "davidmichaelkarr/kubia": rpc error: code =
Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/:
net/http: request canceled while waiting for connection
(Client.Timeout exceeded while awaiting headers)
The last line seems like the most important piece of information at this point. It's not failing authentication, it's timing out the connection. In my experience, something like this is usually caused by issues getting through a firewall/proxy. We do have an internal proxy, and I have those environment variables set in my environment, but what about the "serviceaccount" that kubectl is using to make this connection? Do I have to somehow set a proxy configuration in the serviceaccount description?
You need to make sure the Docker daemon running in the Minikube VM uses your corporate proxy by starting minikube along these lines:
minikube start --docker-env http_proxy=http://proxy.corp.com:port --docker-env https_proxy=http://proxy.corp.com:port --docker-env no_proxy=192.168.99.0/24
I faced same issue couple of time.
Updating here, might be useful for someone.
First describe the POD(kubectl describe pod <pod_name>),
1. If you see access denied/repository does not exist errors like
Error response from daemon: pull access denied for test/nginx,
repository does not exist or may require 'docker login': denied:
requested access to the resource is denied
Solution:
If local K8s, you need to login into docker registry first OR
if Kubernetes Cluster on Cloud, create secret for Registry and add imagepullsecret
along with secret name
2. If you get timeout error,
Error: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while
awaiting headers)
Solution:
check the node is able to connect network OR able to reach private/public Registry.
If AWS EKS Cluster, you need to enable auto-assign ip to Subnet where EC2 is running.
To fetch images stored on registries that require credentials, you need to create a special type of secret called imagePullSecrets.
kubectl create secret docker-registry regsecret --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
Then create the Pod specifying the imagePullSecrets field
apiVersion: v1
kind: Pod
metadata:
name: private-reg
spec:
containers:
- name: private-reg-container
image: <your-private-image>
imagePullSecrets:
- name: regsecret
As mentioned in my comment to the original post, I had the same issue. The only thing of note is the minikube was up as at creation. I restarted the underlying VM and image pulls started working.
This seems to be quite old issue, but I have similar issue and solved by logged in to your docker account.
You can try it by deleting the existing failed pods, firing "docker login" command (login to your acc), then retry for the pod creation.

How to debug route not working in OCP 3.4

I encounter an issue many times, fixed it by recreating the problem app, but I still want to know how to debug and what cause this issue.
For example:
I created a new app jenkins persistent application in my master, I'm able to curl then endpoints IP address in Node/Master, but not able to curl the application exposed IP/hostname address. Sometimes, I just need to remove the application then re-create again, this problem will be fixed, but this time I really want to know how to fix it without re-create, followed the article in https://docs.openshift.com/enterprise/3.1/admin_guide/sdn_troubleshooting.html to debugging, and I'm pretty sure the DNS is working, just let me kknow if I need to provide any more information to here. Thanks.
Here is my svc description output:
# oc describe svc/jenkins
Name: jenkins
Namespace: developer
Labels: app=jenkins-persistent
template=jenkins-persistent-template
Selector: name=jenkins
Type: ClusterIP
IP: 172.30.78.168
Port: web 80/TCP
Endpoints: 10.128.0.97:8080
Session Affinity: None
No events.
# oc get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins 172.30.78.168 <none> 80/TCP 1h
jenkins-jnlp 172.30.87.38 <none> 50000/TCP 1h
# curl http://10.128.0.97:8080
<html><head><meta http-equiv='refresh' content='1;url=/securityRealm/commenceLogin?from=%2F'/><script>window.location.replace('/securityRealm/commenceLogin?from=%2F');</script></head><body style='background-color:white; color:white;'>
Authentication required
<!--
You are authenticated as: anonymous
Groups that you are in:
Permission you need to have (but didn't): hudson.model.Hudson.Read
... which is implied by: hudson.security.Permission.GenericRead
... which is implied by: hudson.model.Hudson.Administer
-->
</body></html>
# oc get oauthclient
NAME SECRET WWW-CHALLENGE REDIRECT URIS
cockpit-oauth-client user7IjHLvwuclbrHeVmi2pslHpSbmQuI3ePqjAHbLSS0aNBekio2aqDM3iBbx33Qwwp FALSE https://registry-console-default.com.cn,https://jenkins-developer.com.cn
# curl 172.30.78.168:8080
curl: (7) Failed connect to 172.30.78.168:8080; No route to host
# curl 172.30.78.168
<html><head><meta http-equiv='refresh' content='1;url=/securityRealm/commenceLogin?from=%2F'/><script>window.location.replace('/securityRealm/commenceLogin?from=%2F');</script></head><body style='background-color:white; color:white;'>
Authentication required
<!--
You are authenticated as: anonymous
Groups that you are in:
Permission you need to have (but didn't): hudson.model.Hudson.Read
... which is implied by: hudson.security.Permission.GenericRead
... which is implied by: hudson.model.Hudson.Administer
# curl jenkins-developer.com.cn
curl: (7) Failed connect to jenkins-developer.com.cn:80; Connection refused
# oc describe route
Name: jenkins
Namespace: developer
Created: 2 days ago
Labels: app=jenkins-persistent
template=jenkins-persistent-template
Annotations: <none>
Requested Host: jenkins-developer.paas.com.cn
exposed on router ose-router 2 days ago
Path: <none>
TLS Termination: <none>
Insecure Policy: <none>
Endpoint Port: web
Service: jenkins
Weight: 100 (100%)
Endpoints: 10.128.0.97:8080

Pod creation in ContainerCreating state always

I am trying to create a pod using kubernetes with the following simple command
kubectl run example --image=nginx
It runs and assigns the pod to the minion correctly but the status is always in ContainerCreating status due to the following error. I have not hosted GCR or GCloud on my machine. So not sure why its picking from there only.
1h 29m 14s {kubelet centos-minion1} Warning FailedSync Error syncing pod, skipping:
failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed
for gcr.io/google_containers/pause:2.0, this may be because there are no
credentials on this request. details: (unable to ping registry endpoint
https://gcr.io/v0/\nv2 ping attempt failed with error: Get https://gcr.io/v2/:
http: error connecting to proxy http://87.254.212.120:8080: dial tcp
87.254.212.120:8080: i/o timeout\n v1 ping attempt failed with error:
Get https://gcr.io/v1/_ping: http: error connecting to proxy
http://87.254.212.120:8080: dial tcp 87.254.212.120:8080: i/o timeout)
Kubernetes is trying to create a pause container for your pod; this container is used to create the pod's network namespace. See this question and its answers for more general information on the pause container.
To your specific error: Kubernetes tries to pull the pause container's image (which would be gcr.io/google_containers/pause:2.0, according to your error message) from the Google Container Registry (gcr.io). Apparently, your Docker engine tries to connect to GCR using a HTTP proxy located at 87.254.212.120:8080, to which it apparently cannot connect (i/o timeout).
To correct this error, either make sure that you HTTP proxy server is online and does not block HTTP requests to GCR, or (if you do have public Internet access) disable the proxy connection for your Docker engine (this would typically be done using the http_proxy and https_proxy environment variables, which would have been set in /etc/sysconfig/docker or /etc/default/docker, depending on your Linux distribution).

Resources