Im running OpenShift.
OpenShift Master: v3.3.1.7
Kubernetes Master: v1.3.0+52492b4
But am having problems trying to run a build in Jenkins (running in a pod). This is not a problem with the java code that I'm trying to build, but is a problem in the Kubernetes/Openshift setup.
The builds fail with:
Caused by: java.net.UnknownHostException: kubernetes.default: Name does not resolve
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
...
Does anyone know how to fix this?
First confirm that DNS is actually working with:
› kubectl run -i -t busybox --image=busybox --restart=Never
Waiting for pod default/busybox to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # nslookup kubernetes.default
Server: 192.168.60.10
Address 1: 192.168.60.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 192.168.60.1 kubernetes.default.svc.cluster.local
If that doesn't work check if the DNS pods are running:
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
Will respond sth like:
NAME READY STATUS RESTARTS AGE
kube-dns-v14-3u5zi 3/3 Running 36 166d
Finally checking the related logs is worth a try:
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kube-dns
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c healthz
Full instructions can be found on kubernetes.io
Please check on manage jenkins -> Configure Global Security -> Agent port is 50000 and Fixed
Related
I understand that this question is asked dozen times, but nothing has helped me through internet searching.
My set up:
CentOS Linux release 7.5.1804 (Core)
Docker Version: 18.06.1-ce
Kubernetes: v1.12.3
Installed by official guide and this one:https://www.techrepublic.com/article/how-to-install-a-kubernetes-cluster-on-centos-7/
CoreDNS pods are in Error/CrashLoopBackOff state.
kube-system coredns-576cbf47c7-8phwt 0/1 CrashLoopBackOff 8 31m
kube-system coredns-576cbf47c7-rn2qc 0/1 CrashLoopBackOff 8 31m
My /etc/resolv.conf:
nameserver 8.8.8.8
Also tried with my local dns-resolver(router)
nameserver 10.10.10.1
Setup and init:
kubeadm init --apiserver-advertise-address=10.10.10.3 --pod-network-cidr=192.168.1.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
I tried to solve this with:
Editing the coredns: root#kub~]# kubectl edit cm coredns -n kube-system
and changing
proxy . /etc/resolv.conf
directly to
proxy . 10.10.10.1
or
proxy . 8.8.8.8
Also tried to:
kubectl -n kube-system get deployment coredns -o yaml | sed 's/allowPrivilegeEscalation: false/allowPrivilegeEscalation: true/g' | kubectl apply -f -
And still nothing helps me.
Error from the logs:
plugin/loop: Seen "HINFO IN 7847735572277573283.2952120668710018229." more than twice, loop detected
The other thread - coredns pods have CrashLoopBackOff or Error state didnt help at all, becouse i havent hit any solutions that were described there. Nothing helped.
Even I have got such error and I successfully managed to work by below steps.
However, you missed 8.8.4.4
sudo nano /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
run following commands to restart daemon and docker service
sudo systemctl daemon-reload
sudo systemctl restart docker
If you are using kubeadm make sure you delete an entire cluster from master and provision cluster again.
kubectl drain <node_name> --delete-local-data --force --ignore-daemonsets
kubectl delete node <node_name>
kubeadm reset
Once You Provision the new cluster
kubectl get pods --all-namespaces
It Should give below expected Result
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-gldlr 2/2 Running 0 24s
kube-system coredns-86c58d9df4-lpnj6 1/1 Running 0 40s
kube-system coredns-86c58d9df4-xnb5r 1/1 Running 0 40s
kube-system kube-proxy-kkb7b 1/1 Running 0 40s
kube-system kube-scheduler-osboxes 1/1 Running 0 10s
$kubectl edit cm coredns -n kube-system
delete ‘loop’ ,save and exit
restart master node. It was work for me.
I faced the the same issue in my local k8s in Docker (KIND) setup. CoreDns pod gets crashloop backoff error.
Steps followed to make the pod into running state:
As Tim Chan said in this post and by referring the github issues link, I did the following
kubectl -n kube-system edit configmaps coredns -o yaml
modify the section
forward . /etc/resolv.conf with forward . 172.16.232.1 (mycase i set 8.8.8.8 for the timebeing)
Delete one of the Coredns Pods, or can wait for sometime - the pods will be in running state.
Usually happens when coredns can't talk to the kube-apiserver:
Check that your kubernetes service is in the default namespace:
$ kubectl get svc kubernetes
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 130d
Then (you might have to create a pod):
$ kubectl -n kube-system exec -it <any-pod-with-shell> sh
# ping kubernetes.default.svc.cluster.local
PING kubernetes.default.svc.cluster.local (10.96.0.1): 56 data bytes
Also, try hitting port 443 from the port:
# telnet kubernetes.default.svc.cluster.local 443 # or
# curl kubernetes.default.svc.cluster.local:443
I got the error is:
connect: no route to host","time":"2021-03-19T14:42:05Z"}
crashloopbackoff
in the log showed by kubectl -n kube-system logs coredns-d9fdb9c9f-864rz
The issue is mentioned in https://github.com/coredns/coredns/tree/master/plugin/loop#troubleshooting-loops-in-kubernetes-clusters
tldr;
Reason: /etc/resolv.conf got updated somehow. The original one is at /run/systemd/resolve/resolv.conf:
e.g:
nameserver 172.16.232.1
Quick fix, edit Corefile:
$ kubectl -n kube-system edit configmaps coredns -o yaml
to replace forward . /etc/resolv.conf with forward . 172.16.232.1
e.g:
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . 172.16.232.1 {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
kind: ConfigMap
metadata:
creationTimestamp: "2021-03-18T15:58:07Z"
name: coredns
namespace: kube-system
resourceVersion: "49996"
uid: 428a03ff-82d0-4812-a3fa-e913c2911ebd
Done, after that, may need to restart the docker
sudo systemctl restart docker
Update: it could be fixed by just sudo systemctl restart docker
What happened:
I have been following this guidelines: https://kubernetes.io/docs/setup/minikube/ and I have the "connection refused" issue when trying to curl the application. Here are the steps I did
~~> minikube status
minikube: Stopped
cluster:
kubectl:
~~> minikube start
Starting local Kubernetes v1.10.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Loading cached images from config file.
~~> kubectl run hello-minikube --image=k8s.gcr.io/echoserver:1.10 --port=9500
deployment.apps/hello-minikube created
~~> kubectl expose deployment hello-minikube --type=NodePort
service/hello-minikube exposed
~~> kubectl get pod
NAME READY STATUS RESTARTS AGE
hello-minikube-79577c5997-24gt8 1/1 Running 0 39s
~~> curl $(minikube service hello-minikube --url)
curl: (7) Failed to connect to 192.168.99.100 port 31779: Connection refused
What I expect to happen:
When I curl the pod, it should give a proper reply (like in the quickstart: https://kubernetes.io/docs/setup/minikube/)
minikube logs: https://docs.google.com/document/d/1o2-ebiZTsoCzQNSn_rQSkcuVzOJABmwT2KKzGoUQNiQ/edit
Not sure where you got the port 9500 from but that's the reason it doesn't work. NGINX serves on port 8080. This should work (it does for me, at least):
$ kubectl expose deployment hello-minikube \
--type=NodePort \
--port=8080 --target-port=8080
$ curl $(minikube service hello-minikube --url)
Hostname: hello-minikube-79577c5997-tf49z
Pod Information:
-no pod information available-
Server values:
server_version=nginx: 1.13.3 - lua: 10008
Request Information:
client_address=172.17.0.1
method=GET
real path=/
query=
request_version=1.1
request_scheme=http
request_uri=http://192.168.64.11:8080/
Request Headers:
accept=*/*
host=192.168.64.11:32141
user-agent=curl/7.54.0
Request Body:
-no body in request-
I'm going through this tutorial
Setting up Jenkins on Container Engine
https://cloud.google.com/solutions/jenkins-on-container-engine-tutorial
and failing on "Creating the Jenkins deployment and services" step
I got this error at one point:
jenkins- 0/1 rpc error: code = 2 desc = failed to start container "": Error response from daemon: {"message":"linux spec user: unable to find user jenkins: no matching entries in passwd file"}
And I get these results for the following commands:
> kubectl apply -f jenkins/k8s/
deployment "jenkins" configured
service "jenkins-ui" configured
service "jenkins-discovery" configured
> get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-<some id> 0/1 CrashLoopBackOff 5 10m
I get it that it is looking for jenkins user in the passwd file, but I'm still not sure why this error took place and what the correct way to fix it is. Any insight would be highly appreciated.
Edit: output of running "kubectl get pods --namespace jenkins"
The very first time running it:
> kubectl get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-1937056428-fp7vr 0/1 ContainerCreating 0 16s
Second time running it:
> kubectl get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-1937056428-fp7vr 0/1 rpc error: code = 2 desc = failed to start container "10a8ab7e3eb0ad153fd6055d86336b1cdfe9642b6993684a7e01fefbeca7a566": Error response from
daemon: {"message":"linux spec user: unable to find user jenkins: no matching entries in passwd file"} 1 39s
Third and after:
> kubectl get pods --namespace jenkins
NAME READY STATUS RESTARTS AGE
jenkins-1937056428-fp7vr 0/1 CrashLoopBackOff 270 22h
It appears that the persistent disk volume for the jenkins is not properly setup. Try running the following commands to reconfigure disk volumes and rerun jenkins pod,
kubectl delete -f jenkins/k8s/
gcloud compute disks delete jenkins-home
gcloud compute images delete jenkins-home-image
gcloud config set compute/zone us-east1-d
gcloud compute images create jenkins-home-image --source-uri https://storage.googleapis.com/solutions-public-assets/jenkins-cd/jenkins-home-v3.tar.gz
gcloud compute disks create jenkins-home --image jenkins-home-image --zone us-east1-d
kubectl apply -f jenkins/k8s/
I basically did one step wrong:
Provision a Kubernetes cluster using Container Engine.
gcloud container clusters create jenkins-cd \
--network jenkins \
--scopes "https://www.googleapis.com/auth/projecthosting,storage-rw"
Here make sure the options --network and --scopes actually get passed in. I guess I copied the command without fixing it up and the options got dropped.
I deploy a kubernetes cluster following the guide: https://blog.hypriot.com/post/setup-kubernetes-raspberry-pi-cluster/. It basically uses hypriotOS and kubernetes from the debian repository.
After the deployment, all the pods were running and no faults were shown. However, the dns server was not working properly on the worker node.
master
$ kubectl -n kube-system get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 34m
kubernetes-dashboard 10.103.97.112 <nodes> 80:30518/TCP 31m
# I installed the dnsutils to have the dig command
$ dig #10.96.0.10 || echo "FAIL"
# shows a valid response (note that we are not resolving anything)
worker
$ dig #10.96.0.10 || echo "FAIL"
....
FAIL
It turn out that the answer was in one of the comments from , but it was not clear that this was my issue.
As the author of the comment stated is due to the iptables policies from Docker versions > 1.13.
To solve it, execute the following on both nodes:
sudo iptables -A FORWARD -i cni0 -j ACCEPT
sudo iptables -A FORWARD -o cni0 -j ACCEPT
I am new to kubernetes. I have an issue in the pods. When I run the command
kubectl get pods
Result:
NAME READY STATUS RESTARTS AGE
mysql-apim-db-1viwg 1/1 Running 1 20h
mysql-govdb-qioee 1/1 Running 1 20h
mysql-userdb-l8q8c 1/1 Running 0 20h
wso2am-default-813fy 0/1 ImagePullBackOff 0 20h
Due to an issue of "wso2am-default-813fy" node, I need to restart it. Any suggestion?
In case of not having the yaml file:
kubectl get pod PODNAME -n NAMESPACE -o yaml | kubectl replace --force -f -
Usually in case of "ImagePullBackOff" it's retried after few seconds/minutes. In case you want to try again manually you can delete the old pod and recreate the pod. The one line command to delete and recreate the pod would be:
kubectl replace --force -f <yml_file_describing_pod>
$ kubectl replace --force -f <resource-file>
if all goes well, you should see something like:
<resource-type> <resource-name> deleted
<resource-type> <resource-name> replaced
details of this can be found in the Kubernetes documentation, "manage-deployment" and kubectl-cheatsheet pages at the time of writing.
If the Pod is part of a Deployment or Service, deleting it will restart the Pod and, potentially, place it onto another node:
$ kubectl delete po $POD_NAME
replace it if it's an individual Pod:
$ kubectl get po -n $namespace $POD_NAME -o yaml | kubectl replace -f -
Try with deleting pod it will try to pull image again.
kubectl delete pod <pod_name> -n <namespace_name>
First try to see what's wrong with the pod:
kubectl logs -p <your_pod>
In my case it was a problem with the YAML file.
So, I needed to correct the configuration file and replace it:
kubectl replace --force -f <yml_file_describing_pod>
Most probably the issue of ImagePullBackOff is due to either the image not being present or issue with the pod YAML file.
What I will do is this
kubectl get pod -n $namespace $POD_NAME --export > pod.yaml | kubectl -f apply -
I would also see the pod.yaml to see the why the earlier pod didn't work
There is also possibility that the pull policy is not defined or kubernetes is configured to pull from the hub but fails due network issues. Try setting up a local secure registry and execute a pull . It would work.