ImagePullBackOff with Fluxcd on Private Autopilot GKE cluster - gke-networking

I'm starting with a new project, default VPC, and forked fluxcd/flux2-kustomize-helm-example github repository.
When I attempted to flux bootstrap into a clean new PRIVATE Autopilot K8s cluster, nothing became available (see below). The pods were stuck at ImagePullBackOff and the log traces looked like everything was in airplane mode.
I suspect I need to open up a Cloud NAT access to ghcr.io/fluxcd/helm-controller, github.com/fluxcd, et. al. unless there is a fluxcd mirror within gcr.io.
NAME READY STATUS RESTARTS AGE
pod/helm-controller-57ff7dd7b5-nnpm8 0/1 ImagePullBackOff 0 4m50s
pod/kustomize-controller-9f9bf46d9-wzcdr 0/1 ImagePullBackOff 0 4m50s
pod/notification-controller-64496c6d67-g6wpx 0/1 ImagePullBackOff 0 4m50s
pod/source-controller-7467658dcb-t6bsp 0/1 ImagePullBackOff 0 4m50s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/notification-controller ClusterIP 10.42.1.103 <none> 80/TCP 4m51s
service/source-controller ClusterIP 10.42.3.58 <none> 80/TCP 4m51s
service/webhook-receiver ClusterIP 10.42.1.217 <none> 80/TCP 4m51s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/helm-controller 0/1 1 0 4m51s
deployment.apps/kustomize-controller 0/1 1 0 4m51s
deployment.apps/notification-controller 0/1 1 0 4m51s
deployment.apps/source-controller 0/1 1 0 4m51s
NAME DESIRED CURRENT READY AGE
replicaset.apps/helm-controller-57ff7dd7b5 1 1 0 4m50s
replicaset.apps/kustomize-controller-9f9bf46d9 1 1 0 4m50s
replicaset.apps/notification-controller-64496c6d67 1 1 0 4m50s
replicaset.apps/source-controller-7467658dcb 1 1 0 4m50s

Related

Knative Parallel and Sequence demo fails

I am trying out knative sequence/parallel flows.
I started with sequence example mentioned in official website for release v1.1
I created steps, sequence and pingSource as mentioned in the document but pingSource failed because sequence was not up.
Sequence has below exception
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning UpdateFailed 8s (x6 over 8s) sequence-controller Failed to update status for "sequence": Sequence.flows.knative.dev "sequence" is invalid: [status.channelStatuses: Invalid value: "null": status.channelStatuses in body must be of type array: "null", status.subscriptionStatuses: Invalid value: "null": status.subscriptionStatuses in body must be of type array: "null"]
Warning UpdateFailed 3s (x5 over 8s) sequence-controller Failed to update status for "sequence": Sequence.flows.knative.dev "sequence" is invalid: [status.subscriptionStatuses: Invalid value: "null": status.subscriptionStatuses in body must be of type array: "null", status.channelStatuses: Invalid value: "null": status.channelStatuses in body must be of type array: "null"]
but steps are running fine
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/first ExternalName <none> first.varadhi.example.com 80/TCP 9m46s
service/first-00001 ClusterIP 10.96.116.201 <none> 80/TCP 9m51s
service/first-00001-private ClusterIP 10.96.155.146 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 9m51s
service/second ExternalName <none> second.varadhi.example.com 80/TCP 9m45s
service/second-00001 ClusterIP 10.96.208.230 <none> 80/TCP 9m51s
service/second-00001-private ClusterIP 10.96.171.83 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 9m51s
service/third ExternalName <none> third.varadhi.example.com 80/TCP 9m45s
service/third-00001 ClusterIP 10.96.131.110 <none> 80/TCP 9m51s
service/third-00001-private ClusterIP 10.96.55.219 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 9m51s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/first-00001-deployment 0/0 0 0 9m52s
deployment.apps/second-00001-deployment 0/0 0 0 9m52s
deployment.apps/third-00001-deployment 0/0 0 0 9m52s
NAME DESIRED CURRENT READY AGE
replicaset.apps/first-00001-deployment-594dc84cb8 0 0 0 9m52s
replicaset.apps/second-00001-deployment-79d9f8b7b8 0 0 0 9m52s
replicaset.apps/third-00001-deployment-7479456fdf 0 0 0 9m51s
NAME URL AGE READY REASON
channel.messaging.knative.dev/varadhi-inmem-channel 26h Unknown NewObservedGenFailure
NAME URL LATESTCREATED LATESTREADY READY REASON
service.serving.knative.dev/first http://first.varadhi.example.com first-00001 first-00001 Unknown IngressNotConfigured
service.serving.knative.dev/second http://second.varadhi.example.com second-00001 second-00001 Unknown IngressNotConfigured
service.serving.knative.dev/third http://third.varadhi.example.com third-00001 third-00001 Unknown IngressNotConfigured
NAME CONFIG NAME K8S SERVICE NAME GENERATION READY REASON ACTUAL REPLICAS DESIRED REPLICAS
revision.serving.knative.dev/first-00001 first 1 True 0 0
revision.serving.knative.dev/second-00001 second 1 True 0 0
revision.serving.knative.dev/third-00001 third 1 True 0 0
NAME LATESTCREATED LATESTREADY READY REASON
configuration.serving.knative.dev/first first-00001 first-00001 True
configuration.serving.knative.dev/second second-00001 second-00001 True
configuration.serving.knative.dev/third third-00001 third-00001 True
NAME URL READY REASON
route.serving.knative.dev/first http://first.varadhi.example.com Unknown IngressNotConfigured
route.serving.knative.dev/second http://second.varadhi.example.com Unknown IngressNotConfigured
route.serving.knative.dev/third http://third.varadhi.example.com Unknown IngressNotConfigured
After spending ample time on knative sequence, I decided to try out knative parallel
I referred official documentation of parallel for v1.1 and used multiple-branches examples
I created filters, transformers, parallel and pingSource but here also parallel didn't come up with below exception
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning UpdateFailed 6m37s (x18 over 17m) parallel-controller Failed to update status for "odd-even-parallel": Parallel.flows.knative.dev "odd-even-parallel" is invalid: status.branchStatuses: Invalid value: "null": status.branchStatuses in body must be of type array: "null"
Here also parallel and sequence are up and running fine.
Has anyone faced similar issues or am I missing anything from official documentation ?
Environment info:
Using Kind Cluster
Using local docker registry (have bypassed digest check on images)
namespace : varadhi
Serving and Eventing CRDS are of knative v1.1
Edit #1
I have not created any channels explicitly and I do not see any channels being created implicitly by controller either
kubectl get channel -n varadhi
No resources found in varadhi namespace.
Also my default channel is InMemoryChannel
anil.gowda#faas-dev-kafka-8420816:~/knative$ kubectl get configmaps -n knative-eventing default-ch-webhook -o yaml
apiVersion: v1
data:
default-ch-config: |
clusterDefault:
apiVersion: messaging.knative.dev/v1
kind: InMemoryChannel
namespaceDefaults:
varadhi:
apiVersion: messaging.knative.dev/v1
kind: InMemoryChannel
kind: ConfigMap
Parallel :
Example Used : https://github.com/knative/docs/tree/main/code-samples/eventing/parallel/multiple-branches
Status
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/even-filter ExternalName <none> even-filter.varadhi.example.com 80/TCP 4d19h
service/even-filter-00001 ClusterIP 10.96.85.252 <none> 80/TCP 4d19h
service/even-filter-00001-private ClusterIP 10.96.98.109 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 4d19h
service/even-transformer ExternalName <none> even-transformer.varadhi.example.com 80/TCP 4d19h
service/even-transformer-00001 ClusterIP 10.96.152.53 <none> 80/TCP 4d19h
service/even-transformer-00001-private ClusterIP 10.96.130.58 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 4d19h
service/event-display ExternalName <none> event-display.varadhi.example.com 80/TCP 4d19h
service/event-display-00001 ClusterIP 10.96.237.175 <none> 80/TCP 4d19h
service/event-display-00001-private ClusterIP 10.96.81.3 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 4d19h
service/odd-filter ExternalName <none> odd-filter.varadhi.example.com 80/TCP 4d19h
service/odd-filter-00001 ClusterIP 10.96.84.239 <none> 80/TCP 4d19h
service/odd-filter-00001-private ClusterIP 10.96.16.17 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 4d19h
service/odd-transformer ExternalName <none> odd-transformer.varadhi.example.com 80/TCP 4d19h
service/odd-transformer-00001 ClusterIP 10.96.61.11 <none> 80/TCP 4d19h
service/odd-transformer-00001-private ClusterIP 10.96.203.185 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 4d19h
service/odd-transformer-00002 ClusterIP 10.96.115.147 <none> 80/TCP 4d19h
service/odd-transformer-00002-private ClusterIP 10.96.235.117 <none> 80/TCP,9090/TCP,9091/TCP,8022/TCP,8012/TCP 4d19h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/even-filter-00001-deployment 0/0 0 0 4d19h
deployment.apps/even-transformer-00001-deployment 0/0 0 0 4d19h
deployment.apps/event-display-00001-deployment 0/0 0 0 4d19h
deployment.apps/odd-filter-00001-deployment 0/0 0 0 4d19h
deployment.apps/odd-transformer-00001-deployment 0/0 0 0 4d19h
deployment.apps/odd-transformer-00002-deployment 0/0 0 0 4d19h
NAME DESIRED CURRENT READY AGE
replicaset.apps/even-filter-00001-deployment-6b7bdd866f 0 0 0 4d19h
replicaset.apps/even-transformer-00001-deployment-666bf9d776 0 0 0 4d19h
replicaset.apps/event-display-00001-deployment-758c9f7595 0 0 0 4d19h
replicaset.apps/odd-filter-00001-deployment-c86bd4799 0 0 0 4d19h
replicaset.apps/odd-transformer-00001-deployment-6bf46bc88f 0 0 0 4d19h
replicaset.apps/odd-transformer-00002-deployment-5c5f7b8b75 0 0 0 4d19h
NAME URL AGE READY REASON
parallel.flows.knative.dev/odd-even-parallel 4d19h
NAME URL AGE READY REASON
NAME LATESTCREATED LATESTREADY READY REASON
configuration.serving.knative.dev/even-filter even-filter-00001 even-filter-00001 True
configuration.serving.knative.dev/even-transformer even-transformer-00001 even-transformer-00001 True
configuration.serving.knative.dev/event-display event-display-00001 event-display-00001 True
configuration.serving.knative.dev/odd-filter odd-filter-00001 odd-filter-00001 True
configuration.serving.knative.dev/odd-transformer odd-transformer-00002 odd-transformer-00002 True
NAME URL READY REASON
route.serving.knative.dev/even-filter http://even-filter.varadhi.example.com Unknown IngressNotConfigured
route.serving.knative.dev/even-transformer http://even-transformer.varadhi.example.com Unknown IngressNotConfigured
route.serving.knative.dev/event-display http://event-display.varadhi.example.com Unknown IngressNotConfigured
route.serving.knative.dev/odd-filter http://odd-filter.varadhi.example.com Unknown IngressNotConfigured
route.serving.knative.dev/odd-transformer http://odd-transformer.varadhi.example.com Unknown IngressNotConfigured
NAME URL LATESTCREATED LATESTREADY READY REASON
service.serving.knative.dev/even-filter http://even-filter.varadhi.example.com even-filter-00001 even-filter-00001 Unknown IngressNotConfigured
service.serving.knative.dev/even-transformer http://even-transformer.varadhi.example.com even-transformer-00001 even-transformer-00001 Unknown IngressNotConfigured
service.serving.knative.dev/event-display http://event-display.varadhi.example.com event-display-00001 event-display-00001 Unknown IngressNotConfigured
service.serving.knative.dev/odd-filter http://odd-filter.varadhi.example.com odd-filter-00001 odd-filter-00001 Unknown IngressNotConfigured
service.serving.knative.dev/odd-transformer http://odd-transformer.varadhi.example.com odd-transformer-00002 odd-transformer-00002 Unknown IngressNotConfigured
NAME CONFIG NAME K8S SERVICE NAME GENERATION READY REASON ACTUAL REPLICAS DESIRED REPLICAS
revision.serving.knative.dev/even-filter-00001 even-filter 1 True 0 0
revision.serving.knative.dev/even-transformer-00001 even-transformer 1 True 0 0
revision.serving.knative.dev/event-display-00001 event-display 1 True 0 0
revision.serving.knative.dev/odd-filter-00001 odd-filter 1 True 0 0
revision.serving.knative.dev/odd-transformer-00001 odd-transformer 1 False ImagePullBackOff 0
revision.serving.knative.dev/odd-transformer-00002 odd-transformer 2 True 0 0
NAME SINK SCHEDULE AGE READY REASON
pingsource.sources.knative.dev/ping-source */1 * * * * 4d19h False NotFound
Edit #2
I installed few more eventing crds (eventing.yaml)
Few changes that I could see now.
Channels are getting created and their status is true
NAME URL AGE READY REASON
inmemorychannel.messaging.knative.dev/odd-even-parallel-kn-parallel http://odd-even-parallel-kn-parallel-kn-channel.varadhi.svc.cluster.local 73m True
inmemorychannel.messaging.knative.dev/odd-even-parallel-kn-parallel-0 http://odd-even-parallel-kn-parallel-0-kn-channel.varadhi.svc.cluster.local 73m True
inmemorychannel.messaging.knative.dev/odd-even-parallel-kn-parallel-1 http://odd-even-parallel-kn-parallel-1-kn-channel.varadhi.svc.cluster.local 73m True
inmemorychannel.messaging.knative.dev/sequence-kn-sequence-0 http://sequence-kn-sequence-0-kn-channel.varadhi.svc.cluster.local 70m True
inmemorychannel.messaging.knative.dev/sequence-kn-sequence-1 http://sequence-kn-sequence-1-kn-channel.varadhi.svc.cluster.local 50m True
inmemorychannel.messaging.knative.dev/sequence-kn-sequence-2 http://sequence-kn-sequence-2-kn-channel.varadhi.svc.cluster.local 50m True
but sequence and parallel are not in true state yet, they say subscriptionsNotReady
$ kubectl -n varadhi get sequence
NAME URL AGE READY REASON
sequence http://sequence-kn-sequence-0-kn-channel.varadhi.svc.cluster.local 71m Unknown SubscriptionsNotReady
$ kubectl -n varadhi get parallel
NAME URL AGE READY REASON
odd-even-parallel http://odd-even-parallel-kn-parallel-kn-channel.varadhi.svc.cluster.local 5d21h False SubscriptionsNotReady
On debugging further I could see below message in sequence
Ready:
Last Transition Time: 2022-01-12T08:19:17Z
Message: Failed to get subscription status: subscription "sequence-kn-sequence-0" not present in channel "sequence-kn-sequence-0" subscriber's list
Reason: SubscriptionNotMarkedReadyByChannel
Status: Unknown
Type: Ready
Subscription:
Name: sequence-kn-sequence-0
Namespace: varadhi
and on looking into channel, it gives below message
Last Transition Time: 2022-01-12T08:19:17Z
Message: The status of Dispatcher Deployment is False: MinimumReplicasUnavailable : Deployment does not have minimum availability.
Reason: DispatcherDeploymentFalse
Severity: Info
Status: False
Type: DispatcherReady
Looks like channel is not able to fetch service deployment status
I also installed Kourier but I cannot see external ip. Note that I am running my own kind cluster for this.
$ kubectl --namespace kourier-system get service kourier
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kourier LoadBalancer 10.96.112.238 <pending> 80:32002/TCP,443:32733/TCP 73m
$
$ kubectl get pods -n knative-serving
NAME READY STATUS RESTARTS AGE
activator-d4cd7dfd5-mcxsl 1/1 Running 0 7d18h
autoscaler-69689d8b7-rx75h 1/1 Running 0 7d18h
controller-766f74d9f8-fwdk9 1/1 Running 0 7d18h
domain-mapping-7dbbb5c7d-xk5m5 1/1 Running 0 7d18h
domainmapping-webhook-747f79dbdc-qm5nn 1/1 Running 0 7d18h
net-kourier-controller-5657664b99-zr9cj 1/1 Running 0 73m
webhook-8f6866966-8z8tt 1/1 Running 0 7d18h
Few changes in services post installing kourier
$ kubectl get service.serving.knative.dev -n varadhi
NAME URL LATESTCREATED LATESTREADY READY REASON
even-filter http://even-filter.varadhi.example.com even-filter-00001 even-filter-00001 Unknown
even-transformer http://even-transformer.varadhi.example.com even-transformer-00001 even-transformer-00001 Unknown
event-display http://event-display.varadhi.example.com event-display-00001 event-display-00001 Unknown
first http://first.varadhi.example.com first-00001 first-00001 Unknown
odd-filter http://odd-filter.varadhi.example.com odd-filter-00001 odd-filter-00001 Unknown
odd-transformer http://odd-transformer.varadhi.example.com odd-transformer-00002 odd-transformer-00002 Unknown
second http://second.varadhi.example.com second-00001 second-00001 Unknown
third http://third.varadhi.example.com third-00001 third-00001 Unknown
IngressNotConfigured status goes away.
for the Sequence example I think there might be something wrong with your default channel configuration.
Sequences create channels to communicate between every step, it is expected that three channels are created for that example:
$ kubectl get channel
inmemorychannel.messaging.knative.dev/sequence-kn-sequence-0 http://sequence-kn-sequence-0-kn-channel.default.svc.cluster.local 56s True
inmemorychannel.messaging.knative.dev/sequence-kn-sequence-1 http://sequence-kn-sequence-1-kn-channel.default.svc.cluster.local 56s True
inmemorychannel.messaging.knative.dev/sequence-kn-sequence-2 http://sequence-kn-sequence-2-kn-channel.default.svc.cluster.local 56s True
Can you check what channels are created for you and their status?
Can you also make sure your default channel is properly setup? See:
https://knative.dev/docs/eventing/channels/channel-types-defaults/
Can you also post here the Parallel you are using?
Status is up to the controller to fill, that one sounds like an issue.
At both of your outputs for sequence and parallel an issue with services can be found:
service.serving.knative.dev/even-filter http://even-filter.varadhi.example.com even-filter-00001 even-filter-00001 Unknown IngressNotConfigured
Although Knative Eventing does not depend on Serving, the examples for sequence/parallel requires it to be properly installed because they use serverless services:
sequence example: event-display, first, second and third
parallel example: event-display, even-filter, even-transformer, odd-filter even-transformer.
Eventing can use regular kubernetes services instead of knative services, but I think the best way to make the examples work for you is making sure Knative Serving works as expected.
Did you configure a network provider for Knative Serving?
If you did not, can you go through this step:
https://knative.dev/docs/install/serving/install-serving-with-yaml/#install-a-networking-layer
If you are in doubt about which one to choose I would go for Kourier, which is maintained by the Knative project.

Kubernetes calico-node issue - running 0/1

Hi I have two virtual machine in a local server with ubuntu 20.04 and i want to build a small cluster for my microservices. I ran the following step to setup my cluster but I got issue with calico-nodes. They are running with 0/1/
master.domain.com
ubuntu 20.04
docker --version = Docker version 20.10.7, build f0df350
kubectl version = Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
worker.domain.com
ubuntu 20.04
docker --version = Docker version 20.10.2, build 20.10.2-0ubuntu1~20.04.2
kubectl version = Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
STEP-1
In the master.domain.com virtual machine I run the following commands
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-7f4f5bf95d-gnll8 1/1 Running 0 38s 192.168.29.195 master <none> <none>
kube-system calico-node-7zmtm 1/1 Running 0 38s 195.251.3.255 master <none> <none>
kube-system coredns-74ff55c5b-ltn9g 1/1 Running 0 3m49s 192.168.29.193 master <none> <none>
kube-system coredns-74ff55c5b-nkhzf 1/1 Running 0 3m49s 192.168.29.194 master <none> <none>
kube-system etcd-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
kube-system kube-apiserver-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
kube-system kube-controller-manager-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
kube-system kube-proxy-2cr2x 1/1 Running 0 3m49s 195.251.3.255 master <none> <none>
kube-system kube-scheduler-kubem 1/1 Running 0 4m6s 195.251.3.255 master <none> <none>
STEP-2
In the worker.domain.com virtual machine I run the following commands
sudo kubeadm join 195.251.3.255:6443 --token azuist.xxxxxxxxxxx --discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxxxxxxxxxxxxxx
STEP-3
In the master.domain.com virtual machine I run the following commands
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-7f4f5bf95d-gnll8 1/1 Running 0 6m37s 192.168.29.195 master <none> <none>
kube-system calico-node-7zmtm 0/1 Running 0 6m37s 195.251.3.255 master <none> <none>
kube-system calico-node-wccnb 0/1 Running 0 2m19s 195.251.3.230 worker <none> <none>
kube-system coredns-74ff55c5b-ltn9g 1/1 Running 0 9m48s 192.168.29.193 master <none> <none>
kube-system coredns-74ff55c5b-nkhzf 1/1 Running 0 9m48s 192.168.29.194 master <none> <none>
kube-system etcd-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kube-system kube-apiserver-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kube-system kube-controller-manager-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kube-system kube-proxy-2cr2x 1/1 Running 0 9m48s 195.251.3.255 master <none> <none>
kube-system kube-proxy-kxw4m 1/1 Running 0 2m19s 195.251.3.230 worker <none> <none>
kube-system kube-scheduler-kubem 1/1 Running 0 10m 195.251.3.255 master <none> <none>
kubectl logs -n kube-system calico-node-7zmtm
...
...
2021-06-20 17:10:25.064 [INFO][56] monitor-addresses/startup.go 774: Using autodetected IPv4 address on interface eth0: 195.251.3.255/24
2021-06-20 17:10:34.862 [INFO][53] felix/summary.go 100: Summarising 11 dataplane reconciliation loops over 1m3.5s: avg=4ms longest=13ms ()
kubectl logs -n kube-system calico-node-wccnb
...
...
2021-06-20 17:10:59.818 [INFO][55] felix/summary.go 100: Summarising 8 dataplane reconciliation loops over 1m3.6s: avg=3ms longest=13ms (resync-filter-v4,resync-nat-v4,resync-raw-v4)
2021-06-20 17:11:05.994 [INFO][51] monitor-addresses/startup.go 774: Using autodetected IPv4 address on interface br-9a88318dda68: 172.21.0.1/16
As you can see for both calico nodes I get 0/1 running, Why??
Any idea how to solve this problem?
Thank you
Got totally the same issue.
CentOS 8
kubectl kubeadm kubelet v1.22.3
docker-ce version 20.10.9
The only difference worth mention is that I have to comment line
- --port=0
in /etc/kubernetes/manifests/kube-scheduler.yaml or otherwise scheduler declared as unhealthy in
kubectl get componentstatuses
Kubernetes API is advertised on a public IP address.
Public IP address of control panel node is substituted with 42.42.42.42 in kubectl print-out;
Public IP address of worker node is substituted with 21.21.21.21
Public domain name (which is also a hostname on Control Panel node) is substituted with public-domain.work
>kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-5d995d45d6-rk9cq 1/1 Running 0 76m 192.168.231.193 public-domain.work <none> <none>
calico-node-qstxm 0/1 Running 0 76m 42.42.42.42 public-domain.work <none> <none>
calico-node-zmz5s 0/1 Running 0 75m 21.21.21.21 node1.public-domain.work <none> <none>
coredns-78fcd69978-5xsb2 1/1 Running 0 81m 192.168.231.194 public-domain.work <none> <none>
coredns-78fcd69978-q29fn 1/1 Running 0 81m 192.168.231.195 public-domain.work <none> <none>
etcd-public-domain.work 1/1 Running 3 82m 42.42.42.42 public-domain.work <none> <none>
kube-apiserver-public-domain.work 1/1 Running 3 82m 42.42.42.42 public-domain.work <none> <none>
kube-controller-manager-public-domain.work 1/1 Running 2 82m 42.42.42.42 public-domain.work <none> <none>
kube-proxy-5kkks 1/1 Running 0 81m 42.42.42.42 public-domain.work <none> <none>
kube-proxy-xsc66 1/1 Running 0 75m 21.21.21.21 node1.public-domain.work <none> <none>
kube-scheduler-public-domain.work 1/1 Running 1 (78m ago) 78m 42.42.42.42 public-domain.work <none> <none>
>kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
public-domain.work Ready control-plane,master 4h56m v1.22.3 42.42.42.42 <none> CentOS Stream 8 4.18.0-348.el8.x86_64 docker://20.10.9
node1.public-domain.work Ready <none> 4h50m v1.22.3 21.21.21.21 <none> CentOS Stream 8 4.18.0-348.el8.x86_64 docker://20.10.10
>kubectl logs -n kube-system calico-node-qstxm
2021-11-09 15:27:38.996 [INFO][86] felix/int_dataplane.go 1539: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:27:38.996 [INFO][86] felix/hostip_mgr.go 85: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:27:38.997 [INFO][86] felix/ipsets.go 130: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
2021-11-09 15:27:38.998 [INFO][86] felix/ipsets.go 785: Doing full IP set rewrite family="inet" numMembersInPendingReplace=7 setID="this-host"
2021-11-09 15:27:40.198 [INFO][86] felix/iface_monitor.go 201: Netlink address update. addr="here:is:some:ipv6:address:that:has:nothing:to:do:with:my:control:panel:server:public:ipv6" exists=true ifIndex=3 2021-11-09 15:27:40.198 [INFO][86] felix/int_dataplane.go 1071: Linux interface addrs changed. addrs=set.mapSet{"fe80::9132:a0df:82d8:e26c":set.empty{}} ifaceName="eth1"
2021-11-09 15:27:40.198 [INFO][86] felix/int_dataplane.go 1539: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{"here:is:some:ipv6:address:that:has:nothing:to:do:with:my:control:panel:server:public:ipv6":set.empty{}}}
2021-11-09 15:27:40.199 [INFO][86] felix/hostip_mgr.go 85: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{"here:is:some:ipv6:address:that:has:nothing:to:do:with:my:control:panel:server:public:ipv6":set.empty{}}}
2021-11-09 15:27:40.199 [INFO][86] felix/ipsets.go 130: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
2021-11-09 15:27:40.200 [INFO][86] felix/ipsets.go 785: Doing full IP set rewrite family="inet" numMembersInPendingReplace=7 setID="this-host"
2021-11-09 15:27:48.010 [INFO][81] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface eth0: 42.42.42.42/24
> kube-system calico-node-zmz5s
2021-11-09 15:25:56.669 [INFO][64] felix/int_dataplane.go 1071: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="eth1"
2021-11-09 15:25:56.669 [INFO][64] felix/int_dataplane.go 1539: Received interface addresses update msg=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:25:56.669 [INFO][64] felix/hostip_mgr.go 85: Interface addrs changed. update=&intdataplane.ifaceAddrsUpdate{Name:"eth1", Addrs:set.mapSet{}}
2021-11-09 15:25:56.669 [INFO][64] felix/ipsets.go 130: Queueing IP set for creation family="inet" setID="this-host" setType="hash:ip"
2021-11-09 15:25:56.670 [INFO][64] felix/ipsets.go 785: Doing full IP set rewrite family="inet" numMembersInPendingReplace=7 setID="this-host"
2021-11-09 15:25:56.769 [INFO][64] felix/iface_monitor.go 201: Netlink address update. addr="here:is:some:ipv6:address:that:has:nothing:to:do:with:my:worknode:server:public:ipv6" exists=false ifIndex=3
2021-11-09 15:26:07.050 [INFO][64] felix/summary.go 100: Summarising 14 dataplane reconciliation loops over 1m1.7s: avg=5ms longest=11ms ()
2021-11-09 15:26:33.880 [INFO][59] monitor-addresses/startup.go 713: Using autodetected IPv4 address on interface eth0: 21.21.21.21/24
Seemed that issue was in closed BGP port due to firewall.
This commands on master node solved it for me:
>firewall-cmd --add-port 179/tcp --zone=public --permanent
>firewall-cmd --reload

Two coredns Pods in k8s cluster are in pending state

kube-system coredns-f68dcb75-f6smn 0/1 Pending 0 34m
kube-system coredns-f68dcb75-npc48 0/1 Pending 0 34m
kube-system etcd-master 1/1 Running 0 33m
kube-system kube-apiserver-master 1/1 Running 0 34m
kube-system kube-controller-manager-master 1/1 Running 0 33m
kube-system kube-flannel-ds-amd64-lngrx 1/1 Running 1 32m
kube-system kube-flannel-ds-amd64-qz2gn 1/1 Running 0 32m
kube-system kube-flannel-ds-amd64-w5lpc 1/1 Running 0 34m
kube-system kube-proxy-9l9nv 1/1 Running 0 32m
kube-system kube-proxy-hvd5g 1/1 Running 0 32m
kube-system kube-proxy-vdgth 1/1 Running 0 34m
kube-system kube-scheduler-master 1/1 Running 0 33m
I am using the latest k8s version: 1.16.0.
kubeadm init --pod-network-cidr=10.244.0.0/16 --image-repository=<some-repo> --token=TOKEN --apiserver-advertise-address=<IP> --kubernetes-version=1.16.0
This is the command I am using to initialize the cluster
The current state of the cluster.
master NotReady master 42m v1.16.0
slave1 NotReady <none> 39m v1.16.0
slave2 NotReady <none> 39m v1.16.0
Please comment if you need any other info.
I think you need to wait for k8s v1.17.0 or update your current installaion, this issue fixed in here
orginal Issue

Kubernetes dial tcp myIP:10250: connect: no route to host

I got Kubernetes Cluster with 1 master and 3 workers nodes.
calico v3.7.3 kubernetes v1.16.0 installed via kubespray https://github.com/kubernetes-sigs/kubespray
Before that, I normally deployed all the pods without any problems.
I can't start a few pod (Ceph):
kubectl get all --namespace=ceph
NAME READY STATUS RESTARTS AGE
pod/ceph-cephfs-test 0/1 Pending 0 162m
pod/ceph-mds-665d849f4f-fzzwb 0/1 Pending 0 162m
pod/ceph-mon-744f6dc9d6-jtbgk 0/1 CrashLoopBackOff 24 162m
pod/ceph-mon-744f6dc9d6-mqwgb 0/1 CrashLoopBackOff 24 162m
pod/ceph-mon-744f6dc9d6-zthpv 0/1 CrashLoopBackOff 24 162m
pod/ceph-mon-check-6f474c97f-gjr9f 1/1 Running 0 162m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/ceph-mon ClusterIP None <none> 6789/TCP 162m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/ceph-osd 0 0 0 0 0 node-type=storage 162m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/ceph-mds 0/1 1 0 162m
deployment.apps/ceph-mon 0/3 3 0 162m
deployment.apps/ceph-mon-check 1/1 1 1 162m
NAME DESIRED CURRENT READY AGE
replicaset.apps/ceph-mds-665d849f4f 1 1 0 162m
replicaset.apps/ceph-mon-744f6dc9d6 3 3 0 162m
replicaset.apps/ceph-mon-check-6f474c97f 1 1 1 162m
But another obe is ok:
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6d57b44787-xlj89 1/1 Running 19 24d
calico-node-dwm47 1/1 Running 310 19d
calico-node-hhgzk 1/1 Running 15 24d
calico-node-tk4mp 1/1 Running 309 19d
calico-node-w7zvs 1/1 Running 312 19d
coredns-74c9d4d795-jrxjn 1/1 Running 0 2d23h
coredns-74c9d4d795-psf2v 1/1 Running 2 18d
dns-autoscaler-7d95989447-7kqsn 1/1 Running 10 24d
kube-apiserver-master 1/1 Running 4 24d
kube-controller-manager-master 1/1 Running 3 24d
kube-proxy-9bt8m 1/1 Running 2 19d
kube-proxy-cbrcl 1/1 Running 4 19d
kube-proxy-stj5g 1/1 Running 0 19d
kube-proxy-zql86 1/1 Running 0 19d
kube-scheduler-master 1/1 Running 3 24d
kubernetes-dashboard-7c547b4c64-6skc7 1/1 Running 591 24d
nginx-proxy-worker1 1/1 Running 2 19d
nginx-proxy-worker2 1/1 Running 0 19d
nginx-proxy-worker3 1/1 Running 0 19d
nodelocaldns-6t92x 1/1 Running 2 19d
nodelocaldns-kgm4t 1/1 Running 0 19d
nodelocaldns-xl8zg 1/1 Running 0 19d
nodelocaldns-xwlwk 1/1 Running 12 24d
tiller-deploy-8557598fbc-7f2w6 1/1 Running 0 131m
I use Centos 7:
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
The error log:
Get https://10.2.67.203:10250/containerLogs/ceph/ceph-mon-744f6dc9d6-mqwgb/ceph-mon?tailLines=5000&timestamps=true: dial tcp 10.2.67.203:10250: connect: no route to host
Maybe someone came across this and can help me? I will provide any additional information
logs from pending pods:
Warning FailedScheduling 98s (x125 over 3h1m) default-scheduler 0/4 nodes are available: 4 node(s) didn't match node selector.
It seems that a firewall is blocking ingress traffic from port 10250 on the 10.2.67.203 node.
You can open it by running the commands below (I'm assuming firewalld is installed or you can run the commands of the equivalent firewall module):
sudo firewall-cmd --add-port=10250/tcp --permanent
sudo firewall-cmd --reload
sudo firewall-cmd --list-all # you should see that port `10250` is updated
tl;dr; It looks like your cluster itself is fairly broken and should be repaired before looking at Ceph specifically
Get https://10.2.67.203:10250/containerLogs/ceph/ceph-mon-744f6dc9d6-mqwgb/ceph-mon?tailLines=5000&timestamps=true: dial tcp 10.2.67.203:10250: connect: no route to host
10250 is the port that the Kubernetes API server uses to connect to a node's Kubelet to retrieve the logs.
This error indicates that the Kubernetes API server is unable to reach the node. This has nothing to do with your containers, pods or even your CNI network. no route to host indicates that either:
The host is unavailable
A network segmentation has occurred
The Kubelet is unable to answer the API server
Before addressing issues with the Ceph pods I would investigate why the Kubelet isn't reachable from the API server.
After you have solved the underlying network connectivity issues I would address the crash-looping Calico pods (You can see the logs of the previously executed containers by running kubectl logs -n kube-system calico-node-dwm47 -p).
Once you have both the underlying network and the pod network sorted I would address the issues with the Kubernetes Dashboard crash-looping, and finally, start to investigate why you are having issues deploying Ceph.

Pod not response properly

I have a local(without cloud provider) cluster made up of 3 vm the master and the nodes, I have created a volume with a nfs to reuse it if a pod die and is reschedule on another nodes, but i think same component not work well: I use to create the cluster just this guide: kubernetes guide and I have after that create the cluster this is the actual state:
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl get pod --all-namespaces
[sudo] password for master:
NAMESPACE NAME READY STATUS RESTARTS AGE
default mysqlnfs3 1/1 Running 0 27m
kube-system etcd-master-virtualbox 1/1 Running 0 46m
kube-system kube-apiserver-master-virtualbox 1/1 Running 0 46m
kube-system kube-controller-manager-master-virtualbox 1/1 Running 0 46m
kube-system kube-dns-86f4d74b45-f6hpf 3/3 Running 0 47m
kube-system kube-flannel-ds-nffv6 1/1 Running 0 38m
kube-system kube-flannel-ds-rqw9v 1/1 Running 0 39m
kube-system kube-flannel-ds-s5wzn 1/1 Running 0 44m
kube-system kube-proxy-6j7p8 1/1 Running 0 38m
kube-system kube-proxy-7pj8d 1/1 Running 0 39m
kube-system kube-proxy-jqshs 1/1 Running 0 47m
kube-system kube-scheduler-master-virtualbox 1/1 Running 0 46m
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl get node
NAME STATUS ROLES AGE VERSION
host1-virtualbox Ready <none> 39m v1.10.2
host2-virtualbox Ready <none> 40m v1.10.2
master-virtualbox Ready master 48m v1.10.2
and this is the pod:
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl get pod
NAME READY STATUS RESTARTS AGE
mysqlnfs3 1/1 Running 0 29m
it is schedule on the host2 and if i try to go in the shell of host 2 and I do dockerexec I use the container very well, the data are store and retrieve, but when I try to use kubect exec not work:
master#master-VirtualBox:~/Documents/KubeT/nfs$ sudo kubectl exec -it -n default mysqlnfs3 -- /bin/bash
error: unable to upgrade connection: pod does not exist

Resources