Pod creation in ContainerCreating state always - docker

I am trying to create a pod using kubernetes with the following simple command
kubectl run example --image=nginx
It runs and assigns the pod to the minion correctly but the status is always in ContainerCreating status due to the following error. I have not hosted GCR or GCloud on my machine. So not sure why its picking from there only.
1h 29m 14s {kubelet centos-minion1} Warning FailedSync Error syncing pod, skipping:
failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed
for gcr.io/google_containers/pause:2.0, this may be because there are no
credentials on this request. details: (unable to ping registry endpoint
https://gcr.io/v0/\nv2 ping attempt failed with error: Get https://gcr.io/v2/:
http: error connecting to proxy http://87.254.212.120:8080: dial tcp
87.254.212.120:8080: i/o timeout\n v1 ping attempt failed with error:
Get https://gcr.io/v1/_ping: http: error connecting to proxy
http://87.254.212.120:8080: dial tcp 87.254.212.120:8080: i/o timeout)

Kubernetes is trying to create a pause container for your pod; this container is used to create the pod's network namespace. See this question and its answers for more general information on the pause container.
To your specific error: Kubernetes tries to pull the pause container's image (which would be gcr.io/google_containers/pause:2.0, according to your error message) from the Google Container Registry (gcr.io). Apparently, your Docker engine tries to connect to GCR using a HTTP proxy located at 87.254.212.120:8080, to which it apparently cannot connect (i/o timeout).
To correct this error, either make sure that you HTTP proxy server is online and does not block HTTP requests to GCR, or (if you do have public Internet access) disable the proxy connection for your Docker engine (this would typically be done using the http_proxy and https_proxy environment variables, which would have been set in /etc/sysconfig/docker or /etc/default/docker, depending on your Linux distribution).

Related

Kubernetes Service Requests are sending back responses from Pod IP rather than Service IP, CoreDNS not working for istioctl install

Overview of my setup/problem
I'm running a K3s cluster inside a Docker container on a Rhel 7.9 box. This is all on an air gapped network so bare with me if you don't see copy and pasted examples below.
I'm trying to install Istio on the cluster but the install hangs on setting up the ingress gateway deployment. The Istio install hangs on gateway deployment because its unable to resolve the Istiod Kubernetes service from inside the Ingress Gateway pod.
What I've tried
I tested the image on a Ubuntu vagrant box and the Istio install works fine there. I've also tested the install on a Windows 10 machine using Rancher Desktop and it works fine there as well. At one point it worked on the Rhel box but my team did some security hardening over a two period but naturally they have no idea what change broke my cluster. So I'm trying to narrow down the search.
I've determined that the issue is with CoreDNS in my K3s cluster. I used the dnsutils docker image and ran a nslookup kubernetes.default. I checked the logs of the CoreDNS pod and it shows the lookup but the response it sends back to nslookup has the ip of the CoreDNS pod rather than the kube-dns Kubernetes service. nslookup correctly sees that and says
nslookup kubernetes.default
;; reply from unexpected source: 10.42.0.4#53, expected 10.43.0.2#53
;; reply from unexpected source: 10.42.0.4#53, expected 10.43.0.2#53
;; reply from unexpected source: 10.42.0.4#53, expected 10.43.0.2#53
;; connection timed out; no servers could be reached
10.42.0.4 being the CoreDNS pod and the 10.43.0.2 being the kube-dns Kubernetes Service for that pod.
The logs from the failing Istio Ingress Gateway pod are saying that its failing to retrieve a certificate from the Istiod pod because the Istiod kubernetes service connection is timing out. Which makes sense considering I can't resolve kubernetes.default correctly either.
2021-05-27T10:28:07.342344Z warn ca ca request failed, starting attempt 1 in 91.589072ms
2021-05-27T10:28:07.434806Z warn ca ca request failed, starting attempt 2 in 203.792343ms
2021-05-27T10:28:07.639557Z warn ca ca request failed, starting attempt 3 in 364.729652ms
2021-05-27T10:28:08.005300Z warn ca ca request failed, starting attempt 4 in 830.723933ms
And then states that the request to the Istiod service timed out
transport: Error while dialing dial tcp: lookup istiod.istio-system.svc on 10.96.0.10:53: read udp 10.244.153.113:41187->10.96.0.10:53: i/o timeout
Again my setup is on an air gapped network so ignore the IP addresses in the example above. These were copied from other posts that are related to my issue.
Where to go from here?
I'm trying to figure out what could be causing this problem. DNS resolution should be out of the box functionality for K3s and its not working correctly. As I stated before its not the Docker image I'm running k3s out of since I've gotten k3s and Istio to work on other machines.
Any suggestions on what to do next or advice on how to troubleshoot this would be greatly appreciated. Let me know if there is any other info I can provide to help. Thanks!
TLDR - bridge-nf-call-iptables and bridge-nf-callip6tables were disabled. They need to be enabled.
I found this using docker info. This listed a warning about bridge-nf-call-iptables and bridge-nf-callip6tables being disabled. I found lots of talk on the CoreDNS and k3s github about issues caused by iptables and our suspicions were correct.
This link was the solution for us.

I am trying to deploy the microsoft fluid framework on Aws eks cluster but the pods go in to CrashLoopBackOff

When I get the logs for one of the pods with the CrashLoopBackOff status
kubectl logs alfred
it returns the following errors.
error: alfred service exiting due to error {"label":"winston","timestamp":"2021-11-08T07:02:02.324Z"}
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:66:26) {
errno: 'ENOTFOUND',
code: 'ENOTFOUND',
syscall: 'getaddrinfo',
hostname: 'mongodb'
} {"label":"winston","timestamp":"2021-11-08T07:02:02.326Z"}
error: Client Manager Redis Error: getaddrinfo ENOTFOUND redis {"errno":"ENOTFOUND","code":"ENOTFOUND","syscall":"getaddrinfo","hostname":"redis","stack":"Error: getaddrinfo ENOTFOUND redis\n at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:66:26)","label":"winston","timestamp":"2021-11-08T07:02:02.368Z"}
I am new to the Kubernetes and Aws Eks. Would be looking forward to help. Thanks
If you see the error its failing at getaddrinfo, which is a program/function to resolve the dns name and connect with an external service. It is trying to access a redis cluster. Seems like your EKS cluster doesn't have the connectivity.
However if you are running redis as part of your EKS cluster, make sure to provide/update the kubernetes service dns in the application code, or set this as an environment variable which can be set just before deployment.
Its redis and mongodb, also as error says you are providing hostname as redis and mongodb, it won't resolve to an IP address unless you have mapped it in /etc/hosts file which is actually untrue.
Give the correct hostnames, the pods will come up. This is the root-cause.
The errors above were being generated because mongo and redis were not exposed by a service. After I created service.yaml files for the instances the above errors went away. Aws Eks deploys containers in pods which are scattered across different nodes. In order to let mongodb communicate from one pod to another you must expose a service or aka "frontend" for the mongodb deployment.

Docker for Desktop Kubernetes Unable to connect to the server: dial tcp [::1]:6445

I am using Docker for Desktop on Windows 10 Professional with Hyper-V, also I am not using minikube. I have installed Kubernetes cluster via Docker for Desktop, as shown below:
It shows the Kubernetes is successfully installed and running.
When I run the following command:
kubectl config view
I get the following output:
apiVersion: v1
clusters:
- cluster:
insecure-skip-tls-verify: true
server: https://localhost:6445
name: docker-for-desktop-cluster
contexts:
- context:
cluster: docker-for-desktop-cluster
user: docker-for-desktop
name: docker-for-desktop
current-context: docker-for-desktop
kind: Config
preferences: {}
users:
- name: docker-for-desktop
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
However when I run the
kubectl cluster-info
I am getting the following error:
Unable to connect to the server: dial tcp [::1]:6445: connectex: No connection could be made because the target machine actively refused it.
It seems like there is some network issue, I am not sure how to resolve this.
I know this is an old question but the following helped me to resolve a similar issue. The root cause was that I had minikube installed previously and that was being used as my default context.
I was getting following error:
Unable to connect to the server: dial tcp 192.168.1.8:8443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
In the power-shell run the following command:
> kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
docker-desktop docker-desktop docker-desktop
docker-for-desktop docker-desktop docker-desktop
* minikube minikube minikube
this will list all the contexts and see if there are multiple. If you had installed minikube in the past, that will show a * mark as currently selected default context. You can change that to point to docker-desktop context like follows:
> kubectl config use-context docker-desktop
Run the get-contexts command again to verify the * mark.
Now, the following command should work:
> kubectl get pods
Posting a response to this very old question, as I was searching for a solution and later found a different cause for my problem and the solution was simple.
Cause was that the config file was missing from the $HOME$/.kube directory
A simple restart of Docker Desktop restored the file with some defaults and things were back ok.
Side note: The issue started after I upgraded my Docker Desktop Installation to latest (when I got the update available popup). I should also mention that the cluster stopped working and I had to manually remove Docker Desktop and Reinstall the latest version (this was the story before the problem occurred).

How to fix "failed to ensure load balancer" error for nginx ingress

When setting a new nginx-ingress using helm and a static ip on Azure the nginx controller never gets the static IP assigned. It always says <pending>.
I install the helm chart as follows -
helm install stable/nginx-ingress --name <my-name> --namespace <my-namespace> --set controller.replicaCount=2 --set controller.service.loadBalancerIP="<static-ip-address>"
It says it installs correctly but there is an error listed as well
E0411 06:44:17.063913 13264 portforward.go:303] error copying from
remote stream to local connection: readfrom tcp4
127.0.0.1:57881->127.0.0.1:57886: write tcp4 127.0.0.1:57881->127.0.0.1:57886: wsasend: An established connection was aborted by the software in your host machine.
I then do a kubectl get all -n <my-namespace> and everything is listed correctly just with the external IP as <pending> for the controller.
I then do a kubectl describe -n <my-namespace> service/<my-name>-nginx-ingress-controller and this error is listed under Events -
Warning CreatingLoadBalancerFailed 11s (x4 over 47s)
service-controller Error creating load balancer (will retry): failed
to ensure load balancer for service
my-namespace/my-name-nginx-ingress-controller: timed out waiting for the
condition.
Thank you kindly
For your issue, the possible reason is that your public IP is not in the same resource group and region with the AKS cluster. See the steps in Create an ingress controller with a static public IP address in Azure Kubernetes Service (AKS).
You can get the AKS group through the CLI command like this:
az aks show --resource-group myResourceGroup --name myAKSCluster --query nodeResourceGroup -o tsv
When your public IP in a different group and region, then it will give the time out error as you.
Make sure that your ingress is in the node resource group, and also that the sku for the ingress is Basic not Standard

Use https for accessing Docker private registry

I have a private registry, that it's accessed through the https protocol.
But Kubernetes + Docker, always tries to use the http protocol http://myserver.com:8080 instead of https://myserver.com:8080.
How to force https protocol?
Snippet of my yaml file that declares a Pod:
containers:
- name: apl
image: myserver.com:8080/myimage
Details of my environment:
CentOS 7.3
Docker 18.06
Kubernetes (Minikube) 1.13.1
Error message in Kubernetes logs:
Normal Pulling 30s (x4 over 2m2s) kubelet, minikube pulling image "docker.mydomain.com:30500/vision-ssh"
Warning Failed 30s (x4 over 2m2s) kubelet, minikube Failed to pull image "docker.mydomain.com:30500/vision-ssh": rpc error: code = Unknown desc = Error response from daemon: Get http://docker.mydomain.com:30500/v2/: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
Warning Failed 30s (x4 over 2m2s) kubelet, minikube Error: ErrImagePull
Warning Failed 19s (x6 over 2m2s) kubelet, minikube Error: ImagePullBackOff
Normal BackOff 4s (x7 over 2m2s) kubelet, minikube Back-off pulling image "docker.fccma.com:30500/vision-ssh"
If I try to specify the protocol in the name of the image, it complains:
couldn't parse image reference "https://docker.mydomain.com:30500/vision-ssh": invalid reference format
Followed this guide in order to create the image registry. It is already secured (HTTPS protocol and protected by user/password).
In the /etc/hosts file, the server docker.mydomain.com is mapped to 127.0.0.1. I've read in the docker docs that local registries are always considered insecure.
If I use a name that is mapped to the external IP, then Docker tries https.
Your private docker registry might not be secured. If it is secured private registry it always use https otherwise it refers to http.
For more details refer doc:
Docker uses the https:// protocol to communicate with a registry, unless the registry is allowed to be accessed over an insecure connection. Refer to the insecure registries section for more information.
https://docs.docker.com/engine/reference/commandline/dockerd/#insecure-registries
So to force https , secure your registry. There are many articles available on net to secure your registry.
Run https proxy service fronting the container registry service. Look at nginx as https proxy

Resources