kuberneets pods images ErrImageNeverPull

kuberneets pods images ErrImageNeverPull - docker

I have a docker image created user-service and tagged it to localhost:5001
I have a local registry running at PORT 5001
User-service pushed to local registry
and created pod using deploy_new.yaml file
apiVersion: v1
kind: Pod
metadata:
name: user-service
labels:
component: web
spec:
containers:
- name: web
image: localhost:5001/user-service
resources:
limits:
memory: 512Mi
cpu: "1"
requests:
memory: 256Mi
cpu: "0.2"
imagePullPolicy: Never
ports:
- name: http
containerPort: 4006
protocol: TCP
livenessProbe:
httpGet:
path: /health/health
port: 4006
initialDelaySeconds: 3
periodSeconds: 3
failureThreshold: 2
readinessProbe:
httpGet:
path: /health/health
port: 4006
initialDelaySeconds: 15
periodSeconds: 10
But on describing pod
I get
Questions :
What is ErrImageNeverPull image and how to fix it?
How to test liveliness and readiness probes?
Probe APIs

1. What is ErrImageNeverPull image and how to fix it?
As the imagePullPolicyis set to Never the kubelet won't fetch images but look for what is present locally. The error means it could not found the image locally and it will not try to fetch it.
If the cluster can reach to your local docker registry, just change the image: user-service to image: localhost:5000/user-service:latest
If you are using minikube, check the README to reuse your docker daemon so you can use your image without uploading it.
Do eval $(minikube docker-env) on each session you need to use it.
Build the image docker build -t user-service .
Add the image in your Pod manifest as image: user-service
make sure you have imagePullPolicy: Never for your container (which you already have)
2. How to test liveliness and readiness probes?
I suggest you try the examples form the Kubernetes documentation they explain really good the difference between the two and the different types of probes you can configure.
You need first to make your pod running before checking liveness and readiness probes. But in your case they will succeed as soon as the Pod starts. Just describe it and see the events.

One more thing to note. eval $(minikube docker-env) will fail silently if you are using a non-default minikube profile, leading to the observed behavior:
$ eval $(minikube docker-env)
$ minikube docker-env
🤷 Profile "minikube" not found. Run "minikube profile list" to view all profiles.
👉 To start a cluster, run: "minikube start"
$
To address this re-run specifying the profile you are using:
$ eval $(minikube -p my-profile docker-env)

Related

How can I save the kubernetes pod as a docker image?

I have a pod running Linux, I have let others use it. Now I need to save the changes made by others. Since sometimes I need to delete/restart the pod, the changes are reverted and new pod get created. So I want to save the pod container as docker image and use that image to create a pod.
I have tried kubectl debug node/pool-89899hhdyhd-bygy -it --image=ubuntu then install docker, dockerd inside but they don't have root permission to perform operations, installed crictl they where listing the containers but they don't have options to save them.
Also created a privileged docker image, created a pod from it, then used the command kubectl exec --stdin --tty app-7ff786bc77-d5dhg -- /bin/sh then tried to get running container, but it was not listing the containers. Below is the deployment i used to the privileged docker container
kind: Deployment
apiVersion: apps/v1
metadata:
name: app
labels:
app: backend-app
backend-app: app
spec:
replicas: 1
selector:
matchLabels:
app: backend-app
task: app
template:
metadata:
labels:
app: backend-app
task: app
spec:
nodeSelector:
kubernetes.io/hostname: pool-58i9au7bq-mgs6d
volumes:
- name: task-pv-storage
hostPath:
path: /run/docker.sock
type: Socket
containers:
- name: app
image: registry.digitalocean.com/my_registry/docker_app#sha256:b95016bd9653631277455466b2f60f5dc027f0963633881b5d9b9e2304c57098
ports:
- containerPort: 80
volumeMounts:
- name: task-pv-storage
mountPath: /var/run/docker.sock
Is there any way I can achieve this, get the pod container and save it as a docker image? I am using digitalocean to run my kubernetes apps, I do not ssh access to the node.

This is not a feature of Kubernetes or CRI. Docker does support snapshotting a running container to an image however Kubernetes no longer supports Docker.

Thank you all for your help and suggestions. I found a way to achieve it using the tool nerdctl - https://github.com/containerd/nerdctl.

k3s image pull from private registries

I've been looking at different references on how to enable k3s (running on my pi) to pull docker images from a private registry on my home network (server laptop on my network). If someone can please point my head in the right direction? This is my approach:
Created the docker registry on my server (and making accessible via port 10000):
docker run -d -p 10000:5000 --restart=always --local-docker-registry registry:2
This worked, and was able to push-pull images to it from the "server pc". I didn't add authentication TLS etc. yet...
(viewing the images via docker plugin on VS Code).
Added the inbound firewall rule on my laptop server, and tested that the registry can be 'seen' from my pi (so this also works):
$ curl -ks http://<server IP>:10000/v2/_catalog
{"repositories":["tcpserialpassthrough"]}
Added the registry link to k3s (k3s running on my pi) in registries.yaml file, and restarted k3s and the pi
$ cat /etc/rancher/k3s/registries.yaml
mirrors:
pwlaptopregistry:
endpoint:
- "http://<host IP here>:10000"
Putting the registry prefix to my image endpoint on a deployment manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: tcpserialpassthrough
spec:
selector:
matchLabels:
app: tcpserialpassthrough
replicas: 1
template:
metadata:
labels:
app: tcpserialpassthrough
spec:
containers:
- name: tcpserialpassthrough
image: pwlaptopregistry/tcpserialpassthrough:vers1.3-arm
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- containerPort: 8001
hostPort: 8001
protocol: TCP
command: ["dotnet", "/app/TcpConnector.dll"]
However, when I check the deployment startup sequence, it's still not able to pull the image (and possibly also still referencing docker hub?):
kubectl get events -w
LAST SEEN TYPE REASON OBJECT MESSAGE
8m24s Normal SuccessfulCreate replicaset/tcpserialpassthrough-88fb974d9 Created pod: tcpserialpassthrough-88fb974d9-b88fc
8m23s Warning FailedScheduling pod/tcpserialpassthrough-88fb974d9-b88fc 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
8m23s Warning FailedScheduling pod/tcpserialpassthrough-88fb974d9-b88fc 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
8m21s Normal Scheduled pod/tcpserialpassthrough-88fb974d9-b88fc Successfully assigned default/tcpserialpassthrough-88fb974d9-b88fc to raspberrypi
6m52s Normal Pulling pod/tcpserialpassthrough-88fb974d9-b88fc Pulling image "pwlaptopregistry/tcpserialpassthrough:vers1.3-arm"
6m50s Warning Failed pod/tcpserialpassthrough-88fb974d9-b88fc Error: ErrImagePull
6m50s Warning Failed pod/tcpserialpassthrough-88fb974d9-b88fc Failed to pull image "pwlaptopregistry/tcpserialpassthrough:vers1.3-arm": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/pwlaptopregistry/tcpserialpassthrough:vers1.3-arm": failed to resolve reference "docker.io/pwlaptopregistry/tcpserialpassthrough:vers1.3-arm": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
6m3s Normal BackOff pod/tcpserialpassthrough-88fb974d9-b88fc Back-off pulling image "pwlaptopregistry/tcpserialpassthrough:vers1.3-arm"
3m15s Warning Failed pod/tcpserialpassthrough-88fb974d9-b88fc Error: ImagePullBackOff
Wondered if the issue is with authorization, and added based on basic auth, following this youtube guide, but the same issue persists.
Also noted that that /etc/docker/daemon.json must be edited to allow unauthorized, non-TLS connections, via:
{
"Insecure-registries": [ "<host IP>:10000" ]
}
but seemed that this needs to be done on node side, whereas nodes don't have docker cli installed??

... this is so stupid, have no idea why a domain name and port needs to be specified as the "name" of your referred registry, but anyway this solved my issue (for reference):
$cat /etc/rancher/k3s/registries.yaml
mirrors:
"<host IP>:10000":
endpoint:
- "http://<host IP>:10000"
and restarting k3s:
systemctl restart k3s
Then in your deployment, referring to that in your image path as:
apiVersion: apps/v1
kind: Deployment
metadata:
name: tcpserialpassthrough
spec:
selector:
matchLabels:
app: tcpserialpassthrough
replicas: 1
template:
metadata:
labels:
app: tcpserialpassthrough
spec:
containers:
- name: tcpserialpassthrough
image: <host IP>:10000/tcpserialpassthrough:vers1.3-arm
resources:
limits:
memory: "128Mi"
cpu: "500m"
ports:
- containerPort: 8001
hostPort: 8001
protocol: TCP
command: ["dotnet", "/app/TcpConnector.dll"]
imagePullSecrets:
- name: mydockercredentials
referring to registry's basic auth details saved as a secret:
$ kubectl create secret docker-registry mydockercredentials --docker-server host IP:10000 --docker-username username --docker-password password
You'll be able to verify the pull process via
$ kubectl get events -w

Can't delete exited Init Container

I'm using Kubernetes 1.15.7 and my issue is similar to https://github.com/kubernetes/kubernetes/issues/62362
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
restartPolicy: Always
initContainers:
- name: install
image: busybox
command:
- sh
- -c
- sleep 60
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
On the node the container is runner if I issue a docker container prune ,it removes the exited busybox (init) container. Only to restart it again and trigger the pod to restart too.
I found the github issue similar to this but without much explaination. These exited container as such do not show up to consumer much same using docker system df but it doesnt allow me to run the prune command as a whole on the node.

Kubelet manages garbage collection of docker images so you dont have to.
Take a look at k8s documentation for more info on this topic.
From k8s documentation:
Garbage collection is a helpful function of kubelet that will clean up unused images and unused containers. Kubelet will perform garbage collection for containers every minute and garbage collection for images every five minutes.
External garbage collection tools are not recommended as these tools can potentially break the behavior of kubelet by removing containers expected to exist

Permission denied with Docker in Docker in Atlassian Bamboo Server

I'm trying to build a docker image using DIND with Atlassian Bamboo.
I've created the deployment/ StatefulSet as follows:
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: bamboo
name: bamboo
namespace: csf
spec:
replicas: 1
serviceName: bamboo
revisionHistoryLimit: 10
selector:
matchLabels:
app: bamboo
template:
metadata:
creationTimestamp: null
labels:
app: bamboo
spec:
containers:
- image: atlassian/bamboo-server:latest
imagePullPolicy: IfNotPresent
name: bamboo-server
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
securityContext:
privileged: true
volumeMounts:
- name: bamboo-home
mountPath: /var/atlassian/application-data/bamboo
- mountPath: /opt/atlassian/bamboo/conf/server.xml
name: bamboo-server-xml
subPath: bamboo-server.xml
- mountPath: /var/run
name: docker-sock
volumes:
- name: bamboo-home
persistentVolumeClaim:
claimName: bamboo-home
- configMap:
defaultMode: 511
name: bamboo-server-xml
name: bamboo-server-xml
- name: docker-sock
hostPath:
path: /var/run
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
Note that I've set privileged: true in securityContext to enable this.
However, when trying to run docker images, I get a permission error:
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/create: dial unix /var/run/docker.sock: connect: permission denied.
See '/var/atlassian/application-data/bamboo/appexecs/docker run --help'
Am I missing something wrt setting up DIND?

The /var/run/docker.sock file on the host system is owned by a different user than the user that is running the bamboo-server container process.
Without knowing any details about your cluster, I would assume docker runs as 'root' (UID=0). The bamboo-server runs as 'bamboo', as can be seen from its Dockerfile, which will normally map to a UID in the 1XXX range on the host system. As these users are different and the container process did not receive any specific permissions over the (host) socket, the error is given.
So I think there are two approaches possible:
Or the container process continues to run as the 'bamboo' user, but is given sufficient permissions on the host system to access /var/run/docker.sock. This would normally mean adding the UID the bamboo user maps to on the host system to the docker group on the host system. However, making changes to the host system might or might not be an option depending on the context of your cluster, and is tricky in a cluster context because the pod could migrate to a different node where the changes were not applied and/or the UID changes.
Or the container is changed as to run as a sufficiently privileged user to begin with, being the root user. There are two ways to accomplish this: 1. you extend and customize the Atlassian provided base image to change the user or 2. you override the user the container runs as at run-time by means of the 'runAsUser' and 'runAsGroup' securityContext instructions as specified here. Both should be '0'.

As mentioned in the documentation here
If you want to run docker as non-root user then you need to add it to the docker group.
Create the docker group if it does not exist
$ sudo groupadd docker
Add your user to the docker group.
$ sudo usermod -aG docker $USER
Log out and log back in so that your group membership is re-evaluated.
$ newgrp docker
Verify that you can run docker commands without sudo
$ docker run hello-world
If that doesn't help you can change the permissions of docker socket to be able to connect to the docker daemon /var/run/docker.sock.
sudo chmod 666 /var/run

A better way to handle this is to run a sidecar container - docker:dind, and export DOCKER_HOST=tcp://dind:2375 in the main Bamboo container. This way you will invoke Docker in a dind container and won't need to mount /var/run/docker.sock

How to start the cloudwatch agent in container?

From the docker hub there is an image which is maintained by amazon.
Any one know how to configure and start the container as I cannot find any documentation

I got this working! I was having the same issue with you when you see Reading json config file path: /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json ... Cannot access /etc/cwagentconfig: lstat /etc/cwagentconfig: no such file or directoryValid Json input schema.
What you need to do is put your config file in /etc/cwagentconfig. A functioning dockerfile:
FROM amazon/cloudwatch-agent:1.230621.0
COPY config.json /etc/cwagentconfig
Where config.json is some cloudwatch agent configuration, such as given by LinPy's answer.
You can ignore the warning about /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json, or you can also COPY the config.json file to that location in the dockerfile as well.
I will also share how I found this answer:
I needed this run in ECS as a sidecar, and I could only find docs on how to run it in kubernetes. Following this documentation: https://docs.aws.amazon.com/en_pv/AmazonCloudWatch/latest/monitoring/Container-Insights-setup-StatsD.html I decided to download all the example k8s manifests, when I saw this one:
apiVersion: v1
kind: Pod
metadata:
namespace: default
name: amazonlinux
spec:
containers:
- name: amazonlinux
image: amazonlinux
command: ["/bin/sh"]
args: ["-c", "sleep 300"]
- name: cloudwatch-agent
image: amazon/cloudwatch-agent
imagePullPolicy: Always
resources:
limits:
cpu: 200m
memory: 100Mi
requests:
cpu: 200m
memory: 100Mi
volumeMounts:
- name: cwagentconfig
mountPath: /etc/cwagentconfig
volumes:
- name: cwagentconfig
configMap:
name: cwagentstatsdconfig
terminationGracePeriodSeconds: 60
So I saw that the volume mount cwagentconfig mounts to /etc/cwagentconfig and that's from the cwagentstatsdconfig configmap, and that's just the json file.

You just to run the container with log-opt, as the log agent is the main process of the container.
docker run --log-driver=awslogs --log-opt awslogs-region=us-west-2 --log-opt awslogs-group=myLogGroup amazon/cloudwatch-agent
You can find more details here and here.
I do not know why you need an agent in a container, but the best practice is to send each container log directly to cloud watch using aws log driver.
Btw this is entrypoint of the container.
"Entrypoint": [
"/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent"
],
All you need to call
/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent

Here is how I got it working in our Docker containers without systemctl or System V init.

This is from official Documentation:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:configuration-file-path -s
here the Docs:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-commandline-fleet.html#start-CloudWatch-Agent-EC2-commands-fleet
Installation path may be different, but that is how the agent is started as per docs.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart