Pod Stuck on `CrashLoopBackOff` even though it should go into /bin/bash - docker

I'm running a Kubernetes cluster with minikube and my deployment (or individual Pods) won't stay running even though I specify in the Dockerfile that it should stay leave a terminal open (I've also tried it with sh). They keep getting restarted and sometimes they get stuck on a CrashLoopBackOff status before restarting again:
FROM ubuntu
EXPOSE 8080
CMD /bin/bash
My deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleeper-deploy
spec:
replicas: 10
selector:
matchLabels:
app: sleeper-world
minReadySeconds: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: sleeper-world
spec:
containers:
- name: sleeper-pod
image: kubelab
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
All in all, my workflow follows (deploy.sh):
#!/bin/bash
# Cleaning
kubectl delete deployments --all
kubectl delete pods --all
# Building the Image
sudo docker build \
-t kubelab \
.
# Deploying
kubectl apply -f sleeper_deployment.yml
By the way, I've tested the Docker Container solo using sudo docker run -dt kubelab and it does stay up. Why doesn't it stay up within Kubernetes? Is there a parameter (in the YAML file) or a flag I should be using for this special case?

1. Original Answer (but edited...)
If you are familiar with Docker, check this.
If you are looking for an equivalent of docker run -dt kubelab, try kubectl run -it kubelab --restart=Never --image=ubuntu /bin/bash. In your case, with the Docker -t flag: Allocate a pseudo-tty. That's why your Docker Container stays up.
Try:
kubectl run kubelab \
--image=ubuntu \
--expose \
--port 8080 \
-- /bin/bash -c 'while true;do sleep 3600;done'
Or:
kubectl run kubelab \
--image=ubuntu \
--dry-run -oyaml \
--expose \
--port 8080 \
-- /bin/bash -c 'while true;do sleep 3600;done'
2. Explaining what's going on (Added by Philippe Fanaro):
As stated by #David Maze, the bash process is going to exit immediately because the artificial terminal won't have anything going into it, a slightly different behavior from Docker.
If you change the restart Policy, it will still terminate, the difference is that the Pod won't regenerate or restart.
One way of doing it is (pay attention to the tabs of restartPolicy):
apiVersion: v1
kind: Pod
metadata:
name: kubelab-pod
labels:
zone: prod
version: v1
spec:
containers:
- name: kubelab-ctr
image: kubelab
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
restartPolicy: Never
However, this will not work if it is specified inside a deployment YAML. And that's because deployments force regeneration, trying to always get to the desired state. This can be confirmed in the Deployment Documentation Webpage:
Only a .spec.template.spec.restartPolicy equal to Always is allowed, which is the default if not specified.
3. If you really wish to force the Docker Container to Keep Running
In this case, you will need something that doesn't exit. A server-like process is one example. But you can also try something mentioned in this StackOverflow answer:
CMD exec /bin/bash -c "trap : TERM INT; sleep infinity & wait"
This will keep your container alive until it is told to stop. Using trap and wait will make your container react immediately to a stop request. Without trap/wait stopping will take a few seconds.

Related

Flaskapp is not running in desired port in Minikube Cluster

I am running a very basic blogging app using Flask. Its runs fine when I run it using Docker i.e. docker run -it -d -p 5000:5000 app.
* Serving Flask app 'app' (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
* Running on all addresses.
WARNING: This is a development server. Do not use it in a production deployment.
* Running on http://10.138.0.96:5000/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 144-234-816
This runs on my localhost:5000 just fine.
But when I deploy this in Minikube, it says
This site can’t be reached 34.105.79.215 refused to connect.
I use this workflow in Kubernetes
$ eval $(minikube docker-env)
$ docker build -t app:latest .
$ kubectl apply -f deployment.yaml (contains deployment & service)
kubectl logs app-7bf8f865cc-gb9fl returns
* Serving Flask app "app" (lazy loading)
* Environment: production
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
* Debug mode: on
* Running on all addresses.
WARNING: This is a development server. Do not use it in a production deployment.
* Running on http://172.17.0.3:5000/ (Press CTRL+C to quit)
* Restarting with stat
* Debugger is active!
* Debugger PIN: 713-503-298
Dockerfile
FROM ubuntu:18.04
WORKDIR /app
COPY . /app
RUN apt-get update && apt-get -y upgrade
RUN apt-get -y install python3 && apt-get -y install python3-pip
RUN pip3 install -r requirements.txt
EXPOSE 5000
ENTRYPOINT ["python3"]
CMD ["app.py"]
deployment.yaml
apiVersion: v1
kind: Service
metadata:
name: app-service
spec:
selector:
app: app
ports:
- protocol: "TCP"
port: 5000
targetPort: 5000
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
selector:
matchLabels:
app: app
replicas: 1
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: app:latest
imagePullPolicy: Never
ports:
- containerPort: 5000
Also I noticed that on running from Docker container when I do docker ps I get PORTS as
0.0.0.0:5000->5000/tcp but the kubernetes ports shows
127.0.0.1:32792->22/tcp, 127.0.0.1:32791->2376/tcp, 127.0.0.1:32790->5000/tcp, 127.0.0.1:32789->8443/tcp, 127.0.0.1:32788->32443/tcp
The port: on a Service only controls the internal port, the one that's part of the ClusterIP service. By default the node port is randomly assigned from the available range. This is because while the port value only has to be unique within the Service itself (couldn't have the same port go to two places, would make no sense), node ports are a global resource and have to be globally unique. You can override it via nodePort: whatever in the Service definition but I wouldn't recommend it.
Minikube includes a helper to manage this for you, run minikube service app-service and it will load the URL in your browser mapped through the correct node port.

Translate docker run commands with initialization to a multi-container k8s pod or compose

I have a container that I need to configure for k8s yaml. The workflow on docker run using the terminal looks like this.:
docker run -v $(pwd):/projects \
-w /projects \
gcr.io/base-project/myoh:v1 init *myproject*
This command creates a directory called myproject. To complete the workflow, I need to cd into this myproject folder and run:
docker run -v $(pwd):/project \
-w /project \
-p 8081:8081 \
gcr.io/base-project/myoh:v1
Any idea how to convert this to either a docker-compose or a k8s pods/deployment yaml? I have tried all that come to mind with no success.
The bind mount of the current directory can't be translated to Kubernetes at all. There's no way to connect a pod's filesystem back to your local workstation. A standard Kubernetes setup has a multi-node installation, and if it's possible to directly connect to a node (it may not be) you can't predict which node a pod will run on, and copying code to every node is cumbersome and hard to maintain. If you're using a hosted Kubernetes installation like GKE, it's even possible that the cluster autoscaler will create and delete nodes automatically, and you won't have an opportunity to manually copy things in.
You need to build your application code into a custom image. That can set the desired WORKDIR, COPY the code in, and RUN any setup commands that are required. Then you need to push that to an image repository, like GCR
docker build -t gcr.io/base-project/my-project:v1 .
docker push gcr.io/base-project/my-project:v1
Once you have that, you can create a minimal Kubernetes Deployment to run it. Set the GCR name of the image you built and pushed as its image:. You will also need a Service to make it accessible, even from other Pods in the same cluster.
Try this (untested yaml, but you will get the idea)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myoh-deployment
labels:
app: myoh
spec:
replicas: 1
selector:
matchLabels:
app: myoh
template:
metadata:
labels:
app: myoh
spec:
initContainers:
- name: init-myoh
image: gcr.io/base-project/myoh:v1
command: ['sh', '-c', "mkdir -p myproject"]
containers:
- name: myoh
image: gcr.io/base-project/myoh:v1
ports:
- containerPort: 8081
volumeMounts:
- mountPath: /projects
name: project-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /data
# this field is optional
type: Directory

Cannot start Spark on Kubernetes

I am trying to set up Spark on Kubernetes on Mac. I have followed this tutorial web pages and it looks so straightforward for me to understand.
Below is the Dockerfile.
# base image
FROM java:openjdk-8-jdk
# define spark and hadoop versions
ENV SPARK_VERSION=3.0.0
ENV HADOOP_VERSION=3.3.0
# download and install hadoop
RUN mkdir -p /opt && \
cd /opt && \
curl http://archive.apache.org/dist/hadoop/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz | \
tar -zx hadoop-${HADOOP_VERSION}/lib/native && \
ln -s hadoop-${HADOOP_VERSION} hadoop && \
echo Hadoop ${HADOOP_VERSION} native libraries installed in /opt/hadoop/lib/native
# download and install spark
RUN mkdir -p /opt && \
cd /opt && \
curl http://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz | \
tar -zx && \
ln -s spark-${SPARK_VERSION}-bin-hadoop2.7 spark && \
echo Spark ${SPARK_VERSION} installed in /opt
# add scripts and update spark default config
ADD common.sh spark-master spark-worker /
ADD spark-defaults.conf /opt/spark/conf/spark-defaults.conf
ENV PATH $PATH:/opt/spark/bin
After building the Docker image I ran the following commands but the pod doesn't start.
$ kubectl create -f ./kubernetes/spark-master-deployment.yaml
$ kubectl create -f ./kubernetes/spark-master-service.yaml
spark-master-deployment.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
name: spark-master
spec:
replicas: 1
selector:
matchLabels:
component: spark-master
template:
metadata:
labels:
component: spark-master
spec:
containers:
- name: spark-master
image: spark-hadoop:3.0.0
command: ["/spark-master"]
ports:
- containerPort: 7077
- containerPort: 8080
resources:
requests:
cpu: 100m
spark-master-service.yaml
kind: Service
apiVersion: v1
metadata:
name: spark-master
spec:
ports:
- name: webui
port: 8080
targetPort: 8080
- name: spark
port: 7077
targetPort: 7077
selector:
component: spark-master
To trace the problem, I ran the kubectl describe... command and got the following result.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 45s default-scheduler Successfully assigned default/spark-master-fc7c95485-zn6wf to minikube
Normal Pulled 21s (x3 over 44s) kubelet, minikube Container image "spark-hadoop:3.0.0" already present on machine
Normal Created 21s (x3 over 44s) kubelet, minikube Created container spark-master
Warning Failed 21s (x3 over 43s) kubelet, minikube Error: failed to start container "spark-master": Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"/spark-master\": stat /spark-master: no such file or directory": unknown
Warning BackOff 8s (x3 over 42s) kubelet, minikube Back-off restarting failed container
It seems that the container didn't start but I didn't figure out why the pod does not start correctly even though I only followed the instruction on the web page.
Below is the GitHub URL that the web page gives me a guide for configuring Spark on Kubernetes.
https://github.com/testdrivenio/spark-kubernetes
I assume you are using Minikube.
For minikube make the following changes:
Eval the docker env using: eval $(minikube docker-env)
Build the docker image: docket build -t my-image
Set the image name as only "my-image" in pod specification in your yaml file.
Set the imagePullPolicy to Never in you yaml file. Here is the example:
apiVersion:
kind:
metadata:
spec:
template:
metadata:
labels:
app: my-image
spec:
containers:
- name: my-image
image: "my-image"
imagePullPolicy: Never
It seems like you didn’t copy the scripts developed by the blogger which are in this project, where in the image there is this command ADD common.sh spark-master spark-worker /, so your image misses the script you need to run the master (you will have the same problem with the workers), you can clone the project then build the image, or use the image published by the blogger mjhea0/spark-hadoop.
Here you are trying to setup a spark standalone cluster on Kubernetes, but you can use Kubernetes itself as a spark manager, where spark in the release 3.1.0 announced that Kubernetes officially become a spark cluster manager (it was experimental since the release 2.3), here is the official documentation, you can also use spark-on-k8s-operator, developed by Google, to submit the jobs and manage them on your Kubernetes cluster.

Init container to wait for rabbit-mq readiness

I saw the example for docker healthcheck of RabbitMQ at docker-library/healthcheck.
I would like to apply a similar mechanism to my Kubernetes deployment to await on Rabbit deployment readiness. I'm doing a similar thing with MongoDB, using a container that busy-waits mongo with some ping command.
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-1
spec:
replicas: 1
selector:
matchLabels:
app: app-1
template:
metadata:
labels:
app: app-1
spec:
initContainers:
- name: wait-for-mongo
image: gcr.io/app-1/tools/mongo-ping
containers:
- name: app-1-service
image: gcr.io/app-1/service
...
However when I tried to construct such an init container I couldn't find any solution on how to query the health of rabbit from outside its cluster.
The following works without any extra images/scripts, but requires you to enable the Management Plugin, eg by using the rabbitmq:3.8-management image instead of eg rabbitmq:3.8.
initContainers:
- name: check-rabbitmq-ready
image: busybox
command: [ 'sh', '-c',
'until wget http://guest:guest#rabbitmq:15672/api/aliveness-test/%2F;
do echo waiting for rabbitmq; sleep 2; done;' ]
Specifically, this is waiting until the HTTP Management API is available, and then checking that the default vhost is running healthily. The %2F refers to the default / vhost, which has to be urlendoded. If using your own vhost, enter that instead.
Adapted from this example, as suggested by #Hanx:
Dockerfile
FROM python:3-alpine
ENV RABBIT_HOST="my-rabbit"
ENV RABBIT_VHOST="vhost"
ENV RABBIT_USERNAME="root"
RUN pip install pika
COPY check_rabbitmq_connection.py /check_rabbitmq_connection.py
RUN chmod +x /check_rabbitmq_connection.py
CMD ["sh", "-c", "python /check_rabbitmq_connection.py --host $RABBIT_HOST --username $RABBIT_USERNAME --password $RABBIT_PASSWORD --virtual_host $RABBIT_VHOST"]
check_rabbitmq_connection.py
#!/usr/bin/env python3
# Check connection to the RabbitMQ server
# Source: https://blog.sleeplessbeastie.eu/2017/07/10/how-to-check-connection-to-the-rabbitmq-message-broker/
import argparse
import time
import pika
# define and parse command-line options
parser = argparse.ArgumentParser(description='Check connection to RabbitMQ server')
parser.add_argument('--host', required=True, help='Define RabbitMQ server hostname')
parser.add_argument('--virtual_host', default='/', help='Define virtual host')
parser.add_argument('--port', type=int, default=5672, help='Define port (default: %(default)s)')
parser.add_argument('--username', default='guest', help='Define username (default: %(default)s)')
parser.add_argument('--password', default='guest', help='Define password (default: %(default)s)')
args = vars(parser.parse_args())
print(args)
# set amqp credentials
credentials = pika.PlainCredentials(args['username'], args['password'])
# set amqp connection parameters
parameters = pika.ConnectionParameters(host=args['host'], port=args['port'], virtual_host=args['virtual_host'], credentials=credentials)
# try to establish connection and check its status
while True:
try:
connection = pika.BlockingConnection(parameters)
if connection.is_open:
print('OK')
connection.close()
exit(0)
except Exception as error:
raise
print('No connection yet:', error.__class__.__name__)
time.sleep(5)
Build and run:
docker build -t rabbit-ping .
docker run --rm -it \
--name rabbit-ping \
--net=my-net \
-e RABBIT_PASSWORD="<rabbit password>" \
rabbit-ping
and in deployment.yaml:
apiVersion: apps/v1
kind: Deployment
...
spec:
...
template:
...
spec:
initContainers:
- name: wait-for-rabbit
image: gcr.io/my-org/rabbit-ping
env:
- name: RABBIT_PASSWORD
valueFrom:
secretKeyRef:
name: rabbit
key: rabbit-password
containers:
...

Kubernetes Image goes into CrashLoopBackoff even if entry point is defined

I am trying to run an image using Kubernetes with below Dockerfile
FROM centos:6.9
COPY rpms/* /tmp/
RUN yum -y localinstall /tmp/*
ENTERYPOINT service test start && /bin/bash
Now when I try to deploy this image using pod.yml as shown below,
apiVersion: v1
kind: Pod
metadata:
labels:
app: testpod
name: testpod
spec:
containers:
- image: test:v0.2
name: test
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: testpod
volumes:
- name: testod
persistentVolumeClaim:
claimName: testpod
Now when I try to create the pod the image goes into a crashloopbackoff. So how I can make the image to wait in /bin/bash on Kubernetes as when I use docker run -d test:v0.2 it work fines and keep running.
You need to attach a terminal to the running container. When starting a pod using kubectl run ... you can use -i --tty to do that. In the pod yml filke, you can add the following, to the container spec to attach tty.
stdin: true
tty: true
You can put a command like tail -f /dev/null to keep your container always be on, this could be done inside your Dockerfile or in your Kubernetes yaml file.

Resources