Cannot start Spark on Kubernetes

Cannot start Spark on Kubernetes - docker

I am trying to set up Spark on Kubernetes on Mac. I have followed this tutorial web pages and it looks so straightforward for me to understand.
Below is the Dockerfile.
# base image
FROM java:openjdk-8-jdk
# define spark and hadoop versions
ENV SPARK_VERSION=3.0.0
ENV HADOOP_VERSION=3.3.0
# download and install hadoop
RUN mkdir -p /opt && \
cd /opt && \
curl http://archive.apache.org/dist/hadoop/common/hadoop-${HADOOP_VERSION}/hadoop-${HADOOP_VERSION}.tar.gz | \
tar -zx hadoop-${HADOOP_VERSION}/lib/native && \
ln -s hadoop-${HADOOP_VERSION} hadoop && \
echo Hadoop ${HADOOP_VERSION} native libraries installed in /opt/hadoop/lib/native
# download and install spark
RUN mkdir -p /opt && \
cd /opt && \
curl http://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz | \
tar -zx && \
ln -s spark-${SPARK_VERSION}-bin-hadoop2.7 spark && \
echo Spark ${SPARK_VERSION} installed in /opt
# add scripts and update spark default config
ADD common.sh spark-master spark-worker /
ADD spark-defaults.conf /opt/spark/conf/spark-defaults.conf
ENV PATH $PATH:/opt/spark/bin
After building the Docker image I ran the following commands but the pod doesn't start.
$ kubectl create -f ./kubernetes/spark-master-deployment.yaml
$ kubectl create -f ./kubernetes/spark-master-service.yaml
spark-master-deployment.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
name: spark-master
spec:
replicas: 1
selector:
matchLabels:
component: spark-master
template:
metadata:
labels:
component: spark-master
spec:
containers:
- name: spark-master
image: spark-hadoop:3.0.0
command: ["/spark-master"]
ports:
- containerPort: 7077
- containerPort: 8080
resources:
requests:
cpu: 100m
spark-master-service.yaml
kind: Service
apiVersion: v1
metadata:
name: spark-master
spec:
ports:
- name: webui
port: 8080
targetPort: 8080
- name: spark
port: 7077
targetPort: 7077
selector:
component: spark-master
To trace the problem, I ran the kubectl describe... command and got the following result.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 45s default-scheduler Successfully assigned default/spark-master-fc7c95485-zn6wf to minikube
Normal Pulled 21s (x3 over 44s) kubelet, minikube Container image "spark-hadoop:3.0.0" already present on machine
Normal Created 21s (x3 over 44s) kubelet, minikube Created container spark-master
Warning Failed 21s (x3 over 43s) kubelet, minikube Error: failed to start container "spark-master": Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"/spark-master\": stat /spark-master: no such file or directory": unknown
Warning BackOff 8s (x3 over 42s) kubelet, minikube Back-off restarting failed container
It seems that the container didn't start but I didn't figure out why the pod does not start correctly even though I only followed the instruction on the web page.
Below is the GitHub URL that the web page gives me a guide for configuring Spark on Kubernetes.
https://github.com/testdrivenio/spark-kubernetes

I assume you are using Minikube.
For minikube make the following changes:
Eval the docker env using: eval $(minikube docker-env)
Build the docker image: docket build -t my-image
Set the image name as only "my-image" in pod specification in your yaml file.
Set the imagePullPolicy to Never in you yaml file. Here is the example:
apiVersion:
kind:
metadata:
spec:
template:
metadata:
labels:
app: my-image
spec:
containers:
- name: my-image
image: "my-image"
imagePullPolicy: Never

It seems like you didn’t copy the scripts developed by the blogger which are in this project, where in the image there is this command ADD common.sh spark-master spark-worker /, so your image misses the script you need to run the master (you will have the same problem with the workers), you can clone the project then build the image, or use the image published by the blogger mjhea0/spark-hadoop.
Here you are trying to setup a spark standalone cluster on Kubernetes, but you can use Kubernetes itself as a spark manager, where spark in the release 3.1.0 announced that Kubernetes officially become a spark cluster manager (it was experimental since the release 2.3), here is the official documentation, you can also use spark-on-k8s-operator, developed by Google, to submit the jobs and manage them on your Kubernetes cluster.

Related

Build Kubernetes cluster with spark master and spark workers

I've built a custom-spark docker image with the following dependencies:
Python 3.6.9
Pip 1.18
Java OpenJDK 64-Bit Server VM, 1.8.0_212
Hadoop 3.2
Scala 2.13.0
Spark 3.0.3
where I pushed to ducker hub: https://hub.docker.com/r/redaer7/custom-spark
Dockerfile,spark-master and spark-worker files are stored under: https://github.com/redaER7/Custom-Spark
I verify /spark-master and /spark-worker works well when creating a container linked to the previous image:
docker run -it -d --name spark_1 redaer7/custom-spark:1.0 bash
docker exec -it $CONTAINER_ID /bin/bash
My issue is when I try to build a K8s cluster from previous image with following yaml file for the spark master pod:
kubectl create namespace sparkspace
kubectl -n sparkspace create -f ./spark-master-deployment.yaml
#yaml file
kind: Deployment
apiVersion: apps/v1
metadata:
name: spark-master
spec:
replicas: 1 # should always be one
selector:
matchLabels:
component: spark-master
template:
metadata:
labels:
component: spark-master
spec:
containers:
- name: spark-master
image: redaer7/custom-spark:1.0
imagePullPolicy: IfNotPresent
command: ["/spark-master"]
ports:
- containerPort: 7077
- containerPort: 8080
resources:
# limits:
# cpu: 1
# memory: 1G
requests:
cpu: 1 #100m
memory: 1G
I get CrashLoopBackOff when viewing pod with kubectl -n sparkspace get pods
When inspecting with kubectl -n sparkspace describe pod $Pod_Name
Any clue about that First warning ? thank you

I simply solved it by re-pulling the image :
imagePullPolicy: Always
Because I edited the Docker Image locally and I haven't changed the following in the config file:
imagePullPolicy: IfNotPresent
Then, I pushed it into Dockerhub for later deployment

go micro application always getting "/greeter-srv" no such file or directory kubernetes

I have trying to install go micro application, but its always getting below error,
` Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/greeter-7d7c644bdc-dk5q2 to minikube
Normal Pulling 9s (x4 over 3m10s) kubelet, minikube Pulling image "12345.dkr.ecr.ap-south-1.amazonaws.com/micro:latest"
Normal Pulled 9s (x4 over 61s) kubelet, minikube Successfully pulled image "460378929709.dkr.ecr.ap-south-1.amazonaws.com/micro:latest"
Normal Created 8s (x4 over 59s) kubelet, minikube Created container greeter
Warning Failed 8s (x4 over 56s) kubelet, minikube Error: failed to start container "greeter": Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: \"/greeter-srv\": stat /greeter-srv: no such file or directory": unknown
`
i have used this doc for installation, as per this doc installed deppentency etcd and NAT , and its running fine.
Any one is created this micro application in kubernetes, i have doubt my Dockerfile it self have any issue or yaml,
can u clear me if i run any wrong , i did not get any correct doc for kubernetes installation from site.
I used below yaml file for kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
name: greeter
spec:
replicas: 1
selector:
matchLabels:
app: greeter-srv
template:
metadata:
labels:
app: greeter-srv
spec:
containers:
- name: greeter
command: [
"/greeter-srv",
]
image: 12345.dkr.ecr.ap-south-1.amazonaws.com/micro:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
name: greeter-port
env:
- name: MICRO_SERVER_ADDRESS
value: "0.0.0.0:8080"
- name: MICRO_BROKER
value: "nats"
- name: MICRO_BROKER_ADDRESS
value: "nats-cluster"
- name: MICRO_REGISTRY
value: "etcd"
- name: MICRO_REGISTRY_ADDRESS
value: "etcd-cluster-client"
imagePullSecrets:
- name: ap-south-1-ecr-registry
Dockerfile
FROM alpine:latest
RUN apk --no-cache add make git go gcc libtool musl-dev
WORKDIR /go/src/
# Configure Go
ENV GOROOT /usr/lib/go
ENV GOPATH /go
ENV PATH /go/bin:$PATH
RUN mkdir -p ${GOPATH}/src ${GOPATH}/bin
COPY . .
COPY greeter-srv /go/src/
RUN make
RUN apk add ca-certificates && \
rm -rf /var/cache/apk/* /tmp/* && \
[ ! -e /etc/nsswitch.conf ] && echo 'hosts: files dns' > /etc/nsswitch.conf
ENTRYPOINT ["/greeter-srv"]
service file name greeter-srv
import (
"github.com/micro/go-micro/v2"
)
func main() {
service := micro.NewService(
micro.Name("greeter")
)
service.Init()
service.Run()
}
I have build with above docker file then tag and push to aws ecr , then i used that registory in kubernetes yaml file.

stat /greeter-srv: no such file or directory" this is due to Dockerfile mensioned WORKDIR /go/src/ and ENTRYPOINT ["/greeter-srv"] this is wrong , should be used ENTRYPOINT ["/go/src/greeter-srv"]

Your Dockerfile caused the issue change the ENTRYPOINT to /go/src/greeter-srv

Pod Stuck on `CrashLoopBackOff` even though it should go into /bin/bash

I'm running a Kubernetes cluster with minikube and my deployment (or individual Pods) won't stay running even though I specify in the Dockerfile that it should stay leave a terminal open (I've also tried it with sh). They keep getting restarted and sometimes they get stuck on a CrashLoopBackOff status before restarting again:
FROM ubuntu
EXPOSE 8080
CMD /bin/bash
My deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleeper-deploy
spec:
replicas: 10
selector:
matchLabels:
app: sleeper-world
minReadySeconds: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: sleeper-world
spec:
containers:
- name: sleeper-pod
image: kubelab
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
All in all, my workflow follows (deploy.sh):
#!/bin/bash
# Cleaning
kubectl delete deployments --all
kubectl delete pods --all
# Building the Image
sudo docker build \
-t kubelab \
.
# Deploying
kubectl apply -f sleeper_deployment.yml
By the way, I've tested the Docker Container solo using sudo docker run -dt kubelab and it does stay up. Why doesn't it stay up within Kubernetes? Is there a parameter (in the YAML file) or a flag I should be using for this special case?

1. Original Answer (but edited...)
If you are familiar with Docker, check this.
If you are looking for an equivalent of docker run -dt kubelab, try kubectl run -it kubelab --restart=Never --image=ubuntu /bin/bash. In your case, with the Docker -t flag: Allocate a pseudo-tty. That's why your Docker Container stays up.
Try:
kubectl run kubelab \
--image=ubuntu \
--expose \
--port 8080 \
-- /bin/bash -c 'while true;do sleep 3600;done'
Or:
kubectl run kubelab \
--image=ubuntu \
--dry-run -oyaml \
--expose \
--port 8080 \
-- /bin/bash -c 'while true;do sleep 3600;done'
2. Explaining what's going on (Added by Philippe Fanaro):
As stated by #David Maze, the bash process is going to exit immediately because the artificial terminal won't have anything going into it, a slightly different behavior from Docker.
If you change the restart Policy, it will still terminate, the difference is that the Pod won't regenerate or restart.
One way of doing it is (pay attention to the tabs of restartPolicy):
apiVersion: v1
kind: Pod
metadata:
name: kubelab-pod
labels:
zone: prod
version: v1
spec:
containers:
- name: kubelab-ctr
image: kubelab
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
restartPolicy: Never
However, this will not work if it is specified inside a deployment YAML. And that's because deployments force regeneration, trying to always get to the desired state. This can be confirmed in the Deployment Documentation Webpage:
Only a .spec.template.spec.restartPolicy equal to Always is allowed, which is the default if not specified.
3. If you really wish to force the Docker Container to Keep Running
In this case, you will need something that doesn't exit. A server-like process is one example. But you can also try something mentioned in this StackOverflow answer:
CMD exec /bin/bash -c "trap : TERM INT; sleep infinity & wait"
This will keep your container alive until it is told to stop. Using trap and wait will make your container react immediately to a stop request. Without trap/wait stopping will take a few seconds.

Kubernetes Image goes into CrashLoopBackoff even if entry point is defined

I am trying to run an image using Kubernetes with below Dockerfile
FROM centos:6.9
COPY rpms/* /tmp/
RUN yum -y localinstall /tmp/*
ENTERYPOINT service test start && /bin/bash
Now when I try to deploy this image using pod.yml as shown below,
apiVersion: v1
kind: Pod
metadata:
labels:
app: testpod
name: testpod
spec:
containers:
- image: test:v0.2
name: test
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: testpod
volumes:
- name: testod
persistentVolumeClaim:
claimName: testpod
Now when I try to create the pod the image goes into a crashloopbackoff. So how I can make the image to wait in /bin/bash on Kubernetes as when I use docker run -d test:v0.2 it work fines and keep running.

You need to attach a terminal to the running container. When starting a pod using kubectl run ... you can use -i --tty to do that. In the pod yml filke, you can add the following, to the container spec to attach tty.
stdin: true
tty: true

You can put a command like tail -f /dev/null to keep your container always be on, this could be done inside your Dockerfile or in your Kubernetes yaml file.

Extending Docker JBoss WildFly server not working

Hope doing good all.
Env: centos 7.3.1611, kubernetes : 1.5, docker 1.12
Problem 1 : Extended jboss docker not working but docker image created successfully
POD gets an error see below, step 7.
Problem 2 : Once problem #1 fixed then i wish to upload to docker hub: https://hub.docker.com/
how can i upload steps please if possible.
1) pull
docker pull jboss/wildfly
2) vi Dockerfile
FROM jboss/wildfly
RUN /opt/jboss/wildfly/bin/add-user.sh admin admin123$ --silent
CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0", "-bmanagement", "0.0.0.0"]
3) Extend docker image
docker build --tag=nbasetty/wildfly-server .
4) [root#centos7 custom-jboss]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
nbasetty/wildfly-server latest c1fbb87faffd 43 minutes ago 583.8 MB
docker.io/httpd latest e0645af13ada 2 weeks ago 177.5 MB
5)vi jboss-wildfly-rc-service-custom.yaml
apiVersion: v1
kind: Service
metadata:
name: wildfly-service
spec:
externalIPs:
- 10.0.2.15
selector:
app: wildfly-rc-pod
ports:
- name: web
port: 8080
#- name: admin-console
# port: 9990
type: LoadBalancer
---
apiVersion: v1
kind: ReplicationController
metadata:
name: wildfly-rc
spec:
replicas: 2
template:
metadata:
labels:
app: wildfly-rc-pod
spec:
containers:
- name: wildfly
image: nbasetty/wildfly-server
ports:
- containerPort: 8080
#- containerPort: 9990
6) kubectl create -f jboss-wildfly-rc-service-custom.yaml
7) [root#centos7 jboss]# kubectl get pods
NAME READY STATUS RESTARTS AGE
mysql-pvc-pod 1/1 Running 6 2d
wildfly-rc-d0k3h 0/1 ImagePullBackOff 0 23m
wildfly-rc-hgsfj 0/1 ImagePullBackOff 0 23m
[root#centos7 jboss]# kubectl logs wildfly-rc-d0k3h
Error from server (BadRequest): container "wildfly" in pod
"wildfly-rc-d0k3h" is waiting to start:
trying and failing to pull image

Glad you have found a way to make it working. here is step I followed.
I labeled node-01 as 'dbserver: mysql'
create the docker image in node-01
created this pod, it worked.
apiVersion: v1 kind: ReplicationController metadata: name: wildfly-rc spec: replicas: 2 template:
metadata:
labels:
app: wildfly-rc-pod
spec:
containers:
- name: wildfly
image: nbasetty/wildfly-server
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
nodeSelector:
dbserver: mysql

Re-creating the issue:
docker pull jboss/wildfly
mkdir jw
cd jw
echo 'FROM jboss/wildfly
RUN /opt/jboss/wildfly/bin/add-user.sh admin admin123$ --silent
CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0", "-bmanagement", "0.0.0.0"]' | tee Dockerfile
docker build --tag=docker.io/surajd/wildfly-server .
See the images available:
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/surajd/wildfly-server latest 10e96902ea12 11 seconds ago 583.8 MB
Create a config that works:
echo '
apiVersion: v1
kind: Service
metadata:
name: wildfly
spec:
selector:
app: wildfly
ports:
- name: web
port: 8080
type: LoadBalancer
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: wildfly
spec:
replicas: 2
template:
metadata:
labels:
app: wildfly
spec:
containers:
- name: wildfly
image: docker.io/surajd/wildfly-server
imagePullPolicy: Never
ports:
- containerPort: 8080
' | tee config.yaml
kubectl create -f config.yaml
Notice the field imagePullPolicy: Never, this helps you use the image available on the node(the image we built using docker build). This works on single node cluster but may or may not work on multiple node cluster. So not recommended to put that value, but since we are doing experiment on single node cluster we can set it to Never. Always set it to imagePullPolicy: Always. So that whenever the pod is scheduled the image will be pulled from registry. Read about imagePullPolicy and some config related tips.
Now to pull the image from registry the image should be on registry, so to answer your question of pushing it to docker hub run command:
docker push docker.io/surajd/wildfly-server
So in the above example replace surajd with your docker registry username.
Here are steps I used to do setup of single node cluster on CentOS:
My machine version:
$ cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
Here is what I have done:
Setup single node k8s cluster on CentOS as follows (src1 & src2):
yum update -y
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
setenforce 0
yum install -y docker kubelet kubeadm kubectl kubernetes-cni
systemctl enable docker && systemctl start docker
systemctl enable kubelet && systemctl start kubelet
sysctl net.bridge.bridge-nf-call-iptables=1
sysctl net.bridge.bridge-nf-call-ip6tables=1
kubeadm init
cp /etc/kubernetes/admin.conf $HOME/
chown $(id -u):$(id -g) $HOME/admin.conf
export KUBECONFIG=$HOME/admin.conf
kubectl taint nodes --all node-role.kubernetes.io/master-
Now k8s version:
# kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:33:17Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart