kubernetest,the sharing Volume by Containers in one pod - docker

I get a question about sharing Volume by Containers in one pod.
Here is my yaml, pod-volume.yaml
apiVersion: v1
kind: Pod
metadata:
name: volume-pod
spec:
containers:
- name: tomcat
image: tomcat
imagePullPolicy: Never
ports:
- containerPort: 8080
volumeMounts:
- name: app-logs
mountPath: /usr/local/tomcat/logs
- name: busybox
image: busybox
command: ["sh", "-c", "tail -f /logs/catalina.out*.log"]
volumeMounts:
- name: app-logs
mountPath: /logs
volumes:
- name: app-logs
emptyDir: {}
create pod:
kubectl create -f pod-volume.yaml
wacth pod status:
watch kubectl get pod -n default
finally,I got this:
NAME READY STATUS RESTARTS AGE
redis-php 2/2 Running 0 15h
volume-pod 1/2 CrashLoopBackOff 5 6m49s
then,I check logs about busybox container:
kubectl logs pod/volume-pod -c busybox
tail: can't open '/logs/catalina.out*.log': No such file or directory
tail: no files
I don't know where is went wrong.
Is this an order of container start in pod, please help me, thanks

For this case:
Catalina logs file is : catalina.$(date '+%Y-%m-%d').log
And in shell script you should not put * into.
So please try:
command: ["sh", "-c", "tail -f /logs/catalina.$(date '+%Y-%m-%d').log"]

Related

Build Kubernetes cluster with spark master and spark workers

I've built a custom-spark docker image with the following dependencies:
Python 3.6.9
Pip 1.18
Java OpenJDK 64-Bit Server VM, 1.8.0_212
Hadoop 3.2
Scala 2.13.0
Spark 3.0.3
where I pushed to ducker hub: https://hub.docker.com/r/redaer7/custom-spark
Dockerfile,spark-master and spark-worker files are stored under: https://github.com/redaER7/Custom-Spark
I verify /spark-master and /spark-worker works well when creating a container linked to the previous image:
docker run -it -d --name spark_1 redaer7/custom-spark:1.0 bash
docker exec -it $CONTAINER_ID /bin/bash
My issue is when I try to build a K8s cluster from previous image with following yaml file for the spark master pod:
kubectl create namespace sparkspace
kubectl -n sparkspace create -f ./spark-master-deployment.yaml
#yaml file
kind: Deployment
apiVersion: apps/v1
metadata:
name: spark-master
spec:
replicas: 1 # should always be one
selector:
matchLabels:
component: spark-master
template:
metadata:
labels:
component: spark-master
spec:
containers:
- name: spark-master
image: redaer7/custom-spark:1.0
imagePullPolicy: IfNotPresent
command: ["/spark-master"]
ports:
- containerPort: 7077
- containerPort: 8080
resources:
# limits:
# cpu: 1
# memory: 1G
requests:
cpu: 1 #100m
memory: 1G
I get CrashLoopBackOff when viewing pod with kubectl -n sparkspace get pods
When inspecting with kubectl -n sparkspace describe pod $Pod_Name
Any clue about that First warning ? thank you
I simply solved it by re-pulling the image :
imagePullPolicy: Always
Because I edited the Docker Image locally and I haven't changed the following in the config file:
imagePullPolicy: IfNotPresent
Then, I pushed it into Dockerhub for later deployment

Why I cannot read files from a shared PersistentVolumeClaim between containers in Kubernetes?

I have a docker image felipeogutierrez/tpch-dbgen that I build using docker-compose and I push it to docker-hub registry using travis-CI.
version: "3.7"
services:
other-images: ....
tpch-dbgen:
build: ../docker/tpch-dbgen
image: felipeogutierrez/tpch-dbgen
volumes:
- tpch-dbgen-data:/opt/tpch-dbgen/data/
- datarate:/tmp/
stdin_open: true
and this is the Dockerfile to build this image:
FROM gcc AS builder
RUN mkdir -p /opt
COPY ./generate-tpch-dbgen.sh /opt/generate-tpch-dbgen.sh
WORKDIR /opt
RUN chmod +x generate-tpch-dbgen.sh && ./generate-tpch-dbgen.sh
In the end, this scripts creates a directory /opt/tpch-dbgen/data/ with some files that I would like to read from another docker image that I am running on Kubernetes. Then I have a Flink image that I create to run into Kubernetes. This image starts 3 Flink Task Managers and one stream application that reads files from the image tpch-dbgen-data. I think that the right approach is to create a PersistentVolumeClaim so I can share the directory /opt/tpch-dbgen/data/ from image felipeogutierrez/tpch-dbgen to my flink image in Kubernetes. So, first I have this file to create the PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: tpch-dbgen-data-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Mi
Then, I am creating an initContainers to launch the image felipeogutierrez/tpch-dbgen and after that launch my image felipeogutierrez/explore-flink:1.11.1-scala_2.12:
apiVersion: apps/v1
kind: Deployment
metadata:
name: flink-taskmanager
spec:
replicas: 3
selector:
matchLabels:
app: flink
component: taskmanager
template:
metadata:
labels:
app: flink
component: taskmanager
spec:
initContainers:
- name: tpch-dbgen
image: felipeogutierrez/tpch-dbgen
#imagePullPolicy: Always
env:
command: ["ls"]
# command: ['sh', '-c', 'for i in 1 2 3; do echo "job-1 `date`" && sleep 5s; done;', 'ls']
volumeMounts:
- name: tpch-dbgen-data
mountPath: /opt/tpch-dbgen/data
containers:
- name: taskmanager
image: felipeogutierrez/explore-flink:1.11.1-scala_2.12
#imagePullPolicy: Always
env:
args: ["taskmanager"]
ports:
- containerPort: 6122
name: rpc
- containerPort: 6125
name: query-state
livenessProbe:
tcpSocket:
port: 6122
initialDelaySeconds: 30
periodSeconds: 60
volumeMounts:
- name: flink-config-volume
mountPath: /opt/flink/conf/
- name: tpch-dbgen-data
mountPath: /opt/tpch-dbgen/data
securityContext:
runAsUser: 9999 # refers to user _flink_ from official flink image, change if necessary
volumes:
- name: flink-config-volume
configMap:
name: flink-config
items:
- key: flink-conf.yaml
path: flink-conf.yaml
- key: log4j-console.properties
path: log4j-console.properties
- name: tpch-dbgen-data
persistentVolumeClaim:
claimName: tpch-dbgen-data-pvc
The Flink stream application is starting but it cannot read the files on the directory /opt/tpch-dbgen/data of the image felipeogutierrez/tpch-dbgen. I am getting the error: java.io.FileNotFoundException: /opt/tpch-dbgen/data/orders.tbl (No such file or directory). It is strange because when I try to go into the container felipeogutierrez/tpch-dbgen I can list the files. So I suppose there is something wrong on my Kubernetes configuration. Does anyone know to point what I am missing on the Kubernetes configuration files?
$ docker run -i -t felipeogutierrez/tpch-dbgen /bin/bash
root#10c0944a95f8:/opt# pwd
/opt
root#10c0944a95f8:/opt# ls tpch-dbgen/data/
customer.tbl dbgen dists.dss lineitem.tbl nation.tbl orders.tbl part.tbl partsupp.tbl region.tbl supplier.tbl
Also, when I list the logs of the container tpch-dbgen I can see the directory tpch-dbgen that I want to read. Although I cannot execute the command command: ["ls tpch-dbgen"] inside my Kubernetes config file.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
flink-jobmanager-n9nws 1/1 Running 2 17m
flink-taskmanager-777cb5bf77-ncdl4 1/1 Running 0 4m54s
flink-taskmanager-777cb5bf77-npmrx 1/1 Running 0 4m54s
flink-taskmanager-777cb5bf77-zc2nw 1/1 Running 0 4m54s
$ kubectl logs flink-taskmanager-777cb5bf77-ncdl4 tpch-dbgen
generate-tpch-dbgen.sh
tpch-dbgen
Docker has an unusual feature where, under some specific circumstances, it will populate a newly created volume from the image. You should not rely on this functionality, since it completely ignores updates in the underlying images and it doesn't work on Kubernetes.
In your Kubernetes setup, you create a new empty PersistentVolumeClaim, and then mount this over your actual data in both the init and main containers. As with all Unix mounts, this hides the data that was previously in that directory. Nothing causes data to get copied into that volume. This works the same way as every other kind of mount, except the Docker named-volume mount: you'll see the same behavior if you change your Compose setup to do a host bind mount, or if you play around with your local development system using a USB drive as a "volume".
You need to make your init container (or something else) explicitly copy data into the directory. For example:
initContainers:
- name: tpch-dbgen
image: felipeogutierrez/tpch-dbgen
command:
- /bin/cp
- -a
- /opt/tpch-dbgen/data
- /data
volumeMounts:
- name: tpch-dbgen-data
mountPath: /data # NOT the same path as in the image
If the main process modifies these files in place, you can make the command be more intelligent, or write a script into your image that only copies the individual files in if they don't exist yet.
It could potentially make more sense to have your image generate the data files at startup time, rather than at image-build time. That could look like:
FROM gcc
COPY ./generate-tpch-dbgen.sh /usr/local/bin/
RUN chmod +x /usr/local/bin/generate-tpch-dbgen.sh
CMD ["generate-tpch-dbgen.sh"]
Then in your init container, you can run the default command (the generate script) with the working directory set to the volume directory
initContainers:
- name: tpch-dbgen
image: felipeogutierrez/tpch-dbgen
volumeMounts:
- name: tpch-dbgen-data
mountPath: /opt/tpch-dbgen/data # or anywhere really
workingDir: /opt/tpch-dbgen/data # matching mountPath
I got to run the PersistentVolumeClaim and share it between pods. Basically I had to use a subPath property which I learned from this answer https://stackoverflow.com/a/43404857/2096986 and I am using a simple Job that I learned from this answer https://stackoverflow.com/a/64023672/2096986. The final results is below:
The Dockerfile:
FROM gcc AS builder
RUN mkdir -p /opt
COPY ./generate-tpch-dbgen.sh /opt/generate-tpch-dbgen.sh
WORKDIR /opt
RUN chmod +x /opt/generate-tpch-dbgen.sh
ENTRYPOINT ["/bin/sh","/opt/generate-tpch-dbgen.sh"]
and the script generate-tpch-dbgen.sh has to have this line in the end sleep infinity & wait to not finalize. The PersistentVolumeClaim is the same of the question. Then I create a Job with the subPath property.
apiVersion: batch/v1
kind: Job
metadata:
name: tpch-dbgen-job
spec:
template:
metadata:
labels:
app: flink
component: tpch-dbgen
spec:
restartPolicy: OnFailure
volumes:
- name: tpch-dbgen-data
persistentVolumeClaim:
claimName: tpch-dbgen-data-pvc
containers:
- name: tpch-dbgen
image: felipeogutierrez/tpch-dbgen
imagePullPolicy: Always
volumeMounts:
- mountPath: /opt/tpch-dbgen/data
name: tpch-dbgen-data
subPath: data
and I use it on the other deployment also with the subPath property.
apiVersion: apps/v1
kind: Deployment
metadata:
name: flink-taskmanager
spec:
replicas: 3
selector:
matchLabels:
app: flink
component: taskmanager
template:
metadata:
labels:
app: flink
component: taskmanager
spec:
volumes:
- name: flink-config-volume
configMap:
name: flink-config
items:
- key: flink-conf.yaml
path: flink-conf.yaml
- key: log4j-console.properties
path: log4j-console.properties
- name: tpch-dbgen-data
persistentVolumeClaim:
claimName: tpch-dbgen-data-pvc
containers:
- name: taskmanager
image: felipeogutierrez/explore-flink:1.11.1-scala_2.12
imagePullPolicy: Always
env:
args: ["taskmanager"]
ports:
- containerPort: 6122
name: rpc
- containerPort: 6125
name: query-state
livenessProbe:
tcpSocket:
port: 6122
initialDelaySeconds: 30
periodSeconds: 60
volumeMounts:
- name: flink-config-volume
mountPath: /opt/flink/conf/
- name: tpch-dbgen-data
mountPath: /opt/tpch-dbgen/data
subPath: data
securityContext:
runAsUser: 9999 # refers to user _flink_ from official flink image, change if necessary
Maybe the issue is the accessMode you set on your PVC. ReadWriteOnce means it can only be mounted by one POD.
See here for Details.
You could try to use ReadWriteMany.
Your generate-tpch-dbgen.sh script is executed while building the docker image resulting those files in /opt/tpch-dbgen/data directory. So, when you run the image, you can see those files.
But the problem with k8s pvc, when you mount the volume (initially empty) to your containers, it replaces the /opt/tpch-dbgen/data directory along with the files in it.
Solution:
Don't execute the generate-tpch-dbgen.sh while building the docker image, rather execute it in the runtime. Then, the files will be created in the shared pv from the init container.
Something like below:
FROM gcc AS builder
RUN mkdir -p /opt
COPY ./generate-tpch-dbgen.sh /opt/generate-tpch-dbgen.sh
RUN chmod +x /opt/generate-tpch-dbgen.sh
ENTRYPOINT ["/bin/sh","/opt/generate-tpch-dbgen.sh"]

Kubernetes Image goes into CrashLoopBackoff even if entry point is defined

I am trying to run an image using Kubernetes with below Dockerfile
FROM centos:6.9
COPY rpms/* /tmp/
RUN yum -y localinstall /tmp/*
ENTERYPOINT service test start && /bin/bash
Now when I try to deploy this image using pod.yml as shown below,
apiVersion: v1
kind: Pod
metadata:
labels:
app: testpod
name: testpod
spec:
containers:
- image: test:v0.2
name: test
imagePullPolicy: Always
volumeMounts:
- mountPath: /data
name: testpod
volumes:
- name: testod
persistentVolumeClaim:
claimName: testpod
Now when I try to create the pod the image goes into a crashloopbackoff. So how I can make the image to wait in /bin/bash on Kubernetes as when I use docker run -d test:v0.2 it work fines and keep running.
You need to attach a terminal to the running container. When starting a pod using kubectl run ... you can use -i --tty to do that. In the pod yml filke, you can add the following, to the container spec to attach tty.
stdin: true
tty: true
You can put a command like tail -f /dev/null to keep your container always be on, this could be done inside your Dockerfile or in your Kubernetes yaml file.

Disable Transparent Huge Pages from Kubernetes

I deploy Redis container via Kubernetes and get the following warning:
WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled
Is it possible to disable THP via Kubernetes? Perhaps via init-containers?
Yes, with init-containers it's quite straightforward:
apiVersion: v1
kind: Pod
metadata:
name: thp-test
spec:
restartPolicy: Never
terminationGracePeriodSeconds: 1
volumes:
- name: host-sys
hostPath:
path: /sys
initContainers:
- name: disable-thp
image: busybox
volumeMounts:
- name: host-sys
mountPath: /host-sys
command: ["sh", "-c", "echo never >/host-sys/kernel/mm/transparent_hugepage/enabled"]
containers:
- name: busybox
image: busybox
command: ["cat", "/sys/kernel/mm/transparent_hugepage/enabled"]
Demo (notice that this is a system wide setting):
$ ssh THATNODE cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
$ kubectl create -f thp-test.yaml
pod "thp-test" created
$ kubectl logs thp-test
always madvise [never]
$ kubectl delete pod thp-test
pod "thp-test" deleted
$ ssh THATNODE cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
Ay,
I don't know if what I did is a good idea but we needed to deactivate THP on all our K8S VMs for all our apps. So I used a DaemonSet instead of adding an init-container to all our stacks :
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: thp-disable
namespace: kube-system
spec:
selector:
matchLabels:
name: thp-disable
template:
metadata:
labels:
name: thp-disable
spec:
restartPolicy: Always
terminationGracePeriodSeconds: 1
volumes:
- name: host-sys
hostPath:
path: /sys
initContainers:
- name: disable-thp
image: busybox
volumeMounts:
- name: host-sys
mountPath: /host-sys
command: ["sh", "-c", "echo never >/host-sys/kernel/mm/transparent_hugepage/enabled"]
containers:
- name: busybox
image: busybox
command: ["watch", "-n", "600", "cat", "/sys/kernel/mm/transparent_hugepage/enabled"]
I think it's a little dirty but it works.

docker run command change to .yaml file

below is my start container docker run command:
docker run -it -d --name=aaa--net=host -v /opt/headedness/phantomjs:/data/phantomjs/bin/phantomjs -v /opt/ctcrawler/log:/data/log XXX/app/aaa:latest -id aaa -endpoint http://localhost:8080/c2/ -selenium http://localhost:4444/wd/hub
How to change it to a yaml file? I have try many ways,but still can`t working...
below is my .yaml file(pls help...)
apiVersion: v1
kind: Pod
metadata:
name: aaa
spec:
containers:
- name: aaa
image: xxx/app/aaa:latest
net: "host"
args:
- -id: aaa
- -phantomjs: /data/phantomjs/bin/phantomjs
- -capturedPath: /data/log
- -endpoint: http://wwww/c2/
- -selenium: http://localhost:4444/wd/hub
- -proxy: n/a
imagePullPolicy: Always
imagePullSecrets:
- name: myregistrykey
Your Spec is invalid.
For host network set spec.securityContext.HostNetwork: true
Use hostPath volumes to mount directories on the host.
If it is configuration data, you can use a gitRepo volume or ConfigMap starting from v1.2.

Resources