How to read multiple files(pdfs) from mapr volume performs OCR and dump in a mapr volume using kubernetes jobs - docker

'''
apiVersion: batch/v1
kind: Job
metadata:
name: parallel-job
namespace:
spec:
completions: 6 # number of times to run
parallelism: 2 # number of pods that can run in parallel
backoffLimit: 6 # number of retries before throwing error
activeDeadlineSeconds: 10 # time to allow job to run
template:
metadata:
labels:
app: kubernetes-series
tier: job
spec:
restartPolicy: OnFailure
containers:
- name: job-fibonacci-2
image:
args:
- sleep
- "1000000"
resources:
requests:
memory: 500M
cpu: 100
volumeMounts:
- name:
mountPath:
name: maprvolume
volumes:
- name: maprvolume
persistentVolumeClaim:
claimName: pvc-mapr

Related

Install cassandra exporter for prometheus monitoring in cassandra pod in kubernetes

I am using Cassandra image w.r.t.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra
labels:
app: cassandra
spec:
serviceName: cassandra
replicas: 3
selector:
matchLabels:
app: cassandra
template:
metadata:
labels:
app: cassandra
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra
image: gcr.io/google-samples/cassandra:v13
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra-0.cassandra.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "K8Demo"
- name: CASSANDRA_DC
value: "DC1-K8Demo"
- name: CASSANDRA_RACK
value: "Rack1-K8Demo"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra-data
mountPath: /cassandra_data
# These are converted to volume claims by the controller
# and mounted at the paths mentioned above.
# do not use these in production until ssd GCEPersistentDisk or other ssd pd
volumeClaimTemplates:
- metadata:
name: cassandra-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 1Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: k8s.io/minikube-hostpath
parameters:
type: pd-ssd
Now I need to add below line to cassandra-env.sh in postStart or in cassandra yaml file:
-JVM_OPTS="$JVM_OPTS
-javaagent:$CASSANDRA_HOME/lib/cassandra-exporter-agent-<version>.jar"
Now I was able to achieve this, but after this step, Cassandra requires a restart but as it's already running as a pod, I don't know how to restart the process. So is there any way that this step is done prior to running the pod and not after it is up?
I was suggested below solution:-
This won’t work. Commands that run postStart don’t impact the running container. You need to change the startup commands passed to Cassandra.
The only way that I know to do this is to create a new container image in the artifactory based on the existing image. and pull from there.
But I don't know how to achieve this.

Binary not found in Kubernetes deployment

I'm trying to deploy rocketmq on my testing cluster. I started from the scripts provided in the apache/rocketmq-docker repo on github, but they do not work. I created my own yaml deployment starting from the one in the repo I previously cited, and it works for mqnamsrv, but not for broker. In the following the 2 deployments:
apiVersion: apps/v1
kind: Deployment
metadata:
name: rocketmq-namesrv
spec:
replicas: 1
selector:
matchLabels:
app: rocketmq-namesrv
template:
metadata:
labels:
app: rocketmq-namesrv
spec:
containers:
- name: namesrv
image: myrepo/rocketmq:4.9.3-alpine
command: ["sh", "mqnamesrv"]
imagePullPolicy: IfNotPresent
resources:
limits:
memory: "128Mi"
cpu: "400m"
ports:
- containerPort: 9876
volumeMounts:
- name: namesrv-log
mountPath: /var/log
volumes:
- name: namesrv-log
persistentVolumeClaim:
claimName: rocketmq-namesrv-pvc
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rocketmq-broker
spec:
replicas: 1
selector:
matchLabels:
app: rocketmq-broker
template:
metadata:
labels:
app: rocketmq-broker
spec:
containers:
- name: broker
image: myrepo/rocketmq:4.9.3-alpine
command: ["sh", "mqbroker", "-n", "localhost:9876"]
imagePullPolicy: IfNotPresent
resources:
limits:
memory: "128Mi"
cpu: "400m"
ports:
- containerPort: 10909
- containerPort: 10911
volumeMounts:
- name: broker-log
mountPath: /var/log
- name: broker-store
mountPath: /home/rocketmq
volumes:
- name: broker-log
persistentVolumeClaim:
claimName: rocketmq-broker-log-pvc
- name: broker-store
persistentVolumeClaim:
claimName: rocketmq-broker-store-pvc
The image rocketmq:4.9.3-alpine was created following the procedure on the apache/rocketmq-docker repo.
After the deployment the rocketmq-namesrv works, but the broker's pod logs: sh: can't open 'mqbroker': No such file or directory. ut if I try to run manually the container with kubectl run -ti rocketmq-broker --image=myrepo/rocketmq:4.9.3-alpine --restart=Never -- sh mqbroker -n localhost:9876 it works...
What could it be the problem in the yaml? Am I making something wrong?
I think the problem is with the mount path.
- name: broker-store
mountPath: /home/rocketmq
So your binaries won't be there and so the error

Jenkins POD restarts how to persists Jenkins configuration and plugin

I have deployed my Jenkins as part of kubernetes yaml file and also enabled Persist volume claim, when my Jenkins pod is restarts, i lost my all the jobs and configuration which means i need to re-install all Jenkins suggest plugin, configure kubernetes cloud, configure git repo, and create new pipeline job.
cloud you please help me how to avoid above scenario.
vi jenkins-deployment.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
name: jenkins-master
namespace: jenkins
labels:
app: jenkins-master
spec:
replicas: 1
selector:
matchLabels:
app: jenkins-master
template:
metadata:
labels:
app: jenkins-master
spec:
securityContext:
fsGroup: 1000
containers:
- name: jenkins
image: jenkins/jenkins:lts
imagePullPolicy: Always
ports:
- containerPort: 8080
- containerPort: 50000
readinessProbe:
httpGet:
path: /login
port: 8080
initialDelaySeconds: 300
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 2
failureThreshold: 5
volumeMounts:
- mountPath: "/var"
name: jenkins-home
subPath: jenkins_home
resources:
limits:
cpu: 800m
memory: 3Gi
requests:
cpu: 100m
memory: 3Gi
volumes:
- name: jenkins-home
persistentVolumeClaim:
claimName: pvc-jenkins-home
vi jenkins-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-jenkins-home
namespace: jenkins
spec:
storageClassName: efs
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Mi
kubectl get pvc -n jenkins
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-jenkins-home Bound pvc-4ccf3f55-6894-4fee-88d7-58dd7584b837 10Mi RWO efs 59m
Please let me know if any details required from my side
Please remove the subpathfrom volumeMounts as subPath will overwrite everything under the /var directory. so it should be just like this
volumeMounts:
- mountPath: /var
name: jenkins-home

Issue with Jenkins Deployment File: Unknown resource kind: Deployment

I'm struggling to figure out what the solution might be, so I thought to ask here. I'm trying to use the code below to deploy a Jenkins pod to Kubernetes, but it fails with a Unknown resource kind: Deployment error:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-deployment
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: jenkins/jenkins:lts-alpine
ports:
- name: http-port
containerPort: 8080
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}
The output of kubectl api-versions is:
admissionregistration.k8s.io/v1
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1
apiregistration.k8s.io/v1beta1
apps/v1
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
autoscaling/v2beta2
batch/v1
batch/v1beta1
certificates.k8s.io/v1beta1
coordination.k8s.io/v1
coordination.k8s.io/v1beta1
discovery.k8s.io/v1beta1
events.k8s.io/v1beta1
extensions/v1beta1
networking.k8s.io/v1
networking.k8s.io/v1beta1
node.k8s.io/v1beta1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
scheduling.k8s.io/v1
scheduling.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1
Does anyone know what the problem might be?
If this is an indentation issue, I'm failing to see it.
It seems the apiVersion is deprecated. You can simply convert to current apiVersion and apply.
$ kubectl convert -f jenkins-dep.yml
kubectl convert is DEPRECATED and will be removed in a future version.
In order to convert, kubectl apply the object to the cluster, then kubectl get at the desired version.
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: jenkins
name: jenkins-deployment
spec:
progressDeadlineSeconds: 2147483647
replicas: 1
revisionHistoryLimit: 2147483647
selector:
matchLabels:
app: jenkins
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: jenkins
spec:
containers:
- image: jenkins/jenkins:lts-alpine
imagePullPolicy: IfNotPresent
name: jenkins
ports:
- containerPort: 8080
name: http-port
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/jenkins_home
name: jenkins-home
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: jenkins-home
status: {}
$ kubectl convert -f jenkins-dep.yml -oyaml > jenkins-dep-latest.yml
Change the apiVersion from extensions/v1beta1 to apps/v1 and use kubectl version to check if the kubectl client and kube API Server version is matching and not too old.

Jenkins container persistence on Kubernetes cluster - PersistentVolumeClaim (VMware/Vsphere)

Trying to persist my jenkins jobs on to vsphere storage when I delete the deployments/services.
I've tried using the standard approach: used StorageClass, then made a PersistentVolumeClaim which is referenced in the .ayml file that will create the deployments.
storage-class.yml:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: mystorage
provisioner: kubernetes.io/vsphere-volume
parameters:
diskformat: zeroedthick
persistent-volume-claim.yml:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc0003
spec:
storageClassName: mystorage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 15Gi
jenkins.yml:
---
apiVersion: v1
kind: Service
metadata:
name: jenkins-auto-ci
labels:
app: jenkins-auto-ci
spec:
type: NodePort
ports:
- port: 80
targetPort: 8080
selector:
app: jenkins-auto-ci
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-auto-ci
spec:
replicas: 1
template:
metadata:
labels:
app: jenkins-auto-ci
spec:
containers:
- name: jenkins-auto-ci
image: jenkins
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
ports:
- name: http-port
containerPort: 80
- name: jnlp-port
containerPort: 50000
volumeMounts:
- name: jenkins-home
mountPath: "/var"
volumes:
- name: jenkins-home
persistentVolumeClaim:
claimName: pvc0003
I expect the jenkins jobs to persist when I delete and recreate the deployments.
You should create VMDK which is Virtual Machine Disk.
You can do that using govc which is vSphere CLI.
govc datastore.disk.create -ds datastore1 -size 2G volumes/myDisk.vmdk
Or using ESXi CLI by ssh into the host as root and executing:
vmkfstools -c 2G /vmfs/volumes/datastore1/volumes/myDisk.vmdk
Once this is done you should create your PV let's call it vsphere_pv.yaml which might look like the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv0001
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
vsphereVolume:
volumePath: "[datastore1] volumes/myDisk"
fsType: ext4
The datastore1 in this example was created in root folder of vCenter, if you have it in a different location you need to change the volumePath. If it's located in DatastoreCluster then set volumePath to"[DatastoreCluster/datastore1] volumes/myDisk".
Apply the yaml to the Kubernetes by kubectl apply -f vsphere_pv.yaml
You can check if it was created by describing it kubectl describe pv pv0001
Now you need PVC let's call it vsphere_pvc.yaml to consume PV.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc0001
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
Apply the yaml to the Kubernetes by kubectl apply -f vsphere_pvc.yaml
You can check if it was created by describing it kubectl describe pvc pv0001
Once this is done your yaml might be looking like the following:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins-auto-ci
spec:
replicas: 1
template:
metadata:
labels:
app: jenkins-auto-ci
spec:
containers:
- name: jenkins-auto-ci
image: jenkins
resources:
requests:
cpu: 100m
memory: 100Mi
env:
- name: GET_HOSTS_FROM
value: dns
ports:
- name: http-port
containerPort: 80
- name: jnlp-port
containerPort: 50000
volumeMounts:
- name: jenkins-home
mountPath: "/var"
volumes:
- name: jenkins-home
persistentVolumeClaim:
claimName: pvc0001
All this is nicely explained on Vmware GitHub vsphere-storage-for-kubernetes.

Resources