I want to transfer my files generated by one of the pod of my application to the Kubernetes cronjob pod.
Is there any method to transfer files between the two pods of a cluster?
Create a persistent-volume and mount the same volume to both the containers
You can use hostdir if you don't want to persist data. however, you can use PVC and any type of shared volume to store data which can be accessible by both pods cronjobs & pod.
Related
We have created a deployment in the EKS cluster. Underlying pods are supposed to have existing content inside a particular directory which was created through the Dockerfile copy command. But when the pod is created an external EFS volume is mounted on the same directory path due to some application requirements. When we login to the pod and check the contents we found that the existing files have been overwritten by the EFS volume contents. We would like to have both the file contents in place once the EFS volume is mounted on the Pod. Please help us to achieve this.
I'm trying to run spark in an kubernetes cluster as described here https://spark.apache.org/docs/latest/running-on-kubernetes.html
It works fine for some basic scripts like the provided examples.
I noticed that the config folder despite being added to the image build by the "docker-image-tool.sh" is overwritten by a mount of a config map volume.
I have two Questions:
What sources does spark use to generate that config map or how do you edit it? As far as I understand the volume gets deleted when the last pod is deleted and regenerated when a new pod is created
How are you supposed to handle the spark-env.sh script which can't be added to a simple config map?
One initially non-obvious thing about Kubernetes is that changing a ConfigMap (a set of configuration values) is not detected as a change to Deployments (how a Pod, or set of Pods, should be deployed onto the cluster) or Pods that reference that configuration. That expectation can result in unintentionally stale configuration persisting until a change to the Pod spec. This could include freshly created Pods due to an autoscaling event, or even restarts after a crash, resulting in misconfiguration and unexpected behaviour across the cluster.
Note: This doesn’t impact ConfigMaps mounted as volumes, which are periodically synced by the kubelet running on each node.
To update configmap execute:
$ kubectl replace -f file.yaml
You must create a ConfigMap before you can use it. So I recommend firstly modify configMap and then redeploy pod.
Note that container using a ConfigMap as a subPath volume mount will not receive ConfigMap updates.
The configMap resource provides a way to inject configuration data into Pods. The data stored in a ConfigMap object can be referenced in a volume of type configMap and then consumed by containerized applications running in a Pod.
When referencing a configMap object, you can simply provide its name in the volume to reference it. You can also customize the path to use for a specific entry in the ConfigMap.
When a ConfigMap already being consumed in a volume is updated, projected keys are eventually updated as well. Kubelet is checking whether the mounted ConfigMap is fresh on every periodic sync. However, it is using its local ttl-based cache for getting the current value of the ConfigMap. As a result, the total delay from the moment when the ConfigMap is updated to the moment when new keys are projected to the pod can be as long as kubelet sync period (1 minute by default) + ttl of ConfigMaps cache (1 minute by default) in kubelet.
But what I strongly recommend you is to use Kubernetes Operator for Spark. It supports mounting volumes and ConfigMaps in Spark pods to customize them, a feature that is not available in Apache Spark as of version 2.4.
A SparkApplication can specify a Kubernetes ConfigMap storing Spark configuration files such as spark-env.sh or spark-defaults.conf using the optional field .spec.sparkConfigMap whose value is the name of the ConfigMap. The ConfigMap is assumed to be in the same namespace as that of the SparkApplication. Spark on K8S provides configuration options that allow for mounting certain volume types into the driver and executor pods. Volumes are "delivered" from Kubernetes side but they can be delivered from local storage in Spark. If no volume is set as local storage, Spark uses temporary scratch space to spill data to disk during shuffles and other operations. When using Kubernetes as the resource manager the pods will be created with an emptyDir volume mounted for each directory listed in spark.local.dir or the environment variable SPARK_LOCAL_DIRS . If no directories are explicitly specified then a default directory is created and configured appropriately.
Useful blog: spark-kubernetes-operator.
I am trying to setup the AKS in which I have used azure disk to mount the source code of the application. When I am using kubectl describe pods command then also it is showing as mounted but I dont know how may I copy the code into that?
I got some recommendations that use kubectl cp command but my pod name is changing each time whenever I am deploying so please let me know what should i do?
you'd need to copy files to the disk directly (not to the pod). you can use your pod or worker node to do that. You can use kubectl cp to copy files to the pod and then move it to the mounted disk like you normally would. or you can ssh to the worker node and copy files over ssh to the node and put files to the mounted disk.
Is there a recommended way of copying files from a pod periodically. I have a pod with an empty storage and no persistence volume. So wanted to periodically copy some log files from the pod containers to a nfs share. I can run a cronjob and invoke kubectl copy but wondering if there is a better way of doing this?
I think the better way for your case is to mount the NFS volume on your Pod to directly write the logs on it : https://kubernetes.io/docs/concepts/storage/volumes/#nfs
Run a cronjob as a scheduled job to copy the files to target location
I have mounted a hostpath volume in a Kubernetes container. Now I want to mount a configmap file onto the hostpath volume.
Is that possible?
Not really, a larger question would be would you'd want to do that?
The standard way to add configurations in Kubernetes is using ConfigMaps. They are stored in etcd and the size limit is 1MB. When your pod comes up the configuration is mounted on a pod mount point that you can specify in the pod spec.
You may want the opposite which is to use a hostPath that has some configuration and that's possible. Say, that you want to have some config that is larger than 1MB (which is not usual) and have your pod use it. The gotcha here is that you need to put this hostPath and the files in all your cluster nodes where your pod may start.
No. The volume mounts are all about pushing data into pods or persisting data that originates in a pod, and aren't usually a bidirectional data transfer mechanism.
If you want to see what's in a ConfigMap, you can always kubectl get configmap NAME -o yaml to dump it out.
(With some exceptions around things like the Docker socket, hostPath volumes aren't that common in non-Minikube Kubernetes installations, especially once you get into multi-host setups, and I'd investigate other paths to do whatever you're using it for now.)