Copy files from container to host while inside the container - docker

I'm working on automation pipeline using Kubernetes and Jenkins. All my commands are running from inside the jnlp-slave container. The jnlp-slave is deployed onto a worker node by Kubernetes. I have -v /var/run/docker.sock on my jnlp-slave so it can run docker commands from inside the container.
Issue:
I'm trying to copy files inside the jnlp-slave container to the host machine (worker node), but the command below does not copy files to host machine, but to destination of the container itself:
docker cp <container_id>:/home/jenkins/workspace /home/jenkins/workspace
Clarification:
Since the container is executing the command, files located inside the container is copied to the destination path which is also inside the container.
Normally, docker commands are executed on the host machine. Therefore, the docker cp can be used to copy files from container to host and from host to container. But in this case, the docker cp is executed from inside the container.
How can I make the container to copy files to the host machine without running docker commands on the host? Is there a command which the container can run to copy files to the host?
P.S. I've tried mounting volume on the host. But the files only can be shared from the host to the container and not the other way around. Any help is appreciated, thanks.

As suggested in comments, you should probably redesign entirely your solution.
But let's summarize what you currently have and try to figure out what you can do with it and not make your solution even more complicated at the same time.
What I did was copy the files from jnlp-slave container to the other
containers.
Copying files from one container to all others is a bit an overkill (btw. How many of them do you place in one Pod ?)
Maybe your containers don't have to be deployed withing the same Pod ? If for some reason this is currently impossible, maybe at least the content of the /home/jenkins/workspace directory shouldn't be integral part of your docker image as it is now ? If this is also impossible, you have no other remedy than to copy somehow those files from the original container based on that image to the shared location which is also available for other containers existing within the same Pod.
emptyDir, mentioned in comments might be an option allowing you to achieve it. However keep in mind that the data stored in an emptyDir volume is not persistent in any way and is deleted along with the deletion of the pod. If you're ok with this fact, it may be a solution for you.
If your data is originally part of your image, it should be first transferred to your emptyDir volume as by its very definition it is initially empty. Simply mounting it under /home/jenkins/workspace won't make the data originally available in this directory in your container automagically appear in the emptyDir volume. From the moment you mount your emptyDir under /home/jenkins/workspace it will contain the content of emptyDir i.e. nothing.
Therefore you need to pre-populate it somehow and one of the available solutions to do that is using an initContainer. As your data is originally the integral part of your docker image, you must use the same image also for the initContainer.
Your deployment may look similar to the one below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-deployment
spec:
replicas: 1
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
spec:
initContainers:
- name: pre-populate-empty-dir
image: <image-1>
command: ['sh', '-c', 'cp -a /home/jenkins/workspace/* /mnt/empty-dir-content/']
volumeMounts:
- name: cache-volume
mountPath: "/mnt/empty-dir-content/"
containers:
- name: app-container-1
image: <image-1>
ports:
- containerPort: 8081
volumeMounts:
- name: cache-volume
mountPath: "/home/jenkins/workspace"
- name: app-container-2
image: <image-2>
ports:
- containerPort: 8082
volumeMounts:
- name: cache-volume
mountPath: "/home/jenkins/workspace"
- name: app-container-3
image: <image-3>
ports:
- containerPort: 8083
volumeMounts:
- name: cache-volume
mountPath: "/home/jenkins/workspace"
volumes:
- name: cache-volume
emptyDir: {}
Once your data has been copied to an emptyDir volume by the initContainer, it will be available for all the main containers and can be mounted under the same path by each of them.

Related

How to attache a volume to kubernetes pod container like in docker?

I am new to Kubernetes but familiar with docker.
Docker Use Case
Usually, when I want to persist data I just create a volume with a name then attach it to the container, and even when I stop it then start another one with the same image I can see the data persisting.
So this is what i used to do in docker
docker volume create nginx-storage
run -it --rm -v nginx-storage:/usr/share/nginx/html -p 80:80 nginx:1.14.2
then I:
Create a new html file in /usr/share/nginx/html
Stop container
Run the same docker run command again (will create another container with same volume)
html file exists (which means data persisted in that volume)
Kubernetes Use Case
Usually, when I work with Kubernetes volumes I specify a PVC (PersistentVolumeClaim) and PV (PersistentVolume) using hostPath which will bind mount directory or a file from the host machine to the container.
what I want to do is reproduce the same behavior specified in the previous example (Docker Use Case) so how can I do that? Is Kubernetes creating volumes process is different from Docker? and if possible providing a YAML file would help me understand.
To a first approximation, you can't (portably) do this. Build your content into the image instead.
There are two big practical problems, especially if you're running a production-oriented system on a cloud-hosted Kubernetes:
If you look at the list of PersistentVolume types, very few of them can be used in ReadWriteMany mode. It's very easy to get, say, an AWSElasticBlockStore volume that can only be used on one node at a time, and something like this will probably be the default cluster setup. That means you'll have trouble running multiple pod replicas serving the same (static) data.
Once you do get a volume, it's very hard to edit its contents. Consider the aforementioned EBS volume: you can't edit it without being logged into the node on which it's mounted, which means finding the node, convincing your security team that you can have root access over your entire cluster, enabling remote logins, and then editing the file. That's not something that's actually possible in most non-developer Kubernetes setups.
The thing you should do instead is build your static content into a custom image. An image registry of some sort is all but required to run Kubernetes and you can push this static content server into the same registry as your application code.
FROM nginx:1.14.2
COPY . /usr/share/nginx/html
# Base image has a working CMD, no need to repeat it
Then in your deployment spec, set image: registry.example.com/nginx-frontend:20220209 or whatever you've chosen to name this build of this image, and do not use volumes at all. You'd deploy this the same way you deploy other parts of your application; you could use Helm or Kustomize to simplify the update process.
Correspondingly, in the plain-Docker case, I'd avoid volumes here. You don't discuss how files get into the nginx-storage named volume; if you're using imperative commands like docker cp or debugging tools like docker exec, those approaches are hard to script and are intrinsically local to the system they're running on. It's not easy to copy a Docker volume from one place to another. Images, though, can be pushed and pulled through a registry.
I managed to do that by creating a PVC only this is how I did it (with an Nginx image):
nginx-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Mi
nginx-deployment.yaml
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template: # template for the pods
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
volumeMounts:
- mountPath: /usr/share/nginx/html
name: nginx-data
volumes:
- name: nginx-data
persistentVolumeClaim:
claimName: nginx-data
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: nginx
ports:
- name: http
port: 80
nodePort: 30080
type: NodePort
Once I run kubectl apply on the PVC then on the deployment going to localhost:30080 will show 404 not found page means that all data in the /usr/share/nginx/html was deleted once the container gets started and that's because it's bind mounting a dir from the k8s cluster node to that container as a volume:
/usr/share/nginx/html <-- dir in volume
/var/lib/k8s-pvs/nginx2-data/pvc-9ba811b0-e6b6-4564-b6c9-4a32d04b974f <-- dir from node (was automatically created)
I tried adding a new file into that container in the html dir as a new index.html file, then deleted the container, a new container was created by the pod and checking localhost:30080 worked with the newly created home page
I tried deleting the deployment and reapplying it (without deleting the PVC) checked localhost:30080 and everything still persists.
An alternative solution specified in the comments kubernetes.io/docs/tasks/configure-pod-container/… by
larsks

How to merge data from docker image with kubernetes volume (NFS)

I have a docker image that contains data in directors /opt/myfiles, Lets say the following:
/opt/myfiles/file1.txt
/opt/myfiles/file2.dat
I want to deploy that image to kubernetes and mount an NFS volume to that directory so that the changes to these files are persisted when I delete the pod.
When I do it in docker swarm I simply mount an empty NFS volume to /opt/myfiles/ and then my docker swarm service is started, the volume is populated with the files from the image and I can then work with my service and when I delete the service, I still have the files on my NFS server, so on next start of the service, I have my previous state back.
In kubernetes, when I mount an empty NFS volume to /opt/myfiles/, the pod is started and /opt/myfiles/ is overwritten with an empty directory, so my pod does not see the files from the image anymore.
My volme mount and volume definition:
[...]
volumeMounts:
- name: myvol
mountPath: /opt/myfiles
[...]
volumes:
- name: myvol
nfs:
server: nfs-server.mydomain.org
path: /srv/shares/myfiles
I read some threads about similar problems (for example K8s doesn't mount files on Persistent Volume) and tried some stuff using subPath and subPathExpr as in the documentation (https://kubernetes.io/docs/concepts/storage/volumes/#using-subpath) but none of my changes does, what docker swarm does by default.
The behaviour of kubernetes seems strange to my as I have worked with docker swarm for quite a while now and I am familiar with the way docker swarm handles that. But I am sure that there is a reason why kubernetes handles that in another way and that there are some possibilities to get, what I need.
So please, can someone have a look at my problem and help me find a way to get the following behaviour?
I have files in my image in some directory
I want to mount an NFS volume to that directory, because I need to have the data persisted, when for example my pod crashes and moves to another host or when my pod is temporarily shut down for whatever reason.
When my pod starts, I want the volume to be populated with the files from the image
And of course I would be really happy if someone could explain me, why kubernetes and docker swarm behave so different.
Thanks a lot in advance
This can be usually achieved in Kubernetes with init-containers.
Let's have an image with files stored in /mypath folder. If you are able to reconfigure the container to use a different path (like /persistent) you can use init container to copy files from /mypath to /persistent on pod's startup.
containers:
- name: myapp-container
image: myimage
env:
- name: path
value: /persistent
volumeMounts:
- name: myvolume
path: /persistent
initContainers:
- name: copy-files
image: myimage
volumeMounts:
- name: myvolume
path: /persistent
command: ['sh', '-c', 'cp /mypath/*.* /persistent']
In this case you have a main container myapp-container using files from /persistent folder from the NFS volume and each time when container starts the files from /mypath will be copied into that folder by init container copy-files.

Kubernetes mountPropagation clarifications

I have a docker image A that contains a folder I need to share with another container B in the same K8s pod.
At first I decided to use a shared volume (emptyDir) and launched A as an init container to copy all the content of the folder into the shared volume. This works fine.
Then looking at k8s doc I realised I could use mountPropagation between the containers.
So I changed the initContainer to a plain container (side car) in the same pod and performed a mount of the container A folder I want to share with container B. This works fine but I need to keep the container running A up with a wait loop. Or not...
Then I decided to come back to the InitContainer pattern and do the same, meaning mount the folder in A inside the shared volume and then the container finishes cause it is an InitContainer and then use the newly mounted folder in container B. And it works !!!!
So my question is, can someone explains me if this is expected on all Kubernetes clusters ? and explain to me why the mounted folder from A that is no longer running as a container can still be seen by my other container ?
Here is a simple manifest to demonstrate it.
apiVersion: v1
kind: Pod
metadata:
name: testvol
spec:
initContainers:
- name: busybox-init
image: busybox
securityContext:
privileged: true
command: ["/bin/sh"]
args: ["-c", "mkdir -p /opt/connectors; echo \"bar\" > /opt/connectors/foo.txt; mkdir -p /opt/connectors_new; mount --bind /opt/connectors /opt/connectors_new; echo connectors mount is ok"]
volumeMounts:
- name: connectors
mountPath: /opt/connectors_new
mountPropagation: Bidirectional
containers:
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "cat /opt/connectors/foo.txt; trap : TERM INT; (while true; do sleep 1000; done) & wait"]
volumeMounts:
- name: connectors
mountPath: /opt/connectors
mountPropagation: HostToContainer
volumes:
- name: connectors
emptyDir: {}
here the manifest to reproduce the behavior
This works because your containers run in a pod. The pod is where your volume is defined, not the container. So you are creating a volume in your pod that is an empty directory. Then you are mounting it in your init container and making changes. That makes changes to the volume on the pod.
Then when your init container finishes, the files at the pod level don't go away, they are still there, so your second container picks up the files when it mounts the same volume from the pod.
This is expected behavior and doesn't need mountPropagation fields at all. The mountPropagation fields may have some effect on emptyDir volumes, but it is not related to preserving the files:
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
All containers in the Pod can read and write the same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.
Note: A container crashing does not remove a Pod from a node. The data in an emptyDir volume is safe across container crashes.
The note here doesn't explicitly state it, but this implies it is also safe across initContainer to Container transitions. As long as your pod exists on the node, your data will be there in the volume.

Is there any better way for changing the source code of a container instead of creating a new image?

What is the best way to change the source code of my application running as Kubernetes pod without creating a new version of image so I can avoid time taken for pushing and pulling image from repository?
You may enter the container using bash if it installed on the image and modify it using -
docker exec -it <CONTAINERID> /bin/bash
However, this isn’t advisable solution. If your modifications succeed, you should update the Dockerfile accordingly or else you risk losing your work and ability to share it with others.
Have the container pull from git on creation?
Setup CI/CD?
Another way to achieve a similar result is to leave the application source outside of the container and mount the application source folder in the container.
This is especially useful when developing web applications in environments such as PHP: your container is setup with your Apache/PHP stack and /var/www/html is configured to mount your local filesystem.
If you are using minikube, it already mounts a host folder within the minikube VM. You can find the exact paths mounted, depending on your setup, here:
https://kubernetes.io/docs/getting-started-guides/minikube/#mounted-host-folders
Putting it all together, this is what a nginx deployment would look like on kubernetes, mounting a local folder containing the web site being displayed:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- mountPath: /var/www/html/
name: sources
readOnly: true
volumes:
- name: sources
hostPath:
path: /Users/<username>/<source_folder>
type: Directory
Finally we have resolved the issue. Here, we changed our image repository from docker hub to aws ecr in the same region where we are running kubernetes cluster. Now, it is taking very lesstime for pushing/pulling images.
This is definitely not recommended for production.
But if your intention is local development with kubernetes, take a look at these tools:
Telepresence
Telepresence is an open source tool that lets you run a single service
locally, while connecting that service to a remote Kubernetes cluster.
Kubectl warp
Warp is a kubectl plugin that allows you to execute your local code
directly in Kubernetes without slow image build process.
The kubectl warp command runs your command inside a container, the same
way as kubectl run does, but before executing the command, it
synchronizes all your files into the container.
I think it should be taken as process to create new images for each deployment.
Few benefits:
immutable images: no intervention in running instance this will ensure image run in any environment
rollback: if you encounter issues in new version, rollback to previous version
dependencies: new versions may have new dependencies

How to reboot kubernetes pod and keep the data

I'm now using kubernetes to run the Docker container.I just create the container and i use SSH connect to my pods.
I need to do some system config change so i need to reboot the container but when i`reboot the container it will lose all the data in the pod. kubernetes will run a new pod just like the Docker image original.
So how can i reboot the pod and just keep the data in it?
The kubernetes was offered my Bluemix
You need to learn more about containers as your question suggests that you are not fully grasping the concepts.
Running SSH in a container is an anti-pattern, a container is not a virtual machine. So remove the SSH server from it.
the fact that you run SSH indicates that you may be running more than one process per container. This is usually bad practice. So remove that supervisor and call your main process directly in your entrypoint.
Setup your container image main process to use environment variables or configuration files for configuration at runtime.
The last item means that you can define environment variables in your Pod manifest or use Kubernetes configmaps to store configuration file. Your Pod will read those and your process in your container will get configured properly. If not your Pod will die or your process will not run properly and you can just edit the environment variable or config map.
My main suggestion here is to not use Kubernetes until you have your docker image properly written and your configuration thought through, you should not have to exec in the container to get your process running.
Finally, more generally, you should not keep state inside a container.
For you to store your data you need to set up persistent storage, if you're using for example Google Cloud as your platform, you would need to create a disk to store your data on and define the use of this disk in your manifest.
With Bluemix it looks like you just have to create the volumes and use them.
bx ic volume-create myapplication_volume ext4
bx ic run --volume myapplication_volume:/data --name myapplication registry.eu-gb.bluemix.net/<my_namespace>/my_image
Bluemix - Persistent storage documentation
I don't use Bluemix myself so i'll proceed with an example manifest using Google's persistent disks.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: myapplication
namespace: default
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: myapplication
template:
metadata:
labels:
app: myapplication
spec:
containers:
- name: myapplication
image: eu.gcr.io/myproject/myimage:latest
imagePullPolicy: Always
ports:
- containerPort: 80
- containerPort: 443
volumeMounts:
- mountPath: /data
name: myapplication-volume
volumes:
- name: myapplication-volume
gcePersistentDisk:
pdName: mydisk-1
fsType: ext4
Here the disk mydisk-1 is mapped to the /data mountpoint.
The only data that will persist after reboots will be inside that folder.
If you want to store your logs for example you could symlink the logs folder.
/var/log/someapplication -> /data/log/someapplication
It works, but this is NOT recommended!
It's not clear to me if you're sshing to the nodes or using some tool to execute a shell inside the containers. Even though running multiple processes per container is bad practice it seems to be working very well, if you keep tabs on memory and cpu use.
Running a ssh server and cronjobs in the same container for example will absolutely work though it's not the best of solutions.
We've been using supervisor with multiple (2-5) processses in production for over a year now and it's working surprisingly well.
For more information about persistent volumes in a variety of platforms.
https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Resources