Access Kubernetes pod's log files from inside the pod? - docker

I'm currently migrating a legacy server to Kubernetes, and I found that kubectl or dashboard only shows the latest log file, not the older versions. In order to access the old files, I have to ssh to the node machine and search for it.
In addition to being a hassle, my team wants to restrict access to the node machines themselves, because they will be running pods from many different teams and unrestricted access could be a security issue.
So my question is: can I configure Kubernetes (or a Docker image) so that these old (rotated) log files are stored in some directory accessible from inside the pod itself?
Of course, in a pinch, I could probably just execute something like run_server.sh | tee /var/log/my-own.log when the pod starts... but then, to do it correctly, I'll have to add the whole logfile rotation functionality, basically duplicating what Kubernetes is already doing.

So there are a couple of ways to and scenarios for this. If you are just interested in the log of the same pod from before last restart, you can use the --previous flag to look at logs:
kubectl logs -f <pod-name-xyz> --previous
But since in your case, you are interested in looking at log files beyond one rotation, here is how you can do it. Add a sidecar container to your application container:
volumeMounts:
- name: varlog
mountPath: /tmp/logs
- name: log-helper
image: busybox
args: [/bin/sh, -c, 'tail -n+1 -f /var/log/*.log']
volumeMounts:
- name: varlog
mountPath: /tmp/logs
volumes:
- name: varlog
hpostPath: /var/log
This will allow the directory which has all logs from /var/log directory from host to /tmp/log inside the container and the command will ensure that content of all files is flushed. Now you can run:
kubectl logs <pod-name-abc> -c count-log-1
This solution does away with SSH access, but still needs access to kubectl and adding a sidecar container. I still think this is a bad solution and you consider of one of the options from the cluster level logging architecture documentation of Kubernetes such as 1 or 2

Related

How to merge data from docker image with kubernetes volume (NFS)

I have a docker image that contains data in directors /opt/myfiles, Lets say the following:
/opt/myfiles/file1.txt
/opt/myfiles/file2.dat
I want to deploy that image to kubernetes and mount an NFS volume to that directory so that the changes to these files are persisted when I delete the pod.
When I do it in docker swarm I simply mount an empty NFS volume to /opt/myfiles/ and then my docker swarm service is started, the volume is populated with the files from the image and I can then work with my service and when I delete the service, I still have the files on my NFS server, so on next start of the service, I have my previous state back.
In kubernetes, when I mount an empty NFS volume to /opt/myfiles/, the pod is started and /opt/myfiles/ is overwritten with an empty directory, so my pod does not see the files from the image anymore.
My volme mount and volume definition:
[...]
volumeMounts:
- name: myvol
mountPath: /opt/myfiles
[...]
volumes:
- name: myvol
nfs:
server: nfs-server.mydomain.org
path: /srv/shares/myfiles
I read some threads about similar problems (for example K8s doesn't mount files on Persistent Volume) and tried some stuff using subPath and subPathExpr as in the documentation (https://kubernetes.io/docs/concepts/storage/volumes/#using-subpath) but none of my changes does, what docker swarm does by default.
The behaviour of kubernetes seems strange to my as I have worked with docker swarm for quite a while now and I am familiar with the way docker swarm handles that. But I am sure that there is a reason why kubernetes handles that in another way and that there are some possibilities to get, what I need.
So please, can someone have a look at my problem and help me find a way to get the following behaviour?
I have files in my image in some directory
I want to mount an NFS volume to that directory, because I need to have the data persisted, when for example my pod crashes and moves to another host or when my pod is temporarily shut down for whatever reason.
When my pod starts, I want the volume to be populated with the files from the image
And of course I would be really happy if someone could explain me, why kubernetes and docker swarm behave so different.
Thanks a lot in advance
This can be usually achieved in Kubernetes with init-containers.
Let's have an image with files stored in /mypath folder. If you are able to reconfigure the container to use a different path (like /persistent) you can use init container to copy files from /mypath to /persistent on pod's startup.
containers:
- name: myapp-container
image: myimage
env:
- name: path
value: /persistent
volumeMounts:
- name: myvolume
path: /persistent
initContainers:
- name: copy-files
image: myimage
volumeMounts:
- name: myvolume
path: /persistent
command: ['sh', '-c', 'cp /mypath/*.* /persistent']
In this case you have a main container myapp-container using files from /persistent folder from the NFS volume and each time when container starts the files from /mypath will be copied into that folder by init container copy-files.

Migrating Docker Compose to Kubernetes Volume Mount Isn't Supported

I am trying to change my existing deployment logic/switch to kubernetes (My server is in gcp and till now I used docker-compose to run my server.) So I decided to start by using kompose and generating services/deployments using my existing docker-compose file. After running
kompose --file docker-compose.yml convert
#I got warnings indicating Volume mount on the host "mypath" isn't supported - ignoring path on the host
After a little research I decided to use the command below to "fix" the issue
kompose convert --volumes hostPath
And what this command achieved is -> It replaced the persistent volume claims that were generated with the first command to the code below.
volumeMounts:
- mountPath: /path
name: certbot-hostpath0
- mountPath: /somepath
name: certbot-hostpath1
- mountPath: /someotherpath
name: certbot-hostpath2
- hostPath:
path: /path/certbot
name: certbot-hostpath0
- hostPath:
path: /path/cert_challenge
name: certbot-hostpath1
- hostPath:
path: /path/certs
name: certbot-hostpath2
But since I am working in my local machine
kubectl apply -f <output file>
results in The connection to the server localhost:8080 was refused - did you specify the right host or port?
I didn't want to connect my local env with gcp just to generate the necessary files, is this a must? Or can I move this to startup-gcp etc
I feel like I am in the right direction but I need a confirmation that I am not messing something up.
1)I have only one compute engine(VM instance) and lots of data in my prod db. "How do I"/"do I need to" make sure I don't lose any data in db by doing something?
2)In startup-gcp after doing everything else (pruning docker images etc) I had a docker run command that makes use of docker/compose 1.13.0 up -d. How should I change it to switch to kubernetes?
3)Should I change anything in nginx.conf as it referenced to 2 different services in my docker-compose (I don't think I should since same services also exist in kubernetes generated yamls)
You should consider using Persistent Volume Claims (PVCs). If your cluster is managed, in most cases it can automatically create the PersistentVolumes for you.
One way to create the Persistent Volume Claims corresponding to your docker compose files is using Move2Kube(https://github.com/konveyor/move2kube). You can download the release and place it in path and run :
move2kube translate -s <path to your docker compose files>
It will then interactively allow you configure PVCs.
If you had a specific cluster you are targeting, you can get the specific storage classes supported by that cluster using the below command in a terminal where you have set your kubernetes cluster as context for kubectl.
move2kube collect
Once you do collect, you will have a m2k_collect folder, which you can then place it in the folder where your docker compose files are. And when you run move2kube translate it will automatically ask you whether to target this specific cluster, and also option to choose the storage class from that cluster.
1)I have only one compute engine(VM instance) and lots of data in my
prod db. "How do I"/"do I need to" make sure I don't lose any data in
db by doing something?
Once the PVC is provisioned you can copy the data to the PVC by using kubectl cp command into a pod where the pvc is mounted.
2)In startup-gcp after doing everything else (pruning docker images
etc) I had a docker run command that makes use of docker/compose
1.13.0 up -d. How should I change it to switch to kubernetes?
You can potentially change it to use helm chart. Move2Kube, during the interactive session, can help you create helm chart too. Once you have the helm chart, all you have to do is "helm upgrade -i
3)Should I change anything in nginx.conf as it referenced to 2
different services in my docker-compose (I don't think I should since
same services also exist in kubernetes generated yamls)
If the services names are name in most cases it should work.

How do I install my plugins to the app's filesystem before the app container starts?

I've an app, packaged as docker image. The app has some default plugins, installed in /opt/myapp/plugins/. I want to install some additional plugins, which basically means copying the plugins to the app's aforementioned path:
cp /path/to/more-plugins/* /opt/myapp/plugins/
How do I do that? Is initContainers helpful in such cases? Does init-container have access to the filesystem of the app-container, so that the above command can be executed? I tried to use busybox in initContainers and ran ls -lR /opt/myapp just to see if such a path even exists. But it seems, init container doesn't have access to the app-filesystem.
So what are the different solutions to this problem? And what is the best one?
I'm not a container expert, but my understanding is that the best way to do this is to create a new container, based on your current docker image, that copies the files into place. Your image, with your plugins and configs, is what k8s loads and manages.
FROM my/oldimage:1.7.1
COPY more-plugins/* /opt/myapp/plugins/
$ docker build -t my/newimage:1.7.1 .
So what are the different solutions to this problem? And what is the
best one?
The best solution has been already provided by #PaulProgrammer and as long as there are no contraindications you should make your app plugins an integral part of your docker image. The reasoning in favour of such solutions was already well explained by #Matt:
I would only do the copy in a container image build though. It means
you always have a complete, reproducible artefact of what is running.
– Matt
I fully agree with it and I would also strongly recommend you rebuilding your docker image rather that solve this issue on kubernetes level.
However to make this thread complete and provide you with answers to your additional questions and clear all possible doubts I'd like to add a few words to what was already written.
I'd like to emphasize again that this is not an optimal solution but from technical perspective it is also possible to do it with the help of init containers in kubernetes. Furthermore I will even try to explain why it is not the best solution or more specifically why it is not applicable in your particular usecase.
How do I do that? Is initContainers helpful in such cases? Does
init-container have access to the filesystem of the app-container, so
that the above command can be executed? I tried to use busybox in
initContainers and ran ls -lR /opt/myapp just to see if such a path
even exists. But it seems, init container doesn't have access to the
app-filesystem.
Keep in mind that init containers are always run before the app containers are started and they cannot modify the original filesystem of the app container as when they run it doesn't even exist yet. However they can be used to pre-populate the additional volume that subsequently can be mounted in the app container. Look at the example below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
labels:
app: my-app
spec:
replicas: 1
selector:
matchLabels:
app: debian
template:
metadata:
labels:
app: debian
spec:
containers:
- name: debian
image: debian
command: ['sh', '-c', 'sleep 3600']
volumeMounts:
- mountPath: "/app/data"
name: my-volume
readOnly: true
volumes:
- name: my-volume
persistentVolumeClaim:
claimName: example-pvc
initContainers:
- name: init-myservice
image: busybox
command: ['sh', '-c', 'echo "Content of my file" > /mnt/my_file']
volumeMounts:
- mountPath: "/mnt"
name: my-volume
In this example we populate the /app/data directory of the app container with some content with the help of the init container. The command isn't really important here. It can be anything from getting some content using wget to cloning git repo or scp some content from the remote server. Note that you cannot add this way an additional content to the already existing one. If the directory have already existed and had already some content in it, it is wiped out and populated with the new one. Actually it isn't from technical point of view. The directory is used as a new mounpoint for another volume. As it points now to another disk and its content, the original directory content isn't available any more from our perspective.
So what is the typical usecase ? What can it be useful for ?
If your aplication needs some data to process, you typically don't want to make it an integral part of your docker image. Imagine that it changes quite often and you always need to download its most recent version. This way you don't have to worry about rebuilding your images each time.
But this is not applicable in your case. First, because you need to add some data (additional plugins) to something that is already an integral part of your app container image. In this situation init containers won't be very helpful. Second, you rather want to have consistent, reproducible image, containing all the required plugins.
As I said, technically it is feasible as you can upload somewhere all the plugins you need (including the original ones that are already present in the base image) and copy the whole content using an init container. The advantage of such approach is that it doesn't require from you building custom images when the new version of your base app is released. You only change the image tag and init container will take care of populating the /opt/myapp/plugins/ directory with all the required plugins you want and ensure it doesn't contain anything else apart from them.
I hope that it brought something new to this thread and clarified a bit in what situations it is worth using an init container to populate your app container with data and in what situations it is better to stick to your custom docker image.

Is there any better way for changing the source code of a container instead of creating a new image?

What is the best way to change the source code of my application running as Kubernetes pod without creating a new version of image so I can avoid time taken for pushing and pulling image from repository?
You may enter the container using bash if it installed on the image and modify it using -
docker exec -it <CONTAINERID> /bin/bash
However, this isn’t advisable solution. If your modifications succeed, you should update the Dockerfile accordingly or else you risk losing your work and ability to share it with others.
Have the container pull from git on creation?
Setup CI/CD?
Another way to achieve a similar result is to leave the application source outside of the container and mount the application source folder in the container.
This is especially useful when developing web applications in environments such as PHP: your container is setup with your Apache/PHP stack and /var/www/html is configured to mount your local filesystem.
If you are using minikube, it already mounts a host folder within the minikube VM. You can find the exact paths mounted, depending on your setup, here:
https://kubernetes.io/docs/getting-started-guides/minikube/#mounted-host-folders
Putting it all together, this is what a nginx deployment would look like on kubernetes, mounting a local folder containing the web site being displayed:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 1
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- mountPath: /var/www/html/
name: sources
readOnly: true
volumes:
- name: sources
hostPath:
path: /Users/<username>/<source_folder>
type: Directory
Finally we have resolved the issue. Here, we changed our image repository from docker hub to aws ecr in the same region where we are running kubernetes cluster. Now, it is taking very lesstime for pushing/pulling images.
This is definitely not recommended for production.
But if your intention is local development with kubernetes, take a look at these tools:
Telepresence
Telepresence is an open source tool that lets you run a single service
locally, while connecting that service to a remote Kubernetes cluster.
Kubectl warp
Warp is a kubectl plugin that allows you to execute your local code
directly in Kubernetes without slow image build process.
The kubectl warp command runs your command inside a container, the same
way as kubectl run does, but before executing the command, it
synchronizes all your files into the container.
I think it should be taken as process to create new images for each deployment.
Few benefits:
immutable images: no intervention in running instance this will ensure image run in any environment
rollback: if you encounter issues in new version, rollback to previous version
dependencies: new versions may have new dependencies

How to reboot kubernetes pod and keep the data

I'm now using kubernetes to run the Docker container.I just create the container and i use SSH connect to my pods.
I need to do some system config change so i need to reboot the container but when i`reboot the container it will lose all the data in the pod. kubernetes will run a new pod just like the Docker image original.
So how can i reboot the pod and just keep the data in it?
The kubernetes was offered my Bluemix
You need to learn more about containers as your question suggests that you are not fully grasping the concepts.
Running SSH in a container is an anti-pattern, a container is not a virtual machine. So remove the SSH server from it.
the fact that you run SSH indicates that you may be running more than one process per container. This is usually bad practice. So remove that supervisor and call your main process directly in your entrypoint.
Setup your container image main process to use environment variables or configuration files for configuration at runtime.
The last item means that you can define environment variables in your Pod manifest or use Kubernetes configmaps to store configuration file. Your Pod will read those and your process in your container will get configured properly. If not your Pod will die or your process will not run properly and you can just edit the environment variable or config map.
My main suggestion here is to not use Kubernetes until you have your docker image properly written and your configuration thought through, you should not have to exec in the container to get your process running.
Finally, more generally, you should not keep state inside a container.
For you to store your data you need to set up persistent storage, if you're using for example Google Cloud as your platform, you would need to create a disk to store your data on and define the use of this disk in your manifest.
With Bluemix it looks like you just have to create the volumes and use them.
bx ic volume-create myapplication_volume ext4
bx ic run --volume myapplication_volume:/data --name myapplication registry.eu-gb.bluemix.net/<my_namespace>/my_image
Bluemix - Persistent storage documentation
I don't use Bluemix myself so i'll proceed with an example manifest using Google's persistent disks.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: myapplication
namespace: default
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
selector:
matchLabels:
app: myapplication
template:
metadata:
labels:
app: myapplication
spec:
containers:
- name: myapplication
image: eu.gcr.io/myproject/myimage:latest
imagePullPolicy: Always
ports:
- containerPort: 80
- containerPort: 443
volumeMounts:
- mountPath: /data
name: myapplication-volume
volumes:
- name: myapplication-volume
gcePersistentDisk:
pdName: mydisk-1
fsType: ext4
Here the disk mydisk-1 is mapped to the /data mountpoint.
The only data that will persist after reboots will be inside that folder.
If you want to store your logs for example you could symlink the logs folder.
/var/log/someapplication -> /data/log/someapplication
It works, but this is NOT recommended!
It's not clear to me if you're sshing to the nodes or using some tool to execute a shell inside the containers. Even though running multiple processes per container is bad practice it seems to be working very well, if you keep tabs on memory and cpu use.
Running a ssh server and cronjobs in the same container for example will absolutely work though it's not the best of solutions.
We've been using supervisor with multiple (2-5) processses in production for over a year now and it's working surprisingly well.
For more information about persistent volumes in a variety of platforms.
https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Resources