Upgrade Jenkins within a Kubernetes container without losing my data? - jenkins

I have a container deployed in a pod by Kubernetes running Jenkins. The container is mounted with a persistent storage volume (AWS Elastic File Store) that's currently storing all of the Jenkins instance's user, configuration, job configurations, etc.
I need to update Jenkins. Normally when I do this, the process wipes out the storage, since the whole container gets re-launched. However, I need to figure out how to do this without losing the data.
How do I update Jenkins without losing the info on the storage volume attached to the container?

I ended up finding the answer I needed by first reading this article, which helped me understand the underlying concepts:
http://www.monkeylittle.com/blog/2017/02/08/adding-persistent-volumes-to-jenkins-with-kubernetes-volumes.html
Then I found this article, which shows you how to do it with EFS, specifically:
https://itnext.io/efs-persistent-volumes-on-aws-kubernetes-193e0035bbfb

Related

How to save doccano database to Google Cloud Storage after deploying to Cloud Run?

I deployed a doccano docker container to Cloud Run and I am successfully able to reach the WebApp.
Everything works fine, such as log in, data import and annotation.
Now I would like to connect the container to Google Cloud Storage in order to save all annotations in a bucket. Currently, all data is lost after the container restarts.
Any hints on how to accomplish that are highly appreciated!
What I (kind of) tried:
Container is up and running, some environment variables are set. But I don't know how I can set a bucket uri within the doccano docker container (doccanos documentation is a bit sparse in that regard).
Maybe this can be helpful for anyone with a similar use case:
My solution/workaround for deploying doccano on GCP was deploying a docker container to the Compute Engine (and opening a port to the app) instead of Cloud Run. Cloud Run seems indeed to be the wrong service for that use case. Compute Engine has a persistent storage which keeps all of the data even if the container has to restart.

Sharing docker volumes within the workplace

I have taken some time to create a useful Docker volume for use at work. It has a restored backup of one of our software databases (SQL Server) on it, and I use it for testing/debug by just attaching it to whatever Linux SQL Container I feel like running at the time.
When I make useful Docker images at work, I share them with our team using either the Azure Container Registry or the AWS Elastic Container Registry. If there's a DockerFile I've made as part of a solution, I can store that in our GIT repo for others to access.
But what about volumes? Is there a way to share these with colleagues so they don't need to go through the process I went through to build the volume in the first place? So if I've got this 'databasevolume' is there a way to source control it? Or share it as a file to other users of Docker within my team? I'm just looking to save them the time of creating a volume, downloading the .bak file from its storage location, restoring it etc.
The short answer is that there is no default docker functionality to export the contents of a docker volume and docker export explicitly does not export the contents of the volumes associated with the container. You can backup, restore or migrate data volumes.
Note: if your're backing up a database I'd suggest using the appropriate tools for that database.

Manually deleting unused Images on kubernetes (GKE)

I am running a managed kubernetes cluster on Google Cloud Platform with a single node for development.
However when I update Pod images too frequently, the ImagePull step fails due to insufficient disk space in the boot disk.
I noticed that images should be auto GC-ed according to documentation, but I have no idea what is the setting on GKE or how to change it.
https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/#image-collection
Can I manually trigger a unused image clean up, using kubectl or Google Cloud console command?
How do I check/change the above GC setting above such that I wont encounter this issue in the future?
Since Garbage Collector is an automated service, there are no kubectl commands or any other commands within GCP to manually trigger Garbage Collector.
In regards to your second inquiry, Garbage Collector is handled by the Master node. The Master node is not accessible to users since it is a managed service. As such, users cannot configure Garbage Collection withing GKE.
The only workaround I can offer is to create a custom cluster from scratch within Google Compute Engine. This would provide you access to the Master node of your cluster so you can have the flexibility of configuring the cluster to your liking.
Edit: If you need to delete old images, I would suggest removing the old images using docker commands. I have attached a github article that provides several different commands that you can run on the node level to remove old images here.

keep CDH container running

I am learning CDH and Docker and didn't have prior experiene in setting up both tools. After reading documentation i managed to run CDH docker in mac environment and also completed example given in quick start guid. But when next day when i started mac book again to learn something new but i didn't find my previous work which i found very strange and even couldn't see container running which seems fine to me.
What i really want to do is i don't want to loose my work even after stoping docker container. could you please guid me how do i configure docker so that i will not loose my work even after restarting docker again?
Every instance of a docker run will allocate a new filesystem, essentially starting from scratch.
If you actually want to "save" your work, then you need to volume mount (using -v docker flag) your local filesystem into the container for at least the following directories.
HDFS Data Directory
NameNode Data Directory
/home/cloudera
I think the hadoop data folders are somewhere under /var/lib/hadoop-*, by default
The better alternative for saving your workloads would be the CDH VM, where it actually has a persistent HDD associated with it.

How do I use Docker on cloud or datacenter

I couldn't have enough courage to start using docker now I'm feel like came from last century. I want to clear my doubts about docker before get started. My question is mainly for deploying/running docker images on cloud or hosting environment.
Can I build a docker image with any type of server (eg. wildfly, payara) and/or database server (eg. mysql, oracle) and will it work on docker enabled cloud/datacenter?
If it's yes how about persistent datas like database files and static storages (eg. images, uploaded documents, logs) those are stored in docker images or somewhere else? What will happen to those files when I update my application and redeploy new image?
I read posts about what is docker but I couln't find specific answer. Forgive me for not doing enough googling.
I have run docker on AWS and other cloud providers. It is really not that hard if you have some experience with system administration and or devops. Regarding cloud hosters and getting started, most providers have some sort of tutorial on how to get started using docker with their infrastructure:
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-dockerextension/
Can I build a docker image with any type of server (eg. wildfly,
payara) and/or database server (eg. mysql, oracle) and will it work on
docker enabled cloud/datacenter?
To get a server up and running, you just need the docker engine installed on the host, there are packages for many distros:
https://docs.docker.com/engine/installation/
After docker engine is installed, you can create dockerfiles for basically any server or service. Hopefully you do not need to, in most cases, since there are countless docker files and pre-configured, vendor maintained images already available on dockerhub (I use wildfly, elk-stack, and mysql for example). Be careful about selecting images are maintained, otherwise you end up with security issues in your images that might never get fixed! Or you have to do it yourself!
Example images:
https://hub.docker.com/r/jboss/wildfly/
https://hub.docker.com/_/mysql/
https://hub.docker.com/_/oraclelinux/
https://hub.docker.com/u/payara/
If it's yes how about persistent datas like database files and static
storages (eg. images, uploaded documents, logs) those are stored in
docker images or somewhere else? What will happen to those files when
I update my application and redeploy new image?
In general, you will want to store persistent data external to the docker image and mount it into the image as a volume:
https://docs.docker.com/engine/tutorials/dockervolumes/
Some cloud based storage providers might be easier to mount or connect to in other ways, but this volume approach is standard, IMO.
For logfiles, I actually push them to an ELK server, so having a volume for the logs is not necessarily required. However, since the ELK server is also a docker image, it does have a volume where the data is persisted.
So you have:
documentation from your cloud hoster (or docker themselves)
a host in your cloud running docker engine
0..n images that you can either grab from dockerhub or build yourself.
storage for persistent data on this host or mounted from elsewhere that you mount into your docker images on startup. this is where e.g. mysql data folders live, or where you can persist logs, etc.
Of course, it can get much more complex from there, e.g. how to transparently scale and update your environment etc., but that is something for e.g. kubernetes or docker swarm or some other solution (I've scripted a bit on my own but do not need the robustness or elastic scalability of large systems).
Regarding cluster management, it should be noted that Swarm is now included in the Docker Core. This has created some controversy in the community and even talks of a fork of the core:
https://technologyconversations.com/2015/11/04/docker-clustering-tools-compared-kubernetes-vs-docker-swarm/
https://jaxenter.com/docker-1-12-is-probably-the-most-important-release-since-1-0-129080.html
http://searchitoperations.techtarget.com/news/450303918/Docker-fork-talk-prompts-container-standardization-brawl
http://www.infoworld.com/article/3118345/cloud-computing/why-kubernetes-is-winning-the-container-war.html
I have experience running docker on Alibaba cloud and AWS as well. I did not see any difference in working with docker on both cloud providers. Docker images can be build same way on all linux platform regardless of the cloud provider. However, persistence of data need to be taken care using docker volumes. However, it is recommended to use managed service such as RDS in Alibaba cloud for databases instead of using docker.
Can I build a docker image with any type of server (eg. wildfly,
payara) and/or database server (eg. mysql, oracle) and will it work on
docker enabled cloud/datacenter?
You can build your own Docker images or use solutions that are already pre-packaged and proven by cloud providers. For example, here is an auto-clustering Docker-based implementation of GlassFish that can be run and managed on Jelastic PaaS.
If it's yes how about persistent datas like database files and static
storages (eg. images, uploaded documents, logs) those are stored in
docker images or somewhere else? What will happen to those files when
I update my application and redeploy new image?
With the above mentioned cluster, all data is kept inside containers and stays without changes after restart. As an option, you can also connect a separate data storage container if you wish to share it across other containers.

Resources