Google Cloud docker image file getting deleted - docker

I am running a docker image for Juypter and tensorboard. The data seem to get deleted everytime the VM instance is stopped is there away to stop this from happening i could find anything on the web that would allow me to do this?

TL;DR: You are not persisting your data.
Docker containers does not persist data out of the box, you need to explicity tell docker to persist any data created inside the container when the container is deleted.
You can read more at Use volumes page at Docker documentation.
If you want to persist data you need to do the next steps:
Create a local volume inside the VM where you want to persist data. This command should be executed on the GCE instance
mkdir -p /opt/data/jupyterdata
Set the correct ownership of the folder to the user id that the user inside your container uses. For example, let's imagine that your container lspvic/tensorboard-notebook run the application using the user tensorflow with the UID 1500. So you need to set the ownership of your folder to the UID 1500:
chown 1500:1500 /opt/data/jupyterdata -R
Modify your docker run command to mount the local directory as a volume inside the container. For example, lets imagine that inside your container you want to save the files at /var/lib/jupyter (this is an example), you will need to modify the docker run command as follows:
docker run -it --rm -p 8888:8888 \
-v /opt/data/jupyterdata:/var/lib/jupyter:Z \
lspvic/tensorboard-notebook
NOTE: the :Z parameter is needed to avoid SELINUX issues
With this steps now your data saved on folder /var/lib/jupyter inside the container will be saved on /opt/data/jupyterdata inside the VM so no more data loss.

Related

How I can update prometheus config file without losing data on docker

I have an docker container running prometheus and sometimes I have to update an config file inside the container, the problem is that I don't know who I can update this file without deleting and creating the container again.
docker run --network="host" -d --name=prometheus -p 9090:9090 -v ~/prometheus.yaml:/etc/prometheus/prometheus.yml prom/prometheus --config.file=/etc/prometheus/prometheus.yml
I want to know how I can update the prometheus.yaml without deleting and creating again the docker container.
You should VOLUME the data path of Prometheus outside of your container.
So if the container creates again, you can have your previous data.
The default data path of Prometheus is ./data but in docker it depends on your base-image.
In theory you can't since by principle containers are ephemeral. Meaning they're supposed to be disposable upon exiting. However, there are a few ways out of your predicament:
#1. Create a new Image from your running container: https://www.scalyr.com/blog/create-docker-image/ to persist the state.
#2. Copy your data from within the container to the "outside world" as a backup, if option 1 is not the right option for you (here's an explanation how to do so: https://linuxhandbook.com/docker-cp-example/). You could also log in to the container (docker exec -it <container-name> bash) and then use yum or apt install (depending on your base image) to install the necessary tools to make your backup (rsync, ...), if the sometimes very barebones baseimage does not provide these.
#3. As #Amir already mentioned, you should always create a Volume inside your Container and map it to the outside world to have a persistent data storage. You create a Volume by the VOLUME-Keyword in the Dockerfile: https://docs.docker.com/storage/volumes/ ..by doing so you can restart the container everytime if the config changes without worrying about data loss.
HTH
Use the reload url
Prometheus can reload its configuration at runtime. If the new configuration is not well-formed, the changes will not be applied. A configuration reload is triggered by sending a SIGHUP to the Prometheus process or sending a HTTP POST request to the /-/reload endpoint (when the --web.enable-lifecycle flag is enabled). This will also reload any configured rule files.
Use the following the change the config inside the container using:
docker exec -it <container_name> sh
Map the config to outside the docker container for persistence using
-v <host-path>:<container_path>

Volume doesn't reflect data

When I created my installation like this:
sudo docker run -d --name myBlog -p 3001:2368 -e url=http://xxxx.com -v /var/www/myBlog/:/var/lib/ghost --restart always ghost
I was under the impression that I was defining that what ever is on /var/lib/ghost would be available in /var/www/myBlog/ but it seems that's not the case, when I check on /var/www/myBlog/ there is an empty folder. I created a new post so it can have some data but nothing is there.
Where exactly is the data being stored then? And is there a way I can access my current image that I'm using to see the files inside? I tried sudo docker run -it ghost but that gets me to the base image, not the one I'm using.
Please run the below command and you should be able to see files under target directory
docker run -d -e url=http://xxxx.com --name some-ghost -p 3001:2368 -v "$(pwd)/target":/var/lib/ghost/content ghost:1-alpine
Volumes work the other way around. The source is mapped into the container, mounted on top of whatever exists in that directory inside the container. When you mount over an existing directory in Linux, the parent filesystem at that location is no longer visible.
That means you'll see the host directory contents inside the container. Note this also includes filesystem permissions, and uid/gid ownership.
Named volumes (not the host volume used here) also feature an initialization step for an empty volume when the container is created, copying image contents to the volume before mounting it. It will not modify an existing volume.

Docker NodeRed committed container does not maintain flows and modules

I'm working on a project using NodeRed deployed with docker and I would like to save the state of my deployment, including flows, settings and new added modules so that I can save the image and load it on another host replicating exactly the same NodeRed instance.
I created the container using:
docker run -itd --name my-nodered node-red
After implementing the flows and installing some custom modules, with the container running I used this command:
docker commit my-nodered my-project-nodered/my-nodered:version1
docker save my-project-nodered/my-nodered:version1 > tar-archive.tar.gz
And on another machine I'd imported the image using:
docker load < tar-archive.tar.gz
And run it using:
docker run -itd my-project-nodered/my-nodered:version1
And I obtain a vanilla NodeRed docker container with a default /data directory and just the files on the data directory maintained.
What am I missing? It could be possibile that my /data directory is overwrittenm as well as my settings.js file in the home directory? And in this case, which is the best practice to achieve my target?
Thank you a lot in advance
commit will not work, as you can see that there is volume defined in the Dockerfile.
# User configuration directory volume
VOLUME ["/data"]
That makes it impossible to create a derived image with any different content in that directory tree. (This is the same reason you can't create a mysql or postgresql image with prepopulated data.)
docker commit doesn't consider volumes at all, so you'll get an unchanged image with nothing preloaded in it.
You can see the offical documentation
Managing User Data
Once you have Node-RED running with Docker, we need to ensure any
added nodes or flows are not lost if the container is destroyed. This
user data can be persisted by mounting a data directory to a volume
outside the container. This can either be done using a bind mount or a
named data volume.
Node-RED uses the /data directory inside the container to store user
configuration data.
nodered-user-data-in-docker
one way is to restore the your config file on another machine, for example backup-config then
docker run -it -p 1880:1880 -v $PWD/backup-config/:/data --name mynodered nodered/node-red-docker
or if you want to full for some repo then you can try
docker run -it --rm -v "$PWD/$(wget https://raw.githubusercontent.com/openenergymonitor/oem_node-red/master/flows_emonpi.json)":/data/ nodered/node-red-docker

What is the use of VOLUME in this official Dockerfile of postgresql

I found the following code in the Dockerfile of official postgresql. https://github.com/docker-library/postgres/blob/master/11/Dockerfile
ENV PGDATA /var/lib/postgresql/data
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA" # this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)
VOLUME /var/lib/postgresql/data
I want to know what is the purpose of VOLUME in this regard.
VOLUME /var/lib/postgresql/data
As per my understanding it will create a new storage volume when we start a container and that storage volume will also be deleted permanently when the container is removed (docker stop contianerid; docker rm containeid)
Then if the data is not going to persist then why to use this. Because VOLUME are used if we want the data to persist.
My question is w.r.t what is its use if the postgres data is only going to remain only till the container is running and after that everything is wiped out. If i have done lot of work and in the end everything is gone then whats the use.
As per my understanding it will create a new storage volume when we start a container and that storage volume will also be deleted permanently when the container is removed (docker stop contianerid; docker rm containeid)
If you run a container with the --rm option, anonymous volumes are deleted when the container exits. If you do not pass the --rm option when creating the container, then the -v option to docker container rm will also delete volumes. Otherwise, these anonymous volumes will persist after a stop/rm.
That said, anonymous volumes are difficult to manage since it's not clear which volume contains what data. Particularly with images like postgresql, I would prefer if they removed the VOLUME line from their Dockerfile, and instead provided a compose file that defined the volume with a name. You can see more about what the VOLUME line does and why it creates problems in my answer over here.
Your understanding of how volumes works is almost correct but not completely.
As you stated, when you create a container from an image defining a VOLUME, docker will indeed create an anonymous volume (i.e. with a random name).
When you stop/remove the container the volume itself will not be deleted and will still be accessible by the docker volume family of commands.
Indeed in order to remove a container and delete associated volumes you have to use the -v flag as in docker rm -v container-name. This command will remove the container and delete all the anonymous volumes associated with it (named volume will never be deleted unless explicitly requested via docker volume rm volume-name).
So to sum up the VOLUME directive inside a Dockerfile is used to identify those places that will host persistent data and ensuring the following:
data will survive the life of the container by default
data can be shared with other containers (i.e. --volumes-from)
The most important aspect to me is that it also serves as a sort of implicit documentation for your user to let them know where the persistent state is kept (so that they can name the volumes via the -v flag of docker run).

Howto run a Prestashop docker container with persistent data?

There is something I'm missing in many docker examples and that is persistent data. Am I right if I conclude that every container that is stopped will lose it's data?
I got this Prestashop image running with it's internal database:
https://hub.docker.com/r/prestashop/prestashop/
You just run docker run -ti --name some-prestashop -p 8080:80 -d prestashop/prestashop
Well you got your demo then, but not very practical.
First of all I need to hook an external MySQL container, but that one will also lose all it's data if for example my server reboots.
And what about all the modules and themes that are going to be added to the prestashop container?
It has to do with Volumes, but it is not clear to my how all the the host volumes needs to be mapped correctly and what path to the host is normally chosen. /opt/prestashop er something?
First of all, I don't have any experience with PrestaShop. This is an example which you can use for every docker container (from which you want to persist the data).
With the new version of docker (1.11) it's pretty easy to 'persist' your data.
First create your named volume:
docker volume create --name prestashop-volume
You will see this volume in /var/lib/docker/volumes:
prestashop-volume
After you've created your named volume container you can connect your container with the volume container:
docker run -ti --name some-prestashop -p 8080:80 -d -v prestashop-volume:/path/to/what/you/want/to/persist :prestashop/prestashop
(when you really want to persist everything, I think you can use the path :/ )
Now you can do what you want on your database.
When your container goes down or you delete your container, the named volume will still be there and you're able to reconnect your container with the named-volume.
To make it even more easy you can create a cron-job which creates a .tar of the content of /var/lib/docker/volumes/prestashop-volume/
When really everything is gone you can restore your volume by recreating the named-volume and untar your .tar-file in it.

Resources