File storage options with Docker - docker

We plan to use Docker with our new asp.net core project and one of the requirements is that app will upload files and we need to have them stored permanently.
We have read that Docker creates filesystem/volumes (i might be imprecise in terminology here) per container and if container is recreated for whatever reason - filesystem/volume exposed to container is lost.
We would like to avoid storing files in our database (mongodb).
What is the usual, best practice way to have files permanently&reliably stored with Docker?

Keeping non-ephemeral data in external storage servers is one solution. An more recent approach is to use S3 or a local equivalent like minio to store shared or private data that needs to outlive the lifetime of the container.

Refer to similar question
It's possible to create data volumes in the docker image/container.
$ docker run -d -P --name web -v /webapp training/webapp python app.py
This will create a new volume inside a container at /webapp. But the files stored will be lost once the container is destroyed.
On the other hand, we can mount a host directory into a container. The host directory will then by accessible inside the container.
$ docker run -d -P --name web -v /src/webapp:/webapp training/webapp python app.py
This command mounts the host directory, /src/webapp, into the container at /webapp.
The files stored by the docker container into this mounted directory will be available even if the container is destroyed. If you are planning to persist the files beyond the life time of container this will be a good option.

Related

How to work with the files from a docker container

I need to work with all the files from a docker container, my approach is to copy all the list of files from the container to my host.
I'm using the next docker commands, for example with the postgres image:
docker create -ti --name dummy_1 postgres bash
docker cp dummy_1:/. Documents/docker/dockerOne
With this I have all the container folders and files in my host.
And then the idea is to transverse all the files with the java API, and work with them and finally delete the files and folders from local, but I would like to know if is it a better approach, maybe with Java and access directly to the container files, instead of create a local copy of the container files in my host.
Any ideas?
You can build a small server app inside your docker container which feeds you the information you need at an exposed port. Thats how i would have done it.
Maybe I don't understand the question, but you can mount a volume when you run, not create the container
docker run -v /host/path:/container/path your_container
Any code in the container (e.g. Java) that modifies files at /container/path will be reflected on the host, and not need to be copied back in/out. Similarly, any modifications on the host filesystem will be seen in the container.
I don't think I can implement an API in the docker container
Yes you can. You bind a TCP port using -p flag

Difference between data and host volume in Docker containers?

As far as I know, there are 2 options for using volume in Docker:
1. Mount a host directory as data volume:
docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=<YourStrong!Passw0rd>" -p 1433:1433
-v <host directory>/data:/var/opt/mssql/data
-d mcr.microsoft.com/mssql/server:2019-latest
2. Use data volume containers:
docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=<YourStrong!Passw0rd>" -p 1433:1433
-v sqlvolume:/var/opt/mssql -d mcr.microsoft.com/mssql/server:2019-latest
My questions:
1) In the 1st option, I think data in the /var/opt/mssql/data is completely kept on <host directory>. But in the 2nd option, where is it kept in the Docker data files?
2) Let's say I have a currently used container where /var/opt/mssql/data is stored and then decide to mount it to a <host directory>. In this scene, can I move this data directly to the host directory using the docker run command in the 2nd option? Or should I backup and restore this data?
3) When mounting data in a host directory and deleting all the containers and uninstalling Docker, we still have the data. So, after reinstalling the Docker and mounting the same database to the same host, will we get the same data before uninstalling Docker?
Your bind mounted data is kept in the bind-mounted folder on your host. Nothing really mysterious about that. For docker volumes, in most typical situations, the data is kept on your docker host. The exact place and folder architecture depends on your configuration, more specifically on the storage driver in use. Meanwhile, inspecting /var/lib/docker/volumes is probably a good start to look under the hood. Note that some drivers support storing data over the network (e.g. vieux/sshfs...), but from your question I doubt you use any of those at the moment.
I'm not totally sure what you are asking here. Meanwhile, one feature available with docker volumes (and not with bind mounts, i.e. mounting a host folder to the container) is the ability to copy the content from the path in the image to a freshly created volume when the container starts.
Data is data, wherever you keep it. If you have any sort of (thoroughly tested!) backup, you will always be able to restore it and feed it to your new container. The situation is usually quite easy to visualize regarding bind-mounts for docker beginners (i.e. backup the folder on the host, restore it and bind-mount it back to a new container). If you decide to work with volumes (which is probably a good suggestion as explained in the volume documentation introduction), see Backup, restore, or migrate data volumes to get a first idea of how to proceed.

Docker: How to create an environment variable in the host machine that points to a directory in a docker container?

I am using Docker to run four containers to run a backend web application. The backend web application uses buildout to assemble the software.
However, the frontend, which is installed and runs on the host machine (that is, not using Docker), needs to access the buildout directory inside one of the four docker containers.
Moreover, the frontend uses an environment variable called NTI_BUILDOUT_PATH that is defined on the host machine. NTI_BUILDOUT_PATH must point to the buildout directory, which is inside the aforementioned container.
My problem is that I do not know how to define NTI_BUILDOUT_PATH such that it contains a directory that points towards the buildout directory that is needed by the front end for SSL certificates and other purposes.
I have researched around the web and read about volumes and bind mounts but I do not think they can help me in my case.
One way you can do that is by copying your buildout folder into the host machine using docker cp
docker cp <backend-container-id>:<path-to-buildout> <path-to-host-folder>
For Example if your backend's container_id is d1b5365c5bca and your buildout folder is in /app/buildout inside the container. You can use the following command to copy it to the host.
docker cp d1b5365c5bca:/app/buildout /home/mahmoud/app/buildout
After that you docker rm all your containers and recreate new ones with a bind mount to the buildout folder in the host. So following the previous example we'll have
docker run -v /home/mahmoud/app/buildout:/app/buildout your-backend-image
docker run -v /home/mahmoud/app/buildout:/app/buildout -e NTI_BUILDOUT_PATH=/app/buildout your-frontend-image

Docker NodeRed committed container does not maintain flows and modules

I'm working on a project using NodeRed deployed with docker and I would like to save the state of my deployment, including flows, settings and new added modules so that I can save the image and load it on another host replicating exactly the same NodeRed instance.
I created the container using:
docker run -itd --name my-nodered node-red
After implementing the flows and installing some custom modules, with the container running I used this command:
docker commit my-nodered my-project-nodered/my-nodered:version1
docker save my-project-nodered/my-nodered:version1 > tar-archive.tar.gz
And on another machine I'd imported the image using:
docker load < tar-archive.tar.gz
And run it using:
docker run -itd my-project-nodered/my-nodered:version1
And I obtain a vanilla NodeRed docker container with a default /data directory and just the files on the data directory maintained.
What am I missing? It could be possibile that my /data directory is overwrittenm as well as my settings.js file in the home directory? And in this case, which is the best practice to achieve my target?
Thank you a lot in advance
commit will not work, as you can see that there is volume defined in the Dockerfile.
# User configuration directory volume
VOLUME ["/data"]
That makes it impossible to create a derived image with any different content in that directory tree. (This is the same reason you can't create a mysql or postgresql image with prepopulated data.)
docker commit doesn't consider volumes at all, so you'll get an unchanged image with nothing preloaded in it.
You can see the offical documentation
Managing User Data
Once you have Node-RED running with Docker, we need to ensure any
added nodes or flows are not lost if the container is destroyed. This
user data can be persisted by mounting a data directory to a volume
outside the container. This can either be done using a bind mount or a
named data volume.
Node-RED uses the /data directory inside the container to store user
configuration data.
nodered-user-data-in-docker
one way is to restore the your config file on another machine, for example backup-config then
docker run -it -p 1880:1880 -v $PWD/backup-config/:/data --name mynodered nodered/node-red-docker
or if you want to full for some repo then you can try
docker run -it --rm -v "$PWD/$(wget https://raw.githubusercontent.com/openenergymonitor/oem_node-red/master/flows_emonpi.json)":/data/ nodered/node-red-docker

Docker data container, boot2docker, and the local file system

I'm slowly working my way through understanding current Docker practices. I'm on a Mac, and I'm using boot2docker.
I've been able to use the docker -v local/directory:container/directory method to link a container directory to my local file system. Great, now I can easily edit things like site code in my local Mac file system and have the changes immediately available to my container (e.g. /var/www/html).
I'm now trying to separate my containers into discrete concerns. For example, a Web, Database, and File (e.g. busybox) container would be useful for a Wordpress site. Thing is, I don't know how to make my file container define volumes that I can then link to my local OS (similar to the -v local/directory:container/directory used by boot2docker).
This is probably not the most eloquent question, as I'm still fumbling through learning Docker, but if you can understand what I'm trying to achieve, I'd really appreciate any guidance provided.
Thanks!
Docker Volumes User Guide
I will use two docker containers for my simple example
marginalized_liskov and plagiarized_engelbart
Mount a Host Directory as a Data Volume (at runtime)
docker run -d -P --name marginalized_liskov -v /host/directory/context:/container/directory/context poop python server.py
marginalized_liskov is the name of the container.
poop is not only my favorite palindrome, but also the name of the volume that we're creating.
"/host/directory/context" is the location on the host that you want to mount
"/container/directory/context" is the location you want your new volume to be created in your container
python is of course the application to run
server.py is the argument provided to "python" for this sample.
Create a Named Volume in a container and mount another container to that volume
docker create -v /poop --name marginalized_liskov training/postgres
docker run -d --volumes-from marginalized_liskov --name plagiarized_engelbart ubuntu
This creates two containers.
marginalized_liskov gets a volume created named poop I built it from the postgres training image because that's what was used in the User Guide. Since we're just setting up a container to contain a data volume and not host applications, using the training/postgres image provides our functionality while remaining lean.
plagiarized_engelbart mounts the volumes from marginalized_liskov with the --volumes-from flag.

Resources