Blazor server app multi-platform and docker data persistency - docker

I am creating a core library for Blazor server Apps creating a core DB automatically at runtime.
Until now, I create the database in Environment.SpecialFolder.LocalApplicationData which I got working on multiple platforms (OSX, Ubunt and Windows).
As I just discovered Docker's simplicity for deploying images, I am trying to make my library compatible with it.
So I face two issues:
Determine if the app is hosted on a Docker image or not
Persist data on a different volume that is NOT on the host if running Docker.
Of course, if in Docker, I shall not use Environment.SpecialFolder.LocalApplicationData as this is not a persistent location on the image itself. I can mount a volume when starting the image as described here.
So my natural idea is to assume users will mount a volumne with a specific path when starting the image, say
docker volume create MyAppDB
and run it with
docker run -dp 3000:3000 -v MyAppDB:/app/Data/MyAppDB myBuildDockerImage
can be verified by testing the existance of folder /app/Data/MyAppDB, and once 1. is verified, 2. become trivial.
If the folder does not exist, I am for sure on non-docker image... Well am I? What if users forgot to mount the volume? Or misspelled it? Maybe the folder does not exist because I am running on a non-docker environment...!
Is there a way to tweak my docker image when building it to force the mount volumes - i.e. created by ME and not the end-user? That seems safest... Alternatively, if not possible, can I add some specific element in Docker image to make absolutely sure I am running on the docker image I built or not?

Related

How docker detects which changes should be saved and which not?

I know that when we stop docker our changes are lost. There are many answers how to prevent this - commit each time. Idea is that when docker runs it will spin up a fresh container based on the image. On the other hand container persists some data after it exists unless you start using --rm.
Just to simplify:
If you run apt-get install vim, you must commit to save the change
BUT If you change nginx.conf or upload new file to HDFS, you do not lose the data.
So, just curious:
How docker knows what to save and what not? Ex: At the end of apt-get-install we have new files in the system. The same is when I upload new file. for the container/image there is NO difference , Right? Just I/O modification. So how docker know which modification should be saved when we stop the image?
The basic rules here:
Anything you explicitly store outside the container — a database, S3 — will outlive the container.
If you attach a volume to the container when you create the container using a docker run -v option or a Docker Compose volumes: option, any data written to that directory outlives the container. (If it’s a named volume, it lasts until you docker volume rm it.)
Anything else in the container filesystem is lost as soon as you docker rm the container.
If you need things like your application source code or a helper tool installed in an image, write a Dockerfile to describe how to build the image and run docker build. Check the Dockerfile into source control alongside your application.
The general theory of working with Docker is that you always start from a clean slate. When you docker build an image, you start from a base image and install your application into it; you never try to upgrade an installed application. Similarly, when you docker run a container, you start from a fresh copy of its image.
So the clearest answer to the question you ask is really, if you consistently docker rm a container when you stop it, when you docker run a new container, it will have the base image plus the content from the mounted volumes. Docker will never automatically persist anything outside of this.
You should never run docker commit: this leads to magic images that can’t be recreated later (in six months when you discover a critical security issue that risks taking your site down). Similarly, you should never install software in a running container, because it will be lost as soon as the container exits; add it to your Dockerfile and rebuild.
For any Container working with the Docker platform by default all the data generated is temporary and all the file generation or data generation is temporary and no data will persist if you have not mounted the filesystem part of if you have not attached volumes to the container.
IF you are finding that the nginx.conf is getting reused even after changes i would suggest try to find what directories are you trying to mount or mapped to the docker volumes.
The configurations for nginx which reside at /etc/nginx/conf.d/* and you might be mapping the volume with this directory. So if you make any changes in a working container and then remove the container the data will still persist as the data gets written to the writable layer. If the new container which you deploy later with the same volume mapping you will find all the changes you had initially done in the previous case are reflected in the newer container as well.

Deploy web app in docker data container vs volume

I'm confused about common consensus that one shouldn't use data containers. I have specific use case that I want to accomplish.
I want to have docker nginx container and behind it some other container with application. To run newest version of my app I want to download ready container from my private docker registry. The application is for now purely static html, javascript something.
So my plan is to create docker image which will hold the files, and will specify a named volume in some /webapp folder. The nginx container will serve this volume. I do not see any other way how to move bunch of files to remote system the "docker containerized" way. Am I not actually creating cursed data container?
Anyway what happens during app containers exchange? When I stop the app container the volume remains accesible, as it is placed on host. When I pull and start new version of app container. The volume will be created again and prefiled with image files stored at the same location, replacing the content on host so the nginx container will server from now new version of the application.Right? What happens when I will reference volume that does not exist yet from the nginx container.
It seem that named values are not automatically filed with the content of the image. As well I'm not sure how to create named volume in docker file as this syntax taken from here doesn't work
FROM training/webapp
VOLUME webapp:/webapp
I think you might want what i have described here https://stackoverflow.com/a/41576040/3625317
The problem with volumes is, that when a container is recreated, not docker-compose down but rather docker-compose pull + up, the new container will not have your "new code stored in the volume" but rather, due to the recycled volume, still the old anon volume. The point is, you will need a anon-volume for the code anyway, since you want it redeployable, not a named volume since you want the code to be exchangeable.
On re-create the anon-volume is not removed, that said, lets say you have the image:v1 right now and you pull image:v2 and then do a docker-compose up. It will recreate your container based on image:v2 - when this finished, you will have a new container, but the code is still from the old container, which was based on image:v1, since the anon-volume has not been replaced, it was re-assigned. docker-compose down && docker-compose up will resolve that for you - but you have to keep this in mind when dealing with your idea. (down removes anon-volumes)
In general, there is a pro / con, see my other post.
Data-containers in general have a other meaning and have been replaced by so called named volumes. Data-containers have been used to establish a volume-mount which is "named" and not based on a anon-volume.
In the past, you had to create a container with a volume, and later use a container-name based mount of this volume ( the container would be the static / name part ), today, you just create a named volume name and mount by this volume-name, no need for a busybox killed after start based container-name based volume mount.

Deploy a docker app using volume create

I have a Python app using a SQLite database (it's a data collector that runs daily by cron). I want to deploy it, probably on AWS or Google Container Engine, using Docker. I see three main steps:
1. Containerize and test the app locally.
2. Deploy and run the app on AWS or GCE.
3. Backup the DB periodically and download back to a local archive.
Recent posts (on Docker, StackOverflow and elsewhere) say that since 1.9, Volumes are now the recommended way to handle persisted data, rather than the "data container" pattern. For future compatibility, I always like to use the preferred, idiomatic method, however Volumes seem to be much more of a challenge than data containers. Am I missing something??
Following the "data container" pattern, I can easily:
Build a base image with all the static program and config files.
From that image create a data container image and copy my DB and backup directory into it (simple COPY in the Dockerfile).
Push both images to Docker Hub.
Pull them down to AWS.
Run the data and base images, using "--volume-from" to refer to the data.
Using "docker volume create":
I'm unclear how to copy my DB into the volume.
I'm very unclear how to get that volume (containing the DB) up to AWS or GCE... you can't PUSH/PULL a volume.
Am I missing something regarding Volumes?
Is there a good overview of using Volumes to do what I want to do?
Is there a recommended, idiomatic way to backup and download data (either using the data container pattern or volumes) as per my step 3?
When you first use an empty named volume, it will receive a copy of the image's volume data where it's first used (unlike a host based volume that completely overlays the mount point with the host directory). So you can initialize the volume contents in your main image as a volume, upload that image to your registry and pull that image down to your target host, create a named volume on that host, point your image to that named volume (using docker-compose makes the last two steps easy, it's really 2 commands at most docker volume create <vol-name> and docker run -v <vol-name>:/mnt <image>), and it will be populated with your initial data.
Retrieving the data from a container based volume or a named volume is an identical process, you need to mount the volume in a container and run an export/backup to your outside location. The only difference is in the command line, instead of --volumes-from <container-id> you have -v <vol-name>:/mnt. You can use this same process to import data into the volume as well, removing the need to initialize the app image with data in it's volume.
The biggest advantage of the new process is that it clearly separates data from containers. You can purge all the containers on the system without fear of losing data, and any volumes listed on the system are clear in their name, rather than a randomly assigned name. Lastly, named volumes can be mounted anywhere on the target, and you can pick and choose which of the volumes you'd like to mount if you have multiple data sources (e.g. config files vs databases).

How to deal with files of web applications in docker?

How do you guys deal with files of web applications for your docker containers? We are using same application for >400 customers. It's the same application with enabled/disabled modules (there are extra files).
I am currently using this approach: build the images, e.g. for Mysql, nginx+php, and then start the container with specific prepared application folder:
docker create -v /dbdata --name dbstore x/mysql /bin/true
docker run -d --volumes-from dbstore --name db1 x/mysql
docker run -d -P --name web --link db1:db1 -v /webapp:/opt/webapp x/webapp php-start index.php
IMHO, it's a space overusing.
I think it's a little bit complex to create >100 tags(revisions) of a webapp docker data container.
Please advice how to manage this problem?
First, recent versions of Docker let you create and use named volumes. This means that "data-only containers" are antiquated and no longer necessary, and in fact are considered an anti-pattern these days. It's pretty straightforward to create and use a named volume:
docker volume create --name=foo
docker run -d -v "foo:/dbdata" --name "db1" x/mysql
You can view your volumes with:
docker volume ls
As far as your main question, you could take advantage of Docker's union filesystem (which could also more simply be called a "shared layer") design. What this means is that if you create two containers from the ubuntu image (e.g. docker run -d --name=one ubuntu and docker run -d --name=two ubuntu), both of those containers are going to use the same filesystem objects in the base ubuntu image. So for example the /etc/passwd file in both of those containers point to the same /etc/passwd data stored on disk. This is part of what is meant by the term "union filesystem" in the context of Docker.
So just take this knowledge a step further and "bake" those modules into your base image for use by all of the containers for your different customers. That just means creating your own image from a Dockerfile which uses FROM wordpress:latest at the top. Continuing with the WordPress example, and if you wanted to make a bunch of WP plugins available, you could just store them in /var/www/html/wp-plugins (or whatever) and only enable certain ones in your configuration. Since they're baked into the image you have created (and used the same image to create all of your different containers), all of those module files point to the same exact data stored on disk, via the union filesystem. Of course, if someone changes the code in one of their modules, for example, the individual container's image will store the changes in its own image layer, but the base files will all be from the same data, not taking up any extra space. Of course, you can substitute in whichever CMS you're using.
Now, where I work, I've recently created a Docker-based hosting system for people to use. The issue is that we wanted each and every customer to have their own copy of the CMS filesystem. Even though the union filesystem means that changes to the base image would be stored in their own image layers, that wasn't good enough for the guy that signs my paycheck. They wanted each customer to have their own EBS volume with their own copy of the CMS filesystem on it. So in that situation, where you want each and every customer to have their own volume (for example in order to transport them for backup, or move to a new host, etc), then you won't be able to get around the issue of using extra storage for those files.
It depends:
If the files are static and you want to be able to move the container around easily, then I keep the files in the container by just copying them into the web location as single directory.
If you have a reliable external location, and you change the files more regular (for example by using some kind of CMS), you could just run an apache or a nginx container and mount the volume

Appropriate use of Volumes - to push files into container?

I was reading Project Atomic's guidance for images which states that the 2 main use cases for using a volume are:-
sharing data between containers
when writing large files to disk
I have neither of these use cases in my example using an Nginx image. I intended to mount a host directory as a volume in the path of the Nginx docroot in the container. This is so that I can push changes to a website's contents into the host rather then addressing the container. I feel it is easier to use this approach since I can - for example - just add my ssh key once to the host.
My question is, is this an appropriate use of a data volume and if not can anyone suggest an alternative approach to updating data inside a container?
One of the primary reasons for using Docker is to isolate your app from the server. This means you can run your container anywhere and get the same result. This is my main use case for it.
If you look at it from that point of view, having your container depend on files on the host machine for a deployed environment is counterproductive- running the same container on a different machine may result in different output.
If you do NOT care about that, and are just using docker to simplify the installation of nginx, then yes you can just use a volume from the host system.
Think about this though...
#Dockerfile
FROM nginx
ADD . /myfiles
#docker-compose.yml
web:
build: .
You could then use docker-machine to connect to your remote server and deploy a new version of your software with easy commands
docker-compose build
docker-compose up -d
even better, you could do
docker build -t me/myapp .
docker push me/myapp
and then deploy with
docker pull
docker run
There's a number of ways to achieve updating data in containers. Host volumes are a valid approach and probably the simplest way to achieve making your data available.
You can also copy files into and out of a container from the host. You may need to commit afterwards if you are stopping and removing the running web host container at all.
docker cp /src/www webserver:/www
You can copy files into a docker image build from your Dockerfile, which is the same process as above (copy and commit). Then restart the webserver container from the new image.
COPY /src/www /www
But I think the host volume is a good choice.
docker run -v /src/www:/www webserver command
Docker data containers are also an option for mounted volumes but they don't solve your immediate problem of copying data into your data container.
If you ever find yourself thinking "I need to ssh into this container", you are probably doing it wrong.
Not sure if I fully understand your request. But why you need do that to push files into Nginx container.
Manage volume in separate docker container, that's my suggestion and recommend by Docker.io
Data volumes
A data volume is a specially-designated directory within one or more containers that bypasses the Union File System. Data volumes provide several useful features for persistent or shared data:
Volumes are initialized when a container is created. If the container’s base image contains data at the specified mount point, that existing data is copied into the new volume upon volume initialization.
Data volumes can be shared and reused among containers.
Changes to a data volume are made directly.
Changes to a data volume will not be included when you update an image.
Data volumes persist even if the container itself is deleted.
refer: Manage data in containers
As said, one of the main reasons to use docker is to achieve always the same result. A best practice is to use a data only container.
With docker inspect <container_name> you can know the path of the volume on the host and update data manually, but this is not recommended;
or you can retrieve data from an external source, like a git repository

Resources