How do I use Docker on cloud or datacenter - docker

I couldn't have enough courage to start using docker now I'm feel like came from last century. I want to clear my doubts about docker before get started. My question is mainly for deploying/running docker images on cloud or hosting environment.
Can I build a docker image with any type of server (eg. wildfly, payara) and/or database server (eg. mysql, oracle) and will it work on docker enabled cloud/datacenter?
If it's yes how about persistent datas like database files and static storages (eg. images, uploaded documents, logs) those are stored in docker images or somewhere else? What will happen to those files when I update my application and redeploy new image?
I read posts about what is docker but I couln't find specific answer. Forgive me for not doing enough googling.

I have run docker on AWS and other cloud providers. It is really not that hard if you have some experience with system administration and or devops. Regarding cloud hosters and getting started, most providers have some sort of tutorial on how to get started using docker with their infrastructure:
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-linux-dockerextension/
Can I build a docker image with any type of server (eg. wildfly,
payara) and/or database server (eg. mysql, oracle) and will it work on
docker enabled cloud/datacenter?
To get a server up and running, you just need the docker engine installed on the host, there are packages for many distros:
https://docs.docker.com/engine/installation/
After docker engine is installed, you can create dockerfiles for basically any server or service. Hopefully you do not need to, in most cases, since there are countless docker files and pre-configured, vendor maintained images already available on dockerhub (I use wildfly, elk-stack, and mysql for example). Be careful about selecting images are maintained, otherwise you end up with security issues in your images that might never get fixed! Or you have to do it yourself!
Example images:
https://hub.docker.com/r/jboss/wildfly/
https://hub.docker.com/_/mysql/
https://hub.docker.com/_/oraclelinux/
https://hub.docker.com/u/payara/
If it's yes how about persistent datas like database files and static
storages (eg. images, uploaded documents, logs) those are stored in
docker images or somewhere else? What will happen to those files when
I update my application and redeploy new image?
In general, you will want to store persistent data external to the docker image and mount it into the image as a volume:
https://docs.docker.com/engine/tutorials/dockervolumes/
Some cloud based storage providers might be easier to mount or connect to in other ways, but this volume approach is standard, IMO.
For logfiles, I actually push them to an ELK server, so having a volume for the logs is not necessarily required. However, since the ELK server is also a docker image, it does have a volume where the data is persisted.
So you have:
documentation from your cloud hoster (or docker themselves)
a host in your cloud running docker engine
0..n images that you can either grab from dockerhub or build yourself.
storage for persistent data on this host or mounted from elsewhere that you mount into your docker images on startup. this is where e.g. mysql data folders live, or where you can persist logs, etc.
Of course, it can get much more complex from there, e.g. how to transparently scale and update your environment etc., but that is something for e.g. kubernetes or docker swarm or some other solution (I've scripted a bit on my own but do not need the robustness or elastic scalability of large systems).
Regarding cluster management, it should be noted that Swarm is now included in the Docker Core. This has created some controversy in the community and even talks of a fork of the core:
https://technologyconversations.com/2015/11/04/docker-clustering-tools-compared-kubernetes-vs-docker-swarm/
https://jaxenter.com/docker-1-12-is-probably-the-most-important-release-since-1-0-129080.html
http://searchitoperations.techtarget.com/news/450303918/Docker-fork-talk-prompts-container-standardization-brawl
http://www.infoworld.com/article/3118345/cloud-computing/why-kubernetes-is-winning-the-container-war.html

I have experience running docker on Alibaba cloud and AWS as well. I did not see any difference in working with docker on both cloud providers. Docker images can be build same way on all linux platform regardless of the cloud provider. However, persistence of data need to be taken care using docker volumes. However, it is recommended to use managed service such as RDS in Alibaba cloud for databases instead of using docker.

Can I build a docker image with any type of server (eg. wildfly,
payara) and/or database server (eg. mysql, oracle) and will it work on
docker enabled cloud/datacenter?
You can build your own Docker images or use solutions that are already pre-packaged and proven by cloud providers. For example, here is an auto-clustering Docker-based implementation of GlassFish that can be run and managed on Jelastic PaaS.
If it's yes how about persistent datas like database files and static
storages (eg. images, uploaded documents, logs) those are stored in
docker images or somewhere else? What will happen to those files when
I update my application and redeploy new image?
With the above mentioned cluster, all data is kept inside containers and stays without changes after restart. As an option, you can also connect a separate data storage container if you wish to share it across other containers.

Related

Does Docker collect data processed by the container (Docker Desktop)

I am going to use Azures Computer Vision (Read 3.2) and run it on-premise as a container, and therefore using Docker Desktop.
However, I have not been able to figure out if Docker collects any data that is being processed by containers running on Docker.
In https://www.docker.com/legal/security-and-privacy-guidelines under the header 'Data Privacy and Security' Docker writes:
"In general, Docker does not collect or store personal data and the use of Docker products does not result in personal data being collected or stored."
Now, to me, this sounds ambiguous. We are using Azure's on-premise container in order to stay compliant and that part works since Azure does not collect any data from the container. However, if Docker itself collects data then that is a show stopper. FYI, I am a beginner to Docker and I might be completely off.
EDIT: So my question is, does Docker collect any of the input or the output going in and out from the container?
Thankful for any answers or wisdom you might be able to share.
Regards
As you saw, the Docker privacy policy "applies to Docker websites, products and services offered by Docker". I do not think running a Docker image as a container would be considered under those terms, and so I do not think Docker collect any information produced by the container as 'output' - i.e. standard output/error streams or the like.
Docker Desktop may collect statistics, metrics and information on the images/containers run directly under Docker desktop where they have access to that information, but also many docker-built images will be run under non-docker environments (such as Kubernetes) where they could not have access to the such information.
As an aside, I think all the image themselves be built from scratch and you (or other interested parties) have access to the layers within the image so you can see what has been added and what the effect of the layer is. Thus you could also verify that Docker (or other parties) are not harvesting data from a running container.

Trying to understand how docker works in production for a scalable symfony application

I'm trying to understand how docker works in production for a scalable symfony application.
Let's say we start with a basic LAMP stack:
Apache container
php container
mysql container
According to my research once our containers are created, we push them to the registry (docker registry).
For production, the orchestrator will take care of creating the PODS (in the case of kubernetes) and will call the images that we have uploaded to the registry.
Did I understand right ? Do we push 3 separate images on the registry?
Let me clarify a couple of things:
push them to the registry (docker registry)
You can push them to the docker registry, however, depending on your company policy this may not be allowed since dockerhub (docker's registry) is hosted somewhere in the internet and not at your company.
For that reason many enterprises deploy their own registry, something like JFrog's artifactory for example.
For production, the orchestrator will take care of creating the PODS (in the case of kubernetes)
Well, you will have to tell the orchestrator what he needs to create, in case of Kubernetes you will need to write a YAML file that describes what you want, then it will take care of the creation.
And having an orchestrator is not only suitable for production. Ideally you would want to have a staging environment as similar to your production environment as possible.
will call the images that we have uploaded to the registry
That may need to be configured. As mentioned above, dockerhub is not the only registry out there but in the end you need to make sure that it is somehow possible to connect to the registry in order to pull the images you pushed.
Do we push 3 separate images on the registry?
You could do that, however, I would not advice you to do so.
As good as containers may have become, they also have downsides.
One of their biggest downsides still are stateful applications (read up on this article to understand the difference between stateful vs stateless).
Although it is possible to have stateful applications (such as MySQL) inside a container and orchestrate it by using something like Kubernetes, it is not advised to do so.
Containers should be ephemeral, something that does not work well with databases for example.
For that reason I would not recommend having your database within a container but instead use a virtual or even a physical machine for it.
Regarding PHP and Apache:
You may have 2 separate images for these two but it honestly is not worth the effort since there are images that already have both of them combined.
The official PHP image has versions that include Apache, better use that and save yourself some maintenance effort.
Lastly I want to say that you cannot simply take everything from a virtual/physical server, put it into a container and expect it to work just as it used to.

Sharing docker volumes within the workplace

I have taken some time to create a useful Docker volume for use at work. It has a restored backup of one of our software databases (SQL Server) on it, and I use it for testing/debug by just attaching it to whatever Linux SQL Container I feel like running at the time.
When I make useful Docker images at work, I share them with our team using either the Azure Container Registry or the AWS Elastic Container Registry. If there's a DockerFile I've made as part of a solution, I can store that in our GIT repo for others to access.
But what about volumes? Is there a way to share these with colleagues so they don't need to go through the process I went through to build the volume in the first place? So if I've got this 'databasevolume' is there a way to source control it? Or share it as a file to other users of Docker within my team? I'm just looking to save them the time of creating a volume, downloading the .bak file from its storage location, restoring it etc.
The short answer is that there is no default docker functionality to export the contents of a docker volume and docker export explicitly does not export the contents of the volumes associated with the container. You can backup, restore or migrate data volumes.
Note: if your're backing up a database I'd suggest using the appropriate tools for that database.

Kubernetes - from Minikube to production

I have created a simple PHP api application that works with a mysql database to store data. I have been experimenting with Kubernetes on my Windows 10 machine through Minikube.
I have just about got my head round the ideas involved, yet I’m not sure about how to implement this properly. So far I have used Kompose to create a set of yaml files from an existing docker-compose file. This has been half successful.
To get my application code into a pod hosting PHP, I have been using hostPath to share from my local machine. I mount to the minikube machine and share from there. I was having trouble sharing by other means. The application code is hosted in a github repo.
My questions are:
Is mounting my application code into a pod (assuming this is similar to what happens in docker) the correct way to do this? I’m not clear exactly what information is held on an image retrieved from the docker hub. Although I have read up on containers isolating the build environment from your machine.
How does this approach to translate into a production environment hosted on a cloud? I see there are various storage types. I had for example, wanted to try deploying on AWS just to see how this would work in practice.
I’m really looking for guidance to go from the tutorials found on the web working on my machine, to something that could be done for a customer hosted on the cloud. This might scale up to a more microservices style architecture over time.
The approach you are describing is mostly for development setups, where you want to mount your code into the container as a volume so you don't have to rebuild every time your code changes. Typically done with a docker-compose file.
For production setups, you want the docker image to correctly work and only mount volumes to data you want to persist, typically databases are the core example. For this EKS is deeply integrated into the AWS infrastructure and will create EBS volumes on demand. You don't need to provision any volume or even care for most cases (unless you need multiple read-write volumes needed for scaling).
For a PHP application you really should not persist any data in the pod, because it will create other issues when you need to scale the application. Also, a good approach for managing files that need to persist is S3 (AWS simple storage service).
So generally speaking, you need a deployment per application a service to access each pod on that application and then an ingress object to route traffic from the internet to each pod.
Your application docker image is really the core. You just build it with your code inside. Make sure to pass configuration using environment variable or configuration file so you can connect to the database.
Now for kubernetes, for each compoment (e.g. PHP application, MySQL) you will most likely create a deployment k8s manifest that points to the docker image and add some configuration environment variables.
For production, you will need persistence volume. On aws you can simply use EBS-backed volumes
To get traffic from Internet to your PHP application, you will need to add one or more k8s components:
K8s Service manifest that exposes your PHP deployment/pod on a stable address. If you only have q or very few services, you can use LoadBalancer which on cloud like AWS will create an ALB/ELB (might need to add annotation to your service)
An ingress which is just a reverse proxy (contour, nginx, traefik). On cloud environment it will map to an ALB/ELB. The advantage of this is that you can have a single ALB for all your services i.e. save money. Also you can configure routing path or TLS termination in one place.

Best practice to automatically backup remotely hosted server

I am trying to setup a server for team note taking, and I am wondering what is the best way to backup its data, A.K.A my notes, automatically.
Currently I plan to run the server in a docker image.
The docker image will be hosted by a hosting service (such as Google).
I found a free hosting service that fits my need, but it does not allow mounting volumes to a docker image.
Therefore, I think the only way for me to backup my data is to transfer them to some other cloud services.
However, this requires that I have to store some sort of sensitive data for authentication in my docker image, apparently this is not cool.
So:
Is it possible to transfer data from a docker image to a cloud service without taking the risk of leaking password/private key?
Is there any other way to backup my data?
I don't have to use docker as all I need is actually Node.js.
But the server must be hosted on some remote machines because I don't have the ability/time/money to host a machine on my own...
I use borg backup to backup our servers (including docker volumes) ... and it's saved the day many times due to failure and stupidity.
It transfers over SSH so comms are encrypted. The repositories it uses are also encrypted on disk so that makes all your data safe. It de-duplicates, snapshots, prunes, compresses ... the feature list is quite large.
After the first backup, subsequent backups are much faster because it only submits the changes since the previous backup.
You can also mount the snapshots as filesystems so you can hunt down the single file you deleted or just restore the whole lot. The mounts can also be done remotely.
I've configured ours to backup /home, /etc and the /var/lib/docker/volumes directories (among others).
We rent a few cheap storage VPSs and send the data up to them nightly. They're in different geographic locations with different hosting providers, you know, because we're paranoid.
Beside docker swarm secrets, don't forget bind mounts strategies: you could have your data in a volume.
In that case, you can have a backup strategy done on the host (instead of the container at runtime), which would take that volume, compress it and save it elsewhere. See for instance this answer or this one.

Resources