I want to know if there is a suggested approach on how to configure Docker machines using configuration files. I have a service that I configure for several users, it is basically a Django app.
Until now I had a shared base image and a bunch of scripts. When I need to create a new machine for a new user, I create it in Google Cloud Engine using the base image. Then I :
SSH into it
Launch a script that download everything via git and launch all services
Copy required credential files using scp
Is there a way to optimize some steps with Docker (using secrets or some external config management tool)?
Thanks!
Related
I have been developing a web application using Python, Flask, Docker(-Compose) and git/github and getting to the point where I try to figure out the best way/workflow to bring it to production. I have read some articles but not sure what is a best practice from different approaches.
My current setup is purely development oriented:
Local Docker using docker-compose to build various service images (such as db, backend workers, webapp (flask & uwisg), nginx).
using .env file for docker-compose to pass configuration to the services
Source code is bind mounted from the local docker host
db data is stored in a named volume
Using local git for source control (though I have connected it to a github repository but not been using it much since I am the only one currently developing the application)
From what I understand the steps to production could be the following:
Implement docker-compose override to distinguish between dev and prod
Implement Dockerfile Multistage builds to create prod images which include the source code in the image and do not include dev dependencies
Tag and push the production images to a registry (docker, google?) or better push the git to github?
[do security scans of the prod images]
deploy/pull the prod images from the registry (or build from github) on a service like GKE for instance
Is this a common way to do it? Am I missing something?
How would I best go about using an integration/staging environment between dev and prod, so that I can first test new prod builds or debug prod images in integration?
Does GKE for instance offer an easy way to setup an integration environment? Or could I use the Docker installation on my NAS for that?
Any best practices for backing up production (like db data most importantly)?
Thanks in advance!
I can't find much information on what the differences are in running Airflow on Google Cloud Composer vs Docker. I am trying to switch our data pipelines that are currently on Google Cloud Composer onto Docker to just run locally but am trying to conceptualize what the difference is.
Cloud Composer is a GCP managed service for Airflow. Composer runs in something known as a Composer environment, which runs on Google Kubernetes Engine cluster. It also makes use of various other GCP services such as:
Cloud SQL - stores the metadata associated with Airflow,
App Engine Flex - Airflow web server runs as an App Engine Flex application, which is protected using an Identity-Aware Proxy,
GCS bucket - in order to submit a pipeline to be scheduled and run on Composer, all that we need to do is to copy out Python code into a GCS bucket. Within that, it'll have a folder called DAGs. Any Python code uploaded into that folder is automatically going to be picked up and processed by Composer.
How Cloud Composer benefits?
Focus on your workflows, and let Composer manage the infrastructure (creating the workers, setting up the web server, the message brokers),
One-click to create a new Airflow environment,
Easy and controlled access to the Airflow Web UI,
Provide logging and monitoring metrics, and alert when your workflow is not running,
Integrate with all of Google Cloud services: Big Data, Machine Learning and so on. Run jobs elsewhere, i.e. other cloud provider (Amazon).
Of course you have to pay for the hosting service, but the cost is low compare to if you have to host a production airflow server on your own.
Airflow on-premise
DevOps work that need to be done: create a new server, manage Airflow installation, takes care of dependency and package management, check server health, scaling and security.
pull an Airflow image from a registry and creating the container
creating a volume that maps the directory on local machine where DAGs are held, and the locations where Airflow reads them on the container,
whenever you want to submit a DAG that needs to access GCP service, you need to take care of setting up credentials. Application's service account should be created and downloaded as a JSON file that contains the credentials. This JSON file must be linked into your docker container and the GOOGLE_APPLICATION_CREDENTIALS environment variable must contain the path to the JSON file inside the container.
To sum up, if you don’t want to deal with all of those DevOps problem, and instead just want to focus on your workflow, then Google Cloud composer is a great solution for you.
Additionally, I would like to share with you tutorials that set up Airflow with Docker and on GCP Cloud Composer.
[google-cloud-storage]I am trying to copy files from Linux directory to GCP bucket using "Transfer for on-premises" option. I’ve installed docker script on Linux and GCP bucket is created. I now need to run Docker Run command to copy files. My question is how do I specify the source & target places in the docker command. For example;
Sudo docker run –source –target --hostname=$(hostname) --agent-id-prefix=ID123456789
The short answer is you can't supply a source/destination to this command, because its purpose is not to transfer the data. This command starts the agents for the service - agents are always-running processes that help you move data.
After starting agents that have access to your files, you issue a copy command in the Cloud Console, where you can specify a source directory and target bucket+prefix. When you do this, the service will contact the agents and use them to push the data to Google Cloud in parallel, for faster transfers. See the following links for more details:
Overview of how Transfer Service for on-premises data works
Setting up the service, and how to submit a transfer job
Is it possible to configure Nexus repository manager (3.9.0) in a way which is suitable for a Docker based containerized environment?
We need a customized docker image which contains basic configurations for the nexus repository manager, like project specific repositories, LDAP based authentication for users. We found that most of the nexus configurations live in the database (OrientDB) used by nexus. We also found that there is a REST interface offered by nexus to handle configurations by 3rd parties, but we found no configuration exporter/importer capabilites besides backup (directory servers ha LDIF, application servers ha command line scripts, etc.).
Right now we export the configuration as backup files, and during the customized docker image build we copy those backup file back to the file system in the container:
FROM sonatype/nexus3:latest
[...]
# Copy backup files
COPY backup/* ${NEXUS_DATA}/backup/
When the conatiner starts up it will pick up the backup files and the nexus will be configured the way we need. However though, it would be much better if there was a way which would allow us the handle these configurations via a set of config files.
All that data is stored under /nexus-data, so you can create an initial docker container with a docker volume or a host directory that would keep all that data. After you preconfigured that instance you can distribute your customized docker image with that docker volume containing nexus data. Or if you used a host directory you can simply copy over all that data is similar fashion as you do now, but use /nexus-data directory instead.
You can find more information at DockerHub under Persistent Data.
How should applications be scripted/automatically deployed when in LXD containers?
For example is best way to deploy applications in LXD containers to use a bash script (which deploys an application)? How to execute this bash script inside the container by executing a command on the host?
Are there any tools/methods of doing this in a similar way to Docker recipes?
In my case, I use Ansible to:
build the LXD containers (web, database, redis for example).
connect to the containers and deploy the services and code needed.
you can build your own images for example with the services and/or code already deployed and build specific containers from this images.
I was doing this from before LXD had Ansible support (Ansible 2.2) i prefer to use ssh instead of lxd connection, when i connect to the containers to deploy services/code. they comes with a profile where i had setup my ssh public key (to have direct ssh connection by keys ... no passwords)
Take a look at my open source project on bitbucket devops_lxd_containers It includes:
Scripts to build lxd image templates including Apache, tomcat, haproxy.
Scripts to demonstrate custom application image builds such as Apache hosting and key/value content and haproxy configured as a router.
Code to launch the containers and map ports so they are accessible to the larger network
Code to configure haproxy as layer 7 proxy to route http requests between boxes and containers based on uri prefix routing. Based on where it previously deployed and mapped ports.
At the higher level it accepts a data drive spec and will deploy an entire environment compose of many containers spread across many hosts and hook them all up to act as a cohesive whole via a layer 7 proxy.
Extensive documentation showing how I accomplished each major step using code snippets before automating.
Code to support zero-outage upgrades using the layer7 ability to gracefully bleed off old connections while accepting new connections at the new layer.
The entire system is built on the premise that image building is best done in layers. We build a updated Ubuntu image. From it we build a hardened Ubuntu image. From it we build a basic Apache image. From it we build an application specific image like our apacheKV sample. The goal is to never rebuild any more than once and to re-use the common functionality such as the basicJDK as the source for all JDK dependent images so we can avoid having duplicate code in any location. I have strived to keep Image or template creation completely separate from deployment and port mapping. The exception is that I could not complete creation of the layer 7 routing image until we knew everything about how other images would be mapped.
I've been using Hashicorp Packer with the ansible provisioner using ansible_connection = lxd
Some notes here for constructing a template
When iterating through local files on your host system you may need to be using ansible_connection = local (e.g for stat & friends)
Using local_action in ansible with the lxd connection is still
action inside the container when using stat (but not with include_vars & lookup function for files)
Using lots of debug messages in Ansible is helpful to know which local environment ansible is actually operating in.
I'm surprised no one here mentioned Canonicals own tool for managing LXD.
https://juju.is
it is super simple, well supported, and the only caveat is it requires you turn off ipv6 at the LXD/LXC side of things (in the network bridge)
snap install juju --classic
juju bootstrap localhost
from there you can learn about juju models, deploy machines or prebaked images like ubuntuOS
juju deploy ubuntu