Just setting up a new Rails app and I have my Vagrant files along with a folder full of dev machine provisioning files for Ansible. These allow me to spin up a dev virtual machine, provision it and have everything up and running really quickly.
My question is, should all that be in my projects version control repository? I will be working on this project across several machines so have it accessible and synced would be useful but on the other hand I don't wish those items to be deployed when I finally deploy it to production? Also, having those files committed would keep a history of them which would also be nice.
What would you recommend?
This is very much a thing of your personal preference.
Some people keep everything in a single self-contained repo. Other people keep application code in a separate repo from their configuration/provisioning/deployment code.
Either way have their own benefits and drawbacks and there's no wrong of doing it as long as you do keep in some version control system.
When I set up new projects I create a directory structure along the lines of:
/<application_name>
./src
./deployment
./docs
Actual source code goes in src, any deployment-specific scripts (e.g. Ansible playbook dirs, Vagrant files) go in deployment and of course any documentation goes in docs.
Then I commit all this to source control. The deployment scripts are then written to be executed from their directory but change into the src directory to perform their actions.
Related
I'm using Django but I guess the question is applicable to any web project.
In our case, there are two types of codes, the first one being python code (run in django), and others are static files (html/js/css)
I could publish new image when there is a change in any of the code.
Or I could use bind mounts for the code. (For django, we could bind-mount the project root and static directory)
If I use bind mounts for code, I could just update the production machine (probably with git pull) when there's code change.
Then, docker image will handle updates that are not strictly our own code changes. (such as library update or new setup such as setting up elasticsearch) .
Does this approach imply any obvious drawback?
For security reasons is advised to keep an operating system up to date with the last security patches but docker images are meant to be released in an immutable fashion in order we can always be able to reproduce productions issues outside production, thus the OS will not update itself for security patches being released. So this means we need to rebuild and deploy our docker image frequently in order to stay on the safe side.
So I would prefer to release a new docker image with my code and static files, because they are bound to change more often, thus requiring frequent release, meaning that you keep the OS more up to date in terms of security patches without needing to rebuild docker images in production just to keep the OS up to date.
Note I assume here that you release new code or static files at least in a weekly basis, otherwise I still recommend to update at least once a week the docker images in order to get the last security patches for all the software being used.
Generally the more Docker-oriented solutions I've seen to this problem learn towards packaging the entire application in the Docker image. That especially includes application code.
I'd suggest three good reasons to do it this way:
If you have a reproducible path to docker build a self-contained image, anyone can build and reproduce it. That includes your developers, who can test a near-exact copy of the production system before it actually goes to production. If it's a Docker image, plus this code from this place, plus these static files from this other place, it's harder to be sure you've got a perfect setup matching what goes to production.
Some of the more advanced Docker-oriented tools (Kubernetes, Amazon ECS, Docker Swarm, Hashicorp Nomad, ...) make it fairly straightforward to deal with containers and images as first-class objects, but trickier to say "this image plus this glop of additional files".
If you're using a server automation tool (Ansible, Salt Stack, Chef, ...) to push your code out, then it's straightforward to also use those to push out the correct runtime environment. Using Docker to just package the runtime environment doesn't really give you much beyond a layer of complexity and some security risks. (You could use Packer or Vagrant with this tool set to simulate the deploy sequence in a VM for pre-production testing.)
You'll also see a sequence in many SO questions where a Dockerfile COPYs application code to some directory, and then a docker-compose.yml bind-mounts the current host directory over that same directory. In this setup the container environment reflects the developer's desktop environment and doesn't really test what's getting built into the Docker image.
("Static files" wind up in a gray zone between "is it the application or is it data?" Within the context of this question I'd lean towards packaging them into the image, especially if they come out of your normal build process. That especially includes the primary UI to the application you're running. If it's things like large image or video assets that you could reasonably host on a totally separate server, it may make more sense to serve those separately.)
I'm new to docker so I have a very simple question: Where do you put your config files?
Say you want to install mongodb. You install it but then you need to create/edit a file. I don't think they fit on github since they're used for deployment though it's not a bad place to store the files.
I was just wondering if docker had any support for storing such config files so you can add them as part of running an image.
Do you have to use swarms?
Typically you'll store the configuration files on the Docker host and then use volumes to bind mount your configuration files in the container. This allows you to separately manage the configuration file from the running containers. When you make a change to the configuration, you can just restart the container.
You can then use a configuration management tool like Salt, Puppet, or Chef to manage copying/storing the configuration file onto the Docker host. Things like passwords can be managed by the secrets capabilities of the tool. When set up this way, changing a configuration file just means you need to restart your container and not build a new image.
Yes, in most cases you definitely want to keep your Dockerfiles in version control. If your org (or you personally) use GitHub for this, that's fine, but stick them wherever your other repos are. One of the main ideas in DevOps is to treat infrastructure as code. In fact, one of the main benefits of something like a Dockerfile (or a chef cookbook, or a puppet file, etc) is that it is "used for deployment" but can also be version-controlled, meaningfully diffed, etc.
We are discussing how we should deploy our application running in a docker container. At the moment, we build our application image in the pipeline containing the application code. Which means we have to build the docker image every time the application updates.
Another approach we consider is putting the application code in a volume on the server. We then pull the latest release with git on the server. So the image has not to be rebuilt.
So our discussed options are:
Build the image containing the application code
Use a volume and store the application code on the server
What is best practice to do and why?
While the other answers here have explained the point of building code into your image, I'd like to go one step further and show you how to get the benefits of both worlds while following this best practice.
Docker best practices call for building source code into your image before deployment, rather than deploying an image with dependencies installed and then source code mounted in as a volume.
This gives you a self-contained, portable container that is straightforward to test, deploy, or rollback.
May I take a stab at why you are considering hot-mounting code?
Hot-mounting code is appealing for several reasons — and they're all easy to achieve without sacrificing this best practice of building a self-contained image:
Building Docker images can be slow, so why rebuild for a minor change when you can just hot-mount the code?
A complementary best practice is to use a "base image" that installs all dependencies -- usually the slow part of building a docker image. The key insight is that this base image won't change often!
But the image that derives from it -- your application image, which installs source code -- will change with every commit you want to deploy. That derived image will be very fast to build. The Dockerfile could be as simple as:
FROM myapp/base . # all dependencies installed in base image
ADD code.tar.gz /src # automatic untaring!
CMD [...] # whatever it takes to run your app
Hot-mounting enables faster development cycles, because a developer won't need to flush their docker container, rebuild, and run a new container just to see a change.
This is a fair point. I recommend making a "dev" image (which will also derive from your base image) that enables code mounting via a volume rather than the source code installation steps you'd have in your testing and deployment images.
When you build image every time with new application you have easy way to deploy it later on to the customer or on your production server. When the docker image is ready you can keep it in the repository. Additionally you have full control on that that your docker is working with current application.
In case of keeping the application in mounted volume you have to keep in mind following problems:
life cycle of application - what to do with container when you have to update the application - gently stop, overwrite and run again
how do you deploy your application - you have to do it manually over SSH, or you want just to run simple command docker run, and it runs your latest version from your repository
The mounted volumes are rather for following casses:
you want to have externally exposed settings for container - what is also not a good idea
you want to have externally access to the data produced by the application like logs, db, etc
To automate it totally, you can:
build image for each application and push to the repository
use for example watchtower to automatic update of the system on your production servers
I believe you should follow the first approach i.e. rebuilding the docker image every time there are changes in code. Reasons are-
Firstly, if you are using volume, every time you have to manage the clean closing and removing of the previous version of the application and check whether the new version of the application is running correctly. Your new application might get affected dependencies of your previous version of the application. That need to be taken care too.
Secondly, there might be some version updates of the frameworks installed and some new frameworks are to be installed with the current application. In this case, the first approach seems to be the only option.
Thirdly, As when you are using docker volume you will be removing the most important feature of docker i.e. abstraction from outside environment. Also, the image might become machine dependent because of it, which might affect if you want to publish the app in multiple environments.
My suggestion would be creating a pipeline using some continuous integration tool and fully automate the process starting from code building, building of docker image and deploying it to your environment.
In our team we currently use vagrant as a development environment. Now I want to replace it with docker, but I can't understand the team workflow with it.
This is what confuses me: with vagrant I create a project repo with a Vagrantfile in it, and every developer pulls a repo and runs vagrant up. If the project needs some changes in environment, I edit Vagrantfile, chef recipe or requirements-file, and developers must run vagrant provision to get an updated environment.
But with docker I see at least two options:
create a Dockerfile and put it in repo, every developer builds an image from it. On every change they rebuild their own image.
build an image, put it on server, every developer pulls it and run. On every change rebuild and image on server (maybe some auto-rebuilds on server and auto-pull scripts).
Docker phylosophy is 'build once, run anywhere', but the same time we have a Dockerfile in repo... What do you think about it? How do you do this in your team?
Docker is more for production as for development
Docker is a deployment tool to package apps for production (or tests environments). It is not so much a tool for development. It is meant to create an isolated environment to run your already developed application somewhere on a server, the cloud or your laptop.
Use a Dockerfile to document the packaging
I think it is nice to have a Dockerfile in your project. Similar to a Vagrant file, it is a kind of executable documentation which describes how your production environment should look like. Somebody who is new to your project could just run the file and will get a packaged and ready-to-run container. Cool!
Use a registry to integrate Docker
I think you should provide a (private) Docker registry if you integrate Docker into your (CI) workflow (e.g. into test and build systems). A single repository to store validated and tested images of all your products will definitely speed-up your time to create new test or production systems (e.g. to scale your app or to setup an installation for a demo or a customer). If your product is open source, consider the public Docker index so people could find your stuff there. You can configure your build system to create a new Docker image after each (successful) build and to push it to the registry. Since the images are layered (and those layers are shared), this will be fast and will not take to much disk space.
If you want to integrate Docker in your development, I don't see so much possibilities:
You can create a repository with final images (as described before)
Or you can use Docker images to develop against them (e.g. to run a MongoDB)
Maybe you have a team A which programs against the API of team B and always needs a running instance of team B's product. Then you could package this product into a Docker image and share it with team A. In this case, team B should provide the image in a repository (and team A shouldn't take care how to build it and use it as a blackbox).
Edit: If you depend on many external apps
To make this "team A and team B" thing more clear: If you develop an app against many other tools, e.g. an app from another team, a MongoDB or an Elasticsearch, you can package those apps into Docker images, run them (locally) and develop against them. You will also have a good chance to find popular apps (such as MongoDB) in the public Docker Index. So instead of installing them manually, you can just pull and start them. But to put together an environment like this, you will need Vagrant again.
You could also use Docker for test environments (build and run an image and test against it). But this wouldn't be a replacement for Vagrant in development.
Vagrant + Docker
I would suggest to use both. Provide a Vagrantfile to build the development environment and provide a Dockerfile to build the production environment.
Also take a look at http://docs.vagrantup.com/v2/provisioning/docker.html. Vagrant has a Docker integration since a while, so you can create Docker containers/environments with Vagrant.
I work for Docker.
Think of Docker (and the Docker index/registry) as equivalent to Git. You don't have to make this very hard. If you change a Dockerfile, it is a cheap and quick operation to update an image. If you use "Trusted Builds" in our registry, then you can have it build automatically off of any branch at any time you want.
These are basic building blocks, but it works great for development. Docker itself is built and developed inside of Docker containers, so we know it works fine.
Authentication information such as database connection strings or passwords should almost never be stored in version control systems.
It looks like the only method of specifying environment variables for an app hosted on OpenShift is to commit them to the Git repository. There is a discussion about this on the OpenShift forums, but no useful suggested workarounds for the problem.
Is there another approach I can use to add authentication information to my app without having to commit it to the repository?
SSH into you application and navigate to your data directory
cd app-root/data
in this directory create a file with your variables (e.g. ".myenv") with content like
export MY_VAR="something"
and then in your repository in ".openshift/action_hooks/pre_start" add this line
source ${OPENSHIFT_DATA_DIR}/.myenv
Openshift supports now setting environment vaiables with the rhc commandline tool like this:
rhc set-env HEROKU_POSTGRESQL_DB_URL='jdbc:postgresql://myurl' -a myapp
I think thats way easier than all the other answers...
See: https://blog.openshift.com/taking-advantage-of-environment-variables-in-openshift-php-apps/
Adding .openshift/action_hooks/pre_start_* is not very cool, because you have to modify your repository in addition to add a file by SSH.
For nodejs, editting nodejs/configuration/node.env works well for some days, but I experienced the file got reverted several times. So it is not stable.
I found a much better solution.
echo -n foobar > ~/.env/user_vars/MY_SECRET
This works perfectly.
(Maybe this is what is done with rhc set-env ...?)
Hope this helps!
Your other option is to create an openshift branch of your project on your local machine. You can create a folder/files for the private information that only lives in your openshift branch. You would still need to source the files in your pre_start hook, something like source ${OPENSHIFT_REPO_DIR}/.private.
Then develop in your master branch, merge into your openshift branch, and push from your openshift branch to OpenShift master branch. This sound convoluted at first, but does make for a very easy workflow, especially if you're origin is shared.
This would be the workflow if your origin was on GitHub.
github/master <--> local/master --> local/openshift --> openshift/master
Notice the only bidirectional link is between github and your local master, so there should be no reason for your credentials to "escape".
This approach also has the added benefit of being able to keep any OpenShift specific changes confined to the openshift branch (like for Gemfiles, ENV variables, paths, etc).
As for security, on the OpenShift server, the repo should have the same security as your $OPENSHIFT_DATA_DIR, so you're not really exposing yourself any more.
Caveat:
Depending on your framework, the files in your $OPENSHIFT_REPO_DIR may be directly accessible via HTTP. You should be able to prevent this with an .htaccess file.