Can I put my docker repository/image on GitHub/Bitbucket? - docker

I know that Docker hub is there but it allows only for 1 private repository. Can I put these images on Github/Bitbucket?

In general you don't want to use version control on large binary images (like videos or compiled files) as git and such was intended for 'source control', emphasis on the source. Technically, here's nothing preventing you from doing so and putting the docker image files into git (outside the limits of the service you're using).
One major issue you'll have is git/bitubucket have no integration with Docker as neither provide the Docker Registry api needed for a docker host to be able to pull down the images as needed. This means you'll need to manually pull down out of the version control system holding the image files if you want to use it.
If you're going to do that, why not just use S3 or something like that?
If you really want 'version control' on your images (which docker hub does not do...) you'd need to look at something like: https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/
Finally, docker hub only allows one FREE private repo. You can pay for more.

So the way to go is:
Create a repository on Github or Bitbucket
Commit and push your Dockerfile (with config files if necessary)
Create an automated build on Docker Hub which uses the Github / Bitbucket repo as source.
In case you need it all private you can self-host a git service like Gitlab or GOGS and of course you can also selfhost a docker registry service for the images.

Yes, since Sept. 2020.
See "Introducing GitHub Container Registry" from Kayla Ngan:
Since releasing GitHub Packages last year (May 2019), hundreds of millions of packages have been downloaded from GitHub, with Docker as the second most popular ecosystem in Packages behind npm.
Available today as a public beta, GitHub Container Registry improves how we handle containers within GitHub Packages.
With the new capabilities introduced today, you can better enforce access policies, encourage usage of a standard base image, and promote innersourcing through easier sharing across the organization.
Our users have asked for anonymous access for public container images, similar to how we enable anonymous access to public repositories of source code today.
Anonymous access is available with GitHub Container Registry today, and we’ve gotten things started today by publishing a public image of our own super-linter.
GitHub Container Registry is free for public images.
With GitHub Actions, publishing to GitHub Container Registry is easy. Actions automatically suggests workflows based for you based on your work, and we’ve updated the “Publish Docker Container” workflow template to make publishing straightforward.

GitHub is in the process of releasing something similar to ECR or Docker Hub. At the time of writing this, it's in Alpha phase and you can request access.
From GitHub:
"GitHub Package Registry is a software package hosting service, similar to npmjs.org, rubygems.org, or hub.docker.com, that allows you to host your packages and code in one place. You can host software packages privately or publicly and use them as dependencies in your projects."
https://help.github.com/en/articles/about-github-package-registry

I guess you are saying about docker images. You can setup your own private registry which will contain the docker images. If you are not pushing only dockerfiles, but are interested in pushing the whole image, then pushing the images as a whole to github is a very bad idea. Consider a case you have 600 MB of docker image, pushing it to github is like putting 600 MB of data to a github repo, and if you keep on pushing more images there, it will get terribly bad.
Also, docker registry does the intelligent mapping of storing only a single copy of a layer (this layer can be referenced by multiple images). If you use github, you are not going to use this use-case. You will end up storing multiple copies of large files which is really really bad.
I would definitely suggest you to go with a private docker registry rather than going with github.

If there is a real need of putting docker image to github/bitbucket you can try to save it into archive (by using https://docs.docker.com/engine/reference/commandline/save/) and commit/push it to your repository.

Related

What is the recommended way of adding documentation to docker images

It seems like there are two ways to add documentation to a docker image:
You can add a readme.md in the root folder (where your docker file is located) and this is meant to be parsed by the dockerhub automated build system.
The second way is by using the manifest
https://docs.docker.com/docker-hub/publish/publish/#prepare-your-image-manifest-materials
But the documentation doesn't really explain well how to annotate the manifest file for an image. Also it looks like the manifest command is considered experimental.
What is the recommended way of documenting a docker image?
Personally i prefer not having to add documentation when the container is being built, i would much rather a file in the source control. However the md file method seems to have minimal support.
Most modern container registries (like Dockerhub, Quay, Harbor) have a webinterface that can render and display documentation in Markdown format. When you do automatic builds on Dockerhub from a Github repo, the git repo's README.md can get automatically synced to the repo on Docker Hub. If you build your images locally (or via a CI runner) and push them to Docker Hub you could also push the README file using the docker-pushrm tool. It also supports other container registries than Dockerhub.

What is the purpose of pushing an image in a CI/CD pipeline?

Context: Reading through this blog post.
Pushing images to a registry seems to be the "right thing to do" ... but I don't understand why.
What purpose does this serve? Is it because the server I ssh into needs to have a local copy of the image? And to do that, one approach is to pull an image from a registry?
What purpose does this serve? Is it because the server I ssh into needs to have a local copy of the image? And to do that, one approach is to pull an image from a registry?
From the CI/CD perspective, a docker registry is the equivalent of an artifact repository for images. You want a central source of these images to download from as you go from one docker host to another since your build server is most likely different than your dev and prod servers.
Couldn't I just upload an image from one machine (say a CI/CD server) via ssh? using dockerhub seems needlessly ceremonious to me. Like in this example (I know this api is deprecated but it illustrates my point).
It is possible to save/load images directly to a docker host, but there a few major downsides. First, you lose any benefit from docker's layered filesystem. When building an app in CI/CD, most of the time only the last few layers should need to be rebuilt with your application changes. There should be the same previous base image and various common layers to build your app that remain identical. With a registry, these common layers are seen, only the difference is pushed and pulled, making your deploys faster and saving you disk space. With a save/load command, all layers are sent every time since you do not know the state of the remote server when you run the save.
Second, this doesn't scale as you add hosts to run images. Every host would need the image copied on the chance you want to run it on that host, e.g. to handle failover or load balancing. It also won't work if you move to swarm mode or kubernetes since you could easily add new nodes to the cluster that won't have your image. Swarm mode defaults to looking up the sha256 of the image on the registry to guarantee the same image is always used even if the tag is modified on the registry after the initial deploy.
Keep in mind you can run your own registry server (there's a docker image and the api is open). Many artifact repositories (e.g. artifactory and nexus) include support for a docker registry. And many cloud providers include a registry with their container offerings. So you do not need to push to a remote docker hub to deploy locally.
One last point is that a registry server is useful to developers who can now pull the same image used in dev and prod to test against other microservices they are writing locally without the need to build everything locally or ssh to a CI/CD server or even prod to save and scp images back to their laptops.
Usually, you use a CI, CD pipeline when you want to streamline your build / test/ deploy process, and usually this happens if you have a production infrastructure to maintain that is actually critical to your business.
There is no need for a CI/CD pipeline if you're just playing around / prototyping IMO, in which case you can build you docker images on the machine directly, or ssh an image over. That's perfectly reasonable.
Look at the 'registry' as a repository of your binary image (i.e. a fixed version of your code that ideally is versioned and you know works)
Then deploying is as simple as telling your servers to pull the image and run it, from anywhere.
On a flexible architecture, you might have nodes coming up or going down at any time, and they need to be able to pull the latest code from somewhere to get back up and running automatically, at any time, without intervention.
Registry is single source of truth in this case. It means, that you can have multiple nodes (servers), cluster(s) and have the single place from where you can get your images. Also if of your nodes drop-down - you can fast start your image in the new one. Also you can automate your image's updating using registry's webhook, for example when you add new version of image registry gonna send webhook to any service that can upgrade your containers to the newest version.
Consider docker image as a new way of distribution of your software to your servers and docker-registry as a centralized storage of shared images(the like npm.org for js, maven.org for java).
For example,
if you develop java application, years before docker you may use .jar files to do it. The way docker image is better is that also include all OS level dependencies like JDK/JRE and system configurations. So this helps you to avoid "it works on my machine" effect.
To distribute docker image you can also use just docker file and build it all the time on every machine. Docker-Repository allows you to have centralized storage of pre-build images.
Pushing to docker-repository in your CI/CD allows to build your distributive once and further work with the same distributive both on integration and prod environments.
Using just Dockerfile will not guarantee you the same state on every build in every moment of time because you may install external dependencies in your Dockerfile script which may be updated or even removed between two sequential builds.

Web development workflow using Github and Docker

I learnt the basics of github and docker and both work well in my environment. On my server, I have project directories, each with a docker-compose.yml to run the necessary containers. These project directories also have the actual source files for that particular app which are mapped to virtual locations inside the containers upon startup.
My question is now- how to create a pro workflow to encapsulate all of this? Should the whole directory (including the docker-compose files) live on github? Thus each time changes are made I push the code to my remote, SSH to the server, pull the latest files and rebuild the container. This rebuilding of course means pulling the required images from dockerhub each time.
Should the whole directory (including the docker-compose files) live on github?
It is best practice to keep all source code including dockerfiles, configuration ... versioned. Thus you should put all the source code, dockerfile, and dockercompose in a git reporitory. This is very common for projects on github that have a docker image.
Thus each time changes are made I push the code to my remote, SSH to the server, pull the latest files and rebuild the container
Ideally this process should be encapsulated in a CI workflow using a tool like Jenkins. You basically push the code to the git repository,
which triggers a jenkins job that compiles the code, builds the image and pushes the image to a docker registry.
This rebuilding of course means pulling the required images from dockerhub each time.
Docker is smart enough to cache the base images that have been previously pulled. Thus it will only pull the base images once on the first build.

Build chain in the cloud?

(I understand this question is somewhat out of scope for stack overflow, because contains more problems and somewhat vague. Suggestions to ask it in the proper ways are welcome.)
I have some open source projects depending in each other.
The code resides in github, the builds happen in shippable, using docker images which in turn are built on docker hub.
I have set up an artifact repo and a debian repository where shippable builds put the packages, and docker builds use them.
The build chain looks like this in terms of deliverables:
pre-zenta docker image
zenta docker image (two steps of docker build because it would time out otherwise)
zenta debian package
zenta-tools docker image
zenta-tools debian package
xslt docker image
adadocs artifacts
Currently I am triggering the builds by pushing to github and sometimes rerunning failed builds on shippable after the docker build ran.
I am looking for solutions for the following problems:
Where to put Dockerfiles? Now they are in the repo of the package needing the resulting docker image for build. This way all information to build the package are in one place, but sometimes I have to trigger an extra build to have the package actually built.
How to trigger build automatically?
..., in a way supporting git-flow? For example if I change the code in zenta develop branch, I want to make sure that zenta-tools will build and test with the development version of it, before merging with master.
Are there a tool with which I can overview the health of the whole build chain?
Since your question is related to Shippable, I've created a support issue for you here - https://github.com/Shippable/support/issues/2662. If you are interested in discussing the best way to handle your scenario, you can also send me an email at support#shippable.com You can set up your entire flow, including building the docker images, using Shippable.

How to install Dockerfile from GitLab to allow pull and commit

Is there a way to clone a Dockerfile from GitLab with the docker command?
I want to use the feature that allow pull and commit.
I am not sure if I have understand well but these pull and commit update the Dockerfile from the git repositories ? Or is it only locally in the next images ?
If not, is there a way to get all the change you made from the previous image made by the Dockerfile into another Dockerfile ?
I know you can clone with Git directly, but like for npm, you can also use Git url like git+https:// or git+ssh://
The pull/commit commands affect the related image and operate directly against your configured registry, which is the official Docker Hub Registry unless configured otherwise. Perhaps some confusion may arise from the registry's support for Automated Builds, where the registry is directly bound to a repository and rebuilds the image every time the targeted repository branch changes.
If you wish to reuse someone's Docker image, the best approach is to simply reference it via the FROM instruction in your Dockerfile and effectively fork the image. While it's certainly possible to clone the original source repository and continue editing the Dockerfile contained therein, you usually do not want to go down that path.
So if there exists such a foo/bar image you want to continue building upon, the best, most direct approach to do so is to create your own Dockerfile, inherit the image by setting it as a base for your succeeding instructions via FROM foo/bar and possibly pushing your baz/bar image back into the registry if you want it to be publicly available for others to re-base upon.

Resources