Storing repositories - docker

I am new to this. I am now having a project which is developing an application. From what I known, normally people will store it on docker hub by then creating a docker container and pass it to Kubernetes. But what we are trying to do is that we wish to store the repositories without using the docker hub. Is there a way to build something like that from scratch? Which direction should I look into?

You can install jfrog or nexus artifact management server to create repository and store your image to that artifact management repo.
Please check this link for more details.

Related

How to distribute docker-compose files?

I've managed to create a docker-compose file which runs my application. Now I'm wondering if there's a standard way for distributing this file? I mean, with docker I would distribute the image uploaded to docker-hub built from my Dockerfile, can I also upload docker-compose files to docker-hub?
What would the deployment flow look like here?
You can deploy single images on DockerHub
You can't deploy a docker-compose file to DockerHub
The way that I saw the most is :
Creating a Github repository containing your project (with the
docker-compose file)
Explaining how to create the different images in a Readme.md
Push each images on DockerHub and link your DockerHub images to your
git repositories to allow people to check the whole stack.

What is the recommended way of adding documentation to docker images

It seems like there are two ways to add documentation to a docker image:
You can add a readme.md in the root folder (where your docker file is located) and this is meant to be parsed by the dockerhub automated build system.
The second way is by using the manifest
https://docs.docker.com/docker-hub/publish/publish/#prepare-your-image-manifest-materials
But the documentation doesn't really explain well how to annotate the manifest file for an image. Also it looks like the manifest command is considered experimental.
What is the recommended way of documenting a docker image?
Personally i prefer not having to add documentation when the container is being built, i would much rather a file in the source control. However the md file method seems to have minimal support.
Most modern container registries (like Dockerhub, Quay, Harbor) have a webinterface that can render and display documentation in Markdown format. When you do automatic builds on Dockerhub from a Github repo, the git repo's README.md can get automatically synced to the repo on Docker Hub. If you build your images locally (or via a CI runner) and push them to Docker Hub you could also push the README file using the docker-pushrm tool. It also supports other container registries than Dockerhub.

What is the purpose of pushing an image in a CI/CD pipeline?

Context: Reading through this blog post.
Pushing images to a registry seems to be the "right thing to do" ... but I don't understand why.
What purpose does this serve? Is it because the server I ssh into needs to have a local copy of the image? And to do that, one approach is to pull an image from a registry?
What purpose does this serve? Is it because the server I ssh into needs to have a local copy of the image? And to do that, one approach is to pull an image from a registry?
From the CI/CD perspective, a docker registry is the equivalent of an artifact repository for images. You want a central source of these images to download from as you go from one docker host to another since your build server is most likely different than your dev and prod servers.
Couldn't I just upload an image from one machine (say a CI/CD server) via ssh? using dockerhub seems needlessly ceremonious to me. Like in this example (I know this api is deprecated but it illustrates my point).
It is possible to save/load images directly to a docker host, but there a few major downsides. First, you lose any benefit from docker's layered filesystem. When building an app in CI/CD, most of the time only the last few layers should need to be rebuilt with your application changes. There should be the same previous base image and various common layers to build your app that remain identical. With a registry, these common layers are seen, only the difference is pushed and pulled, making your deploys faster and saving you disk space. With a save/load command, all layers are sent every time since you do not know the state of the remote server when you run the save.
Second, this doesn't scale as you add hosts to run images. Every host would need the image copied on the chance you want to run it on that host, e.g. to handle failover or load balancing. It also won't work if you move to swarm mode or kubernetes since you could easily add new nodes to the cluster that won't have your image. Swarm mode defaults to looking up the sha256 of the image on the registry to guarantee the same image is always used even if the tag is modified on the registry after the initial deploy.
Keep in mind you can run your own registry server (there's a docker image and the api is open). Many artifact repositories (e.g. artifactory and nexus) include support for a docker registry. And many cloud providers include a registry with their container offerings. So you do not need to push to a remote docker hub to deploy locally.
One last point is that a registry server is useful to developers who can now pull the same image used in dev and prod to test against other microservices they are writing locally without the need to build everything locally or ssh to a CI/CD server or even prod to save and scp images back to their laptops.
Usually, you use a CI, CD pipeline when you want to streamline your build / test/ deploy process, and usually this happens if you have a production infrastructure to maintain that is actually critical to your business.
There is no need for a CI/CD pipeline if you're just playing around / prototyping IMO, in which case you can build you docker images on the machine directly, or ssh an image over. That's perfectly reasonable.
Look at the 'registry' as a repository of your binary image (i.e. a fixed version of your code that ideally is versioned and you know works)
Then deploying is as simple as telling your servers to pull the image and run it, from anywhere.
On a flexible architecture, you might have nodes coming up or going down at any time, and they need to be able to pull the latest code from somewhere to get back up and running automatically, at any time, without intervention.
Registry is single source of truth in this case. It means, that you can have multiple nodes (servers), cluster(s) and have the single place from where you can get your images. Also if of your nodes drop-down - you can fast start your image in the new one. Also you can automate your image's updating using registry's webhook, for example when you add new version of image registry gonna send webhook to any service that can upgrade your containers to the newest version.
Consider docker image as a new way of distribution of your software to your servers and docker-registry as a centralized storage of shared images(the like npm.org for js, maven.org for java).
For example,
if you develop java application, years before docker you may use .jar files to do it. The way docker image is better is that also include all OS level dependencies like JDK/JRE and system configurations. So this helps you to avoid "it works on my machine" effect.
To distribute docker image you can also use just docker file and build it all the time on every machine. Docker-Repository allows you to have centralized storage of pre-build images.
Pushing to docker-repository in your CI/CD allows to build your distributive once and further work with the same distributive both on integration and prod environments.
Using just Dockerfile will not guarantee you the same state on every build in every moment of time because you may install external dependencies in your Dockerfile script which may be updated or even removed between two sequential builds.

How to tell the software version under a tag on Docker hub

I am quite newbie in docker, and I am trying to find the way to tell version for a docker hub tagged image.
For instance, the jenkins/jenkins:lts-latest image, listed here https://hub.docker.com/r/jenkins/jenkins/tags/, what image version does actually aliase? And how can I infer the correspondent dockerfile/branch in jenkins repo?
I tried with docker search but I couldn't. I tried also to find a clue in the official Jenkins github dockerfile repo: https://github.com/jenkinsci/docker, but I don't see any bindung tag or anything that gives me a hint on the source of the image
Another example, I have a Kubernetes cluster, and when I check my Nexus pod, I see likewise that the image is defined as sonatype/nexus3:latest.
In this case at least I have the imageID: docker-pullable://sonatype/nexus3#sha256:434a2564aa64646464afaf.. but once again I don't know how to map it to the actual version of the software
For the repo you asked, the answer is No.
When setup repo on dockerhub, there are two kinds of options for user to choose as follows:
1) Create Repository:
In this way, dockerhub just create a repo for user, and user need to build his own image on local server, tag it, and push it to dockerhub.
When user push his image to dockerhub, no additional information about the source version will be appended, so can't get any source map from dockerhub.
jenkins/jenkins, just this kind of repo.
2) Create Automated Build
In this way, dockerhub will fetch the code from github or bitbucket, and build the image on its cloud infrastructure, so it will know exactly what source commit is for current docker image.
jenkins/jnlp-slave, just this kind of repo.
Then, you can click its Build Details on the web page, click into one link, e.g. 3.26-1-alpine, you will see log mentioned 0a0239228bf2fd26d2458a91dd507e3c564bc7d0 is the source commit.
To sum up, for the repo you mentioned in the question, they are not Automated Build, so you cannot get the map for the image & source code, but if you happen to find a repo in dockerhub which is Automated Build later & want to know the map, then you can.
As long as I understand your question, you are trying to tag the docker image exact with same version as of your software version. For that I use to create image tag:
$ export VERSION="2.31-b19"
$ docker tag "<user>/<image>:${VERSION}" "<docker_hub_user>/<repo>:latest"
If this is not the case. Please explain your use case a bit more so that we can provide you a better workaround.

Can I put my docker repository/image on GitHub/Bitbucket?

I know that Docker hub is there but it allows only for 1 private repository. Can I put these images on Github/Bitbucket?
In general you don't want to use version control on large binary images (like videos or compiled files) as git and such was intended for 'source control', emphasis on the source. Technically, here's nothing preventing you from doing so and putting the docker image files into git (outside the limits of the service you're using).
One major issue you'll have is git/bitubucket have no integration with Docker as neither provide the Docker Registry api needed for a docker host to be able to pull down the images as needed. This means you'll need to manually pull down out of the version control system holding the image files if you want to use it.
If you're going to do that, why not just use S3 or something like that?
If you really want 'version control' on your images (which docker hub does not do...) you'd need to look at something like: https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/
Finally, docker hub only allows one FREE private repo. You can pay for more.
So the way to go is:
Create a repository on Github or Bitbucket
Commit and push your Dockerfile (with config files if necessary)
Create an automated build on Docker Hub which uses the Github / Bitbucket repo as source.
In case you need it all private you can self-host a git service like Gitlab or GOGS and of course you can also selfhost a docker registry service for the images.
Yes, since Sept. 2020.
See "Introducing GitHub Container Registry" from Kayla Ngan:
Since releasing GitHub Packages last year (May 2019), hundreds of millions of packages have been downloaded from GitHub, with Docker as the second most popular ecosystem in Packages behind npm.
Available today as a public beta, GitHub Container Registry improves how we handle containers within GitHub Packages.
With the new capabilities introduced today, you can better enforce access policies, encourage usage of a standard base image, and promote innersourcing through easier sharing across the organization.
Our users have asked for anonymous access for public container images, similar to how we enable anonymous access to public repositories of source code today.
Anonymous access is available with GitHub Container Registry today, and we’ve gotten things started today by publishing a public image of our own super-linter.
GitHub Container Registry is free for public images.
With GitHub Actions, publishing to GitHub Container Registry is easy. Actions automatically suggests workflows based for you based on your work, and we’ve updated the “Publish Docker Container” workflow template to make publishing straightforward.
GitHub is in the process of releasing something similar to ECR or Docker Hub. At the time of writing this, it's in Alpha phase and you can request access.
From GitHub:
"GitHub Package Registry is a software package hosting service, similar to npmjs.org, rubygems.org, or hub.docker.com, that allows you to host your packages and code in one place. You can host software packages privately or publicly and use them as dependencies in your projects."
https://help.github.com/en/articles/about-github-package-registry
I guess you are saying about docker images. You can setup your own private registry which will contain the docker images. If you are not pushing only dockerfiles, but are interested in pushing the whole image, then pushing the images as a whole to github is a very bad idea. Consider a case you have 600 MB of docker image, pushing it to github is like putting 600 MB of data to a github repo, and if you keep on pushing more images there, it will get terribly bad.
Also, docker registry does the intelligent mapping of storing only a single copy of a layer (this layer can be referenced by multiple images). If you use github, you are not going to use this use-case. You will end up storing multiple copies of large files which is really really bad.
I would definitely suggest you to go with a private docker registry rather than going with github.
If there is a real need of putting docker image to github/bitbucket you can try to save it into archive (by using https://docs.docker.com/engine/reference/commandline/save/) and commit/push it to your repository.

Resources