What exactly is a Docker repository? - docker

I can't seem to find a definition of exactly what is a Docker repository. The general approach to labelling seems to be username/imagename.
Of course, it contains Docker images BUT do they need to be different versions of the same image or can they be different Docker images?
For example, could I keep App1, App2, ... in the same Docker repository and just use labels to distinguish them.

By convention the Docker images are named as <owner>/<application>:<tag>. There is not any technical restriction if you want to keep different applications under the same repository (i.e.: there doesn't has to be related different tags unless you force that relation), and you can have, for example, mycompany/myuser:app1 and mycompany/myuser:app2. But this is not the way you will found most of the public images which are tagged as I indicated before.

I believe the answer is that a Docker Repository is a labelled set of versions of an image.
Although it seems technically possible for them to be semantically different images, the difference perhaps being denoted by the label, this will be 1) confusing because they all have the same image name (confusing for humans and software that uses the images), and 2) not in line with the intended use of the repositories as far as I can tell, and 3) probably in opposition to the business model for the Docker Hub hosted registry for public and private repositories.
It was never my intention to attempt to break that business model, I was just confused somewhat since the term repository often means something more general than just a single conceptual entity. Docker provides the ability to privately host your own registries, that contain many repositories.

According to Docker Doc,
A registry is a collection of repositories, and a repository is a collection of images – sort of like a GitHub repository, except the code is already built.
The notation for associating a local image with a repository on a registry, is
username/repository:tag.
The :tag is optional, but recommended; it’s the mechanism that registries use to give Docker images a version. So, putting all that together, enter your username, and repo and tag names, so your existing image will upload to your desired destination:
docker tag friendlyhello username/repository:tag
I also find using the word repository a bit confusing in that context. In the end, the repository is meant to contain all the versions of an image, with the tag latest used to designate the image which will be pulled in case no tag is specified when using the docker pull command.

Related

Is there a search for docker images based on their content?

Let me start with the preface that I'm pretty new to the docker world.
I've found myself (and colleagues) searching for docker images on several occasions, where we looked for any docker image that contains specific binaries. E.g. we want to run selenium-based tests in an headless Chrome, so we look for an image that contains node, Chrome and selenium.
This kind of search, at least in what I've experienced, is not well supported by dockerhub. You can search there for all ingredients separately, and find great images for them. But you don't find images containing all ingredients at once.
Am I missing an obvious place to look for?
Needless to say we want to avoid creating our own images if we can, as we would have to maintain them in the long run, and we were of the impression that we couldn't be the first ones needing a similar image.

Microservice instances (Dockerized) with slightly different parameters in Kubernetes

I’m somewhat new to Kubernetes and not sure the standard way to do this. I’d like to have many instances of a single microservice, but with each of the containers parameterized slightly differently. (Perhaps an environment variable passed to the container that’s different for each instance, as specified in the container spec of the .yaml file?)
It seems like a single deployment with multiple replicas wouldn’t work. Yet, having n different deployments with very slightly different .yaml files seems a bit redundant. Is there some sort of templating solution perhaps?
Or should each microservice be identical and seek out its parameters from a central service?
I realize this could be interpreted as an “opinion question” but I am looking for typical solutions.
There are definitely several ways of doing it. One popular option is to use Helm. Helm lets you define kubernetes manifests using Go templates, and package them on a single unit called a Helm Chart. Later on you can install this Chart (install is what Helm calls to save these manifests in the Kubernetes API). When installing the Helm Chart, you can pass arguments that will be used when rendering the templates. That way you can re-use pretty much everything, and just replace the significant bits of your manifests: Deployments, Services, etc.
There are plenty of Helm charts available as open sources projects, that you can use as an example on how to create your own Chart.
And many useful guides on how to create your first Helm Chart.
Here you can find the official docs on developing your own Charts.
As an option you can use StatefulSet with InitContainers plus ConfigMap.
Statefulset will guarantee you proper naming and order.
ConfigMap will let you store fine-grained information like individual properties or coarse-grained information like entire config files.
Configuration data can be consumed in pods in a variety of ways. ConfigMaps can be used to:
1) Populate the values of environment variables
2) Set command-line arguments in a container
3) Populate config files in a volume
For the begging you can review Kubernetes – StatefulSets article where you can find a good explanation on how this pieces work together and inspect prepared example on how to deploy containers from the same image but with different properties.

How to know which docker base image will be right one for the requirement from all images of DockerHub?

How can I know which Docker Base image will be the best one for the requirement out of all the related images available in DockerHub?
Thank you.
This is a question that doesn't have an exact answer: what base image to use will depend on multiple factors, and even after considering those, you'll probably have multiple alternatives.
Here's an article that talks about this same question, and goes into detail about some tips when choosing. The conclusion ends up being:
To summarize, selecting the appropriate OS base image depends on the
following factors:
What technologies are used to build the application being containerized?
What is the intended target platform on which these images will run?
What is the desirable size of the application image? (including all layers and base images)
Pure Taste!
Still, always make sure to check DockerHub for an existing image of the requirements you're looking to solve. You may be surprised and find exactly what you need!

Is there a way of identifying base layers for docker so I can reuse them?

I have a complex docker image that includes linux plus a number of packages. This uses several Gb for the image and each instance.
Now I want to create a simpler image on the same server but still need linux. If I use the same base layer rather than a completely different version this would presumably save space. But how do I know what the base layer is?
I have tried docker history but the base layer image id I can't find; and the others are all listed as "missing".

Should I use Dockerfiles or image commits?

I'm a little bit confused about these two options. They appear to be related. However, they're not really compatible.
For example, it seems that using Dockerfiles means that you shouldn't really be committing to images, because you should really just track the Dockerfile in git and make changes to that. Then there's no ambiguity about what is authoritative.
However, image commits seem really nice. It's so great that you could just modify a container directly and tag the changes to create another image. I understand that you can even get something like a filesystem diff from an image commit history. Awesome. But then you shouldn't use Dockerfiles. Otherwise, if you made an image commit, you'd have to go back to your Dockerfile and make some change which represents what you did.
So I'm torn. I love the idea of image commits: that you don't have to represent your image state in a Dockerfile -- you can just track it directly. But I'm uneasy about giving up the idea of some kind of manifest file which gives you a quick overview of what's in an image. It's also disconcerting to see two features in the same software package which seem to be incompatible.
Does anyone have any thoughts on this? Is it considered bad practice to use image commits? Or should I just let go of my attachment to manifest files from my Puppet days? What should I do?
Update:
To all those who think this is an opinion-based question, I'm not so sure. There are some subjective qualities to it, but I think it's mostly an objective question. Furthermore, I believe a good discussion on this topic will be informative.
In the end, I hope that anyone reading this post will come away with a better understanding of how Dockerfiles and image commits relate to each other.
Update - 2017/7/18:
I just recently discovered a legitimate use for image commits. We just set up a CI pipeline at our company and, during one stage of the pipeline, our app tests are run inside of a container. We need to retrieve the coverage results from the exited container after the test runner process has generated them (in the container's file system) and the container has stopped running. We use image commits to do this by committing the stopped container to create a new image and then running commands which display and dump the coverage file to stdout. So it's handy to have this. Apart from this very specific case, we use Dockerfiles to define our environments.
Dockerfiles are a tool that is used to create images.
The result of running docker build . is an image with a commit so it's not possible to use a Dockerfile with out creating a commit. The question is should you update the image by hand each time anything changes and thus doom yourself to the curse of the golden image?
The curse of the golden image is a terrible curse cast upon people who must continue living with a buggy security hole ridden base image to run their software on because the person who created it was long ago devoured by the ancient ones (or moved on to a new job) and nobody knows where they got the version of imagemagic that went into that image. and is the only thing that will link against the c++ module that was provided by that consultant the boss's son hired three years ago, and anyway it doesn't matter because even if you figured out where imagemagic came from the version of libstdc++ used by the JNI calls in the support tool that intern with the long hair created only exists in an unsupported version of ubuntu anyway.
Knowing both solutions advantages and inconvenient is a good start. Because a mix of the two is probably a valid way to go.
Con: avoid the golden image dead end:
Using only commits is bad if you lose track of how to rebuild your image. You don't want to be in the state that you can't rebuild the image. This final state is here called the golden image as the image will be your only reference, starting point and ending point at each stage. If you loose it, you'll be in a lot of trouble since you can't rebuild it. The fatal dead end is that one day you'll need to rebuild a new one (because all system lib are obsolete for instance), and you'll have no idea what to install... ending in big loss of time.
As a side note, it's probable that using commits over commits would be nicer if the history log would be easily usable (consult diffs, and repeat them on other images) as it is in git: you'll notice that git don't have this dilemma.
Pro: slick upgrades to distribute
In the other hand, layering commits has some considerable advantage in term of distributed upgrades and thus in bandwidth and deploy time. If you start to handle docker images as a baker is handling pancakes (which is precisely what docker permits), or want to deploy tests version instantly, you'll be happier to send just a small update in the form of a small commit rather a whole new image. Especially when having continuous integration for your customers where bug fixes should be deployed soon and often.
Try to get the best of two worlds:
In these type of scenario, you'll probably want to tag major version of your images and they should come from Dockerfiles. And you can provide continuous integration versions thanks to commits based on the tagged version. This mitigates advantages and inconvenients of Dockerfiles and layering commits scenario. Here, the key point is that you never stop keeping track of your images by limiting the number of commits you'll allow to do on them.
So I guess it depends on your scenario, and you probably shouldn't try to find a single rule. However, there might be some real dead-ends you should really avoid (as ending up with a "golden image" scenario) whatever the solution.

Resources