What is the use case of docker Ubuntu images - docker

I'm new to this containers. I have seen docker doing really awesome job in virtualization. what is the point of using OS images like Ubuntu, centos etc. For an example if I need to run a mysql server. I can pull it up and just simply run it. I don't think so i need a help of another os image. Can anyone clarify this ? Thanks.

These Linux-distribution images are useful as a base for further images. It’s common enough to build an Ubuntu-based image for some specific application, for example:
FROM ubuntu:18.04
RUN apt-get update && apt-get install ...
WORKDIR /app
COPY . ./
CMD ["./myapp"]
To the extent that you might need, say, a PostgreSQL client library, getting it via a standard distribution package manager is much more convenient than building it from source.
You’re right that there’s basically no point in directly running these images.
(You’re also right that you don’t need a distribution: if you have a statically-linked binary, you can build an image FROM scratch that contains only the application itself and not common niceties like the system C library or a shell. I’ve mostly only seen this for Go applications; it can be very hard to use and debug otherwise if you’re not confident in Docker.)

Related

How do I make a docker image of my pi’s setups?

I’m trying to share my setups on Raspberry pi4 for my teammates
So that they can all use the same setups I have now.
So far, On My Raspberry pi , I have downloaded tensorflow, stuffs for Object detections and also set up a Web server with APM.
I heard that I can share all these setups on this pi using docker by making an docker image
But I don’t know how .
I’ve tried to pull images of Tensorflow and APM from Docker hub and run them all in one container then share it after making it as an image.
but then I realized that I wouldn’t be able to share the files I have for OD.
Can anyone please explain me how to make an Docker image of the entire setups on my pi?
If you want to share you're whole raspberry-pi setup, then you have to share a image of your SD-card.
A image contains the whole filesystem and has normally a size of a few gigabytes. It's not recommended to share them to teammates for each little change.
There are different ways to create a image - just google for "create image sd card".
A docker-image is like a image, but is has much less data. Docker-Images does not contain the whole operation-system.
If you want to share your docker-image, then your teammates have to setup their Pi's first.
And you need a repository where you can upload your images (and your mates can download). A good plattform is https://hub.docker.com
It sounds for me, that you want to share code. For code-sharing I recommend git. There are some good tutorials for https://github.com
My hint: Setup a pi with all the things you need (tenserflow, git, ...).
You can share this image with your mates.
The code of the project can you store on GitHub. When you update the code, then your mates can get these updates with git pull - these changes are normally very small and not some gigabytes for the whole image.
If you train big complex models and want to share them, than a docker-image is the best choice.
In this case share a image for raspberry with docker. Docker-Images are split in different layers. With a good design, these layers are small and good for sharing.
Your questions in your comment:
You can install tenserflow in a docker-image. There are good instructions on their site. https://www.tensorflow.org/install/docker
I would recommend a Dockerfile like this:
FROM tensorflow/tensorflow # latest stable release
ADD requirements.txt requirements.txt # https://pip.pypa.io/en/stable/user_guide/#requirements-files
RUN pip install -r requirements.txt && rm -f requirements.txt
ADD src/* /app/
WORKDIR /app
ENTRYPOINT["python"]
CMD["-u", "yourscript.py"]
Every time when you change your code, then you can build it with docker build . -t <IMAGE-NAME>.
You have to push the image somewhere (like hub.docker.com). docker push <IMAGE-NAME>

Docker containers with multiple bases?

I don't really understand something basic about Docker, specifically if I wanted to build from multiple bases within the same Dockerfile. For example, I know that these two lines wouldn't work:
FROM X
FROM Y
(well it would compile but then it seems to only include the image from X in the final build). Or perhaps I'm wrong and this is correct but I still haven't seen any other Dockerfiles like this.
Why would I want to do this? For instance, if X and Y are images I found on DockerHub that I would like to build from. For a concrete example if I wanted ubuntu and I also wanted python:
FROM python:2
FROM ubuntu:latest
What is the best way to go about it? Am I just limited to one base? If I want the functionality from both am I supposed to go into the docker files until I find something in common to both of them and build the image myself by copying the one of the dockerfile's code manually all the way through sub images until I reach the common base and add those lines to the other Dockerfile? I imagine this is not the correct way to do this as it seems quite involved and not in line with the simplicity that Docker aims to provide.
For a concrete example if I wanted ubuntu and I also wanted python:
FROM python:2
FROM ubuntu:latest
Ubuntu is Os, not a python. so what you need a Ubuntu base image which has python installed.
you can check offical python docker hub are based on ubuntu, so at one image you will get ubuntu + python, then why bother with two FROM? which is not also not working.
Some of these tags may have names like buster or stretch in them.
These are the suite code names for releases of Debian and indicate
which release the image is based on. If your image needs to install
any additional packages beyond what comes with the image, you'll
likely want to specify one of these explicitly to minimize breakage
when there are new releases of Debian.
So for you below question
What is the best way to go about it? Am I just limited to one base? If
I want the functionality from both am I supposed to go into the docker
files
yes, limit it one base image suppose your base image
python:3.7-stretch
So with this base image, you have python and ubuntu both. you do not need to make Dockerfile that have two FROM.
Also, you do need to maintain and build the image from scratch, use the offical one and extend as per your need.
For example
FROM python:3.7-stretch
RUN apt-get update && apt-get install -y vim
RUN pip install mathutils
Multiple FROM lines in a Dockerfile are used to create a multi-stage build. The result of a build will still be a single image, but you can run commands in one stage, and copy files to the final stage from earlier stages. This is useful if you have a full compile environment with all of the build tools to compile your application, but only want to ship the runtime environment with the image you deploy. Your first stage would be something like a full JDK with Maven or similar tools, while your final stage would be just your JRE and the JAR file copied from the first stage.

How to compose it?

Target: build opencv docker
Dockerfile creation:
From Ubuntu14.04
or
From Python3.7
Which to choose and why?
I was trying to write dockerfile from scratch without copy paste from others dockerfile.
I would usually pick the highest-level Docker Hub library image that matches what I need. It's also worth searching the https://hub.docker.com/ search box which will often find relevant things, though of rather varied ownership and maintenance levels.
The official Docker Hub images tend to have thought through a lot of issues around persistence and configuration and first-time setup. Compare "I'll just apt-get install mysql-server" with all of the parts that go into the official mysql image; just importing that real-world experience and reusing it can save you some trouble.
I'd consider building my own from an OS base like ubuntu:16.04 if:
There is a requirement that Docker images must be built from some specific distribution base ("my job requires everything to be built off of CentOS so I need a CentOS-based MySQL image")
I need a combination of software versions or patches that the Docker Hub image no longer supports (jruby:9.1.16.0 is no longer being built, so if I need OS updates, I need to build my own base image)
I need an especially exotic set of build options for whatever reason ("I have a C extension that only works if the interpreter is specifically built with UTF-16 Unicode support")
I need or want very detailed control over what version(s) of software are embedded; for example if it's something Java-based where there's a JVM version and a runtime version and an application version that all could matter
In my opinion you should choose From Python3.7.
Since you are writing a dockerfile for opencv which is an open source computer vision and machine learning software library so you may require python also in your container.
Now if you use From Ubuntu14.04 you may need to add python also in the dockerfile whereas with From Python3.7 that will become redundant and will also make the dockerfile a bit shorter.

What Docker scratch contains by default?

There is an option to use FROM scratch for me it looks like a really attractive way of building my Go containers.
My question is what does it still have natively to run binaries do I need to add anything in order to reliably run Go binaries? Compiled Go binary seems to run it at least on my laptop.
My goal is to keep image size to a minimum both for security and infra management reasons. In an optimal situation, my container would not be able to execute binaries or shell commands outside of build phase.
The scratch image contains nothing. No files. But actually, that can work to your advantage. It turns out, Go binaries built with CGO_ENABLED=0 require absolutely nothing, other than what they use. There are a couple things to keep in mind:
With CGO_ENABLED=0, you can't use any C code. Actually not too hard.
With CGO_ENABLED=0, your app will not use the system DNS resolver. I don't think it does by default anyways because it's blocking and Go's native DNS resolver is non-blocking.
Your app may depend on some things that are not present:
Apps that make HTTPS calls (as in, to other services, i.e. Amazon S3, or the Stripe API) will need ca-certs in order to confirm HTTPS certificate authenticity. This also has to be updated over time. This is not needed for serving HTTPS content.
Apps that need timezone awareness will need the timezone info files.
A nice alternative to FROM scratch is FROM alpine, which will include a base Alpine image - which is very tiny (5 MiB I believe) and includes musl libc, which is compatible with Go and will allow you to link to C libraries as well as compile without setting CGO_ENABLED=0. You can also leverage the fact that alpine is regularly updated, using its tzinfo and ca-certs.
(It's worth noting that the overhead of Docker layers is amortized a bit because of Docker's deduplication, though of course that is negated by how often your base image is updated. Still, it helps sell the idea of using the quite small Alpine image.)
You may not need tzinfo or ca-certs now, but it's better to be safe than sorry; you can accidentally add a dependency without realizing it breaks your build. So I recommend using alpine as your base. alpine:latest should be fine.
Bonus: If you want the advantages of reproducible builds inside Docker, but with small image sizes, you can use the new Docker multi-stage builds available in Docker 17.06+.
It works a bit like this:
FROM golang:alpine
ADD . /go/src/github.com/some/gorepo # may need some go getting if you don't vendor
RUN go build -o /app github.com/some/gorepo
FROM scratch # or alpine
COPY --from=0 /app /app
ENTRYPOINT ["/app"]
(I apologize if I've made any mistakes, I'm typing that from memory.)
Note that when using FROM scratch you must use the exec form of ENTRYPOINT, because the shell form won't work (it depends on the Docker image having /bin/sh, which it won't.) This will work fine in Alpine.

Additional steps in Dockerfile

I have a Docker image which is a server for a web IDE (Jupyter notebook) for Haskell.
Each time I want to allow the usage of a library in the IDE, I have to go to the Dockerfile and add the install command into it, then rebuild the image.
Another drawback of this, I have to fork the original image on Github, not allowing me to contribute to it.
I was thinking about writing another Dockerfile which pulls the base one with the FROM directive and then RUNs the commands to install the libraries. But, as they are in separate layers, the guest system does not find the Haskell package manager command.
TL;DR: I want to run stack install <library> (stack is like npm or pip, but for Haskell) from the Dockerfile, but I dont want to have a fork of the base image.
How could I solve this problem?
I was thinking about writing another Dockerfile which pulls the base one with the FROM directive and then RUNs the commands to install the libraries. But, as they are in separate layers, the guest system does not find the Haskell package manager command.
This is indeed the correct way to do this, and it should work. I'm not sure I understand the "layers" problem here - the commands executed by RUN should be running in an intermediate container that contains all of the layers from the base image and the previous RUN commands. (Ignoring the possibility of multi-stage builds, but these were added in 17.05 and did not exist when this question was posted.)
The only scenario I can see where stack might work in the running container but not in the Dockerfile RUN command would be if the $PATH variable isn't set correctly at this point. Check this variable, and make sure RUN is running as the correct user?

Resources