Docker DinD fails to build images after upgrading - docker

We are using Docker 18.9.8-dind. DinD — Docker-in-Docker — is running Docker in a separate container. This way, we send requests to this container to build our images, instead of executing Docker in the machine that wants the built image.
We needed to upgrade from 18.9.8-dind to 20.10.14-dind. Since we use Kubernetes, we just updated the image version in some YAML files:
spec:
containers:
- name: builder
- image: docker:18.09.8-dind
+ image: docker:20.10.14-dind
args: ["--storage-driver", "overlay2", "--mtu", "1460"]
imagePullPolicy: Always
resources:
Alas, things stopped working after that. Builds failed, and we could find these error messages in the code reaching for our Docker builder:
{"errno":-111,"code":"ECONNREFUSED","syscall":"connect","address":"123.456.789.10","port":2375}
Something went wrong and the entire build was interrupted due to an incorrect configuration file or build step,
check your source code.
What can be going on?

We checked the logs in the Docker pod, and found this message at the end:
API listen on [::]:2376
Well, our message in the question states we tried to connect to port 2375, which used to work. Why has the port changed?
Docker enables TLS as default from version 19.03 onwards. When Docker uses TLS, it listens on port 2376.
We had three alternatives here:
change the port to 2375 (which sounds like a bad idea: we would use the default plain port for TLS communication, a very confusing setup);
Connect to the new port; or
disable TLS.
In general, connecting to the new port is probably the best solution. However, for reasons specific to us, we choose to disable TLS, which only requires an environment variable in yet another YAML file:
- name: builder
image: docker:20.10.14-dind
args: ["--storage-driver", "overlay2", "--mtu", "1460"]
+ env:
+ - name: DOCKER_TLS_CERTDIR
+ value: ""
imagePullPolicy: Always
resources:
requests:
In most scenarios, though, it is probably better to have TLS enabled and change the port in the client.
(Sharing in the spirit of Can I answer my own questions? because it took us some time to piece the parts together. Maybe by sharing this information together with the error message, things can be easier for other affected people to find.)

Related

How to call Redis inside Kubernetes? Problems removing Old Redis service

Previously I had been experimenting with this command on Docker for Desktop Kubernetes
helm install my-release --set password=password bitnami/redis
I had issued the command helm uninstall my-release.
Now I am trying to make my todolistclient work inside (Docker for Desktop) Kubernetes with redis:
kubectl run redis --image=bitnami/redis:latest --replicas=1 --port=6379 --labels="ver=1,app=todo,env=proto" --env="REDIS_PASSWORD=password" --env="REDIS_REPLICATION_MODE=master" --env="REDIS_MASTER_PASSWORD=password"
kubectl run todolistclient --image=siegfried01/todolistclient:latest --replicas=3 --port=5000 --labels="ver=1,app=todo,env=proto"
When I look at the log for ToDoListClient, I see a stack trace indicating that it is failing to connect to the redis server with this error message:
System.AggregateException: One or more errors occurred. (No connection is available to service this operation: EVAL; SocketFailure on my-release-redis-master.default.svc.cluster.local:6379/Subscription, origin: Error, input-buffer: 0, outstanding: 0, last-read: 0s ago, last-write: 0s ago, unanswered-write: 9760s ago, keep-alive: 60s, pending: 0, state: Connecting, last-heartbeat: never, last-mbeat: -1s ago, global: 0s ago)
What is this my-release-redis-master.default.svc.cluster.local? This has been uninstalled and I'm not running that any more.
My C# code is connecting to Redis with
.AddDistributedRedisCache(options => { options.InstanceName = "OIDCTokens"; options.Configuration = "redis,password=password"; })
Just to be certain that I was indeed using the above code and specifically "redis", I recompiled my code and pushed to DockerHub again and I am getting the same error again.
So apparently there is something left over from the helm version of redis that is translating "redis" into "my-release-redis-master". How do I remove this so I can connect to my current redis?
Thanks
Siegfried
In the todolistclient application you are using my-release-redis-master.default.svc.cluster.local:6379/Subscription. This is the url of a service exposing redis pod. This is automatically created by helm release.
If that is not desired then you need to change that url in todolistclient application to your redis service.
You have deployed redis but have not created any service to expose redis, Hence you can not use a service url to connect to it unless you create it.
So you have two options
Use redis pod IP in the todolistclient application. This is not recomended because Pod IP changes when restarted.
Create a service and then use that service url in todolistclient application.
apiVersion: v1
kind: Service
metadata:
name: redis-master
labels:
run: redis
spec:
ports:
- port: 6379
targetPort: 6379
selector:
run: redis
Here is a guide on how to deploy a guestbook application on kubernetes and connect to redis.
One suggestion don't use same labels for both todolistclient and redis
The problem was that I had originally changed my source code to accommodate the name generated by helm: my-release-redis-master and later restored the code to just use the domain name redis.
The confusion was from the fact that even though I was intending to compile and deploy (to Kubernetes) a debug version (which is the setting I had for Visual Studio), Visual studio was continuing to recompile the debug version but deploying that ancient release version with the bad domain name.
The GUI for the Visual Studio 2019 publish dialog apparently is broken and won't let you deploy in debug mode. (I wish I could find the file where that publish dialog stored its settings so I could correct it with notepad). It would have been nice if I had received a warning indicating that it was not deploying my latest build.
Arghya Sadhu's response was helpful because it gave me the confidence to say that this was not some weird feature of Kubernetes that causing my domain name to translated to the bogus my-release-redis-master.
Thank you Arghya.
So the solution was simple: recompile in release mode and deploy.
Siegfried

Docker: Artifactory CE-C++ behind a reverse proxy (using traefik)

I'm trying to run Artifactory (Artifactory CE-C++, V7.6.1) behind a reverse proxy (Traefik v2.2, latest).
Both are official unaltered docker-images. For starting them up I'm using docker-compose.
My artifactory-yml-file (docker-compose.yml) uses the following traefik-configuration:
image: docker.bintray.io/jfrog/artifactory-cpp-ce
[...]
lables:
- "traefik.enable=true"
- "traefik.http.routers.artifactory.rule = Host(`localhost`) && PathPrefic(`/artifactory`)"
- "traefik.http.routers.artifactory.middlewares=artifactory-strippprefix"
- "traefik.http.middlewares.artifactory-strippprefix.strippprefix.prefixes=/"
- "traefik.docker.network=docker-network"
Note: My network docker network is just a simple docker network (external). I have this still in there because of traefik v1
My artifactory is accessible at the beginning over http://localhost/artifactory/, but only when starting up. As soon as artifactory wants me to redirect to its UI, it takes me to http://localhost/ui/ instead of (I guess?) http://localhost/artifactory/ui/, which is invalid.
I'm seeking either for a feature to tell artifactory it should account the prefix /artifactory when forwarding or a possibility in traefik to alter artifactory’s forward -response in a way that the forward-url matches the path.
I'm also using Jenkins with traefik, there it was as simple as adding
JENKINS_OPTS: "--prefix=/jenkins"
The Artifactory CE-C++ opens up two ports: 8081 and 8082. I suggest your reverse proxy points to the format port 8081, does it? Whereas the UI endpoint is AFAIK served on 8082. Did you try this port?

Consistent DNS between Kubernetes and docker-compose

As far as I'm concerned, this is more of a development question than a server question, but it lies very much on the boundary of the two, so feel free to migrate to serverfault.com if that's the consensus).
I have a service, let's call it web, and it is declared in a docker-compose.yml file as follows:
web:
image: webimage
command: run start
build:
context: ./web
dockerfile: Dockerfile
In front of this, I have a reverse-proxy server running Apache Traffic Server. There is a simple mapping rule in the url remapping config file
map / http://web/
So all incoming requests are mapped onto the web service described above. This works just peachily in docker-compose, however when I move the service to kubernetes with the following service description:
apiVersion: v1
kind: Service
metadata:
labels:
io.kompose.service: web
name: web
spec:
clusterIP: None
ports:
- name: headless
port: 55555
targetPort: 0
selector:
io.kompose.service: web
status:
loadBalancer: {}
...traffic server complains because it cannot resolve the DNS name web.
I can resolve this by slightly changing the DNS behaviour of traffic server with the following config change:
CONFIG proxy.config.dns.search_default_domains INT 1
(see https://docs.trafficserver.apache.org/en/7.1.x/admin-guide/files/records.config.en.html#dns)
This config change is described as follows:
Traffic Server can attempt to resolve unqualified hostnames by expanding to the local domain. For example if a client makes a request to an unqualified host (e.g. host_x) and the Traffic Server local domain is y.com, then Traffic Server will expand the hostname to host_x.y.com.
Now everything works just great in kubernetes.
However, when running in docker-compose, traffic-server complains about not being able to resolve web.
So, I can get things working on both platforms, but this requires config changes to do so. I could fire a start-up script for traffic-server to establish if we're running in kube or docker and write the config line above depending on where we are running, but ideally, I'd like the DNS to be consistent across platforms. My understanding of DNS (and in particular, DNS default domains/ local domains) is patchy.
Any pointers? Ideally, a local domain for docker-compose seems like the way to go here.
The default kubernetes local domain is
default.svc.cluster.local
which means that the fully qualified name of the web service under kubernetes is web.default.svc.cluster.local
So, in the docker-compose file, under the trafficserver config section, I can create an alias for web as web.default.svc.cluster.local with the following docker-compose.yml syntax:
version: "3"
services:
web:
# ...
trafficserver:
# ...
links:
- "web:web.default.svc.cluster.local"
and update the mapping config in trafficserver to:
map / http://web.default.svc.cluster.local/
and now the web service is reachable using the same domain name across docker-compose and kubernetes.
I found the same problem but solved it in another way, after much painful debugging.
With CONFIG proxy.config.dns.search_default_domains INT 1 Apache Traffic Server will append the names found under search in /etc/resolve.conf one by one until it gets a hit.
In my case resolve.conf points to company.intra so I could name my services (all services used from Apache Traffic Server) according to this
version: '3.2'
services:
# this hack is ugly but we need to name this
# (and all other service called from ats), with
# the same name as found under search in /etc/resolve.conf)
web.company.intra:
image: web-image:1.0.0
With this change I don't need to make any changes to remap.config at all, the URL used can still be only "web", since it gets expanded to a name that matches both environments,
web.company.intra in docker.compose
web.default.svc.local.cluster in kubernetes

Use the Kubernetes tool Kompose to start multiple containers in a single pod

I have been search google for a solution to the below problem for longer than I care to admit.
I have a docker-compose.yml file, which allows me to fire up an ecosystem of 2 containers on my local machine. Which is awesome. But I need to be able to deploy to Google Container Engine (GCP). To do so, I am using Kubernetes; deploying to a single node only.
In order to keep the deploying process simple, I am using kompose, which allows me to deploy my containers on Google Container Engine using my original docker-compose.yml. Which is also very cool. The issue is that, by default, Kompose will deploy each docker service (I have 2) in seperate pods; one container per pod. But I really want all containers/services to be in the same pod.
I know there are ways to deploy multiple containers in a single pod, but I am unsure if I can use Kompose to accomplish this task.
Here is my docker-compose.yml:
version: "2"
services:
server:
image: ${IMAGE_NAME}
ports:
- "3000"
command: node server.js
labels:
kompose.service.type: loadbalancer
ui:
image: ${IMAGE_NAME}
ports:
- "3001"
command: npm run ui
labels:
kompose.service.type: loadbalancer
depends_on:
- server
Thanks in advance.
The thing is, that neither dose docker-compose launch them like this. They are completely separate. It means, for example, that you can have two containers listening on port 80, cause they are independent. If you try to pack them into same pod you will get port conflict and end up with a mess. The scenario you want to achieve should be achieved on your Dockerfile level to make any sense (although fat [supervisor based] containers can be considered an antipattern in many cases), in turn making your compose obsolete...
IMO you should embrace how things are, cause it does not make sense to map docker-compose defined stack to single pod.

Setting a policy for RabbitMQ as a part of Dockerfile process

I'm trying to make a Dockerfile based on the RabbitMQ repository with a customized policy set. The problem is that I can't useCMD or ENTRYPOINT since it will override the base Dockerfile's and then I have to come up with my own and I don't want to go down that path. Let alone the fact if I don't use RUN, it will be a part of run time commands and I want this to be included in the image, not just the container.
Other thing I can do is to use RUN command but the problem with that is the RabbitMQ server is not running at build time and also there's no --offline flag for the set_policycommand of rabbitmqctl program.
When I use docker's RUN command to set the policy, here's the error I face:
Error: unable to connect to node rabbit#e06f5a03fe1f: nodedown
DIAGNOSTICS
===========
attempted to contact: [rabbit#e06f5a03fe1f]
rabbit#e06f5a03fe1f:
* connected to epmd (port 4369) on e06f5a03fe1f
* epmd reports: node 'rabbit' not running at all
no other nodes on e06f5a03fe1f
* suggestion: start the node
current node details:
- node name: 'rabbitmq-cli-136#e06f5a03fe1f'
- home dir: /var/lib/rabbitmq
- cookie hash: /Rw7u05NmU/ZMNV+F856Fg==
So is there any way I can set a policy for the RabbitMQ without writing my own version of CMD and/or ENTRYPOINT?
You're in a slightly tricky situation with RabbitMQ as it's mnesia data path is based on the host name of the container.
root#bf97c82990aa:/# ls -1 /var/lib/rabbitmq/mnesia
rabbit#bf97c82990aa
rabbit#bf97c82990aa-plugins-expand
rabbit#bf97c82990aa.pid
For other image builds you could seed the data files, or write a script that RUN calls to launch the application or database and configure it. With RabbitMQ, the container host name will change between image build and runtime so the image's config won't be picked up.
I think you are stuck with doing the config on container creation or at startup time.
Options
Creating a wrapper CMD script to do the policy after startup is a bit complex as /usr/lib/rabbitmq/bin/rabbitmq-server runs rabbit in the foreground, which means you don't have access to an "after startup" point. Docker doesn't really do background processes so rabbitmq-server -detached isn't much help.
If you were to use something like Ansible, Chef or Puppet to setup the containers. Configure a fixed hostname for the containers startup. Then start it up and configure the policy as the next step. This only needs to be done once, as long as the hostname is fixed and you are not using the --rm flag.
At runtime, systemd could complete the config to a service with ExecStartPost. I'm sure most service managers will have the same feature. I guess you could end up dropping messages, or at least causing errors at every start up if anything came in before configuration was finished?
You can configure the policy as described here.
Docker compose:
rabbitmq:
image: rabbitmq:3.7.8-management
container_name: rabbitmq
volumes:
- ~/rabbitmq/data:/var/lib/rabbitmq:rw
- ./rabbitmq/rabbitmq.conf:/etc/rabbitmq/rabbitmq.conf
- ./rabbitmq/definitions.json:/etc/rabbitmq/definitions.json
ports:
- "5672:5672"
- "15672:15672"

Resources