Keycloak Docker container fails to start after restarting the container - docker

I have a Keycloak installation running as docker container in a docker-compose environment. Every night, my backup stops relevant containers, performs a DB and volume backup and restarts the containers again. For most it works, but Keycloak seems to have a problem with it and does not come up again afterwards. Looking at the logs, the error message is:
The batch failed with the following error: :
keycloak | WFLYCTL0062: Composite operation failed and was rolled back. Steps that failed:
keycloak | Step: step-9
keycloak | Operation: /subsystem=datasources/jdbc-driver=postgresql:add(driver-name=postgresql, driver-module-name=org.postgresql.jdbc, driver-xa-datasource-class-name=org.postgresql.xa.PGXADataSource)
keycloak | Failure: WFLYCTL0212: Duplicate resource [
keycloak | ("subsystem" => "datasources"),
keycloak | ("jdbc-driver" => "postgresql")
keycloak | ]
...
The batch failed with the following error: :
keycloak | WFLYCTL0062: Composite operation failed and was rolled back. Steps that failed:
keycloak | Step: step-9
keycloak | Operation: /subsystem=datasources/jdbc-driver=postgresql:add(driver-name=postgresql, driver-module-name=org.postgresql.jdbc, driver-xa-datasource-class-name=org.postgresql.xa.PGXADataSource)
keycloak | Failure: WFLYCTL0212: Duplicate resource [
keycloak | ("subsystem" => "datasources"),
keycloak | ("jdbc-driver" => "postgresql")
keycloak | ]
The docker-compose.yml entry for Keycloak looks as follows, important data obviously removed
keycloak:
image: jboss/keycloak:8.0.1
container_name: keycloak
environment:
- PROXY_ADDRESS_FORWARDING=true
- DB_VENDOR=postgres
- DB_ADDR=db
- DB_DATABASE=keycloak
- DB_USER=keycloak
- DB_PASSWORD=<password>
- VIRTUAL_HOST=<url>
- VIRTUAL_PORT=8080
- LETSENCRYPT_HOST=<url>
volumes:
- /opt/docker/keycloak-startup:/opt/jboss/startup-scripts
The volume I'm mapping is there to make some changes to WildFly to make sure it behaves well with the reverse proxy:
embed-server --std-out=echo
# Enable https listener for the new security realm
/subsystem=undertow/ \
server=default-server/ \
http-listener=default \
:write-attribute(name=proxy-address-forwarding, \
value=true)
# Create new socket binding with proxy https port
/socket-binding-group=standard-sockets/ \
socket-binding=proxy-https \
:add(port=443)
# Enable https listener for the new security realm
/subsystem=undertow/ \
server=default-server/ \
http-listener=default \
:write-attribute(name=redirect-socket, \
value="proxy-https")
After stopping the container, its not starting anymore with the messages shown above. Removing the container and re-creating it works fine however. I tried to remove the volume after the initial start, this doesn't really make a difference either. I already learned that I have to remove the KEYCLOAK_USER=admin and KEYCLOAK_PASSWORD environment variables after the initial boot as otherwise the container complains that the user already exists and doesn't start anymore. Any idea how to fix that?

Update on 23rd of May 2021:
The issue has been resolved on RedHats Jira, it seems to be resolved in version 12. The related GitHub pull request can be found here: https://github.com/keycloak/keycloak-containers/pull/286
According to RedHat support, this is a known "issue" and not supposed to be fixed. They want to concentrate on a workflow where a container is removed and recreated, not started and stopped. They agreed with the general problem, but stated that currently there are no resources available. Stopping and starting the container is a operation which is currently not supported.
See for example https://issues.redhat.com/browse/KEYCLOAK-13094?jql=project%20%3D%20KEYCLOAK%20AND%20text%20~%20%22docker%20restart%22 for reference

A legitimate use case for restarting is to add debug logging. For example to debug authentication with an external identity provider.
I ended up creating a shell script that does:
docker stop [container]
docker rm [container]
recreates the image i want with changes to the logging configuration
docker run [options] [container]
However a nice feature of docker is the ability to restart a stopped container automatically, decreasing downtime. This Keycloak bug takes that feature away.

I had the same problem here, and my solution was:
Export docker container to a .tar file:
docker export CONTAINER_NAME > latest.tar
2- Create a new volume in a docker
docker volume create VOLUME_NAME
3 - Start a new docker container mapping the volume created to a container db path, something like this:
docker run --name keycloak2 -v keycloak_db:/opt/jboss/keycloak/standalone/data/ -p 8080:8080 -e PROXY_ADDRESS_FORWARDING=true -e KEYCLOAK_USER=admin -e KEYCLOAK_PASSWORD=root jboss/keycloak
4 - Stop the container
5 - Unpack the tar file and find the database path, something like this:
tar unpack path: /opt/jboss/keycloak/standalone/data
6 - Move the path content to docker volume, if you dont know where is the physical path use docker inspect volume VOLUME_NAME to find the path
7 - Start the stoped container
This works for me, I hope its so helpfull to the next person to fix this problem.

Related

Fixing my connection a Docker postgres after moving to Apple silicon

I have a local project early in development which uses Nestjs and TypeORM to connect to a Docker postgres instance (called 'my_database_server'). Things were working on my old computer, an older Macbook Pro.
I've just migrated everything onto a new Macbook Pro with the new M2 chip (Apple silicon). I've downloaded the version of Docker Desktop that's appropriate for Apple silicon. It runs fine, it still shows 'my_database_server', it can launch that fine, and I can even use the Terminal to go into its Postgres db and see the data that existed in my old computer.
But, I can't figure out how to adjust the config of my project to get it to connect to this database. I've read from other articles that because Docker is running on Apple silicon now and is using emulation, that the host should be different.
This is what my .env used to look like:
POSTGRES_HOST=127.0.0.1
POSTGRES_PORT=5432
POSTGRES_USER=postgres
On my new computer, the above doesn't connect. I have tried these other values for POSTGRES_HOST, many inspired by other SO posts, but these all yield Error: getaddrinfo ENOTFOUND _____ errors:
my_database_server (the container name)
docker (since I didn't use a docker-compose.yaml file - see below - I don't know what the 'service name' is in this case)
192.168.65.0/24 (the "Docker subnet" value in Docker Desktop > Preferences > Resources > Network)
Next, for some other values I tried, the code is trying to connect for a longer time, but it's getting stuck on something later in the process. With these, eventually I get Error: connect ETIMEDOUT ______:
192.168.65.0
172.17.0.2 (from another SO post, I tried the terminal command docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' 78f6e532b324 - the last part being the container ID of my_database_server)
In case it helps, I originally set up this docker container using the script I found here, not using a docker-compose.yaml file. Namely, I ran this script once at the beginning:
#!/bin/bash
set -e
SERVER="my_database_server";
PW="mysecretpassword";
DB="my_database";
echo "echo stop & remove old docker [$SERVER] and starting new fresh instance of [$SERVER]"
(docker kill $SERVER || :) && \
(docker rm $SERVER || :) && \
docker run --name $SERVER -e POSTGRES_PASSWORD=$PW \
-e PGPASSWORD=$PW \
-p 5432:5432 \
-d postgres
# wait for pg to start
echo "sleep wait for pg-server [$SERVER] to start";
SLEEP 3;
# create the db
echo "CREATE DATABASE $DB ENCODING 'UTF-8';" | docker exec -i $SERVER psql -U postgres
echo "\l" | docker exec -i $SERVER psql -U postgres
What should be my new db config settings?
I never figured the above problem out, but it was blocking me so I found a different away around.
Per other SO questions, I decided to go with the more typical route of using a docker-compose.yml file to create the Docker container. In case it helps others in this problem, this is what the main part of my docker-compose.yml looks like:
version: '3'
services:
db:
image: postgres
restart: always
environment:
- POSTGRES_USER=${DATABASE_USER}
- POSTGRES_PASSWORD=${DATABASE_PASSWORD}
- POSTGRES_DB=${DB_NAME}
container_name: postgres-db
volumes:
- ./pgdata:/var/lib/postgresql/data
ports:
- "54320:5432"
I then always run this with docker-compose up -d, not starting the container through the Docker Desktop app (though after that command, you should see the new container light up in the app).
Then in .env, I have this critical part:
POSTGRES_HOST=localhost
POSTGRES_PORT=54320
I mapped Docker's internal 5432 to the localhost-accessible 54320 (a suggestion I found here). Doing "5432:5432" as other articles suggest was not working for me, for reasons I don't entirely understand.
Other articles will suggest changing the host to whatever the service name is in your docker-compose.yml (for the example above, it would be db) - this also did not work for me. I believe the "54320:5432" part maps the ports correctly so that host can remain localhost.
Hope this helps others!

First attempt at docker compose, status is "restarting" what have I done wrong?

This is my first attempt at docker composer (and docker since yesterday), however the docker is in a restarting state.
The application is Grafana which I normally run with:
docker volume create grafana-storage
docker run -d -p 3000:3000 --name=grafana -v grafana-storage:/var/lib/grafana grafana/grafana
Today I thought I'd try using docker composer, here is what I have done:Create a folder for the Docker App(s)
sudo mkdir Docker_Applications
cd Docker_Applications
sudo mkdir Grafana
Go into the directory
cd Grafana
sudo nano docker-compose.yml
add
version: '3'
services:
grafana:
image: "grafana/grafana:7.3.7"
volumes:
# Data persistency
# sudo mkdir -p /Docker_Applications/Grafana
- "./database:/var/lib/grafana"
- "./config:/etc/grafana"
ports:
- 3000:3000
restart: always
Then ran it
root#grafana-dev:/Docker_Applications/Grafana$ sudo docker-compose up -d
Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Starting grafana_grafana_1 ... done
status
root#grafana-dev:/Docker_Applications/Grafana$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a347b12ae9a3 grafana/grafana:7.3.7 "/run.sh" 18 minutes ago Restarting (1) 4 seconds ago grafana_grafana_1
Hopefully you can see I've tried my best. I wonder if it's to do with the volumes.
Any help would be really appreciated.
I removed the volumes section of the yaml file and it runs now. It looks like permissions or it can't locate my folders, i'm not sure what permissions/commands to try.
grafana-dev:/Docker_Applications/Grafana$ sudo docker-compose up
Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
Creating grafana_grafana_1 ... done
Attaching to grafana_grafana_1
grafana_1 | mkdir: can't create directory '/var/lib/grafana/plugins': Permission denied
grafana_1 | GF_PATHS_CONFIG='/etc/grafana/grafana.ini' is not readable.
grafana_1 | GF_PATHS_DATA='/var/lib/grafana' is not writable.
grafana_1 | You may have issues with file permissions, more information here: http://docs.grafana.org/installation/docker/#migration-from-a-previous-version-of-the-docker-container-to-5-1-or-later
Most likely the container exits during startup due to an error and since you've set restart: always in your docker-compose file, the container automatically restarts.
Check the logs or just run docker-compose up non-detached by removing the -d flag to find out what the problem is, fix that and your container will stop restarting itself continuously.

Elasticsearch service does't start on gitlab - docker container already in use

I have my own CI server with gitlab and I'm trying to run docker runner (version 10.6) with this configuration:
image: php:7.1
services:
- mysql:latest
- redis:latest
- elasticsearch:latest
before_script:
- bash ci/install.sh > /dev/null
- php composer install -a
stages:
- test
test:
stage: test
variables:
API_ENVIRONMENT: 'test'
script:
- echo "Running tests"
- php composer app:tests
But everytime when I pull docker container with elastic, I've got error message:
*** WARNING: Service runner-1de473ae-project-225-concurrent-0-elasticsearch-2 probably didn't start properly.
Error response from daemon: Conflict. The container name "/runner-1de473ae-project-225-concurrent-0-elasticsearch-2-wait-for-service" is already in use by container "f26f56b2905e8c3da1977bc7c48e7eba00e943532146b7a8711f91fe67b67c3b". You have to remove (or rename) that container to be able to reuse that name.
*********
I also tried to log into this server and list all containers, but there is only redis one:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5cec961e03b2 811c03fb36bc "gitlab-runner-ser..." 39 hours ago Up 39 hours runner-1de473ae-project-247-concurrent-1-redis-1-wait-for-service
After googling this problem I found this issue: https://gitlab.com/gitlab-org/gitlab-runner/issues/2667 then I update runner to 10.6, but problem persists.
After all, there is no running elastic on my server, then my tests fails on:
FAILED: Battle/BattleDataElasticProviderTest.php method=testGetLocalBattles
Exited with error code 255 (expected 0)
Elasticsearch\Common\Exceptions\NoNodesAvailableException: No alive nodes found in your cluster
Is there any way, how to start ES or at least put ES into more verbosive mode?
Thanks!
When a container is stopped, it still exists even though it's now in an exited state. Using command docker ps -a shows you all the running and exited containers.
To start a new container with an already existing name, you need to first manually remove the old container occupying this name by using docker rm.
A convenient way is to use the --rm argument when starting a container, the container will be automatically removed once it stops.

Why is my Docker volume not working in a remote build box?

I am attempting to add a volume to a Docker container that will be built and run in a Docker Compose system on a hosted build service (CircleCI). It works fine locally, but not remotely. CircleCI provide an SSH facility I can use to debug why a container is not behaving as expected.
The relevant portion of the Docker Compose file is thus:
missive-mongo:
image: missive-mongo
command: mongod -v --logpath /var/log/mongodb/mongodb.log --logappend
volumes:
- ${MONGO_LOCAL}:/data/db
- ${LOGS_LOCAL_PATH}/mongo:/var/log/mongodb
networks:
- storage_network
Locally, if I do docker inspect integration_missive-mongo_1 (i.e. the running container name, I will get the volumes as expected:
...
"HostConfig": {
"Binds": [
"/tmp/missive-volumes/logs/mongo:/var/log/mongodb:rw",
"/tmp/missive-volumes/mongo:/data/db:rw"
],
...
On the same container, I can shell in and see that the volume works fine:
docker exec -it integration_missive-mongo_1 sh
/ # tail /var/log/mongodb/mongodb.log
2017-11-28T22:50:14.452+0000 D STORAGE [initandlisten] admin.system.version: clearing plan cache - collection info cache reset
2017-11-28T22:50:14.452+0000 I INDEX [initandlisten] build index on: admin.system.version properties: { v: 2, key: { version: 1 }, name: "incompatible_with_version_32", ns: "admin.system.version" }
2017-11-28T22:50:14.452+0000 I INDEX [initandlisten] building index using bulk method; build may temporarily use up to 500 megabytes of RAM
2017-11-28T22:50:14.452+0000 D INDEX [initandlisten] bulk commit starting for index: incompatible_with_version_32
2017-11-28T22:50:14.452+0000 D INDEX [initandlisten] done building bottom layer, going to commit
2017-11-28T22:50:14.454+0000 I INDEX [initandlisten] build index done. scanned 0 total records. 0 secs
2017-11-28T22:50:14.455+0000 I COMMAND [initandlisten] setting featureCompatibilityVersion to 3.4
2017-11-28T22:50:14.455+0000 I NETWORK [thread1] waiting for connections on port 27017
2017-11-28T22:50:14.455+0000 D COMMAND [PeriodicTaskRunner] BackgroundJob starting: PeriodicTaskRunner
2017-11-28T22:50:14.455+0000 D COMMAND [ClientCursorMonitor] BackgroundJob starting: ClientCursorMonitor
OK, now for the remote. I kick off a build, it fails because Mongo won't start, so I use the SSH facility that keeps a box alive after a failed build.
I first hack the DC file so that it does not try to launch Mongo, as it will fail. I just get it to sleep instead:
missive-mongo:
image: missive-mongo
command: sleep 1000
volumes:
- ${MONGO_LOCAL}:/data/db
- ${LOGS_LOCAL_PATH}/mongo:/var/log/mongodb
networks:
- storage_network
I then run the docker-compose up script to bring all containers up, and then examine the problematic box: docker inspect integration_missive-mongo_1:
"HostConfig": {
"Binds": [
"/tmp/missive-volumes/logs/mongo:/var/log/mongodb:rw",
"/tmp/missive-volumes/mongo:/data/db:rw"
],
That looks fine. So on the host I create a dummy log file, and list it to prove it is there:
bash-4.3# ls /tmp/missive-volumes/logs/mongo
mongodb.log
So I try shelling in, docker exec -it integration_missive-mongo_1 sh again. This time I find that the folder exists, but not the volume contents:
/ # ls /var/log
mongodb
/ # ls /var/log/mongodb/
/ #
This is very odd, because the reliability of volumes in the remote Docker/Compose config has been exemplary up until now.
Theories
My main one at present is that the differing versions of Docker and Docker Compose could have something to do with it. So I will list out what I have:
Local
Host: Linux Mint
Docker version 1.13.1, build 092cba3
docker-compose version 1.8.0, build unknown
Remote
Host: I suspect it is Alpine (it uses apk for installing)
I am using the docker:17.05.0-ce-git image supplied by CircleCI, the version shows as Docker version 17.05.0-ce, build 89658be
Docker Composer is installed via Pip, and getting the version produces docker-compose version 1.13.0, build 1719ceb.
So, there is some version discrepancy. As a shot in the dark, I could try bumping up Docker/Compose, though I am wary of breaking other things.
What would be ideal though, is some sort of advanced Docker commands I can use to debug why the volume appears to be registered but is not exposed inside the container. Any ideas?
CircleCI runs docker-compose remotely from the Docker daemon so local bind mounts don't work.
A named volume will default to the local driver and would work in CircleCI's Compose setup, the volume will exist where ever the container runs.
Logging should generally be left to stdout and stderr in a single process per container setup. Then you can make use of a logging driver plugin to ship to a central collector. MongoDB defaults to logging to stdout/stderr when run in the foreground.
Combining the volumes and logging:
version: "2.1"
services:
syslog:
image: deployable/rsyslog
ports:
- '1514:1514/udp'
- '1514:1514/tcp'
mongo:
image: mongo
command: mongod -v
volumes:
- 'mongo_data:/data/db'
depends_on:
- syslog
logging:
options:
tag: '{{.FullID}} {{.Name}}'
syslog-address: "tcp://10.8.8.8:1514"
driver: syslog
volumes:
mongo_data:
This is a little bit of a hack as the logging endpoint would normally be external, rather than a container in the same group. This is why the logging uses the external address and port mapping to access the syslog server. This connection is between the docker daemon and the log server, rather than container to container.
I wanted to add an additional answer to accompany the accepted one. My use-case on CircleCI is to run browser-based integration tests, in order to check that a whole stack is working correctly. A number of the 11 containers in use have volumes defined for various things, such as log output and raw database file storage.
What I had not realised until now was that the volumes in CircleCI's Docker executor do not work, as a result of a technical Docker limitation. As a result of this failure, in each case previously, the files were just written to an empty folder.
In my new case however, this issue was causing Mongo to fail. The reason for that was that I'm using --logappend to prevent Mongo from doing its own log rotation on start-up, and this switch requires the path specified in --logpath to exist. Since it existed on the host, but the volume creation failed, the container could not see the log file.
To fix this, I have modified my Mongo service entry to call a script in the command section:
missive-mongo:
image: missive-mongo
command: sh /root/mongo-logging.sh
And the script looks like this:
#!/bin/sh
#
# The command sets up logging in Mongo. The touch is for the benefit of any
# environment in which the logs do not already exist (e.g. Integration, since
# CircleCI does not support volumes)
touch /var/log/mongodb/mongodb.log \
&& mongod -v --logpath /var/log/mongodb/mongodb.log --logappend
In the two possible use cases, this will act as follows:
In the case of the mount working (dev, live) it will simply touch a file if it exists, and create it if it does not (e.g. a completely new environment),
In the case of the mount not working (CircleCI) it will create the file.
Either way, this is a nice safety feature to prevent Mongo blowing up.

Difference between docker-compose and manual commands

What I'm trying to do
I want to run a yesod web application in one docker container, linked to a postgres database in another docker container.
What I've tried
I have the following file hierarchy:
/
api/
Dockerfile
database/
Dockerfile
docker-compose.yml
The docker-compose.yml looks like this:
database:
build: database
api:
build: api
command: .cabal/bin/yesod devel # dev setting
environment:
- HOST=0.0.0.0
- PGHOST=database
- PGPORT=5432
- PGUSER=postgres
- PGPASS
- PGDATABASE=postgres
links:
- database
volumes:
- api:/home/haskell/
ports:
- "3000:3000"
Running sudo docker-compose up fails either to start the api container at all or, just as often, with the following error:
api_1 | Yesod devel server. Press ENTER to quit
api_1 | yesod: <stdin>: hGetLine: end of file
personal_api_1 exited with code 1
If, however, I run sudo docker-compose database up & then start up the api container without using compose but instead using
sudo docker run -p 3000:3000 -itv /home/me/projects/personal/api/:/home/haskell --link personal_database_1:database personal_api /bin/bash
I can export the environment variables being set up in the docker-compose.yml file then manually run yesod devel and visit my site successfully on localhost.
Finally, I obtain a third different behaviour if I run sudo docker-compose run api on its own. This seems to start successfully but I can't access the page in my browser. By running sudo docker-compose run api /bin/bash I've been able to explore this container and I can confirm the environment variables being set in docker-compose.yml are all set correctly.
Desired behaviour
I would like to get the result I achieve from running the database in the background then manually setting the environment in the api container's shell simply by running sudo docker-compose up.
Question
Clearly the three different approaches I'm trying do slightly different things. But from my understanding of docker and docker-compose I would expect them to be essentially equivalent. Please could someone explain how and why they differ and, if possible, how I might achieve my desired result?
The error-message suggests the API container is expecting input from the command-line, which expects a TTY to be present in your container.
In your "manual" start, you tell docker to create a TTY in the container via the -t flag (-itv is shorthand for -i -t -v), so the API container runs successfully.
To achieve the same in docker-compose, you'll have to add a tty key to the API service in your docker-compose.yml and set it to true;
database:
build: database
api:
build: api
tty: true # <--- enable TTY for this service
command: .cabal/bin/yesod devel # dev setting

Resources