ERROR Disk error while locking directory /var/kafka-logs in 3.10 Kafka

ERROR Disk error while locking directory /var/kafka-logs in 3.10 Kafka - docker

I am using Kafka 3.1.0, Portainer 2.9.0 and docker 20.10.11 to build a 1 broker, 1 consumer and 1 producer cluster.
I am trying to map the log dirs via the docker-compose from the container to the host machine in order to persist the content of that directory (because if the container falls that information will be lost). I know it is recommended to have more than 1 broker, but since I am just testing this feature, I don't want to overcomplicate myself.
The problem I get is
ERROR Disk error while locking directory /var/kafka-logs (kafka.server.LogDirFailureChannel)
java.nio.file.AccessDeniedException: /var/kafka-logs/.lock
[2022-03-31 12:00:53,986] ERROR [KafkaServer id=1] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
I have checked and the user that executes the broker has all permissions (since I created that directory with my Dockerfile).
RUN mkdir /var/kafka-logs \
&& chown -R kafka:kafka /var/kafka-logs \
&& chmod -R 777 /var/kafka-logs
I have seen that this problem was a thing in the 3.0 version and was fixed in the 3.1, and also that it only happened in Windows, so I don't know the source of this problem.
Edit: I have checked and even without the mapping it still prints that error. It must be a problem of changing the log.dirs property to a non /tmp directory, because if I leave the default configuration it works just fine.
By default I mean the following:
log.dirs=/tmp/kafka-logs
My docker-compose:
version: "3.8"
networks:
net:
external: true
services:
kafka-broker1:
image: registry.gitlab.com/repo/kafka:2.13_3.1.0_v0.1
volumes:
- /var/volumes/kafka/config/server1.properties:/opt/kafka/config/server.properties
networks:
- net
kafka-producer:
image: registry.gitlab.com/repo/kafka:2.13_3.1.0_v0.1
stdin_open: true
tty: true
networks:
- net
kafka-consumer:
image: registry.gitlab.com/repo/kafka:2.13_3.1.0_v0.1
stdin_open: true
tty: true
networks:
- net

The problem was that I have been creating a few docker images and the container with the same name and it didn't picked the newest image.
Once I erased the rest of images and the container picked the lastest it all worked just fine, so it was basically a problem of not having enough permissions to get the lock of that directory.

Related

RabbitMQ error: "ssl_options.keyfile invalid, file does not exist or cannot be read by the node"

I'm trying to add SSL to a RabbitMQ deployment via Docker Compose:
# rabbitmq.conf
ssl_options.certfile = /container/path/to/certfile.crt
ssl_options.keyfile = /container/path/to/keyfile.key
# docker-compose.yml
rabbitmq:
image: rabbitmq:3.10.7-management
...
volumes:
- /host/path/to/certfile.crt:/container/path/to/certfile.crt
- /host/path/to/keyfile.crt:/container/path/to/keyfile.key
- ...
...
However, when spinning up the container, I get the error:
ssl_options.keyfile invalid, file does not exist or cannot be read by the node
I have double checked that the volume mounting is working, and that the keyfile is actually there.

Turned out to be a permissions issue. Solved by running in the host machine:
chmod 664 /host/path/to/certfile.crt
chmod 664 /host/path/to/keyfile.crt

Apache Nifi (on docker): only one of the HTTP and HTTPS connectors can be configured at one time error

Have a problem adding authentication due to a new needs while using Apache NiFi (NiFi) without SSL processing it in a container.
The image version is apache/nifi:1.13.0
It's said that SSL is unconditionally required to add authentication. It's recommended to use tls-toolkit in the NiFi image to add SSL. Worked on the following process:
Except for environment variable nifi.web.http.port for HTTP communication, and executed up the standalone mode container with nifi.web.https.port=9443
docker-compose up
Joined to the container and run the tls-toolkit script in the nifi-toolkit.
cd /opt/nifi/nifi-toolkit-1.13.0/bin &&\
sh tls-toolkit.sh standalone \
-n 'localhost' \
-C 'CN=yangeok,OU=nifi' \
-O -o $NIFI_HOME/conf
Attempt 1
Organized files in directory $NIFI_HOME/conf. Three files keystore.jks, truststore.jsk, and nifi.properties were created in folder localhost that entered the value of the option -n of the tls-toolkit script.
cd $NIFI_HOME/conf &&
cp localhost/*.jks .
The file $NIFI_HOME/conf/localhost/nifi.properties was not overwritten as it is, but only the following properties were imported as a file $NIFI_HOME/conf/nifi.properties:
nifi.web.http.host=
nifi.web.http.port=
nifiweb.https.host=localhost
nifiweb.https.port=9443
Restarted container
docker-compose restart
The container died with below error log:
Only one of the HTTP and HTTPS connectors can be configured at one time
Attempt 2
After executing the tls-toolkit script, all files a were overwritten, including file nifi.properties
cd $NIFI_HOME/conf &&
cp localhost/* .
Restarted container
docker-compose restart
The container died with the same error log
Hint
The dead container volume was also accessible, so copied and checked file nifi.properties, and when did docker-compose up or restart, it changed as follows:
The part I overwritten or modified:
nifi.web.http.host=
nifi.web.http.port=
nifi.web.http.network.interface.default=
#############################################
nifi.web.https.host=localhost
nifi.web.https.port=9443
The changed part after re-executing the container:
nifi.web.http.host=a8e283ab9421
nifi.web.http.port=9443
nifi.web.http.network.interface.default=
#############################################
nifi.web.https.host=a8e283ab9421
nifi.web.https.port=9443
I'd like to know how to execute the container with http.host, http.port empty. docker-compose.yml file is as follows:
version: '3'
services:
nifi:
build:
context: .
args:
NIFI_VERSION: ${NIFI_VERSION}
container_name: nifi
user: root
restart: unless-stopped
network_mode: bridge
ports:
- ${NIFI_HTTP_PORT}:8080/tcp
- ${NIFI_HTTPS_PORT}:9443/tcp
volumes:
- ./drivers:/opt/nifi/nifi-current/drivers
- ./templates:/opt/nifi/nifi-current/templates
- ./data:/opt/nifi/nifi-current/data
environment:
TZ: 'Asia/Seoul'
########## JVM ##########
NIFI_JVM_HEAP_INIT: ${NIFI_HEAP_INIT} # The initial JVM heap size.
NIFI_JVM_HEAP_MAX: ${NIFI_HEAP_MAX} # The maximum JVM heap size.
########## Web ##########
# NIFI_WEB_HTTP_HOST: ${NIFI_HTTP_HOST} # nifi.web.http.host
# NIFI_WEB_HTTP_PORT: ${NIFI_HTTP_PORT} # nifi.web.http.port
NIFI_WEB_HTTPS_HOST: ${NIFI_HTTPS_HOST} # nifi.web.https.host
NIFI_WEB_HTTP_PORT: ${NIFI_HTTPS_PORT} # nifi.web.https.port
Thank you

Elastic search TestContainers Timed out waiting for URL to be accessible in Docker

Local env:
MacOS 10.14.6
Docker Desktop 2.0.1.2
Docker Engine 19.03.2
Compose Engine 1.24.1
Test containers 1.12.1
I'm using Elastic search in an app, and I want to be able to use TestContainers in my integration tests. Sample code in a Play Framework app that uses ElasticSearch testcontainer:
#BeforeAll
public static void setup() {
private static final ElasticsearchContainer ES = new ElasticsearchContainer();
ES.start();
}
This works when testing locally, but I want to be able to run this inside a Docker container to run on my CI server. I'm getting this exception when running the tests inside the Docker container:
[warn] o.t.u.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: alpine:3.5, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory)
[warn] o.t.u.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: quay.io/testcontainers/ryuk:0.2.3, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory)
?? Checking the system...
? Docker version should be at least 1.6.0
? Docker environment should have more than 2GB free disk space
[warn] o.t.u.RegistryAuthLocator - Failure when attempting to lookup auth config (dockerImageName: docker.elastic.co/elasticsearch/elasticsearch:7.1.1, configFile: /root/.docker/config.json. Falling back to docker-java default behaviour. Exception message: /root/.docker/config.json (No such file or directory)
[error] d.e.c.1.1] - Could not start container
org.testcontainers.containers.ContainerLaunchException: Timed out waiting for URL to be accessible (http://172.17.0.1:32911/ should return HTTP [200])
at org.testcontainers.containers.wait.strategy.HttpWaitStrategy.waitUntilReady(HttpWaitStrategy.java:197)
at org.testcontainers.containers.wait.strategy.AbstractWaitStrategy.waitUntilReady(AbstractWaitStrategy.java:35)
at org.testcontainers.containers.GenericContainer.waitUntilContainerStarted(GenericContainer.java:675)
at org.testcontainers.containers.GenericContainer.tryStart(GenericContainer.java:332)
at org.testcontainers.containers.GenericContainer.lambda$doStart$0(GenericContainer.java:285)
at org.rnorth.ducttape.unreliables.Unreliables.retryUntilSuccess(Unreliables.java:81)
at org.testcontainers.containers.GenericContainer.doStart(GenericContainer.java:283)
at org.testcontainers.containers.GenericContainer.start(GenericContainer.java:272)
at controllers.HomeControllerTest.setup(HomeControllerTest.java:56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I've read the instructions here: https://www.testcontainers.org/supported_docker_environment/continuous_integration/dind_patterns/
So my docker-compose.yml looks like (note: I've been testing with another ES container as seen commented out below, but I have not been using it with this test)($INSTANCE is a random 16 char string for a particular test run):
version: '3'
services:
# elasticsearch:
# container_name: elasticsearch_${INSTANCE}
# image: docker.elastic.co/elasticsearch/elasticsearch:6.7.2
# ports:
# - 9200:9200
# - 9300:9300
# command: elasticsearch -E transport.host=0.0.0.0
# logging:
# driver: 'none'
# environment:
# ES_JAVA_OPTS: "-Xms750m -Xmx750m"
mainapp:
container_name: mainapp_${INSTANCE}
image: test_image:${INSTANCE}
stop_signal: SIGKILL
stdin_open: true
tty: true
working_dir: $PWD
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- $PWD:$PWD
environment:
ES_JAVA_OPTS: "-Xms1G -Xmx1G"
command: /bin/bash /projectfolder/build/tests/wrapper.sh
I've also tried running my tests with this command but received the same error:
docker run -it --rm -v $PWD:$PWD -w $PWD -v /var/run/docker.sock:/var/run/docker.sock test_image:68F75D8FD4C7003772C7E52B87B774F5 /bin/bash /testproject/build/tests/wrapper.sh
I tried creating a postgres container the same way inside my testing container and had no issues. I've also tried making a GenericContainer with the Elasticsearch image with no luck.
I don't think this is a connection issue because if I run curl 172.17.0.1:{port printed to test console} from inside my test container, I do get a valid elastic search response with status code 200, so it almost seems like its timing out trying to connect even though the connection is there.
Thanks.

Confluence on Docker runs setup assistent on existing installation after update

A few days ago, my watchtower updated Confluence on Docker with the 6.15.1-alpine tag. It's hosted using Atlassians official image. Since those update, Confluence shows the setup screen. Haven't any chance to get inside the admin panel. When continue the wizard end entering server credentials of the existing installation, it gave an error that an installation already exists that would be overwritten if continued.
It was a re-push of the exact version tag 6.15.1 tag, not a regular version update. So there seems no possibility to use the old, working image. Also other versions seems re-pushed. Tried some older ones and also a new one, without success.
docker-compose.yml
version: "2"
volumes:
confluence-home:
services:
confluence:
container_name: confluence
image: atlassian/confluence-server:6.15.1-alpine
#restart: always
mem_limit: 6g
volumes:
- confluence-home:/var/atlassian/application-data/confluence
- ./confluence.cfg.xml:/var/atlassian/application-data/confluence/confluence.cfg.xml
- ./server.xml:/opt/atlassian/confluence/conf/server.xml
- ./mysql-connector-java-5.1.42-bin.jar:/opt/atlassian/confluence/lib/mysql-connector-java-5.1.42-bin.jar
networks:
- traefik
environment:
- "TZ=Europe/Berlin"
- JVM_MINIMUM_MEMORY=4096m
- JVM_MAXIMUM_MEMORY=4096m
labels:
- "traefik.port=8090"
- "traefik.backend=confluence"
- "traefik.frontend.rule=Host:confluence.my-domain.com"
networks:
traefik:
external: true

I found out that there were the following changes on the images:
Ownership
The logs throwed errors about not beinng able to write on log files because nearly the entire home directory was owned by an user called bin:
root#8ac38faa94f1:/var/atlassian/application-data/confluence# ls -l
total 108
drwx------ 2 bin bin 4096 Aug 19 00:03 analytics-logs
drwx------ 3 bin bin 4096 Jun 15 2017 attachments
drwx------ 2 bin bin 24576 Jan 12 2019 backups
[...]
This could be fixed by executing a chown:
docker exec -it confluence bash
chown confluence:confluence -R /var/atlassian/application-data/confluence
Moutings inside mount
My docker-compose.yml mounts a volume to /var/atlassian/application-data/confluence and inside those volume, the confluence.cfg.xml file was mounted from current directory. This approach is a bit older and should seperate the user data in the volume from configuration files like docker-compose.yml and also the application itself as confluence.cfg.xml.
Seems not properly working any more on using Docker 17.05 and Docker-Compose 1.8.0 (at least in combination with Confluence), so I simply removed that second mount and placed the configuration file inside the volume.
Atlassian creates config files now dynamically
It was noticeable that my mounted configuration files like confluence.cfg.xml and server.xml were overwritten by Atlassians container. Their source code shows that they now use Jina2, a common Python template engine used in e.g. Ansible. A python script parse those files on startup and create Confluences configuration files, without properly checking on all of those files if they already exists.
Mounting them as read only caused the app to crash because this is also not handled in their Python script. By analyzing their templates, I learned that they replaced nearly every config item by environment variables. Not a bad approach, so I specified my server.xml parameters by env variables instead of mouting the entire file.
In my case, Confluence is behind a Traefik reverse proxy and it's required to tell Confluence it's final application url for end users:
environment:
- ATL_proxyName=confluence.my-domain.com
- ATL_proxyPort=443
- ATL_tomcat_scheme=https
Final working docker-compose.yml
By applying all modifications above, accessing the existing installation works again using the following docker-compose.yml file:
version: "2"
volumes:
confluence-home:
services:
confluence:
container_name: confluence
image: atlassian/confluence-server:6.15.1
#restart: always
mem_limit: 6g
volumes:
- confluence-home:/var/atlassian/application-data/confluence
- ./mysql-connector-java-5.1.42-bin.jar:/opt/atlassian/confluence/lib/mysql-connector-java-5.1.42-bin.jar
networks:
- traefik
environment:
- "TZ=Europe/Berlin"
- JVM_MINIMUM_MEMORY=4096m
- JVM_MAXIMUM_MEMORY=4096m
- ATL_proxyName=confluence.my-domain.com
- ATL_proxyPort=443
- ATL_tomcat_scheme=https
labels:
- "traefik.port=8090"
- "traefik.backend=confluence"
- "traefik.frontend.rule=Host:confluence.my-domain.com"
networks:
traefik:
external: true

Filebeat not running using docker-compose: setting 'filebeat.prospectors' has been removed

I'm trying to launch filebeat using docker-compose (I intend to add other services later on) but every time I execute the docker-compose.yml file, the filebeat service always ends up with the following error:
filebeat_1 | 2019-08-01T14:01:02.750Z ERROR instance/beat.go:877 Exiting: 1 error: setting 'filebeat.prospectors' has been removed
filebeat_1 | Exiting: 1 error: setting 'filebeat.prospectors' has been removed
I discovered the error by accessing the docker-compose logs.
My docker-compose file is as simple as it can be at the moment. It simply calls a filebeat Dockerfile and launches the service immediately after.
Next to my Dockerfile for filebeat I have a simple config file (filebeat.yml), which is copied to the container, replacing the default filebeat.yml.
If I execute the Dockerfile using the docker command, the filebeat instance works just fine: it uses my config file and identifies the "output.json" file as well.
I'm currently using version 7.2 of filebeat and I know that the "filebeat.prospectors" isn't being used. I also know for sure that this specific configuration isn't coming from my filebeat.yml file (you'll find it below).
It seems that, when using docker-compose, the container is accessing another configuration file instead of the one that is being copied to the container, by the Dockerfile, but so far I haven't been able to figure it out how, why and how can I fix it...
Here's my docker-compose.yml file:
version: "3.7"
services:
filebeat:
build: "./filebeat"
command: filebeat -e -strict.perms=false
The filebeat.yml file:
filebeat.inputs:
- paths:
- '/usr/share/filebeat/*.json'
fields_under_root: true
fields:
tags: ['json']
output:
logstash:
hosts: ['localhost:5044']
The Dockerfile file:
FROM docker.elastic.co/beats/filebeat:7.2.0
COPY filebeat.yml /usr/share/filebeat/filebeat.yml
COPY output.json /usr/share/filebeat/output.json
USER root
RUN chown root:filebeat /usr/share/filebeat/filebeat.yml
RUN mkdir /usr/share/filebeat/dockerlogs
USER filebeat
The output I'm expecting should be similar to the following, which comes from the successful executions I'm getting when I'm executing it as a single container.
The ERROR is expected because I don't have logstash configured at the moment.
INFO crawler/crawler.go:72 Loading Inputs: 1
INFO log/input.go:148 Configured paths: [/usr/share/filebeat/*.json]
INFO input/input.go:114 Starting input of type: log; ID: 2772412032856660548
INFO crawler/crawler.go:106 Loading and starting Inputs completed. Enabled inputs: 1
INFO log/harvester.go:253 Harvester started for file: /usr/share/filebeat/output.json
INFO pipeline/output.go:95 Connecting to backoff(async(tcp://localhost:5044))
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp [::1]:5044: connect: cannot assign requested address
INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://localhost:5044)) with 1 reconnect attempt(s)
ERROR pipeline/output.go:100 Failed to connect to backoff(async(tcp://localhost:5044)): dial tcp [::1]:5044: connect: cannot assign requested address
INFO pipeline/output.go:93 Attempting to reconnect to backoff(async(tcp://localhost:5044)) with 2 reconnect attempt(s)

I managed to figure out what the problem was.
I needed to map the location of the config file and logs directory in the docker-compose file, using the volumes tag:
version: "3.7"
services:
filebeat:
build: "./filebeat"
command: filebeat -e -strict.perms=false
volumes:
- ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml
- ./filebeat/logs:/usr/share/filebeat/dockerlogs
Finally I just had to execute the docker-compose command and everything start working properly:
docker-compose -f docker-compose.yml up -d

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

ERROR Disk error while locking directory /var/kafka-logs in 3.10 Kafka - docker

Related

RabbitMQ error: "ssl_options.keyfile invalid, file does not exist or cannot be read by the node"

Apache Nifi (on docker): only one of the HTTP and HTTPS connectors can be configured at one time error

Elastic search TestContainers Timed out waiting for URL to be accessible in Docker

Confluence on Docker runs setup assistent on existing installation after update

Filebeat not running using docker-compose: setting 'filebeat.prospectors' has been removed

Categories

Resources