Monitor ksqlDB with JMX - docker

I have followed the instructions on this page: https://docs.ksqldb.io/en/latest/operate-and-deploy/monitoring/
So this is my ksqldb-server part of docker-compose:
ksqldb-server:
image: confluentinc/ksqldb-server:0.15.0
hostname: ksqldb-server
container_name: ksqldb-server
depends_on:
- kafka
- schema-registry
- kafka-connect
ports:
- "8088:8088"
- "1099:1099"
environment:
KSQL_LISTENERS: http://0.0.0.0:8088
KSQL_BOOTSTRAP_SERVERS: kafka:29092
KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: "true"
KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: "true"
KSQL_KSQL_SCHEMA_REGISTRY_URL: http://schema-registry:8081
KSQL_KSQL_CONNECT_URL: http://kafka-connect:8083
KSQL_KSQL_QUERY_PULL_METRICS_ENABLED: "true"
KSQL_JMX_OPTS: >
-Djava.rmi.server.hostname=localhost
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=1099
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.rmi.port=1099
I have setup Prometheus in the same docker-compose file, and when I visit {prometheus-url}/targets, I see Get "http://ksqldb-server:1099/metrics": EOF
I have already tried plenty configurations during my research, including changing the -Djava.rmi.server.hostname either to the host's IP address or to ksqldb-server's container IP address, but none of them worked. Does anyone have a solution?

Well, six months later after having dealt with this topic once again, I managed to set this up. This follows the approach suggested by swist in my GitHub issue I created back then when I created this issue, too.
You need JMX Exporter. Download it here
You need a YAML file, telling the JMX exporter which metrics to export. You can get it here. If you are only interested in the ksqlDB metrics, remove all other patterns, e.g. the kafka patterns.
Place the JMX Exporter and the YAML file on every node on which you want to monitor a ksqlDB instance
Before starting ksqlDB, create the environment variable KSQL_JMX_OPTS as follows:
export KSQL_JMX_OPTS="-Dcom.sun.management.jmxremote \
-Dcom.sun.management.jmxremote.authenticate=false \
-Dcom.sun.management.jmxremote.ssl=false \
-Djava.util.logging.config.file=logging.properties \
-javaagent:[BLUB]/jmx_prometheus_javaagent.jar=7010:ksqldb.yml"
You need to either create this variable every time you have a new session or create it permantently. [BLUB] is the absolute path to your JMX JAR.
Now you can run ksqlDB and the metrics become available at port 7010 (you can specify any other free port). If you want to have a good dashboard, go with this one.

The jmxremote.port value is also not a proper Prometheus target; it's for jconsole, Visualvm, or other JMX monitoring tools, as the documentation you've linked to says
If you want to use Prometheus, you need to download and mount the JMX exporter agent JAR into the container and modify the JVM arguments to include the agent+scraper port+mbeans config file...
You could also switch to using minikube and apply the Confluent ksqlDB Helm Chart, which does this for you

Related

Landoop. Kafka connect how to change worker properties

landoop/fast-data-dev:2.6
I want to change default batch.size using 'producer.override.batch.size=65536' when creating new connector.
But in order to do that, it's required to apply override policy on the worker side
connector.client.config.override.policy=All
Otherwise there is exception
"producer.override.batch.size" : The 'None' policy does not allow
'batch.size' to be overridden in the connector configuration.
It's not clear, how exactly:
to change the default worker properties
where it expects them to be placed
which name they should have
So that landoop sees them
I start the landoop using the following docker-compose.
version: '2'
services:
kafka-cluster:
image: landoop/fast-data-dev:2.6
environment:
ADV_HOST: 127.0.0.1
RUNTESTS: 0
ports:
- 2181:2181 # Zookeeper
- 3030:3030 # Landoop UI
- 8081-8083:8081-8083 # REST Proxy, Schema Registry, Kafka Connect ports
- 9581-9585:9581-9585 # JMX Ports
- 9092:9092 # Kafka Broker
volumes:
- ./connectors/news/crypto-panic-connector-1.0.jar:/connectors/crypto-panic-connector-1.0.jar
distributed.properties at folder /connect/connect-avro-distributed.properties generated by Landoop
offset.storage.partitions=5
key.converter.schema.registry.url=http://127.0.0.1:8081
value.converter.schema.registry.url=http://127.0.0.1:8081
config.storage.replication.factor=1
offset.storage.topic=connect-offsets
status.storage.partitions=3
offset.storage.replication.factor=1
key.converter=io.confluent.connect.avro.AvroConverter
config.storage.topic=connect-configs
config.storage.partitions=1
group.id=connect-fast-data
rest.advertised.host.name=127.0.0.1
port=8083
value.converter=io.confluent.connect.avro.AvroConverter
rest.port=8083
status.storage.replication.factor=1
status.storage.topic=connect-statuses
access.control.allow.origin=*
access.control.allow.methods=GET,POST,PUT,DELETE,OPTIONS
jmx.port=9584
plugin.path=/var/run/connect/connectors/stream-reactor,/var/run/connect/connectors/third-party,/connectors
bootstrap.servers=PLAINTEXT://127.0.0.1:9092
crypto-panic-connector-1.0 connector directories structure:
/config:
> worker.properties
/src:
> ...
UPDATE
Adding to environment properties:
CONNECT_CONNECT_CLIENT_CONFIG_OVERRIDE_POLICY: 'All'
CONNECT_PRODUCER_OVERRIDE_BATCH_SIZE: 65536
Doesn't work for landoop/fast-data-dev:2.6
In logs it's still
'connector.client.config.override.policy = None'
And warning
WARN The configuration 'connect.client.config.override.policy' was supplied but isn't a known config.
Changing this to
CONNECTOR_CONNECTOR_CLIENT_CONFIG_OVERRIDE_POLICY: 'All'
CONNECT_PRODUCER_OVERRIDE_BATCH_SIZE: 65536
Removes warning, but at the end the override policy is still 'None' and it's not possible to override properties for client when creating connector.
Changing to
CONNECTOR_CLIENT_CONFIG_OVERRIDE_POLICY: 'All'
CONNECT_PRODUCER_OVERRIDE_BATCH_SIZE: 65536
has same effect, policy 'None'.
Also batch size overriding is not aplied. So I assume those overriding features are not supported in Landoop.
WARN The configuration 'producer.override.batch.size' was supplied but isn't a known config.
I assume 'confluentinc/cp-kafka-connect' doesn't have UI built-in, and for learning purposes seems better to have it. So it's more preferable to do it in Landoop. But thanks for recommendation to use 'confluentinc/cp-kafka-connect'. I will try to do this config overriding there also
For starters, that image is very old and no longer maintained. I'd recommend you use confluentinc/cp-kafka-connect
In any case, for both images, you can use
environment:
CONNECT_CONNECT_CLIENT_CONFIG_OVERRIDE_POLICY: 'All'
CONNECT_PRODUCER_OVERRIDE_BATCH_SIZE: 65536
It's not clear, how exactly ... change the default worker properties
Look at the source code

PySpark doesn't find Kafka source

I am trying to deploy a docker container with Kafka and Spark and would like to read to Kafka Topic from a pyspark application. Kafka is working and I can write to a topic and also spark is working. But when I try to read the Kafka stream I get the error message:
pyspark.sql.utils.AnalysisException: Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".
My Docker Compose yaml looks like this:
---
version: '3.7'
services:
zookeeper:
image: bitnami/zookeeper:3
ports:
- 2181:2181
environment:
ALLOW_ANONYMOUS_LOGIN: "yes"
kafka:
image: bitnami/kafka:2
ports:
- 9092:9092
environment:
KAFKA_CFG_ZOOKEEPER_CONNECT: zookeeper:2181
ALLOW_PLAINTEXT_LISTENER: "yes"
KAFKA_LISTENERS: >-
INTERNAL://:29092,EXTERNAL://:9092
KAFKA_ADVERTISED_LISTENERS: >-
INTERNAL://kafka:29092,EXTERNAL://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: >-
INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: "INTERNAL"
depends_on:
- zookeeper
spark:
image: docker.io/bitnami/spark:3-debian-10
environment:
- SPARK_MODE=master
- SPARK_RPC_AUTHENTICATION_ENABLED=no
- SPARK_RPC_ENCRYPTION_ENABLED=no
- SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
- SPARK_SSL_ENABLED=no
ports:
- '8080:8080'
volumes:
- ./:/home/workspace/
- ./spark/jars:/opt/bitnami/spark/.ivy2
spark-worker-1:
image: docker.io/bitnami/spark:3-debian-10
environment:
- SPARK_MODE=worker
- SPARK_MASTER_URL=spark://spark:7077
- SPARK_WORKER_MEMORY=1G
- SPARK_WORKER_CORES=1
- SPARK_RPC_AUTHENTICATION_ENABLED=no
- SPARK_RPC_ENCRYPTION_ENABLED=no
- SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
- SPARK_SSL_ENABLED=no
volumes:
- ./:/home/workspace/
- ./spark/jars:/opt/bitnami/spark/.ivy2
kafdrop:
image: obsidiandynamics/kafdrop:latest
ports:
- 9000:9000
environment:
KAFKA_BROKERCONNECT: kafka:29092
depends_on:
- kafka
and the pyspark app:
from pyspark.sql import SparkSession
import os
#os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.0,org.apache.kafka:kafka-clients:2.8.1'
# the source for this data pipeline is a kafka topic, defined below
spark = SparkSession.builder.appName("fuel-level").master("local[*]").getOrCreate()
spark.sparkContext.setLogLevel('WARN')
kafkaRawStreamingDF = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "localhost:9092") \
.option("subscribe","SimLab-KUKA") \
.option("startingOffsets","earliest")\
.load()
#this is necessary for Kafka Data Frame to be readable, into a single column value
kafkaStreamingDF = kafkaRawStreamingDF.selectExpr("cast(key as string) key", "cast(value as string) value")
kafkaStreamingDF.writeStream.outputMode("append").format("console").start().awaitTermination()
I am new to Spark and docker, so maybe It's an obvious mistake, I hope you can help me
EDIT
When I uncomment os.env I get the following error:
Error: Missing application resource.
Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn,
k8s://https://host:port, or local (Default: local[*]).
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of jars to include on the driver
and executor classpaths.
--packages Comma-separated list of maven coordinates of jars to include
on the driver and executor classpaths. Will search the local
maven repo, then maven central and any additional remote
repositories given by --repositories. The format for the
coordinates should be groupId:artifactId:version.
--exclude-packages Comma-separated list of groupId:artifactId, to exclude while
resolving the dependencies provided in --packages to avoid
dependency conflicts.
--repositories Comma-separated list of additional remote repositories to
search for the maven coordinates given with --packages.
--py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place
on the PYTHONPATH for Python apps.
--files FILES Comma-separated list of files to be placed in the working
directory of each executor. File paths of these files
in executors can be accessed via SparkFiles.get(fileName).
--archives ARCHIVES Comma-separated list of archives to be extracted into the
working directory of each executor.
--conf, -c PROP=VALUE Arbitrary Spark configuration property.
--properties-file FILE Path to a file from which to load extra properties. If not
specified, this will look for conf/spark-defaults.conf.
--driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 1024M).
--driver-java-options Extra Java options to pass to the driver.
--driver-library-path Extra library path entries to pass to the driver.
--driver-class-path Extra class path entries to pass to the driver. Note that
jars added with --jars are automatically included in the
classpath.
--executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G).
--proxy-user NAME User to impersonate when submitting the application.
This argument does not work with --principal / --keytab.
--help, -h Show this help message and exit.
--verbose, -v Print additional debug output.
--version, Print the version of current Spark.
Cluster deploy mode only:
--driver-cores NUM Number of cores used by the driver, only in cluster mode
(Default: 1).
Spark standalone or Mesos with cluster deploy mode only:
--supervise If given, restarts the driver on failure.
Spark standalone, Mesos or K8s with cluster deploy mode only:
--kill SUBMISSION_ID If given, kills the driver specified.
--status SUBMISSION_ID If given, requests the status of the driver specified.
Spark standalone, Mesos and Kubernetes only:
--total-executor-cores NUM Total cores for all executors.
Spark standalone, YARN and Kubernetes only:
--executor-cores NUM Number of cores used by each executor. (Default: 1 in
YARN and K8S modes, or all available cores on the worker
in standalone mode).
Spark on YARN and Kubernetes only:
--num-executors NUM Number of executors to launch (Default: 2).
If dynamic allocation is enabled, the initial number of
executors will be at least NUM.
--principal PRINCIPAL Principal to be used to login to KDC.
--keytab KEYTAB The full path to the file that contains the keytab for the
principal specified above.
Spark on YARN only:
--queue QUEUE_NAME The YARN queue to submit to (Default: "default").
Traceback (most recent call last):
File "/Users/janikbischoff/Documents/Uni/PuL/BA/Code/Tests/spark-test.py", line 6, in <module>
spark = SparkSession.builder.appName("fuel-level").master("local[*]").getOrCreate()
File "/Users/janikbischoff/Library/Python/3.8/lib/python/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/Users/janikbischoff/Library/Python/3.8/lib/python/site-packages/pyspark/context.py", line 392, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/Users/janikbischoff/Library/Python/3.8/lib/python/site-packages/pyspark/context.py", line 144, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/Users/janikbischoff/Library/Python/3.8/lib/python/site-packages/pyspark/context.py", line 339, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "/Users/janikbischoff/Library/Python/3.8/lib/python/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway
raise RuntimeError("Java gateway process exited before sending its port number")
RuntimeError: Java gateway process exited before sending its port number
Missing application resource
This implies you're running the code using python rather than spark-submit
I was able to reproduce the error by copying your environment, as well as using findspark, it seems PYSPARK_SUBMIT_ARGS aren't working in that container, even though the variable does get loaded...
The workaround would be to pass the argument at execution time.
spark-submit \
--packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.0 \
script.py

Graylog in Docker persistent

I'm trying to make a Graylog Docker Container persistent.
Meaning that after restarting (docker-compose down; docker-compose up) the logs will still be there alongside the configuration.
I've used the documentation at https://docs.graylog.org/en/3.1/pages/installation/docker.html I created a yml file with the content under the topic "Persisting data".
I only edited the line "GRAYLOG_HTTP_EXTERNAL_URI=http://127.0.0.1:9000/" to not use localhost but the external ip the machine is using.
Docker works, i can create an input and collect logfiles. What does not work is the data being persistent. Also every time i restart the node id changes, so i have to reconfigure the input. Running docker volume ls lists five volumes 3 of which are the ones created in the yml file.
I don't understand why data is not persistent. Can anybody help?
I had the same problem and I'd been struggling for a while before I found a solution. I'm on 3.2 and also had issues with node persistence. The documentation doesn't seem to directly state that there is one more configuration folder you need to persist, which is:
/usr/share/graylog/data/config
They actually mention it in the Custom configuration files section and when I took a look via CLI in that directory, it turns out that it's where the graylog.conf and node-id (the file Graylog uses to store information about its nodes) are stored as well!
Here's my docker-compose.override.yml section with the necessary changes (marked with '# ADDED' comments)
services:
graylog:
environment:
# CHANGE ME (must be at least 16 characters)!
- GRAYLOG_PASSWORD_SECRET=somepasswordpepper
# Password: admin
- GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
- GRAYLOG_HTTP_EXTERNAL_URI=http://127.0.0.1:9000/
- GRAYLOG_IS_MASTER=true
#- GRAYLOG_NODE_ID_FILE=/usr/share/graylog/data/config/node-id
ports:
# Graylog web interface and REST API
- 9000:9000
# Syslog TCP
- 1514:1514
# Syslog UDP
- 1514:1514/udp
# GELF TCP
- 12201:12201
# GELF UDP
- 12201:12201/udp
volumes:
- "graylogjournal:/usr/share/graylog/data/journal"
- "graylogconfig:/usr/share/graylog/data/config" # ADDED
volumes:
graylogjournal:
driver: local
graylogconfig: # ADDED
driver: local # ADDED
Hope this helps
You can add into daemon.json file these lines ;
{
"log-driver": "gelf",
"log-opts": {
"gelf-address": "udp://1.2.3.4:12201"
}
}
https://docs.docker.com/config/containers/logging/gelf/

Containerized Kafka client errors when producing messages to the host Kafka server

There are a number of similar types of queries on stackoverflow, but none quite match the problem that I am seeing.
I have a zookeeper/kafka setup on my server which work perfectly. One can produce
bin/kafka-console-producer.sh --broker-list 192.168.2.80:9092 --topic test
and consume
bin/kafka-console-consumer.sh --bootstrap-server 192.168.2.80:9092 --topic test --from-beginning
locally on the Linux Ubuntu 16.04 server.
From a Docker container - also running Ubuntu 16.04 - I want to produce and consume. The container's Kafka code was copied from that on the server.
Firstly I can create a new topic
bin/kafka-topics.sh --create --zookeeper 192.168.2.80:2181 --replication-factor 1 --partitions 1 --topic test2
from the container and then list it again
bin/kafka-topics.sh --list --zookeeper 192.168.2.80:2181
However when I try to produce new messages, using the above (kafka-console-producer.sh) command it fails with the following message:
[2017-06-05 13:59:05,317] ERROR Error when sending message to topic test2 with key: null, value: 2 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for test2-0: 1526 ms has passed since batch creation plus linger time
immediately after entering the text of the message and pressing enter.
It may seem strange running a Docker container on the same host, but once this works I will move the container to a separate host for production.
My kafka server.properties file:
listeners=PLAINTEXT://0.0.0.0:9092
Kafka version:
2.12-0.10.2.1
Docker version:
Docker version 1.12.6, build 78d1802
The problem is (slightly simplified) caused by how Kafka's protocol works. Given a list of "bootstrap servers" (e.g. localhost:9092), a Kafka client will contact those bootstrap servers, but then use the hostnames of the actual Kafka brokers as returned by the bootstrap servers (the broker's advertised.listeners config, depending on your Kafka/Docker setup, might be set to e.g. kafka:9092). So here, the client would talk to localhost:9092 for bootstrapping (which will work), but then switch to kafka:9092 (which will not work, "thanks" to the networking setup).
Fortunately there is a way to configure Kafka + Docker in a way that "just works", and it doesn't require shenanigans such as fiddling with your host's /etc/hosts file and such. As part of this you need to set a few (new) Kafka settings though, which were added in kafka's KIP-103: Separation of Internal and External traffic.
Here's a snippet for Docker Compose (docker-compose.yml) that demonstrates how to do this:
---
version: '2'
services:
zookeeper:
image: confluentinc/cp-zookeeper:3.2.1
hostname: zookeeper
ports:
- '32181:32181'
environment:
ZOOKEEPER_CLIENT_PORT: 32181
kafka:
image: confluentinc/cp-kafka:3.2.1
hostname: kafka
ports:
- '9092:9092'
- '29092:29092'
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: zookeeper:32181
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
# Following line is needed for Kafka versions 0.11+
# in case you run less than 3 Kafka brokers in your
# cluster because the broker config
# `offsets.topic.replication.factor` (default: 3)
# is now enforced upon topic creation
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
Here, the key settings are:
listener.security.protocol.map (which is being set via KAFKA_LISTENER_SECURITY_PROTOCOL_MAP)
inter.broker.listener.name
advertised.listeners
In the setup above, the containerized Kafka broker listens on localhost:9092 for access from your host machine (e.g. your Mac laptop) and on kafka:29092 for access from other containers.
A full end-to-end example is available at:
https://github.com/confluentinc/cp-docker-images/blob/v3.2.1/examples/kafka-streams-examples/docker-compose.yml (documentation at http://docs.confluent.io/3.2.1/cp-docker-images/docs/tutorials/kafka-streams-examples.html).
Your producer (in the container) can't resolve the host name of your Linux guest OS which is returned in the Kafka producers initial metadata request to the bootstrap server. You can add it manually to the /etc/hosts file inside the container or add "--add-host" parameter to the docker run command that launches the image running your producer
Aha!
After further reading and the answers given above the solution came. As is often the case it is an easy one.
A simple edit of the kafka server.properties file:
advertised.listeners=PLAINTEXT://192.168.2.80:9092
Also note, the parameter 'listeners' is not set in this file.

Docker-compose : understanding linking environment variables

I'm now using docker-compose for all of my projects. Very convenient. Much more comfortable than manual linking through several docker commands.
There is something that is not clear to me yet though: the logic behind the linking environment variables.
Eg. with this docker-compose.yml:
mongodb:
image: mongo
command: "--smallfiles --logpath=/dev/null"
web:
build: .
command: npm start
volumes:
- .:/myapp
ports:
- "3001:3000"
links:
- mongodb
environment:
PORT: 3000
NODE_ENV: 'development'
In the node app, I need to retrieve the mongodb url. And if I console.log(process.env), I get so many things that it feels very random (just kept the docker-compose-related ones):
MONGODB_PORT_27017_TCP: 'tcp://172.17.0.2:27017',
MYAPP_MONGODB_1_PORT_27017_TCP_PORT: '27017',
MYAPP_MONGODB_1_PORT_27017_TCP_PROTO: 'tcp',
MONGODB_ENV_MONGO_VERSION: '3.2.6',
MONGODB_1_ENV_GOSU_VERSION: '1.7',
'MYAPP_MONGODB_1_ENV_affinity:container': '=d5c9ebd7766dc954c412accec5ae334bfbe836c0ad0f430929c28d4cda1bcc0e',
MYAPP_MONGODB_1_ENV_GPG_KEYS: 'DFFA3DCF326E302C4787673A01C4E7FAAAB2461C \t42F3E95A2C4F08279C4960ADD68FA50FEA312927',
MYAPP_MONGODB_1_PORT_27017_TCP: 'tcp://172.17.0.2:27017',
MONGODB_1_PORT: 'tcp://172.17.0.2:27017',
MYAPP_MONGODB_1_ENV_MONGO_VERSION: '3.2.6',
MONGODB_1_ENV_MONGO_MAJOR: '3.2',
MONGODB_ENV_GOSU_VERSION: '1.7',
MONGODB_1_PORT_27017_TCP_ADDR: '172.17.0.2',
MONGODB_1_NAME: '/myapp_web_1/mongodb_1',
MONGODB_1_PORT_27017_TCP_PORT: '27017',
MONGODB_1_PORT_27017_TCP_PROTO: 'tcp',
'MONGODB_1_ENV_affinity:container': '=d5c9ebd7766dc954c412accec5ae334bfbe836c0ad0f430929c28d4cda1bcc0e',
MONGODB_PORT: 'tcp://172.17.0.2:27017',
MONGODB_1_ENV_GPG_KEYS: 'DFFA3DCF326E302C4787673A01C4E7FAAAB2461C \t42F3E95A2C4F08279C4960ADD68FA50FEA312927',
MYAPP_MONGODB_1_ENV_GOSU_VERSION: '1.7',
MONGODB_ENV_MONGO_MAJOR: '3.2',
MONGODB_PORT_27017_TCP_ADDR: '172.17.0.2',
MONGODB_NAME: '/myapp_web_1/mongodb',
MONGODB_1_PORT_27017_TCP: 'tcp://172.17.0.2:27017',
MONGODB_PORT_27017_TCP_PORT: '27017',
MONGODB_1_ENV_MONGO_VERSION: '3.2.6',
MONGODB_PORT_27017_TCP_PROTO: 'tcp',
MYAPP_MONGODB_1_PORT: 'tcp://172.17.0.2:27017',
'MONGODB_ENV_affinity:container': '=d5c9ebd7766dc954c412accec5ae334bfbe836c0ad0f430929c28d4cda1bcc0e',
MYAPP_MONGODB_1_ENV_MONGO_MAJOR: '3.2',
MONGODB_ENV_GPG_KEYS: 'DFFA3DCF326E302C4787673A01C4E7FAAAB2461C \t42F3E95A2C4F08279C4960ADD68FA50FEA312927',
MYAPP_MONGODB_1_PORT_27017_TCP_ADDR: '172.17.0.2',
MYAPP_MONGODB_1_NAME: '/myapp_web_1/novatube_mongodb_1',
Don't know what to pick, and why so many entries? Is it better to use the general ones, or the MYAPP-prefixed one? Where does the MYAPP name comes from? Folder name?
Could someone clarify this?
Wouldn't it be easier to let the user define the ones he needs in the docker-compose.yml file with a custom mapping? Like:
links:
- mongodb:
- MONGOIP: IP
- MONGOPORT : PORT
What I'm saying might not have sense though. :-)
Environment variables are a legacy way of defining links between containers. If you are using a newer version of compose, you don't need the links declaration at all. Trying to connect to mongodb from your app container will work fine by just using the name of the service (mongodb) as a hostname, without any links defined in the compose file (instead using docker's builtin DNS resolution, check /etc/hosts, nothing in there either!)
In answer to your question about why the prefix with MYAPP, you're right. Compose prefixes the service name with the name of the folder (or 'project', in compose nomenclature). It does the same thing when creating custom networks and volumes.

Resources