I'm using a Docker compose with multiple containers (a custom version of Full Stack FastAPI, but with Neo4j included).
Full docker-compose.yml here and an excerpt for neo4j:
neo4j:
image: neo4j
networks:
- ${TRAEFIK_PUBLIC_NETWORK?Variable not set}
- default
ports:
- "6477:6477"
- "7474:7474"
- "7687:7687"
volumes:
- app-neo4j-data:/data
- app-neo4j-plugins:/plugins
env_file:
- .env
environment:
- NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
- NEO4J_AUTH=${NEO4J_USERNAME?Variable not set}/${NEO4J_PASSWORD?Variable not set}
- NEO4J_dbms_default__advertised__address=0.0.0.0
- NEO4J_dbms_default__listen__address=0.0.0.0
- NEO4J_dbms_connector_bolt_advertised__address=0.0.0.0:7687
- NEO4J_dbms_connector_bolt_listen__address=0.0.0.0:7687
- NEO4J_dbms_connector_http_listen__address=:7474
- NEO4J_dbms_connector_http_advertised__address=:7474
- NEO4J_dbms_connector_https_listen__address=:6477
- NEO4J_dbms_connector_https_advertised__address=:6477
- NEO4J_dbms_connector_bolt_listen__address=:7687
- NEO4J_dbms_mode=SINGLE
deploy:
labels:
- traefik.enable=true
- traefik.docker.network=${TRAEFIK_PUBLIC_NETWORK?Variable not set}
- traefik.constraint-label=${TRAEFIK_PUBLIC_TAG?Variable not set}
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-http.rule=Host(`neo4j.${DOMAIN?Variable not set}`)
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-http.entrypoints=http
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-http.middlewares=${STACK_NAME?Variable not set}-https-redirect
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-https.rule=Host(`neo4j.${DOMAIN?Variable not set}`)
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-https.entrypoints=https
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-https.tls=true
- traefik.http.routers.${STACK_NAME?Variable not set}-neo4j-https.tls.certresolver=le
- traefik.http.services.${STACK_NAME?Variable not set}-neo4j.loadbalancer.server.port=7474
replicas: 1
resources:
limits:
memory: 1024M
reservations:
memory: 500M
restart_policy:
condition: on-failure
I try to reach bolt from the backend with bolt://login:password#neo4j:7687 but get the following error:
neo4j.exceptions.ServiceUnavailable: Couldn't connect to neo4j:7687 (resolved to ('10.0.3.11:7687',)):
Failed to establish connection to ResolvedIPv4Address(('10.0.3.11', 7687)) (reason [Errno 111] Connection refused)
I have reviewed an extraordinary number of responses on Stackoverflow, but not getting anywhere. This does work on dev, but I haven't implemented https there, so I'm not sure if that's what's causing the problem.
I'm at a loss and would appreciate any guidance.
This particular issue wasn't related to Docker or Neo4j directly. Rather it is that it takes finite time to initialise the database. The solution is to retry the connection. Here's the way I went about it:
from tenacity import after_log, before_log, retry, stop_after_attempt, wait_fixed
max_tries = 60 * 5 # 5 minutes
wait_seconds = 1
#retry(
stop=stop_after_attempt(max_tries),
wait=wait_fixed(wait_seconds),
before=before_log(logger, logging.INFO),
after=after_log(logger, logging.WARN),
)
def initNeo4j() -> None:
try:
init_gdb()
except Exception as e:
raise e
Where init_gdb() is the script initialising Neo4j.
The issue arises after Docker completes instantiation of the containers and the main backend server begins to initialise and build itself. When the admin user is created, the database must be up and running, if it isn't, you get this error.
Given the same error is used to cover a multitude of issues, it isn't necessarily clear that the database isn't ready, rather than that some setting to reach it is incorrect.
Related
I had problem with my mailserver is it was not working properly when i added mailbox to Outlook. It warned me that i use wrong certificate, on closer inspection it showed default email docker creator certificate. Even though webmail is protected with SSL from Traefik.
After researching I manage to come up with this, it is combination of ones guy docker compose, cannot find the link for his post. And my previous attempt to host acme challenge on my flask website, what was overwritten by default with Nginx Proxy manager, so I abandon it.
But now while working with Traefik, what provides much more flexibility i was able to do it:
This is one page on my flask website, what returns files from within the .well-known folder, which is mapped in each docker container what needs access to lets encrypt certification process, behind reverse proxy.
#app.route('/.well-known/acme-challenge/<acmechalleng>')
def acme(acmechalleng):
try:
# acme_folder is file path to folder where challenges are stored
# for example: acmechall_folder = os.path.abspath('acmefiles/')
acme_folder=config.acmechall_folder
extr = {}
# Sets up ignored file names what will not be showed
ignore_files='.paths.json'
try:
items = json.load(open(os.path.join(acme_folder, '.paths.json')))
except Exception:
items = ""
for root, dirs, files in os.walk(acme_folder):
for f in files:
# this filters files we do not want to show ie: paths file
if f not in ignore_files:
full_path = os.path.join(root, f)
path = full_path.split(acme_folder,1)[1]
path = path.replace("\\", "/")
split_path = path.split('/')
extr.update({path[1:]: 0})
if items != extr:
with open(os.path.join(acme_folder, '.paths.json'), 'w', encoding='utf-8') as f:
json.dump(extr, f, ensure_ascii=False, indent=4)
# Only files what are in .pahts.json will be allowed to be read
items = json.load(open(os.path.join(acme_folder, '.paths.json')))
existing = []
for key, arr in items.items():
existing.append(key)
if acmechalleng in existing:
with open(os.path.join(acme_folder, acmechalleng),"rb") as f:
content = f.readlines()
return Response(content)
else:
return render_template('404.html', **_auto_values), 404
except Exception as err:
return render_template('404.html', **_auto_values), 404
Poste.io mail server Docker File, with access to .well-known challenges for Letsencrypt
version: '3'
networks:
traefikauth_net:
external: true
services:
mailserver:
image: analogic/poste.io
container_name: mailserver
hostname: mail.mydomain.com
networks:
- traefikauth_net
restart: unless-stopped
ports:
- "25:25"
- "587:587"
- "993:993"
environment:
- HOSTNAME=poste.mydomain.com
- TZ=Europe/London
- LETSENCRYPT_EMAIL=admin#mydomain.com
- LETSENCRYPT_HOST=mydomain.com
- VIRTUAL_HOST=poste.mydomain.com
- HTTPS=OFF
- DISABLE_CLAMAV=TRUE
- DISABLE_RSPAMD=TRUE
volumes:
- /opt/traefik/.well-known:/opt/www/.well-known/acme-challenge
- /opt/mailserver:/data
labels:
- "traefik.enable=true"
- "traefik.http.routers.poste-letsencrypt.rule=HostRegexp(`mydomain.com`,`{subdomain:[a-z]*}.mydomain.com`) && PathPrefix(`/.well-known/`)"
- "traefik.http.routers.poste-letsencrypt.entrypoints=http"
- "traefik.http.routers.poste-letsencrypt.service=mail"
- "traefik.http.services.poste-letsencrypt.loadbalancer.server.port=80"
- "traefik.http.routers.mail.rule=Host(`poste.mydomain.com`)"
- "traefik.http.routers.mail.entrypoints=https"
- "traefik.http.routers.mail.tls.certresolver=cloudflare"
- "traefik.http.routers.mail.service=mail"
- "traefik.http.services.mail.loadbalancer.server.port=80"
The main website, on "mydomain.com" needs also Traefik labels:
volumes:
- /opt/traefik/.well-known:/var/www/wellknown
labels:
- "traefik.enable=true"
- "traefik.http.routers.flask-letsencrypt.entrypoints=http"
- "traefik.http.routers.flask-letsencrypt.rule=HostRegexp(`mydomain.com`) && PathPrefix(`/.well-known/`)"
- "traefik.http.services.flask-letsencrypt.loadbalancer.server.port=80"
- "traefik.http.routers.flask-letsencrypt.service=myflask"
- "traefik.http.routers.myflask.entrypoints=https"
- "traefik.http.routers.myflask.rule=Host(`mydomain.com`)"
- "traefik.http.routers.myflask.service=myflask"
- "traefik.http.routers.myflask.tls=true"
- "traefik.http.routers.myflask.tls.certresolver=cloudflare"
- "traefik.http.services.myflask.loadbalancer.server.port=80"
I also have normal CertResolver for all the other domains what is in main Traefik file as follow traefik.yml:
certificatesResolvers:
cloudflare:
acme:
email: admin#mydomain.com
storage: /letsencrypt/acme.json
dnsChallenge:
provider: cloudflare
resolvers:
- "1.1.1.1:53"
- "1.0.0.1:53"
entryPoints:
http:
address: ":80"
https:
address: ":443"
Traefik 2 docker-compose:
version: '3'
services:
traefik:
image: traefik
restart: always
volumes:
- /etc/localtime:/etc/localtime:ro
- /var/run/docker.sock:/var/run/docker.sock
- /opt/traefik/traefik.yml:/traefik.yml:ro
- /opt/traefik/conf:/additionalsettings
- /opt/traefik/folderacme:/letsencrypt
labels:
- 'traefik.enable=true'
- 'traefik.http.routers.api.rule=Host(`traefik.mydomain.com`)'
- 'traefik.http.routers.api.entrypoints=https'
- 'traefik.http.routers.api.service=api#internal'
- 'traefik.http.routers.api.tls=true'
- 'traefik.http.routers.api.middlewares=authelia#docker'
- "traefik.http.routers.api.tls.certresolver=cloudflare"
- "traefik.http.routers.api.tls.domains[0].main=mydomain.com"
- "traefik.http.routers.api.tls.domains[0].sans=*.mydomain.com"
- "traefik.http.middlewares.authelia.forwardauth.address=http://authelia:9091/api/verify?rd=https://logmein.mydomain.com/"
- "traefik.http.middlewares.authelia.forwardauth.trustforwardheader=true"
- 'traefik.http.middlewares.authelia.forwardauth.authResponseHeaders=Remote-User, Remote-Groups, Remote-Name, Remote-Email'
- 'traefik.http.middlewares.authelia-basic.forwardauth.address=http://authelia:9091/api/verify?auth=basic'
- 'traefik.http.middlewares.authelia-basic.forwardauth.trustForwardHeader=true'
- 'traefik.http.middlewares.authelia-basic.forwardauth.authResponseHeaders=Remote-User, Remote-Groups, Remote-Name, Remote-Email'
# global redirect to https
- "traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)"
- "traefik.http.routers.http-catchall.entrypoints=http"
- "traefik.http.routers.http-catchall.middlewares=redirect-to-https"
# middleware redirect
- "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"
environment:
- CF_API_EMAIL=<cloudflare api email>
- CF_API_KEY=<cloudflare api key for certresolver>
ports:
- 80:80
- 443:443
command:
- '--api'
- '--providers.docker=true'
- '--providers.docker.exposedByDefault=false'
- '--entrypoints.http=true'
- '--entrypoints.http.address=:80'
- '--entrypoints.http.http.redirections.entrypoint.to=https'
- '--entrypoints.http.http.redirections.entrypoint.scheme=https'
- '--entrypoints.https=true'
- '--entrypoints.https.address=:443'
What do you think? I believe that python code could be much smaller, especially that part what stores the path in Json but i use that from my other project, and there I needed more then path to the file. This way I believe nobody should be able access other files, only those in ACME folder.
Tracing makes finding parts in code, worthwhile a developers time and attention, much easier. For that reason, I attached Jaeger as tracer to a set of microservices inside Docker containers. I use Traefik as ingress controller/ service-mesh to route and proxy requests.
The problem I am facing is, that something's wrong with the tracing config in Traefik. Jaeger can not find the span context to connect the single/ service-dependend spans to a whole trace.
The following line appears in the logs:
{
"level":"debug",
"middlewareName":"tracing",
"middlewareType":"TracingEntryPoint",
"msg":"Failed to extract the context: opentracing: SpanContext not found in Extract carrier",
"time":"2021-02-02T23:16:51+01:00"
}
What I tried/ searched/ confirmed so far:
I already checked ports (they are open inside the Docker host network) and everything's reachable. So interconnectivity is not the problem here.
The forwarding of headers is set via Docker Compose labels: loadbalancer.passhostheader=true.
The following snippets describe the Docker Compose setup.
Traefik: Ingress Controller
This is a stripped down version of the traefik Container.
# Network
ROOT_DOMAIN=example.test
DEFAULT_NETWORK=traefik
---
version: '3'
services:
image: "traefik:2.4.2"
hostname: "controller"
restart: on-failure
security_opt:
- no-new-privileges:true
ports:
- "443:443"
- "80:80"
# The Web UI (enabled by --api.insecure=true)
- "8080:8080"
- "8082:8082"
- "8083:8083"
networks:
- default
working_dir: /etc/traefik
volumes:
- /private/etc/localtime:/etc/localtime:ro
- ${PWD}/controller/static.yml:/etc/traefik/traefik.yml:ro
- ${PWD}/controller/dynamic.yml:/etc/traefik/dynamic.yml:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
- cert-storage:/usr/local/share/ca-certificates:ro
- ${PWD}/logs/traefik:/var/log/traefik
volumes:
cert-storage:
driver_opts:
type: none
o: bind
device: ${PWD}/certs/certs
networks:
default:
external: true
name: ${DEFAULT_NETWORK}
Traefik is set up using the file provider as base and Docker Compose labels on top of it:
# static.yaml (Traefik conf)
debug: true
log:
level: DEBUG
filePath: /var/log/traefik/error.log
format: json
serversTransport:
insecureSkipVerify: true
api:
dashboard: true
insecure: true
debug: true
providers:
docker:
exposedByDefault: false
swarmMode: false
watch: true
defaultRule: "Host(`{{ normalize .Name }}.example.test`)"
endpoint: "unix:///var/run/docker.sock"
network: traefik
file:
filename: /etc/traefik/dynamic.yml
watch: true
tracing:
serviceName: "controller"
spanNameLimit: 250
jaeger:
samplingType: const
samplingParam: 1.0
samplingServerURL: http://tracer:5778/sampling
localAgentHostPort: 127.0.0.1:6831
gen128Bit: true
propagation: jaeger
traceContextHeaderName: "traefik-trace-id"
collector:
endpoint: http://tracer:14268/api/traces?format=jaeger.thrift
Jaeger: Open Tracing/ Open Telemetry
---
version: '3'
services:
tracer:
image: "jaegertracing/all-in-one:1.21.0"
hostname: "tracer"
command:
- "--log-level=info"
- "--admin.http.host-port=:14269"
- "--query.ui-config=/usr/local/share/jaeger/ui/conf.json"
environment:
SPAN_STORAGE_TYPE: memory
restart: on-failure
security_opt:
- no-new-privileges:true
expose:
- 5775/udp
- 6831/udp
- 6832/udp
- 5778
- 14250
- 14268
- 14269
- 14271
- 16686
- 16687
volumes:
- /private/etc/localtime:/etc/localtime:ro
- ${PWD}/tracer/conf:/usr/local/share/jaeger
- ${PWD}/logs/jaeger:/var/log/#TODO
- cert-storage:/usr/local/share/ca-certificates
networks:
- default
labels:
- "traefik.enable=true"
- "traefik.docker.network=${DEFAULT_NETWORK}"
# Admin UI router
- "traefik.http.routers.tracer-router.rule=Host(`tracer.$ROOT_DOMAIN`)"
- "traefik.http.routers.tracer-router.entrypoints=https"
- "traefik.http.routers.tracer-router.tls=true"
- "traefik.http.routers.tracer-router.tls.options=default"
- "traefik.http.routers.tracer-router.service=tracer"
# Service/ Load Balancer
- "traefik.http.services.tracer.loadbalancer.passhostheader=true"
- "traefik.http.services.tracer.loadbalancer.server.port=16686"
- "traefik.http.services.tracer.loadbalancer.server.scheme=http"
I'm not 100% sure what the problem is you're experiencing, but here's some things to consider.
According to this post on the Traefik forums, that message you're seeing is debug level because it's not something you should be worried about. It's just logging that no trace context was found, so a new one will be created. That second part is not in the message, but apparently that's what happens.
You should check to see if you're getting data appearing in Jaeger. If you are, that message is probably nothing to worry.
If you are getting data in Jaeger, but it's not connected, that will be because Traefik can only only work with trace context that is already in inbound requests, but it can't add trace context to outbound requests. Within your application, you'll need to implement trace propagation so that your outbound requests include the trace context that was received as part of the incoming request. Without doing that, every request will be sent without trace context and will start a new trace when it is received at the next Traefik ingress point.
The problem actually was with the traceContextHeaderName. Sadly I can not tell exactly what the problem was as the git diff only shows that nothing changed around traefik and jaeger at the point where I fixed it. I assume config got "stuck" somehow. I tracked down the related lines in source, but as I am no Go-Dev, I can only guess if there's a bug.
What I did was to switch back to uber-trace-id, which magically "fixed" it. After I ran some traces and connected another service (node, npm jaeger-client with process.env.TRACER_STATE_HEADER_NAME set to an equal value), I switched back to traefik-trace-id and things worked.
I'm using a K3S Cluster in a docker(-compose) container in my CI/CD pipeline, to test my application code. However I have problem with the certificate of the cluster. I need to communicate on the cluster using the external addres. My docker-compose script looks as follows
version: '3'
services:
server:
image: rancher/k3s:v0.8.1
command: server --disable-agent
environment:
- K3S_CLUSTER_SECRET=somethingtotallyrandom
- K3S_KUBECONFIG_OUTPUT=/output/kubeconfig.yaml
- K3S_KUBECONFIG_MODE=666
volumes:
- k3s-server:/var/lib/rancher/k3s
# get the kubeconfig file
- .:/output
ports:
# - 6443:6443
- 6080:6080
- 192.168.2.110:6443:6443
node:
image: rancher/k3s:v0.8.1
tmpfs:
- /run
- /var/run
privileged: true
environment:
- K3S_URL=https://server:6443
- K3S_CLUSTER_SECRET=somethingtotallyrandom
ports:
- 31000-32000:31000-32000
volumes:
k3s-server: {}
accessing the cluster from python gives me
MaxRetryError: HTTPSConnectionPool(host='192.168.2.110', port=6443): Max retries exceeded with url: /apis/batch/v1/namespaces/mlflow/jobs?pretty=True (Caused by SSLError(SSLCertVerificationError("hostname '192.168.2.110' doesn't match either of 'localhost', '172.19.0.2', '10.43.0.1', '172.23.0.2', '172.18.0.2', '172.23.0.3', '127.0.0.1', '0.0.0.0', '172.18.0.3', '172.20.0.2'")))
Here are my two (three) question
how can I add additional IP adresses to the cert generation? I was hoping the --bind-address in the server command triggers taht
how can I fall back on http providing an --http-listen-port didn't achieve the expected result
any other suggestion how I can enable communication with the cluster
changing the python code is not really an option as I would like o keep the code unaltered for testing. (Fallback on http works via kubeconfig.
The solution is to use the parameter tls-san
server --disable-agent --tls-san 192.168.2.110
I am trying to capture syslog messages sent over the network using rsyslog, and then have rsyslog capture, transform and send these messages to elasticsearch.
I found a nice article on the configuration on https://www.reddit.com/r/devops/comments/9g1nts/rsyslog_elasticsearch_logging/
Problem is that rsyslog keeps popping up an error at startup that it cannot connect to Elasticsearch on the same machine on port 9200. Error I get is
Failed to connect to localhost port 9200: Connection refused
2020-03-20T12:57:51.610444+00:00 53fd9e2560d9 rsyslogd: [origin software="rsyslogd" swVersion="8.36.0" x-pid="1" x-info="http://www.rsyslog.com"] start
rsyslogd: omelasticsearch: we are suspending ourselfs due to server failure 7: Failed to connect to localhost port 9200: Connection refused [v8.36.0 try http://www.rsyslog.com/e/2007 ]
Anyone can help on this?
Everything is running in docker on a single machine. I use below docker compose file to start the stack.
version: "3"
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- 9200:9200
networks:
- logging-network
kibana:
image: docker.elastic.co/kibana/kibana:7.6.1
depends_on:
- logstash
ports:
- 5601:5601
networks:
- logging-network
rsyslog:
image: rsyslog/syslog_appliance_alpine:8.36.0-3.7
environment:
- TZ=UTC
- xpack.security.enabled=false
ports:
- 514:514/tcp
- 514:514/udp
volumes:
- ./rsyslog.conf:/etc/rsyslog.conf:ro
- rsyslog-work:/work
- rsyslog-logs:/logs
volumes:
rsyslog-work:
rsyslog-logs:
networks:
logging-network:
driver: bridge
rsyslog.conf file below:
global(processInternalMessages="on")
#module(load="imtcp" StreamDriver.AuthMode="anon" StreamDriver.Mode="1")
module(load="impstats") # config.enabled=`echo $ENABLE_STATISTICS`)
module(load="imrelp")
module(load="imptcp")
module(load="imudp" TimeRequery="500")
module(load="omstdout")
module(load="omelasticsearch")
module(load="mmjsonparse")
module(load="mmutf8fix")
input(type="imptcp" port="514")
input(type="imudp" port="514")
input(type="imrelp" port="1601")
# includes done explicitely
include(file="/etc/rsyslog.conf.d/log_to_logsene.conf" config.enabled=`echo $ENABLE_LOGSENE`)
include(file="/etc/rsyslog.conf.d/log_to_files.conf" config.enabled=`echo $ENABLE_LOGFILES`)
#try to parse a structured log
action(type="mmjsonparse")
# this is for index names to be like: rsyslog-YYYY.MM.DD
template(name="rsyslog-index" type="string" string="rsyslog-%$YEAR%.%$MONTH%.%$DAY%")
# this is for formatting our syslog in JSON with #timestamp
template(name="json-syslog" type="list") {
constant(value="{")
constant(value="\"#timestamp\":\"") property(name="timereported" dateFormat="rfc3339")
constant(value="\",\"host\":\"") property(name="hostname")
constant(value="\",\"severity\":\"") property(name="syslogseverity-text")
constant(value="\",\"facility\":\"") property(name="syslogfacility-text")
constant(value="\",\"program\":\"") property(name="programname")
constant(value="\",\"tag\":\"") property(name="syslogtag" format="json")
constant(value="\",") property(name="$!all-json" position.from="2")
# closing brace is in all-json
}
# this is where we actually send the logs to Elasticsearch (localhost:9200 by default)
action(type="omelasticsearch" template="json-syslog" searchIndex="rsyslog-index" dynSearchIndex="on")
#################### default ruleset begins ####################
# we emit our own messages to docker console:
syslog.* :omstdout:
include(file="/config/droprules.conf" mode="optional") # this permits the user to easily drop unwanted messages
action(name="main_utf8fix" type="mmutf8fix" replacementChar="?")
include(text=`echo $CNF_CALL_LOG_TO_LOGFILES`)
include(text=`echo $CNF_CALL_LOG_TO_LOGSENE`)
First of all you need to run all the containers on the same docker network which in this case are not. Second , after running the containers on the same network , login to rsyslog container and check if 9200 is available.
I'm trying to setup a network of 2 organizations each having two peers. A 3rd organisation having 2 orderer nodes with kakfa-zookeeper ensemble with 4 kafka and 3 zookeeper nodes.
Below is the relevant part of my crypto-config.yaml file:
OrdererOrgs:
- Name: Orderer
Domain: ordererOrg.example.com
Template:
Count: 2
Below is the relevant part of my configtx.yaml file:
- &OrdererOrg
Name: OrdererOrg
ID: OrdererMSP
MSPDir: crypto-config/ordererOrganizations/ordererOrg.example.com/msp
Policies:
Readers:
Type: Signature
Rule: "OR('OrdererMSP.member')"
Writers:
Type: Signature
Rule: "OR('OrdererMSP.member')"
Admins:
Type: Signature
Rule: "OR('OrdererMSP.admin')"
.................
Orderer: &OrdererDefaults
OrdererType: kafka
Addresses:
- orderer0.ordererOrg.example.com:7050
- orderer1.ordererOrg.example.com:7040
BatchTimeout: 2s
BatchSize:
MaxMessageCount: 10
AbsoluteMaxBytes: 99 MB
PreferredMaxBytes: 512 KB
Kafka:
Brokers:
- kafka0.ordererOrg.example.com:9092
- kafka1.ordererOrg.example.com:9092
- kafka2.ordererOrg.example.com:9092
- kafka3.ordererOrg.example.com:9092
...............
Below is the relevant part of my Docker base file:
zookeeper:
image: hyperledger/fabric-zookeeper
environment:
- ZOO_SERVERS=server.1=zookeeper0.ordererOrg.example.com:2888:3888 server.2=zookeeper1.ordererOrg.example.com:2888:3888 server.3=zookeeper2.ordererOrg.example.com:2888:3888
restart: always
kafka:
image: hyperledger/fabric-kafka
restart: always
environment:
- KAFKA_MESSAGE_MAX_BYTES=103809024 # 99 * 1024 * 1024 B
- KAFKA_REPLICA_FETCH_MAX_BYTES=103809024 # 99 * 1024 * 1024 B
- KAFKA_UNCLEAN_LEADER_ELECTION_ENABLE=false
- KAFKA_MIN_INSYNC_REPLICAS=2
- KAFKA_DEFAULT_REPLICATION_FACTOR=3
- KAFKA_ZOOKEEPER_CONNECT=zookeeper0.ordererOrg.example.com:2181,zookeeper1.ordererOrg.example.com:2181,zookeeper2.ordererOrg.example.com:2181
Below is the relevant part of my Docker Compose file:
zookeeper0.ordererOrg. example.com:
container_name: zookeeper0.ordererOrg.example.com
extends:
file: base/kafka-base.yaml
service: zookeeper
environment:
- ZOO_MY_ID=1
ports:
- '2181:2181'
- '2888:2888'
- '3888:3888'
networks:
- byfn
kafka0.ordererOrg.example.com:
container_name: kafka0.ordererOrgvodworks.example.com
extends:
file: base/kafka-base.yaml
service: kafka
depends_on:
- zookeeper0.ordererOrg.example.com
- zookeeper1.ordererOrg.example.com
- zookeeper2.ordererOrg.example.com
environment:
- KAFKA_BROKER_ID=0
ports:
- '9092:9092'
- '9093:9093'
networks:
- byfn
-----------------------
Note: The same structure is being followed for:
- zookeeper1.ordererOrg. example.com
- zookeeper2.ordererOrg. example.com
And
- kafka1.ordererOrg.example.com
- kafka2.ordererOrg.example.com
- kafka3.ordererOrg.example.com
When I run the network start command I get the following error messages:
✖ Starting business network definition. This may take a minute...
Error: Error trying to start business network. Error: No valid
responses from any peers. Response from attempted peer comms was an
error: Error: REQUEST_TIMEOUT
And when I run the same network start command again, I get the following:
✖ Starting business network definition. This may take a minute...
Error: Error trying to start business network. Error: No valid
responses from any peers. Response from attempted peer comms was an
error: Error: chaincode registration failed: timeout expired while
starting chaincode tt_poc:0.0.1 for transaction
And images files are also not being created against the chaincode (BNA file) as you can see the ccenv containers and orderer logs in the image below:
And I get the following logs as well on console after peer channel create command, though channel gets created successfully:
2019-03-25 15:20:34.567 UTC [channelCmd] InitCmdFactory -> INFO 001 Endorser and rderer connections initialized
2019-03-25 15:20:34.956 UTC [cli.common] readBlock -> INFO 002 Got status: &{SERVICE_UNAVAILABLE}
I tried to provide maximum information but still please let me know if you require logs of any other container as well. Thanks for your time.
I finally able to resolve this issue. There was nothing wrong with these YAML configurations. The issue was with the docker configurations that It was lacking in resources and the strange thing is that I didn't get any resources related error in any container logs file. So, I just increased CPUs and Memory settings in the docker advanced configurations like below:
And after these configurational changes, my network started successfully and working properly.
Thanks to my colleague #Rafiq who help me in sorting out this issue.