Understanding Docker Container Internals in Hyperledger Fabric

Understanding Docker Container Internals in Hyperledger Fabric - docker

I think to understood how fabric mainly works and how consens is reached. What I am still missing in the documentation is the part of what happens inside of a docker container of fabric to take part in communication process.
So, communication starting from a client (e.g. an app) takes place in using gRPC messages between peers and orderer.
But what happens inside of the containers?
I imagine it for myself as a process that is only receiving gRPC message and answering them in using functions in the background of a peer/orderer, to hands out its response for further processing in another unit like the client to collect the responses of multiple peers for a smart contract.
But what happens really inside a container? I mean, a container spawns, when the docker image file is loaded and launched by the yaml config file. But what is started there inside of it (is there only a single peer binary started, e.g. like the command "peer node start") - I mean the compiled go binary file "peer" only?? What is listening? What is responding there? I discovered only one port for every container that is exposed out. This seems for me to be the gate for gRPC (cause it is often used as Port ID: **51).
The same questions goes for the orderer, the chaincode and the cli. How are they talking to each other or is gRPC the only way of communication and processing (excluded of the discovery service and gossip, how is this started inside of the containers (in using the yaml files for lauchun only or is there further internal configuration or a startupscript in the image files (cause I cannot look inside the images, only login on running containers while runtime).

When your client sends request to one of the peers, peer instance checks if requested chaincode (CC) installed on it. If CC not installed: Obviously you'll get an error.
If CC is installed: Peer checks if a dedicated container is already started for the given CC and corresponding version. If container is started, peer sends transaction request to that CC instance and returns back the response to your client after signing the transaction. Signing guarantees that response is really sent by that peer.
If container not started:
It builds a docker image and starts that instance (docker container). New image would be based on one of the hyperledger images. i.e. if your CC is GO, then hyperledger/baseos, which is very basic linux os, will be used. This new image contains CC binary and META-DATA as well.
That peer instance is using underlying (your) machine's docker server to do all of those. That's the reason why we need to pass /var/run:/host/var/run into volume mapping and CORE_VM_ENDPOINT=unix:///host/var/run/docker.sock into environment variables.
Once the CC container starts, it connects to its parent peer node which is defined with
CORE_PEER_CHAINCODEADDRESS attribute. Peer dictates to child (probably during image creation) to use this address, so they obey. Peer node defines its own listen URL with CORE_PEER_CHAINCODELISTENADDRESS attribute.
About your last question; communication is with gRPC in between nodes also with clients. If TLS is enabled, then it's for sure secure communication. Entry point for orderers to know about peers and peers know about other organizations' peers is the definition of anchor peers defined during channel creation. Discovery service is running in peer nodes, so they can hold a close to real-time network layout. Discovery service also provides peers' identity, that's how clients can detect other organizations' peers when endorsement policy requires multiple organizations' endorsement policy (i.e. if policy look like AND(Org1MSP.member, Org2MSP.member)).

Related

How to setup a connection between Python Couchbase SDK and containerized Couchbase nodes?

I want to have the following setup:
3 Couchbase nodes, each running on a separate container, all in the same cluster
Python application running in another container (querying, inserting, deleting data from the Couchbase cluster)
What I managed to do:
Set up a cluster, bucket, query the bucket via UI (accessed by localhost:8091)
What I didn't manage to do:
Create a connection between a Python application (which would at the end be Dockerized, for now for the sake of simplicity, let's treat it as local) and the (already working) cluster. Unfortunately, I cannot access it via Docker containers IP's with 8091 port, via localhost too. Unfortunately, the Couchbase documentation is either severely lacking here, or I just don't understand it. I tried to even use the setting-alternate-address option, but without much success (maybe I used it wrongly, so if you have any "how-to's" explaining the process, I'd still be grateful)
The connection works if there is one node, but throws Timeout if I set up 3 nodes.
I would really appreciate any tips leading to solving this problem.
EDIT: Adding code and error message:
connection_string = "couchbase://localhost"
cluster = Cluster.connect(connection_string, ClusterOptions(PasswordAuthenticator(os.getenv("LOGIN"), os.getenv("PASSWORD"))))
# following a successful authentication, a bucket can be opened.
# access a bucket in that cluster
bucket = cluster.bucket('travel-sample')
coll = bucket.default_collection()
result = coll.get('airline_10')
print(result.content_as[dict])
Error message:
couchbase.exceptions.UnAmbiguousTimeoutException: <ec=14, category=couchbase.common, message=unambiguous_timeout, context=KeyValueErrorContext:{'key': 'airline_10', 'bucket_name': 'travel-sample', 'scope_name': '_default', 'collection_name': '_default', 'opaque': 0}, C Source=C:\Jenkins\workspace\python\sdk\python-scripted-build-pipeline\py-client\src\kv_ops.cxx:209>

Couchbase SDKs need to be able to connect to every node on the cluster.
If you are running an app outside of the Docker host, it cannot connect to every node (you can't expose every node on the same port).
This is exactly why it will work fine with one node, but not with multiple (more details in the documentation)
If you run the Python app inside of a container that runs in the Docker host, it should connect just fine (or stick to a single node for development - which is mostly fine if you're not testing something specific to clustering/failover/replication).

Microservices with dynamic ports

I have a series of microservices that I have been testing. Originally it was using Service Fabric however I have switched to using Consul, Fabio, Nomad which I like better.
In development on my machine things work well however I am running into some issues actually getting Fabio to work in a cluster format.
I have a cluster of 5 nodes each running Consul, Fabio, Nomad.
Each service gets a dynamic port at runtime and successfully registers itself.
On the node which the service is running Fabio correctly forwards traffic.
However if the same fabio url is used on a different node then traffic is forwarded to the correct node/port however that is closed so the connection doesn't work.
For instance if ServiceA running on MachineA on port 1234 then http://MachineA:9999/ServiceA correctly works.
However http://MachineB/ServiceA fails after MachineA tries to initiate a connection to MachineB on port 1234.
A solution would be to add firewall rules, I would imagine, however this requires all the Services to run as Admin which I don't want.
Is there a way to support this through Fabio?

Can't mirror messages from Kafka producer in container to consumer

I am trying to mirror a kafka topic from a provider in a container in an ec2 instance to a consumer, and messages are not coming through. I suspect that I am messing things up with the .properties configs, as this is my first time using MirrorMaker. I may have also messed up with pointing to the wrong ports somewhere along the way.
The would-be provider broker is running in a centOS container in an ec2 instance. The provider is receiving data from a remote MySQL server through a custom-configured jdbc source connector to a topic called mysql-jdbc-events. The provider is successfully receiving messages.
The would-be consumer is currently on the host ec2 instance, although that will change once it's successfully been tested. I mapped port 12181 of the host to port 2181 of the container (where zookeeper is running). I am running the MirrorMaker command from the consumer.
I ran the command
./kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config /path/to/config/consumer.properties --producer.config /path/to/config/producer.properties --whitelist "mysql-jdbc-events"
consumer.properties:
# format: host1:port1,host2:port2 ...
zookeeper.connect=(host ip-address):12181
zookeeper.connection.timeout.ms=10000
bootstrap.servers=localhost:9092
# consumer group id
group.id=mirror_group
producer.properties:
# format: host1:port1,host2:port2 ...
zookeeper.connect=(host ip-address):2181
bootstrap.servers=localhost:9092
# specify the compression codec for all data generated: none, gzip, snappy, lz4, zstd
compression.type=none
I tried both with and without the zookeeper.connect parameter in the producer config because I found conflicting about it being necessary. I also got a warning to the effect of WARN The configuration 'zookeeper.connect' was supplied but isn't a known config., but I read elsewhere on SO that this could be ignored.
I did not get any messages populated to the topic at the consumer, but there are messages in the topic at the producer.
If any more information would be helpful, please let me know.
I am also not married to this configuration - if there is a simpler way to keep the jdbc in the container and forward the messages to a kafka instance outside the container, that's good too.

Hyperledger Composer Identity Issue error after network restart (code:20, authorization failure)

I am using Docker Swarm and docker-compose to setup my Fabric (v1.1) and Composer (v0.19.18) networks.
I wanted to test how my Swarm/Fabric networks would respond to a host/ec2 failure, so I manually reboot the host which is running the fabric-ca, orderer, and peer0 containers.
Before the reboot, everything runs perfectly with respect to issuing identities. After the reboot, though all of the Fabric containers are restarted and appear to be functioning properly, I am unable to issue identities with the main admin#network card.
After reboot, composer network ping -c admin#network returns successfully, but composer identity issue (via CLI or javascript) both return code 20 errors as described here:
"fabric-ca request register failed with errors [[{\"code\":20,\"message\":\"Authorization failure\"}]]"
I am guessing the issue is stemming from the host reboot and some difference in how it is restarting the Fabric containers. I can post my docker-compose files if necessary.

If your fabric-ca-server has restarted and it's registration database hasn't been persisted (for example the database is stored on the file system of the container and loss of that container means loss of the contents of that file system) then the ca-server will create a completely new bootstrap identity called admin for issuing identities and it won't be the one you have already have and therefore isn't a valid identity anymore for the fabric-ca-server. Note that it will be a valid identity for the fabric network still. So this is why you now get authorisation failure from the fabric-ca-server. The identity called admin that you currently have is not known to your fabric-ca-server anymore.

Docker services stops communicating after some time

I have together 6 containers running in docker swarm. Kafka+Zookeeper, MongoDB, A, B, C and Interface. Interface is the main access point from public - only this container publish the port - 5683. The interface container connects to A, B and C during startup. I am using docker-compose file + docker stack deploy, each service has a name which is used as host for interface. Everything starts successfully and works fine. After some time (20 mins,1h,..) I am not able to make request to interface. Interface receives my requests but application lost connection with service A,B,C or all of them. If I restart interface, it's able to reconnect to services A,B,C.
I firstly thought it's problem of application so I expose 2 new ports on each service (interface, A,B,C) and connect with profiler and debugger to them. Application is running properly, no leaks, no blocked threads, normally working and waiting for connections. Debugger shows me that when I make a request to interface and interface tries to request service A, Connection reset by peer exception was thrown.
During this debugging I found out interesting stuff. I attached debugger to interface when the services started and also debugger was disconnected after some time. + I was not able to reconnect it, until I made request to the container -> application. PRoblem - handshake failed.
Another interesting stuff that I found out was that I was not able to request neither interface. So I used wireshark to see what's going on and: SYN - ACK was fine. Then application post some data and interface respond with FIN,ACK. I assume that this also happen when interface tries to request service A and it FIN the connection. Codebase of Interface, A,B and C is the same regarding netty server.
Finally, I don't think it's a application issue. Why? I tried to deploy containers not as services. I run each container separately, published the ports of each and endpoint of services were set to localhost. (not overlay network). And it is working. Containers run without problem. + I didn't say at the beginning, that the the java applications (interface, A,B,C) runs without problem when they are running as standalone application - not in docker.
Could you please help me what could be the issue? Why the docker in case of overlay network is closing sockets?
I am using newest docker. I used also older.

Finally, I was able to solve the problem.
What was happening, one more time. Interface opens permanent TCP connection to A,B,C. When you try to run these services A,B,C as a standalone java applications, everything is working. When we dockerize them and run in swarm, it was working only few minutes. Strange was that the connection between Interface and another service was interrupted in the moment when you made a request from client to interface.
After many many unsuccessful tests and debugging each container I tried to run each docker container separately, with mapped ports and as endpoint I specified localhost. (each container exposed ports and interface was connecting to localhost) Funny thing happen, it was working. When you run containers like this, different network driver for container is used. Bridge one. If you run it in swarm, overlay network driver is used.
So it had to be something with the docker network, not with application itself. Next step was tcpdump from each container after couple of minutes, when it should stop working. It was very interesting.
Client -> Interface (OK, request accepted)
Interface ->(forward request because it belongs to A) A
Interface -> A [POST]
A -> Interface [RESET]
A was reseting opened TCP communication after couple of minutes without communication. Why?
Docker uses IP Virtual Server and IPVS maintains its own connection table. The default timeout for CLOSE_WAIT connections in IPVS table is 60 seconds. Hence when the server sends something after 60 seconds, the IPVS connection is no longer available and the packet looks invalid for a new TCP session and gets RST. On the client side, the connection remains forever in FIN_WAIT2 state because the app still has the socket open; kernel's fin_wait timer kicks in only for orphaned TCP sockets.
This is what I read about it and how understand it. I am not sure if my explanation of problem is correct, but based on these assumptions I implemented ping-pong between Interface and A,B,C services in case there is no communication for <60seconds. And, it’s working.

Got the same issue.
Specified
endpoint_mode: dnsrr
to properties of the service which plays "server" role and it works just fine.
https://forums.docker.com/t/tcp-timeout-that-occurs-only-in-docker-swarm-not-simple-docker-run/58179

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart