How to appropriately determine the minPeers in Readiness endpoints of Hyperledger Besu, so as to check a network in DOWN status - hyperledger-besu

Hyperledger Besu Configuration:
6 validator, 3 Tx-nodes
When all the nodes are up, the minPeers connected is 6, for UP status.
When the network is down, what is the optimal minPeers to depict DOWN status.
How is the logic arrived for the minPeers count

Related

Canopen auto addressing with LSS, how to architect the system

I am new to Canopen and need to architect a system with the following characteristics:
1 canopen Master (also a gateway)
multiple canopen slave nodes, composed by multiple instances of the same device (with unique SNs, as required by LSS)
I would like to design this device to not require any pre-configuration before connecting it to the bus, and also allow devices that were previously connected to another canopen bus (and therefore had a previous node ID) to be seamlessly connected to a new bus (therefore their node IDs should not persist after a reboot).
After learning about Canopen and the LSS service I think a good solution would be:
The device has no persistent node ID, and at every boot it needs to be addressed by the master through LSS
the master will periodically scan and address new nodes through the LSS service (allowing device hot-plug)
If for any reason the master reboots, it can re-detect all already addressed nodes through a simple node scan (SDO info upload of all addresses)
Now my questions:
It is not clear to me how to have an "invalid canopen node-ID" (referenced here: https://www.can-cia.org/can-knowledge/canopen/cia305/) when they boot, if it has no initial node ID (and therefore only replies to the LSS addressing service) it should be completely silent on the bus, not even sending a boot-up message when powered (not being canopen compliant) until it gets addressed by the LSS service, but if I give it any default initial node-ID it would cause collisions when multiple nodes are simultaneously powered on (which will be the normal behaviour at every system boot-up, all devices, including the master, will be powered on at the same time), is it valid to have a canopen device "unaddressed" and silent like this, and still be canopen compliant? how to handle this case?
I read that node ID 0 means broadcast, so it means that my master could ask for all (addressed) node infos (through an SDO upload) with just one command (SDO upload info on node ID 0)? or is it not allowed, and I should inquire all 127 addresses on the bus to remap the network?
Thanks
I hope I get your questions because they are bit long:
Question 1
Yes, it is CANopen compliant if you have a Node which has no Node-ID. That's what the LS-Service is for. As long as the LSS Master has not assigned a Node-ID to the slave, your are not able to talk to the slave via SDO requests. Also PDO communication is not possible in unconfigured state.
Question 2
The ID 0 broadcast is only available for the Master NMT command. That means the CANopen master can set all NMT states of the system at the same time. SDO communication is only available between the Master and one Slave so you have to ask every node individually.

what factors determine "catch up speed" on Edge after a network outage

I have a customer with IoT Edge deployed to manufacturing plants in remote areas with spotty internet. They have leaf devices sending messages to IOT Edge and then to IoT Hub. They frequently have small outages (5, 10, 15 minutes). They often need to make timely decisions based on the data that makes it to IOT Hub from the plants. They've noticed, if they have a 15 minute outage, it can take anywhere from 15-30 minutes afterwards for IOT Edge to catch up.
Besides network speed itself, what are the factors that would influence that.. For example
- if we were hitting throttling based on their number of iot hub units, would that be surfaced in the edgeHub logs?
- if disk, network, etc can keep up, does edgeHub pretty much upload data as fast as possible (given throttling), or are there any other limits imposed by default?
- What is the default connection retry policy in edgeHub? is the same exponential backoff policy in the C# SDK? If so, could that be the case that if I have a 15 minute outage, that it's taking edgeHub a while after network recovery to 'try again'? If so, is that policy configurable in edgeHub? (via ENV variable or something?)
Any other things to check?

How ThingsBoard servers in a cluster communicates with each other?

Currently, I am doing some R&D on Thingsboard IOT platform. I am planning to deploy it in cluster mode.
When it is deployed, how two Thingsboard servers communicate with each other?
I got this problem in my mind because a particular device can send a message to one Thingsboard server (A) but actually, the message might need to be transferred to another server (B) since a node in the B server is processing that particular device's messages (As I know Thingsboard nodes uses a device hash to handle messages).
How Kafka stream forward that message accordingly when in a cluster?
I read the official documentation and did some googling. But couldn't find exact answers.
Thingsboard uses Zookeeper as a service discovery.
Each Thingsboard microservice knows what other services run somewhere in the cluster.
All communications perform through message queues (Kafka is a good choice).
Each topic has several partitions. Each partition will be assigned to the respective node.
Message for device will be hashed by originator id and always pushed to the constant partition number. There is no direct communication between nodes.
In the case of some nodes crash or simply scaled up/down, Zookeeper will fire the repartition event on each node. And existing partitions will be reassigned according to the line node count. The device service will follow the same logic.
That is all magic. Simple and effective. Hope it helps with the Thingsboard cluster architure.

Clustering setups?

I am new in ejabberd clustering setup i tried ejabberd cluster setup past one week but till i did not get it.
1.After clustering setup i got the output like running db nodes = ['ejabberd#first.example.com','ejabberd#second.example.com'] still now fine.
After that i login into PSI+ client and login credtials username:one#first.example.com then password:xxxxx.
Then i stopped ejabberd#first.example.com node my PSI+ client also down.
So why its not automatically connect with my second server ejabberd#second.example.com
Then how will i achieve ejabberd clustering suppose if one node is crash another node is manitain the connection automatically.
Are you trying to set up one cluster, or federate two clusters? If just one cluster, they should share the same domain (either first.example.com or second.example.com).
Also, when there's a node failure, your client must reconnect (not sure what PSI does), and you need to have all nodes in your cluster behind a VIP so the reconnect attempt will find the next available node.

Ejabberd Clustering understanding

Let assume I have two ejabberd server consider X and Y which has the same source and i did ejabberd clustering for those server by using this. Now consider A and B are user and those are connected in X server. Both A and B are in ONLINE state and those are connected via X server. If suppose X server is get shutdown or crashed by some issue. In this sceneraio whether the A and B are get OFFLINE state or A and B are in ONLINE state which is handle by Y server. I don't know whether my thought is right or not. If any one give me the suggestion about it.
If you have nodes in different physical locations, you should set them up as separate clusters (even if it's a cluster of 1 node) and federate them. Clustering should only be done at datacenter level since there are mnesia transactional locks between all nodes in a cluster (e.g. creating a MUC room).
"Load balancing" is not what you are describing in your question.
In load balancing, a incoming connections are distributed in a balanced fashion over multiple nodes. This is so that no one server has too high a load (hence the name "load balancing"). It also provides fail-over capability if your load balancer is smart enough to detect and remove dead nodes.
A smart load balancer can make it so that new connections always succeed as long as there is at least one working node in your cluster. However, in your question, you talk about clients "maintaining the connection". That's something quite different.
To do that, you'd either need the connection to be stateless or you'd need each client to connect to all nodes. That's not how XMPP works: it's a stateful connection to a single server. You must rely on your clients to reconnect if they get disconnected.

Resources