How ThingsBoard servers in a cluster communicates with each other? - iot

Currently, I am doing some R&D on Thingsboard IOT platform. I am planning to deploy it in cluster mode.
When it is deployed, how two Thingsboard servers communicate with each other?
I got this problem in my mind because a particular device can send a message to one Thingsboard server (A) but actually, the message might need to be transferred to another server (B) since a node in the B server is processing that particular device's messages (As I know Thingsboard nodes uses a device hash to handle messages).
How Kafka stream forward that message accordingly when in a cluster?
I read the official documentation and did some googling. But couldn't find exact answers.

Thingsboard uses Zookeeper as a service discovery.
Each Thingsboard microservice knows what other services run somewhere in the cluster.
All communications perform through message queues (Kafka is a good choice).
Each topic has several partitions. Each partition will be assigned to the respective node.
Message for device will be hashed by originator id and always pushed to the constant partition number. There is no direct communication between nodes.
In the case of some nodes crash or simply scaled up/down, Zookeeper will fire the repartition event on each node. And existing partitions will be reassigned according to the line node count. The device service will follow the same logic.
That is all magic. Simple and effective. Hope it helps with the Thingsboard cluster architure.

Related

On-prem docker swarm deployment with HA

I’m doing on-prem deployments using docker swarm and I need application and DB high availability.
As far as application HA is concerned, it works great within docker (service discovery and load balancing), but I’m not sure how to use it on my network. I mean how can I assign a virtual IP to all of my docker managers so that if any of them goes down, that virtual IP automatically points to the other docker manager in the cluster. I don’t want to have a single point of failure in my architecture, that’s why I’m not inclined to use any (single) reverse proxy solution in front of my swarm cluster (because to my understanding, if nginx/HAProxy goes down, the whole system goes into abyss. I would love to know that I’m wrong).
Secondly, I use WebSockets in my application for push notifications which doesn’t behave normally with all the load balancing stuff because socket handshakes get distorted.
I want a solution to these problems without writing anything in code (HA-specific and non-generic like hard coding IPs etc). Any suggestions? I hope I explained my problem correctly.
Docker Flow Proxy or Traefik can be placed on a set of swarm nodes that you want to receive traffic for incoming connections, and use DNS routing to get packets to the correct containers. Both have sticky sessions option (I know Docker Flow does, not sure about Traefik).
Then you can either:
If your incoming connections are just client HTTP/S requests, you can use DNS Round Robin with multiple A records, which works great, or
By an expensive hardware fault tolerant reverse proxy like F5
Use some network-layer IP failover that is at the OS and physical network level (not related to Docker really), but I'm not sure how well that would work with Swarm.
Number 2 is the typical solution in private datacenters that need full HA at all layers.

Docker Swarm - Route a request to ALL containers

Is there any sort of way to broadcast an incoming request to all containers in a swarm?
EDIT: More info
I have a distributed application with many docker containers. The client can send requests to the swarm and have it respond. However, in some cases, the client needs to change a state on all server instances and therefore I would either need to be able to broadcast a message or have all the Docker containers talk to each other similar to MPI, which I'm trying to avoid.
There is no built-in way to turn a unicast packet into a multicast packet, nor any common 3rd party way of doing (That I've seen or heard of).
I'm not sure what "change a state on all server instances" means. Are we talking about the running state on all containers in a single service?
Or the actual underlying OS? All containers on all services? etc.
Without knowing more about your use case, I'd say it's likely better to design something where the request is received by one Swarm service, and then it's stored in a queue system where a backend worker would pick it up and "change the state on all server instances" for you.
It depends on your specific use case. One way to do it is to send a docker service update --force, which will cause all containers to reboot. If your containers fetch the information that is changed at startup, it would have the required effect

Cluster in MQTT for IoT and Push Notification

I have started reading some details about MQTT protocol and its implementation. I came across the term 'cluster' a lot. Can anyone help me understand what does 'cluster' mean for MQTT protocol?
In this comparison of various MQTT protocol, there is a column for the term 'cluster'
Forwarding messages with topic bridge loops will not result in a true MQTT broker cluster, which will lead to drawbacks lined out above.
A true MQTT broker cluster is a distributed system that represents one logical MQTT broker. A cluster consists of various individual MQTT broker nodes, that are typically installed on separate physical or virtual machines and or connect over a network.
Typical advantages of MQTT broker clusters include:
Elimination of the single point of failure
Load distribution across multiple cluster nodes
The ability for clients to resume sessions on any broker cluster
Scalability
Resilience and fault tolerance - especially useful in cloud environments
I recommend this blogpost, if you're looking for a more detailed explanation.
A cluster is a collection of MQTT brokers set up to bridge all topics between each other so that a client can connect to any one of the cluster members and still publish and receive messages to all other clients no matter which cluster member they are connected to.
A few things to be aware of:
Topic bridge loops, where a message is published to one cluster member which is then forwarded to another cluster member, then another and finally back to the original. If this happens the original broker doesn't have a way to know it originally pushed this to the other cluster members so the message and end up in a loop. Shared message state databases or using a single bridging replication broker can fix this.
Persistent subscriptions/sessions, unless brokers have a pooled session cache then clients will not retain session or subscription status if they connect to a different cluster member when reconnecting.

Clustering setups?

I am new in ejabberd clustering setup i tried ejabberd cluster setup past one week but till i did not get it.
1.After clustering setup i got the output like running db nodes = ['ejabberd#first.example.com','ejabberd#second.example.com'] still now fine.
After that i login into PSI+ client and login credtials username:one#first.example.com then password:xxxxx.
Then i stopped ejabberd#first.example.com node my PSI+ client also down.
So why its not automatically connect with my second server ejabberd#second.example.com
Then how will i achieve ejabberd clustering suppose if one node is crash another node is manitain the connection automatically.
Are you trying to set up one cluster, or federate two clusters? If just one cluster, they should share the same domain (either first.example.com or second.example.com).
Also, when there's a node failure, your client must reconnect (not sure what PSI does), and you need to have all nodes in your cluster behind a VIP so the reconnect attempt will find the next available node.

Ejabberd Clustering understanding

Let assume I have two ejabberd server consider X and Y which has the same source and i did ejabberd clustering for those server by using this. Now consider A and B are user and those are connected in X server. Both A and B are in ONLINE state and those are connected via X server. If suppose X server is get shutdown or crashed by some issue. In this sceneraio whether the A and B are get OFFLINE state or A and B are in ONLINE state which is handle by Y server. I don't know whether my thought is right or not. If any one give me the suggestion about it.
If you have nodes in different physical locations, you should set them up as separate clusters (even if it's a cluster of 1 node) and federate them. Clustering should only be done at datacenter level since there are mnesia transactional locks between all nodes in a cluster (e.g. creating a MUC room).
"Load balancing" is not what you are describing in your question.
In load balancing, a incoming connections are distributed in a balanced fashion over multiple nodes. This is so that no one server has too high a load (hence the name "load balancing"). It also provides fail-over capability if your load balancer is smart enough to detect and remove dead nodes.
A smart load balancer can make it so that new connections always succeed as long as there is at least one working node in your cluster. However, in your question, you talk about clients "maintaining the connection". That's something quite different.
To do that, you'd either need the connection to be stateless or you'd need each client to connect to all nodes. That's not how XMPP works: it's a stateful connection to a single server. You must rely on your clients to reconnect if they get disconnected.

Resources