Clustering setups? - erlang

I am new in ejabberd clustering setup i tried ejabberd cluster setup past one week but till i did not get it.
1.After clustering setup i got the output like running db nodes = ['ejabberd#first.example.com','ejabberd#second.example.com'] still now fine.
After that i login into PSI+ client and login credtials username:one#first.example.com then password:xxxxx.
Then i stopped ejabberd#first.example.com node my PSI+ client also down.
So why its not automatically connect with my second server ejabberd#second.example.com
Then how will i achieve ejabberd clustering suppose if one node is crash another node is manitain the connection automatically.

Are you trying to set up one cluster, or federate two clusters? If just one cluster, they should share the same domain (either first.example.com or second.example.com).
Also, when there's a node failure, your client must reconnect (not sure what PSI does), and you need to have all nodes in your cluster behind a VIP so the reconnect attempt will find the next available node.

Related

How ThingsBoard servers in a cluster communicates with each other?

Currently, I am doing some R&D on Thingsboard IOT platform. I am planning to deploy it in cluster mode.
When it is deployed, how two Thingsboard servers communicate with each other?
I got this problem in my mind because a particular device can send a message to one Thingsboard server (A) but actually, the message might need to be transferred to another server (B) since a node in the B server is processing that particular device's messages (As I know Thingsboard nodes uses a device hash to handle messages).
How Kafka stream forward that message accordingly when in a cluster?
I read the official documentation and did some googling. But couldn't find exact answers.
Thingsboard uses Zookeeper as a service discovery.
Each Thingsboard microservice knows what other services run somewhere in the cluster.
All communications perform through message queues (Kafka is a good choice).
Each topic has several partitions. Each partition will be assigned to the respective node.
Message for device will be hashed by originator id and always pushed to the constant partition number. There is no direct communication between nodes.
In the case of some nodes crash or simply scaled up/down, Zookeeper will fire the repartition event on each node. And existing partitions will be reassigned according to the line node count. The device service will follow the same logic.
That is all magic. Simple and effective. Hope it helps with the Thingsboard cluster architure.

Need help setting up a dev/test Corda Network with docker

I want to set up an environment where I have several VMs, representing several partners, and where each VM host one or more nodes. Ideally, I would use kubernetes to bring up/down my environment. I have understood from the docs that this has to be done as a Dev-network, not as my own compatibility zone or anything.
However, the steps to follow are not clear (to me). I have used Dockerform or the docker image provided, but this does not seem to be the way for what i need to do.
My current (it changes with the hours) understanding is that:
a) I should create a network between the vms that will be hosting nodes. To do so, i understand i should use Cordite or the Bootstrap jar. Cordite documentation seems clearer that the Corda docs, but i haven't been able to try it yet. Should one or the other be my first step? Can anyone shed some light on how?
b) Once I have my network created I need a certifying entity (Thanks #Chris_Chabot for pointing it out!)
c) The next step should be running deployNodes so I create the config files. Here, I am not sure of whether I can indicate in deployNodes at which IPs? should the nodes be created or I just need to create the dockerfiles and certificate folders and so on, and distribute across the VMs them accordingly. I am not sure either about how to point out to the Network service.
Personally, I guess that I will not use the Dockerfiles if I am going to use Kubernetes and that I only need to distribute the certificates and config files to all the slave VMs so they are available to the nodes when they are to be launched.
To be clear, and honest :D, this is even before including any cordapp in the containers, I am just trying to have the environment ready. Basically, starting a process that builds the nodes, distribute the config files among the slave vms, and runs the dockers with the nodes. As explained in a comment, the goal here is not testing Cordapps, is testing how to deploy an operative distributed dev environment.
ANY help is going to be ABSOLUTELY welcome.
Thanks!
(Developer Relations # R3 here)
A network of Corda nodes needs three things:
- A notary node, or a pool of multiple notary nodes
- A certification manager
- A network map service
The certification manager is the root of the trust in the network, and, well, manages certificates. These need to be distributed to the nodes to declare and prove their identity.
The nodes connect to the network map service, which checks their certificate to see if they have access to the network, and if so, add them to the list of nodes that it manages -- and distributes this list of node identities + ip addresses to all the nodes on that network.
Finally the nodes use the notaries to sign the transactions that happen on the network.
Generally we find that most people start developing on the https://testnet.corda.network/ network, and later deploy to the production corda.network.
One benefit of that is that this already comes with all these pieces (certification manager, network map, and a geographically distributed pool of notaries). The other benefit is that it guarantees that you have interoperability with other parties in the future, as everyone uses the same root certificate authority -- With your own network other 3rd parties couldn't just connect as they'd be on a different cert chain and couldn't be validated.
If however you have a strong reason to want to build your own network, you can use Cordite to provide the network map and certman services.
In that case step 1 is to go through the setup and configuration instructions on https://gitlab.com/cordite/network-map-service
Once that is fully setup and up and running, https://docs.corda.net/permissioning.html has some more information on how the certificates are setup, and the "Joining an existing Compatibility Zone" section in https://docs.corda.net/docker-image.html has instructions on how to get a Corda docker image / node to join that network by specifying which network map / certman url's to use.
Oh and on the IP network question: The network manager stores a combination of the X509 identity and the IP address for each node which it distributes to the network -- this means that every node, including the notaries, certman, network map and all nodes need to be able to connect to that IP address -- either by all being on the same network that you created, or by having public ip addresses

Monitor Failovercluster roles with Icinga2

I'm using Icinga2 with NSClient++
I have a PowerShell check for certain cluster roles which is installed on every cluster node.
Should a cluster role fail, all cluster nodes would send out identical notifications which will result in a lot of spam for just one actual service problem.
Only installing the check on one cluster node is no option as it would produce a single point of failure for role monitoring: A failing cluster node should not affect the cluster roles (aside from a short timeout) but I would not be able to check any cluster role as soon as it's down.
Is it possible to assign a service to a hostgroup in a way that only one notification will be sent if this service fails?
I ended up having the check itself check if he should report a problem as critical (service on the node itself failed) or warning/ok (service on another node failed).

Ejabberd Clustering understanding

Let assume I have two ejabberd server consider X and Y which has the same source and i did ejabberd clustering for those server by using this. Now consider A and B are user and those are connected in X server. Both A and B are in ONLINE state and those are connected via X server. If suppose X server is get shutdown or crashed by some issue. In this sceneraio whether the A and B are get OFFLINE state or A and B are in ONLINE state which is handle by Y server. I don't know whether my thought is right or not. If any one give me the suggestion about it.
If you have nodes in different physical locations, you should set them up as separate clusters (even if it's a cluster of 1 node) and federate them. Clustering should only be done at datacenter level since there are mnesia transactional locks between all nodes in a cluster (e.g. creating a MUC room).
"Load balancing" is not what you are describing in your question.
In load balancing, a incoming connections are distributed in a balanced fashion over multiple nodes. This is so that no one server has too high a load (hence the name "load balancing"). It also provides fail-over capability if your load balancer is smart enough to detect and remove dead nodes.
A smart load balancer can make it so that new connections always succeed as long as there is at least one working node in your cluster. However, in your question, you talk about clients "maintaining the connection". That's something quite different.
To do that, you'd either need the connection to be stateless or you'd need each client to connect to all nodes. That's not how XMPP works: it's a stateful connection to a single server. You must rely on your clients to reconnect if they get disconnected.

Is this the right way of building an Erlang network server for multi-client apps?

I'm building a small network server for a multi-player board game using Erlang.
This network server uses a local instance of Mnesia DB to store a session for each connected client app. Inside each client's record (session) stored in this local Mnesia, I store the client's PID and NODE (the node where a client is logged in).
I plan to deploy this network server on at least 2 connected servers (Node A & B).
So in order to allow a Client A who is logged in on Node A to search (query to Mnesia) for a Client B who is logged in on Node B, I replicate the Mnesia session table from Node A to Node B or vise-versa.
After Client A queries the PID and NODE of the Client B, then Client A and B can communicate with each other directly.
Is this the right way of establishing connection between two client apps that are logged-in on two different Erlang nodes?
Creating a system where two or more nodes are perfectly in sync is by definition impossible. In practice however, you might get close enough that it works for your particular problem.
You don't say the exact reason behind running on two nodes, so I'm going to assume it is for scalability. With many nodes, your system will also be more available and fault-tolerant if you get it right. However, the problem could be simplified if you know you only ever will run in a single node, and need the other node as a hot-slave to take over if the master is unavailable.
To establish a connection between two processes on two different nodes, you need some global addressing(user id 123 is pid<123,456,0>). If you also care about only one process running for User A running at a time, you also need a lock or allow only unique registrations of the addressing. If you also want to grow, you need a way to add more nodes, either while your system is running or when it is stopped.
Now, there are already some solutions out there that helps solving your problem, with different trade-offs:
gproc in global mode, allows registering a process under a given key(which gives you addressing and locking). This is distributed to the entire cluster, with no single point of failure, however the leader election (at least when I last looked at it) works only for nodes that was available when the system started. Adding new nodes requires an experimental version of gen_leader or stopping the system. Within your own code, if you know two players are only going to ever talk to each other, you could start them on the same node.
riak_core, allows you to build on top of the well-tested and proved architecture used in riak KV and riak search. It maps the keys into buckets in a fashion that allows you to add new nodes and have the keys redistributed. You can plug into this mechanism and move your processes. This approach does not let you decide where to start your processes, so if you have much communication between them, this will go across the network.
Using mnesia with distributed transactions, allows you to guarantee that every node has the data before the transaction is commited, this would give you distribution of the addressing and locking, but you would have to do everything else on top of this(like releasing the lock). Note: I have never used distributed transactions in production, so I cannot tell you how reliable they are. Also, due to being distributed, expect latency. Note2: You should check exactly how you would add more nodes and have the tables replicated, for example if it is possible without stopping mnesia.
Zookeper/doozer/roll your own, provides a centralized highly-available database which you may use to store the addressing. In this case you would need to handle unregistering yourself. Adding nodes while the system is running is easy from the addressing point of view, but you need some way to have your application learn about the new nodes and start spawning processes there.
Also, it is not necessary to store the node, as the pid contains enough information to send the messages directly to the correct node.
As a cool trick which you may already be aware of, pids may be serialized (as may all data within the VM) to a binary. Use term_to_binary/1 and binary_to_term/1 to convert between the actual pid inside the VM and a binary which you may store in whatever accepts binary data without mangling it in some stupid way.

Resources