Perl or any script to fail-over solace VPN's between Data centers - solace

I have multiple VPN's in solace appliance. we need to fail-over all VPN's at same time from Production to DR data centers. do we have any in-build script that can I use or suggest me how to develop the script.
Thanks,
Ramesh

A controlled replication switch-over is intended to be a manual activity so there is no built-in script to complete the activity. It is intended to be manual as it is required to wait until replication queue is fully drained until completing the switch-over. If the message VPNs are using XA transactions, it may also be required to heuristically roll back or commit any prepared transactions that were on a formally active site.
The steps to perform a controlled switch-over are below. In the example, it is assumed that NY_Appliance1 as the initially active appliance and NJ_Appliance1 as the initially standby appliance. The example shows only one message VPN named Trading_VPN, but the steps must be repeated for each message VPN. If multiple VPNs need to be switched-over at the same time, it may be easier to use SolAdmin or the Solace WebUI.
Verify that the replication bridge is bound to the replication queue:
Run "show message-vpn Trading_VPN replication" on each replication site. The remote bridge state should be "Up" for the replication-active site and the local bridge state should be "Up" for the replication-standby site
NY_Appliance1> show message-vpn Trading_VPN replication
Message VPN A C B R Q S M T
-------------------------------- - - - - - - - - -
Trading_VPN U A - U U - N A
NJ_Appliance1> show message-vpn Trading_VPN replication
Message VPN A C B R Q S M T
-------------------------------- - - - - - - - - -
Trading_VPN U S U - - - N A
Switch the currently replication active Message VPNs to standby.
NY_Appliance1(configure)# message-vpn Trading_VPN
NY_Appliance1(configure/message-vpn)# replication state standby
Repeat for each Message VPN.
Allow any messages or transactions that are in progress from the formerly replication active Message VPN to its corresponding Message VPN on its replication mate to arrive . Allowing the propagation of all messages and transactions to the standby Message VPN can prevent the loss of asynchronous replication messages and transactions.
NY_Appliance1(configure)# show queue #MSGVPN_REPLICATION_DATA_QUEUE message-vpn Trading_VPN
Name : #MSGVPN_REPLICATION_DATA_QUEUE
Message VPN : Trading_VPN
...
Current Messages Spooled : 1
Current Spool Usage (MB) : 0.0006
...
The system administrator should not configure the Message VPN in the other replication mate (NJ_Appliance1) as replication active until “Current Messages Spooled” is 0 for the replication queue for the Message VPN that was just switched to replication standby.
If the Message VPN is using XA transactions, there may be some prepared transactions on the formerly active site that need to be heuristically committed or rolled back. Only prepared transactions have to be addressed. Transactions in other states can be ignored.
In order to appropriately decide if the XA transaction should be committed or rolled back, the user should check the logs or state of the transaction manager from the application side.
NY_Appliance1> show transaction message-vpn Trading_VPN state PREPARED replicated detail
Switch the formerly replication-standby message VPN to replication-active
NJ_Appliance1(configure)# message-vpn Trading_VPN
NJ_Appliance1(configure/message-vpn)# replication state active
Repeat for each message VPN.
If you previously heuristically completed transactions, you should delete them to free up the resources. You must always delete the completed transactions on the formerly active site.
solace(admin/message-spool) delete-transaction xid <xid>

Thanks for the details steps.
my concern is I have around 300 VPN's. if I follow manual processes it will take more than 15 hours to fail-over all. so I am looking the alternate way to fail-over that can save my time.
Thanks,
Ramesh

Related

Canopen auto addressing with LSS, how to architect the system

I am new to Canopen and need to architect a system with the following characteristics:
1 canopen Master (also a gateway)
multiple canopen slave nodes, composed by multiple instances of the same device (with unique SNs, as required by LSS)
I would like to design this device to not require any pre-configuration before connecting it to the bus, and also allow devices that were previously connected to another canopen bus (and therefore had a previous node ID) to be seamlessly connected to a new bus (therefore their node IDs should not persist after a reboot).
After learning about Canopen and the LSS service I think a good solution would be:
The device has no persistent node ID, and at every boot it needs to be addressed by the master through LSS
the master will periodically scan and address new nodes through the LSS service (allowing device hot-plug)
If for any reason the master reboots, it can re-detect all already addressed nodes through a simple node scan (SDO info upload of all addresses)
Now my questions:
It is not clear to me how to have an "invalid canopen node-ID" (referenced here: https://www.can-cia.org/can-knowledge/canopen/cia305/) when they boot, if it has no initial node ID (and therefore only replies to the LSS addressing service) it should be completely silent on the bus, not even sending a boot-up message when powered (not being canopen compliant) until it gets addressed by the LSS service, but if I give it any default initial node-ID it would cause collisions when multiple nodes are simultaneously powered on (which will be the normal behaviour at every system boot-up, all devices, including the master, will be powered on at the same time), is it valid to have a canopen device "unaddressed" and silent like this, and still be canopen compliant? how to handle this case?
I read that node ID 0 means broadcast, so it means that my master could ask for all (addressed) node infos (through an SDO upload) with just one command (SDO upload info on node ID 0)? or is it not allowed, and I should inquire all 127 addresses on the bus to remap the network?
Thanks
I hope I get your questions because they are bit long:
Question 1
Yes, it is CANopen compliant if you have a Node which has no Node-ID. That's what the LS-Service is for. As long as the LSS Master has not assigned a Node-ID to the slave, your are not able to talk to the slave via SDO requests. Also PDO communication is not possible in unconfigured state.
Question 2
The ID 0 broadcast is only available for the Master NMT command. That means the CANopen master can set all NMT states of the system at the same time. SDO communication is only available between the Master and one Slave so you have to ask every node individually.

emqx MQTT broker doesn’t persist session after restart

I'm using emqx broker and I want to persist session on disk so that I can recover sessions if the broker reboot for any reasons.
What I do:
start the emqx broker with a docker-compose:
emqx1:
image: emqx/emqx:v4.0.0
environment:
- EMQX_NAME=emqx
- EMQX_NODE__NAME=emqx.local.node
- EMQX_HOST=node1.emqx.io
- EMQX_CLUSTER__DISCOVERY=static
- EMQX_RETAINER__STORAGE_TYPE=disc
volumes:
- emqx-data:/opt/emqx/data
- emqx-etc:/opt/emqx/etc
- emqx-log:/opt/emqx/log
ports:
- 18083:18083
- 1883:1883
- 8081:8081
networks:
gateway-api:
aliases:
- node1.emqx.io
start a Go subscribe client with Paho MQTT lib with following config. The code of the client can be found in the "stdinpub" and "stdoutsub" folder in the paho repo
clientId = "sub1"
qos = 1
clean = false
topic_subscribe = "topic1"
start a Go publish client with this config and publish a message:
clientId = ""
clean = true
and the message:
qos = 1
retain = false
topic = "topic1"
payload = "test"
then I disconnect the client "sub1" and send a 2nd message with qos=1:
qos = 1
retain = false
topic = "topic1"
payload = "test2"
this message is not delivered to the client "sub1" so the broker queues it (qos=1). Indeed if I restart the sub1 client it does get the message "test2".
But if I reboot the broker before restarting the client "sub1", then "test2" get lost and is not delivered.
I tried the same test with retain set to true and the message "test2" is well delivered even after the broker is rebooted. So the broker persist the retained messages on disk well but not the client session.
Any idea of why ? Is there a configuration I should change to persist client session on disk ?
As hashed out in the comments.
Client Session storage is a feature only available in the "Enterpise" paid for version of emqx not the free to use version.
This can be seen from the Feature list and the issues 1 & 2 also asking about the feature.
retainer message storage in disk:
# etc/plugins/emqx_retainer.conf
## Where to store the retained messages.
retainer.storage_type = disc_only
The EMQ X open source product does not support the persistence of messages within the server, which is an architectural design choice. First of all, the core problem solved by EMQ X is connection and routing; secondly, we believe that built-in persistence is a wrong design.
Traditional MQ servers with built-in message persistence, such as the widely used JMS server ActiveMQ, are redesigning the persistence part in almost every major version. There are two problems with the design of built-in message persistence:
How to balance the use of memory and disk? Message routing is memory-based, while message storage is disk-based.
Under the multi-server distributed cluster architecture, how to place the Queue and how to replicate the messages of the Queue?
Kafka has made the correct design for the above problems: a message server based entirely on disk distributed Commit Log.
After EMQ X separates message routing and message storage responsibilities in design, data replication, disaster recovery backup and even application integration can be implemented flexibly at the data level.
In EMQ X Enterprise Edition products, you can persist messages to databases such as Redis, MongoDB, Cassandra, MySQL, PostgreSQL, and message queues such as RabbitMQ and Kafka through a rule engine or plug-in.

Emq mqtt clustering : Client Session management

I've gone through the documentation for emq clustering but i couldn't find a clear explanation for how session management is done. I understand topics table is shared among clustered nodes but client connection information is not? What happens if a node goes down? Would it loose all the info about client sessions it was managing?
The route is shared in the cluster, and the session is stored on the connected node. If the node goes down and the client reconnects, the session will be lost.
https://docs.emqx.io/tutorial/latest/en/backend/whats_backend.html

Can MQTT (such as Mosquitto) be used so that a published topic is picked up by one, and only one, of the subscribers?

I have a system that relies on a message bus and broker to spread messages and tasks from producers to workers.
It benefits both from being able to do true pub/sub-type communications for the messages.
However, it also needs to communicate tasks. These should be done by a worker and reported back to the broker when/if the worker is finished with the task.
Can MQTT be used to publish this task by a producer, so that it is picked up by a single worker?
In my mind the producer would publish the task with a topic "TASK_FOR_USER_A" and there are X amount of workers subscribed to that topic.
The MQTT broker would then determine that it is a task and send it selectively to one of the workers.
Can this be done or is it outside the scope of MQTT brokers such as Mosquitto?
MQTT v5 has an optional extension called Shared Subscriptions which will deliver messages to a group of subscribers in a round robin approach. So each message will only be delivered to one of the group.
Mosquitto v1.6.x has implemented MQTT v5 and the shared subscription capability.
It's not clear what you mean by 1 message at a time. Messages will be delivered as they arrive and the broker will not wait for one subscriber to finish working on a message before delivering the next message to the next subscriber in the group.
If you have low enough control over the client then you can prevent the high QOS responses to prevent the client from acknowledging the message and force the broker to only allow 1 message to be in flight at a time which would effectively throttle message delivery, but you should only do this if message processing is very quick to prevent the broker from deciding delivery has failed and attempting to deliver the message to another client in the shared group.
Normally the broker will not do any routing above and beyond that based on the topic. The as mentioned in a comment on this answer the Flespi has implemented "sticky sessions" so that messages from a specific publisher will be delivered to the same client in the shared subscription pool, but this is a custom add on and not part of the spec.
What you're looking for is a message broker for a producer/consumer scenario. MQTT is a lightweight messaging protocol which is based on pub/sub model. If you start using any MQTT broker for this, you might face issues depending upon your use case. A few issues to list:
You need ordering of the messages (consumer must get the messages in the same order the producer published those). While QoS 2 guarantees message order without having shared subscriptions, having shared subscriptions doesn't provide ordered topic guarantees.
Consumer gets the message but fails before processing it and the MQTT broker has already acknowledged the message delivery. In this case, the consumer needs to specifically handle the reprocessing of failed messages.
If you go with a single topic with multiple subscribers, you must have idempotency in your consumer.
I would suggest to go for a message broker suitable for this purpose, e.g. Kafka, RabbitMQ to name a few.
As far as I know, MQTT is not meant for this purpose. It doesn't have any internal working to distribute the tasks on workers (consumers). On the Otherhand, AMQP can be used here. One hack would be to conditionalize the workers to accept only a particular type of tasks, but that needs producers to send task type as well. In this case, you won't be able to scale as well.
It's better if you explore other protocols for this type of usecase.

WebSphere MQ recovery of in-flight messages after failover

Does WebSphere MQ v7 guarantee the recovery of in-flight messages after failover to a standby queue manager?
If so, how is this accomplished?
Thanks
There are two primary types of standby instances which support this level of recovery. The first is in a traditional hardware cluster such as Power HA, HACMP, Veritas, MSCS and so forth. The other is a Multi-Instance Queue Manager (MIQM). Both of these are capable of running the queue manager on more than one server with data and log files occupying shared disk which is accessible to all instances.
In both cases, persistent messages which have been committed prior to the termination of the primary QMgr will be recovered. The secondary QMgr will assume possesion of the data and log files during the failover event. From the perspective of the failover node it is the same as if the QMgr was just starting up after a shutdown or crash, it just hapens to now be running on a different server.
The main differences between a hardware cluster versus MIQM is that a hardware cluster fails over the IP address and possibly non-MQ processes as well. The MIQM recovers only the MQ processes and comes up on a different IP address. Applications with V7 clients can be configured with multi-instance connection details to allow for the multiple IP addresses.
So for these solutions in which the state of the QMgr and any in-flight messages is stored on shared disk, bringing the QMgr up with the same shared disk but on a different node recovers the state of the QMgr, including any in-flight messages.

Resources