Apart from the Master, Which part can be used as a SYNC message sender in CANopen? - can-bus

I encountered a problem, the message that goes from the NMT to the nodes for synchronization is usually done through the Master in CANopen.
Can another department do this?

First to clear out some terms:
NMT = Network Management, a protocol used in CANopen.
Master = NMT master, the node responsible for supervision and initiating state changes of other nodes. Outside of NMT, there is no master.
Therefore the sentence "the message that goes from the NMT to the nodes" doesn't make any sense and has nothing to with the SYNC message either.
The SYNC feature is a feature of its own, related to PDOs rather than NMT. Nodes have SYNC producer or SYNC consumer capabilities. For convenience, it is often the NMT Master acting as SYNC producer, but this isn't a requirement by the standard. Any node supporting the SYNC producer feature can act as one.
(Not to be confused with Heartbeat producer/consumer, which is part of NMT.)
For reference see Object Dictionary entries 1005h to 1007h, CiA 301 chapter 7.5.2.

Related

Distributed computing in a network - Framework/SDK

I need to build a system that consist of:
Nodes, each mode can accept one input.
The node that received the input shares it with all nodes in the network.
Each node do a computation on the input (same computation but each node has a different database so the results are different for each node).
The node that received the input consolidate each node result and apply a logic to determine the overall result.
This result is returned to the caller.
It's very similar to a map-reduce use case. Just there will be a few nodes (maybe 10~20), and solutions like hadoop seems an overkill.
Do you know of any simple framework/sdk to build:
Network (discovery, maybe gossip protocol)
Distribute a task/data to each node
Aggregate the results
Can be in any language.
Thanks very much
Regads;
fernando
Ok to begin with, there are many ways to do this. I would suggest the following if you are just starting to tackle this architecture:
Pub/Sub with Broker
Programs like RabbitMQ are meant to easily allow for variable amounts of nodes to connect and speak to one another. Most importantly, they allow for transparency and observability. You can easily ask the Broker which nodes are connected and even view messages in transit. Basically they are a 'batteries included' means of delaying with a large amount of clients.
Brokerless (Update)
I was looking for a more 'symmetric' architecture where each node is the same and do not have a centralized broker/queue manager.
You can use a brokerless Pub/Subs, but I personally avoid them. While they have tooling, it is hard to understand their registration protocols if something odd happens. I generally just use Multicast as it is very straight forward, especially if each node has just one network interface, and you can extend/modify behavior just with routing infra.
Here is how you scheme would work with Multicast:
All nodes join a known multicast address (IE: 239.1.2.3:8000)
All nodes would need to respond to a 'who's here' message
All nodes would either need to have a 'do work' api either via multicast or from consumer to node (node address grabbed from 'who's here message)
You would need to make these messages yourself, but given how short i expect them to be it should be pretty simple.
The 'who's here' message from the consumer could just be a message with a binary zero.
The 'who's here' response could just be a 1 followed by the nodes information (making it a TLV would probably be best though)
Not sure if each node has unique arguments or not so i don't know how to make your 'do work' message or responce

Canopen auto addressing with LSS, how to architect the system

I am new to Canopen and need to architect a system with the following characteristics:
1 canopen Master (also a gateway)
multiple canopen slave nodes, composed by multiple instances of the same device (with unique SNs, as required by LSS)
I would like to design this device to not require any pre-configuration before connecting it to the bus, and also allow devices that were previously connected to another canopen bus (and therefore had a previous node ID) to be seamlessly connected to a new bus (therefore their node IDs should not persist after a reboot).
After learning about Canopen and the LSS service I think a good solution would be:
The device has no persistent node ID, and at every boot it needs to be addressed by the master through LSS
the master will periodically scan and address new nodes through the LSS service (allowing device hot-plug)
If for any reason the master reboots, it can re-detect all already addressed nodes through a simple node scan (SDO info upload of all addresses)
Now my questions:
It is not clear to me how to have an "invalid canopen node-ID" (referenced here: https://www.can-cia.org/can-knowledge/canopen/cia305/) when they boot, if it has no initial node ID (and therefore only replies to the LSS addressing service) it should be completely silent on the bus, not even sending a boot-up message when powered (not being canopen compliant) until it gets addressed by the LSS service, but if I give it any default initial node-ID it would cause collisions when multiple nodes are simultaneously powered on (which will be the normal behaviour at every system boot-up, all devices, including the master, will be powered on at the same time), is it valid to have a canopen device "unaddressed" and silent like this, and still be canopen compliant? how to handle this case?
I read that node ID 0 means broadcast, so it means that my master could ask for all (addressed) node infos (through an SDO upload) with just one command (SDO upload info on node ID 0)? or is it not allowed, and I should inquire all 127 addresses on the bus to remap the network?
Thanks
I hope I get your questions because they are bit long:
Question 1
Yes, it is CANopen compliant if you have a Node which has no Node-ID. That's what the LS-Service is for. As long as the LSS Master has not assigned a Node-ID to the slave, your are not able to talk to the slave via SDO requests. Also PDO communication is not possible in unconfigured state.
Question 2
The ID 0 broadcast is only available for the Master NMT command. That means the CANopen master can set all NMT states of the system at the same time. SDO communication is only available between the Master and one Slave so you have to ask every node individually.

akka stream ActorSubscriber does not work with remote actors

http://doc.akka.io/docs/akka-stream-and-http-experimental/1.0-M2/scala/stream-integrations.html says:
"ActorPublisher and ActorSubscriber cannot be used with remote actors, because if signals of the Reactive Streams protocol (e.g. request) are lost the the stream may deadlock."
Does this mean akka stream is not location transparent? How do I use akka stream to design a backpressure-aware client-server system where client and server are on different machines?
I must have misunderstood something. Thanks for any clarification.
They are strictly a local facility at this time.
You can connect it to an TCP sink/source and it will apply back-pressure using TCP as well though (that's what Akka Http does).
How do I use akka stream to design a backpressure-aware client-server system where client and server are on different machines?
Check out streams in Artery (Dec. 2016, so 18 months later):
The new remoting implementation for actor messages was released in Akka 2.4.11 two months ago.
Artery is the code name for it. It’s a drop-in replacement to the old remoting in many cases, but the implementation is completely new and it comes with many important improvements.
(Remoting enables Actor systems on different hosts or JVMs to communicate with each other)
Regarding back-pressure, this is not a complete solution, but it can help:
What about back-pressure? Akka Streams is all about back-pressure but actor messaging is fire-and-forget without any back-pressure. How is that handled in this design?
We can’t magically add back-pressure to actor messaging. That must still be handled on the application level using techniques for message flow control, such as acknowledgments, work-pulling, throttling.
When a message is sent to a remote destination it’s added to a queue that the first stage, called SendQueue, is processing. This queue is bounded and if it overflows the messages will be dropped, which is in line with the actor messaging at-most-once delivery nature. Large amount of messages should not be sent without application level flow control. For example, if serialization of messages is slow and can’t keep up with the send rate this queue will overflow.
Aeron will propagate back-pressure from the receiving node to the sending node, i.e. the AeronSink in the outbound stream will not progress if the AeronSource at the other end is slower and the buffers have been filled up.
If messages are sent at a higher rate than what can be consumed by the receiving node the SendQueue will overflow and messages will be dropped. Aeron itself has large buffers to be able to handle bursts of messages.
The same thing will happen in the case of a network partition. When the Aeron buffers are full messages will be dropped by the SendQueue.
In the inbound stream the messages are in the end dispatched to the recipient actor. That is an ordinary actor tell that will enqueue the message in the actor’s mailbox. That is where the back-pressure ends on the receiving side. If the actor is slower than the incoming message rate the mailbox will fill up as usual.
Bottom line, flow control for actor messages must be implemented at the application level. Artery does not change that fact.

Is this the right way of building an Erlang network server for multi-client apps?

I'm building a small network server for a multi-player board game using Erlang.
This network server uses a local instance of Mnesia DB to store a session for each connected client app. Inside each client's record (session) stored in this local Mnesia, I store the client's PID and NODE (the node where a client is logged in).
I plan to deploy this network server on at least 2 connected servers (Node A & B).
So in order to allow a Client A who is logged in on Node A to search (query to Mnesia) for a Client B who is logged in on Node B, I replicate the Mnesia session table from Node A to Node B or vise-versa.
After Client A queries the PID and NODE of the Client B, then Client A and B can communicate with each other directly.
Is this the right way of establishing connection between two client apps that are logged-in on two different Erlang nodes?
Creating a system where two or more nodes are perfectly in sync is by definition impossible. In practice however, you might get close enough that it works for your particular problem.
You don't say the exact reason behind running on two nodes, so I'm going to assume it is for scalability. With many nodes, your system will also be more available and fault-tolerant if you get it right. However, the problem could be simplified if you know you only ever will run in a single node, and need the other node as a hot-slave to take over if the master is unavailable.
To establish a connection between two processes on two different nodes, you need some global addressing(user id 123 is pid<123,456,0>). If you also care about only one process running for User A running at a time, you also need a lock or allow only unique registrations of the addressing. If you also want to grow, you need a way to add more nodes, either while your system is running or when it is stopped.
Now, there are already some solutions out there that helps solving your problem, with different trade-offs:
gproc in global mode, allows registering a process under a given key(which gives you addressing and locking). This is distributed to the entire cluster, with no single point of failure, however the leader election (at least when I last looked at it) works only for nodes that was available when the system started. Adding new nodes requires an experimental version of gen_leader or stopping the system. Within your own code, if you know two players are only going to ever talk to each other, you could start them on the same node.
riak_core, allows you to build on top of the well-tested and proved architecture used in riak KV and riak search. It maps the keys into buckets in a fashion that allows you to add new nodes and have the keys redistributed. You can plug into this mechanism and move your processes. This approach does not let you decide where to start your processes, so if you have much communication between them, this will go across the network.
Using mnesia with distributed transactions, allows you to guarantee that every node has the data before the transaction is commited, this would give you distribution of the addressing and locking, but you would have to do everything else on top of this(like releasing the lock). Note: I have never used distributed transactions in production, so I cannot tell you how reliable they are. Also, due to being distributed, expect latency. Note2: You should check exactly how you would add more nodes and have the tables replicated, for example if it is possible without stopping mnesia.
Zookeper/doozer/roll your own, provides a centralized highly-available database which you may use to store the addressing. In this case you would need to handle unregistering yourself. Adding nodes while the system is running is easy from the addressing point of view, but you need some way to have your application learn about the new nodes and start spawning processes there.
Also, it is not necessary to store the node, as the pid contains enough information to send the messages directly to the correct node.
As a cool trick which you may already be aware of, pids may be serialized (as may all data within the VM) to a binary. Use term_to_binary/1 and binary_to_term/1 to convert between the actual pid inside the VM and a binary which you may store in whatever accepts binary data without mangling it in some stupid way.

Erlang remote procedure call module internals

I have Several Erlang applications on Node A and they are making rpc calls to Node B onto which i have Mnesia Stored procedures (Database querying functions) and my Mnesia DB as well. Now, occasionally, the number of simultaneous processes making rpc calls to Node B for data can rise to 150. Now, i have several questions:
Question 1: For each rpc call to a remote Node, does Node A make a completely new (say TCP/IP or UDP connection or whatever they use at the transport) CONNECTION? or there is only one connection and all rpc calls share this one (since Node A and Node B are connected [got to do with that epmd process])?
Question 2: If i have data centric applications on one node and i have a centrally managed Mnesia Database on another and these Applications' tables share the same schema which may be replicated, fragmented, indexed e.t.c, which is a better option: to use rpc calls in order to fetch data from Data Nodes to Application nodes or to develope a whole new framework using say TCP/IP (the way Scalaris guys did it for their Failure detector) to cater for network latency problems?
Question 3: Has anyone out there ever tested or bench marked the rpc call efficiency in a way that can answer the following?
(a) What is the maximum number of simultaneous rpc calls an Erlang Node can push onto another without breaking down?
(b) Is there a way of increasing this number, either by a system configuration or operating system setting? (refer to Open Solaris for x86 in your answer)
(c) Is there any other way of applications to request data from Mnesia running on remote Erlang Nodes other than rpc? (say CORBA, REST [requires HTTP end-to-end], Megaco, SOAP e.t.c)
Mnesia runs over erlang distribution, and in Erlang distribution there is only one tcp/ip connection between any pair of nodes (usually in a fully mesh arrangement, so one connection for every pair of nodes). All rpc/internode communication will happen over this distribution connection.
Additionally, it's guaranteed that message ordering is preserved between any pair of communicating processes over distribution. Ordering between more than two processes is not defined.
Mnesia gives you a lot of options for data placement. If you want your persistent storage on node B, but processing done on node A, you could have disc_only_copies of your tables on B and ram_copies on node A. That way applications on node A can get quick access to data, and you'll still get durable copies on node B.
I'm assuming that the network between A and B is a reliable LAN that is seldom going to partition (otherwise you're going to spend a bunch of time getting mnesia back online after a partition).
If both A and B are running mnesia, then I would let mnesia do all the RPC for me - this is what mnesia is built for and it has a number of optimizations. I wouldn't roll my own RPC or distribution mechanism without a very good reason.
As for benchmarks, this is entirely dependent on your hardware, mnesia schema and network between nodes (as well as your application's data access patterns). No one can give you these benchmarks, you have to run them yourself.
As for other RPC mechanisms for accessing mnesia, I don't think there are any out of the box, but there are many RPC libraries you could use to present the mnesia API to the network with a small amount of effort on your part.

Resources