Node with higher ID always losing arbitration in CAN Bus - can-bus

I am a beginner in CAN Protocol, referring Texas Instruments Application Report SLOA101B - Introduction to the Controller Area Network (CAN).
What happens when 2 nodes are continuously sending CAN frames, will the node sending the frame higher CAN ID always lose arbitration?
In my understanding, in the initial arbitration, the node with lower ID wins, then sends the data frame, after which the bus goes to 3 recessive IFS, then again both nodes finds the bus as idle and starts arbitration, here also the node with lower ID wins the arbitration and so on. This means the node sending the frame higher CAN ID always loses arbitration.

Yes that is correct. But apart from identifier, arbitration also involves the RTR/SRR bit as well as the IDE (extended identifier) bits. So an 11 bit frame with identifier 0x123 always wins over an 29 bit frame with identifier 0x123.
Though note that continuous, 100% bus load often means there is something seriously wrong with the bus.

Related

Who wins arbitration between standard remote frame (11-bit identifier) and extended data frame(29-bit identifier)?

So in case of a CAN bus having both CAN 2.0A and CAN 2.0B nodes, who wins arbitration when CAN 2.0A node tries to send remote frame (RTR bit = 1, IDE = 0) and CAN 2.0B tries to send data frame (SRR = 1 and IDE = 1). I have attached the image below for the reference.
Will CAN 2.0A win the arbitration but how? IDE bit is dominant in case of CAN 2.0A but the CAN 2.0A controller does not include the IDE bit in arbitration as arbitration field is just 11-bit MSG ID and the RTR bit.
No matter if 11 bit or 29 bit, everything between the start of frame bit and to the RTR bit is called "arbitration phase". Bus arbitration only happens during the transmission of these bits. Therefore a 29 bit frame with the same identifier as an 11 bit frame always loses arbitration. This is because of two reasons:
The SRR (substitute remote request) bit of a 29 bit frame is always recessive (1).
The IDE (identifier extension) bit of an 11 bit frame is always dominant (0). The meaning of this bit only exists in 11 bit frames - it is replaced with RTR in the 29 bit frame.
Even though IDE bit isn't included in the arbitration of CAN 2.0A controllers, it's always 0 for them. During the transmission of IDE bit, CAN 2.0B controller detects that it has lost the arbitration, because it transmits 1 but reads 0 back. In this case, CAN 2.0B controller yields and arbitration is won by the CAN 2.0A controller.

CAN bus sending data from two masters with equal balance

I have two master nodes connected to the same CAN bus, both send data to my PC.
first master ID = 0xFFA1
second master ID = 0xFFA2
Since the first master ID is lower than the second it takes control of the bus more than the second master. And this causes some delay in the data.
Is there a way to make load balancing between two nodes so that each node send an almost equal amount of messages.
I tried making the first node send data while switching between two IDs 0xFFA1 and 0xFFB2,
and the second node sends data with ID 0xFFB1. And it didn't help.
There is no such thing as "masters" in CAN, nor in higher layer protocols like CANopen for that matter (a "master" in CANopen is just a supervisor node). Who gets to send what is defined by the CAN identifiers - CAN is primarily focusing on data, not nodes. What matters is what is sent, rather than who is sending/receiving, since every message is broadcasted.
It sounds as if you have 2 nodes that wildly spam the bus with identifier 0xFFA1 and 0xFFA2 messages, as fast as they are able, leading to 100% bus load. Then the node sending 0xFFA2 will "starve". Sending data "as fast as you are able" is never the correct way to use CAN.
Instead you need to define a higher layer protocol that dictates real-time characteristics. In control systems, this is most commonly done by having nodes send data at fixed intervals, such as once per 10ms or 100ms. This alone should fix your starvation problem.
If you want to prevent nodes from sending at the same time, then you could provide a means for them to synchronize. A trick used in CANopen and other protocols, is to have one node send out a "sync" message at given fixed time intervals.
After reading this sync message, all nodes should act within x ms from receiving it.

How the CAN message is re-transmitted after arbitration

I understand the CAN arbitration process. But I am very curious about how the Node which loses the arbitration re-transmit its message until success.
As I know many CAN messages are repeatably send on the CAN bus. For example, Node A and Node B simultaneously send messages every 100ms.
Assuming Node A has low identifier value and Node B has high identifier value, then Node A will always win the arbitration and repeatably send message on CAN bus. As Node A and Node B send the message always at the same time, seems Node B will always lose the arbitration and the message can not reach to other Nodes forever...
What CAN mechanism is used for this situation?
Node B will try again when Node A ends the tx, which happens much earlier than 100ms.

Missing master heartbeat does not cause node to react in a CANopen system

I have a strange finding about the heartbeat-protocol in CANopen. Maybe somebody else has seen something like this and maybe it is supposed to work like this... Anyway, here's what it's about:
In CANopen there are two timeout-based life-guarding mechanisms: the first is node guarding, which I will not mention further, since it's considered old news.
The other one is called heartbeat. It is pretty simple: Any participant on the network sends a regular message stating its node ID and its state. The frequency is defined by object 0x1017sub0 and is called heartbeat-producer-time. If it is set to zero, no heartbeat is being sent.
Any other participant can then define a number of nodes it wants to find on the network plus the maximum time there may be between two consecutive heartbeat-messages. This information is stored in object 0x1016sub1..n as 32-bit entries for as many nodes as this particular node wants to listen to.
The entries consist of the node ID (bits 22 to 16) and the mentioned maximum time that may elaps between heartbeats, called the heartbeat-consumer-time (in bits 15..0). Again if the entry is zero, it is being ignored.
As you may have gathered, there is no distinction between network-master (node ID 1) and slaves (node IDs 2 to 127).
So far the theory, now for my problem:
I configure one of the slave-nodes in my network as a heartbeat-consumer for the master, so there's an entry in object 0x1016sub1 that looks like this: 0x000107D0. Meaning that a heartbeat-message from the master is expected after at least two seconds.
I have observed that this works in two examples. If I send a master-heartbeat for a time and then stop, the node either returns to pre-operational mode or sends an appropriate emergency-message.
If I don't send any master-heartbeat-messages, I would expect that after I start the node (send it into operational mode) it takes at most two seconds for the node to either return to pre-operational mode or send an appropriate emergency-message or perhaps even both. But in the two examples I tried, nothing happened. If I never send any heartbeat, the node never expects one and just keeps on running.
The two examples are very different from each other. I am not sure whether they use the same CANopen-stack library perhaps.
Is there an explanation?
If you read CANopen User Manual, section 1.3.1.6, page 39, you will notice that the heartbeat consumer is first activated upon receiving a heartbeat from the producer. I would assume then that, since in your example the first heartbeat is never sent, the consumer is not activated.

Is spacial search in P2P network possible?

I want to build a Javascript/HTML5 geolocation based social network and I wonder the best choice of possible architectures. Client-server can be simple to develop but drawback is the system ressources that could be very high, especially because the application must manage moves (worst case: a user that is in a car must see others users that are around him in cars).
Basicaly, in a client-server architecture, server tasks will be :
collects and stores latitude and longitude of the users (could have thousands of them)
makes geo distance search for that user (to get the list of users present around him in a radius)
builds and sends to the client an XML file with position of the users in the list
These 3 operation must be done periodically, every 3 or 5 seconds because I want a "live" map that shows users in the list moving in their environnement (city, town).
All these 3 points could be optimized :
client send his position when moving of 10 meters to reduce amount of data to process
"spherical rectangle" search in MyISAM table with spatial index (use of MBRContains) to off load MySQL database.
common output file : the XML that is sent can be the same if 2 users are located in a radius of x meters (the 2 users are close each-other).
It is hard to make load estimation at this stage but I think client-server architecture is not appropriate for that type of application and peer2peer could be a nice answer if 2 clients could communicate when they are near each other.
My point is:
Is there any methode to make possible a client to blind search other clients that are located in a certain radius without the help of a central server ? (it is possible with UDP broadcast :-)
edit : Correction. UDP Brodcast allow a client to poll a machine wherever it is, in certain range or IP address.
Thank you for your help,
Florent
You will have to have central peers/servers, because you need to centralize some information to be able to perform you functionalities.
I would go for the following:
Assign square miles (or whatever size you want) to specific servers.
Have devices send a 'I am here' message with their coordinates to some dispatcher that will forward these to the correct square mile server for handling.
Have servers register when a device enters a square mile they manage. This could be a central map to make sure a device is registered to one and only one square.
Forward this message to all other devices in the square.
And/or make sure you include to which square this message is intended and make sure the devices checks it before displays it to the user.
Tune the size of the square and the rate of 'I am here' message. That's it.
The answer actually depends on many things so I'll help out with basic strategy. To understand things out you'll need to understand how does Kademlia works (Kademlia is a DHT P2P network that stores information).
In Kademlia at first startup each node picks random ID which is a 160 bit number that represents point in a space of all possible 160 bit IDs.
The ID of the information that needs to be stored is obtained with SHA-1 function (it receives arbitrary string, and outputs 160 bit number that is treated like ID of the information that needs to be stored)
After that you have the ID of the information, you publish it, the information is physically stored on a node that has it's ID close to information ID.
(The illustration is taken from here)
The information is queried via it's ID. Both the information lookups or node lookups takes O(log(N)) hops to obtain the required information. The "XOR" metric is used in Kademlia (in your case it can be ordinary Euclidian metric).
Each node maintains an array of buckets, each bucket contains addresses of nodes that are appropriate to the current bucket. The appropriate'ness is a measure of how close the IDs are. consider example:
0 160
Node 1 ID: 1101000101011111101110101001010...
Node 2 ID: 1101011101011111101110101001010...
Node 3 ID: 1101000101011001101110101001010...
After applying XOR metric to Nodes #1,2 i.e (computing the number that represents the virtual distance between these nodes) we get:
index - 012345678901234
xor - 000001100000000... (the difference is in 5-th msb bit)
order - msb lsb
After applying Xor metric to Nodes #1,3 we get:
index - 012345678901234
xor - 000000000000011... (the difference is in 13-th msb bit)
order - msb lsb
Apparently Node 1 is closer to Node 3 since it has difference in less significant bits than the distance from Node 1 to Node 2. And therefore from a point of view of a Node 1, it's neighbor Node 3 goes to 13-th bucket(higher index means closer IDs), and Node 2 goes to to 5-th bucket which contains a group of nodes that are 5 MSB radixes away from a current node ID.
Such data structure allows each node to know it's surroundings in variety of 160 levels of distances.
Back to your example, to allow efficient geospacial queries you'll need to replace Kademlias XOR metric with ordinary Euclidian metric. In this case you will have your ID's as a 3D or 2D vectors, and unfortunately due to fact that Euclidian metric results with floating point numbers which are not directly suitable for this type of algorithm so you will need to convert them to a discrete binary numbers somehow in a way similar to what XOR function does. After that, finding node's neighboring nodes is a trivial task.
Hope this helps. Oh by the way look to HyperDex, new searchable distributed datastore closely tied to euclidian metric, might help...

Resources