How much time does a node have to reply? - can-bus

Is there a time limit for a node to respond to a message?
Is there a difference between PDO and SDO messages in terms of response time?
Or does this only depend on how the Master gets implemented?
Is there any documentation about this?

Generally, no. CiA 301 says (Annex A, informative):
Time-out's [sic]
Since COB's may be ignored, the response of a confirmed service may never arrive. To resolve this
situation, an implementation may, after a certain amount of time, indicate this to the service user
(time-out). A time-out is not a confirmation of that service. A time-out indicates that the service has not
completed yet. The application may deal with this situation. Time-out values are considered to be
implementation specific and do not fall within the scope of this specification. However, it is
recommended that an implementation provides facilities to adjust these time-out values to the
requirements of the application.
Furthermore, CANopen separates data and supervision, so checking if nodes are alive should be done with the Heartbeat protocol separately.
It is reasonably in most applications to implement a timeout, especially on mission-critical data. My general recommendation for control systems (automotive/industrial) that is used to steer some manner of machine or is otherwise safety-related, is to use PDO inhibit time and event timer on the PDO producer, to have it send out data cyclically every 10ms or 100ms.
And then the PDO consumer should have a corresponding timeout after which it enters a safe mode/error state. Some safety-related protocols expect data to arrive within a certain time window even - neither too late nor too early. It all depends on the specific application, how fast it needs to be reacting and if there are safety-related aspects or not.

Related

Inhibit Time in Tx-PDO

Objects 180Nh have the following subindices:
0x00:----
0x01:----
0x02:----
0x03 (inhibit time): This subindex contains a time lock in 100 µs steps (see following figure). This can be used to set a time that must elapse after the sending of a PDO before the PDO is sent another time. This time only applies for asynchronous PDOs. This is intended to prevent PDOs from being sent continuously if the mapped object constantly changes.
0x04 (compatibility entry): This subindex has no function and exists only for compatibility reasons.
0x05 (event timer): This time (in ms) can be used to trigger an Event which handles the copying of the data and the sending of the PDO.
According to the above point, we realize that when the event occurs, a certain time is determined, which is blocked, and it is for Tx-PDO; now, if the event occurs in this interval, it will be executed in the next section.
Why should the whole section be implemented? Why is the second, third, and fourth event executed in the last part?
Shouldn't the third and fourth events be executed separately?
By default, common CANopen device profiles like for example CiA 401 "generic I/O module" are configured to suit large automation networks. That is: a large network with lots of nodes where it is important to keep bus traffic low. On such networks nodes only transmit PDOs when there has been a data update (an internal event has occurred).
However, such a setup is very much unsuitable when CANopen is used for real-time control systems, like for example having a PLC controlling a bunch of actuator I/O modules that control motions of a machine. Which could also be a safety-related application. In such systems, it is custom to always send data repeatedly at even intervals, even if it has not changed. For example send all data once every 10ms/100ms.
Only the last data sent is used by the receiving node(s), so in case data goes missing/corrupt, new reliable data will arrive soon again. And if no data arrives at all, that's an indication that something is broken and the system ought to revert to a safe state, after receiving no new data in a certain time period. This is how mobile/automotive control systems are most commonly designed, since it is safe, deterministic and proven in use. Custom, non-standard CAN bus protocols by OEM are often implemented exactly like this.
Now, to achieve this with CANopen, we have to configure the TPDO communication parameters. Event timer to set the interval and inhibit time to prevent the node spamming extra data as soon as something has changed. If I remember correctly we also need to set 180N:2 transmission type to asynchronous (which sounds counter-intuitive).
With a setup like this, only the most recent event matters. The most up to date data will always get sent, at fixed intervals.

Time to send SDO

I am working on CANopen architecture and had three questions:
1- When the 'synchronous window' is closed until the next SYNC message, should we send the SDO message? Can we not send a message during this period?
2- Is it possible not to send the PDO message during the simultaneous window?
3- What is the answer that the slaves give in the SYNC message?
Disclaimer: I don't have exact answers but I just wanted to share my assumptions & thoughts.
CiA 301 doesn't mention the relation between synchronous window and SDOs. In normal operation after the initial configuration, one may assume that SDOs aren't present on the system, or at least they are rare compared to PDOs. Although not strictly necessary, SDOs are generally initiated by a device which has the master role, and that device also produces the SYNC messages (again, it's not strictly necessary but it's the usual/common implementation). So, the master device may adjust the timing of SDO requests according to the synchronous window.
Here is a quote from CiA 301:
If the synchronous window length expires all synchronous TPDOs may be
discarded and an EMCY message may be transmitted; all synchronous
RPDOs may be discarded until the next SYNC message is received.
Synchronous RPDO processing is resumed with the next SYNC message.
CiA 301 uses the word "may" (see the quote above). So I'm not sure if it's mandatory or not. In my opinion, it makes sense to follow the advice and abort synchronous TPDO transmissions after the synchronous window and send an EMCY message. Event-driven (non-synchronous) TPDOs can be sent within or after the synchronous window.
There is no direct response to SYNC messages. On SYNC reception, SYNC consumers (slaves) sample their inputs, drive their outputs according to the previous RPDOs, and start transmitting their TPDOs containing the previous samples (or the current ones? I'm not sure about this).
Synchronous windows are for specific PDO synchronization only. For hard real-time systems, data might be required to arrive within certain fixed time intervals - not too early, not too late. That is, it acts as a real-time deadline. If such features are enabled, you need to take that in consideration when doing the specific CANopen bus implementation.
For example if some SDO communication would occupy the bus so that the PDO can't meet its time window, that would be a problem. But this is easily solved by giving the PDO a lower COBID than the SDO, which should already be the case in most default device profile setups like "DS401 GPIO module". Other than that, you would have to make sure there is no ridiculous bus loads or that nodes hang up or get busy doing other things.
In systems with hard real-time requirements you probably don't want to allow any SDO communication during operational mode to begin with.
What is the answer that the slaves give in the SYNC message?
That question doesn't make any sense. You need to study what the SYNC message does and what it is for.

CAN J1939 device stops responding after communication timeout

I'm a higher layer guy, I don't and don't want to know much about can-bus, j1939 or even particular ECUs. I just don't like the software solution, so I'd like to ask, if customer's requirements are legitimate.
If particular ECU doesn't receive CAN frame within 300 ms timeout after powerup, it stops responding to any further frames and must be power cycled. This is a information from customer's technicians, I have to just believe it.
It is possible to powerup ECU after CAN driver thread is ready, but it would require some extra wiring by end customers.
Software solutions are all bad or worse, like running FreeRTOS before important checks, put CAN driver code to code common with other products, or start CAN periphery in the bootloader and left running without software control until driver starts.
The sensitive part is, that we have no explicit demand to start CAN driver within such a short time in specification. Customer says, that it's part of J1939 specification.
Can someone confirm or disprove, that J1939 allows devices to unrecoverably stop receiving after 300 ms of silence or requires devices to start transmitting within 300 ms after powerup? Or at least guide me to parts of J1939 standard, which could possibly regard this?
Thank you
If particular ECU doesn't receive CAN frame within 300 ms timeout after powerup, it stops responding to any further frames and must be power cycled. This is a information from customer's technicians, I have to just believe it.
This does of course entirely depend on what task it is performing.
Generally, an ECU, as in an automotive computer in a car/truck etc is never allowed to hang up/latch up. The normal course of action would be for the ECU to either reboot/reset itself or revert to a fail-safe mode.
But in case of tractors and heavy machinery the normal safe mode is "stop everything".
It is possible to powerup ECU after CAN driver thread is ready, but it would require some extra wiring by end customers.
I don't know what this is supposed to mean. What is "extra wiring"? Something to keep other nodes in common mode while one is rebooting? Terminating resistors? Some dirty power-up delay circuit?
Software solutions are all bad or worse, like running FreeRTOS before important checks, put CAN driver code to code common with other products, or start CAN periphery in the bootloader and left running without software control until driver starts.
Generally speaking, it's custom to initialize critical hardware like clocks, watchdogs, prescalers, pull resistors etc very early on. Initializing hardware peripherals may or may not be critical. It's custom to do this after the CRT has been executed, at the beginning of main() and the order of initialization usually matters a lot.
If you have a delay longer than 300ms from power-on reset to the start of main(), something is terribly wrong with the program.
The sensitive part is, that we have no explicit demand to start CAN driver within such a short time in specification. Customer says, that it's part of J1939 specification.
I haven't worked much with J1939 and I don't remember what it says specifically, but 300ms is an eternity in a real-time system! It's not a "short time".
In general, correctly designed mission-/safety-critical CAN control systems in automotive/industrial settings work like this:
All data is sent repeatedly in fixed intervals, regardless of if it has changed or not. Commonly once per 10ms or once per 100ms.
A node which has not received new data will use the previously received data for now.
There is a timeout from the point of when last valid data was received, when the receiving node must stop using old data and revert to a fail-safe mode. This time is often relative to how fast the controlled object can move. It's common to have timeouts after some multiple of 100ms.
I would say that your customer's requirements are very reasonable, it's nothing out of the ordinary.
My colleague answered, that there's no such demand, only vague 300 ms timeout.

How to synchronize message status updates in Delphi Indy

RFC 3501 states in section 6.1.2. that you should use the NOOP command for polling.
Though in TIdIMAP4 there's only the KeepAlive method using it, which is implemented as a procedure, i.e. doesn't return anything.
So how to check for status updates like e.g. new messages or read status changes? I.e. how can I do manual polling with TIdIMAP4? Which methods and properties are involved in doing that? And how to get the (U)IDs these messages?
Or is it even possible to use the IDLE command specified in RFC 2177 to avoid polling and to get updates automatically?
IMAP is technically an asynchronous protocol, but TIdIMAP4 is currently implemented as a synchronous client. As such, unexpected/out-of-order data is either discarded, treated as untagged data, or treated as error data, depending on timing and context. Untagged/extra data is accessible from the TIdIMAP4.LastCmdResult property, which you can type-cast to TIdReplyIMAP4 to access its Extra sub-property.
IDLE is not currently supported in TIdIMAP4. There are tickets in Indy's issue trackers (see here and here) to add IDLE support in a future release, maybe in Indy 11. Until then, you will have to poll the mailbox envelopes periodically, keeping track of messages you have already seen so you can detect new messages.
Yes, you can use IDLE to avoid NOOP and in general it's a good idea.
However, that won't give you any results. In a way, IMAP commands don't have results. They tell the server what you want, and the server tells you things. The server is free to tell you things for other reasons as well, including the goodness of its heart.
You might say that NOOP means "hi server, now is a good time to tell me things, I'm listening" and IDLE means "hi server, I'm listening all the time, so just tell me whatever you want whenever you want". Both also mean "and btw, restart your inactivity timeout if you have one".
The server will send you EXISTS, FETCH and other responses, which I expect TIdIMAP4 forwards to you in some way. (Yes, they're called responses even though they're not in response to any command of yours. They may be sent in response to someone else having sent you mail, for instance. Stupid naming.)

Anyone know average HL7 clinical message response times?

I'm designing a .net interface for sending and receiving a HL7 message and noticed on this forum theres a few people with this experience.
My question is.... Would anyone be able to share their experience on how long it could take to get a message response back from a hospital HL7 server. (Particularly when requesting patient demographics) - seconds / minutes / Hours?
My dilemma is do I design my application to make the user wait for the message to come back.
(Sorry if this is a little off topic, it’s still kinda programming related? – I searched the web for HL7 forums but got stuck so again if anyone knows of any please let me know )
cheers,
Jason
In my experience, you should receive an ACK or NAK back within a few seconds. The receiving application shouldn't do something like making you wait while it performs operations on the message. We have timeouts set to 30 seconds, and we almost never wait that long for a response.
This is quite dependent on the kind of HL7 message sent, typically messages like ADT's are sent as essentially updates to the server, and are acknowledged almost immediately if the hospital system is behaving well. This will result in a protocol level acknowledgement, which indicates that the peer has received the message but not necessarily processed it yet.
Typically, most systems will employ a broker or message queue in their integration engines so you get your ack almost immediately.
Other messages like lab request messages may actually send another non-ack message back which contains the information requested. These requests can take longer.
You can check with the peer you're communicating with to see what integration engine they are using, and if a queue sits on that end which would help ensure the response times are short.
In the HL7 integration tool I work on, we use queues for inbound data so we can responde immediately. And for our outbound connections, 10s timeouts are default, and seem to work fine for most of our customers.
When sending a Query type event in HL7, it could take a number of seconds to get the proper response back. You also need to code for the possibility that you will never get a response back, and the possibility that connected systems "don't do" queries.
Most HL7 nets that I have worked on, assume that all interested systems are listening for demographic updates at all times. Usually, receiving systems process these updates into a patient database that documents both the Person and Encounter (Stay) information on the fly.
In my location, my system usually gets about 10-20 thousand messages a day, most of which are patient demographic updates.
It depends if the response is generated automatically by a system or if the response is generated after an user does something on the system. For an automatic response it might take less than a second, depending of course on the processing that is done by the system and the current work load of that system. If the system is not too busy and processing is just a couple of queries and verification of some conditions, considering network delays, response time should be a few seconds or less.

Resources