How does error handling work in SCTP Sockets API Extensions? - sctp

I have been trying to implement a wrapper library for the Linux interface to SCTP sockets, and I am not sure how to integrate the asynchronous style of errors (where they are delivered via events). All example code I have seen, if it deals with the errors at all, simply prints out the information related to the error when it is received, but inserting error-handling code there seems like it would be ineffective, because by that point all of the context related to the original message which was sent has been lost and only a 32-bit integer sinfo_context remains. It also seems that there is no way to directly tell when a given message has been acknowledged successfully by the remote peer, which would make it impossible to implement an approach which listens for errors after sending a message, because the context information for successfully-delivered messages could never be freed.
Is there a way to handle the errors related to a given sending operation as part of the call to a send function, or is there a different way to approach error handling for SCTP which does not lose the context of the error?
One solution which I have considered is using the SCTP_SENDER_DRY notification to tell when packets have been sent, however this requires sending only one packet at a time. Another idea is to use the peer's receiver window size together with the sinfo_cumtsn field of sctp_sndrcvinfo to calculate how much data has been acknowledged as fully received using the cumulative TSN, however there are a couple of disadvantages to this: first, it requires bookkeeping overhead to calculate a number of bytes received by the peer based on the cumulative TSN (especially if the peer's window size may change); second, it requires waiting until all earlier packets were received before reporting success, which seems to defeat the purpose of SCTP's multistreaming; and third, it seems like it would not work for unordered packets.

Related

Time to send SDO

I am working on CANopen architecture and had three questions:
1- When the 'synchronous window' is closed until the next SYNC message, should we send the SDO message? Can we not send a message during this period?
2- Is it possible not to send the PDO message during the simultaneous window?
3- What is the answer that the slaves give in the SYNC message?
Disclaimer: I don't have exact answers but I just wanted to share my assumptions & thoughts.
CiA 301 doesn't mention the relation between synchronous window and SDOs. In normal operation after the initial configuration, one may assume that SDOs aren't present on the system, or at least they are rare compared to PDOs. Although not strictly necessary, SDOs are generally initiated by a device which has the master role, and that device also produces the SYNC messages (again, it's not strictly necessary but it's the usual/common implementation). So, the master device may adjust the timing of SDO requests according to the synchronous window.
Here is a quote from CiA 301:
If the synchronous window length expires all synchronous TPDOs may be
discarded and an EMCY message may be transmitted; all synchronous
RPDOs may be discarded until the next SYNC message is received.
Synchronous RPDO processing is resumed with the next SYNC message.
CiA 301 uses the word "may" (see the quote above). So I'm not sure if it's mandatory or not. In my opinion, it makes sense to follow the advice and abort synchronous TPDO transmissions after the synchronous window and send an EMCY message. Event-driven (non-synchronous) TPDOs can be sent within or after the synchronous window.
There is no direct response to SYNC messages. On SYNC reception, SYNC consumers (slaves) sample their inputs, drive their outputs according to the previous RPDOs, and start transmitting their TPDOs containing the previous samples (or the current ones? I'm not sure about this).
Synchronous windows are for specific PDO synchronization only. For hard real-time systems, data might be required to arrive within certain fixed time intervals - not too early, not too late. That is, it acts as a real-time deadline. If such features are enabled, you need to take that in consideration when doing the specific CANopen bus implementation.
For example if some SDO communication would occupy the bus so that the PDO can't meet its time window, that would be a problem. But this is easily solved by giving the PDO a lower COBID than the SDO, which should already be the case in most default device profile setups like "DS401 GPIO module". Other than that, you would have to make sure there is no ridiculous bus loads or that nodes hang up or get busy doing other things.
In systems with hard real-time requirements you probably don't want to allow any SDO communication during operational mode to begin with.
What is the answer that the slaves give in the SYNC message?
That question doesn't make any sense. You need to study what the SYNC message does and what it is for.

Linux recv returns data not seen in Wireshark capture

I am receiving data through a TCP socket and although this code has been working for years, I came across a very odd behaviour trying to integrate a new device (that acts as a server) into my system:
Before receiving the HTTP Body response, the recv() kernel function gives me strange characters like '283' or '7b'.
I am actually debuging with gdb and I can see that the variables hold these values right after recv() was called (so it is not just what printf shows me)
I always read byte-after-byte (one at a time) with the recv() function and the returned value is always positive.
This first line of the received HTTP Body cannot be seen in Wireshark (!) and is also not expected to be there. In Wireshark I see what I would expect to receive.
I changed the device that sends me the data and I still receive the exact same values
I performed a clean debug build and also tried a release version of my programm and still get the exact same values, so I assume these are not random values that happened to be in memory.
i am running Linux kernel 3.2.58 without the option to upgrade/update.
I am not sure what other information i should provide and I have no idea what else to try.
Found it. The problem is that I did not take the Transfer-Encoding into consideration, which is chunked. I was lucky because also older versions of Wireshark were showing these bytes in the payload so other people also posted similar problems in the wireshark forum.
Those "strange" bytes show you the payload length that you are supposed to receive. When you are done reading this amount of bytes, you will receive again a number that tells you whether you should continue reading (and, again, how many bytes you will receive). As far as I understood, this is usefull when you have data that change dynamically and you might want to continuously get their current value.

Zero byte receives: purpose clarification

I am learning server development with IO Completion Ports. My book, "Network Programming for Microsoft Windows - Second Edition", states the following:
With every overlapped send or receive operation, it is probable that
the data buffers submitted will be locked. When memory is locked, it
cannot be paged out of physical memory. The operating system imposes a
limit on the amount of memory that may be locked. When this limit is
reached, overlapped operations will fail with the WSAENOBUFS error. If
a server posts many overlapped receives on each connection, this limit
will be reached as the number of connections grow. If a server
anticipates handling a very high number of concurrent clients, the
server can post a single zero byte receive on each connection. Because
there is no buffer associated with the receive operation, no memory
needs to be locked. With this approach, the per-socket receive buffer
should be left intact because once the zero-byte receive operation
completes, the server can simply perform a non-blocking receive to
retrieve all the data buffered in the socket's receive buffer. There
is no more data pending when the non-blocking receive fails with
WSAEWOULDBLOCK.
Now, I'm trying to understand this paragraph; I think I've got it but want to make sure please.
I understand about memory being locked if I post make multiple WSARecv() calls with large buffers. But I am not entirely sure how a zero byte buffer prevents this.
I am thinking it is this (and would like confirmation please):
If I have n connections, and I post 50 WSARecv() calls with a 1KB buffer on each connection, that is n * 50KB total memory locked. All of that memory is locked, regardless of whether or not it is actually being used (i.e. whether or not anything is being copied into it from the TCP buffers). Hence if I keep adding more connections, I will keep locking more memory that may or may ever be used. Thus I can run out, with WSAENOBUFS error.
If I however post a zero byte receive on each connection, a completion packet will be generated on that connection only when there is data available for reading. (That is my first assumption, is that correct?)
Now, when I know there is some data, I can then post a WSARecv() with a buffer of 1KB (or however much) - or indeed loop repeatedly reading it all as suggested in my book - knowing that it will be filled immediately hence not remain unused and locked (second assumption, is that correct?)
Question 1
Thus, if my two assumptions are correct, then I have understood my book :) This means then that my server could, in theory, post a zero byte receive when a new connection is established, then when a completion packet is generated, read all of the data until there is no more, then post another zero byte receive - is that correct?
Question 2
However, isn't there still a risk that if I receive completion packets for lots of my zero byte receive posts at once, and I then go onto make multiple WSARecv() calls, that I will still end up with some failing with WSAENOBUFS?
Hopefully someone can clarify these two assumptions and two questions for me.
OK I've done research into this along with experimentation and have found the following:
Assumptions:
1) Assumption 1 is correct.
2) I believe assumption 2 is correct.
Questions
1) I have tested this and this seems to work.
2) This I guess remains a possibility but much less likely than if I posted receives with a none-zero buffer.
Note that we can still raise the WSAENOBUF error when sending too fast; more details here.

Unordered socket read & close notification using IOCP

Most server framework/examples using sockets and I/O completion ports makes notifications in a way I couldn't completely figure out the purpose.
Upon read packets are processed, usually they are reordered to circumvent thread scheduling issues processing packets out of order no matter IOCP ensure a FIFO queue.
The problem is when a socket is closed gracefully or by an error. I saw in both situation, and again by the o.s. thread scheduler, the close notification may be sent to the application (i.e. http server using the framework) "before" the notification of data previously readed.
I think that the close notification should be queued in such way so the application receives it after previous reads.
Is there any intended use in most code I saw or my behavior may be correct depending on the situation?
What you suggest makes sense and I would imagine that any code that handles graceful close (a read returning 0 bytes) would do so by processing it after any proceeding successful read. Errors coming out of GetQueuedCompletionStatus(), such as connection reset errors, etc, are harder to integrate into the receive flow as they occur out of band as far as the receive data is concerned. Your question's a bit vague and depends very much on the code you're using and how you (or the people who wrote that code) want to handle these things. There is no single correct way, IMHO.

Read unknown number of incoming bytes

My app communicates with a server over TCP, using AsyncSocket. There are two situations in which communication takes place:
The app sends the server something, the server responds. The app needs to read this response and do something with the information in it. This response is always the same length, e.g., a response is always 6 bytes.
The app is "idling" and the server initiates communication at some time (unknown to the app). The app needs to read whatever the server is sending (could be any number of bytes, but the first byte will indicate how many bytes are following so I know when to stop reading) and process this information.
The first situation is working fine. readDataToLength:timeout:tag returns what I need and I can do with it what I want. It's the second situation that I'm unsure of how to implement. I can't use readDataToLength:timeout:tag, since I don't know the length beforehand.
I'm thinking I could do something with readDataWithTimeout:tag:, setting the timeout to -1. That makes the socket to constantly listen for anything that's coming in, I believe. However, that will probably interfere with data that's coming in as response to something I sent out (situation 1). The app can't distinguish incoming data from situation 1 or situation 2 anymore.
Anybody here who can give me help me solve this?
Your error is in the network protocol design.
Unless your protocol has this information, there's no way to distinguish the response from the server-initiated communication. And network latency prevents obvious time-based approach you've described from working reliably.
One simple way to fix the protocol in your case (if the server-initiated messages are always less then 255 bytes) - add the 7-th byte to the beginning of the response, with the value FF.
This way you can readDataWithTimeout:tag: for 1 byte.
On timeout you retry until there's a data.
If the received value is FF, you read 6 more bytes with readDataToLength:6 timeout: tag:, and interpret it as the response to the request you've sent earlier.
If it's some other value, you read the message with readDataToLength:theValue timeout: tag:, and process the server-initiated message.

Resources