Why is it not safe to use Socket.ReceiveLength? - delphi

Well, even Embarcadero states that it is not guaranteed to return accurate result of the bytes ready to read in the socket buffer, but if you look at it, when you place -1 at Socket.ReceiveBuf (this is what ReceiveLength wraps) it calls ioctlsocket with FIONREAD to determine the amount of data pending in the network's input buffer that can be read from socket s.
so, how is it not safe or bad ?
e.g: ioctlsocket(Socket.SocketHandle, FIONREAD, Longint(i));

The documentation you mention specifically says (emphasis mine)
Note: ReceiveLength is not guaranteed to be accurate for streaming socket connections.
This means that the length is not known ahead of time because it's being supplied by a stream of data. Obviously, if you don't know how big the data is that's being sent ahead of time, you can't properly set the length the client should expect.
Consider it like generic code to copy a file. If you don't know ahead of time how big the file is you'll be copying, you can't predict how many bytes you'll be copying. In the case of the socket, the stream size that's supplying the socket isn't known in advance (for instance, for data being generated real-time and sent), so there's no way to inform the client socket how much to expect.

Related

What does CBATTError Code insufficientResources really mean?

I'm trying to send data over BLE from my iPhone to an ESP32 board. I'm developing in flutter platform and I'm using flutter_reactive_ble library.
My iPhone can connect to the other device and it can also send 1 byte using writeCharacterisiticWithResponse function. But when I try to send my real data which is large (>7000 bytes), it then gives me the error:
flutter: Error occured when writing 9f714672-888c-4450-845f-602c1331cdeb :
Exception: GenericFailure<WriteCharacteristicFailure>(
code: WriteCharacteristicFailure.unknown,
message: "Error Domain=CBATTErrorDomain Code=17
"Resources are insufficient."
UserInfo={NSLocalizedDescription=Resources are insufficient.}")
I tried searching for this error but didn't find additional info, even in Apple Developer website. It just says:
Resources are insufficient to complete the ATT request.
What does this error really means? Which resources are not sufficient and how to work around this problem?
This is almost certainly larger than this characteristic's maximum value length (which is probably on the order of 10s of bytes, not 1000s of bytes). Before writing, you need to call maximumWriteValueLength(for:) to see how much data can be written. If you're trying to send serial data over a characteristic (which is common, but not really what they were designed for), you'll need to break your data up into chunks and reassemble them on the peripheral. You will likely either need an "end" indicator of some kind, or you will need to send the length of the payload first so that the receiver knows how much to excpect.
First of all, a characteristic value cannot be larger than 512 bytes. This is set by the ATT standard (Bluetooth Core Specification v5.3, Vol 3, Part F (ATT), section 3.2.9). This number has been set arbitrarily by the protocol designers and does not map to any technical limitation of the protocol.
So, don't send 7000 bytes in a single write. You need to keep it at most 512 to be standard compliant.
If you say that it works with another Bluetooth stack running on the GATT server, then I guess CoreBluetooth does not enforce/check the maximum length of 512 bytes on the client side (I haven't tested). Therefore I also guess the error code you see was sent by the remote device rather than by CoreBluetooth locally as a pre-check.
There are three different common ways of writing a characteristic on the protocol level (Bluetooth Core Specification v5.3, Vol 3, Part G (GATT), section 4.9 Characteristic Value Write):
Write Without Response (4.9.1)
Write Characteristic Value (4.9.3)
Write Long Characteristic Values (4.9.4)
Number one is unidirectional and does not result in a response packet. It uses a single ATT_WRITE_CMD packet where the value must be at most ATT_MTU-3 bytes in length. This length can be retrieved using maximumWriteValueLength(for:) with .withoutResponse. The requestMtu method in flutter_reactive_ble uses this method internally. If you execute many writes of this type rapidly, be sure to add flow control to avoid CoreBluetooth dropping outgoing packets before they are sent. This can be done through peripheralIsReadyToSendWriteWithoutResponse by simply always waiting for this callback after each write, before you write the next packet. Unfortunately, it seems flutter_reactive_ble does not implement this flow control mechanism.
Number two uses a single ATT_WRITE_REQ where the value must be at most ATT_MTU-3 bytes in length, just as above. Use the same approach as above to retrieve that maximum length (note that maximumWriteValueLength with .withResponse always returns 512 and is not what you want). Here however, either an ATT_WRITE_RSP will be returned on success or an error packet will be received with an error code. Only one ATT transaction can be outstanding at a time, which significantly lowers throughput compared to Write Without Response.
Number three uses a sequence of multiple ATT_PREPARE_WRITE_REQ packets (containing offset and value) followed by an ATT_EXECUTE_WRITE_REQ. The maximum length of the value in each each chunk is ATT_MTU-5. Each _REQ packet also requires a corresponding _RSP packet before it can continue (alternatively, an error code could be sent by the remote device). This approach is used when the characteristic value to be written is too long to be sent using a single ATT_WRITE_REQ.
For any of the above write methods, you are always also limited by the maximum attribute size of 512 bytes as per the specification.
Any Bluetooth stack I know of transparently chooses between "Write Characteristic Value" and "Write Long Characteristic Values" when you tell it to write with response, depending on the value length and MTU. Server side it's a bit different. Some stacks put the burden on the user to combine all packets but it seems nimble handles that on its own. From what I can see in the source code (https://github.com/apache/mynewt-nimble/blob/26ccb8af1f3ea6ad81d5d7cbb762747c6e06a24b/nimble/host/src/ble_att_svr.c#L2099) it can return the "Insufficient Resources" error code when it tries to allocate memory but fails (most likely due to too much buffered data). This is what might happen for you. To answer your first actual question, the standard itself does not say anything else about this error code than simply "Insufficient Resources to complete the request".
The error has nothing to do with LE Data Length extension, which is simply an optimization for a lower layer (BLE Link Layer) that does not affect the functionality of the host stack. The L2CAP layer will take care of the reassembling of smaller link layer packets if necessary, and must always support up to the negotiated MTU without overflowing any buffers.
Now, to answer your second question, if you send very large amounts of data (7000 bytes), you must divide the data in multiple chunks and come up with a way to correctly be able to combine them. Each chunk is written as a full characteristic value. When you do this, be sure to send values at most of size ATT_MTU-3 (but never larger than 512 bytes), to avoid the inefficient overheads of "Write Long Characteristic Values". It's then up to your application code to make sure you don't run out of memory in case too much data is sent.

Read Timeout TIdTCPClient

Good day. I use the TIdTCPClient component to send requests to the server and read the response. I know the size of the response for certain requests, but not for others.
When I know the size of the response, then my data reading code looks like this:
IdTCPClient1->Socket->Write(requestBuffer);
IdTCPClient1->Socket->ReadBytes(answerBuffer, expectSize);
When the size of the response is not known to me, then I use this code:
IdTCPClient1->Socket->Write(requestBuffer);
IdTCPClient1->Socket->ReadBytes(answerBuffer, -1);
In both cases, I ran into problems.
In the first case, if the server does not return all the data (less than expectSize), then IdTCPClient1 will wait for ReadTimeout to finish, but there will be no data at all in the answerBuffer (even if the server sent something). Is this the logic behind TIdTCPClient? It is right?
In the second case, ReadTimeout does not work at all. That is, the ReadBytes function ends immediately and nothing is written to the answerBuffer, or several bytes from the server are written. However, I expected that since this function in this case does not know the number of bytes to read, it must wait for ReadTimeout and read the bytes, who came during this time. For the experiment, I inserted Sleep (500) between writing and reading, and then I read all the data that arrived.
May I ask you to answer why this is happening?
Good day. I use the TIdTCPClient component to send requests to the server and read the response. I know the size of the response for certain requests, but not for others.
Why do you not know the size of all of the responses? What does your protocol actually look like? TCP is a byte stream, each message MUST be framed in such a way that a receiver can know where each message begins and ends in order to read the messages correctly and preserve the integrity of the stream. As such, messages MUST either include their size in their payload, or be uniquely delimited between messages. So, which is the case in your situation? It doesn't sound like you are handling either possibility.
When the size of the response is not known to me, then I use this code:
IdTCPClient1->Socket->Write(requestBuffer);
IdTCPClient1->Socket->ReadBytes(answerBuffer, -1);
When you set AByteCount to -1, that tells ReadBytes() to return whatever bytes are currently available in the IOHandler's InputBuffer. If the InputBuffer is empty, ReadBytes() waits, up to the ReadTimeout interval, for at least 1 byte to arrive, and then it returns whatever bytes were actually received into the InputBuffer, up to the maximum specified by the IOHandler's RecvBufferSize. So it may still take multiple reads to read an entire message in full.
In general, you should NEVER set AByteCount to -1 when dealing with an actual protocol. -1 is good to use only when proxying/streaming arbitrary data, where you don't care what the bytes actually are. Any other use require knowledge of the protocol's details of how messages are framed.
In the first case, if the server does not return all the data (less than expectSize), then IdTCPClient1 will wait for ReadTimeout to finish, but there will be no data at all in the answerBuffer (even if the server sent something). Is this the logic behind TIdTCPClient? It is right?
Yes. When AByteCount is > 0, ReadBytes() waits for the specified number of bytes to be available in the InputBuffer before then extracting that many bytes into your output TIdBytes. Your answerBuffer will not be modified unless all of the requested bytes are available. If the ReadTimeout elapses, an EIdReadTimeout exception is raised, and your answerBuffer is left untouched.
If that is not the behavior you want, then consider using ReadStream() instead of ReadBytes(), using a TIdMemoryBufferStream or TBytesStream to read into.
In the second case, ReadTimeout does not work at all. That is, the ReadBytes function ends immediately and nothing is written to the answerBuffer.
I have never heard of ReadBytes() not waiting for the ReadTimeout. What you describe should only happen if there are no bytes available in the InputBuffer and the ReadTimeout is set to some very small value, like 0 msecs.
or several bytes from the server are written.
That is a perfectly reasonable outcome given you are asking ReadBytes() to read an arbitrary number of bytes between 1..RecvBufferSize, inclusive, or read no bytes if the timeout elapses.
However, I expected that since this function in this case does not know the number of bytes to read, it must wait for ReadTimeout and read the bytes, who came during this time.
That is how it should be working, yes. And how it has always worked. So I suggest you debug into ReadBytes() at runtime and find out why it is not working the way you are expecting. Also, make sure you are using an up-to-date version of Indy to begin with (or at least a version from the last few years).
Why do you not know the size of all of the responses?
Because, in fact, I'm doing a survey of an electronic device. This device has its own network IP address and port. So, the device can respond to the same request in different ways, depending on its status. Strictly speaking, there can be two answers to some queries and they have different lengths. It is in these cases, when reading, I specify AByteCount = -1 to read any device response.
I have never heard of ReadBytes() not waiting for the ReadTimeout.
You're right! I was wrong. When specifying AByteCount = -1, I get one byte. As you said, if at least one byte arrives, it returns and ReadBytes() ends.
Also, make sure you are using an up-to-date version of Indy to begin with (or at least a version from the last few years).
I am working with C++ Builder 10.3 Community Edition, Indy version 10.6.2.5366.

How do I receive arbitrary length data using a UdpSocket?

I am writing an application which sends and receives packages using UDP. However, the documentation of recv_from states:
If a message is too long to fit in the supplied buffer, excess bytes may be discarded.
Is there any way to receive all bytes and write them into a vector? Do I really have to allocate an array with the maximum packet length (which, as far as I know, is 65,507 bytes for IPv4) in order to be sure to receive all data? That seems a bit much for me.
Check out the next method in the docs, UdpSocket::peek_from (emphasis mine):
Receives a single datagram message on the socket, without removing it from the queue.
You can use this method to read a known fixed amount of data, such as a header which contains the length of the entire packet. You can use crates like byteorder to decode the appropriate part of the header, use that to allocate exactly the right amount of space, then call recv_from.
This does require that the protocol you are implementing always provides that total size information at a known location.
Now, is this a good idea?
As ArtemGr states:
Because extra system calls are much more expensive than getting some space from the stack.
And from the linked question:
Obviously at some point you will start wondering if doubling the number of system calls to save memory is worth it. I think it isn't.
With the recent Spectre / Meltdown events, now's a pretty good time to be be reminded to avoid extra syscalls.
You could, as suggested, just allocate a "big enough" array ahead of time. You'll need to track how many bytes you've actually read vs allocated though. I recommend something like arrayvec to make it easier.
You could instead implement a pool of pre-allocated buffers on the heap. When you read from the socket, you use a buffer or create a new one. When you are done with the buffer, you put it back in the pool for reuse. That way, you incur the memory allocation once and are only passing around small Vecs on the stack.
See also:
How can I create a stack-allocated vector-like container?
How large should my recv buffer be when calling recv in the socket library
How to read UDP packet with variable length in C

Is transmitted bytes event exist in Linux kernel?

I need to write a rate limiter, that will perform some stuff each time X bytes were transmitted.
The straightforward is to check the length of each transmitted packet, but I think it will be to slow for me.
Is there a way to use some kind of network event, that will be triggered by transmitted packets/bytes?
I think you may look at netfilter.
Using its (kernel level) api, you can have your custom code triggered by network events, modify received messages before passing it to application, and so on.
http://www.netfilter.org/
It's protocol dependent, actually. But for TCP, you can setsockopt the SO_RCVLOWAT option to define the minimum number of bytes (watermark) to permit the read operation.
If you need to enforce the maximum size too, adjust the receive buffer size using SO_RCVBUF.

Double system call to write() causes massive network slowdown

In a partially distributed network app I'm working on in C++ on Linux, I have a message-passing abstraction which will send a buffer over the network. The buffer is sent in two steps: first a 4-byte integer containing the size is sent, and then the buffer is sent afterwards. The receiving end then receives in 2 steps as well - one call to read() to get the size, and then a second call to read in the payload. So, this involves 2 system calls to read() and 2 system calls to write().
On the localhost, I setup two test processes. Both processes send and receive messages to each other continuously in a loop. The size of each message was only about 10 bytes. For some reason, the test performed incredibly slow - about 10 messages sent/received per second. And this was on localhost, not even over a network.
If I change the code so that there is only 1 system call to write, i.e. the sending process packs the size at the head of the buffer and then only makes 1 call to write, the whole thing speeds up dramatically - about 10000 messages sent/received per second. This is an incredible difference in speed for only one less system call to write.
Is there some explanation for this?
You might be seeing the effects of the Nagle algorithm, though I'm not sure it is turned on for loopback interfaces.
If you can combine your two writes into a single one, you should always do that. No sense taking the overhead of multiple system calls if you can avoid it.
Okay, well I'm using TCP/IP (SOCK_STREAM) sockets. The example code is pretty straight forward. Here is a basic snippet that reproduces the problem. This doesn't include all the boiler plate setup code, error-checking, or ntohs code:
On the sending end:
// Send size
uint32_t size = strlen(buffer);
int res = write(sock, &size, sizeof(size));
// Send payload
res = write(sock, buffer, size);
And on the receiving end:
// Receive size
uint32_t size;
int res = read(sock, &size, sizeof(size));
// Receive payload
char* buffer = (char*) malloc(size);
read(sock, buffer, size);
Essentially, if I change the sending code by packing the size into the send buffer, and only making one call to write(), the performance increase is almost 1000x faster.
This is essentially the same question: C# socket abnormal latency .
In short, you'll want to use the TCP_NODELAY socket option. You can set it with setsockopt.
You don't give enough information to say for sure. You don't even say which protocol you're using.
Assuming TCP/IP, the socket could be configured to send a packet on every write, instead of buffering output in the kernel until the buffer is full or the socket is explicitly flushed. This means that TCP sends the two pieces of data in different fragments and has to defeagment them at the other end.
You might also be seeing the effect of the TCP slow-start algorithm. The first data sent is transmitted as part of the connection handshake. Then the TCP window size is slowly ramped up as more data is transmitted until it matches the rate at which the receiver can consume data. This is useful in long-lived connections but a big performance hit in short-lived ones. You can turn off slow-start by setting a socket option.
Have a look at the TCP_NODELAY and TCP_NOPUSH socket options.
An optimization you can use to avoid multiple system calls and fragmentation is scatter/gather I/O. Using the sendv or writev system call you can send the 4-byte size and variable sized buffer in a single syscall and both pieces of data will be sent in the same fragment by TCP.
The problem is that with the first call to send, the system has no idea the second call is coming, so it sends the data immediately. With the second call to send, the system has no idea a third call isn't coming, so it delays the data in hopes that it can combine the data with a subsequent call.
The correct fix is to use a 'gather' operation such as writev if your operating system supports it. Otherwise, allocate a buffer, copy the two chunks in, and make a single call to write. (Some operating systems have other solutions, for example Linux has a 'TCP cork' operation.)
It's not as important, but you should optimize your receiving code too. Call 'read' asking for as many bytes as possible and then parse them yourself. You're tying to teach the operating system your protocol, and that's not a good idea.

Resources