We're having a bizarre problem with Indy10 where two large strings (a few hundred characters each) that we send out one after the other using TCP are appearing at the other end intertwined oddly. This happens extremely infrequently.
Each string is a complete XML message terminated with a LF and in general the READ process reads an entire XML message, returning when it sees the LF.
The call to actually send the message is protected by a critical section around the call to the IOHandler's writeln method and so it is not possible for two threads to send at the same time. (We're certain the critical section is implemented/working properly). This problem happens very rarely. The symptoms are odd...when we send string A followed by string B what we received at the other end (on the rare occasions where we have failure) is the trailing section of string A by itself (i.e., there's a LF at the end of it) followed by the leading section of string A and then the entire string B followed by a single LF. We've verified that the "timed out" property is not true after the partial read - we log that property after every read that returns content. Also, we know there are no embedded LF characters in the string, as we explicitly replace all non-alphanumeric characters in the string with spaces before appending the LF and sending it.
We have log mechanisms inside the critical sections on both the transmission and receiving ends and so we can see this behavior at the "wire".
We're completely baffled and wondering (although always the lowest possibility) whether there could be some low-level Indy issues that might cause this issue, e.g., buffers being sent in the wrong order....very hard to believe this could be the issue but we're grasping at straws.
Does anyone have any bright ideas?
You could try Wireshark to find out how the data is tranferred. This way you can find out whether the problem is in the server or in the client. Also remember to use TCP to get "guaranteed" valid data in right order.
Are you using TCP or UDP? If you are using UDP, it is possible (and expected) that the UDP packets can be received in a different order than they were transmitted due to the routing across the network. If this is the case, you'll need to add some sort of packet ID to each UDP packet so that the receiver can properly order the packets.
Do you have multiple threads reading from the same socket at the same time on the receiving end? Even just to query the Connected() status causes a read to occur. That could cause your multiple threads to read the inbound data and store it into the IOHandler.InputBuffer in random order if you are not careful.
Have you checked the Nagle settings of the IOHandler? We had a similar problem that we fixed by setting UseNagle to false. In our case sending and receiving large amounts of data in bursts was slow due to Nagle coalescing, so it's not quite the same as your situation.
Related
I have been trying to implement a wrapper library for the Linux interface to SCTP sockets, and I am not sure how to integrate the asynchronous style of errors (where they are delivered via events). All example code I have seen, if it deals with the errors at all, simply prints out the information related to the error when it is received, but inserting error-handling code there seems like it would be ineffective, because by that point all of the context related to the original message which was sent has been lost and only a 32-bit integer sinfo_context remains. It also seems that there is no way to directly tell when a given message has been acknowledged successfully by the remote peer, which would make it impossible to implement an approach which listens for errors after sending a message, because the context information for successfully-delivered messages could never be freed.
Is there a way to handle the errors related to a given sending operation as part of the call to a send function, or is there a different way to approach error handling for SCTP which does not lose the context of the error?
One solution which I have considered is using the SCTP_SENDER_DRY notification to tell when packets have been sent, however this requires sending only one packet at a time. Another idea is to use the peer's receiver window size together with the sinfo_cumtsn field of sctp_sndrcvinfo to calculate how much data has been acknowledged as fully received using the cumulative TSN, however there are a couple of disadvantages to this: first, it requires bookkeeping overhead to calculate a number of bytes received by the peer based on the cumulative TSN (especially if the peer's window size may change); second, it requires waiting until all earlier packets were received before reporting success, which seems to defeat the purpose of SCTP's multistreaming; and third, it seems like it would not work for unordered packets.
I am receiving data through a TCP socket and although this code has been working for years, I came across a very odd behaviour trying to integrate a new device (that acts as a server) into my system:
Before receiving the HTTP Body response, the recv() kernel function gives me strange characters like '283' or '7b'.
I am actually debuging with gdb and I can see that the variables hold these values right after recv() was called (so it is not just what printf shows me)
I always read byte-after-byte (one at a time) with the recv() function and the returned value is always positive.
This first line of the received HTTP Body cannot be seen in Wireshark (!) and is also not expected to be there. In Wireshark I see what I would expect to receive.
I changed the device that sends me the data and I still receive the exact same values
I performed a clean debug build and also tried a release version of my programm and still get the exact same values, so I assume these are not random values that happened to be in memory.
i am running Linux kernel 3.2.58 without the option to upgrade/update.
I am not sure what other information i should provide and I have no idea what else to try.
Found it. The problem is that I did not take the Transfer-Encoding into consideration, which is chunked. I was lucky because also older versions of Wireshark were showing these bytes in the payload so other people also posted similar problems in the wireshark forum.
Those "strange" bytes show you the payload length that you are supposed to receive. When you are done reading this amount of bytes, you will receive again a number that tells you whether you should continue reading (and, again, how many bytes you will receive). As far as I understood, this is usefull when you have data that change dynamically and you might want to continuously get their current value.
I am learning server development with IO Completion Ports. My book, "Network Programming for Microsoft Windows - Second Edition", states the following:
With every overlapped send or receive operation, it is probable that
the data buffers submitted will be locked. When memory is locked, it
cannot be paged out of physical memory. The operating system imposes a
limit on the amount of memory that may be locked. When this limit is
reached, overlapped operations will fail with the WSAENOBUFS error. If
a server posts many overlapped receives on each connection, this limit
will be reached as the number of connections grow. If a server
anticipates handling a very high number of concurrent clients, the
server can post a single zero byte receive on each connection. Because
there is no buffer associated with the receive operation, no memory
needs to be locked. With this approach, the per-socket receive buffer
should be left intact because once the zero-byte receive operation
completes, the server can simply perform a non-blocking receive to
retrieve all the data buffered in the socket's receive buffer. There
is no more data pending when the non-blocking receive fails with
WSAEWOULDBLOCK.
Now, I'm trying to understand this paragraph; I think I've got it but want to make sure please.
I understand about memory being locked if I post make multiple WSARecv() calls with large buffers. But I am not entirely sure how a zero byte buffer prevents this.
I am thinking it is this (and would like confirmation please):
If I have n connections, and I post 50 WSARecv() calls with a 1KB buffer on each connection, that is n * 50KB total memory locked. All of that memory is locked, regardless of whether or not it is actually being used (i.e. whether or not anything is being copied into it from the TCP buffers). Hence if I keep adding more connections, I will keep locking more memory that may or may ever be used. Thus I can run out, with WSAENOBUFS error.
If I however post a zero byte receive on each connection, a completion packet will be generated on that connection only when there is data available for reading. (That is my first assumption, is that correct?)
Now, when I know there is some data, I can then post a WSARecv() with a buffer of 1KB (or however much) - or indeed loop repeatedly reading it all as suggested in my book - knowing that it will be filled immediately hence not remain unused and locked (second assumption, is that correct?)
Question 1
Thus, if my two assumptions are correct, then I have understood my book :) This means then that my server could, in theory, post a zero byte receive when a new connection is established, then when a completion packet is generated, read all of the data until there is no more, then post another zero byte receive - is that correct?
Question 2
However, isn't there still a risk that if I receive completion packets for lots of my zero byte receive posts at once, and I then go onto make multiple WSARecv() calls, that I will still end up with some failing with WSAENOBUFS?
Hopefully someone can clarify these two assumptions and two questions for me.
OK I've done research into this along with experimentation and have found the following:
Assumptions:
1) Assumption 1 is correct.
2) I believe assumption 2 is correct.
Questions
1) I have tested this and this seems to work.
2) This I guess remains a possibility but much less likely than if I posted receives with a none-zero buffer.
Note that we can still raise the WSAENOBUF error when sending too fast; more details here.
My app communicates with a server over TCP, using AsyncSocket. There are two situations in which communication takes place:
The app sends the server something, the server responds. The app needs to read this response and do something with the information in it. This response is always the same length, e.g., a response is always 6 bytes.
The app is "idling" and the server initiates communication at some time (unknown to the app). The app needs to read whatever the server is sending (could be any number of bytes, but the first byte will indicate how many bytes are following so I know when to stop reading) and process this information.
The first situation is working fine. readDataToLength:timeout:tag returns what I need and I can do with it what I want. It's the second situation that I'm unsure of how to implement. I can't use readDataToLength:timeout:tag, since I don't know the length beforehand.
I'm thinking I could do something with readDataWithTimeout:tag:, setting the timeout to -1. That makes the socket to constantly listen for anything that's coming in, I believe. However, that will probably interfere with data that's coming in as response to something I sent out (situation 1). The app can't distinguish incoming data from situation 1 or situation 2 anymore.
Anybody here who can give me help me solve this?
Your error is in the network protocol design.
Unless your protocol has this information, there's no way to distinguish the response from the server-initiated communication. And network latency prevents obvious time-based approach you've described from working reliably.
One simple way to fix the protocol in your case (if the server-initiated messages are always less then 255 bytes) - add the 7-th byte to the beginning of the response, with the value FF.
This way you can readDataWithTimeout:tag: for 1 byte.
On timeout you retry until there's a data.
If the received value is FF, you read 6 more bytes with readDataToLength:6 timeout: tag:, and interpret it as the response to the request you've sent earlier.
If it's some other value, you read the message with readDataToLength:theValue timeout: tag:, and process the server-initiated message.
I am using the following few lines of code to write and read from an external Modem/Router (aka device) via IP.
TCPClient.IOHandler.Write(MsgStr);
TCPClient.IOHandler.InputBuffer.Clear;
TCPClient.IOHandler.ReadBytes(Buffer, 10, True);
MsgStr is a string type which contains the text that I am sending to my device.
Buffer is declared as TIdBytes.
I can confirm that IOHandler.InputBufferIsEmpty returns True immediately prior to calling ReadBytes.
I'm expecting the first 10 bytes received to be very specific hence from my point of view I am only interested in the first 10 bytes received after I've sent my string.
The trouble I am having is, when talking to certain devices, the first byte returned the first time I've sent a string after establishing a connection puts a rogue (random) byte in my Buffer output. The subsequent bytes following are correct.
eg 10 bytes I'm expecting might be: #6A1EF1090#3 but what I get is .#6A1EF1090. in this example I have a full stop where there shouldn't be one.
If I try to send again, it works fine. (ie the 2nd Write sent after a connection has been established). What's weird (to me) is using a Socket Sniffer doesn't show the random byte being returned. If I create my own "server" to receive the response and send something back it works fine 100% of the time. Other software - ie, not my software - communicates fine with the device (but of course I have no idea how they're parsing the data).
Is there anything that I'm doing incorrectly above that would cause this - bearing in mind it only occurs the first time I'm using Write after establishing a connection?
Thanks
EDIT
I'm using Delphi 7 and Indy 10.5.8
UPDATE
Ok. After much testing and looking, I am no closer to finding this solution. I am getting two main scenarios. 1 - First byte missing and 2 - "introduced" byte at the start of received packet. Using TIdLogEvent and TIdLogDebug both either show the missing byte or the initial introduced byte as appropriate. So my ReadBytes statement above is showing consistently what Indy believes is there (in my opinion).
Also, to test it further, I downloaded and installed ICS components. Unfortunately (or fortunately depending on how you look at it) this didn't show the same issues as Indy. This didn't show the first byte missing nor did it show an introduced byte at the beginning. However, I have only done superficial testing, but Indy produces the behaviour "pretty much straight away" whereas ICS hasn't produced it at all yet.
If anyone is interested I can supply a small demo app illustrating the issue and IP I connect to - it's a public IP so anyone can access it. Otherwise for now, I'll just have to work around it. I'm reluctant to switch to ICS as ICS may work fine in this instance and given the use of this socket stuff is pretty much the whole crux of the program, it would be nasty to have to entirely replace Indy with ICS.
The last parameter (True)
TCPClient.IOHandler.ReadBytes(Buffer, 10, True);
causes the read to append instead of replace the buffer content.
This requires that size and content of the buffer are set up correctly first.
If the parameter is False, the buffer content will be replaced for the given number of bytes.
ReadBytes() does not inject rogue bytes into the buffer, so there are only two possibilities I can think of right now given the limited information you have provided:
The device really is sending an extra byte upon initial connection, like mj2008 suggested. If a packet sniffer is not detecting it, try attaching one of Indy's own TIdLog... components to your TIdTCPClient, such as TIdLogFile or TIdLogEvent, to verify what TIdTCPClient is actually receiving from the socket.
you have another thread reading from the same connection at the same time, corrupting the InputBuffer. Even a call to TIdTCPClient.Connected() will perform a read. Don't perform reads in multiple threads at the same time, if you are using the threads.