If a client connects to a server over a normal tcp connection, and then later on the client's connection cuts out, the server will get (assuming active mode) {tcp_closed,Socket}. But there are cases where the server won't know that the client has disconnected, such as power failure or crashing and such (I believe, I could be wrong). In these cases, the client is gone but the server still believes it's connected. If the server attempts to send the client a message in these cases, will it assume that the client gets the message or will the tcp stack sort that out on the low level and the server gets back some kind of error?
I know this is a simplistic question, but I've been having trouble testing it myself, as I can't get a client to catastrophically fail like I need it to (even kill -9 isn't doing it). Does anyone have any experience with this?
The answer depends. When you try to send out data, the kernels TCP window will slowly fill until it can't take any more data. Then your send will block because the internal kernel buffer is full. TCP has some timers which will trigger after some time. When that happens, the kernel will error the send request, Erlangs VM runtime will transform it into {error, Reason}, where Reason is the posix() error message from the underlying system.
If you want to be sure the data got through, you have to acknowledge it on the stream the other way. Or you can make the data idempotent so you can resend it without trouble. It is especially important if the other endpoint, the client, is a device like a mobile phone where disconnects will happen all the time.
To test it, you can block the communication with a firewall rule on lo.
Related
I tested a simulation disconnect of multiple clients by cutting their internet connection. I found that TIdTCPServer did not discharge their threads, it did not detect their disconnect. By comparison, when I closed a client manually, the server detected the disconnection and discharged its thread.
Abnormal disconnects are not detected by the OS in a timely manner. It can take a considerable amount of time for a lost socket connection to timeout internally so the OS can invalidate it. Until the OS does that, Indy has no way of knowing that the client connection is gone.
To account for that, you should either:
implement a timeout in your application-layer data protocol. If you are expecting a client to send something to your server, and it does not do so for a certain amount of time, assume the client is gone and close the connection. During periods of idle activity, require clients to send a heartbeat command to your server at regular intervals to keep their connections alive. You can use the AContext.Connection.IOHandler.CheckForDataOnSource() method to wait for data to arrive, or you can use the AContext.Binding.SetSockOpt() method to specify an SO_RCVTIMEO timeout on blocking reads.
if you cannot change your data protocol, you can at least enable TCP-level keep-alives on the socket itself. In the server's OnConnect event, you can call the AContext.Binding.SetKeepAliveValues() method to enable keep-alives. The OS will then handle the keep-alives for you, and will invalidate the connection if the timeout elapses.
With that said, also make sure that your server event handlers are not swallowing Indy exceptions (derived from EIdException). That can also cause the server to not terminate threads correctly, if a connection is lost and Indy raises an exception about it but you are not allowing the server to process it. If you need to catch exceptions (for logging, etc), make sure to re-raise any EIdException-derived exception and let the server handle it.
I have a server application which runs on a Linux machine. I can connect this application from Windows/Linux machines and can send/recieve data. After a few hours, something occurs and I get following error on the client side.
On Windows: An existing connection was forcibly closed by the remote host
On Linux: Connection timed out
I have made a search on the web and found some posts which suggest to increase/decrease OS's keep alive time. However, it didin't work for me.
Can I found a soultion to this problem or should I simply try to reconnect to the server when the connection is forcibly closed?
EDIT: I have tracked the situation. I sent a data to the remote node and sent another data after waiting 5 hours. Sending side sent the first data, but whet the sender sent the second data it didn't response. TCP/IP stack of the sender repeated this 5 times by incrementing the times between retries. Finally, sender reset the connection. I can't be sure why this is happening (Maybe because of a firewall or NAT - see Section 2.4) but I applied two different approach to solve this problem:
Use TCP/IP keep alive using setsockopt (Section 4.2)
Make an application level keep alive. This is more reliable since the first approach is OS related.
It depends on what your application is supposed to do. A little more information and perhaps the code you use for listening and handling connections could be of help.
Regardless, technically a longer keep alive time, should prevent the OS from cutting you off. So perhaps it is something else causing the trouble.
Such a thing could be router malfunction or traffic causing your keep-alive packet to get lost.
If you aren't already testing it on a LAN (without heavy trafic) I suggest doing so.
It might also be due to how your socket is handled (which I can't determine from your question)
This article might help.
Non blocking socket with timeout
I'm not used to how connections are handled on Linux, but I expect the OS won't cut off a connection unnecessary.
You can re-establish connection as a recovery, but you need to take into account that not all disconnects are gentle, and therefore you could end up making recovery on a connection you actually wish to be closed.
Since it is TCP, it will do its best to make a gentle disconnect, but you can send a custom message telling the server or client not to re-establish the connection right before disconnecting. That way you be absolutely sure, despite that it should be unnecessary to do so.
so I'm making an iOS app, but this is more of a general networking question.
So what I have is one phone that acts as the server and then a bunch of phones connect to the phone as the client. Basically it's a game/music sharer.
It's kind of hard to really get into the semantics of it, but that isn't important.
What is important is that the server and client are repeatedly sending each other commands and positions rapidly over a TCP connection, and sometimes the client wants to send the server a music file (4MB usually) to play as the music.
The problem I initially encountered was that when sending the large file, it would hang the sending of commands from the client to the server.
My naive solution was to create another socket to connect to the server to send the file to the server, the server would check the IP of the new socket, and if it has the IP of an existing connection then it would just tie it to that connection, receive the file, and then disconnect the socket.
But the problem with this is that it takes a 1-2 second delay for the socket to connect, and I'm aware that there are man-in-the-middle attacks that can occur.
Is there a more elegant solution to this problem?
I would not call your solution naive, this is largely how FTP works, separating data and control paths is a good design pattern in my view.
I wouldn't worry about the man in the middle thing. If you wanted, you could add a command to the client that it responds to over the data connection with a secret the server supplies, this would let you associate the connections without using the ip addressing.
If the delay is a problem then why not establish both connections at the start, the overhead of a few tcp connections on an operating system is not usually significant.
You could also use the two connections for both commands and data, alternating between them. Since both the server and client know when a connection is busy they can choose to use the idle one. The advantage of this is that it will keep both connections busy to ensure they are both known to be working.
You probably should also use a different thread for each socket but I suspect you are doing this since it won't work too well without it.
I am using TIdCmdTCPClient and TIdCmdTCPServer. Suddenly I find that I might like to have bi-directional communication.
What would be best? Should I possibly use some other components? If so, which? Or should I kludge and have the 'client' poll the 'server' to ask if it wishes to communciate anything?
This is a very small system. Two clients and ten servers, with a burst of one tarnscation every 30 to 60 seconds for a few minutes once a day, so overhead for polling is inconsequential.
I'm just woder if there is a 'correct' way.
Update: this really is an incredibly simple system. Very little traffic and all of it simple. All transmissions are an indication of even type an an optional single parameter.
<event type> [ <parameter>] e.g. "HERE_IS_SOME_DATA 42"
This can be sent in both directions, hover here is no "reply" as such. Just fire off a message (and hope that it got there)? Receive an Ack with no data? Non-catching of an exception indicates that message was successfully sent?)
Would it be possible (would it be overkill) to use two TIdCmdTCPServer?
Both TIdCmdTCPClient and TIdCmdTCPServer continuously poll their socket endpoints for inbound data during the lifetime of the connection. You do not have to do anything extra for that. So, as soon as a TIdCmdTCPClient connects to the TIdCmdTCPServer, both components will initially be in a reading state until one of them sends a command to the other.
Now, there is a problem with doing that - as soon as either component sends that first command, the receiving component will interpret it as a command and send back a reply, which the other component will interpret as a command and send back a reply, which will be interpretted as a command and send back a reply, and so on, causing an endless cycle of replies back and forth. For that reason, it is not wise to use TIdCmdTCPClient and TIdCmdTCPServer together. You should either use TIdTCPClient with TIdCmdTCPServer, or use TIdCmdTCPClient with TIdTCPServer. Depending on what exactly your protocol looks like, you may have to forgo using TIdCmdTCPClient and TIdCmdTCPServer altogether and just use TIdTCPClient with TIdTCPServer so you have more control over reading and writing on both ends. It is hard to answer with actual code without first knowing what the communication protocol should look like.
A single TCP socket connection can be used in two directions. The server can send data asynchronously to the client at any time. It is up to the client however to read the socket, for asynchronous processing this is done in a listener thread which reads from the socket and synchronizes incoming data operations with the main worker thread.
An example use case in the Indy components is the Telnet client component (TIdTelnet) which has a receive thread listening for server messages.
But you also asked about the 'correct' way - and then the answer depends on other factors such as network stability, guaranteed delivery and how to handle temporary server outages. In enterprise environments, one central messaging hub is preferred in many use cases, so that all parties connect only to this central server which is only responsible for reliable message delivery, and keeps messages until the recipient is available.
You can download the INDY 10 TCP server demo sample code here.
I'm looking to detect local connection loss. Is there a mean to do that, as with the events on the Corelabs components ?
Thanks
EDIT:
Sorry, I'm going to try to be more specific:
I'm currently designing a prototype using datasnap 2009. So I've got a thin client, a stateless server app and a database server.
What I would be able to do is to detect and handle connection loss (internet connectivity) between the client and the server app to handle it appropriately, ie: Display an informative error message to the user or to detect a server shutdown to silently redirect on another app server.
In 2-tier I used to manage that with ODAC components, the TOraSession have some events to handle this issues.
Normally there is no event fired when a connection is broken, unless a statement is fired against the database. This is because there is no way of knowing a connection loss unless there is some sort of is-alive pinging going on.
Many frameworks check if a connection is still valid by doing a very small query against the server. Could be getting the time from a server. Especially in a connection pooling environment.
You can implement a connection checking function in your application in some of the database events (beforeexecute?). Or make a timer that checks every 10 seconds.
Spawn a thread on the client which periodically sends some RPC 'Ping' or 'Heartbeat' commands to the server.
if this fails, the client knows that something happened to the connection
if the server does not hear the client anymore for some time period (for example, two times the heartbeat interval), he can conclude that the client disconnected, however this requires a stateful server (and your design is stateless so it would require event processing in a secondary system, which could be fed through a message queue)