Sudden exception during UDP streaming - network-programming

I have an open UDP connection that streams video for several hours between two machines on different vlans.
After several hours I get the following exception on the server side (the transmitter):
System.Net.Sockets.SocketException: A blocking operation was
interrupted by a call to WSACancelBlockingCall at
System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32
size, SocketFlags socketFlags)
From that moment on, from time to time (not on every send), I see the following:
System.Net.Sockets.SocketException: A non-blocking socket operation
could not be completed immediately at
System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32
size, SocketFlags socketFlags)
On the client side I see no exception or abnormal behavior.
Is it possible that I get this exception due to a N/W problem, for example, something in the switch?
Any other ideas what can cause these exceptions?
Thanks

I will make a wild guess about WSACancelBlockingCall exception.
Probably you are trying to close the socket from another thread or your socket is getting disposed somehow with garbage collector.

Related

Transactions not expired after timeout expired

We're using neo4j (3.1.5-enterprise) for one of our services. (Over HTTP)
We set dbms.transaction.timeout=150s in our neo4j config file .
We have a scenario which may take more time than 150 seconds, but what we would like is for the transaction to be expired after 150 seconds anyway.
For some reason its not happening and the transaction continue until it fully executed but its not being stopped after 150 seconds, any guess why?
In our application logs I can see the following exception (more stacktrace details below):
neo.db.NeoHttpDriver - Errors in response:
[NeoResponseError{
code='Neo.DatabaseError.Statement.ExecutionFailed',
message='Transaction timeout. (Overtime: 23793 ms).',
stackTrace='org.neo4j.kernel.guard.GuardTimeoutException: Transaction timeout. (Overtime: 23793 ms).
...
Also, our service steps(in the specific scenario that may take long time) in general is open a transaction, lock some common entity and proceed. Since the transaction is not expired and released(and therefor the common entity continue to be locked) after 150 seconds, then other threads may also be locked for a long time.
Thanks!
Orel
Exception stacktrace:
15:00:59.627 [DefaultThreadPool-7] DEBUG c.e.e.m.neo.db.NeoHttpDriver - Errors in response: [NeoResponseError{code='Neo.DatabaseError.Statement.ExecutionFailed', message='Transaction timeout. (Overtime: 23793 ms).', stackTrace='org.neo4j.kernel.guard.GuardTimeoutException: Transaction timeout. (Overtime: 23793 ms).
at org.neo4j.kernel.guard.TimeoutGuard.check(TimeoutGuard.java:71)
at org.neo4j.kernel.guard.TimeoutGuard.check(TimeoutGuard.java:57)
at org.neo4j.kernel.guard.TimeoutGuard.check(TimeoutGuard.java:49)
at org.neo4j.kernel.impl.api.GuardingStatementOperations.nodeCursorById(GuardingStatementOperations.java:300)
at org.neo4j.kernel.impl.api.OperationsFacade.nodeHasProperty(OperationsFacade.java:343)
at org.neo4j.cypher.internal.spi.v3_1.TransactionBoundQueryContext$NodeOperations.hasProperty(TransactionBoundQueryContext.scala:319)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations$$anonfun$hasProperty$1.apply$mcZ$sp(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations$$anonfun$hasProperty$1.apply(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations$$anonfun$hasProperty$1.apply(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.spi.v3_1.ExceptionTranslationSupport$class.translateException(ExceptionTranslationSupport.scala:32)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1.translateException(ExceptionTranslatingQueryContextFor3_1.scala:34)
at org.neo4j.cypher.internal.compatibility.ExceptionTranslatingQueryContextFor3_1$ExceptionTranslatingOperations.hasProperty(ExceptionTranslatingQueryContextFor3_1.scala:245)
at org.neo4j.cypher.internal.compiler.v3_1.spi.DelegatingOperations.hasProperty(DelegatingQueryContext.scala:221)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.AbstractSetPropertyOperation.setProperty(SetOperation.scala:98)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.SetEntityPropertyOperation.set(SetOperation.scala:117)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.SetPipe$$anonfun$internalCreateResults$1.apply(SetPipe.scala:31)
at org.neo4j.cypher.internal.compiler.v3_1.pipes.SetPipe$$anonfun$internalCreateResults$1.apply(SetPipe.scala:30)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:71)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator$$anonfun$next$1.apply(ResultIterator.scala:68)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator$$anonfun$failIfThrows$1.apply(ResultIterator.scala:94)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.decoratedCypherException(ResultIterator.scala:103)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.failIfThrows(ResultIterator.scala:92)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.next(ResultIterator.scala:68)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.next(ResultIterator.scala:49)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.foreach(ResultIterator.scala:49)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:183)
at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:45)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.to(ResultIterator.scala:49)
at scala.collection.TraversableOnce$class.toList(TraversableOnce.scala:294)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.toList(ResultIterator.scala:49)
at org.neo4j.cypher.internal.compiler.v3_1.EagerResultIterator.<init>(ResultIterator.scala:35)
at org.neo4j.cypher.internal.compiler.v3_1.ClosingIterator.toEager(ResultIterator.scala:53)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.buildResultIterator(DefaultExecutionResultBuilderFactory.scala:109)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.createResults(DefaultExecutionResultBuilderFactory.scala:99)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.DefaultExecutionResultBuilderFactory$ExecutionWorkflowBuilder.build(DefaultExecutionResultBuilderFactory.scala:68)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.InterpretedExecutionPlanBuilder$$anonfun$getExecutionPlanFunction$1.apply(ExecutionPlanBuilder.scala:164)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.InterpretedExecutionPlanBuilder$$anonfun$getExecutionPlanFunction$1.apply(ExecutionPlanBuilder.scala:148)
at org.neo4j.cypher.internal.compiler.v3_1.executionplan.InterpretedExecutionPlanBuilder$$anon$1.run(ExecutionPlanBuilder.scala:123)
at org.neo4j.cypher.internal.compatibility.CompatibilityFor3_1$ExecutionPlanWrapper$$anonfun$run$1.apply(CompatibilityFor3_1.scala:275)
at org.neo4j.cypher.internal.compatibility.CompatibilityFor3_1$ExecutionPlanWrapper$$anonfun$run$1.apply(CompatibilityFor3_1.scala:273)
at org.neo4j.cypher.internal.compatibility.exceptionHandlerFor3_1$runSafely$.apply(CompatibilityFor3_1.scala:190)
at org.neo4j.cypher.internal.compatibility.CompatibilityFor3_1$ExecutionPlanWrapper.run(CompatibilityFor3_1.scala:273)
at org.neo4j.cypher.internal.PreparedPlanExecution.execute(PreparedPlanExecution.scala:26)
at org.neo4j.cypher.internal.ExecutionEngine.execute(ExecutionEngine.scala:107)
at org.neo4j.cypher.internal.javacompat.ExecutionEngine.executeQuery(ExecutionEngine.java:59)
at org.neo4j.server.rest.transactional.TransactionHandle.safelyExecute(TransactionHandle.java:371)
at org.neo4j.server.rest.transactional.TransactionHandle.executeStatements(TransactionHandle.java:323)
at org.neo4j.server.rest.transactional.TransactionHandle.execute(TransactionHandle.java:230)
at org.neo4j.server.rest.transactional.TransactionHandle.execute(TransactionHandle.java:119)
at org.neo4j.server.rest.web.TransactionalService.lambda$executeStatements$0(TransactionalService.java:203)
Most likely the problem is that the tx is waiting on a lock. Prior to Neo4j 3.2, dbms.transaction.timeout cannot cover the case of terminating a transaction that's waiting on a lock (or rather, it will mark it for termination, but the actual termination won't happen until the lock is acquired).
In Neo4j 3.2, dbms.lock.acquisition.timeout was introduced, which interrupts waiting on a lock and allows the thread to check if the tx has been terminated and take appropriate action.
The following is based on an answer provided by Neo4j Support:
dbms.lock.acquisition.timeout
As a starting point, dbms.lock.acquisition.timeout was only added in Neo4j 3.2, it does not exist for 3.1. where we don't yet have lock acquisition timeout, hence wait times on locks can over-runs past the set limit. Things like GC can also extend the time. However, as you're currently on 3.1.5, dbms.lock.acquisition.timeout would not yet be enforced.
dbms.transaction.timeout
dbms.transaction.timeout marks a transaction for termination, but the actual logic of checking this and performing the termination happens on a running thread, not one waiting on locks, and doesn't cause the thread to wake up and check. Presumably the logic for terminating a thread upon timeout is that some other thread periodically checks execution time for a transaction, and if it has exceeded the transaction timeout, sets a boolean variable on the transaction to indicate that it is marked for termination. The actual termination of the thread likely happens in an event loop for the transaction, where it checks that variable to see if it's marked for termination, then terminates and rolls back. A thread that attempts to acquire a lock enters a waiting state when the lock is already held by another thread. During this waiting state, the event loop is not being processed, so the thread never reaches the point in the event loop where it can check if it's been marked as terminated and take care of it.
Bottom line:
dbms.transaction.timeout does not cause a hard timeout, it only marks the transaction as timed-out, which will cause it to rollback once the flag is checked.

Is NSStream.close() synchronous wrt TCP?

I have input and output NSStream's as part of a TCP connection after using NSStream.getStreamsToHostWithName(). If I call the close() method on those input and output streams, then by the time the functions return, will my TCP connection be in the CLOSED state?
If not, how could I determine the time at which the underlying TCP connection actually closes?
Closing the stream terminates the flow of bytes and releases system
resources that were reserved for the stream when it was opened. If the
stream has been scheduled on a run loop, closing the stream implicitly
removes the stream from the run loop. A stream that is closed can
still be queried for its properties.
var streamStatus: NSStreamStatus { get }
The receiver’s status
enum NSStreamStatus : UInt {
case NotOpen
case Opening
case Open
case Reading
case Writing
case AtEnd
case Closed
case Error
}
As far as I can tell, the answer is no, the TCP connection will not necessarily be in the CLOSED state. To make sure the TCP connection does a graceful close and find out when that happens, one must use a lower-level API than NSStream.

Indy server TIdTCPServer.OnExecute called continuously after client disconnect

I have situation when I know how much bytes I expect to get from client (in this code it is 100). If client does't gives me 100 I don't need this packet at all, because it is corrupted.
I have situation when client gives less than 100 bytes and disconnects. In this case I'm catching exception and everything would be fine except that after this exception my ServerTCPExecute is called for many times uncontinuously. There are no any other clients connected. Why it is so? If I do TIdYarnOfThread(AContext.Yarn).Thread.Terminate; everything goes fine. But I'm not sure it is good solution.
procedure Form1.ServerTCPExecute(AContext: TIdContext);
begin
try
AContext.Connection.IOHandler.ReadBytes(b, 100, False);
except
//TIdYarnOfThread(AContext.Yarn).Thread.Terminate;
end;
end;
Solution: do not use try ... except to catch and ignore all exceptions within OnExecute.
This will swallow the Indy exception which happens when the client disconnects, the server will execute the next iteration of OnExecute if the IOHandler.InputBuffer still has data in it, and ReadBytes raises another exception because the peer disconnected.
If any exception is raised and leaves the method, the Indy TCP server will close and clean up the connection.
Indy uses exceptions for error handling and notifications in OnExecute, so do not suppress them with an "empty" exception handler. If you need to catch exceptions, re-raise any EIdException-derived exceptions you catch and let the server handle them.
p.s.: the posted code seems to use a non-local variable b. Because TIdTCPServer is a multi-threaded component, non-local variables must always be accessed in a thread-safe way, for example using a critical section, or putting the variable inside of the TIdContext object.

libevent : whether event can be triggered if the related socket is closed by local program

if I add an event for a connection socket which is returned by accept(), as below
event_set(&conn_ev, connfd, EV_READ|EV_PERSIST, on_recv, NULL);
event_base_set(base, &conn_ev);
event_add(&conn_ev, NULL);
if at sometime, the local program(not the peer) closes the socket, will the conn_ev be triggered?
if so, how to detect whether whether the event is due to the closing of the socket?
is it recv(connfd,..) returns -1 and errno is set EBAD or any other cases?
thanks!
All sockets are marked as readable if the socket is nicely closed by the other end, with read returning zero. When an error is received they are marked either read- or write-able, with read or write returning -1.
See e.g. the socket(7) manual page for a table of states.

MQ Connection - 2009 error

am connectting the MQ with below code. I am able connected to MQ successfully. My case is i place the messages to MQ every 1 min once. After disconnecting the cable i get a ResonCode error but IsConnected property still show true. Is this is the right way to check if the connection is still connected ? Or there any best pratcices around that.
I would like to open the connection when applicaiton is started keep it open for ever.
public static MQQueueManager ConnectMQ()
{
if ((queueManager == null) || (!queueManager.IsConnected)||(queueManager.ReasonCode == 2009))
{
queueManager = new MQQueueManager();
}
return queueManager;
}
The behavior of the WMQ client connection is that when idle it will appear to be connected until an API call fails or the connection times out. So isConnected() will likely report true until a get, put or inquire call is attempted and fails, at which point QMgr will then report disconnected.
The other thing to consider here is that 2009 is not the only code you might get. It happens to be the one you get when the connection is severed but there are connection codes for QMgr shutting down, channel shutting down, and a variety of resource and other errors.
Typically for a requirement to maintain a constant connection you would want to wrap the connect and message processing loop inside a try/catch block nested inside a while statement. When you catch an exception other than an intentional exit, close the objects and QMgr, sleep at least 5 seconds, then loop around to the top of the while. The sleep is crucial because if you get caught in a tight reconnect loop and throw hundreds of connection attempts at the QMgr, you can bring even a mainframe QMgr to its knees.
An alternative is to use a v7 WMQ client and QMgr. With this combination, automatic reconnection is configurable as a channel configuration.

Resources