MQ Connection - 2009 error - connection

am connectting the MQ with below code. I am able connected to MQ successfully. My case is i place the messages to MQ every 1 min once. After disconnecting the cable i get a ResonCode error but IsConnected property still show true. Is this is the right way to check if the connection is still connected ? Or there any best pratcices around that.
I would like to open the connection when applicaiton is started keep it open for ever.
public static MQQueueManager ConnectMQ()
{
if ((queueManager == null) || (!queueManager.IsConnected)||(queueManager.ReasonCode == 2009))
{
queueManager = new MQQueueManager();
}
return queueManager;
}

The behavior of the WMQ client connection is that when idle it will appear to be connected until an API call fails or the connection times out. So isConnected() will likely report true until a get, put or inquire call is attempted and fails, at which point QMgr will then report disconnected.
The other thing to consider here is that 2009 is not the only code you might get. It happens to be the one you get when the connection is severed but there are connection codes for QMgr shutting down, channel shutting down, and a variety of resource and other errors.
Typically for a requirement to maintain a constant connection you would want to wrap the connect and message processing loop inside a try/catch block nested inside a while statement. When you catch an exception other than an intentional exit, close the objects and QMgr, sleep at least 5 seconds, then loop around to the top of the while. The sleep is crucial because if you get caught in a tight reconnect loop and throw hundreds of connection attempts at the QMgr, you can bring even a mainframe QMgr to its knees.
An alternative is to use a v7 WMQ client and QMgr. With this combination, automatic reconnection is configurable as a channel configuration.

Related

Bidirectional gRPC stream sometimes stops processing responses after stopping and starting

In short
We have a mobile app that streams fairly high volumes of data to and from a server through various bidirectional streams. The streams need to be closed on occasion (for example when the app is backgrounded). They are then reopened as needed. Sometimes when this happens, something goes wrong:
From what I can tell, the stream is up and running on the device's side (the status of both the GRPCProtocall and the GRXWriter involved is either started or paused)
The device sends data on the stream fine (the server receives the data)
The server seems to send data back to the device fine (the server's Stream.Send calls return as successful)
On the device, the result handler for data received on the stream is never called
More detail
Our code is heavily simplified below, but this should hopefully provide enough detail to indicate what we're doing. A bidirection stream is managed by a Switch class:
class Switch {
/** The protocall over which we send and receive data */
var protocall: GRPCProtoCall?
/** The writer object that writes data to the protocall. */
var writer: GRXBufferedPipe?
/** A static GRPCProtoService as per the .proto */
static let service = APPDataService(host: Settings.grpcHost)
/** A response handler. APPData is the datatype defined by the .proto. */
func rpcResponse(done: Bool, response: APPData?, error: Error?) {
NSLog("Response received")
// Handle response...
}
func start() {
// Create a (new) instance of the writer
// (A writer cannot be used on multiple protocalls)
self.writer = GRXBufferedPipe()
// Setup the protocall
self.protocall = Switch.service.rpcToStream(withRequestWriter: self.writer!, eventHandler: self.rpcRespose(done:response:error:))
// Start the stream
self.protocall.start()
}
func stop() {
// Stop the writer if it is started.
if self.writer.state == .started || self.writer.state == .paused {
self.writer.finishWithError(nil)
}
// Stop the proto call if it is started
if self.protocall?.state == .started || self.protocall?.state == .paused {
protocall?.cancel()
}
self.protocall = nil
}
private var needsRestart: Bool {
if let protocall = self.protocall {
if protocall.state == .notStarted || protocall.state == .finished {
// protocall exists, but isn't running.
return true
} else if writer.state == .notStarted || writer.state == .finished {
// writer isn't running
return true
} else {
// protocall and writer are running
return false
}
} else {
// protocall doesn't exist.
return true
}
}
func restartIfNeeded() {
guard self.needsRestart else { return }
self.stop()
self.start()
}
func write(data: APPData) {
self.writer.writeValue(data)
}
}
Like I said, heavily simplified, but it shows how we start, stop, and restart streams, and how we check whether a stream is healthy.
When the app is backgrounded, we call stop(). When it is foregrounded and we need the stream again, we call start(). And we periodically call restartIfNeeded(), eg. when screens that use the stream come into view.
As I mentioned above, what happens occasionally is that our response handler (rpcResponse) stops getting called when server writes data to the stream. The stream appears to be healthy (server receives the data we write to it, and protocall.state is neither .notStarted nor .finished). But not even the log on the first line of the response handler is executed.
First question: Are we managing the streams correctly, or is our way of stopping and restarting streams prone to errors? If so, what is the correct way of doing something like this?
Second question: How do we debug this? Everything we could think of that we can query for a status tells us that the stream is up and running, but it feels like the objc gRPC library keeps a lot of its mechanics hidden from us. Is there a way to see whether responses from server may do reach us, but fail to trigger our response handler?
Third question: As per the code above, we use the GRXBufferedPipe provided by the library. Its documentation advises against using it in production because it doesn't have a push-back mechanism. To our understanding, the writer is only used to feed data to the gRPC core in a synchronised, one-at-a-time fashion, and since server receives data from us fine, we don't think this is an issue. Are we wrong though? Is the writer also involved in feeding data received from server to our response handler? I.e. if the writer broke due to overload, could that manifest as a problem reading data from the stream, rather than writing to it?
UPDATE: Over a year after asking this, we have finally found a deadlock bug in our server-side code that was causing this behaviour on client-side. The streams appeared to hang because no communication sent by the client was handled by server, and vice-versa, but the streams were actually alive and well. The accepted answer provides good advice for how to manage these bi-directional streams, which I believe is still valuable (it helped us a lot!). But the issue was actually due to a programming error.
Also, for anyone running into this type of issue, it might be worth investigating whether you're experiencing this known issue where a channel gets silently dropped when iOS changes its network. This readme provides instructions for using Apple's CFStream API rather than TCP sockets as a possible fix for that issue.
First question: Are we managing the streams correctly, or is our way of stopping and restarting streams prone to errors? If so, what is the correct way of doing something like this?
From what I can tell by looking at your code, the start() function seems to be right. In the stop() function, you do not need to call cancel() of self.protocall; the call will be finished with the previous self.writer.finishWithError(nil).
needsrestart() is where it gets a bit messy. First, you are not supposed to poll/set the state of protocall yourself. That state is altered by itself. Second, setting those state does not close your stream. It only pause a writer, and if app is in background, pausing a writer is like a no-op. If you want to close a stream, you should use finishWithError to terminate this call, and maybe start a new call later when needed.
Second question: How do we debug this?
One way is to turn on gRPC log (GRPC_TRACE and GRPC_VERBOSITY). Another way is to set breakpoint at here where gRPC objc library receives a gRPC message from the server.
Third question: Is the writer also involved in feeding data received from server to our response handler?
No. If you create a buffered pipe and feed that as request of your call, it only feed data to be sent to server. The receiving path is handled by another writer (which is in fact your protocall object).
I don't see where the usage of GRXBufferedPipe in production is discouraged. The known drawback about this utility is that if you pause the writer but keep writing data to it with writeWithValue, you end up buffering a lot of data without being able to flush them, which may cause memory issue.

Is NSStream.close() synchronous wrt TCP?

I have input and output NSStream's as part of a TCP connection after using NSStream.getStreamsToHostWithName(). If I call the close() method on those input and output streams, then by the time the functions return, will my TCP connection be in the CLOSED state?
If not, how could I determine the time at which the underlying TCP connection actually closes?
Closing the stream terminates the flow of bytes and releases system
resources that were reserved for the stream when it was opened. If the
stream has been scheduled on a run loop, closing the stream implicitly
removes the stream from the run loop. A stream that is closed can
still be queried for its properties.
var streamStatus: NSStreamStatus { get }
The receiver’s status
enum NSStreamStatus : UInt {
case NotOpen
case Opening
case Open
case Reading
case Writing
case AtEnd
case Closed
case Error
}
As far as I can tell, the answer is no, the TCP connection will not necessarily be in the CLOSED state. To make sure the TCP connection does a graceful close and find out when that happens, one must use a lower-level API than NSStream.

What does it mean that `gen_server` dodges auto-connections on sends but not suspends?

The gen_server implementation has this fun little function:
do_send(Dest, Msg) ->
case catch erlang:send(Dest, Msg, [noconnect]) of
noconnect ->
spawn(erlang, send, [Dest,Msg]);
Other ->
Other
end.
The entry for erlang:send/3 says of the noconnect option
If the destination node would have to be auto-connected before doing the send, noconnect is returned instead.
The function here avoids the delay in setting up a connection between nodes by forcing a spawned process to do the waiting. Clever!
There's another option to erlang:send/3, nosuspend:
If the sender would have to be suspended to do the send, nosuspend is returned instead.
Per, erlang:send_nosuspend/2 the sender will be suspended if the connection is overloaded. Why would not gen_server wish to pull the same trick to avoid suspension of the sending process?
It does this when Dest is on another erlang node. It first tries to send the message without forcing a connection to be set-up if the nodes aren't connected, the [noconnect] option. If this can be done then erlang:send/3 sends the message. If this can't be done then we spawn a process which does a send which waits for the connection to be set up. Setting up a connection between two nodes can take time. This is, of course, so we don't sit and wait unnecessarily for the send.
EDIT:
The gen_server doesn't handle the nosuspend case at all, it just worries about the case where sending a message to a remote process could take time because of the need to wait for a connection to be set up. In which case a process is spawned so we can go on. This does not change the semantics. The nosuspend does a more complex handling of eventual network problems which would probably need more complex handling than should be provided in a standard API.

libevent : whether event can be triggered if the related socket is closed by local program

if I add an event for a connection socket which is returned by accept(), as below
event_set(&conn_ev, connfd, EV_READ|EV_PERSIST, on_recv, NULL);
event_base_set(base, &conn_ev);
event_add(&conn_ev, NULL);
if at sometime, the local program(not the peer) closes the socket, will the conn_ev be triggered?
if so, how to detect whether whether the event is due to the closing of the socket?
is it recv(connfd,..) returns -1 and errno is set EBAD or any other cases?
thanks!
All sockets are marked as readable if the socket is nicely closed by the other end, with read returning zero. When an error is received they are marked either read- or write-able, with read or write returning -1.
See e.g. the socket(7) manual page for a table of states.

Does erlang:disconnect_node/2 immediately stop queued messages?

If I sent a lot of messages to a remote node and immediately call erlang:disconnect_node/2 to drop the connection, is there a chance some messages don't get through the wire? In other words, does that method perform a brutal disconnection, regardless of waiting messages?
No, even with two local nodes!
Setup: I got a node a#super, on witch a dummy receive-print loop runs, registered with a. On another node, I run
(b#super)1> [{a, a#super} ! X || X <- lists:seq(0,10000)], erlang:disconnect_node(a#super).
That is, many messages, and then a brutal disconnection.
Result: the receiver printed the full 10001 messages only once over 10 runs.
So, you definitely do not have any guarantee the receiver got all the messages. You should use another technique (novice at erlang, sorry), or use an ack message before the disconnect.

Resources