ZeroMQ Router Losing Messages - network-programming

The scenario is as follows:
2 ZMQ_ROUTER sockets, A and B;
A binds to an address and is assigned an identity;
B binds to an address, is assigned an identity and also connects to A;
They talk for a while, everything is fine;
I intentionally close the B socket;
The B socket is reopened, rebound and reconnected using all the same parameters, assigned the same identity as previously;
B sends message to A;
A is not seeing the message. I am, however, noticing B's message in tcpdump output (tcpdump is monitoring all traffic on A's port).
This leads me to believe that zmq for some reason drops B's second message. Can anyone explain what's going on here?

Related

Erlang unlinked process terminates

If I have a process A that makes call to a function in process B (procB:func().), and func() generates an error during execution. Process B would terminate, but what about process A? Consider the following in process A:
Case 1:
{ok, Reply} = procB:func().
Case 2:
procB:func().
Will process A terminate in both cases? Or just in case 1 because of mismatch? Please note that the two processes are not linked.
Thanks in advance!
There is no such thing as calling a function in another process, you can send a message to a process that it then may choose to call a function based on message content.
gen_servers work this way, you send a message to the gen_server, and it does a match on the message and chooses if it should invoke call/cast/info/terminate functions.
Assuming you are really talking about sending a message from A to B and B decides to exit, it's all about if process A is linked/monitoring process B.
If you monitor B, you are sent a message saying that B went down and the reason.
If you are linked to B, I believe the rule is you are killed if B died with a status other than 'normal'
A could also have set the flag trap_exit, which means that even if linked and B dies, A is sent a message that he should die and you get to interact with that message (ie: you may restart B, if you choose)
learn you some erlang has a good tutorial on how this works.
You are not able to call function in another process. That is the beauty of Erlang: all communication between processes is via message passing. People sometimes confuse modules with processes. I even wrote article about it.
For example process A:
spawns process B
sends message which is for example tuple {fun_to_call, Args, self()} (you need the self() to know, where to respond
waits for reply using receive
Process B:
immediately after start waits for message
when receives message, does some computation and sends response back
This looks like a lot of boilerplate, so this exact pattern is abstracted in gen_server

What does it mean that `gen_server` dodges auto-connections on sends but not suspends?

The gen_server implementation has this fun little function:
do_send(Dest, Msg) ->
case catch erlang:send(Dest, Msg, [noconnect]) of
noconnect ->
spawn(erlang, send, [Dest,Msg]);
Other ->
Other
end.
The entry for erlang:send/3 says of the noconnect option
If the destination node would have to be auto-connected before doing the send, noconnect is returned instead.
The function here avoids the delay in setting up a connection between nodes by forcing a spawned process to do the waiting. Clever!
There's another option to erlang:send/3, nosuspend:
If the sender would have to be suspended to do the send, nosuspend is returned instead.
Per, erlang:send_nosuspend/2 the sender will be suspended if the connection is overloaded. Why would not gen_server wish to pull the same trick to avoid suspension of the sending process?
It does this when Dest is on another erlang node. It first tries to send the message without forcing a connection to be set-up if the nodes aren't connected, the [noconnect] option. If this can be done then erlang:send/3 sends the message. If this can't be done then we spawn a process which does a send which waits for the connection to be set up. Setting up a connection between two nodes can take time. This is, of course, so we don't sit and wait unnecessarily for the send.
EDIT:
The gen_server doesn't handle the nosuspend case at all, it just worries about the case where sending a message to a remote process could take time because of the need to wait for a connection to be set up. In which case a process is spawned so we can go on. This does not change the semantics. The nosuspend does a more complex handling of eventual network problems which would probably need more complex handling than should be provided in a standard API.

path of packets through network stack

I'm trying to study and understand operations of the Linux tcp/ip stack, specifically how 'ping' sends packets down and receives them.
Ping creates raw socket in AF_INET family, therefore I placed printk in inet_sendmsg() at net/ipv4/af_inet.c to print out the socket protocol name (RAW, UDP etc.) and the address of protocol specific sendmsg function which correctly appears to be raw_sendmsg() from net/ipv4/raw.c
Now, I'm sending a single packet and observe that I'm getting printk form inet_sendmsg() twice.This puzzles me -- is it normal (has something to do with interrupts etc. ?) or there's something broken in the kernel?
Platform - ARM5te, kernel 2.6.31.8
Looking forward to hearing from you !
Mark

Does erlang:disconnect_node/2 immediately stop queued messages?

If I sent a lot of messages to a remote node and immediately call erlang:disconnect_node/2 to drop the connection, is there a chance some messages don't get through the wire? In other words, does that method perform a brutal disconnection, regardless of waiting messages?
No, even with two local nodes!
Setup: I got a node a#super, on witch a dummy receive-print loop runs, registered with a. On another node, I run
(b#super)1> [{a, a#super} ! X || X <- lists:seq(0,10000)], erlang:disconnect_node(a#super).
That is, many messages, and then a brutal disconnection.
Result: the receiver printed the full 10001 messages only once over 10 runs.
So, you definitely do not have any guarantee the receiver got all the messages. You should use another technique (novice at erlang, sorry), or use an ack message before the disconnect.

MQ Connection - 2009 error

am connectting the MQ with below code. I am able connected to MQ successfully. My case is i place the messages to MQ every 1 min once. After disconnecting the cable i get a ResonCode error but IsConnected property still show true. Is this is the right way to check if the connection is still connected ? Or there any best pratcices around that.
I would like to open the connection when applicaiton is started keep it open for ever.
public static MQQueueManager ConnectMQ()
{
if ((queueManager == null) || (!queueManager.IsConnected)||(queueManager.ReasonCode == 2009))
{
queueManager = new MQQueueManager();
}
return queueManager;
}
The behavior of the WMQ client connection is that when idle it will appear to be connected until an API call fails or the connection times out. So isConnected() will likely report true until a get, put or inquire call is attempted and fails, at which point QMgr will then report disconnected.
The other thing to consider here is that 2009 is not the only code you might get. It happens to be the one you get when the connection is severed but there are connection codes for QMgr shutting down, channel shutting down, and a variety of resource and other errors.
Typically for a requirement to maintain a constant connection you would want to wrap the connect and message processing loop inside a try/catch block nested inside a while statement. When you catch an exception other than an intentional exit, close the objects and QMgr, sleep at least 5 seconds, then loop around to the top of the while. The sleep is crucial because if you get caught in a tight reconnect loop and throw hundreds of connection attempts at the QMgr, you can bring even a mainframe QMgr to its knees.
An alternative is to use a v7 WMQ client and QMgr. With this combination, automatic reconnection is configurable as a channel configuration.

Resources