I have .Net client that connects and works fine for a while and then disconnects and fails to reconnect because of an Invalid Selector error. There is no selector being passed so I don't know what the issue is.
See log below:
2019-01-28T19:47:56.8258311Z SolaceConsumer FlowEvent ParentSessionDown Session for Flow disconnected
2019-01-28T19:47:56.8264139Z SolaceConsumer SessionEvent Reconnecting solClientOS.c:5745 (7f50727bb700) Peer closed socket, fd 92, cannot read
2019-01-28T19:47:57.1187156Z SolaceConsumer SessionEvent Reconnected host 'serverSolace:55003', hostname 'serverSolace:55003' IP 54.245 (host 1 of 1) (host connection attempt 1 of 1) (total reconnection attempt 1 of -1)
2019-01-28T19:47:57.1204652Z SolaceConsumer solClientFlow.c:4286 (7f50727bb700) Selector Invalid: exceeds 1023 bytes
2019-01-28T19:47:57.1205778Z SolaceConsumer FlowEvent BindFailedError Selector Invalid: exceeds 1023 bytes
This is an existing bug in Solace version 10.3.0.0
Solace has fixed this issue in their upper version 10.3.2.0, you can use the higher version to solve this issue.
Related
We implemented connection pooling in our client code to invoke a server which closes(sends Connection:close in response headers) a connection after 2.5mins. Due to server behaviour we sometimes/intermittently get NoHttpResponseException. And this may occur at high TPS or at low TPS as well.
We are using apache http client version 4.5.11. And there is one validateAfterInactivity setting in PoolingHttpClientConnectionManager which is by-default set to 2000ms. But i think we may get same exception if we try to get the connection in 2000ms period.
We can choose to set aggressive value for validateAfterInactivity but i heard that it can degrade the performance by ~20 to 30ms for each request.
is retrying this exception a good solution ?
And also align to same context, can we retry in case of java.net.SocketException: Connection reset ?
#ok2c any suggestion here ?
Thanks in advance.
NoHttpResponseException is considered safe to retry for idempotent methods.
In your particular case however I would consider limiting the TTL (total to live) of client connections to 2.5 minutes to match that of the server endpoints.
I am trying to establish the max number of retries from my app to rabbit broker.
I have the retry interceptor,
#Bean
public RetryOperationsInterceptor retryOperationsInterceptor() {
return RetryInterceptorBuilder.stateless()
.maxAttempts(CommonConstants.MAX_AMQP_RETRIES)
.backOffOptions(500, 2.0, 3000)
.build();
}
and this is used inside listener container,
container.setAdviceChain(new Advice[]{retryOperationsInterceptor()});
However, after a couple of retries, the consumer attempts connection all over again in an endless loop,
2017-02-21 15:03:12.229 WARN 9292 --- [nsumerThread_92] o.s.a.r.l.SimpleMessageListenerContainer : Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused: connect
2017-02-21 15:03:12.229 INFO 9292 --- [nsumerThread_92] o.s.a.r.l.SimpleMessageListenerContainer : Restarting Consumer: tags=[{}], channel=null, acknowledgeMode=AUTO local queue size=0
2017-02-21 15:03:13.245 WARN 9292 --- [nsumerThread_93] o.s.a.r.l.SimpleMessageListenerContainer : Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused: connect
2017-02-21 15:03:13.245 INFO 9292 --- [nsumerThread_93] o.s.a.r.l.SimpleMessageListenerContainer : Restarting Consumer: tags=[{}], channel=null, acknowledgeMode=AUTO local queue size=0
2017-02-21 15:03:13.261 ERROR 9292 --- [nsumerThread_83] o.s.a.r.l.SimpleMessageListenerContainer : Failed to check/redeclare auto-delete queue(s).
I want the app to fail and error out because of lack of connectivity to the broker after a MAX_RETRY # limit.
thanks for the help
EDIT
As suggested by #artem-bilan, I ended up using a Component
public class BrokerFailureEventListener implements ApplicationListener<ListenerContainerConsumerFailedEvent>
In this class the onApplicationEvent I counted the number of failures and then take appropriate action.
In case of producer-side, it's a little different scenario. As explained by #artem-bilan, the application would need to take care of any issues. I explored using netflix-hystrix and added a fallback method for the production method and will go with that route. thanks much again.
Well, you misunderstood a bit container.setAdviceChain(new Advice[]{retryOperationsInterceptor()});. It is for the business errors during messages processing:
Business exception handling, as opposed to protocol errors and dropped connections, might need more thought and some custom configuration, especially if transactions and/or container acks are in use. Prior to 2.8.x, RabbitMQ had no definition of dead letter behaviour, so by default a message that is rejected or rolled back because of a business exception can be redelivered ad infinitum. To put a limit in the client on the number of re-deliveries, one choice is a StatefulRetryOperationsInterceptor in the advice chain of the listener. The interceptor can have a recovery callback that implements a custom dead letter action: whatever is appropriate for your particular environment.
In contradiction to the:
In fact it loops endlessly trying to restart the consumer, and only if the consumer is very badly behaved indeed will it give up. One side effect is that if the broker is down when the container starts, it will just keep trying until a connection can be established.
What you need is ListenerContainerConsumerFailedEvent, which is emitted as:
private void logConsumerException(Throwable t) {
if (logger.isDebugEnabled()
|| !(t instanceof AmqpConnectException || t instanceof ConsumerCancelledException)) {
logger.warn(
"Consumer raised exception, processing can restart if the connection factory supports it",
t);
}
else {
logger.warn("Consumer raised exception, processing can restart if the connection factory supports it. "
+ "Exception summary: " + t);
}
publishConsumerFailedEvent("Consumer raised exception, attempting restart", false, t);
}
So, you can listen for those events and stop your application when some condition is reached.
I'm using rabbitMQ server with amq.
I am having a difficult problem. After leaving the server alone for about 10 min, the connection is lost.
What could be causing this?
If you look at the Erlang client documentation http://www.rabbitmq.com/erlang-client-user-guide.html you will see a section titled Connecting To A Broker
This gives you a few different options that you can specify when setting up your connection to the RabbitMQ server, one of the options is the heartbeat, as you can see the default is 0 so no heartbeat is specified.
I don't know the exact Erlang notation, but you will need to do something like:
{ok, Connection} = amqp_connection:start(#amqp_params_network{heartbeat = 5})
The heartbeat timeout is specified in seconds. So this would cause your consumer to heartbeat back to the server every 5seconds.
Also take a look at this discussion: https://groups.google.com/forum/?fromgroups=#!topic/rabbitmq-discuss/u227xzvqOr8
The default connection timeout for the RabbitMQ connection factory is 600 seconds (at least in the Java client API), hence your 10 minutes. You can change this by specifying to the connection factory your timeout of choice.
It is good practice to ensure your connection is release and recreated after a specific amount of time, to prevent eventual leaks and excessive resournces. Your code should ensure that it seeks a valid connection that is not close to be timed-out, and re-establish a new connection on the ones that did time-out. Overall, adopt a connection-pooling approach.
- Java example:
ConnectionFactory factory = new ConnectionFactory();
factory.setHost(this.serverName);
factory.setPort(this.serverPort);
factory.setUsername(this.userName);
factory.setPassword(this.userPassword);
factory.setConnectionTimeout( YOUR-TIMEOUT-IN-SECONDS );
Connection = factory.newConnection();
I am making a simple UDP P2P Chat Program with a well known server.
The client's send and recieve data from server and clients through a single IdUDPServer.
The clients as of now can login and logout i.e. they can send data to the server.
Whenever the server sends any data it gets dropped at the NIC side of the node as the embedded ip header checksum is 0x00 as notified by wireshark.
IdUDPServer Settings (Client/Server)
Active : True
Bindings :
Broadcast : False
BufferSize : 8192
DefaultPort : 10000
IPVersion : Id_IPv4
ThreadedEvent : False
Command Used
only one command is used within
UDPServer.SendBuffer ( ED_Host.Text, StrToInt ( ED_Port.Text ), Buffer );
A similar configuration is working perfectly in another program of mine.
Most NICs will perform checksum validation and generation these days instead of the os network stack. This is to improve performance and is known as checksum offloading. As such wiresshark will report the fact the checksum is missing as an error but it can usually be ignored or the error turned off in the wire shark settings.
Some NIC drivers allow you to turn off checksum offloading. Try this and retest the code
Using Delphi 2010 and Indy 10.5.8.0.
Against the server Titan FTP I'm getting all the time the exception "Invalid argument to time encode" (EConvertError) when connecting.
The server log tells me:
FEAT<EOL>
211-Extensions Supported<EOL> COMB<EOL> MLST type*;size*;modify*;create*;perm*;<EOL> SIZE<EOL> MDTM<EOL> XCRC<EOL> REST STREAM<EOL> AUTH SSL<EOL> AUTH TLS<EOL> CCC<EOL> PBSZ<EOL> PROT<EOL> EPRT<EOL> EPSV<EOL> DQTA<EOL>211 End<EOL>
TYPE A<EOL>
200 Type set to A.<EOL>
The user "*****" has initiated a session on "217.********:21"
SYST<EOL>
215 UNIX Type: L8<EOL>
SITE ZONE<EOL>
210 UTC-2147483647<EOL>
QUIT<EOL>
221 Session Ended. Downloaded 0KB, Uploaded 0KB. Goodbye *** from 130.******.<EOL>
Any ideas?
The server is sending a faulty UTC offset in response to the SITE ZONE command. That is a bug in Titan. When Indy tries to parse the value for use in later TDateTime operations, the parse fails. Contact the Titan devs and let them know about the bug. In the meantime, I will look into updating TIdFTP to handle that error in the future.