spring amqp rabbit max consumer connection retries - spring-amqp

I am trying to establish the max number of retries from my app to rabbit broker.
I have the retry interceptor,
#Bean
public RetryOperationsInterceptor retryOperationsInterceptor() {
return RetryInterceptorBuilder.stateless()
.maxAttempts(CommonConstants.MAX_AMQP_RETRIES)
.backOffOptions(500, 2.0, 3000)
.build();
}
and this is used inside listener container,
container.setAdviceChain(new Advice[]{retryOperationsInterceptor()});
However, after a couple of retries, the consumer attempts connection all over again in an endless loop,
2017-02-21 15:03:12.229 WARN 9292 --- [nsumerThread_92] o.s.a.r.l.SimpleMessageListenerContainer : Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused: connect
2017-02-21 15:03:12.229 INFO 9292 --- [nsumerThread_92] o.s.a.r.l.SimpleMessageListenerContainer : Restarting Consumer: tags=[{}], channel=null, acknowledgeMode=AUTO local queue size=0
2017-02-21 15:03:13.245 WARN 9292 --- [nsumerThread_93] o.s.a.r.l.SimpleMessageListenerContainer : Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection refused: connect
2017-02-21 15:03:13.245 INFO 9292 --- [nsumerThread_93] o.s.a.r.l.SimpleMessageListenerContainer : Restarting Consumer: tags=[{}], channel=null, acknowledgeMode=AUTO local queue size=0
2017-02-21 15:03:13.261 ERROR 9292 --- [nsumerThread_83] o.s.a.r.l.SimpleMessageListenerContainer : Failed to check/redeclare auto-delete queue(s).
I want the app to fail and error out because of lack of connectivity to the broker after a MAX_RETRY # limit.
thanks for the help
EDIT
As suggested by #artem-bilan, I ended up using a Component
public class BrokerFailureEventListener implements ApplicationListener<ListenerContainerConsumerFailedEvent>
In this class the onApplicationEvent I counted the number of failures and then take appropriate action.
In case of producer-side, it's a little different scenario. As explained by #artem-bilan, the application would need to take care of any issues. I explored using netflix-hystrix and added a fallback method for the production method and will go with that route. thanks much again.

Well, you misunderstood a bit container.setAdviceChain(new Advice[]{retryOperationsInterceptor()});. It is for the business errors during messages processing:
Business exception handling, as opposed to protocol errors and dropped connections, might need more thought and some custom configuration, especially if transactions and/or container acks are in use. Prior to 2.8.x, RabbitMQ had no definition of dead letter behaviour, so by default a message that is rejected or rolled back because of a business exception can be redelivered ad infinitum. To put a limit in the client on the number of re-deliveries, one choice is a StatefulRetryOperationsInterceptor in the advice chain of the listener. The interceptor can have a recovery callback that implements a custom dead letter action: whatever is appropriate for your particular environment.
In contradiction to the:
In fact it loops endlessly trying to restart the consumer, and only if the consumer is very badly behaved indeed will it give up. One side effect is that if the broker is down when the container starts, it will just keep trying until a connection can be established.
What you need is ListenerContainerConsumerFailedEvent, which is emitted as:
private void logConsumerException(Throwable t) {
if (logger.isDebugEnabled()
|| !(t instanceof AmqpConnectException || t instanceof ConsumerCancelledException)) {
logger.warn(
"Consumer raised exception, processing can restart if the connection factory supports it",
t);
}
else {
logger.warn("Consumer raised exception, processing can restart if the connection factory supports it. "
+ "Exception summary: " + t);
}
publishConsumerFailedEvent("Consumer raised exception, attempting restart", false, t);
}
So, you can listen for those events and stop your application when some condition is reached.

Related

Devops2019 JobAgent Down after URL Change

We are using Devops 2019 (App and data tiers are on two different servers),As part of our domain migration we have changed our Devops URL
from: https:\domain.wireless.com
to: https:\domain.wire.com
But post URL change, we have noticed all the build agents stopped working (self hosted agents ) and when admin console is launched and tried to send test mail I see below:
Exception Message: The underlying connection was closed:
An unexpected error occurred on a send. (type WebException)Exception Stack Trace:
at System.Net.HttpWebRequest.GetResponse()
at Microsoft.TeamFoundation.Admin.Console.Models.DlgSendTestMailViewModel.SendEmail()
Inner Exception Details:
Exception Message: Unable to read data from the transport connection:
An existing connection was forcibly closed by the remote host. (type IOException)
Exception Stack Trace:
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
Need some assistance on how to resolve such issues
Probably not your ideal approach, but you could just remove and register the agents using the new URL. You'd have to reset any custom capabilities afterwards.
For windows, the URL is stored in a hidden file at the agent root folder. I don't think it works just modifying it in that agent configuration, but you could try changing one and stop/start the agent if it is running as a service. If it is running auto-login (maybe for UI testing), I've never had any success except for removing/reconfiguring.

How to check to see if Ruby-Kafka retries works?

In the documentation it is mentioned that producer retries to send the message to the queue based on max_retries.
So I shutdown the Kafka and then tried my producer. I get this error
Fetching cluster metadata from kafka://localhost:9092
[topic_metadata] Opening connection to localhost:9092 with client id MYCLIENTID
ERROR -- : [topic_metadata] Failed to connect to localhost:9092: Connection refused
DEBUG -- : Closing socket to localhost:9092
ERROR -- : Failed to fetch metadata from kafka://localhost:9092
Completed 500 Internal Server Error in 486ms (ActiveRecord: 33.9ms)
which it make sense, however the retries never happens after that. I have read the doc inside-out and I can't figure it out how this retries actually going to trigger?
Here is my code:
def self.deliver_message(kafka, message, topic, transactional_id)
producer = kafka.producer(idempotent: true,
transactional_id: transactional_id,
required_acks: :all,
max_retries: 5,
retry_backoff: 5)
producer.produce(message, topic: topic)
producer.deliver_messages
end
link to doc:
https://www.rubydoc.info/gems/ruby-kafka/Kafka/Producer#initialize-instance_method
Thank you in advance.
The retries are based on the type of Exception thrown by the producer callback. According to the Callback Docs there are the following Exception possible happening during callback:
The exception thrown during processing of this record. Null if no error occurred. Possible thrown exceptions include:
Non-Retriable exceptions (fatal, the message will never be sent):
InvalidTopicException
OffsetMetadataTooLargeException
RecordBatchTooLargeException
RecordTooLargeException
UnknownServerException
Retriable exceptions (transient, may be covered by increasing #.retries):
CorruptRecordException
InchvalidMetadataException
NotEnoughReplicasAfterAppendException
NotEnoughReplicasException
OffsetOutOfRangeException
TimeoutException
UnknownTopicOrPartitionException
Shutting down Kafka completely rather looks like a non-retriable Exception.

javax.jms.JMSException: Channel was inactive for too long: localhost/127.0.0.1:7676

I am trying to connect jms server using ActiveMQ from a stand alone java code but I am struggling with following exception. I tried various options but not able to figure out root cause.
It is failing at following line of code:
jmsConnection.start();
my broker url is:
tcp://localhost:7676?wireFormat.maxInactivityDuration=0
Stack Trace is as follows:
javax.jms.JMSException: Channel was inactive for too long: localhost/127.0.0.1:7676
at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:62)
at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1206)
at org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1289)
at org.apache.activemq.ActiveMQConnection.start(ActiveMQConnection.java:456)
at com.bt.ccdr.dbtoqueue.DBToQueueHelperImpl.getJMSConnection(DBToQueueHelperImpl.java:172)
at com.bt.ccdr.dbtoqueue.DBToQueueHelperImpl.main(DBToQueueHelperImpl.java:42)
Caused by: org.apache.activemq.transport.InactivityIOException: Channel was inactive for too long: localhost/127.0.0.1:7676
at org.apache.activemq.transport.InactivityMonitor.oneway(InactivityMonitor.java:225)
at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:83)
at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:100)
at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:40)
at org.apache.activemq.transport.ResponseCorrelator.asyncRequest(ResponseCorrelator.java:74)
at org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:79)
at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1195)
... 4 more
Change the connection sting to failover:(tcp://localhost:7676). The failover transport layers reconnect logic over the standard OpenWire tcp transport.

Cowboy websocket termination error

I am implementing cowboy websocket. Everything is working fine except when user closes the browser it fires websocket_termination and at server end it generates following error:-
Error in process <0.298.0> on node 'ews_2#servername.com' with exit value: {function_clause,
[{cowboy_req,ensure_response,[{[]},204],[{file,"src/cowboy_req.erl"},{line,1112}]},
{cowboy_protocol,next_request,3,[{file,"src/cowboy_protocol.erl"},{line,545}]}]}
Code in websocket_termination is :-
websocket_terminate(Reason, Req, State) ->
io:format("~nWebsocket connection termination~n"),
ok.
Resolved: Problem was Req was not getting passed and got manipulated between the callbacks... Cowboy needs a proper Req parameter to be passed at the time of connection termination.

Connection in RabbitMQ server auto lost after 600s

I'm using rabbitMQ server with amq.
I am having a difficult problem. After leaving the server alone for about 10 min, the connection is lost.
What could be causing this?
If you look at the Erlang client documentation http://www.rabbitmq.com/erlang-client-user-guide.html you will see a section titled Connecting To A Broker
This gives you a few different options that you can specify when setting up your connection to the RabbitMQ server, one of the options is the heartbeat, as you can see the default is 0 so no heartbeat is specified.
I don't know the exact Erlang notation, but you will need to do something like:
{ok, Connection} = amqp_connection:start(#amqp_params_network{heartbeat = 5})
The heartbeat timeout is specified in seconds. So this would cause your consumer to heartbeat back to the server every 5seconds.
Also take a look at this discussion: https://groups.google.com/forum/?fromgroups=#!topic/rabbitmq-discuss/u227xzvqOr8
The default connection timeout for the RabbitMQ connection factory is 600 seconds (at least in the Java client API), hence your 10 minutes. You can change this by specifying to the connection factory your timeout of choice.
It is good practice to ensure your connection is release and recreated after a specific amount of time, to prevent eventual leaks and excessive resournces. Your code should ensure that it seeks a valid connection that is not close to be timed-out, and re-establish a new connection on the ones that did time-out. Overall, adopt a connection-pooling approach.
- Java example:
ConnectionFactory factory = new ConnectionFactory();
factory.setHost(this.serverName);
factory.setPort(this.serverPort);
factory.setUsername(this.userName);
factory.setPassword(this.userPassword);
factory.setConnectionTimeout( YOUR-TIMEOUT-IN-SECONDS );
Connection = factory.newConnection();

Resources