Why RSocket connection retry is using multiple (different) threads every time - rsocket

I have the below program that connects to a Spring boot rsocket server running on localhost:7999.
I have configured the connector Retry.fixedDelay(Integer.MAX_VALUE, Duration.ofSeconds(5))
As you can see the the RSocketRequester is Mono so it should hold a single connection.
When the connection fails and the Retry begins, I see that every retry is made from a different thread i.e. as below parallel-1---parallel-8. May I know the reason behind this ?
12:08:24.463550|parallel-1|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #1 (1 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:08:30.470593|parallel-2|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #2 (2 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:08:36.475666|parallel-3|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #3 (3 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:08:42.494801|parallel-4|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #4 (4 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:08:48.499084|parallel-5|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #5 (5 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:08:54.503385|parallel-6|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #6 (6 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:09:00.509830|parallel-7|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #7 (7 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:09:06.545815|parallel-8|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #8 (8 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
12:09:12.553582|parallel-1|WARN |RSocketRefDataReceiver |doAfterRetry===>attempt #9 (9 in a row), last failure={io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: localhost/127.0.0.1:7999}
My Program is as below:
RSocketStrategies strategies = RSocketStrategies.builder()
.encoders(e -> e.add(new Jackson2CborEncoder()))
.decoders(e -> e.add(new Jackson2CborDecoder()))
.build();
Mono<RSocketRequester> requester = Mono.just(RSocketRequester.builder()
.rsocketConnector(connector -> {
connector.reconnect(
Retry.fixedDelay(Integer.MAX_VALUE, Duration.ofSeconds(5))
.doAfterRetry(e -> LOG.warn("doAfterRetry===>{}", e)))
.acceptor(RSocketMessageHandler.responder(strategies,this))
.payloadDecoder(PayloadDecoder.ZERO_COPY);
})
.dataMimeType(MediaType.APPLICATION_CBOR)
.setupRoute("test")
.setupData("test-123")
.rsocketStrategies(strategies)
.tcp("localhost",7999));

This article (Flight of the Flux 3) is a good introduction to Spring Reactor threading model. Reactor is the base library providing an implementation of Rx functionality in rsocket-java.
The key sentence is
Schedulers.parallel() is good for CPU-intensive but short-lived tasks.
It can execute N such tasks in parallel (by default N == number of
CPUs)
Also read up on https://projectreactor.io/docs/core/release/api/reactor/core/scheduler/Schedulers.html
If all operations were guaranteed to be on a single thread, then it's likely it would cause noisy latency, as two different clients who happened to get the same thread initially would compete for the thread throughout the lifetime of your program. So it's better than general work get's spread evenly between a limited pool of threads.

Thanks #Yuri #bruto #OlegDokuka and for your suggestions and answers. I have changed my program as below to enforce retry to run on single thread.
connector.reconnect(
Retry.fixedDelay(Integer.MAX_VALUE, Duration.ofSeconds(5))
.scheduler(Schedulers.single()) // <---- This enforces retry to run on a single thread
.doAfterRetry(e -> LOG.warn("doAfterRetry===>{}", e)))
.acceptor(RSocketMessageHandler.responder(strategies,this))
.payloadDecoder(PayloadDecoder.ZERO_COPY);
})

Related

Exception while executing Tell method in Akka

I am using Akka.net (version="1.0.8") .I have two Actors one is reading Input from database and other is accepting that input and contains some processing logic. Actor “A” is Input actor and “B” is actor for processing input.
When we send around 50 000 messages (size around 30kb each) to “actor B”
for ex.
[1..10000000]
|> List.iter(fun msg ->
actRef.Tell(JsonConvert.SerializeObject(ProcessData (MessageData))))
I get below errors with some dead letters:
“[WARNING][4/29/2016 10:08:18 AM][Thread 0009]
[[akka://Pankaj/system/endpointManager/reliableEndpointWriterakka.tcp%3a%2f%2fRemoteFSharp%40***.**.**.**%3a8788-1]]
Association with remote system akka.tcp://RemoteFSharp#***.**.**.**:8788 has failed;
address is now gated for 5000 ms.
Reason is: [Akka.Remote.EndpointDisassociatedException: Disassociated …
[INFO]
[4/29/2016 10:08:18 AM][Thread 0021][akka://Pankaj/deadLetters]
Message String from akka://Pankaj/deadLetters to akka://Pankaj/deadLetters was not delivered.
1 dead letters encountered… “”
But when I use Ask method it works fine.
[1..1000000]
|> List.iter(fun msg ->
actRef.Ask(JsonConvert.SerializeObject(ProcessData (MessageData))).Wait())
If I put a delay of 10ms before every Tell call then I get some error after processing 200 000 records.
Wwhat exactly I am missing here?
What is the dead letter exception and how to fix it?

Configuring zabbix to monitor ping from a server

I am new to zabbix. I would like to monitor the ping from my server and I want to activate a trigger if the ping gets unresponsive or ping time exceeds 20 milliseconds.
I don't know how to configure the trigger expression to suit my needs. Please help. Thanks.
I used
type -> Simple check
key -> icmppingsec
Type of information -> Numeric(Float)
Units -> s
Flexible intervial -> 10secs, from 7:00-24:00
This is the trigger expression.
And a graph I created.
According to simple check documentation, icmppingsec item returns ping time in seconds or 0 if the host is not reachable. Therefore, your trigger can be as follows:
{Template ICMP Ping:icmppingsec.avg(5m)} > 0.020 |
{Template ICMP Ping:icmppingsec.max(5m)} = 0
If you are using at least Zabbix 2.4, you should use or instead of | (see What's new in Zabbix 2.4.0).
Note also that there is no point in using "1-7,00:00-24:00" flexible interval. You can just put "10" into "Updated interval (in sec)" field.

Wireshark: Flag abbreviations and Exchange type

I was told to ask this here:
10:53:04.042608 IP 172.17.2.12.42654 > 172.17.2.6.6000: Flags [FPU], seq 3891587770, win 1024, urg 0, length 0
10:53:04.045939 IP 172.17.2.6.6000 > 172.17.2.12.42654: Flags [R.], seq 0, ack 3891587770, win 0, length 0
This states that the flags set are FPU and R. What flags do these stand for and what kind of exchange is this?
The flags are:
F - FIN, used to terminate an active TCP connection from one end.
P - PUSH, asks that any data the receiving end is buffering be sent to the receiving process.
U - URGENT, indicating that there is data referenced by the urgent "pointer."
R - RESET, indicating that a packet was received that was NOT part of an existing connection.
It looks like the first packet was manufactured, or possibly delayed. The argument for it being manufactured is the urgent flag being set, with no urgent data. If it was delayed, it indicates the normal end of a connection between .12 and .6 on port 6000, along with a request that the last of any pending data sent across the wire be flushed to the service on .6.
.6 has clearly forgotten about this connection, if it even existed. .6 is indicating that while it got the FIN packet, it believes that the connection that FIN packet refers to did not exist.
If .6 had a current matching connection, it would have replied with a FIN-ACK instead of RST, acknowledging the termination of the connection.

tracert command returns timed out

tracert returns requested time out. What I understand from this is the packets lost some where on the network.
Does it mean the issue is with the ISP or with the hosting provider or my windows system?
10 * * * Request timed out.
11 * * * Request timed out.
12 * * * Request timed out.
13 * * * Request timed out.
14 * * * Request timed out.
15 * * * Request timed out.
16 * * * Request timed out.
17 * * * Request timed out.
18 * * * Request timed out.
19 * * * Request timed out.
20 * * * Request timed out.
21 * * * Request timed out.
22 * * * Request timed out.
23 * * * Request timed out.
24 * * * Request timed out.
25 * * * Request timed out.
26 * * * Request timed out.
27 * * * Request timed out.
28 * * * Request timed out.
29 * * * Request timed out.
30 * * * Request timed out.
The first 9 were successful.
I can't see the first 9 hops but if they are all the same then you may have a firewall configuration issue that prevents the packets from either getting out or getting back.
Try again turning off your firewall (temporarily!). The other option is that your ISP may drop ICMP traffic as a matter of course, or only when they are busy with other traffic.
ICMP (the protocol used by traceroute) is of the lowest priority, and when higher priority traffic is ongoing the router may be configured to simply drop ICMP packets. There is also the possibility that the ISP drops all ICMP packets as a matter of security since many DOS (Denial of Service) attacks are based on probing done with ICMP packets.
Some routers view all pings as a Port-Scan, and block for that reason. (as the first step in any attack is determining which ports are open.) However, blocking ping packets / tracert packets, etc. is only partially effective at mitigating a Denial-of-service attack, as such an attack could use ANY PROTOCHOL it wanted (such as by using TCP or UDP packets, etc.) So long as there is an open port to receive the packet on the machine targeted for Denial-Of-Service. For example, if we wanted to target an http server, we only need use an intercepting proxy to repeatedly send a null TCP packet to the server on port 80 or port 8080, since we know that these are the two most common ports for http. Likewise, if the target machine is running an IRCd, we know the port is most likely 6667 (unless the server is using SSL), which would be the most common port for that kind of service. Therefore, dropping ping packets does not prevent a DdOS attack- it just makes that type of attack a bit more difficult.
This is what I found from the Wireshark documentation(I had the same problem):
"The tracert program provided with Windows does not allow one
to change the size of the ICMP message sent by tracert. So it won’t be
possible to use a Windows machine to generate ICMP messages that are large
enough to force IP fragmentation. However, you can use tracert to generate
small,fixed-length packets"
https://danielgraham.files.wordpress.com/2021/09/wireshark_ip_v8.1-2.pdf
use tracert -h 1
this will limit the number of times it tries a particular ip address to 1 try. h = hops. I had written a batch script a while back to scan my entire network to get a list of ips and computer networks and it would waste time on the fire wall that wouldnt answer and ip addresses that weren't assigned to any computers. Wicked annoying!!!!!! so I added the -h 1 to the script!! I runs through and makes a list in a text file. I hope to improve it in the future by running arp -a first to get a quck list of ips, then feeding that list into a script similar to this one. that way it doesn't waste time on unassigned IP's.
enter code here#echo off
enter code hereset trace=tracert
enter code hereset /a byte1=222
enter code hereset /a byte2=222
enter code hereset /a byte3=222
enter code hereset /a byte4=100
enter code hereset loop=0
enter code here:loop
enter code here#echo
enter code here%trace% %byte1%.%byte2%.%byte3%.%byte4%>>ips.txt
enter code hereset /a loop=%loop% + 1
enter code hereset /a byte4=%byte4% + 1
enter code here#echo %byte4%
enter code hereif %loop%==255 goto next
enter code heregoto loop
enter code here:next
enter code hereend
Your antivirus blocks the incoming packets , and in no case this option can be turned off because its the basic property of an antivirus i.e to block packets to prevent computer from normal as well as DOS (Denial of Service) attacks .

Error -206 table (aus_command) not in database

Informix 11.70.TC4DE on Windows Vista SP2, i7 Dual Core, 8GB RAM:
I searched for (aus_command) table and it is in the database. Any idea why it says it could not find it?
Tue May 15 22:07:21 2012
22:07:21 Booting Language <c> from module <>
22:07:21 Loading Module <CNULL>
22:07:21 Booting Language <builtin> from module <>
22:07:21 Loading Module <BUILTINNULL>
22:07:28 DR: DRAUTO is 0 (Off)
22:07:28 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled)
22:07:28 IBM Informix Dynamic Server Version 11.70.TC4DE Software Serial Number AAA#B000000
22:07:29 Performance Advisory: The physical log size is smaller than the recommended size for a
server configured with RTO_SERVER_RESTART.
22:07:29 Results: Fast recovery performance might not be optimal.
22:07:29 Action: For best fast recovery performance when RTO_SERVER_RESTART is enabled,
increase the physical log size to at least 242000 KB. For servers
configured with a large buffer pool, this might not be necessary.
22:07:29 IBM Informix Dynamic Server Initialized -- Shared Memory Initialized.
22:07:29 Started 1 B-tree scanners.
22:07:29 B-tree scanner threshold set at 5000.
22:07:29 B-tree scanner range scan size set to -1.
22:07:29 B-tree scanner ALICE mode set to 6.
22:07:29 B-tree scanner index compression level set to med.
22:07:29 Physical Recovery Started at Page (2:5459).
22:07:29 Physical Recovery Complete: 0 Pages Examined, 0 Pages Restored.
22:07:29 Logical Recovery Started.
22:07:29 5 recovery worker threads will be started.
22:07:30 Logical Recovery has reached the transaction cleanup phase.
22:07:30 Logical Recovery Complete.
6 Committed, 0 Rolled Back, 0 Open, 0 Bad Locks
22:07:31 Onconfig parameter STORAGE_FULL_ALARM modified from 0 to 3.
22:07:31 Dataskip is now OFF for all dbspaces
22:07:31 Init operation complete - Mode Online
22:07:31 Checkpoint Completed: duration was 0 seconds.
22:07:31 Tue May 15 - loguniq 21, logpos 0x500b4, timestamp: 0xa252b Interval: 62
22:07:31 Maximum server connections 0
22:07:31 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 16, Llog used 1
22:07:31 On-Line Mode
22:07:34 SCHAPI: Started dbScheduler thread.
22:07:36 Defragmenter cleaner thread now running
22:07:36 Defragmenter cleaner thread cleaned:0 partitions
22:07:36 Booting Language <spl> from module <>
22:07:36 Loading Module <SPLNULL>
22:07:36 Auto Registration is synced
22:07:36 SCHAPI: Started 2 dbWorker threads.
22:07:38 SCHAPI: [Auto Update Statistics Refresh 33-4] Error -206 The specified table (aus_command) is not in the database.
22:07:38 SCHAPI: [Auto Update Statistics Refresh 33-4] Error -111 ISAM error: no record found.
22:07:38 SCHAPI: [Auto Update Statistics Refresh 33-4] Error -206 The specified table (aus_command) is not in the database.
22:07:38 SCHAPI: [Auto Update Statistics Refresh 33-4] Error -111 ISAM error: no record found.
22:07:40 SCHAPI: [Auto Update Statistics Evaluation 32-8] Error -242 Could not open database table (informix.aus_command).
22:07:40 SCHAPI: [Auto Update Statistics Evaluation 32-8] Error -106 ISAM error: non-exclusive access.
22:07:45 Logical Log 21 Complete, timestamp: 0xae6d3.
22:23:40 Explain file for session 31 : C:\PROGRA~1\IBM\Informix\11.70\sqexpln\cost.out
22:24:07 Explain file for session 31 : C:\PROGRA~1\IBM\Informix\11.70\sqexpln\cost.out
22:24:24 Explain file for session 31 : C:\PROGRA~1\IBM\Informix\11.70\sqexpln\cost.out
22:27:10 Checkpoint Completed: duration was 0 seconds.
22:27:10 Tue May 15 - loguniq 22, logpos 0x545018, timestamp: 0xb4f29 Interval: 63
22:27:10 Maximum server connections 1
22:27:10 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 657, Llog used 3573
22:27:11 IBM Informix Dynamic Server Stopped.

Resources