Error: unable to perform an operation on node 'rabbit#localhost' - docker

So I have an issue with docker-compose and rabbitmq.
I run docker-compose up. Everything spins up. Docker-compose:
services:
rabbitmq3:
image: "rabbitmq:3-management"
hostname: "localhost"
command: rabbitmq-server
ports:
- 5672:5672
- 15672:15672
Then I do sudo rabbitmqctl status to check connection with node. I get this error:
Error: unable to perform an operation on node 'rabbit#localhost'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#localhost
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: [rabbit#localhost]
rabbit#localhost:
* connected to epmd (port 4369) on localhost
* epmd reports: node 'rabbit' not running at all
no other nodes on localhost
* suggestion: start the node
Current node details:
* node name: 'rabbitmqcli-25456-rabbit#localhost'
* effective user's home directory: /Users/olof.grund
* Erlang cookie hash: d1oONiVA/qogGxkf6vs9Rw==
When I do it in the container docker-compose exec -T rabbitmq3 rabbitmqctl status it works.
Do I need to expose something from docker somehow? Some rabbitmq client or node maybe?

I used all the tips that I have found in other sources. (adding IP to /etc/hosts/, restarts of containers, services). Took me a day to finally get this to work and it boils down to this.
<wait for 60secs since the rabbit container has been started>
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl force_boot
rabbitmqctl start_app

Rabbitmq uses Erlang's distribution protocol, which requires port 4369 open for the EPMD (Erlang Port Mapper Daemon), expose it in the docker-compose and stop the EPMD running in your host.

Related

SSL(curl) connection error in ElasticSearch setup

Have setup a 3-node Elasticsearch cluster using docker-compose. Followed below steps:
On one of the master nodes, es11, gets below error, however same curl command works fine on other 2 nodes i.e. es12, es13:
Error:
curl -X GET 'https://localhost:9316'
curl: (35) Encountered end of file
Below error in logs:
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [es13][SOMEIP:9316][internal:cluster/coordination/join]",
"Caused by: org.elasticsearch.transport.ConnectTransportException: [es11][SOMEIP:9316] handshake failed. unexpected remote node {es13}{SOMEVALUE}{SOMEVALUE
"at org.elasticsearch.transport.TransportService.lambda$connectionValidator$6(TransportService.java:468) ~[elasticsearch-7.17.6.jar:7.17.6]",
"at org.elasticsearch.action.ActionListener$MappedActionListener.onResponse(ActionListener.java:95) ~[elasticsearch-7.17.6.jar:7.17.6]",
"at org.elasticsearch.transport.TransportService.lambda$handshake$9(TransportService.java:577) ~[elasticsearch-7.17.6.jar:7.17.6]",
https://localhost:9316 on browser gives site can't be reached error as well.It seems SSL certificate as created in step 4 below is having some issues in es11.
Any leads please? OR If I repeat step 4, do i need to copy the certs again to es12 & es13?
Below elasticsearch.yml
cluster.name: "docker-cluster"
network.host: 0.0.0.0
Ports as defined in all 3 nodes docker-compose.yml
environment:
- node.name=es11
- transport.port=9316
ports:
- 9216:9200
- 9316:9316
Initialize a docker swarm. On ES11 run docker swarm init. Follow the instructions to join 12 and 13 to the swarm.
Create an overlay network docker network create -d overlay --attachable elastic
If necessary, bring down the current cluster and remove all the associated volumes by running docker-compose down -v
Create SSL certificates for ES with docker-compose -f create-certs.yml run --rm create_certs
Copy the certs for es12 and 13 to the respective servers
Use this busybox to create the overlay network on 12 and 13 sudo docker run -itd --name containerX --net [network name] busybox
Configure certs on 12 and 13 with docker-compose -f config-certs.yml run --rm config_certs
Start the cluster with docker-compose up -d on each server
Set the passwords for the built-in ES accounts by logging into the cluster docker exec -it es11 sh then running bin/elasticsearch-setup-passwords interactive --url localhost:9316
(as per your https://discuss.elastic.co thread)
you cannot talk HTTP to the transport protocol port, which you have defined in transport.port. you need to talk to port 9200 in the container, which you have mapped to 9216 outside the container
the transport port runs a binary protocol that is not HTTP accessible

Error accessing Scylladb cluster outside docker container

I'm running Scylladb locally in a docker container and I want to access the cluster outside the docker container. That's when I'm getting the following error: cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers')
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 172.17.0.2 776 KB 256 ? ad698c75-a465-4deb-a92c-0b667e82a84f rack1
Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.SimpleSnitch
DynamicEndPointSnitch: disabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
443048b2-c1fe-395e-accd-5ae9b6828464: [172.17.0.2]
I have no problem accessing the cluster using cqlsh on port 9042:
Connected to at 172.17.0.2:9042.
[cqlsh 5.0.1 | Cassandra 3.0.8 | CQL spec 3.3.1 | Native protocol v4]
Now I'm trying to access the cluster from my fastapi app that is outside the docker container.
from cassandra.cluster import Cluster
cluster = Cluster(['172.17.0.2'])
session = cluster.connect('Test Cluster')
And here's the Error that I'm getting:
raise NoHostAvailable("Unable to connect to any servers", errors)
cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', {'172.17.0.2:9042': OSError(51, "Tried connecting to [('172.17.0.2', 9042)]. Last error: Network is unreachable")})
with a little bit of tinkering, it's possible to achieve a connection to the Scylla running in a container outside of the container for local development.
I've tried on M1 Mac with docker desktop:
Run scylla container with couple of new parameters[src]:
--listen-address 0.0.0.0 for simplification as we are spawning Scylla inside the container to allow connection to the container from any network
--broadcast-rpc-address 127.0.0.1 required if --listen-address set to 0.0.0.0. We are going to port forward 9042 from container to host (local) machine, so this is an IP where it will be acessible.
The final command to spawn the container is:
$ docker run --rm -ti \
-p 127.0.0.1:9042:9042 \
scylladb/scylla \
--smp 1 \
--listen-address 0.0.0.0 \
--broadcast-rpc-address 127.0.0.1
The -p 127.0.0.1:9042:9042 is to make port 9042 accessible on host (local) machine.
Install pip3 install scylla-driver as it has support of darwin/arm64 architecture.
Write a simple python script:
# so74265199.py
from cassandra.cluster import Cluster
cluster = Cluster(['127.0.0.1'])
session = cluster.connect()
# Select from a table that is available without keyspace
res = session.execute('SELECT * FROM system.versions')
print(res.one())
Run your script
$ python3 so74265199.py
Row(key='local', build_id='71178cf6db7021896cd8251751b78b3d9e3afa8d', build_mode='release', version='5.0.5-0.20221009.5a97a1060')
Disclaimer: I'm not an expert in Scylla's configuration, so feel free to point out a better approach.

Running a Chainlink Node - Can't connect to database

Using docker-desktop on macOS.
I'm trying to run a node following the instructions on this page.
The database name is node, which is the same as the username: node. The user has access to the database and can log in using psql client.
Connection strings I've tried in the .env file:
postgresql://node#localhost/node
postgresql://node:password#localhost/node
postgresql://node:password#localhost:5432/node
postgresql://node:password#127.0.0.1:5432/node
postgresql://node:password#127.0.0.1/node
When I run the start command: cd ~/.chainlink-kovan && docker run -p 6688:6688 -v ~/.chainlink-kovan:/chainlink -it --env-file=.env smartcontract/chainlink local n , using docker-desktop on macOS, I get the following stack trace:
2020-09-15T14:24:41Z [INFO] Starting Chainlink Node 0.8.15 at commit a904730bd62c7174b80a2c4ccf885de3e78e3971 cmd/local_client.go:50
2020-09-15T14:24:41Z [INFO] SGX enclave *NOT* loaded cmd/enclave.go:11
2020-09-15T14:24:41Z [INFO] This version of chainlink was not built with support for SGX tasks cmd/enclave.go:12
2020-09-15T14:24:41Z [INFO] Locking postgres for exclusive access with 500ms timeout orm/orm.go:69
2020-09-15T14:24:41Z [ERROR] unable to lock ORM: dial tcp 127.0.0.1:5432: connect: connection refused logger/default.go:139 stacktrace=github.com/smartcontractkit/chainlink/core/logger.Error
/chainlink/core/logger/default.go:117
...
Does anyone know how I can resolve this?
The problem probably caused by the fact that your chainlink database has been locked with Exclusive Lock and before stopping node that locks never removed.
What you do in this situation (as what works for me) is use PgAdmin Ui or similar way to find all Locks then find the Exclusive Lock that is held on the chainlink database and note down its Process id or ids (if multiple exclusive locks there are on chainlink DB)
Log in to your pg client and run SELECT pg_terminate_backend(<pid>) or SELECT pg_cancel_backend(<pid>); Enter PID of those locks here without quotes and meanwhile keep refreshing on pg admin URL to see if those processes stopped If stopped then rerun your chainlink node.
The problem is with docker networking.
Add --network host to the docker run command so that it is:
cd ~/.chainlink-kovan && docker run -p 6688:6688 -v ~/.chainlink-kovan:/chainlink -it --env-file=.env smartcontract/chainlink --network host local n
This fixes the issue.

Starting Redis cluster hangs when calling redis-trib

I have tried to setup a Redis cluster running docker but it hangs when I try to join them. My docker ps gives me this:
Notice the port mapping.
All containers have this basic redis.conf file
port 6379
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
appendonly yes
cluster-announce-ip 127.0.0.1
cluster-announce-port [7001, 7002, 7003, 7004, 7005 or 7006]
cluster-announce-bus-port [7101, 7102, 7103, 7104, 7105 or 7106]
Where the only change is the cluster-announce-port and cluster-announce-bus-port for each docker container. I hope you get the point.
I try to join the nodes with ./redis-trib.rb create --replicas 1 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006
And it discovers it perfectly and asking if the config should be accepted:
But then redis-trib hangs indefinitely with "Waiting for the cluster to join". I can see through docker logs r_1 to r_6, that the epoch is getting set:
1:M 15 Jul 10:38:08.493 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH
So redis-trib does call the different nodes.
I cant really find anything about the cluster-announce variables anywhere. Does anyone here know how to do this? I think my problems lies in this part.
The redis version I am using is 4.0.10.
Ok so I figured it out. I needed to
set my cluster-announce-ip to the Ethernet adapter that has been created when installing docker (open up a terminal and do ipconfig)
update redis-trib.rb to reflect this IP
map the 16379 port when the docker image is created

Cannot find main class SolrCLI when running bin/solr -e cloud

I want to compile solr from the main trunk and run it.
I did the following:
git clone https://github.com/apache/lucene-solr.git
cd lucene-solr/solr
ant dist
bin/solr -e cloud
This creates the relevant solr nodes but fails to create a collection with the following error:
$ bin/solr -e cloud
Welcome to the SolrCloud example!
This interactive session will help you launch a SolrCloud cluster on your local workstation.
To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2]
Ok, let's start up 2 Solr nodes for your example SolrCloud cluster.
Please enter the port for node1 [8983]
8983
Please enter the port for node2 [7574]
7574
Starting up SolrCloud node1 on port 8983 using command:
solr start -cloud -s example/cloud/node1/solr -p 8983
Waiting to see Solr listening on port 8983 [|]
Started Solr server on port 8983 (pid=94888). Happy searching!
Starting node2 on port 7574 using command:
solr start -cloud -s example/cloud/node2/solr -p 7574 -z localhost:9983
Waiting to see Solr listening on port 7574 [|]
Started Solr server on port 7574 (pid=94979). Happy searching!
Now let's create a new collection for indexing documents in your 2-node cluster.
Please provide a name for your new collection: [gettingstarted]
gettingstarted
How many shards would you like to split gettingstarted into? [2]
2
How many replicas per shard would you like to create? [2]
2
Please choose a configuration for the gettingstarted collection, available options are:
basic_configs, data_driven_schema_configs, or sample_techproducts_configs [data_driven_schema_configs]
Error: Could not find or load main class org.apache.solr.util.SolrCLI
I am sure this used to work before.
But I am not able to figure out what's wrong.
Any help would be appreciated.
ant server needs to be run to solve the classpath issue.
(Or ant example for older versions).

Resources