neo4j backup error when backing up from ha cluster

neo4j backup error when backing up from ha cluster - neo4j

I'm trying to setup backup for a Neo4j cluster with 3 instances. Neo4j is embedded.
If I run:
./neo4j-backup -from ha://10.106.4.80:5001,10.106.4.203:5001,10.106.14.164:5001 -to /tmp/neobak2/
from a host outside the 10.106.4.0 network, I get this error:
Could not find backup server in cluster neo4j.ha at 10.106.4.80:5001,10.106.4.203:5001,10.106.14.164:5001, operation timed out.
If I run it from a cluster member it works just fine. Also if I run the backup script with single instead of ha works fine from anywhere.
Below the basic cluster config I'm using:
ha.server_id: 1
ha.initial_hosts:10.106.4.80:5001,10.106.4.203:5001,10.106.14.164:5001
ha.tx_push_factor: 2
I already checked for firewall issues, there aren't any. Neo4j version used is 1.9.5.
The webadmin interface shows the cluster has online backup enabled and listening to the default port.
Any help will be appreciated.

According to RFC 5735 IP Adresses 10.0.0.0/8 are private. So I assume they're not routed from an external host.

Related

How to setup a connection between Python Couchbase SDK and containerized Couchbase nodes?

I want to have the following setup:
3 Couchbase nodes, each running on a separate container, all in the same cluster
Python application running in another container (querying, inserting, deleting data from the Couchbase cluster)
What I managed to do:
Set up a cluster, bucket, query the bucket via UI (accessed by localhost:8091)
What I didn't manage to do:
Create a connection between a Python application (which would at the end be Dockerized, for now for the sake of simplicity, let's treat it as local) and the (already working) cluster. Unfortunately, I cannot access it via Docker containers IP's with 8091 port, via localhost too. Unfortunately, the Couchbase documentation is either severely lacking here, or I just don't understand it. I tried to even use the setting-alternate-address option, but without much success (maybe I used it wrongly, so if you have any "how-to's" explaining the process, I'd still be grateful)
The connection works if there is one node, but throws Timeout if I set up 3 nodes.
I would really appreciate any tips leading to solving this problem.
EDIT: Adding code and error message:
connection_string = "couchbase://localhost"
cluster = Cluster.connect(connection_string, ClusterOptions(PasswordAuthenticator(os.getenv("LOGIN"), os.getenv("PASSWORD"))))
# following a successful authentication, a bucket can be opened.
# access a bucket in that cluster
bucket = cluster.bucket('travel-sample')
coll = bucket.default_collection()
result = coll.get('airline_10')
print(result.content_as[dict])
Error message:
couchbase.exceptions.UnAmbiguousTimeoutException: <ec=14, category=couchbase.common, message=unambiguous_timeout, context=KeyValueErrorContext:{'key': 'airline_10', 'bucket_name': 'travel-sample', 'scope_name': '_default', 'collection_name': '_default', 'opaque': 0}, C Source=C:\Jenkins\workspace\python\sdk\python-scripted-build-pipeline\py-client\src\kv_ops.cxx:209>

Couchbase SDKs need to be able to connect to every node on the cluster.
If you are running an app outside of the Docker host, it cannot connect to every node (you can't expose every node on the same port).
This is exactly why it will work fine with one node, but not with multiple (more details in the documentation)
If you run the Python app inside of a container that runs in the Docker host, it should connect just fine (or stick to a single node for development - which is mostly fine if you're not testing something specific to clustering/failover/replication).

RabbitMQ Unable to Join Cluster

I am trying to learn clustering rabbitmq nodes and I am following this tutorial as well as the official documentation.
I have 2 physical machines with rabbitmq deployed on them through docker. machine1 (192.168.1.2) is to be the cluster, and machine2 (192.168.1.3) is to join it.
When I attempt to run rabbitmqctl join_cluster rabbit#192.168.1.2 from machine2, this fails with the following message.
Clustering node rabbit#node2.rabbit with rabbit#192.168.1.2
Error: unable to perform an operation on node 'rabbit#192.168.1.2'. Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
* Target node is not running
In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rabbit#192.168.1.2
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
attempted to contact: ['rabbit#192.168.1.2']
rabbit#192.168.1.3:
* connected to epmd (port 4369) on 192.168.1.2
* epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
* TCP connection succeeded but Erlang distribution failed
* suggestion: check if the Erlang cookie identical for all server nodes and CLI tools
* suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other
* suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that
* suggestion: see the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
Current node details:
* node name: 'rabbitmqcli-1352-rabbit#node2.rabbit'
* effective user's home directory: /var/lib/rabbitmq
* Erlang cookie hash: XXXXXXXXXXXXX
The error logs on machine1 show nothing related to such a connection attempt. I have verified the md5sum of the cookies on both docker containers and they are exactly the same. So are the permissions.
I assumed perhaps the port 4369 isn't reachable, but it is.
I am unsure what I am doing wrong. Can someone help here?
Additional information:
I am using the rabbitmq:3.85-management image. It uses Erlang/OTP 23 [erts-11.0.3].
I have been checking the troubleshooting guide, but I am unsure what seems wrong here. Please let me know if I can provide more information.

So thanks to #NeoAnderson and #José M, I was able to understand what happened.
The containers running RMQ need to be accessible via the hostname that Erlang uses within the service, across the network. Since the hostname of the containers were not accessible in a container on another machine, this clustering failed.
A simple fix would be to edit the /etc/hosts file on the containers so that it would point the IP to the "leader" node.
I was just doing this to avoid installing RMQ and not because I thought this was the best way to do this. Alternately, docker swarm or k8s would have provided the right networking for me.
But the root cause was definitely the nodename problem.

Neo4j setup in OpenShift

I am having difficulties deploying Neo4j official docker image https://hub.docker.com/_/neo4j to an OpenShift environment and accessing it from outside (from my local machine)
I have performed the following steps:
oc new-app neo4j
Created route for port 7474
Set up the environment variable NEO4J_dbms_connector_bolt_listen__address to 0.0.0.0:7687 which is the equivalent of seting up the dbms.connector.bolt.listen_address=0.0.0.0:7687 in the neo4j.conf file.
Access the route url from local machine which will open the neo4j browser which requires authentication. At this point I am blocked because any combination of urls I try are unsuccessful.
As a workaround I have managed to forward 7687 port to my local machine, install Neo4j Desktop solution and connect via bolt://localhost:7687 but this is not the ideal solution.
Therefore there are two questions:
1. How can I connect from the neo4j browser to it's own database
How can I connect from external environment (trough OpenShift route) to the Neo4j DB

I have no experience with the OpenShift, but try to add the following config:
dbms.default_listen_address=0.0.0.0
Is there any other way for you to connect to Neo4j, so that you could further inspect the issue?

Short answer:
To connect to the DB that is most likely a configuration issue, maybe Tomaž Brataničs answer is the solution. As for accessing the DB from outside, you will most likely need a NodePort.
Long answer:
Note that OpenShift Routes are for HTTP / HTTPS traffic and not for any other kind of traffic. Typically, the "Routers" of an OpenShift cluster listen only on Port 80 and 443, so connecting to your database on any other port will most likely not work (although this heavily depends on your cluster configuration).
The solution for non-HTTP(S) traffic is to use NodePorts as described in the OpenShift documentation: https://docs.openshift.com/container-platform/3.11/dev_guide/expose_service/expose_internal_ip_nodeport.html
Note that also for NodePorts, you might need to have your cluster administrator add additional ports to the loadbalancer or you might need to connect to the OpenShift Nodes directly. Refer to the documentation on how to use NodePorts.

connecting to cassandra nodes on a datastax cluster on EC2 Ruby on Rails

I created a datastax cassandra Enterprise cluster with 2 cassandra nodes, 2 search nodes and 2 Analytics nodes.
Everything seems to work correctly EXCEPT, I can't connect to it from outside. If I'm on node0 server I can run the cassandra-cli and connect to the cassandra nodes on port 9160 but when I tried to connect using datastax-rails gem, I get "No live servers" I also tried datastax devCenter which tries to connect to the native port 9042 but also didn't work. I'm really puzzled, any help is appreciated.
So after some digging I found some issues
1. Port 9160 is connected and I can connect to it from telnet node0_ip 9160
2. when I run rake ds:migrate, I get No live servers in node0_ip
3. I tried to connect to 'cassandra' gem instead from IRB and tried
a. client = Cassandra.new('example', 'node0_ip:9160')
b. client.insert(:users, "5", {'screen_name' => "buttonscat4"})
I got a similar error with ThriftClient::NoServersAvailable: No live servers but this time with all the IPs of all the nodes in the cluster
4. I tried adding "client.disable_node_auto_discovery!" and I was able to connect and add stuff using 'cassandra' Gem.
5. I also found on https://github.com/cassandra-rb/cassandra/issues/171 that I need to change your server to bind on a non-loopback address but have no idea what does that mean
The question now is how

Sounds like you need to open up your EC2 security group to the outside on port 9160. Specifically the security group that your node0 is using.
You can find more information about them here:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

I was getting the same error and got this to work by using disable_node_auto_discovery!
You can see in the documentation for this method that it says "This is primarily helpful when the cassandra cluster is communicating internally on a different ip address than what you are using to connect. A prime example of this would be when using EC2 to host a cluster. Typically, the cluster would be communicating over the local ip addresses issued by Amazon, but any clients connecting from outside EC2 would need to use the public ip."
http://rdoc.info/github/cassandra-rb/cassandra/master/Cassandra:disable_node_auto_discovery!

Cassandra Cluster Setup getting JMX error

I m trying setup a cassandra cluster as a test bed but gave the JMX remote connection error. I seem to found the answer for my error from cassandra FAQ page
Nodetool says "Connection refused to host: 127.0.1.1" for any remote host. What gives?
Nodetool relies on JMX, which in turn relies on RMI, which in turn sets up it's own listeners and connectors as needed on each end of the exchange. Normally all of this happens behind the scenes transparently, but incorrect name resolution for either the host connecting, or the one being connected to, can result in crossed wires and confusing exceptions.
If you are not using DNS, then make sure that your /etc/hosts files are accurate on both ends. If that fails try passing the -Djava.rmi.server.hostname=$IP option to the JVM at startup (where $IP is the address of the interface you can reach from the remote machine).
But can somebody help me on how to do -Djava.rmi.server.hostname=$IP
Or what to add is hosts file, i know that in hosts normally we add "IP Alias", but whose ip and alias.
I dont know much java or either linux
I m currently working on ubuntu v10.04 and cassandra v0.74
Sudesh

For JMX you need to enable JMX-remoting:
java -Dcom.sun.management.jmxremote
Depending on from where you want to access the jmx-server, you also need to specify a port:
-Dcom.sun.management.jmxremote.port=12345
and set or disable passwords.
Have a look at http://download.oracle.com/javase/1.5.0/docs/guide/management/agent.html for more details.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart