Does Neo4j clustering require atleast 3 nodes? - neo4j

I'm playing with Neo4J high availability clustering. Whilst the documentation indicates a cluster requires at least 3 nodes, or 2 with a arbitrator, I'm wondering what the implications of running with only 2 nodes are?
If i set up a 3 node cluster, and remove a node, i have no issues adding data. Likewise if i set-up the cluster with only 2 nodes i can still add data and don't seem to be restricted functionality. What should i expect to experience as limitations? For example, the following indicates the trace of a slave started in a 2 node cluster. Data can be added to the master with no issues - and be queried.
2013-11-06 10:34:50.403+0000 INFO [Cluster] Attempting to join cluster of [127.0.0.1:5001, 127.0.0.1:5002]
2013-11-06 10:34:54.473+0000 INFO [Cluster] Joined cluster:Name:neo4j.ha Nodes:{1=cluster://127.0.0.1:5001, 2=cluster://127.0.0.1:5002} Roles:{coordinator=1}
2013-11-06 10:34:54.477+0000 INFO [Cluster] Instance 2 (this server) joined the cluster
2013-11-06 10:34:54.512+0000 INFO [Cluster] Instance 1 was elected as coordinator
2013-11-06 10:34:54.530+0000 INFO [Cluster] Instance 1 is available as master at ha://localhost:6363?serverId=1
2013-11-06 10:34:54.531+0000 INFO [Cluster] Instance 1 is available as backup at backup://localhost:6366
2013-11-06 10:34:54.537+0000 INFO [Cluster] ServerId 2, moving to slave for master ha://localhost:6363?serverId=1
2013-11-06 10:34:54.564+0000 INFO [Cluster] Checking store consistency with master
2013-11-06 10:34:54.620+0000 INFO [Cluster] The store does not represent the same database as master. Will remove and fetch a new one from master
2013-11-06 10:34:54.646+0000 INFO [Cluster] ServerId 2, moving to slave for master ha://localhost:6363?serverId=1
2013-11-06 10:34:54.658+0000 INFO [Cluster] Copying store from master
2013-11-06 10:34:54.687+0000 INFO [Cluster] Copying index/lucene-store.db
2013-11-06 10:34:54.688+0000 INFO [Cluster] Copied index/lucene-store.db
2013-11-06 10:34:54.688+0000 INFO [Cluster] Copying neostore.nodestore.db
2013-11-06 10:34:54.689+0000 INFO [Cluster] Copied neostore.nodestore.db
2013-11-06 10:34:54.689+0000 INFO [Cluster] Copying neostore.propertystore.db
2013-11-06 10:34:54.689+0000 INFO [Cluster] Copied neostore.propertystore.db
2013-11-06 10:34:54.689+0000 INFO [Cluster] Copying neostore.propertystore.db.arrays
2013-11-06 10:34:54.690+0000 INFO [Cluster] Copied neostore.propertystore.db.arrays
2013-11-06 10:34:54.690+0000 INFO [Cluster] Copying neostore.propertystore.db.index
2013-11-06 10:34:54.690+0000 INFO [Cluster] Copied neostore.propertystore.db.index
2013-11-06 10:34:54.690+0000 INFO [Cluster] Copying neostore.propertystore.db.index.keys
2013-11-06 10:34:54.691+0000 INFO [Cluster] Copied neostore.propertystore.db.index.keys
2013-11-06 10:34:54.691+0000 INFO [Cluster] Copying neostore.propertystore.db.strings
2013-11-06 10:34:54.691+0000 INFO [Cluster] Copied neostore.propertystore.db.strings
2013-11-06 10:34:54.691+0000 INFO [Cluster] Copying neostore.relationshipstore.db
2013-11-06 10:34:54.692+0000 INFO [Cluster] Copied neostore.relationshipstore.db
2013-11-06 10:34:54.692+0000 INFO [Cluster] Copying neostore.relationshiptypestore.db
2013-11-06 10:34:54.692+0000 INFO [Cluster] Copied neostore.relationshiptypestore.db
2013-11-06 10:34:54.692+0000 INFO [Cluster] Copying neostore.relationshiptypestore.db.names
2013-11-06 10:34:54.693+0000 INFO [Cluster] Copied neostore.relationshiptypestore.db.names
2013-11-06 10:34:54.693+0000 INFO [Cluster] Copying nioneo_logical.log.v0
2013-11-06 10:34:54.693+0000 INFO [Cluster] Copied nioneo_logical.log.v0
2013-11-06 10:34:54.693+0000 INFO [Cluster] Copying neostore
2013-11-06 10:34:54.694+0000 INFO [Cluster] Copied neostore
2013-11-06 10:34:54.694+0000 INFO [Cluster] Done, copied 12 files
2013-11-06 10:34:55.101+0000 INFO [Cluster] Finished copying store from master
2013-11-06 10:34:55.117+0000 INFO [Cluster] Checking store consistency with master
2013-11-06 10:34:55.123+0000 INFO [Cluster] Store is consistent
2013-11-06 10:34:55.124+0000 INFO [Cluster] Catching up with master
2013-11-06 10:34:55.125+0000 INFO [Cluster] Now consistent with master
2013-11-06 10:34:55.172+0000 INFO [Cluster] ServerId 2, successfully moved to slave for master ha://localhost:6363?serverId=1
2013-11-06 10:34:55.207+0000 INFO [Cluster] Instance 2 (this server) is available as slave at ha://localhost:6364?serverId=2
2013-11-06 10:34:55.261+0000 INFO [API] Successfully started database
2013-11-06 10:34:55.265+0000 INFO [Cluster] Database available for write transactions
2013-11-06 10:34:55.318+0000 INFO [API] Starting HTTP on port :8574 with 40 threads available
2013-11-06 10:34:55.614+0000 INFO [API] Enabling HTTPS on port :8575
2013-11-06 10:34:56.256+0000 INFO [API] Mounted REST API at: /db/manage/
2013-11-06 10:34:56.261+0000 INFO [API] Mounted discovery module at [/]
2013-11-06 10:34:56.341+0000 INFO [API] Loaded server plugin "CypherPlugin"
2013-11-06 10:34:56.344+0000 INFO [API] Loaded server plugin "GremlinPlugin"
2013-11-06 10:34:56.347+0000 INFO [API] Mounted REST API at [/db/data/]
2013-11-06 10:34:56.355+0000 INFO [API] Mounted management API at [/db/manage/]
2013-11-06 10:34:56.435+0000 INFO [API] Mounted webadmin at [/webadmin]
2013-11-06 10:34:56.477+0000 INFO [API] Mounting static content at [/webadmin] from [webadmin-html]
2013-11-06 10:34:57.923+0000 INFO [API] Remote interface ready and available at [http://localhost:8574/]
2013-11-06 10:35:52.829+0000 INFO [API] Available console sessions: SHELL: class org.neo4j.server.webadmin.console.ShellSessionCreator
CYPHER: class org.neo4j.server.webadmin.console.CypherSessionCreator
GREMLIN: class org.neo4j.server.webadmin.console.GremlinSessionCreator
Thanks

There is no implications in terms of functionality Neo4j server.
But in terms of high availability is better to have more then 2 servers in cluster.

If there is a network failure between the 2 nodes and they are running but can't see each other, they will both promote themselves to master.
This may result in problems reforming the cluster when the network recovers.
Adding a 3rd node ensures that only one of the 3 nodes can ever be master.

Related

Graphaware Framework and UUID not starting on Neo4j GrapheneDB

I am trying to get the Graphaware Framework and UUID running on a GrapheneDB instance. I have followed the instructions to zip the JAR and neo4j.properties files and uploaded using the GrapheneDB Web Interface but UUID's are not added when I create a new node.
neo4j.properties file
dbms.unmanaged_extension_classes=com.graphaware.server=/graphaware
com.graphaware.runtime.enabled=true
#UIDM becomes the module ID:
com.graphaware.module.UIDM.1=com.graphaware.module.uuid.UuidBootstrapper
#optional, default is uuid:
com.graphaware.module.UIDM.uuidProperty=uuid
#optional, default is false:
com.graphaware.module.UIDM.stripHyphens=true
#optional, default is all nodes:
#com.graphaware.module.UIDM.node=hasLabel('Label1') || hasLabel('Label2')
#optional, default is no relationships:
#com.graphaware.module.UIDM.relationship=isType('Type1')
com.graphaware.module.UIDM.relationship=com.graphaware.runtime.policy.all.IncludeAllBusinessRelationships
#optional, default is uuidIndex
com.graphaware.module.UIDM.uuidIndex=uuidIndex
#optional, default is uuidRelIndex
com.graphaware.module.UIDM.uuidRelationshipIndex=uuidRelIndex
Log Output
2017-03-02 10:20:40.184+0000 INFO Neo4j Server shutdown initiated by
request 2017-03-02 10:20:40.209+0000 INFO
[c.g.s.f.b.GraphAwareServerBootstrapper] stopped 2017-03-02
10:20:40.209+0000 INFO Stopping... 2017-03-02 10:20:40.982+0000 INFO
Stopped. 2017-03-02 10:20:43.402+0000 INFO Starting... 2017-03-02
10:20:43.820+0000 INFO Bolt enabled on 0.0.0.0:7475. 2017-03-02
10:20:45.153+0000 INFO [c.g.r.b.RuntimeKernelExtension] GraphAware
Runtime disabled. 2017-03-02 10:20:48.130+0000 INFO Started.
2017-03-02 10:20:48.343+0000 INFO
[c.g.s.f.b.GraphAwareServerBootstrapper] started 2017-03-02
10:20:48.350+0000 INFO Mounted unmanaged extension
[com.graphaware.server] at [/graphaware] 2017-03-02 10:20:48.724+0000
INFO Mounting GraphAware Framework at /graphaware 2017-03-02
10:20:48.755+0000 INFO Will try to scan the following packages:
{com..graphaware.,org..graphaware.,net..graphaware.}
2017-03-02 10:20:52.633+0000 INFO Remote interface available at
http://localhost:7474/
Messages.log Extract
2017-03-02 10:33:59.991+0000 INFO [o.n.k.i.DiagnosticsManager] ---
STARTED diagnostics for KernelDiagnostics:StoreFiles END ---
2017-03-02 10:34:01.846+0000 INFO [o.n.k.i.DiagnosticsManager] ---
SERVER STARTED START --- 2017-03-02 10:34:02.526+0000 INFO
[c.g.s.f.b.GraphAwareBootstrappingFilter] Mounting GraphAware
Framework at /graphaware 2017-03-02 10:34:02.547+0000 INFO
[c.g.s.f.c.GraphAwareWebContextCreator] Will try to scan the following
packages:
{com..graphaware.,org..graphaware.,net..graphaware.}
2017-03-02 10:34:06.100+0000 INFO [o.n.k.i.DiagnosticsManager] ---
SERVER STARTED END ---
It looks like the framework is not started but I have set enabled=true in the properties file.
Environment Setup
Neo4j Community Edition 3.1.1
graphaware-server-3.1.0.44
graphaware-uuid-3.1.0.44.13
Thanks

Hazelcast memory is continuously increasing

I have a hazelcast cluster with two machines.
The only object in the cluster is a map. Analysing the log files I noticed that the health monitor starts to report a slow increase in memory consumption even though no new entries are being added to map (see sample of log entries below)
Any ideas of what may be causing the memory increase?
<p>2015-09-16 10:45:49 INFO HealthMonitor:? - [10.11.173.129]:5903
[dev] [3.2.1] memory.used=97.6M, memory.free=30.4M,
memory.total=128.0M, memory.max=128.0M, memory.used/total=76.27%,
memory.used/max=76.27%, load.process=0.00%, load.system=1.00%,
load.systemAverage=3.00%, thread.count=96, thread.peakCount=107,
event.q.size=0, executor.q.async.size=0, executor.q.client.size=0,
executor.q.operation.size=0, executor.q.query.size=0,
executor.q.scheduled.size=0, executor.q.io.size=0,
executor.q.system.size=0, executor.q.operation.size=0,
executor.q.priorityOperation.size=0, executor.q.response.size=0,
operations.remote.size=1, operations.running.size=0, proxy.count=2,
clientEndpoint.count=0, connection.active.count=2,
connection.count=2</p>
<p>2015-09-16 10:46:02 INFO
InternalPartitionService:? - [10.11.173.129]:5903 [dev] [3.2.1]
Remaining migration tasks in queue = 51 2015-09-16 10:46:12 DEBUG
TeleavisoIvrLoader:71 - Checking for new files... 2015-09-16 10:46:13
INFO InternalPartitionService:? - [10.11.173.129]:5903 [dev] [3.2.1]
All migration tasks has been completed, queues are empty. 2015-09-16
10:46:19 INFO HealthMonitor:? - [10.11.173.129]:5903 [dev] [3.2.1]
memory.used=103.9M, memory.free=24.1M, memory.total=128.0M,
memory.max=128.0M, memory.used/total=81.21%, memory.used/max=81.21%,
load.process=0.00%, load.system=1.00%, load.systemAverage=2.00%,
thread.count=73, thread.peakCount=107, event.q.size=0,
executor.q.async.size=0, executor.q.client.size=0,
executor.q.operation.size=0, executor.q.query.size=0,
executor.q.scheduled.size=0, executor.q.io.size=0,
executor.q.system.size=0, executor.q.operation.size=0,
executor.q.priorityOperation.size=0, executor.q.response.size=0,
operations.remote.size=0, operations.running.size=0, proxy.count=2,
clientEndpoint.count=0, connection.active.count=2,
connection.count=2</p>
<p>2015-09-16 10:46:49 INFO HealthMonitor:? - [10.11.173.129]:5903
[dev] [3.2.1] memory.used=105.1M, memory.free=22.9M,
memory.total=128.0M, memory.max=128.0M, memory.used/total=82.11%,
memory.used/max=82.11%, load.process=0.00%, load.system=1.00%,
load.systemAverage=1.00%, thread.count=73, thread.peakCount=107,
event.q.size=0, executor.q.async.size=0, executor.q.client.size=0,
executor.q.operation.size=0, executor.q.query.size=0,
executor.q.scheduled.size=0, executor.q.io.size=0,
executor.q.system.size=0, executor.q.operation.size=0,
executor.q.priorityOperation.size=0, executor.q.response.size=0,
operations.remote.size=0, operations.running.size=0, proxy.count=2,
clientEndpoint.count=0, connection.active.count=2,
connection.count=2</p>

Neo4j randomly shutting down

I am running neo4j on an EC2 instance. But for some reason it randomly shuts down from time to time. Is there a way to check the shutdown logs? And is there a way to automatically restart the server? I couldn't locate the log folder. But here's what my messages.log file looks like. This section covers the timeframe when the server went down (before 2015-04-13 05:39:59.084+0000) and when I manually restarted the server (at 2015-04-13 05:39:59.084+0000). You can see that there is no record of server issue or shutdown. Time frame before 2015-03-05 08:18:47.084+0000 contains info of the previous server restart.
2015-03-05 08:18:44.180+0000 INFO [o.n.s.m.Neo4jBrowserModule]: Mounted Neo4j Browser at [/browser]
2015-03-05 08:18:44.253+0000 INFO [o.n.s.w.Jetty9WebServer]: Mounting static content at [/webadmin] from [webadmin-html]
2015-03-05 08:18:44.311+0000 INFO [o.n.s.w.Jetty9WebServer]: Mounting static content at [/browser] from [browser]
2015-03-05 08:18:47.084+0000 INFO [o.n.s.CommunityNeoServer]: Server started on: http://0.0.0.0:7474/
2015-03-05 08:18:47.084+0000 INFO [o.n.s.CommunityNeoServer]: Remote interface ready and available at [http://0.0.0.0:7474/]
2015-03-05 08:18:47.084+0000 INFO [o.n.k.i.DiagnosticsManager]: --- SERVER STARTED END ---
2015-04-13 05:39:59.084+0000 INFO [o.n.s.CommunityNeoServer]: Setting startup timeout to: 120000ms based on -1
2015-04-13 05:39:59.265+0000 INFO [o.n.k.InternalAbstractGraphDatabase]: No locking implementation specified, defaulting to 'community'
2015-04-13 05:39:59.383+0000 INFO [o.n.k.i.DiagnosticsManager]: --- INITIALIZED diagnostics START ---
2015-04-13 05:39:59.384+0000 INFO [o.n.k.i.DiagnosticsManager]: Neo4j Kernel properties:
2015-04-13 05:39:59.389+0000 INFO [o.n.k.i.DiagnosticsManager]: neostore.propertystore.db.mapped_memory=78M
2015-04-13 05:39:59.389+0000 INFO [o.n.k.i.DiagnosticsManager]: neostore.nodestore.db.mapped_memory=21M

Troubles while running neo4j & rails in HA

I have configured an HA cluster with 3 nodes.
The first is a Rails server, and the others are two Neo4j server
I started the Rails s, then I started another instance, with the following error
Has anyone an idea?
Paolo
2013-11-21 23:48:57.473+0000 INFO [Cluster] Attempting to join cluster of [127.0.0.1:5001, 127.0.0.1:5002, 127.0.0.1:5003]
2013-11-21 23:49:02.568+0000 INFO [Cluster] Joined cluster:Name:neo4j.ha Nodes:{1=cluster://127.0.0.1:5001, 3=cluster://127.0.0.1:5003} Roles:{coordinator=1}
2013-11-21 23:49:02.582+0000 INFO [Cluster] Instance 3 (this server) joined the cluster
2013-11-21 23:49:06.845+0000 INFO [Cluster] Instance 1 was elected as coordinator
java.io.EOFException
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2598)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1318)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at org.neo4j.cluster.member.paxos.PaxosClusterMemberEvents$HighAvailabilitySnapshotProvider.setState(PaxosClusterMemberEvents.java:179)
at org.neo4j.cluster.protocol.snapshot.SnapshotMessage$SnapshotState.setState(SnapshotMessage.java:91)
at org.neo4j.cluster.protocol.snapshot.SnapshotState$2.handle(SnapshotState.java:105)
at org.neo4j.cluster.protocol.snapshot.SnapshotState$2.handle(SnapshotState.java:89)
at org.neo4j.cluster.statemachine.StateMachine.handle(StateMachine.java:88)
at org.neo4j.cluster.StateMachines$1.run(StateMachines.java:135)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Unable to load webadmin for Neo4j High Availability

I have installed 3 instances of neo4j version 1.9.4 on a linux machine, in 3 different directories: Neo4j01, neo4j02, neo4j03.
I have updated the configuration files neo4j.properties and neo4j-server.properties as mentioned in the link (http://docs.neo4j.org/chunked/milestone/ha-setup-tutorial.html).
When I start all the neo4j instances one after the other, they are successfully installing, but after some time 2 of the 3 neo4j process/instances are automatically disappearing. I noticed it via ps -aef | grep neo4j.
When I checked console logs then i found below errors:
2013-11-12 16:37:32.512+0000 INFO [Cluster] Checking store consistency with master
2013-11-12 16:37:33.174+0000 INFO [Cluster] Store is consistent
2013-11-12 16:37:33.176+0000 INFO [Cluster] Catching up with master
2013-11-12 16:37:33.276+0000 INFO [Cluster] Now consistent with master
2013-11-12 16:37:34.442+0000 INFO [Cluster] ServerId 2, successfully moved to slave for master ha://localhost.localdomain:6363?serverId=1
2013-11-12 16:37:34.689+0000 INFO [Cluster] Instance 1 is available as backup at backup://localhost.localdomain:6366
2013-11-12 16:37:34.798+0000 INFO [Cluster] Instance 2 (this server) is available as slave at ha://localhost.localdomain:6364?serverId=2
2013-11-12 16:37:35.036+0000 INFO [Cluster] Database available for write transactions
2013-11-12 16:37:35.360+0000 INFO [API] Successfully started database
2013-11-12 16:37:36.079+0000 INFO [API] Starting HTTP on port :7474 with 10 threads available
2013-11-12 16:37:40.596+0000 INFO [Cluster] Instance 3 has failed
2013-11-12 16:37:43.654+0000 INFO [API] Enabling HTTPS on port :7473
2013-11-12 16:38:01.081+0000 INFO [API] Mounted REST API at: /db/manage/
2013-11-12 16:38:01.158+0000 INFO [API] Mounted discovery module at [/]
2013-11-12 16:38:02.375+0000 INFO [API] Loaded server plugin "CypherPlugin"
2013-11-12 16:38:02.449+0000 INFO [API] Loaded server plugin "GremlinPlugin"
2013-11-12 16:38:02.462+0000 INFO [API] Mounted REST API at [/db/data/]
2013-11-12 16:38:02.534+0000 INFO [API] Mounted management API at [/db/manage/]
2013-11-12 16:38:03.568+0000 INFO [API] Mounted webadmin at [/webadmin]
2013-11-12 16:38:06.189+0000 INFO [API] Mounting static content at [/webadmin] from [webadmin-html]
2013-11-12 16:38:30.844+0000 DEBUG [API] Failed to start Neo Server on port [7474], reason [org.mortbay.util.MultiException[java.net.BindException: Address already in use, java.net.BindException: Address already in use]]
2013-11-12 16:38:30.880+0000 DEBUG [API] org.neo4j.server.ServerStartupException: Starting Neo4j Server failed: org.mortbay.util.MultiException[java.net.BindException: Address already in use, java.net.BindException: Address already in use]
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:211) ~[neo4j-server-1.9.4.jar:1.9.4]
at org.neo4j.server.Bootstrapper.start(Bootstrapper.java:86) [neo4j-server-1.9.4.jar:1.9.4]
at org.neo4j.server.Bootstrapper.main(Bootstrapper.java:49) [neo4j-server-1.9.4.jar:1.9.4]
Caused by: java.lang.RuntimeException: org.mortbay.util.MultiException[java.net.BindException: Address already in use, java.net.BindException: Address already in use]
at org.neo4j.server.web.Jetty6WebServer.startJetty(Jetty6WebServer.java:334) ~[neo4j-server-1.9.4.jar:1.9.4]
at org.neo4j.server.web.Jetty6WebServer.start(Jetty6WebServer.java:154) ~[neo4j-server-1.9.4.jar:1.9.4]
at org.neo4j.server.AbstractNeoServer.startWebServer(AbstractNeoServer.java:344) ~[neo4j-server-1.9.4.jar:1.9.4]
at org.neo4j.server.AbstractNeoServer.start(AbstractNeoServer.java:187) ~[neo4j-server-1.9.4.jar:1.9.4]
... 2 common frames omitted
Caused by: org.mortbay.util.MultiException: Multiple exceptions
at org.mortbay.jetty.Server.doStart(Server.java:188) ~[jetty-6.1.25.jar:6.1.25]
at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) ~[jetty-util-6.1.25.jar:6.1.25]
at org.neo4j.server.web.Jetty6WebServer.startJetty(Jetty6WebServer.java:330) ~[neo4j-server-1.9.4.jar:1.9.4]
... 5 common frames omitted
2013-11-12 16:38:30.894+0000 DEBUG [API] Failed to start Neo Server on port [7474]
Now, only neo4j01 process is running and neo4j02 and neo4j03 processes are disappeared. But even though neo4j01 process is up and running I am unable to access the webadmin page at http://htname:7474/webadmin/#/info/org.neo4j/High%20Availability/.
Please, can someone shed some light on this?
You might want to take a look at https://github.com/neo-technology/neo4j-enterprise-local-qa. This contains a rakefile that automates a local setup of 3 instances. Clone the repo locally, and use
rake setup_cluster start_cluster
to bring a locally running cluster online. Shutdown can be done via
rake stop_cluster
Find the configs in machine[ABC]/conf/.

Resources