Spring Data Neo4j load balancing - neo4j

I'm working on an application using Spring Data Neo4j that works with an embedded Neo4j Server. I would like for my application to be able to work with a cluster containing 3 Neo4j nodes, one of this nodes being the embedded server.
I am trying to accomplish some sort of load balancing within the cluster: 1. round-robin requests on each server or 2. write requests on the master embedded server and read requests on the other two servers.
Does Spring Data Neo4j have any kind of load balancing mechanism out of the box? What configuration is necessary to achieve this? Do I need additional tools like HAProxy or mod_proxy? Is there any example of how they can be integrated with the Neo4j cluster and Spring Data Neo4j?

A load balancer component is not part of Neo4j nor part of Spring Data Neo4j. For a sample setup using Neo4j as server is documented at http://docs.neo4j.org/chunked/stable/ha-haproxy.html.
Since your application uses SDN in embedded HA mode, you need to expose the status of your local instance (master or slave) yourself to achieve the same like /db/manage/server/ha/master does in server mode. You might use HighlyAvailableGraphDatabase.isMaster() in your implementation.

Related

queries regarding neo4j HA setup

Hi I am new to HA concepts and Neo4j HA. I have gone through the Neo4j Docs but i still have a couple of questions that come to my mind.
When using a php script to connect to Neo4j database via REST what ip should i use for the cluster. Is there a common ip for the cluster?
I ask this because if the master fails a new neo4j instance becomes the master. how should my script connect to the new master. Should i use third party software for pointing to the new master. can that happen automatically with neo4j through a common cluster ip. pardon me if my concepts are weak, just need some guidance.
How can i direct all reads and writes to the master only and use the slaves only for replication. Or is this the default setting. I see multiple read & multiple write scenarios so i am getting confused.
Is there any doc/material that explains further on setting up an Arbiter Instance or should i just configure 3 node Neo4j HA as explained in http://neo4j.com/docs/stable/ha-setup-tutorial.html and run the below command for one of the instance -
neo4j_home$ ./bin/neo4j-arbiter start
Any help is appreciated. Thanks!
Welcome to the community of Neo4j Users ;)
First I recommend you to look on neo4j-php-client, because it support Neo4j HA cluster and it could solve your question and problems. Instead of finding your own solutions.
Best practice is to use some kind of load balancing front of the Neo4j HA Cluster. Here is the great article about it: http://blog.armbruster-it.de/2015/08/neo4j-and-haproxy-some-best-practices-and-tricks/
You can do that on load balancer level based on HTTP methods (GET redirect to slaves; POST, PUT, DELETE redirect to master). But there is a problem with Cypher endpoint, because it uses only POST method. You can use additional HTTP header to distinguish between read and write request, but that logic must be in your application.
For start it's good enough to start with official documentation.
Resources
Neo4j HA cluster configuration (example)
Neo4j cluster and firewalls
As my friend MicTech mentioned, generally we use HAProxy as load balancer on top of Neo4j.
With the php client mentioned, you have a great configuration mechanism that allows to :
When using HA Proxy, define your read/write queries so it will automatically add a header to the http request. The header is configurable too.
When not using HAProxy, you can in the client setup, define all your neo4j instances and activate the High-Availibility extension (works only with cache enabled). So when the master is down, the client will automatically try to detect the new elected master and rewrite the connections configuration in the cache for further requests.
I tried to make the README as good as possible, please read it and open issues on the repository if there are things that are missing.
https://github.com/graphaware/neo4j-php-client

Social Networking Site with Grails and Neo4j

I am developing a social networking website with Grails and MYSQL DB. But I am planning to move to Neo4j DB. Grails supports complete GORM features with MYSQL. But I am sure about the same features in Neo4j. There is a Grails plugin for Neo4j which does not support some of the features that is required to support big websites. So I am planning to use the native Neo4j API. From Grails how to connect to Neo4j DB? There are two scenarios in my case.
Case 1:
Neo4j server is up and running. How to connect and perform the database transactions?
Case 2:
Neo4j server is not running. How to connect and perform the database transactions? I was able to connect using GraphDatabaseService class. But why would one need to connect to DB which is not running. What is this class GraphDatabaseService particularly used for?
I want to use the native Neo4j API to get access to maximum features. Is there a better approach to build the application.
The Grails Neo4j GORM plugin in version 2.0.0.M01 is able to run embedded mode only - so the JVM running the Grails applications spawns the database inside the same JVM. Next milestone (M02) will add support for accessing Neo4j via REST.
In case you don't want to transparently use the GORM methods to talk to Neo4j you can always use direct access by emitting Cypher statements or accessing GraphDatabaseService directly (only on embedded mode of course).

Runs multiple web application with same embedded neo4j db

It is required to run multiple application on same neo4j db. But when I try to do that, I am meeting a problem about locking.
Neo4j is locking itself when an application is using it. Multiple application can't be run.
The exception is like,
Unable to lock store [/opt/neo4j-lojika-db/neostore.relationshiptypestore.db.names], this is usually a result of some other Neo4j kernel running using the same store
Is there a way to run multiple web application with same embedded neo4j db.
Thank you!
You can't do this way. You have two options
Use Neo4j HA or
Run Neo4j in server mode rather than embedded mode. If your application is simple then you can use the REST api provided by Neo4j out of box. If your service layer is more involved then, put a service layer on top of a single Neo4j embedded instance and let each application talk to Neo4j through this service layer.

How to connect to neo4j server in spring via rmi?

How to connect to neo4j server in spring via rmi? I found http://mvnrepository.com/artifact/org.neo4j/neo4j-remote-graphdb. Is it component neo4j-remote-graphdb supported?
You can connect with that library, which is also in http://github.com/neo4j/java-rest-binding
But you shouldn't use individual operations over the wire for higher performance operations, use Cypher instead (using RestGraphDatabase.query(query, params)) with Cypher Parameters.

clustering jsf 2.0 web application

I am going to create an application using jsf 2.x, glassfish 3.1 open source, JPA + postgresql . I want to develop it in such a way, that my app can be clustered on several physical servers and load balanced.
What are the recommended free and open source technologies for clustering and load balancing a jsf 2.0 web application?
What are the best approaches and what should I keep in mind before planning and designing my application?
Any other useful information related to this question is also appreciated )).
Thanks in advance.
Glassfish application server has a built-in cluster support. You have to run your application on multiple glassfish instances and configure the server to replicate the data to other server (bind the servers in a cluster).
To enable replication for your application you should put the following tag in web.xml
<distributable />
When the cluster is set up properly the http sessions will be replicated among the cluster nodes. What's left is to configure a load balanced like Apache httpd that will accept requests and route them to a specific server in a cluster.
In general - avoid storing data in the session as much as possible. Make your beans serializable with scope with longer life than request.
Look in google for more information.

Resources