Create Node and Relationship in one query : Spring Data Neo4j - neo4j

I am trying to create new nodes and relationships using Neo4j using Spring Data Neo4j. My use case is to add a friend relationship between 2 user nodes. So this boils down to :
User user1 = userRepo.findByPropertyValue("userId1", userId1);
User user2 = userRepo.findByPropertyValue("userId2", userId2);
if(user1 == null){
createUserObject(userId1);
}
if(user2 == null){
createUserObject(userId2);
}
user1.isFriend(user2);
userRepo.save();
So this includes 2 calls to the DB (findByPropertyValue). Is this correct or is there another way to do this ? Maybe batch the whole thing up into one request ?
Thanks..

You can do both with a single cypher query:
START user1=node:User(userId={userId1}),
user2=node:User(userId={userId2})
CREATE UNIQUE (user1)-[:FRIEND]-(user2);
The user-id's are passed in as params in a map.
You can also use an annotated repository method for that.

Related

Neo4j BatchInserter - create relationship using Node properties

I'm using BatchInserter to initialise my Neo4j database - the data is coming from XML files on my local filesystem.
Suppose one set of files contains node information / properties, and another set has relationship information. I wanted to do two passes: create all the nodes, then set about creating the relationships.
However, the createRelationship method accepts a long id for the nodes, which I don't have in my relationship XML - all of my nodes have a GUID as a property called ID which I use to reference them.
Does BatchInsert mean it hasn't been indexed yet, so I won't be able to create relationships on nodes based on some other property?
I usually just keep the node-attribute to id mapping in a cache in memory in an efficient collection implementation like Trove or so.
Then for the relationships you can look up the node-id by attribute.
I found I was able to add nodes to the index as I go.
Creating index:
BatchInserter inserter = BatchInserters.inserter( "data/folder" );
BatchInserterIndexProvider indexProvider = new LuceneBatchInserterIndexProvider( inserter );
BatchInserterIndex index = indexProvider.nodeIndex("myindex", MapUtil.stringMap( "type", "exact" ) );
Then each time I insert a node, add it to the index as well:
Label label = DynamicLabel.label("person");
Map<String, Object> properties = new HashMap<>();
properties.put("ID", <some-value-here>);
long newNode = inserter.createNode(properties, labek);
index.add(newNode, properties);
index.flush();
Which I can query as I like:
IndexHits<Long> hits = index.get("ID", <some-value-here>);
if(hits.size() > 0) {
long existing = hits.getSingle();
}
I have no idea whether this is any good. I guess calling flush on the index often is a bad idea, but it seems to work for me.

Using Merge in BatchInserter?

I am using the BatchInserter in order to create some nodes and relationships, however I have unique nodes, and I wanted to make multiple relationships between them.
I can easily do that using the Cypher and in the very same time by using the Java Core API by:
ResourceIterator<Node> existedNodes = graphDBService.findNodesByLabelAndProperty( DynamicLabel.label( "BaseProduct" ), "code", source.getBaseProduct().getCode() ).iterator();
if ( !existedNodes.hasNext() )
{
//TO DO
}
else {
// create relationship with the retrieved node
}
and in Cypher I can easily use the merge.
is there any possible way to do the same with the BatchInserter ?
No it is not possible in the batch-inserter, as those APIs are not available there.
That's why I usually keep in-memory maps with the information I need to look up.
See this blog post for a groovy script:
http://jexp.de/blog/2014/10/flexible-neo4j-batch-import-with-groovy/

neo4jClient foreach create

How to create multiple nodes in a single transaction using neo4jClient.The Current code works fine but is a bit slower
foreach (UserInfo _ui in users)
{
client.Cypher.Create("(n:User{param})")
.WithParam("param", _ui).ExecuteWithoutResults();
}
In Cypher, you can create multiple nodes in a single transaction using a parameter that is a collection of maps. Since your users variable already seems to be such a collection, try replacing your loop with:
client.Cypher.Create("(n:User{param})")
.WithParam("param", users).ExecuteWithoutResults();

create if not exists... with multiple properties (and unique ID)

First, sorry for my pretty bad english, i'm French :p
I'm currently switching from MySQL to Neo4j and i have a little question about my scripts.
I have artists and music albums; each of them linked (if needed) as (artist)-[:OWNS]->(album).
Now i develop the API for updating the information and i have a little "bug" for this :
How can i get an existing node and create it if not exist ?
For another part, i'm doing like that :
MATCH (u:User) WHERE u.id='83cac821-1607-49a3-e124-07431ef375ce' MERGE (c:Country {name:'France'}) CREATE UNIQUE (u)-[:FROM]->(c) RETURN u,c;
So, if the country "France" already exists, neo4j will not create a second one... Perfect 'cause my countries haven't ID's...
But for artists and albums, i need an unique identifier; and i can't create my request :
MATCH (ar:Artist) WHERE ar.id='83cac821-1607-49a3-e124-07431ef375ce' MERGE (al:Album {name:'Title01', id:'31efc821-1607-49a3-e124-074383ca75ce'}) CREATE UNIQUE (ar)-[:OWNS]->(al) RETURN ar,al;
In this way, i need to know the album'ID (and in my API, i don't !). In fact, i need Neo4j get the album "Title01" if exist, and create (with a fresh new ID) if not. In my exemple, if i don't give the ID, it can get the album if exist; but if not, it will create a new one without ID... And if i send an ID, neo4j will never get it (cause the title's already exist but not with this particular ID).
(Before, in Mysql i was using multiple requests : 1° search if album exist. If yes, return ID; if not create with new one and return ID. 2° the same for artist. 3° create link between them...)
Thanks for your help !
The MERGE command can be extended with ON MATCH and ON CREATE, see http://docs.neo4j.org/chunked/stable/query-merge.html#_use_on_create_and_on_match. I guess you have to something like
MATCH (ar:Artist) WHERE ar.id='83cac821-1607-49a3-e124-07431ef375ce'
MERGE (al:Album {name:'Title01'})
ON CREATE SET al.id = '31efc821-1607-49a3-e124-074383ca75ce'
CREATE UNIQUE (ar)-[:OWNS]->(al) RETURN ar,al
Here's a page that shows how to create a node if it doesn't exist: Link

Neo4jClient: doubts about CRUD API

My persistency layer essentially uses Neo4jClient to access a Neo4j 1.9.4 database. More specifically, to create nodes I use IGraphClient#Create() in Neo4jClient's CRUD API and to query the graph I use Neo4jClient's Cypher support.
All was well until a friend of mine pointed out that for every query, I essentially did two HTTP requests:
one request to get a node reference from a legacy index by the node's unique ID (not its node ID! but a unique ID generated by SnowMaker)
one Cypher query that started from this node reference that does the actual work.
For read operations, I did the obvious thing and moved the index lookup into my Start() call, i.e.:
GraphClient.Cypher
.Start(new { user = Node.ByIndexLookup("User", "Id", userId) })
// ... the rest of the query ...
For create operations, on the other hand, I don't think this is actually possible. What I mean is: the Create() method takes a POCO, a couple of relationship instances and a couple of index entries in order to create a node, its relationships and its index entries in one transaction/HTTP request. The problem is the node references that you pass to the relationship instances: where do they come from? From previous HTTP requests, right?
My questions:
Can I use the CRUD API to look up node A by its ID, create node B from a POCO, create a relationship between A and B and add B's ID to a legacy index in one request?
If not, what is the alternative? Is the CRUD API considered legacy code and should we move towards a Cypher-based Neo4j 2.0 approach?
Does this Cypher-based approach mean that we lose POCO-to-node translation for create operations? That was very convenient.
Also, can Neo4jClient's documentation be updated because it is, frankly, quite poor. I do realize that Readify also offers commercial support so that might explain things.
Thanks!
I'm the author of Neo4jClient. (The guy who gives his software away for free.)
Q1a:
"Can I use the CRUD API to look up node A by its ID, create node B from a POCO, create a relationship between A and B"
Cypher is the way of not just the future, but also the 'now'.
Start with the Cypher (lots of resources for that):
START user=node:user(Id: 1234)
CREATE user-[:INVITED]->(user2 { Id: 4567, Name: "Jim" })
Return user2
Then convert it to C#:
graphClient.Cypher
.Start(new { user = Node.ByIndexLookup("User", "Id", userId) })
.Create("user-[:INVITED]->(user2 {newUser})")
.WithParam("newUser", new User { Id = 4567, Name = "Jim" })
.Return(user2 => user2.Node<User>())
.Results;
There are lots more similar examples here: https://github.com/Readify/Neo4jClient/wiki/cypher-examples
Q1b:
" and add B's ID to a legacy index in one request?"
No, legacy indexes are not supported in Cypher. If you really want to keep using them, then you should stick with the CRUD API. That's ok: if you want to use legacy indexes, use the legacy API.
Q2.
"If not, what is the alternative? Is the CRUD API considered legacy code and should we move towards a Cypher-based Neo4j 2.0 approach?"
That's exactly what you want to do. Cypher, with labels and automated indexes:
// One time op to create the index
// Yes, this syntax is a bit clunky in C# for now
graphClient.Cypher
.Create("INDEX ON :User(Id)")
.ExecuteWithoutResults();
// Find an existing user, create a new one, relate them,
// and index them, all in a single HTTP call
graphClient.Cypher
.Match("(user:User)")
.Where((User user) => user.Id == userId)
.Create("user-[:INVITED]->(user2 {newUser})")
.WithParam("newUser", new User { Id = 4567, Name = "Jim" })
.ExecuteWithoutResults();
More examples here: https://github.com/Readify/Neo4jClient/wiki/cypher-examples
Q3.
"Does this Cypher-based approach mean that we lose POCO-to-node translation for create operations? That was very convenient."
Correct. But that's what we collectively all want to do, where Neo4j is going, and where Neo4jClient is going too.
Think about SQL for a second (something that I assume you are familiar with). Do you run a query to find the internal identifier of a node, including its file offset on disk, then use this internal identifier in a second query to manipulate it? No. You run a single query that does all that in one hit.
Now, a common use case for why people like passing around Node<T> or NodeReference instances is to reduce repetition in queries. This is a legitimate concern, however because the fluent queries in .NET are immutable, we can just construct a base query:
public ICypherFluentQuery FindUserById(long userId)
{
return graphClient.Cypher
.Match("(user:User)")
.Where((User user) => user.Id == userId);
// Nothing has been executed here: we've just built a query object
}
Then use it like so:
public void DeleteUser(long userId)
{
FindUserById(userId)
.Delete("user")
.ExecuteWithoutResults();
}
Or, add even more Cypher logic to delete all the relationships too:
Then use it like so:
public void DeleteUser(long userId)
{
FindUserById(userId)
.Match("user-[:?rel]-()")
.Delete("rel, user")
.ExecuteWithoutResults();
}
This way, you can effectively reuse references, but without ever having to pull them back across the wire in the first place.

Resources