Create multiple nodes and relationships in several Cypher statements - neo4j

I want to create multiple neo4j nodes and relationships in one Cypher transaction. I'm using py2neo which allows issuing multiple Cypher statements in one transaction .
I thought I'd add a statement for each node and relationship I create:
tx.append('CREATE (n:Label { prop: val })')
tx.append('CREATE (m:Label { prop: val2 })')
Now I want to create a relationship between the two created nodes:
tx.append('CREATE (n)-[:REL]->(m)')
This doesn't work as expected. No relationship is created between the first two nodes, since there's no n or m in the context of the last statement (there is a new relationship between two new nodes - four nodes are created in total)
Is there a way around this? Or should I combine all the calls to CREATE (around 100,000 per logical transaction) in one statement?
It just hurts my brain thinking about such a statement, because I'll need to store everything on one big StringIO, and I lose the ability to use Cypher query parameters - I'll need to serialize dictionaries to text myself.
UPDATE:
The actual graph layout is more complicated than that. I have multiple relationship types, and each node is connected to at least two other nodes, while some nodes are connected to hundreds of nodes.

You don't need multiple queries. You can use a single CREATE to create each relationship and its related nodes:
tx.append('CREATE (:Label { prop: val })-[:REL]->(:Label { prop: val2 })')

Do something like this:
rels = [(1,2), (3,4), (5,6)]
query = """
CREATE (n:Label {prop: {val1} }),
(m:Label {prop: {val2} }),
(n)-[:REL]->(m)
"""
tx = graph.cypher.begin()
for val1, val2 in rels:
tx.append(query, val1=val1, val2=val2)
tx.commit()
And if your data is large enough consider doing this in batches of 5000 or so.

Related

why we can create null (empty) node or multi label node in neo4j?

In Neo4j we can create Null (Empty) and multi label Nodes.
CREATE () // Create Empty Node
CREATE (:l1 :l2) // CREATE multi Label Node
Why it allow create null (empty) node? what is the benefit and usability of null node? why we need multi label node?
Node is the most basic entity in neo4j, which stores data. Labels provide a way to group the nodes into sets and they help in fast lookups.
The statement: CREATE () creates a node with no labels and properties, but now to query this node.
MATCH (n) WHERE labels(n) = 0 return n,
Neo4j will have to perform all nodes scan. This is inefficient and this is where labels help you.
The statement CREATE (:l1 :l2) creates a node with two labels l1 and l2.This node can be easily queried using the queries:
MATCH (n:l1) return n
OR
MATCH (n:l2) return n
In these queries, neo4j only looks for the nodes grouped under these labels, which helps in faster lookups, since the dataset to search for, is reduced.

MERGE Nodes by a Property field

I have different nodes that share one same property field, i need to merge these nodes into one and in the same time copy all the rest of the other properties in the merge node.
example:
(n1,g,p1) (n2,g,p2) (n3,g,p3) =>(n,g,p1,p2,p3)
Important to Note that i don't need the apoc solutions since user defined functions dosen't work in CAPS that i m working at
update :
geohash is the field that have a repeated values, so i want to merge the nodes by this field .
The CAPS team gave me this cypher query to have distinct geohash nodes from the intial graph:
CATALOG CREATE GRAPH temp {
FROM GRAPH session.inputGraph
MATCH (n)
WITH DISTINCT n.geohash AS geohash
CONSTRUCT
CREATE (:HashNode {geohash: geohash})
RETURN GRAPH
}
, however it still missing is the collect of the rest of the properties on the merged nodes.
I haven't a problem for the relationship ,cause we can copy them later from the intial graph:
FROM GRAPH inputGraph
MATCH (from)-[via]->(to)
FROM GRAPH temp
MATCH (n), (m)
WHERE from.geohash = n. AND AND to.geohash = m.geohash
CONSTRUCT
CREATE (n)-[COPY OF via]->(m)
RETURN GRAPH
It's not 100% possible in pure cypher, that's why there is an APOC procedure for that.
To merge two nodes , you have to :
create the merge node with all the properties
to create all the relationship of the nodes on the merge one
For the first part it's possible in cypher. Example :
MATCH (n) WHERE id(n) IN [106, 68]
WITH collect(n) AS nodes
CREATE (new:MyNode)
with nodes, new
UNWIND nodes as node
SET new += properties(node)
RETURN new
But for the second part, you need to be able to create relationship with a dynamic type and dynamic direction, and this is not allowed in cypher ...

Neo4j : Does creating nodes with 500 properties slow down the creating process?

I am trying to creating node with 500 ~ 600 columns, as the requirement is show all the properties in a single dataframe/table. The creation of nodes is using bolt driver api from java eclipse.
Also, it would be tedious to select manually so many columns, if there were from different nodes and select from different nodes, in order to show all the properties as tables.
If all the properties were in a single node, I would easily return * properties.
This is how I am trying to create the nodes. The total number nodes to be created is about 20K ~ 40K.
Example:
List<String> nodes = {create(s:TEST{a:"", b:"", .... })
create(s:TEST{a:"", b:"", .... })
...
create(s:TEST{a:"", b:"", .... })
};
This is how, I am creating the does for example:
try (Session session = driver.session()) {
for (String q : nodes) {
StatementResult st = session.run(q);
}
}
This kind of iterative approach, one node created at a time, isn't recommended.
You may want to review Michael Hunger's batching tips on better ways to insert your data.
In your case, providing a parameter of a list of property maps allows you to unwind the maps, create the nodes, and add the property values to the nodes directly:
UNWIND $propsList as props
CREATE (n:Test)
SET n = props
If you run into slowdowns or hangs when inserting, take a look at Michael's advice on using APOC Procedures to batch insert with periodic commits.

Is it possible to merge using data-driven node or relationship labels?

I'm working with prepared statements via the Neo4J JDBC driver, and have a need to create node and relationship labels whose names are driven by the data we will be receiving.
For example, I'd like to create a prepared statement along these lines:
MERGE (test:{1} {id: {2}) ON CREATE SET test.id = {2}
OR
MERGE (test:Test)-[:{1}]->(test2:Test)
These don't currently work, as it seems Neo4J doesn't interpret the placeholder {1} as a placeholder, instead seeing it as an invalid label name.
Another possibility I'm exploring is that we may be able to extend Cypher via a stored procedure, though I suspect we may run into the same limitation.
Hoping someone can provide some insight as to whether there is any way to accomplish this with Cypher.
Thanks!
UPDATE:
An answer below suggests using APOC's apoc.create.node procedure, but what I need is to merge on a dynamic label. Updated the title to reflect this.
I ended up using a different procedure from APOC - apoc.cypher.doIt, as it turns out APOC doesn't have a way to merge with dynamic labels as of yet. See feature request: https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/271
Below is what I ended up doing. Note that the requirement was to iterate (in this case using UNWIND) over a collection and merge nodes with dynamic labels pulled from this collection, and then merge a relationship between a pre-existing node and this new node:
WITH myNode, myList
UNWIND categories AS catArray
WITH myNode, 'MERGE (cat:' + catArray[0] + ' {value: "' + catArray[1] + '" }) ON CREATE SET cat.value = \"' + catArray[1] + '\" RETURN cat' AS cypher
CALL apoc.cypher.doIt(cypher, {}) YIELD value
WITH myNode, value.cat as cat
MERGE (myNode)-[:IN_CATEGORY]->(cat)
You can use APOC to create dynamic relationships. Similar APOC procedures exist for creating nodes with dynamic labels or adding dynamic labels to nodes.
MERGE (test:Test {name: 'Test'})
WITH test
MERGE (test2:Test {name: 'Test 2'})
WITH test, test2
CALL apoc.create.relationship(test, {new_rel_type}, {}, test2) YIELD rel
RETURN test, test2, rel
After trying almost half a day, I finally find out this method:
UNWIND {batch} as row
MERGE (n { id: row.id }) SET n += row.properties
WITH n
CALL apoc.create.addLabels(id(n), [n.label]) YIELD node
RETURN node
and the performance is almost the same as purely MERGE.
Many thanks for this refer: SET label : pass label name as parameter

neo4j - Better way of creating nodes having same type

I am new to neo4j. I am writing a script that imports records from MySQL to neo4j. I want to know the better way of creating nodes that have same type. I have created nodes related to likes as, consider following snippet:
CREATE (Like:like_1 { 'node_type:"likes", like_name:"abc" })
CREATE (Like:like_2 { 'node_type:"likes", like_name:"def" })
CREATE (Like:like_3 { 'node_type:"likes", like_name:"ghi" })
And in the similar fashion I created users:
CREATE (User:user_1 { 'node_type:"user", user_name:"alpha" })
CREATE (User:user_2 { 'node_type:"user", user_name:"beta" })
CREATE (User:user_3 { 'node_type:"user", user_name:"gamma" })
Thus, total 6 nodes were created, where, (like_n and user_n) n is the id (primary key) of SQL record. I thought, it is better for the retrieval, such that the label of node is known to me (like_ followed by id).
MATCH (l:like_1) RETURN l
Is the way nodes created above better? Or I should go the following (alternate) pattern, in which I put id as a property in node:
CREATE (Like:like { 'node_type:"likes", like_name:"abc", like_id:"1" })
CREATE (Like:like { 'node_type:"likes", like_name:"def", like_id:"2" })
CREATE (Like:like { 'node_type:"likes", like_name:"ghi", like_id:"3" })
CREATE (User:user { 'node_type:"user", like_name:"alpha", user_id:"1" })
CREATE (User:user { 'node_type:"user", like_name:"beta", user_id:"2" })
CREATE (User:user { 'node_type:"user", like_name:"gamma", user_id:"3" })
Carrying with the same scenario, if the second apparoach is better, how could I make relationships betweens two user_1 and all likes and retrieve it?
I think you need to reread parts of the Cypher dev guide, or at least the parts on node labeling, and using variables in your queries.
In short, though, the syntax is (variableName:nodeLabel {<params>})
nodeLabel is the equivalent of a type or a table in a relational db, so it makes sense to User as a node label, but not user_1.
variableName only lasts for the duration of a query. It binds the element (or elements) to that variable for use later in the query. If you aren't planning on using the variable for anything in the rest of the query, don't use a variable at all.
For unique identifiers (like your id primary key), you'll want to set that as a property on your nodes, and additionally create a unique constraint on that label/property combination (that's the equivalent of a unique constraint on a column in a table).
As for Likes...I've got to ask, does a Like make more sense as a node, or as a relationship? Do Users like each other (and other things)? How does a Like fit into your data model?
Rather than just look at your db and try to translate it directly to neo4j, you might want to draw out or refer to an entity relationship diagram or similar. In neo4j, the physical model IS the logical model, so going from a diagram to the actual db should be easy.
For example, let's say that it makes more sense to model Likes as a relationship between users. You might do that like this:
MERGE (user1:User { id:1, name:"alpha" })
MERGE (user2:User { id:2, name:"beta" })
MERGE (user1)-[:Likes]->(user2)
In the above I'm using MERGE instead of CREATE so that if I run this again, it won't create duplicate nodes or relationships (you'll want to read up on MERGE, it's useful but tricky, you'll usually want to use it piecemeal, not for an entire pattern). The nodes I created are nodes with the :User label. The id is a property (and you really should create a constraint on the id property of the User label before doing any create script). After the two nodes are created we create the relationship between them.
An alternate approach, rather than doing node creation and label creation all at once, is to separate them. This also makes sense if you're tracking your Likes in a separate table.
You would create your nodes similarly, but without variables, like this:
MERGE (:User { id:1, name:"alpha" })
MERGE (:User { id:2, name:"beta" })
And in your separate query, adding the Likes relationship, assuming you have the ids of the users and assuming there's a unique constraint on the ids of Users:
MATCH (aUser:User{ id:1})
MATCH (bUser:User{ id:2})
MERGE (aUser)-[:Likes]->(bUser)
Remember, variables are only in-scope for the duration of the query to help you refer to and use already defined elements elsewhere in the query.

Resources