Neo4j : Does creating nodes with 500 properties slow down the creating process? - neo4j

I am trying to creating node with 500 ~ 600 columns, as the requirement is show all the properties in a single dataframe/table. The creation of nodes is using bolt driver api from java eclipse.
Also, it would be tedious to select manually so many columns, if there were from different nodes and select from different nodes, in order to show all the properties as tables.
If all the properties were in a single node, I would easily return * properties.
This is how I am trying to create the nodes. The total number nodes to be created is about 20K ~ 40K.
Example:
List<String> nodes = {create(s:TEST{a:"", b:"", .... })
create(s:TEST{a:"", b:"", .... })
...
create(s:TEST{a:"", b:"", .... })
};
This is how, I am creating the does for example:
try (Session session = driver.session()) {
for (String q : nodes) {
StatementResult st = session.run(q);
}
}

This kind of iterative approach, one node created at a time, isn't recommended.
You may want to review Michael Hunger's batching tips on better ways to insert your data.
In your case, providing a parameter of a list of property maps allows you to unwind the maps, create the nodes, and add the property values to the nodes directly:
UNWIND $propsList as props
CREATE (n:Test)
SET n = props
If you run into slowdowns or hangs when inserting, take a look at Michael's advice on using APOC Procedures to batch insert with periodic commits.

Related

why we can create null (empty) node or multi label node in neo4j?

In Neo4j we can create Null (Empty) and multi label Nodes.
CREATE () // Create Empty Node
CREATE (:l1 :l2) // CREATE multi Label Node
Why it allow create null (empty) node? what is the benefit and usability of null node? why we need multi label node?
Node is the most basic entity in neo4j, which stores data. Labels provide a way to group the nodes into sets and they help in fast lookups.
The statement: CREATE () creates a node with no labels and properties, but now to query this node.
MATCH (n) WHERE labels(n) = 0 return n,
Neo4j will have to perform all nodes scan. This is inefficient and this is where labels help you.
The statement CREATE (:l1 :l2) creates a node with two labels l1 and l2.This node can be easily queried using the queries:
MATCH (n:l1) return n
OR
MATCH (n:l2) return n
In these queries, neo4j only looks for the nodes grouped under these labels, which helps in faster lookups, since the dataset to search for, is reduced.

Create multiple nodes and relationships in several Cypher statements

I want to create multiple neo4j nodes and relationships in one Cypher transaction. I'm using py2neo which allows issuing multiple Cypher statements in one transaction .
I thought I'd add a statement for each node and relationship I create:
tx.append('CREATE (n:Label { prop: val })')
tx.append('CREATE (m:Label { prop: val2 })')
Now I want to create a relationship between the two created nodes:
tx.append('CREATE (n)-[:REL]->(m)')
This doesn't work as expected. No relationship is created between the first two nodes, since there's no n or m in the context of the last statement (there is a new relationship between two new nodes - four nodes are created in total)
Is there a way around this? Or should I combine all the calls to CREATE (around 100,000 per logical transaction) in one statement?
It just hurts my brain thinking about such a statement, because I'll need to store everything on one big StringIO, and I lose the ability to use Cypher query parameters - I'll need to serialize dictionaries to text myself.
UPDATE:
The actual graph layout is more complicated than that. I have multiple relationship types, and each node is connected to at least two other nodes, while some nodes are connected to hundreds of nodes.
You don't need multiple queries. You can use a single CREATE to create each relationship and its related nodes:
tx.append('CREATE (:Label { prop: val })-[:REL]->(:Label { prop: val2 })')
Do something like this:
rels = [(1,2), (3,4), (5,6)]
query = """
CREATE (n:Label {prop: {val1} }),
(m:Label {prop: {val2} }),
(n)-[:REL]->(m)
"""
tx = graph.cypher.begin()
for val1, val2 in rels:
tx.append(query, val1=val1, val2=val2)
tx.commit()
And if your data is large enough consider doing this in batches of 5000 or so.

How to create a relationship to existing nodes based on a condition of matching properties

In Neo4j, I am trying to load a CSV file whilst creating a relationship between nodes based on the condition that a certain property is matched.
My Cypher code is:
LOAD CSV WITH HEADERS FROM "file:C:/Users/George.Kyle/Simple/Simple scream v3.csv" AS
csvLine
MATCH (g:simplepages { page: csvLine.page}),(y:simplepages {pagekeyword: csvLine.keyword} )
MATCH (n:sensitiveskin)
WHERE g.keyword = n.keyword
CREATE (f)-[:_]->(n)
You can see I am trying to create a relationship between 'simplepages' and 'sensitiveskin' based on their keyword properties being the same.
The query is executing but relationships won't form.
What I hope for is that when I execute a query such as
MATCH (n:sensitiveskin) RETURN n LIMIT 25
You will see all nodes (both sensitive skin and simple pages) with auto-complete switched on.
CREATE (f)-[:_]->(n) is using an f variable that was not previously defined, so it is creating a new node (with no label or properties) instead, and then creating a relationship from that new node. I think you meant to use either g or y instead of f. (Probably y, since you don't otherwise use it?)

Neo4j server plugin basic questions

I am using Neo4J v2.1.5 and creating a server plugin.
How to create a unique node i.e. guarantee uniqueness of a property?
Is there a hook where in the plugin lifecycle, constraints and indexes can be created?
Returning a node returns the complete database. How can I return just a node or a pojo list as JSON? Are there any working examples or explanation of Representation available?
I am using Java API and not Cypher.
How to create a unique node i.e. guarantee uniqueness of a property?
You can create a unique constraint on a (label, property) pair which will ensure the uniqueness of that property.
e.g.
CREATE UNIQUE CONSTRAINT ON :Person(name)
Would ensure you can't have two people nodes with the same name. If you want to do that from the Java API you'd do something like this:
try ( Transaction tx = graphdb.beginTx() )
{
graphdb.schema()
.constraintFor( DynamicLabel.label( "Person" ) )
.assertPropertyIsUnique( "name" )
.create();
tx.success();
}
Is there a hook where in the plugin lifecycle, constraints and indexes can be created?
You can do that in a transaction but IIRC you can only create one index/constraint per transaction.
Returning a node returns the complete database. How can I return just a node or a pojo list? Are there
any working examples or explanation of Representation available?
Do you mean from cypher? A simple query which will only return one node would be this:
MATCH (n)
RETURN n
LIMIT 1
In cypher land that will return you a map of the properties that the node has on it. If you want to get something more specific you could try this:
MATCH (n:Person)
RETURN n.name AS personName
LIMIT 1
So then you'd get a String back for that column in the result set.
-- Updating for Java API --
From the Java API you can write your own traversals which will give you back 'Node' and 'Relationship' objects. From those you'd then have to extract any properties that you're interested in.
try ( Transaction tx = graphDatabaseService.beginTx() )
{
ResourceIterable<Node> people = GlobalGraphOperations.at( graphDatabaseService ).getAllNodesWithLabel( DynamicLabel.label( "Person" ) );
for ( Node node : people )
{
String name = (String) node.getProperty( "name" );
}
tx.success();
}
Hi with cypher i can sugesst you few thing,
Q How to create a unique node i.e. guarantee uniqueness of a property?
Ans.
first chosse a property that could be unique for that node , same like Primary key in of your relational database system,
i.e Id
now you merge to create a node,
MERGE (u:User { Id:1 })
set u.Name='Charlie'
RETURN u
if the user with Id will not exist it will create it,
then using set Clause you can set other property or hole obejct as well,
Q Returning a node returns the complete database. How can I return just a node or a pojo list as JSON? Are there any working examples or explanation of Representation available?
Ans. Same way to match if you pass the unique id and try to search it will return you only that particualr node only i.e
match(u:User { Id:1 }) return u
to create such Id , i will suggest you to go with GUID created in programing lunguaze like C#,
but with neo4j 3.x you also used autoincremented propery as well.

Inserting Data into Neo4j through neo4j rest binding Batch REST API becomes slow as more data is inserted

I am currently trying to insert lots of data into neo4j. by using the neo4j java-rest-binding library, i am doing batch insertions by 500 cypher queries, Currently I have at most 200k nodes and 1.4m relationships stored in my graph.
With my current data i am already experiencing request timeouts during insertion, I was wondering if there are any configurations that could make the inserting of batch requests faster.
Or maybe some improvements to the query I am currently using
Also here is a sample query being used,
MERGE (firstNode {id:'ABC'})
ON CREATE SET firstNode.type="RINGCODE", firstNode.created = 100, firstNode:rbt
ON MATCH SET firstNode.type="RINGCODE", firstNode:rbt
MERGE (secondNode{id:'RBT-TC664'})
WITH firstNode, secondNode OPTIONAL MATCH firstNode - [existing:`sku`] - ()
DELETE existing
CREATE UNIQUE p = (firstNode)-[r:`sku`]-(secondNode) RETURN p;
Use labels
create an index or unique constraint for the label + property (id)
represent types with labels instead
Otherwise Neo4j has to scan all nodes to find out if the node you want to merge is already in the database.
If you don't need the uniqueness, you can also use create which doesn't check but just creates and doesn't slow down.
What does rbt stand for?
create constraint on (n:Rbt) assert n.id is unique;
MERGE (firstNode:Rbt:RingCode {id:'ABC'})
ON CREATE SET firstNode.created = 100
MERGE (secondNode:Rbt {id:'RBT-TC664'})
WITH firstNode, secondNode
OPTIONAL MATCH firstNode -[existing:`sku`]- ()
DELETE existing
MERGE p = (firstNode)-[r:`sku`]-(secondNode)
RETURN p;

Resources