Neo4j 2.0 Merge with unique constraints performance bug?

Neo4j 2.0 Merge with unique constraints performance bug? - neo4j

Here's the situation:
I have a node that has a property ContactId which is set as unique and indexed. The node label is :Contact
(node:Contact {ContactId:1})
I have another node similar to that pattern for Address:
(node2:Address {AddressId:1})
I now try to add a new node that (among other properties, includes ContactId (for referencing))
(node3:ContactAddress {AddressId:1,ContactId:1})
When I run a merge command for each, the time for adding a node that contains a property that is set as unique in another node type seems to make the process much slower.
The ContactAddress node only contains relational properties between the Contact and Address nodes. Contact and Address nodes contain up to 10 properties each. Is this a bug, where Neo4j is check the property key -> value -> then node label?
Code and screenshot below:
string strForEach = string.Format("(n in {{{0}}} |
MERGE (c:{1} {{{2} : n.{2}}}) SET c = n)", propKey, label, PK_Field);
var query = client
.Cypher
.ForEach(strForEach)
.WithParam(propKey, entities.ToList());

Constraint checks are more expensive than just inserts. They also take a global lock on the constraint to prevent multiple insertion.
I saw you don't use parameters, but string substitiution, I really recommend to change that and go with parameters.
Also setting the whole node c to n triggers constraint check again.
Your probably want to use the ON CREATE SET clause of MERGE
(n in {nodes} |
MERGE (c:Label {key : n.key}}) ON CREATE SET c.foo = n.foo, c.bar = n.bar )

Related

Add Integer Number to existing Values - Neo4j

Using Neo4j.
I would like to add a integer number to values already existing in properties of several relationships that I call this way:
MATCH x=(()-[y]->(s:SOL{PRB:"Taking time"})) SET y.points=+2
But it doesn't add anything, just replace by 2 the value I want to incremente.

To achieve this use
SET y.points = y.points + 2
From your original question it looks like you were trying to use the Addition Assignment operator which exists in lots of languages (e.g. python, type/javascript, C#, etc.). However, in cypher += is a little different and is designed to do this in a way which allows you to add or update properties to or on entire nodes or relationships based on a mapping.
If you had a parameter like the below (copy this into the neo4j browser to create a param).
:param someMapping: {a:1, b:2}
The query below would create a property b on the node with value 2, and set the value of property a on that node to 1.
MATCH (n:SomeLabel) WHERE n.a = 0
SET n+= $someMapping
RETURN n

Neo4j display only one node for 1 to many relationship

i'm trying to solve a problem of the 1: many relationship display in neo4j. My dataset is as below
child,desc,type,parent
1,PGD,Exchange,0
2,MSE 1,MSE,1
3,MSE 2,MSE,1
4,MSE 3,MSE,1
5,MSE 4,MSE,1
6,BRAS 1,BRAS,2
6,BRAS 1,BRAS,3
7,BRAS 2,BRAS,4
7,BRAS 2,BRAS,5
10,NPE 1,NPE,6
11,NPE 2,NPE,7
12,OLT,OLT,10
12,OLT,OLT,11
13,FDC,FDC,12
14,FDP,FDP,13
15,Cust 1,Customer,14
16,Cust 2,Customer,14
17,Cust 3,Customer,14
LOAD CSV WITH HEADERS FROM 'file:///FTTH_sample.csv' AS line
CREATE(:ftthsample
{child_id:line.child,
desc:line.desc,
type:line.type,
parent_id:line.parent});
//Relations
match (child:ftthsample),(parent:ftthsample)
where child.child_id=parent.parent_id
create (child)-[:test]->(parent)
//Query:
MATCH (child)-[childrel:test*]-(elem)-[parentrel:test*]->(parent)
WHERE elem.desc='FDP'
RETURN child,childrel,elem,parentrel
It returns a display as below.
I want the duplicate nodes to be displayed as one. Newbie with Neo4J. Can anyone of the experts help please?

This seems like an error in your graph creation query. You have a few lines in your query specifying the same node multiple times, but with multiple parents:
6,BRAS 1,BRAS,2
6,BRAS 1,BRAS,3
I'm guessing you actually want this to be a single node, with parent relationships to nodes with the given parent ids, instead of two separate nodes.
Let's adjust your import query. Instead of using a CREATE on each line, we'll use MERGE, and just on the child_id, which seems to be your primary key (maybe consider just using id instead, as a node can have an id on its own, without having to consider the context of whether it's a parent or child). We can use the ON CREATE clause after MERGE to add in the remaining properties only if the MERGE resulted in node creation (instead of matching to an existing node.
That will ensure we only have one node created per child_id.
Rather than having to rematch the child, we can use the child node we just created, match on the parent, and create the relationship.
LOAD CSV WITH HEADERS FROM 'file:///FTTH_sample.csv' AS line
MERGE(child:ftthsample {child_id:line.child})
ON CREATE SET
child.desc = line.desc,
child.type = line.type
WITH child, line.parent as parentId
MATCH (parent:ftthsample)
WHERE parent.child_id = parentId
MERGE (child)-[:test]->(parent)
Note that we haven't added line.parent as a property. It's not needed, since we only use that to create relationships, and after the relationships are there, we won't need those again.

How to use Create Unique with Cypher

My target is to create node + set new property to it in case not exists
if it's exist I just want to update it's property
Tried this:
MATCH (user:C9 {userId:'44'})
CREATE UNIQUE (user{timestamp:'1111'})
RETURN user
*in case the node with the property userId=44 already existed I just want to set it's property into 1111 else just create it and set it.
error I am getting:
user already declared (line 2, column 16 (offset: 46))
"CREATE UNIQUE (user{timestamp:'1111'})"
should I switch to Merge or?
thanks.

Yes, you should use the MERGE statement.
MERGE (user:C9 {userId:'44'})
// you can set some initial properties when the node is created if required
//ON CREATE SET user.propertykey = 'propertyvalue'
ON MATCH SET user.timestamp = '1111'
RETURN user
You mention unique constraints - I assume you have one set up. You definitely should do to prevent duplicate nodes being created. It will also create a schema index to improve the performance of your node lookup. If you do not yet have a unique constraint then it can be created like so
CREATE CONSTRAINT ON (u:C9) ASSERT u.userId IS UNIQUE
See the Neo4j MERGE documentation.
Finally, to understand what is going on in your query let's have a quick look line by line.
MATCH (user:C9 { userId:'44' })
This matches the node with label :C9 that has a userId property with value 44 and assigns it the identifier user.
CREATE UNIQUE (user{timestamp:'1111'})
This line is simply trying to create a new node with no label and a property timestamp with value '1111'. The exception you are seeing is a result of you using the same user identifier that has already been used in the first line. However, this is not a supported way to use CREATE UNIQUE as it requires a match first and will then create bits of the pattern that does not exist. The upside of this is that it is stopping this unwanted node (user{timestamp:'1111'}) being created in the graph.
RETURN user
This line is pretty self explanatory and is not being reached.
EDIT
There seems to be some confusion surrounding CREATE UNIQUE and when it should be used. This query
CREATE UNIQUE (user:C9 {timestamp:'1111'})
will fail with the message
This pattern is not supported for CREATE UNIQUE
To use CREATE UNIQUE you would first match an existing node and then use that to create a unique pattern in the graph. So to create a relationship from user to a second node you would use
MATCH (user:C9 { userId: '44' }
CREATE UNIQUE (user)-[r:FOO]-(bar)
RETURN r
If there is no relationships of type FOO from user then a new node will be created to represent bar and relationship of type :FOO will be created between them. Conversely, if the MATCH statement does not make a match then no nodes or relationships will be created.

CREATE UNIQUE with labels and properties

I'm using Neo4j 2.0.0-M06. Just learning Cypher and reading the docs. In my mind this query would work, but I should be so lucky...
I'm importing tweets to a mysql-database, and from there importing them to neo4j. If a tweet is already existing in the Neo4j database, it should be updated.
My query:
MATCH (y:Tweet:Socialmedia) WHERE
HAS (y.tweet_id) AND y.tweet_id = '123'
CREATE UNIQUE (n:Tweet:Socialmedia {
body : 'This is a tweet', tweet_id : '123', tweet_userid : '321', tweet_username : 'example'
} )
Neo4j says: This pattern is not supported for CREATE UNIQUE
The database is currently empty on nodes with the matching labels, so there are no tweets what so ever in the Neo4j database.
What is the correct query?

You want to use MERGE for this query, along with a unique constraint.
CREATE CONSTRAINT on (t:Tweet) ASSERT t.tweet_id IS UNIQUE;
MERGE (t:Tweet {tweet_id:'123'})
ON CREATE
SET t:SocialMedia,
t.body = 'This is a tweet',
t.tweet_userid = '321',
t.tweet_username = 'example';
This will use an index to lookup the tweet by id, and do nothing if the tweet exists, otherwise it will set those properties.

I would like to point that one can use a combination of
CREATE CONSTRAINT and then a normal
CREATE (without UNIQUE)
This is for cases where one expects a unique node and wants to throw an exception if the node unexpectedly exists. (Far cheaper than looking for the node before creating it).
Also note that MERGE seems to take more CPU cycles than a CREATE. (It also takes more CPU cycles even if an exception is thrown)
An alternative scenario covering CREATE CONSTRAINT, CREATE and MERGE (though admittedly not the primary purpose of this post).

Node identifiers in neo4j

I'm new to Neo4j - just started playing with it yesterday evening.
I've notice all nodes are identified by an auto-incremented integer that is generated during node creation - is this always the case?
My dataset has natural string keys so I'd like to avoid having to map between the Neo4j assigned ids and my own. Is it possible to use string identifiers instead?

Think of the node-id as an implementation detail (like the rowid of relational databases, can be used to identify nodes but should not be relied on to be never reused).
You would add your natural keys as properties to the node and then index your nodes with the natural key (or enable auto-indexing for them).
E..g in the Java API:
Index<Node> idIndex = db.index().forNodes("identifiers");
Node n = db.createNode();
n.setProperty("id", "my-natural-key");
idIndex.add(n, "id",n.getProperty("id"));
// later
Node n = idIndex.get("id","my-natural-key").getSingle(); // node or null
With auto-indexer you would enable auto-indexing for your "id" field.
// via configuration
GraphDatabaseService db = new EmbeddedGraphDatabase("path/to/db",
MapUtils.stringMap(
Config.NODE_KEYS_INDEXABLE, "id", Config.NODE_AUTO_INDEXING, "true" ));
// programmatic (not persistent)
db.index().getNodeAutoIndexer().startAutoIndexingProperty( "id" );
// Nodes with property "id" will be automatically indexed at tx-commit
Node n = db.createNode();
n.setProperty("id", "my-natural-key");
// Usage
ReadableIndex<Node> autoIndex = db.index().getNodeAutoIndexer().getAutoIndex();
Node n = autoIndex.get("id","my-natural-key").getSingle();
See: http://docs.neo4j.org/chunked/milestone/auto-indexing.html
And: http://docs.neo4j.org/chunked/milestone/indexing.html

This should help:
Create the index to back automatic indexing during batch import We
know that if auto indexing is enabled in neo4j.properties, each node
that is created will be added to an index named node_auto_index. Now,
here’s the cool bit. If we add the original manual index (at the time
of batch import) and name it as node_auto_index and enable auto
indexing in neo4j, then the batch-inserted nodes will appear as if
auto-indexed. And from there on each time you create a node, the node
will get indexed as well.**
Source : Identifying nodes with Custom Keys

According Neo docs there should be automatic indexes in place
http://neo4j.com/docs/stable/query-schema-index.html
but there's still a lot of limitations

Beyond all answers still neo4j creates its own ids to work faster and serve better. Please make sure internal system does not conflict between ids then it will create nodes with same properties and shows in the system as empty nodes.

the ID's generated are default and cant be modified by users. user can use your string identifiers as a property for that node.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart