Improving cypher queries and avoiding cartesian product

Improving cypher queries and avoiding cartesian product - neo4j

I'm pretty new to neo4j and cypher. I have a table Transaction and a table Person that i'd like to link as follow (only the rel3 relationship does matter in the end) :
MATCH (t:Transaction),(p1:Person)
WHERE p1.id = t.id1
CREATE (p1)-[:rel1]->(t)
MATCH (t:Transaction),(p2:Person)
WHERE p2.id = t.id2
CREATE (t)-[:rel2]->(p2)
MATCH (p1:Person)-[:rel1]->(t:Transaction)-[:rel2]->(p2:Person)
CREATE (p1)-[:rel3]->(p2)
However I was wondering if there was a way to avoid this double cartesian product and still achieve the same goal. Performance is indeed a big concern for me since I got millions of rows to process. So I tried some modifications and ended up with that version :
MATCH (t:Transaction)
WITH t
MATCH (p1:Person {id : t.id1})
WITH n1, t
MATCH (p2:Person {id : t.id2})
CREATE (p1)-[:rel3]->(p2)
It's easier to read and understand, but according to the PROFILE command, it does the exact same thing. Any idea to improve that code ?

I wonder why you're doing that in two steps if you can do
MATCH (t:Transaction),(p1:Person),(p2:Person)
WHERE p1.id = t.id1
AND p2.id = t.id2
CREATE (p1)-[:REL3]->(p2)
If you have a unique constraint on the id property for the Person label that's about as fast as you'll get this. My biggest question however ... why don't you create the relationship the moment you create the Transaction itself, instead of storing foreign keys (a big no, no in graph database) and then doing the work afterwards?
Hope this helps.
Regards, Tom

Related

Bulk merge of relations in Neo4J using Cypher

I am trying to create/update distinct relations between two nodes with a single bulk operation in Cypher, leveraging the MERGE and FOREACH clause.
Right now, I am trying to do it with the following, but it is not syntactically correct:
MERGE (u1:Person {id:1})
MERGE (u2:Person {id:3})
FOREACH (score IN [{name:'R1',val:1.0},{name:'R2',val:0.5}]|
MERGE (u1)-[r]-(u2)
WHERE type(r) = score.name
ON CREATE SET r.weight=score.val,r.created=timestamp(),r.updated=r.created
ON MATCH SET r.weight=score.val,r.updated=timestamp()
)
May you please suggest me a query to achieve that.

I think the problem with your query is this:
MERGE (u1)-[r]-(u2)
WHERE type(r) = score.name
Creting relationships without a type is not allowed, nor is it to use a variable name (score.name) for the type of the relationship. I can only suggest two partial solutions:
1) If you are writing the query from some code, insert the name from there. For instance, in PHP: :
....
$rels[] = [val => 1.0, name => 'R1'];
foreach ($rels as $rel) {
$query[] = 'MERGE (u1)-[r:' . '$rel["name"]' . ']-(u2)';
ON CREATE SET r.weight=score.val, r.created=timestamp(), r.updated=r.created
ON MATCH SET r.weight=score.val,r.updated=timestamp()
}
....
This probably would give an error because of reusing the "r" identifier for the relationship, but that could be avoided making it variable too.
2) A cleaner solution, but maybe not available in your environment, is to use APOC. In Neo4j 3.0+ it´s available to install with many functions for you to use, in particular apoc.create.relationship. I am not familiar with this, but here it´s quite well explained.
Anyway here I also leave the current open issue at neo4j repository in case it´s useful.

Neo4j relate nodes by same property

I have a Neo4J DB up and running with currently 2 Labels: Company and Person.
Each Company Node has a Property called old_id.
Each Person Node has a Property called company.
Now I want to establish a relation between each Company and each Person where old_id and company share the same value.
Already tried suggestions from: Find Nodes with the same properties in Neo4J and
Find Nodes with the same properties in Neo4J
following the first link I tried:
MATCH (p:Person)
MATCH (c:Company) WHERE p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
resulting in no change at all and as suggested by the second link I tried:
START
p=node(*), c=node(*)
WHERE
HAS(p.company) AND HAS(c.old_id) AND p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
RETURN p, c;
resulting in a runtime >36 hours. Now I had to abort the command without knowing if it would eventually have worked. Therefor I'd like to ask if its theoretically correct and I'm just impatient (the dataset is quite big tbh). Or if theres a more efficient way in doing it.

This simple console shows that your original query works as expected, assuming:
Your stated data model is correct
Your data actually has Person and Company nodes with matching company and old_id values, respectively.
Note that, in order to match, the values must be of the same type (e.g., both are strings, or both are integers, etc.).
So, check that #1 and #2 are true.

Depending on the size of your dataset you want to page it
create constraint on (c:Company) assert c.old_id is unique;
MATCH (p:Person)
WITH p SKIP 100000 LIMIT 100000
MATCH (c:Company) WHERE p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
RETURN count(*);
Just increase the skip value from zero to your total number of people in 100k steps.

Move/copy all relationships to different node

Is there any way to copy or move a relationship from one node to another?
I have a situation similar to that here:
neo4j merge 2 or multiple duplicate nodes
and here:
Copy relationships of different type using Cypher
Say I have this pattern in the graph
(a)-[r:FOO]->(b)
(a)<-[r2:BAR]-(c)
I then have another node, (d), which may or may not be a duplicate of (a). My thinking is that it does not matter whether the nodes are duplicate or not from a functionality point of view. I want to be able to move or copy the relationships r:FOO and r2:BAR so that the graph now includes
(d)-[r:FOO]->(b)
(d)<-[r2:BAR]-(c)
If I was then doing this to merge nodes when I have duplicates I would like to be able to move the relationships as opposed to copy and then (perhaps optionally) delete (a). Note that there is more than one relationship type and I do not know for sure what the types will be. I realise I can do this in stages but thought it would be great if there was an efficient way to do it in one cypher query. My current strategy is something like (not exact syntax but just to give an idea)
// get all outgoing relationships
MATCH (a:Label1 { title : 'blah' })-[r]->(o)
RETURN r
// returns FOO and BAR
// for each relationship type, create one from (d) and copy the properties over
MATCH (a:Label1 { title : 'blah' })-[r:FOO]->(o), (d:Label1 { title : 'blah blah' })
CREATE (d)-[r2:FOO]->(o)
SET r2 = r
...etc...
// now do the same for incoming relationships
MATCH (a:Label1 { title : 'blah' })<-[r]-(o)
RETURN r
// returns FOO and BAR
// for each relationship type, create one from (d) and copy the properties over
MATCH (a:Label1 { title : 'blah' })<-[r:FOO]-(o), (d:Label1 { title : 'blah blah' })
CREATE (d)<-[r2:FOO]-(o)
SET r2 = r
...etc...
// finally delete node and relationships (if required)
MATCH (a:Label1 { title : 'blah' })-[r]-(o)
DELETE r, a
However this relies on a number of queries and hence transactions. It would be much preferable (in my simpleton view) to be able to achieve this in one query. However, I do not believe anything like this exists in cypher. Am I wrong?
Any ideas? Let me know if this is not clear and I shall try to elaborate and explain further.
For info, I am using Neo4j 2.1.6 community edition (with neo4jclient from a .NET application).
Just realised that I would also have to repeat my process to account for the direction of the relationship unless I am mistaken? i.e. get all outgoing relationships from (a), recreate them as outgoing from (d), and then do the same for all incoming relationships. Cypher above has been edited accordingly.
UPDATE: I am guessing this is a pipe dream and not at all possible. Can anyone confirm this? It would be good to have a definitive answer even if it is "No!". If so I would consider asking the Neo4j guys if this functionality is even feasible and worth considering.
UPDATE 2: From the lack of ideas I am guessing this can't be done. I certainly have got no further in my research or experimentation. Looks like a feature request is the way to go. I can't be the only person who would find this functionality exceptionally useful.

I think you can just chain these together:
// get all relationships
MATCH
(a:Label1 { title : 'blah' })-[r]-(o),
(d:Label1 { title : 'blah blah' })
CREATE (d)-[r2:type(r)]-(o)
DELETE r, a
The only thing I'm not entirely sure about is the ability to use the type() function where it's being used there. I'll try it out now

Can I create and relate two nodes with the same name but different ids in neo4j

I have created two nodes in neo4j with the same name and label but with different ids:
CREATE (P:names {id:"1"})
CREATE (P:names{id:"2"})
My question is if I can create a relationship between these two nodes like this:
MATCH (P:names),(P:names)
WHERE P.id = "1" AND P.id = "2"
CREATE (P)-[r:is_connected_with]->(P) RETURN r"
I try it but it doesn't work.
Is it that I shouldn't create nodes with the same name or there is a workaround?

How about the following?
First run the create statements:
CREATE (p1:Node {id:"1"}) // note, named p1 here
CREATE (p2:Node {id:"2"})
Then, do the matching:
MATCH (pFirst:Node {id:"1"}), (pSecond:Node {id:"2"}) // and here we can call it something else
CREATE pFirst-[r:is_connected_with]->(pSecond)
RETURN r
Basically, you are matching two nodes (with the label Node). In your match you call them p1 and p2 but you can change these identifiers if you wish. Then, simply create the relationship between them.
You should not create identifiers with the same name. Also note that p1 and p2 are not the name of the node, it is the name of the identifier in this particular query.
EDIT: After input from the OP I have created a small Gist that illustrates some basics regarding Cypher.

#wassgren has the right answer about how to fix your query but I might be able to fill in some details about why and it's too long to leave in a comment.
The character before the colon when describing a node or relationship is referred to as an identifier, it's just a variable representing a node/rel within a Cypher query. Neo4j has some naming conventions that you are not following and as a result, it makes your query harder to read and will be harder for you to get help in the future. Best practices are:
Identifiers start lowercase: person instead of Person1, p instead of P
Labels are singular and have their first character capitalized: (p1:Name), not (p1:Names) or (p1:names) or (p1:name)
Relationships are all caps, [r:IS_CONNECTED_WITH], not [r:is_connected_with], though this one gets broken all the time ;-)
Back to your query, it both won't work and it doesn't follow conventions.
Won't work:
MATCH (P:names),(P:names)
WHERE P.id = "1" AND P.id = "2"
CREATE (P)-[r:is_connected_with]->(P) RETURN r
Will work, looks so much better(!):
MATCH (p1:Name),(p2:Name)
WHERE p1.id = "1" AND p2.id = "2"
CREATE (p1)-[r:IS_CONNECTED_WITH]->(p2) RETURN r
The reason your query doesn't work, though, is that by writing MATCH (P:names),(P:names) WHERE P.id = "1" AND P.id = "2", you are essentially saying "find a node, call it 'P', with an ID of both 1 and 2." That's not what you want and it obviously won't work!
If you're trying to create many nodes, you would rerun this query for each pair of nodes you want to create, changing the ID you assign each time. You can create the nodes and their relationship in one query, too:
CREATE (p1:Name {id:"1"})-[r:IS_CONNECTED_WITH]->(p2:Name {id:"2"}) RETURN r
In the app, just change the ID you want to assign to the nodes before you run the query. The identifiers are instance variables, they disappear when the query is complete.
EDIT #1!
One more thing, setting the id property within your app and assigning it to the database instead of relying on the Neo4j-created internal ID is a best practice. I suggest avoiding sequential IDs and instead using something to create a unique ID. In Ruby, many people use SecureRandom::uuid for this, I'm sure there's a parallel in whatever language(s) you are using.
EDIT #2!
Neo4j supports integer properties. {id:"1"} != {id: 1}. If your field is supposed to be an integer, use an integer.

Cypher - multiple relationships with same label, i want to delete just one

I have the following schema:
(Node a and b are identified by id and are the same in both relationships)
(a)-[r:RelType {comment:'a comment'} ]-(b)
(a)-[r:RelType {comment:'another comment'} ]-(b)
So i have 2 nodes, and an arbitrary number of relationships between them. I want to delete just one of the relationships, and i don t care which one. How can i do this?
I have tried this, but it does not work:
match (a {id:'aaa'})-[r:RelType]-(b {id:'bbb'}) where count(r)=1 delete r;
Any ideas?
Here is the real-world query:
match (order:Order {id:'order1'}),(produs:Product {id:'supa'}),
(order)-[r:ordprod {status:'altered'}]->(produs) with r limit 1 set r.status='alteredAgain'
return (r);
The problem is Chypher says
Set 1 property, returned 1 row in 219 ms
, but when i inspect the database, it turns out all relationships have been updated.

Use the following:
match (a {id:'aaa'})-[r:RelType]-(b {id:'bbb'})
with r
limit 1
delete r

I tried to implement data structure like your's (Mihai's). and gone with both solutions; i.e., Stefan's and Sumit's. Stefan's solution is working at my side. Mihai, are you still facing any problems?

Hope this helps(As per my understanding you are trying to modify a relationship between two given nodes)
MATCH (order:Order {id:'order1'})-[r:ordprod {status:'altered'}]->(produs:Product {id:'supa'})
WITH order,r,produs
LIMIT 1
DELETE r
WITH order,produs
CREATE (order:Order {id:'order1'})-[r:ordprod {status:'alteredAgain'}]->(produs:Product {id:'supa'})
return (r);
And the reason for all your relationships getting updated is that after your WITH clause you are passing only r ie the relationship which may be same between all such nodes of label Order and Product. So when you do r.status = 'alteredagain' it changes all the relationships instead of changing between those two specific nodes that you matched in the beginning of your cypher query. Pass them too in the WITH and it will work fine!

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Improving cypher queries and avoiding cartesian product - neo4j

Related

Bulk merge of relations in Neo4J using Cypher

Neo4j relate nodes by same property

Move/copy all relationships to different node

Can I create and relate two nodes with the same name but different ids in neo4j

Cypher - multiple relationships with same label, i want to delete just one

Categories

Resources