Is there any way to copy or move a relationship from one node to another?
I have a situation similar to that here:
neo4j merge 2 or multiple duplicate nodes
and here:
Copy relationships of different type using Cypher
Say I have this pattern in the graph
(a)-[r:FOO]->(b)
(a)<-[r2:BAR]-(c)
I then have another node, (d), which may or may not be a duplicate of (a). My thinking is that it does not matter whether the nodes are duplicate or not from a functionality point of view. I want to be able to move or copy the relationships r:FOO and r2:BAR so that the graph now includes
(d)-[r:FOO]->(b)
(d)<-[r2:BAR]-(c)
If I was then doing this to merge nodes when I have duplicates I would like to be able to move the relationships as opposed to copy and then (perhaps optionally) delete (a). Note that there is more than one relationship type and I do not know for sure what the types will be. I realise I can do this in stages but thought it would be great if there was an efficient way to do it in one cypher query. My current strategy is something like (not exact syntax but just to give an idea)
// get all outgoing relationships
MATCH (a:Label1 { title : 'blah' })-[r]->(o)
RETURN r
// returns FOO and BAR
// for each relationship type, create one from (d) and copy the properties over
MATCH (a:Label1 { title : 'blah' })-[r:FOO]->(o), (d:Label1 { title : 'blah blah' })
CREATE (d)-[r2:FOO]->(o)
SET r2 = r
...etc...
// now do the same for incoming relationships
MATCH (a:Label1 { title : 'blah' })<-[r]-(o)
RETURN r
// returns FOO and BAR
// for each relationship type, create one from (d) and copy the properties over
MATCH (a:Label1 { title : 'blah' })<-[r:FOO]-(o), (d:Label1 { title : 'blah blah' })
CREATE (d)<-[r2:FOO]-(o)
SET r2 = r
...etc...
// finally delete node and relationships (if required)
MATCH (a:Label1 { title : 'blah' })-[r]-(o)
DELETE r, a
However this relies on a number of queries and hence transactions. It would be much preferable (in my simpleton view) to be able to achieve this in one query. However, I do not believe anything like this exists in cypher. Am I wrong?
Any ideas? Let me know if this is not clear and I shall try to elaborate and explain further.
For info, I am using Neo4j 2.1.6 community edition (with neo4jclient from a .NET application).
Just realised that I would also have to repeat my process to account for the direction of the relationship unless I am mistaken? i.e. get all outgoing relationships from (a), recreate them as outgoing from (d), and then do the same for all incoming relationships. Cypher above has been edited accordingly.
UPDATE: I am guessing this is a pipe dream and not at all possible. Can anyone confirm this? It would be good to have a definitive answer even if it is "No!". If so I would consider asking the Neo4j guys if this functionality is even feasible and worth considering.
UPDATE 2: From the lack of ideas I am guessing this can't be done. I certainly have got no further in my research or experimentation. Looks like a feature request is the way to go. I can't be the only person who would find this functionality exceptionally useful.
I think you can just chain these together:
// get all relationships
MATCH
(a:Label1 { title : 'blah' })-[r]-(o),
(d:Label1 { title : 'blah blah' })
CREATE (d)-[r2:type(r)]-(o)
DELETE r, a
The only thing I'm not entirely sure about is the ability to use the type() function where it's being used there. I'll try it out now
Related
I have in my database Company with:
#Relationship(type = "COLLECTS_FROM")
private Set<Company> subCompanies
What I'm trying to achieve is to create two queries:
get the company with limited depth 2 for nodes of the same type (so company which collects from different companies, which collects from different companies)
get the company with full depth for nodes of the same type (similar to previous, but just all the way down without any limitations.
I had a few attempts to solve this and here are those:
first approach was to use findById(ID,2) and findById(ID,999999), but this query has a huge performance drop and I ran into StackOverflow exception, that's why I'm trying to create my own methods similar to what I found here.
second approach was to use custom query (which were actually working in case of different node types):
#Query("Match (company:Company {name:$name})-[relation:COLLECTS_FROM*0..2]->(secondCompany:Company) RETURN company,relation,secondCompany")
Company customerWithCustomDepth(String name);
Unfortunately while using the same node types I ran into the IncorrectResultSizeDataAccessException: Incorrect result size: expected at most 1 and when I changed return type to the list 3 records were returned (actually the first one of them is what I would like to get, but it seems like specifying secondCompany in the return statement populates nodes that I don't want to have in my return type.
on the third approach I did similar to the second, but it also returned 3 rows (the first one is what I wanted).
#Query("Match p = (company:Company {name:$name})-[relation:COLLECTS_FROM*0..2]->(secondCompany:Company) RETURN p")
and fourth one once again 3 companies, instead of a single one (the first one is what I wanted)
#Query("MATCH (company:Company {name:$name} ) WITH company MATCH p=(company)-[COLLECTS_FROM*0..2]->(m) RETURN p")
And that's basically it. I feel like I'm close, but a little detail is missing. Maybe instead of trying to fix the query, I should somehow limit the result to get the first row from the Query? Any help will be really appreciated.
update after receiving #cybersam answer:
The general purpose of this query is to create a tree graph, that's why I would like to get a company with their sub-companies inside (and companies in the sub-companies), not only the parent one.
I could use second, or third approach and just grab the first element on the server side:
result.stream()
.findFirst()
.orElseThrow();
but I believe it should be able to do it on the service side.
To return the Company with the specified name if it is connected by a path of 1 to 2 COLLECTS_FROM relationships to descendant Company nodes:
#Query("MATCH p=(c:Company {name:$name})-[:COLLECTS_FROM*..2]->(:Company) WHERE ALL(n IN NODES(p) WHERE 'Company' IN LABELS(n)) RETURN c")
Company customerWithCustomDepth(String name);
I assume that you do not want to get a result if there are no such descendants, which is why the variable-length relationship pattern does not use *0..2.
Note: If you want to test for paths of exactly length 2, change ..2 to just 2.
The corresponding query for paths of any length can be obtained by just changing ..2 to ... However, variable-length relationship pattern with no upper bound are not recommended, since they can take "forever" to run or run out of memory.
I'm pretty new to neo4j and cypher. I have a table Transaction and a table Person that i'd like to link as follow (only the rel3 relationship does matter in the end) :
MATCH (t:Transaction),(p1:Person)
WHERE p1.id = t.id1
CREATE (p1)-[:rel1]->(t)
MATCH (t:Transaction),(p2:Person)
WHERE p2.id = t.id2
CREATE (t)-[:rel2]->(p2)
MATCH (p1:Person)-[:rel1]->(t:Transaction)-[:rel2]->(p2:Person)
CREATE (p1)-[:rel3]->(p2)
However I was wondering if there was a way to avoid this double cartesian product and still achieve the same goal. Performance is indeed a big concern for me since I got millions of rows to process. So I tried some modifications and ended up with that version :
MATCH (t:Transaction)
WITH t
MATCH (p1:Person {id : t.id1})
WITH n1, t
MATCH (p2:Person {id : t.id2})
CREATE (p1)-[:rel3]->(p2)
It's easier to read and understand, but according to the PROFILE command, it does the exact same thing. Any idea to improve that code ?
I wonder why you're doing that in two steps if you can do
MATCH (t:Transaction),(p1:Person),(p2:Person)
WHERE p1.id = t.id1
AND p2.id = t.id2
CREATE (p1)-[:REL3]->(p2)
If you have a unique constraint on the id property for the Person label that's about as fast as you'll get this. My biggest question however ... why don't you create the relationship the moment you create the Transaction itself, instead of storing foreign keys (a big no, no in graph database) and then doing the work afterwards?
Hope this helps.
Regards, Tom
I am trying to create/update distinct relations between two nodes with a single bulk operation in Cypher, leveraging the MERGE and FOREACH clause.
Right now, I am trying to do it with the following, but it is not syntactically correct:
MERGE (u1:Person {id:1})
MERGE (u2:Person {id:3})
FOREACH (score IN [{name:'R1',val:1.0},{name:'R2',val:0.5}]|
MERGE (u1)-[r]-(u2)
WHERE type(r) = score.name
ON CREATE SET r.weight=score.val,r.created=timestamp(),r.updated=r.created
ON MATCH SET r.weight=score.val,r.updated=timestamp()
)
May you please suggest me a query to achieve that.
I think the problem with your query is this:
MERGE (u1)-[r]-(u2)
WHERE type(r) = score.name
Creting relationships without a type is not allowed, nor is it to use a variable name (score.name) for the type of the relationship. I can only suggest two partial solutions:
1) If you are writing the query from some code, insert the name from there. For instance, in PHP: :
....
$rels[] = [val => 1.0, name => 'R1'];
foreach ($rels as $rel) {
$query[] = 'MERGE (u1)-[r:' . '$rel["name"]' . ']-(u2)';
ON CREATE SET r.weight=score.val, r.created=timestamp(), r.updated=r.created
ON MATCH SET r.weight=score.val,r.updated=timestamp()
}
....
This probably would give an error because of reusing the "r" identifier for the relationship, but that could be avoided making it variable too.
2) A cleaner solution, but maybe not available in your environment, is to use APOC. In Neo4j 3.0+ it´s available to install with many functions for you to use, in particular apoc.create.relationship. I am not familiar with this, but here it´s quite well explained.
Anyway here I also leave the current open issue at neo4j repository in case it´s useful.
I have a Neo4J DB up and running with currently 2 Labels: Company and Person.
Each Company Node has a Property called old_id.
Each Person Node has a Property called company.
Now I want to establish a relation between each Company and each Person where old_id and company share the same value.
Already tried suggestions from: Find Nodes with the same properties in Neo4J and
Find Nodes with the same properties in Neo4J
following the first link I tried:
MATCH (p:Person)
MATCH (c:Company) WHERE p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
resulting in no change at all and as suggested by the second link I tried:
START
p=node(*), c=node(*)
WHERE
HAS(p.company) AND HAS(c.old_id) AND p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
RETURN p, c;
resulting in a runtime >36 hours. Now I had to abort the command without knowing if it would eventually have worked. Therefor I'd like to ask if its theoretically correct and I'm just impatient (the dataset is quite big tbh). Or if theres a more efficient way in doing it.
This simple console shows that your original query works as expected, assuming:
Your stated data model is correct
Your data actually has Person and Company nodes with matching company and old_id values, respectively.
Note that, in order to match, the values must be of the same type (e.g., both are strings, or both are integers, etc.).
So, check that #1 and #2 are true.
Depending on the size of your dataset you want to page it
create constraint on (c:Company) assert c.old_id is unique;
MATCH (p:Person)
WITH p SKIP 100000 LIMIT 100000
MATCH (c:Company) WHERE p.company = c.old_id
CREATE (p)-[:BELONGS_TO]->(c)
RETURN count(*);
Just increase the skip value from zero to your total number of people in 100k steps.
I have created two nodes in neo4j with the same name and label but with different ids:
CREATE (P:names {id:"1"})
CREATE (P:names{id:"2"})
My question is if I can create a relationship between these two nodes like this:
MATCH (P:names),(P:names)
WHERE P.id = "1" AND P.id = "2"
CREATE (P)-[r:is_connected_with]->(P) RETURN r"
I try it but it doesn't work.
Is it that I shouldn't create nodes with the same name or there is a workaround?
How about the following?
First run the create statements:
CREATE (p1:Node {id:"1"}) // note, named p1 here
CREATE (p2:Node {id:"2"})
Then, do the matching:
MATCH (pFirst:Node {id:"1"}), (pSecond:Node {id:"2"}) // and here we can call it something else
CREATE pFirst-[r:is_connected_with]->(pSecond)
RETURN r
Basically, you are matching two nodes (with the label Node). In your match you call them p1 and p2 but you can change these identifiers if you wish. Then, simply create the relationship between them.
You should not create identifiers with the same name. Also note that p1 and p2 are not the name of the node, it is the name of the identifier in this particular query.
EDIT: After input from the OP I have created a small Gist that illustrates some basics regarding Cypher.
#wassgren has the right answer about how to fix your query but I might be able to fill in some details about why and it's too long to leave in a comment.
The character before the colon when describing a node or relationship is referred to as an identifier, it's just a variable representing a node/rel within a Cypher query. Neo4j has some naming conventions that you are not following and as a result, it makes your query harder to read and will be harder for you to get help in the future. Best practices are:
Identifiers start lowercase: person instead of Person1, p instead of P
Labels are singular and have their first character capitalized: (p1:Name), not (p1:Names) or (p1:names) or (p1:name)
Relationships are all caps, [r:IS_CONNECTED_WITH], not [r:is_connected_with], though this one gets broken all the time ;-)
Back to your query, it both won't work and it doesn't follow conventions.
Won't work:
MATCH (P:names),(P:names)
WHERE P.id = "1" AND P.id = "2"
CREATE (P)-[r:is_connected_with]->(P) RETURN r
Will work, looks so much better(!):
MATCH (p1:Name),(p2:Name)
WHERE p1.id = "1" AND p2.id = "2"
CREATE (p1)-[r:IS_CONNECTED_WITH]->(p2) RETURN r
The reason your query doesn't work, though, is that by writing MATCH (P:names),(P:names) WHERE P.id = "1" AND P.id = "2", you are essentially saying "find a node, call it 'P', with an ID of both 1 and 2." That's not what you want and it obviously won't work!
If you're trying to create many nodes, you would rerun this query for each pair of nodes you want to create, changing the ID you assign each time. You can create the nodes and their relationship in one query, too:
CREATE (p1:Name {id:"1"})-[r:IS_CONNECTED_WITH]->(p2:Name {id:"2"}) RETURN r
In the app, just change the ID you want to assign to the nodes before you run the query. The identifiers are instance variables, they disappear when the query is complete.
EDIT #1!
One more thing, setting the id property within your app and assigning it to the database instead of relying on the Neo4j-created internal ID is a best practice. I suggest avoiding sequential IDs and instead using something to create a unique ID. In Ruby, many people use SecureRandom::uuid for this, I'm sure there's a parallel in whatever language(s) you are using.
EDIT #2!
Neo4j supports integer properties. {id:"1"} != {id: 1}. If your field is supposed to be an integer, use an integer.