Relatively new to Neo4j. I realize the way I originally posted this it was too ambiguous. Below is hopefully a better explanation.
//Subgraph 1
Create (p1:Person {name: 'Person1'})
Create (p2:Person {name: 'Person2'})
Create (a1:Address {street: 'Suspicious'})
Create (p1)-[:Resides]->(a1)
Create (p2)-[:Resides]->(a1)
//Subgraph 2
Create (p3:Person {name: 'Person3'})
Create (p4:Person {name: 'Person4'})
Create (a2:Address {street: 'Double'})
Create (p3)-[:Resides]->(a2)
Create (p4)-[:Resides]->(a2)
Create (p3)-[:Knows]->(p4)
//Subgraph 3
Create (p5:Person {name: 'Person5'})
Create (a3:Address {street: 'Single'})
Create (p5)-[:Resides]->(a3)
What I would like to write is a query to detect the following:
- All addresses (and people) that have 2 or more People residing there that do not know each other.
This means that only Subgraph1 should be found.
Subgraph2 would not be found because there are 2 people that reside there but they know each other.
Subgraph3 would not be found because there is only 1 person residing there.
Again, thanks for the help.
This Cypher query should work:
MATCH (n1)-[:RESIDES_AT]->()<-[:RESIDES_AT]-(n2)
WHERE NOT exists((n1)-[:KNOWS]-(n2))
RETURN n1, n2
start by matching on nodes that have a RESIDES_AT relationship to the same node, then filter out nodes that have a KNOWS relationship.
Related
I have several separated graphs in a single database and I am currently searching for a way to get a list of all similar graphs.
For instance, I have the following three graphs:
As you can see, graph 1 and 2 are similar and graph 3 is different, because the last node of graph 3 has Label_4 and not Label_3 (as it is the case for 1 and 2).
Therefore, I would like to get as a result of the query something like:
[a1->b1->c1,a2->b2->c2],[a3->b3->d3]
whereas a1->b1->c1 is graph 1, a2->b2->c2 is graph 2, and a3->b3->d3 is graph 3.
Is there a way to achieve this with Cypher? The representation of the result can also be different, as long as it groups similar graphs (e.g., also a list node IDs or only the start node IDs is fine).
For the creation of the example I used the following commands:
CREATE (a1:Label_1 {name: "Label_1"})
CREATE (b1:Label_2 {name: "Label_2"})
CREATE (c1:Label_3 {name: "Label_3"})
CREATE (a2:Label_1 {name: "Label_1"})
CREATE (b2:Label_2 {name: "Label_2"})
CREATE (c2:Label_3 {name: "Label_3"})
CREATE (a3:Label_1 {name: "Label_1"})
CREATE (b3:Label_2 {name: "Label_2"})
CREATE (d3:Label_4 {name: "Label_4"})
CREATE (a1)-[:FOLLOWS]->(b1)
CREATE (b1)-[:FOLLOWS]->(c1)
CREATE (a2)-[:FOLLOWS]->(b2)
CREATE (b2)-[:FOLLOWS]->(c2)
CREATE (a3)-[:FOLLOWS]->(b3)
CREATE (b3)-[:FOLLOWS]->(d3)
If you are: (A) trying to group complete directed graphs (i.e., directed graphs that start at a root node and end at a leaf node), and (B) only interested in using one of the (possibly many) labels for each node, this should work (but, due to the unbounded variable-length relationship, it could take a very long time or run out of memory in large DBs):
MATCH p = (n)-[*]->(m)
WHERE NOT ()-->(n) AND NOT (m)-->()
RETURN [x IN NODES(p) | LABELS(x)[0]] as labelPath, COLLECT(p)
You can remove the (A) constraint by removing the WHERE clause, but then you'd have a much bigger result set (and increase the time to completion and the risk of running out of memory).
Initial Situation
Large Neo4j 3.4.6 graph with a tree-like structure (10 levels deep, 10 million nodes).
Unexceptional all nodes are connected with each other. The nodes as well as the relationships are in each case of the same type.
Exactly one central root node.
Reduced and simplified example:
Graphic representation
CREATE (Root:CustomType {name: 'Root'})
CREATE (NodeA:CustomType {name: 'NodeA'})
CREATE (NodeB:CustomType {name: 'NodeB'})
CREATE (NodeC:CustomType {name: 'NodeC'})
CREATE (NodeD:CustomType {name: 'NodeD'})
CREATE (NodeE:CustomType {name: 'NodeE'})
CREATE (NodeF:CustomType {name: 'NodeF'})
CREATE (NodeG:CustomType {name: 'NodeG'})
CREATE (NodeH:CustomType {name: 'NodeH'})
CREATE (NodeI:CustomType {name: 'NodeI'})
CREATE (NodeJ:CustomType {name: 'NodeJ'})
CREATE (NodeK:CustomType {name: 'NodeK'})
CREATE (NodeL:CustomType {name: 'NodeL'})
CREATE (NodeM:CustomType {name: 'NodeM'})
CREATE (NodeN:CustomType {name: 'NodeN'})
CREATE (NodeO:CustomType {name: 'NodeO'})
CREATE (NodeP:CustomType {name: 'NodeP'})
CREATE (NodeQ:CustomType {name: 'NodeQ'})
CREATE
(Root)-[:CONTAINS]->(NodeA),
(Root)-[:CONTAINS]->(NodeB),
(Root)-[:CONTAINS]->(NodeC),
(NodeA)-[:CONTAINS]->(NodeD),
(NodeA)-[:CONTAINS]->(NodeE),
(NodeA)-[:CONTAINS]->(NodeF),
(NodeE)-[:CONTAINS]->(NodeG),
(NodeE)-[:CONTAINS]->(NodeH),
(NodeF)-[:CONTAINS]->(NodeI),
(NodeF)-[:CONTAINS]->(NodeJ),
(NodeF)-[:CONTAINS]->(NodeK),
(NodeI)-[:CONTAINS]->(NodeL),
(NodeI)-[:CONTAINS]->(NodeM),
(NodeJ)-[:CONTAINS]->(NodeN),
(NodeK)-[:CONTAINS]->(NodeO),
(NodeK)-[:CONTAINS]->(NodeP),
(NodeM)-[:CONTAINS]->(NodeQ);
To be solved challenge
By means of a MATCH-WITH-UNWIND Cypher query I’m successfully able to select a subtree and bind it to a path. Let’s say the subtree spans over the nodes A,E,F,I and J.
Based on this path I need all leaves of the subtree, not the complete tree now.
.
MATCH
path = (:CustomType {name:'NodeA'})-[:CONTAINS*]->(:CustomType {name:'NodeJ'}) /* simplified */
WITH
nodes(path) as selectedPath
/* here: necessary magic to identify the leaf nodes of the subtree */
RETURN
leafNode;
Among other things I tried to solve the requirement with a WHERE NOT(node-->()) approach, but realized this works for leaves of the complete tree only. Unfortunately I was not able to convince the WHERE NOT(node-->()) clause to respect the selected subtree boundaries.
So, how can I find all leaves of a selected subgraph with Cypher and Neo4j? Can you please give me an advice how to solve this challenge? Many thanks in advance for pointing me into the right direction!
You correctly noted that the check node with no children is suitable only for the entire tree. So you need to go through all the relationships in the subtree, and find such a node of the subtree that is as the end of the relationship, but not as the start of the relationship:
MATCH
path = (:CustomType {name:'NodeA'})-[:CONTAINS*]->(:CustomType {name:'NodeJ'})
UNWIND relationShips(path) AS r
WITH collect(DISTINCT endNode(r)) AS endNodes,
collect(DISTINCT startNode(r)) AS startNodes
UNWIND endNodes AS leaf
WITH leaf WHERE NOT leaf IN startNodes
RETURN leaf
According to the neo4j documentation:
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match
what it can, and create what is missing. CREATE UNIQUE will always
make the least change possible to the graph — if it can use parts of
the existing graph, it will.
This sounds great, but CREATE UNIQUE doesn't seem to follow the 'least possible change' rule. e.g., here is some Cypher to create two people:
CREATE (n:Person {name: 'Alice'})
CREATE (n:Person {name: 'Bob'})
CREATE INDEX ON :Person(name)
and here's two CREATE UNIQUE statements, to create a relationship between those people. Since both people already exist in the graph, only the relationships should be newly created:
MATCH (a:Person {name: 'Alice'})
CREATE UNIQUE (a)-[:knows]->(b:Person {name: 'Bob'})
RETURN a
MATCH (a:Person {name: 'Alice'})
CREATE UNIQUE (a)<-[:knows]-(b:Person {name: 'Bob'})
RETURN a
After this, the graph should look like
(Alice)<---KNOWS--->(Bob).
But when you run a MATCH query:
MATCH (a:Person)
RETURN a
it seems that the graph now looks like
(Bob)
(Bob)--KNOWS-->(Alice)--KNOWS-->(Bob);
two extra Bobs have been created.
I looked a bit through the other Cypher commands, but none of them seem intended for this use case: create a link between existing node A and existing node B if B exists, and otherwise create a link between existing node A and a newly created node B. How can this problem best be solved within the Cypher framework?
This query should do what you want (if you always want to end up with a single knows relationship between the 2 nodes):
MATCH (a:Person {name: 'Alice'})
MERGE (b:Person {name: 'Bob'})
MERGE (a)-[:knows]->(b)
RETURN a;
Here is how you can do it with CREATE UNIQUE
MATCH (a:Person {name: 'Alice'}), (b:Person {name:'Bob'})
CREATE UNIQUE (a)-[:knows]->(b), (b)-[:knows]->(a)
You need 2 match clauses otherwise you are always creating the node in the CREATE UNIQUE statement, not matching existing nodes.
I want to add a "created by" relationship on nodes in my database. Any node should be able of having this relationship but there can never be more than one.
Right now my query looks something like this:
MATCH (u:User {email: 'my#mail.com'})
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.name='Node name', n.attribute='value'
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
As I have understood Cypher there is no way to achieve what I want, the current query will only make sure there are no unique relationships based on TWO nodes, not ONE. So, this will create more CREATED_BY relationships when run for another User and I want to limit the outgoing CREATED_BY relationship to just one for all nodes.
Is there a way to achieve this without running multiple queries involving program logic?
Thanks.
Update
I tried to simplyfy the query by removing implementation details, if it helps here's the updated query based on cybersams response.
MERGE (c:Company {name: 'Test Company'})
ON CREATE SET c.uuid='db764628-5695-40ee-92a7-6b750854ebfa', c.created_at='2015-02-23 23:08:15', c.updated_at='2015-02-23 23:08:15'
WITH c
OPTIONAL MATCH (c)
WHERE NOT (c)-[:CREATED_BY]-()
CREATE (c)-[:CREATED_BY {date: '2015-02-23 23:08:15'}]->(u:User {token: '32ba9d2a2367131cecc53c310cfcdd62413bf18e8048c496ea69257822c0ee53'})
RETURN c
Still not working as expected.
Update #2
I ended up splitting this into two queries.
The problem I found was that there was two possible outcomes as I noticed.
The CREATED_BY relationship was created and (n) was returned using OPTIONAL MATCH, this relationship would always be created if it didn't already exist between (n) and (u), so when changing the email attribute it would re-create the relationship.
The Node (n) was not found (because of not using OPTIONAL MATCH and the WHERE NOT (c)-[:CREATED_BY]-() clause), resulting in no relationship created (yay!) but without getting the (n) back the MERGE query looses all it's meaning I think.
My Solution was the following two queries:
MERGE (n:Node {name: 'Name'})
ON CREATE SET
SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)-[r:CREATED_BY]-()
RETURN c, r
Then I had program logic check the value of r, if there was no relationship I would run the second query.
MATCH (n:Node {name: 'Name'})
MATCH (u:User {email: 'my#email.com'})
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
Unfortunately I couldn't find any real solution to combining this in one single query with Cypher. Sam, thanks! I've selected your answer even though it didn't quite solve my problem, but it was very close.
This should work for you:
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)
WHERE NOT (n)-[:CREATED_BY]->()
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(:User {email: 'my#mail.com'})
RETURN n;
I've removed the starting MATCH clause (because I presume you want to create a CREATED_BY relationship even when that User does not yet exist in the DB), and simplified the ON CREATE to remove the redundant setting of the name property.
I have also added an OPTIONAL MATCH that will only match an n node that does not already have an outgoing CREATED_BY relationship, followed by a CREATE UNIQUE clause that fully specifies the User node.
How to create unique CONSTRAINT to relationship by neo4j cypher?
At present, there is only one kind of CONSTRAINT neo4j will let you create, and that's a UNIQUENESS constraint. That link cites what's in the internal API, and you'll notice there's only one type at present.
Here's a link on how to create a uniqueness constraint.
This lets you assert that a certain property of a node must be unique, but it doesn't say anything about relationships. I don't think it's possible to constrain what sort of relationships can come off of various nodes.
If I understood your problem correctly, you want to enforce uniqueness of certain kind of relation rather than uniqueness of relation's certain attribute. If that's what you want, then you enforce such uniqueness by using "CREATE UNIQUE":
MATCH (root { name: 'root' })
CREATE UNIQUE (root)-[:LOVES]-(someone)
RETURN someone
The Neo4j Manual: Create unique relationships
it seems that a relationship constraint can only enforce the existence of a relationship property but not its uniqueness
CREATE CONSTRAINT ON ()-[like:LIKED]-() ASSERT exists(like.day)
http://neo4j.com/docs/developer-manual/current/cypher/#query-constraints-prop-exist-rels
Though you can't do this as a constraint just yet you can use the following work-around to get similar behavior at the query level (instead of the constraint level) by using MERGE in your queries. You used to be able to use CREATE UNIQUE to do this, but that has since been deprecated, but the CREATE UNIQUE docs here have a good introduction section that covers the details pretty well and shows you how to do alternatively in the non-deprecated MERGE way.
So, you can use these docs to see how you can create unique nodes and relationships through queries using MERGE. Also, since this uniqueness is decided at the query level instead of the constraint level you should be very cautious of accidentally creating duplicate data where it should be unique.
(I'll put the current relevant doc sections provided above for CREATE UNIQUE with the MERGE alternatives here in case they disappear.)
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match what it can, and create what is missing.
We show in the following example how to express using MERGE the same level of uniqueness guaranteed by CREATE UNIQUE for nodes and relationships.
Assume the original set of queries is given by:
MERGE (p:Person {name: 'Joe'})
RETURN p
MATCH (a:Person {name: 'Joe'})
CREATE UNIQUE (a)-[r:LIKES]->(b:Person {name: 'Jill'})-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})
RETURN a
MATCH (a:Person {name: 'Joe'})
CREATE UNIQUE (a)-[r:LIKES]->(b:Person {name: 'Jill'})-[r1:EATS]->(f:Food {name: 'Banana'})
RETURN a
This will create two :Person nodes, a :LIKES relationship between them, and two :EATS relationships from one of the :Person nodes to two :Food nodes. No node or relationship is duplicated.
The following set of queries — using MERGE — will achieve the same result:
MERGE (p:Person {name: 'Joe'})
RETURN p
MATCH (a:Person {name: 'Joe'})
MERGE (b:Person {name: 'Jill'})
MERGE (a)-[r:LIKES]->(b)
MERGE (b)-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})
RETURN a
MATCH (a:Person {name: 'Joe'})
MERGE (b:Person {name: 'Jill'})
MERGE (a)-[r:LIKES]->(b)
MERGE (b)-[r1:EATS]->(f:Food {name: 'Banana'})
RETURN a
We note that all these queries can also be combined into a single, larger query.
The CREATE UNIQUE examples below use the following graph:
--- source: Cypher Manual v3.5: Section 3.18, Introduction
As in Neo4j community edition version 2.3.1 there seems to be no constraints on relations.
neo4j-sh (?)$ schema ls
Indexes
ON :RELTYPE(id) ONLINE (for uniqueness constraint)
Constraints
ON (reltype:RELTYPE) ASSERT reltype.id IS UNIQUE
You can easily create multiple relations with of type RELTYPE and the same id globally or even between the same nodes
MATCH (s:Person {name:"foo"}), (t:Target {name:"target"})
CREATE (s)-[r:RELTYPE {id:"baz"}]-(t)
This constraint seems to be applied to node only , I can not find anything mentioning the relation in the neo4j documentation
http://neo4j.com/docs/stable/rest-api-schema-constraints.html
What I would like to see (but from my reading of the Neo4J documentation is not currently possible) is to constrain (for example) the ACTED_IN relationship:
(:Person)-[ACTED_IN]->(:Movie)
to prevent the erroneous relationship:
(:Movie)-[ACTED_IN]->(:Person)
Obviously, you can find the bad backwards relationships this way but it would be nice to prevent it from happening with a constraint:
match((m:Movie)-[:ACTED_IN]->(p:Person)) return m,p