I'm stumped on this one, and I think the answer will be straightforward, so let me cut right to it.
Given a graph that looks like this:
Created by a query that looks like this:
CREATE (simpsons:Family {name: "Simpson"})
CREATE (homer:Father {name: "Homer"})
CREATE (lisa:Daughter {name: "Lisa"})
CREATE (snowball:Pet {name: "snowball"})
CREATE (lisa)-[:owns]->(snowball)-[:has]->(:Item {name: "catnip"})
CREATE (homer)-[:has]->(:Item {name: "beer"})
CREATE (lisa)-[:has]->(:Item {name: "saxophone"})
CREATE (lisa)<-[:memberOf]-(simpsons)-[:memberOf]->(homer)
Why would a query that looks like this fail?
MATCH (f:Family),
(f)-[*1..10]-(lisa:Daughter),
(lisa)-[*1..10]-(:Item {name: "saxophone"}),
(f)-[*1..10]-(snowball:Pet),
(snowball)-[*1..10]-(:Item {name: "catnip"})
RETURN f;
Taken separately, its two components both find matches.
MATCH (f:Family),
(f)-[*1..10]-(lisa:Daughter),
(lisa)-[*1..10]-(:Item {name: "saxophone"})
RETURN f;
and
MATCH (f:Family),
(f)-[*1..10]-(snowball:Pet),
(snowball)-[*1..10]-(:Item {name: "catnip"})
RETURN f;
But when pieced together there are no matches.
I have tried PROFILEing the query and it seems like Cypher works backwards from Snowball. It can make that first connection between the family and Snowball.
After that it does a VarLengthExpand(All)
snowball, f, lisa
(f)-[ UNNAMED22:*..10]-(lisa)
Which yields 6 rows. We then drop to 0 rows with this Filter:
snowball, f, lisa
lisa: Daughter
I can get the match to work if I declare a connection between the family and a daughter in the first line of the match statement, but for reasons having to do w/ my particular application this is not a useful workaround.
MATCH (f:Family)-[*1..10]-(lisa:Daughter),
(lisa)-[*1..10]-(:Item {name: "saxophone"}),
(lisa)-[*1..10]-(snowball:Pet {name: "snowball"})-[*1..10]-(:Item {name: "catnip"})
RETURN f;
I think I'm missing something about how Cypher searches for these patterns. Does anyone have insight into what that might be? Thank you for your time!
This isn't a Cypher bug, this is a side-effect of relationship uniqueness within a given MATCH pattern.
From the uniqueness section of the docs:
While pattern matching, Neo4j makes sure to not include matches where the same graph relationship is found multiple times in a single pattern.
This type of uniqueness is usually correct, and is great for preventing infinite loops when using variable-length relationships which traverse a cycle.
Relationship uniqueness is preserved for patterns from a MATCH or an OPTIONAL MATCH, even when these include multiple comma separated paths, as in your case.
You have all of the paths within the pattern of a single MATCH, so relationships must be unique; if used in one path, they will not be reused for another path.
The real problem is here: (f)-[*1..10]-(snowball:Pet) because you've already traversed the same relationship (<memberOf between the Simpsons and Lisa) when you did (f)-[*1..10]-(lisa:Daughter) earlier. Since the relationship cannot be reused, one of those two paths will not be able to be matched, so the entire MATCH fails...no such pattern exists with unique relationships.
Note that when you break up the single MATCH into multiple MATCHes, as in stdob--'s answer, the query succeeds. There is no uniqueness in play here between separate MATCH clauses.
Related
I'm looking into neo4j as a Graph database, and variable length path queries will be a very important use case. I now think I've found an example query that Cypher will not support.
The main issue is that I want to treat composed relations as a single relation. Let my give an example: finding co-actors. I've done this using the standard database of movies. The goal is to find all actors that have acted alongside Tom Hanks. This can be found with the query:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN]->()<-[:ACTED_IN]-(a:Person) return a
Now, what if we want to find co-actors of co-actors recursively.
We can rewrite the above query to:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN*2]-(a:Person) return a
And then it becomes clear we can do this with
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN*]-(a:Person) return a
Notably, all odd-length paths are excluded because they do not end in a Person.
Now, I have found a query that I cannot figure out how to make recursive:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN]->()<-[:DIRECTED]-()-[:DIRECTED]->()<-[:ACTED_IN]-(a:Person) return DISTINCT a
In words, all actors that have a director in common with Tom Hanks.
In order to make this recursive I tried:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN|DIRECTED*]-(a:Person) return DISTINCT a
However, (besides not seeming to complete at all). This will also capture co-actors.
That is, it will match paths of the form
()-[:ACTED_IN]->()<-[:ACTED_IN]-()
So what I am wondering is:
can we somehow restrict the order in which relations occur in a multi-path query?
Something like:
MATCH (tom {name: "Tom Hanks"}){-[:ACTED_IN]->()<-[:DIRECTED]-()-[:DIRECTED]->()<-[:ACTED_IN]-}*(a:Person) return DISTINCT a
Where the * applies to everything in the curly braces.
The path expander procs from APOC Procedures should help here, as we added the ability to express repeating sequences of labels, relationships, or both.
In this case, since you want to match on the actor of the pattern rather than the director (or any of the movies in the path), we need to specify which nodes in the path you want to return, which requires either using the labelFilter in addition to the relationshipFilter, or just to use the combined sequence config property to specify the alternating labels/relationships expected, and making sure we use an end node filter on the :Person node at the point in the pattern that you want.
Here's how you would do this after installing APOC:
MATCH (tom:Person {name: "Tom Hanks"})
CALL apoc.path.expandConfig(tom, {sequence:'>Person, ACTED_IN>, *, <DIRECTED, *, DIRECTED>, *, <ACTED_IN', maxLevel:12}) YIELD path
WITH last(nodes(path)) as person, min(length(path)) as distance
RETURN person.name
We would usually use subgraphNodes() for these, since it's efficient at expanding out and pruning paths to nodes we've already seen, but in this case, we want to keep the ability to revisit already visited nodes, as they may occur in further iterations of the sequence, so to get a correct answer we can't use this or any of the procs that use NODE_GLOBAL uniqueness.
Because of this, we need to guard against exploring too many paths, as the permutations of relationships to explore that fit the path will skyrocket, even after we've already found all distinct nodes possible. To avoid this, we'll have to add a maxLevel, so I'm using 12 in this case.
This procedure will also produce multiple paths to the same node, so we're going to get the minimum length of all paths to each node.
The sequence config property lets us specify alternating label and relationship type filterings for each step in the sequence, starting at the starting node. We are using an end node filter symbol, > before the first Person label (>Person) indicating that we only want paths to the Person node at this point in the sequence (as the first element in the sequence it will also be the last element in the sequence as it repeats). We use the wildcard * for the label filter of all other nodes, meaning the nodes are whitelisted and will be traversed no matter what their label is, but we don't want to return any paths to these nodes.
If you want to see all the actors who acted in movies directed by directors who directed Tom Hanks, but who have never acted with Tom, here is one way:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN]->(m)
MATCH (m)<-[:ACTED_IN]-(ignoredActor)
WITH COLLECT(DISTINCT m) AS ignoredMovies, COLLECT(DISTINCT ignoredActor) AS ignoredActors
UNWIND ignoredMovies AS movie
MATCH (movie)<-[:DIRECTED]-()-[:DIRECTED]->(m2)
WHERE NOT m2 IN ignoredMovies
MATCH (m2)<-[:ACTED_IN]-(a:Person)
WHERE NOT a IN ignoredActors
RETURN DISTINCT a
The top 2 MATCH clauses are deliberately not combined into one clause, so that the Tom Hanks node will be captured as an ignoredActor. (A MATCH clause filters out any result that use the same relationship twice.)
we are trying to achieve something similar- but i just taking a general example below:
(A) - [r:ACT_IN {role:'main actor'}]->(m:movie)
(A) - [r:ACT_IN {role:'side actor'}]->(m:movie)
Is it possible to create such scenarios? because when i tries to add the second row it doesn't add anything.
Thanks,
Pk!
It is certainly possible. You did not provide your exact Cypher queries, so it is not clear where you went wrong.
However, here is a working example (assuming that the DB already contains those Actor and Movie nodes):
MATCH (a:Actor {name: 'Keanu Reeves'}), (m:Movie {title: 'The Matrix'})
CREATE
(a)-[r1:ACT_IN {role:'main actor'}]->(m),
(a)-[r2:ACT_IN {role:'side actor'}]->(m);
Is there a better way to do this than 3 separate queries?
MATCH (you:User {name: "Alexander"})-[:LIKES]->(youLike:User)
RETURN youLike
MATCH (likesYou:User)-[:LIKES]->(you:User {name: "Alexander"})
RETURN likesYou
MATCH (mutualLike:User)-[:LIKES]->(you:User {name: "Alexander"})-[:LIKES]->(mutualLike:User)
RETURN mutualLike
Here is a shot at a single query.
Essentially, find yourself first, optionally find people that you like and collect them, optionally find people that like you and collect them, then return both collections and the intersection of the two.
By matching the node that identifies you and reusing it you match yourself once instead of three times.
Using the collection filter function allowsyou two find the intersection of the two :LIKES populations without rematching those nodes.
The OPTIONAL keyword allows the query to continue if either :LIKES population is empty.
MATCH (you:User {name: "Alexander"})
WITH you
OPTIONAL MATCH(you)-[:LIKES]->(youLike:User)
WITH you, collect(youLike) as youLike
OPTIONAL MATCH (likesYou:User)-[:LIKES]->(you)
WITH you, youLike, collect(likesYou) as likesYou
RETURN you
, youLike
, likesYou
, filter(n in youLike where n in likesYou) as mutualLike
I want to add a "created by" relationship on nodes in my database. Any node should be able of having this relationship but there can never be more than one.
Right now my query looks something like this:
MATCH (u:User {email: 'my#mail.com'})
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.name='Node name', n.attribute='value'
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
As I have understood Cypher there is no way to achieve what I want, the current query will only make sure there are no unique relationships based on TWO nodes, not ONE. So, this will create more CREATED_BY relationships when run for another User and I want to limit the outgoing CREATED_BY relationship to just one for all nodes.
Is there a way to achieve this without running multiple queries involving program logic?
Thanks.
Update
I tried to simplyfy the query by removing implementation details, if it helps here's the updated query based on cybersams response.
MERGE (c:Company {name: 'Test Company'})
ON CREATE SET c.uuid='db764628-5695-40ee-92a7-6b750854ebfa', c.created_at='2015-02-23 23:08:15', c.updated_at='2015-02-23 23:08:15'
WITH c
OPTIONAL MATCH (c)
WHERE NOT (c)-[:CREATED_BY]-()
CREATE (c)-[:CREATED_BY {date: '2015-02-23 23:08:15'}]->(u:User {token: '32ba9d2a2367131cecc53c310cfcdd62413bf18e8048c496ea69257822c0ee53'})
RETURN c
Still not working as expected.
Update #2
I ended up splitting this into two queries.
The problem I found was that there was two possible outcomes as I noticed.
The CREATED_BY relationship was created and (n) was returned using OPTIONAL MATCH, this relationship would always be created if it didn't already exist between (n) and (u), so when changing the email attribute it would re-create the relationship.
The Node (n) was not found (because of not using OPTIONAL MATCH and the WHERE NOT (c)-[:CREATED_BY]-() clause), resulting in no relationship created (yay!) but without getting the (n) back the MERGE query looses all it's meaning I think.
My Solution was the following two queries:
MERGE (n:Node {name: 'Name'})
ON CREATE SET
SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)-[r:CREATED_BY]-()
RETURN c, r
Then I had program logic check the value of r, if there was no relationship I would run the second query.
MATCH (n:Node {name: 'Name'})
MATCH (u:User {email: 'my#email.com'})
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(u)
RETURN n
Unfortunately I couldn't find any real solution to combining this in one single query with Cypher. Sam, thanks! I've selected your answer even though it didn't quite solve my problem, but it was very close.
This should work for you:
MERGE (n:Node {name: 'Node name'})
ON CREATE SET n.attribute='value'
WITH n
OPTIONAL MATCH (n)
WHERE NOT (n)-[:CREATED_BY]->()
CREATE UNIQUE (n)-[:CREATED_BY {date: '2015-02-23'}]->(:User {email: 'my#mail.com'})
RETURN n;
I've removed the starting MATCH clause (because I presume you want to create a CREATED_BY relationship even when that User does not yet exist in the DB), and simplified the ON CREATE to remove the redundant setting of the name property.
I have also added an OPTIONAL MATCH that will only match an n node that does not already have an outgoing CREATED_BY relationship, followed by a CREATE UNIQUE clause that fully specifies the User node.
How to create unique CONSTRAINT to relationship by neo4j cypher?
At present, there is only one kind of CONSTRAINT neo4j will let you create, and that's a UNIQUENESS constraint. That link cites what's in the internal API, and you'll notice there's only one type at present.
Here's a link on how to create a uniqueness constraint.
This lets you assert that a certain property of a node must be unique, but it doesn't say anything about relationships. I don't think it's possible to constrain what sort of relationships can come off of various nodes.
If I understood your problem correctly, you want to enforce uniqueness of certain kind of relation rather than uniqueness of relation's certain attribute. If that's what you want, then you enforce such uniqueness by using "CREATE UNIQUE":
MATCH (root { name: 'root' })
CREATE UNIQUE (root)-[:LOVES]-(someone)
RETURN someone
The Neo4j Manual: Create unique relationships
it seems that a relationship constraint can only enforce the existence of a relationship property but not its uniqueness
CREATE CONSTRAINT ON ()-[like:LIKED]-() ASSERT exists(like.day)
http://neo4j.com/docs/developer-manual/current/cypher/#query-constraints-prop-exist-rels
Though you can't do this as a constraint just yet you can use the following work-around to get similar behavior at the query level (instead of the constraint level) by using MERGE in your queries. You used to be able to use CREATE UNIQUE to do this, but that has since been deprecated, but the CREATE UNIQUE docs here have a good introduction section that covers the details pretty well and shows you how to do alternatively in the non-deprecated MERGE way.
So, you can use these docs to see how you can create unique nodes and relationships through queries using MERGE. Also, since this uniqueness is decided at the query level instead of the constraint level you should be very cautious of accidentally creating duplicate data where it should be unique.
(I'll put the current relevant doc sections provided above for CREATE UNIQUE with the MERGE alternatives here in case they disappear.)
CREATE UNIQUE is in the middle of MATCH and CREATE — it will match what it can, and create what is missing.
We show in the following example how to express using MERGE the same level of uniqueness guaranteed by CREATE UNIQUE for nodes and relationships.
Assume the original set of queries is given by:
MERGE (p:Person {name: 'Joe'})
RETURN p
MATCH (a:Person {name: 'Joe'})
CREATE UNIQUE (a)-[r:LIKES]->(b:Person {name: 'Jill'})-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})
RETURN a
MATCH (a:Person {name: 'Joe'})
CREATE UNIQUE (a)-[r:LIKES]->(b:Person {name: 'Jill'})-[r1:EATS]->(f:Food {name: 'Banana'})
RETURN a
This will create two :Person nodes, a :LIKES relationship between them, and two :EATS relationships from one of the :Person nodes to two :Food nodes. No node or relationship is duplicated.
The following set of queries — using MERGE — will achieve the same result:
MERGE (p:Person {name: 'Joe'})
RETURN p
MATCH (a:Person {name: 'Joe'})
MERGE (b:Person {name: 'Jill'})
MERGE (a)-[r:LIKES]->(b)
MERGE (b)-[r1:EATS]->(f:Food {name: 'Margarita Pizza'})
RETURN a
MATCH (a:Person {name: 'Joe'})
MERGE (b:Person {name: 'Jill'})
MERGE (a)-[r:LIKES]->(b)
MERGE (b)-[r1:EATS]->(f:Food {name: 'Banana'})
RETURN a
We note that all these queries can also be combined into a single, larger query.
The CREATE UNIQUE examples below use the following graph:
--- source: Cypher Manual v3.5: Section 3.18, Introduction
As in Neo4j community edition version 2.3.1 there seems to be no constraints on relations.
neo4j-sh (?)$ schema ls
Indexes
ON :RELTYPE(id) ONLINE (for uniqueness constraint)
Constraints
ON (reltype:RELTYPE) ASSERT reltype.id IS UNIQUE
You can easily create multiple relations with of type RELTYPE and the same id globally or even between the same nodes
MATCH (s:Person {name:"foo"}), (t:Target {name:"target"})
CREATE (s)-[r:RELTYPE {id:"baz"}]-(t)
This constraint seems to be applied to node only , I can not find anything mentioning the relation in the neo4j documentation
http://neo4j.com/docs/stable/rest-api-schema-constraints.html
What I would like to see (but from my reading of the Neo4J documentation is not currently possible) is to constrain (for example) the ACTED_IN relationship:
(:Person)-[ACTED_IN]->(:Movie)
to prevent the erroneous relationship:
(:Movie)-[ACTED_IN]->(:Person)
Obviously, you can find the bad backwards relationships this way but it would be nice to prevent it from happening with a constraint:
match((m:Movie)-[:ACTED_IN]->(p:Person)) return m,p