Why is a single Neo4j relationship shown twice in Cypher query results? - neo4j

Let's consider a trivial graph with a directed relationship:
CREATE
(`0` :Car {value:"Ford"})
, (`1` :Car {value:"Subaru"})
, (`0`)-[:`DOCUMENT` {value:"DOC-1"}]->(`1`);
The following query MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car) RETURN * returns:
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Subaru"}│{"value":"Ford"} │{"value":"DOC-1"}│
├──────────────────┼──────────────────┼─────────────────┤
│{"value":"Ford"} │{"value":"Subaru"}│{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
The graph defines only a Ford->Subaru relationship, why are two relationships?
How to interpret the reversed one (line 1; not specified in the CREATE) statement?
Note: This is a follow-up to Convert multiple relationships between 2 nodes to a single one with weight asked by me earlier. I solved my problem, but I'm not convinced my answer is the best solution.

Your MATCH statement here doesn't specify the direction, therefore there are two possible paths that will match the pattern (remember that the ordering of nodes in the path is important and distinguishes paths from each other), thus your two answers.
If you specify the direction of the relationship instead you'll find there is only one possible path that matches:
MATCH (n1:Car)-[r:DOCUMENT]->(n2:Car)
RETURN *
As for the question of why we get two paths back when we omit the direction, remember that paths are order-sensitive: two paths that have the same elements but with a different order of the elements are different paths.
To help put this into perspective, consider the following two queries:
# Query 1
MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car)
WHERE n1.value = 'Ford'
RETURN *
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Ford"} │{"value":"Subaru"}│{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
# Query 2
MATCH (n1:Car)-[r:DOCUMENT]-(n2:Car)
WHERE n1.value = 'Subaru'
RETURN *
╒══════════════════╤══════════════════╤═════════════════╕
│"n1" │"n2" │"r" │
╞══════════════════╪══════════════════╪═════════════════╡
│{"value":"Subaru"}│{"value":"Ford"} │{"value":"DOC-1"}│
└──────────────────┴──────────────────┴─────────────────┘
Conceptually (and also used by the planner, in absence of indexes), to get to each of the above results you start off with the results of the full match as in your description, then filter to the only one which meets the given criteria.
The results above would not be consistent with the original directionless match query if that original query only returned a single row instead of two.
Additional information from the OP
It will take a while to wrap my head around it, but it does work this way and here's a piece of documentation to confirm it's by design:
When your pattern contains a bound relationship, and that relationship pattern doesn’t specify direction, Cypher will try to match the relationship in both directions.
MATCH (a)-[r]-(b)
WHERE id(r)= 0
RETURN a,b
This returns the two connected nodes, once as the start node, and once as the end node.

Related

Neo4J Matching Nodes Based on Multiple Relationships

I had another thread about this where someone suggested to do
MATCH (p:Person {person_id: '123'})
WHERE ANY(x IN $names WHERE
EXISTS((p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(:Dias {group_name: x})))
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
This does what I need it to, returns nodes that fit the criteria without returning the relationships, but now I need to include another param that is a list.
....(:Dias {group_name: x, second_name: y}))
I'm unsure of the syntax.. here's what I tried
WHERE ANY(x IN $names and y IN $names_2 WHERE..
this gives me a syntax error :/
Since the ANY() function can only iterate over a single list, it would be difficult to continue to use that for iteration over 2 lists (but still possible, if you create a single list with all possible x/y combinations) AND also be efficient (since each combination would be tested separately).
However, the new existenial subquery synatx introduced in neo4j 4.0 will be very helpful for this use case (I assume the 2 lists are passed as the parameters names1 and names2):
MATCH (p:Person {person_id: '123'})
WHERE EXISTS {
MATCH (p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(d:Dias)
WHERE d.group_name IN $names1 AND d.second_name IN $names2
}
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
By the way, here are some more tips:
If it is possible to specify the direction of each relationship in your query, that would help to speed up the query.
If it is possible to remove any node labels from a (sub)query and still get the same results, that would also be faster. There is an exception, though: if the (sub)query has no variables that are already bound to a value, then you would normally want to specify the node label for the one node that would be used to kick off that (sub)query (you can do a PROFILE to see which node that would be).

Query intersection of Paths in Neo4j using Cypher

Having this query working in Cypher (Neo4j):
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
RETURN p
which returns all possible paths belonging a specific group (group is just a property to classify nodes), I am struggling to get a query that returns the paths in common between both collection of paths. It would be something like this:
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
MATCH p=(g3:Node)-[:FOLLOWED_BY *2..2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15
RETURN INTERSECTION(path1, path2)
Of course I made that up. The goal is to get all the paths in common between both queries.
The start/end nodes of your 2 MATCHes have different groups, so they can never find common paths.
Therefore, when you ask for "paths in common", I assume you actually want to find the shared middle nodes (between the 2 sets of 3-node paths). If so, this query should work:
MATCH p1=(g:Node)-[:FOLLOWED_BY *2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
WITH COLLECT(DISTINCT NODES(p1)[1]) AS middle1
MATCH p2=(g3:Node)-[:FOLLOWED_BY *2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15 AND NODES(p2)[1] IN middle1
RETURN DISTINCT NODES(p2)[1] AS common_middle_node;

NEO4J nodes under a node filter by relationship

How can we add a relationship to the query.
Say A-[C01]-B-[C02]-D and A-[C01]-B-[C03]-E
C01 C02 C03 are relationship codes I want to get output
B E
because I want only nodes that can be reached unbroken by C01 or C03
How can I get this result in Cypher?
You may want to clarify, what you're asking for seems like a very simple case of matching. You may want to provide some more info, such as node labels and how you're matching to your start nodes, since without these we have to make things up for example code.
MATCH (a:Thing)
WHERE a.ID = 123
WITH a
MATCH (a)-[:C01|C03*]->(b:Thing)
RETURN b
The key here is specifying multiple relationship types to traverse, using * for multiplicity, so it will match on all nodes that can be reached by any chain of those relationships.

Cypher Query not returning nonexistent relationships

I have a graph database where there are user and interest nodes which are connected by IS_INTERESTED relationship. I want to find interests which are not selected by a user. I wrote this query and it is not working
OPTIONAL MATCH (u:User{userId : 1})-[r:IS_INTERESTED] -(i:Interest)
WHERE r is NULL
Return i.name as interest
According to answers to similar questions on SO (like this one), the above query is supposed to work.However,in this case it returns null. But when running the following query it works as expected:
MATCH (u:User{userId : 1}), (i:Interest)
WHERE NOT (u) -[:IS_INTERESTED] -(i)
return i.name as interest
The reason I don't want to run the above query is because Neo4j gives a warning:
This query builds a cartesian product between disconnected patterns.
If a part of a query contains multiple disconnected patterns, this
will build a cartesian product between all those parts. This may
produce a large amount of data and slow down query processing. While
occasionally intended, it may often be possible to reformulate the
query that avoids the use of this cross product, perhaps by adding a
relationship between the different parts or by using OPTIONAL MATCH
(identifier is: (i))
What am I doing wrong in the first query where I use OPTIONAL MATCH to find nonexistent relationships?
1) MATCH is looking for the pattern as a whole, and if can not find it in its entirety - does not return anything.
2) I think that this query will be effective:
// Take all user interests
MATCH (u:User{userId: 1})-[r:IS_INTERESTED]-(i:Interest)
WITH collect(i) as interests
// Check what interests are not included
MATCH (ni:Interest) WHERE NOT ni IN interests
RETURN ni.name
When your OPTIONAL MATCH query does not find a match, then both r AND i must be NULL. After all, since there is no relationship, there is no way get the nodes that it points to.
A WHERE directly after the OPTIONAL MATCH is pulled into the evaluation.
If you want to post-filter you have to use a WITH in between.
MATCH (u:User{userId : 1})
OPTIONAL MATCH (u)-[r:IS_INTERESTED] -(i:Interest)
WITH r,i
WHERE r is NULL
Return i.name as interest

Grouping nodes by their relationship

I've built a simple graph of one node type and two relationship types: IS and ISNOT. IS relationships means that the node pair belongs to the same group, and obviouslly ISNOT represents the not belonging rel.
When I have to get the groups of related nodes I run the following query:
"MATCH (a:Item)-[r1:IS*1..20]-(b:Item) RETURN a,b"
So this returns a lot of a is b results and I added some code to group them after.
What I'd like is to group them modifying the query above, but given my rookie level I haven't yet figured it out. What I'd like is to get one row per group like:
(node1, node3, node5)
(node2,node4,node6)
(node7,node8)
I assume what you call groups are nodes present in a path where all these nodes are connected with a :IS relationship.
I think this query is what you want :
MATCH p=(a:Item)-[r1:IS*1..20]-(b:Item)
RETURN nodes(p) as nodes
Where p is a path describing your pattern, then you return all the nodes present in the path in a collection.
Note that, a simple graph (http://console.neo4j.org/r/ukblc0) :
(item1)-[:IS]-(item2)-[:IS]-(item3)
will return already 6 paths, because you use undericted relationships in the pattern, so there are two possible paths between item1 and item2 for eg.

Resources