How do I get all relationships between nodes on a path - neo4j

I am trying to follow a certain relationship type and return all the nodes and (other) relationships on that path but not to follow paths through nodes that are not part of the path.
Below is the live query that I have set up to demonstrate.
http://console.neo4j.org/?id=b6sxoh
In the example I do not want the relationship through B->E->C to be included in the results because there is no 'depends_on' relationship between them.
Below is one of my many attempts... (also in the console).
START me=node:node_auto_index(name='A')
MATCH p=me-[d:depends_on*]->others
WITH me,others
MATCH p=me-[r*]-others
RETURN DISTINCT relationships(p);
I would love some help please!

One way to do this is to iterate through each pair of nodes on the matched path for the pattern "p=me-[d:depends_on*]->others", and find any other relationships between them.
START me=node:node_auto_index(name='A')
MATCH me-[:depends_on*0..]->(previous)-[:depends_on]->last
With previous, last
Match previous-[r]-last
Where type(r) <> 'depends_on'
Return r
Since each matched path for the pattern "me-[d:depends_on*]->others" is augmented with a new relationship as the last relationship, to iterate through all the relationships on the matched path is to iterate through each last relationship on the matched paths. So for each matched path, we capture the start node and the end node of the last relationship as "previous" and "last" and then find the relationships "r" between them, filter "r" with the "Where" clause based on the type of the relationship r, return only those relationships r that are not of the type "depends_on".

Related

Enforce order of relations in multipath queries

I'm looking into neo4j as a Graph database, and variable length path queries will be a very important use case. I now think I've found an example query that Cypher will not support.
The main issue is that I want to treat composed relations as a single relation. Let my give an example: finding co-actors. I've done this using the standard database of movies. The goal is to find all actors that have acted alongside Tom Hanks. This can be found with the query:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN]->()<-[:ACTED_IN]-(a:Person) return a
Now, what if we want to find co-actors of co-actors recursively.
We can rewrite the above query to:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN*2]-(a:Person) return a
And then it becomes clear we can do this with
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN*]-(a:Person) return a
Notably, all odd-length paths are excluded because they do not end in a Person.
Now, I have found a query that I cannot figure out how to make recursive:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN]->()<-[:DIRECTED]-()-[:DIRECTED]->()<-[:ACTED_IN]-(a:Person) return DISTINCT a
In words, all actors that have a director in common with Tom Hanks.
In order to make this recursive I tried:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN|DIRECTED*]-(a:Person) return DISTINCT a
However, (besides not seeming to complete at all). This will also capture co-actors.
That is, it will match paths of the form
()-[:ACTED_IN]->()<-[:ACTED_IN]-()
So what I am wondering is:
can we somehow restrict the order in which relations occur in a multi-path query?
Something like:
MATCH (tom {name: "Tom Hanks"}){-[:ACTED_IN]->()<-[:DIRECTED]-()-[:DIRECTED]->()<-[:ACTED_IN]-}*(a:Person) return DISTINCT a
Where the * applies to everything in the curly braces.
The path expander procs from APOC Procedures should help here, as we added the ability to express repeating sequences of labels, relationships, or both.
In this case, since you want to match on the actor of the pattern rather than the director (or any of the movies in the path), we need to specify which nodes in the path you want to return, which requires either using the labelFilter in addition to the relationshipFilter, or just to use the combined sequence config property to specify the alternating labels/relationships expected, and making sure we use an end node filter on the :Person node at the point in the pattern that you want.
Here's how you would do this after installing APOC:
MATCH (tom:Person {name: "Tom Hanks"})
CALL apoc.path.expandConfig(tom, {sequence:'>Person, ACTED_IN>, *, <DIRECTED, *, DIRECTED>, *, <ACTED_IN', maxLevel:12}) YIELD path
WITH last(nodes(path)) as person, min(length(path)) as distance
RETURN person.name
We would usually use subgraphNodes() for these, since it's efficient at expanding out and pruning paths to nodes we've already seen, but in this case, we want to keep the ability to revisit already visited nodes, as they may occur in further iterations of the sequence, so to get a correct answer we can't use this or any of the procs that use NODE_GLOBAL uniqueness.
Because of this, we need to guard against exploring too many paths, as the permutations of relationships to explore that fit the path will skyrocket, even after we've already found all distinct nodes possible. To avoid this, we'll have to add a maxLevel, so I'm using 12 in this case.
This procedure will also produce multiple paths to the same node, so we're going to get the minimum length of all paths to each node.
The sequence config property lets us specify alternating label and relationship type filterings for each step in the sequence, starting at the starting node. We are using an end node filter symbol, > before the first Person label (>Person) indicating that we only want paths to the Person node at this point in the sequence (as the first element in the sequence it will also be the last element in the sequence as it repeats). We use the wildcard * for the label filter of all other nodes, meaning the nodes are whitelisted and will be traversed no matter what their label is, but we don't want to return any paths to these nodes.
If you want to see all the actors who acted in movies directed by directors who directed Tom Hanks, but who have never acted with Tom, here is one way:
MATCH (tom {name: "Tom Hanks"})-[:ACTED_IN]->(m)
MATCH (m)<-[:ACTED_IN]-(ignoredActor)
WITH COLLECT(DISTINCT m) AS ignoredMovies, COLLECT(DISTINCT ignoredActor) AS ignoredActors
UNWIND ignoredMovies AS movie
MATCH (movie)<-[:DIRECTED]-()-[:DIRECTED]->(m2)
WHERE NOT m2 IN ignoredMovies
MATCH (m2)<-[:ACTED_IN]-(a:Person)
WHERE NOT a IN ignoredActors
RETURN DISTINCT a
The top 2 MATCH clauses are deliberately not combined into one clause, so that the Tom Hanks node will be captured as an ignoredActor. (A MATCH clause filters out any result that use the same relationship twice.)

Cypher - traversal of nodes

I have a simple graph with nodes labeled as Organization and two directed relationship types 'Control' and 'Influence'. My objective is -
Step-1
Given a node, find all nodes connected to it with 'Control' relationship (in any direction)
Step-2
For all nodes found by Step-1, find any outbound 'Influence' relationships (of any length) and include those nodes as well
What I have been able to come up thus far:
MATCH (x:Organization {ORGID: "5621"})-[:Control*1..]-(y) WITH y MATCH y-[:Influence*0..]-(z) RETURN y,z;
Questions
1) This query does not include the starting node, how can I get that in the result?
2) Ideally I'd like to get the relationships in the result also, it just returns the nodes
TIA
Here is one way to do what you want:
MATCH (x:Organization { ORGID: "5621" }), p1 = x-[:Influence*0..]->(z)
WHERE NOT z-[:Influence]->()
WITH x, COLLECT(p1) AS c1
OPTIONAL MATCH x-[:Control*]-(y), p2=y-[:Influence*0..]->(z)
WHERE NOT z-[:Influence]->()
RETURN c1 + COLLECT(p2) AS result;
The query returns a collection of matching Influence paths. The WHERE clauses are used to make sure that each path is as long as possible (to avoid a lot of duplicate subpaths from showing up in the results). Every path is a collection of node(s) separated by relationships, and can consist of just a node if the path has no relationships.

How do I return nodes in neo4j that have a certain relationship, and then return nodes that have a different relationship with the first nodes?

I have a bunch of nodes that "refer" to other nodes. The nodes referred to (refer_to being the relationship) may then have a relationship to another node called changed_to. Those nodes that are related by the changed_to relationship may also have another changed_to relationship with yet another node. I want to return the nodes referred to, but also the nodes that the referred nodes were changed into. I tried a query that returns referred to nodes combined with a union with an optional match of ReferencedNode changed to ResultNode, but I don't think this will work as it will only get me the referenced node plus the first changed to node and nothing after that, assuming that would work at all to begin with. How can I make a query with the described behavior?
Edit:
This is an example of the relationships going on. I would like to return the referenced to node and the node that the referenced node was ultimately became, with some indicator showing that it became that node ultimately.
Can you give some examples of queries you've tried? Here's what I'm thinking:
MATCH path=(n)-[:refer_to]->(o)-[:changed_to*1..5]->(p)
WHERE n.id = {start_id}
RETURN nodes(path), rels(path)
Of course I don't know if you have an id property so that might need to change. Also you'll be passing in a start_id parameter here.
If you want to return the "references" node and the last "changed_to" node if there is one, you can first match the relationship you know is there, then optionally match at variable depth the path that may be there. If there is more than one "changed_to" relationship you will have more than one result item at this point. If you want all the "changed_to" nodes you can return now, but if you want only the last one you can order the result items by path depth descending with a limit of 1 to get the longest path and then return the last node in that path. That query could look something like
MATCH (n)-[:REFERENCES]->(o)
WHERE n.uid = {uid}
OPTIONAL MATCH path=o-[:CHANGED_TO*1..5]->(p)
WITH n, o, path
ORDER BY length(path) DESC
LIMIT 1
RETURN n, o, nodes(path)[-1]
This returns the start node, the "references" node and
nothing when there is no "changed_to" node
the one "changed_to" node when there is only one
the last "changed_to" node when there is more than one
You can test the query in this console. It contains these three cases and you can test them by replacing {uid} above with the values 1,5 and 8 to get the starting nodes for the three paths.

Grouping nodes by their relationship

I've built a simple graph of one node type and two relationship types: IS and ISNOT. IS relationships means that the node pair belongs to the same group, and obviouslly ISNOT represents the not belonging rel.
When I have to get the groups of related nodes I run the following query:
"MATCH (a:Item)-[r1:IS*1..20]-(b:Item) RETURN a,b"
So this returns a lot of a is b results and I added some code to group them after.
What I'd like is to group them modifying the query above, but given my rookie level I haven't yet figured it out. What I'd like is to get one row per group like:
(node1, node3, node5)
(node2,node4,node6)
(node7,node8)
I assume what you call groups are nodes present in a path where all these nodes are connected with a :IS relationship.
I think this query is what you want :
MATCH p=(a:Item)-[r1:IS*1..20]-(b:Item)
RETURN nodes(p) as nodes
Where p is a path describing your pattern, then you return all the nodes present in the path in a collection.
Note that, a simple graph (http://console.neo4j.org/r/ukblc0) :
(item1)-[:IS]-(item2)-[:IS]-(item3)
will return already 6 paths, because you use undericted relationships in the pattern, so there are two possible paths between item1 and item2 for eg.

Cypher query to find all paths with same relationship type

I'm struggling to find a single clean, efficient Cypher query that will let me identify all distinct paths emanating from a start node such that every relationship in the path is of the same type when there are many relationship types.
Here's a simple version of the model:
CREATE (a), (b), (c), (d), (e), (f), (g),
(a)-[:X]->(b)-[:X]->(c)-[:X]->(d)-[:X]->(e),
(a)-[:Y]->(c)-[:Y]->(f)-[:Y]->(g)
In this model (a) has two outgoing relationship types, X and Y. I would like to retrieve all the paths that link nodes along relationship X as well as all the paths that link nodes along relationship Y.
I can do this programmatically outside of cypher by making a series of queries, the first to
retrieve the list of outgoing relationships from the start node, and then a single query (submitted together as a batch) for each relationship. That looks like:
START n=node(1)
MATCH n-[r]->()
RETURN COLLECT(DISTINCT TYPE(r)) as rels;
followed by:
START n=node(1)
MATCH n-[:`reltype_param`*]->()
RETURN p as path;
The above satisfies my need, but requires at minimum 2 round trips to the server (again, assuming I batch together the second set of queries in one transaction).
A single-query approach that works, but is horribly inefficient is the following single Cypher query:
START n=node(1)
MATCH p = n-[r*]->() WHERE
ALL (x in RELATIONSHIPS(p) WHERE TYPE(x) = TYPE(HEAD(RELATIONSHIPS(p))))
RETURN p as path;
That query uses the ALL predicate to filter the relationships along the paths enforcing that each relationship in the path matches the first relationship in the path. This, however, is really just a filter operation on what it essentially a combinatorial explosion of all possible paths --- much less efficient than traversing a relationship of a known, given type first.
I feel like this should be possible with a single Cypher query, but I have not been able to get it right.
Here's a minor optimization, at least non-matching the paths will fail fast:
MATCH n-[r]->()
WITH distinct type(r) AS t
MATCH p = n-[r*]->()
WHERE type(r[-1]) = t // last entry matches
RETURN p AS path
This is probably one of those things that should be in the Java API if you want it to be really performant, though.

Resources