How *not* to return the relationships that link a node to itself in neo4J - neo4j

In the database that I am using, nodes often have multiple relationships to themselves which makes the resulting graph very messy. As this is for a presentation how do we structure a Cypher query which does not return the self-referencing relationships
I have tried
match p=((n:actor) -[*1..3]-> (nd:movie)) where n.name='Craig' and nd.name='Pride_and_prejudice' and not (n)-[]->(n) return p
didnt give the desired result.

If you have a lot of relationships from Actor to itself, the variable length path query might not be ideal. It will always include the self-referencing relationships which limits performance and gives too many results. One solution would be to explicitly MATCH the first step and filter for the label:
MATCH p=( (n:actor)-[r1]-(n1)-[*0..2]->(nd:Movie) )
WHERE NOT n1:actor
RETURN ...
The *0..2 relationship will catch cases where n1 is a Movie.
Alternatively, you can filter the variable length path for a property as described here: http://neo4j.com/docs/stable/query-match.html#match-match-with-properties-on-a-variable-length-path

Related

Count the number of Relationships between two specific Nodes - Neo4j / Cypher

I would like to input two specific nodes and return the quantity of relationships that are along the path that connect the specific nodes.
(There is only 1 path possible in every case)
In some cases, two specific nodes are related through two relationships like this:
(Tim)-[]-()-[]-(Bill)
Should return 2 (relationships).
In other cases there are more nodes between my specific start and end nodes. Like this:
(Tim)-[]-()-[]-()-[]-()-[]-(Bill)
Should return 4 (relationships).
I have two types of relationships that could exist between nodes, so I need to avoid being specific about the type of relationship if possible.
New to this and performed an extensive search before asking this question as no one seemed to discuss relationships between specific nodes...
Many thanks for your help!
This query should work:
match p = (:Person {name:'Tim'})-[*]->(:Person {name:'Bill'})
RETURN length(p)
That is: return the length() of path p.

NEO4J shortestPath taking into account a particular relationship pattern

I have a graph where I have chains of nodes that have a relationship [:LINKS_TO] and I can successfully get the shortestPath function to work.
For most of my users this level of detail is fine.
I have another set of users where there is a need for a richer set of information on the relationship. Given that properties on relationships are supposed to represent strengths or scores for the relationship I have created specific nodes to hold descriptive metadata.
This means I have a pattern that says (start)-[:PARTICIPATES]-(middle)-[:REFERENCES]->(end)
There can be any number of nodes between the start and end points in the chain.
I am struggling to get the shortestPath function to return any results for the more detailed chain. Is there a way to do this using Cypher?
You could also have kept your metadata information on the relationships.
For your needs, this should work:
MATCH p = shortestPath((start)-[:PARTICIPATES|:REFERENCES*]->(end))
RETURN nodes(p)

Is a DFS Cypher Query possible?

My database contains about 300k nodes and 350k relationships.
My current query is:
start n=node(3) match p=(n)-[r:move*1..2]->(m) where all(r2 in relationships(p) where r2.GameID = STR(id(n))) return m;
The nodes touched in this query are all of the same kind, they are different positions in a game. Each of the relationships contains a property "GameID", which is used to identify the right relationship if you want to pass the graph via a path. So if you start traversing the graph at a node and follow the relationship with the right GameID, there won't be another path starting at the first node with a relationship that fits the GameID.
There are nodes that have hundreds of in and outgoing relationships, some others only have a few.
The problem is, that I don't know how to tell Cypher how to do this. The above query works for a depth of 1 or 2, but it should look like [r:move*] to return the whole path, which is about 20-200 hops.
But if i raise the values, the querys won't finish. I think that Cypher looks at each outgoing relationship at every single path depth relating to the start node, but as I already explained, there is only one right path. So it should do some kind of a DFS search instead of a BFS search. Is there a way to do so?
I would consider configuring a relationship index for the GameID property. See http://docs.neo4j.org/chunked/milestone/auto-indexing.html#auto-indexing-config.
Once you have done that, you can try a query like the following (I have not tested this):
START n=node(3), r=relationship:rels(GameID = 3)
MATCH (n)-[r*1..]->(m)
RETURN m;
Such a query would limit the relationships considered by the MATCH cause to just the ones with the GameID you care about. And getting that initial collection of relationships would be fast, because of the indexing.
As an aside: since neo4j reuses its internally-generated IDs (for nodes that are deleted), storing those IDs as GameIDs will make your data unreliable (unless you never delete any such nodes). You may want to generate and use you own unique IDs, and store them in your nodes and use them for your GameIDs; and, if you do this, then you should also create a uniqueness constraint for your own IDs -- this will, as a nice side effect, automatically create an index for your IDs.

Neo4j: Conditions on Relationships with Depth

What I'm trying to do
Being relatively new to Neo4j, I'm trying to find certain nodes with Cypher in a Neo4j graph database. The nodes should be connected by a chain of relationships of a certain type with further conditions on the relationships:
// Cypher
START self = node(3413)
MATCH (self)<-[rel:is_parent_of*1..100]-(ancestors)
WHERE rel.some_property = 'foo'
RETURN DISTINCT ancestors
What goes wrong
If I drop the depth part *1..100, the query works, but of course, then allows only one relationship between self and the ancestors.
But, if I allow the ancestors to be several steps away from self by introducing the depth *1..100, the query fails:
Error: Expected rel to be a Map but it was a Collection
I thought, maybe this syntax defines rel to be is_parent_of*1..100 rather than defining rel to be a relationship of type is_parent_of and allowing a larger relationship depth.
So, I've tried to make my intentions clear by using parenthesis: [(rel:is_parent_of)*1..100. But this causes a syntax error.
I'd appreciate any help to fix this. Thanks!
Nomenclature
Calling *1..100 depth originates in the nomenclature of the neography ruby gem, where this is done using the abstract depth method.
In neo4j, this is called variable length relationships, as can be seen here in the documentation: MATCH / Variable length relationships.
Reason for the error
The reason for the "Expected rel to be a Map but it was a Collection" error is, indeed, that rel does not refer to each single relationship but rather to the entire collection of matched relationships.
For an example, see here in the documentation: MATCH / Relationship variable in variable length relationships.
Solution
First, acknowledge that the identifier refers to a collection (i.e. a set of multiple items) and call it rels instead of rel. Then, in the WHERE clause, state that the condition has to apply to all rel items in the rels collection using the ALL predicate.
// Cypher
START self = node(3413)
MATCH (self)<-[rels:is_parent_of*1..100]-(ancestors)
WHERE ALL (rel in rels WHERE rel.some_property = 'foo')
RETURN DISTINCT ancestors
The ALL predicate is explained here in the documentation: Functions / Predicate Functions.
I was led to this solution by this stackoverflow answer of a related question.
Long query time
Unfortunately, asking for relationship properties does cost a lot of time. The above query with only a couple of nodes in the database, takes over 3000ms on my development machine.

How to find distinct nodes in a Neo4j/Cypher query

I'm trying to do some pattern matching in neo4j/cypher and I came across this issue:
There are two types of graphs I want to search for:
Star graphs: A graph with one center node and multiple outgoing relationships.
n-length line graphs: A line graph with length n where none of the nodes are repeats (I have some bidirectional edges and cycles in my graph)
So the main problem is that when I do something such as:
MATCH a-->b, a-->c, a-->d
MATCH a-->b-->c-->d
Cypher doesn't guarantee (when I tried it) that a, b, c, and d are all different nodes. For small graphs, this can easily be fixed with
WHERE not(a=b) AND not(a=c) AND ...
But I'm trying to have graphs of size 10+, so checking equality between all nodes isn't a viable option. Afaik, RETURN DISTINCT does not work as well since it doesn't check equality among variables, only across different rows. Is there any simple way I can specify the query to make the differently named nodes distinct?
Old question, but look to APOC Path Expander procedures for how to address these kinds of use cases, as you can change the traversal uniqueness behavior for expansion (the same way you can when using the traversal API...which these procedures use).
Cypher implicitly uses RELATIONSHIP_PATH uniqueness, meaning that per path returned, a relationship must be unique, it cannot be used multiple times in a single path.
While this is good for queries where you need all possible paths, it's not a good fit for queries where you want distinct nodes or a subgraph or to prevent repeating nodes in a path.
For an n-length path, let's say depth 6 with only outgoing relationships of any type, we can change the uniqueness to NODE_PATH, where a node must be unique per path, no repeats in a path:
MATCH (n)
WHERE id(n) = 12345
CALL apoc.path.expandConfig(n, {maxLevel:6, uniqueness:'NODE_PATH'}) YIELD path
RETURN path
If you want all reachable nodes up to a certain depth (or at any depth by omitting maxLevel), you can use NODE_GLOBAL uniqueness, or instead just use apoc.path.subgraphNodes():
MATCH (n)
WHERE id(n) = 12345
CALL apoc.path.subgraphNodes(n, {maxLevel:6}) YIELD node
RETURN node
NODE_GLOBAL uniqueness means that across all paths that a node must be unique, it will only be visited once, and there will only be one path to a node from a given start node. This keeps the number of paths that need to be evaluated down significantly, but because of this behavior not all relationships will be traversed, if they expand to a node already visited.
You will not get relationships back with this procedure (you can use apoc.path.spanningTree() for that, although as previously mentioned not all relationships will be included, as we will only capture a single path to each node, not all possible paths to nodes). If you want all nodes up to a max level and all possible relationships between those nodes, then use apoc.path.subgraphAll():
MATCH (n)
WHERE id(n) = 12345
CALL apoc.path.subgraphAll(n, {maxLevel:6}) YIELD nodes, relationships
RETURN nodes, relationships
Richer options exist for label and relationship filtering, or filtering (whitelist, blacklist, endnode, terminator node) based on lists of pre-matched nodes.
We also support repeating sequences of relationships or node labels.
If you need filtering by node or relationship properties during expansion, then this won't be a good option as that feature is yet supported.

Resources