How do I create relationships between nodes only where specific relationships specified by pattern do not already exist?
The closest I can get is:
MATCH (d1:Document),(d2:Document), (d1)-[r]-()
WHERE d1.acct = d2.acct
AND d1.doc_type = 'ID' AND d2.doc_type = 'PA'
AND d1.DD_sd = d2.DD_sd
AND NOT TYPE (r) =~ ':IDRelated*.'
CREATE (d1) -[:IDRelated_SameDD]-> (d2)
but that would not be as accurate as listing out the specific relationships to exclude.
I would like something like:
MATCH (d1:Document),(d2:Document)
WHERE d1.acct = d2.acct
AND d1.doc_type = 'ID' AND d2.doc_type = 'PA'
AND d1.DD_sd = d2.DD_sd
AND NOT ( (d1) -[':IDRelated*.']- () )
CREATE (d1) -[:IDRelated_SameDD]-> (d2)
where I can use a regular expression (or STARTS WITH) to specify a pattern to exclude multiple relationships matching that pattern.
Thanks!
At the moment there's no such way to do this inline in a pattern.
However, you may want to PROFILE your query, and compare it against a query where you save your match to d2 until you finish filtering for the correct d1 matches. Something like this:
MATCH (d1:Document)-[r]-()
WHERE d1.doc_type = 'ID'
WITH d1, COLLECT(r) as rels
WHERE NONE(r in rels WHERE type(r) STARTS WITH 'IDRelated')
MATCH (d2:Document)
WHERE d1.acct = d2.acct
AND d1.DD_sd = d2.DD_sd
AND d2.doc_type = 'PA'
CREATE (d1) -[:IDRelated_SameDD]-> (d2)
Related
I have for example the following graph in Neo4j
(startnode)-[:BELONG_TO]-(Interface)-[:IS_CONNECTED]-(Interface)-[:BELONG_TO]-
#the line below can repeat itself 0..n times
(node)-[:BELONG_TO]-(Interface)-[:IS_CONNECTED]-(Interface)-[:BELONG_TO]-
#up to the endnode
(endnode)
There is an Interface properties I also need to match on. I do not want to follow all the paths, I just the one with Interface Node property I am looking for. For example Interface.VlanList CONTAINS ",23,"
I have done the following in Cypher but it applies that I already know how many iterations I am going to find which in reality is not the case.
match (n:StartNode {name:"device name"}) -[:BELONG_TO]- (i:Interface) -[:IS_CONNECTED]- (ii:Interface)-[:BELONG_TO]-(nn:Node) -[:BELONG_TO]- (iii:Interface) -[:IS_CONNECTED]- (iiii:Interface) -[:BELONG_TO]-(nnn:Node)
where i.VlanList CONTAINS ",841,"
AND ii.VlanList CONTAINS ",841,"
AND iii.VlanList CONTAINS ",841,"
return n, i,ii,nn,iii,iiii,nnn
I have been looking at the documentation but can not work out how the above could be resolved.
This should work:
// put the searchstring in a variable
WITH ',841,' AS searchstring
// look up start end endnode
MATCH (startNode: .... {...}), (endNode: .... {...})
// look for paths of variable length
// that have your search string in all nodes,
// except the first and the last one
WITH searchstring,startNode,endNode
MATCH path=(startnode)-[:BELONG_TO|IS_CONNECTED*]-(endnode)
WHERE ALL(i IN nodes(path)[1..-1] WHERE i.VlanList CONTAINS searchstring)
RETURN path
You can also look at https://neo4j.com/labs/apoc/4.1/graph-querying/path-expander/ for more ideas about how you can limit the pathfinding.
This query should work for you (assuming that the relationship directions I chose are correct):
MATCH p = (sNode:StartNode)-[:BELONG_TO]->(i1:Interface)-[:IS_CONNECTED]->(i2:Interface)-[:BELONG_TO]->(n1)-[:BELONG_TO|IS_CONNECTED*0..]->(eNode:Node)
WHERE sNode.name = "device name" AND eNode.name = "foo" AND LENGTH(p)%3 = 0
WITH p, i1, i2, n1, eNode, RELATIONSHIPS(p) AS rels, NODES(p) AS ns
WHERE n1 = eNode OR (
ALL(j IN RANGE(3, SIZE(rels)-3, 3) WHERE
'BELONG_TO' = TYPE(rels[j]) = TYPE(rels[j+2]) AND
'IS_CONNECTED' = TYPE(rels[j+1])) AND
ALL(x IN ([i1, i2] + REDUCE(s = [], i IN RANGE(3, SIZE(ns)-2, 3) | CASE WHEN i%3 = 0 THEN s ELSE s +ns[i] END))
WHERE x:Interface AND x.VlanList CONTAINS $substring)
)
RETURN p
It checks that the returned paths have the required pattern of node labels, node property value, and relationship types. It takes advantage of the variable length relationship syntax, using zero as the lower bound. Since there is no upper bound, the variable length relationship query query can take "forever" to finish (and in such a situation, you should use a reasonable upper bound).
I am trying to get the connecting nodes. The final node should have a type(r) of pobj. How can I specify it with the shortest path?
match(c:fdnode{name:'flights'})
match(d:fdnode)
match p = shortestPath((c)-[*..15]-(e)-[r]-(d))
where d.name = '700' and type(r) = 'pobj'
RETURN nodes(p)
If I remove r, the code returns desired output. But I need the type(r).
pobj is only for this specific case.I have multiple traversal criteria.
The following query may do what you want. It returns the nodes in the shortest undirected path (of up to length 15) that meets these criteria:
The first node has the label 'fdnode', and its name value is 'flights'.
The last node has the label 'fdnode', and its name value is '700'.
The last relationship has the type, pobj.
MATCH p = shortestPath((c:fdnode)-[*..15]-(d:fdnode))
WHERE c.name = 'flights' AND d.name = '700' AND TYPE(LAST(RELATIONSHIPS(p))) = 'pobj'
RETURN NODES(p);
NOTE: The matched path will be "undirected" because this query mimicked your query by not specifying any directionality in the MATCH pattern. So, every matched relationship is allowed to be in either direction. If this is not what you intended, you need to explicitly specify the directionality in your pattern.
Figured it out. This works.
match (a:iknode)-[*..10]->(b:iknode)-[r]->(c:iknode) where type(r) = 'pobj' and a.name = 'flights' return c
This also works:
MATCH p = ((c:fdnode)-[*..4]->(d:fdnode))
WHERE c.name = 'flights' AND TYPE(LAST(RELATIONSHIPS(p))) = 'pobj'
RETURN NODES(p) order by length(p) ;
I want to test all nodes in the path from node a to node b (with only MATCH statement), where the depth is changing (could be any number). In the example below the depth is 2.
START a = node(86)
MATCH p0 = a-[*..2]-b
WHERE (b.attr = 'true') AND (a.attr = 'true')
RETURN p0
My question is how do I test the nodes between a and b for a certain attribute (attr = 'true'), using the MATCH statement, without knowing the depth required.
I find that using filter method I can filter out all the unwanted nodes.
like:
START a = node(86)
MATCH p0 = a-[*..2]-b
RETURN filter(x IN nodes(p0) WHERE x.attr = 'true')
But that is not what I need, I need to use MATCH.
Take a look at the Cypher refcard, specifically to the List Predicates section. The all() function should do the trick.
Something like:
START a=node(86)
MATCH p0=(a)-[*..2]-(b)
WHERE ALL(node in nodes(p0) WHERE node.attr = true)
RETURN p0
This will only match patterns where all the nodes in the pattern have that attribute as true.
I've been playing with neo4j for a geneology site and it's worked great!
I've run into a snag where finding the starting node isn't as easy. Looking through the docs and the posts online I haven't seen anything that hints at this so maybe it isn't possible.
What I would like to do is pass in a list of genders and from that list follow a specific path through the nodes to get a single node.
in context of the family:
I want to get my mother's father's mother's mother. so I have my id so I would start there and traverse four nodes from mine.
so pseudo query would be
select person (follow childof relationship)
where starting node is me
where firstNode.gender == female
AND secondNode.gender == male
AND thirdNode.gender == female
AND fourthNode.gender == female
Focusing on the general solution:
MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND extract(x in tail(nodes(p)) | x.gender) = {genders}
RETURN ancestor
here's how it works:
match the starting node by id
match all the variable-length paths going to any ancestor
constrain the length of the path (i.e. the number of relationships, which is the same as the number of ancestors), as you can't parameterize the length in the query
extract the genders in the path
nodes(p) returns all the nodes in the path, including the starting node
tail(nodes(p)) skips the first element of the list, i.e. the starting node, so now we only have the ancestors
extract() extracts the genders of all the ancestor nodes, i.e. it transforms the list of ancestor nodes into their genders
the extracted list of genders can be compared to the parameter
if the path matched, we can return the bound ancestor, which is the end of the path
However, I don't think it will be faster than the explicit solution, though the performance could remain comparable. On my small test data (just 5 nodes), the general solution does 26 DB accesses whereas the specific solution only does 22, as reported by PROFILE. Further profiling would be needed on a larger database to compare the performances:
PROFILE MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND extract(x in tail(nodes(p)) | x.gender) = {genders}
RETURN ancestor
The general solution has the advantage of being a single query which won't need to be parsed again by the Cypher engine, whereas each generated query will need to be parsed.
It was more simple than I thought. Maybe there is still a better way so I'll leave this open for a bit.
the query would be
MATCH (n1:Person { Id: 'f59c40de-506d-4829-a765-7a3ae94af8d1' })
<-[:CHILDOF]-(n2 { Gender:'0'})
<-[:CHILDOF]-(n3 { Gender:'1'})
<-[:CHILDOF]-(n4 { Gender:'1'})
RETURN n4
and for each generation back would add a new row.
The equivalent query would look something like this:
MATCH (me:Person)
WHERE me.ID = ?
WITH me
MATCH (me)-[r:childof*4]->(ancestor:Person)
WITH ancestor, EXTRACT(rel IN r | endNode(rel).gender) AS genders
WHERE genders = ?
RETURN ancestor
Disclaimer, I haven't double-checked the syntax.
In Neo4j you typically find your start node first, typically by an ID of some sort (modify as required to match on a unique property). We then traverse a number of relationships to an ancestor, extract the gender property of all end nodes in the traversed relationships, and compare the genders to the expected list of genders (you'll need to make sure the argument is a bracketed list in the desired order).
Note that this approach filters down all possible results with that degree of childof relationship as opposed to walking your graph, so higher degrees of relationship (the higher the degree of ancestry you're querying), the slower the call will get.
I'm also unsure if you can parameterize the degree of the variable relationship, so that might prevent this from being a generalized solution for any degree of ancestry.
I'm not sure if you want a generic query which can work whatever the collection of genders you pass, or a specific solution.
Here's the specific solution: you match the path with the wanted length, and match each gender, as you've already noted in your own answer.
MATCH (me:Person)-[:IS_CHILD_OF]->(p1:Person)
-[:IS_CHILD_OF]->(p2:Person)
-[:IS_CHILD_OF]->(p3:Person)
-[:IS_CHILD_OF]->(p4:Person)
WHERE me.uuid = {uuid}
AND p1.gender = {genders}[0]
AND p2.gender = {genders}[1]
AND p3.gender = {genders}[2]
AND p4.gender = {genders}[3]
RETURN p4
Now, if you want to pass in a list of genders of an arbitrary length, it's actually possible. You match a variable-length path, make sure it has the right length (matching the number of genders), then match each gender in sequence.
MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND all(i IN range(0, size({genders}) - 1)
WHERE {genders}[i] = extract(x in tail(nodes(p)) | x.gender)[i])
RETURN ancestor
Building on #InverseFalcon's answer, you can actually compare collections, which simplifies the query:
MATCH p = (me:Person)-[:IS_CHILD_OF*]->(ancestor:Person)
WHERE me.uuid = {uuid}
AND length(p) = size({genders})
AND extract(x in tail(nodes(p)) | x.gender) = {genders}
RETURN ancestor
I want to do something like this in cypher:
MATCH (n:node) WHERE n.ID = x //x is an integer value
FOREACH (num in n.IDs:
MATCH (p:node) WHERE p.ID = num
CREATE (n)-[:LINK]->(p) )
where num is an array of integer values referring to the IDs of nodes that need to be linked to the node matched in the first line.
When I run this query, I get the error: Invalid use of MATCH inside FOREACH.
I'm in the early stages of teaching myself both Cypher and Neo4j. How can I achieve my desired functionality here? Or am I barking up the wrong tree - am I failing to grasp something that makes it unnecessary for me to do so?
This is not allowed, instead use the top-level MATCH like http://gist.neo4j.org/?8332363
MATCH (n:node), (p:node)
WHERE n.ID = 1 AND p.ID in [2,3,4]
CREATE (n)-[:LINK]->(p)