Getting first root of a leaf in neo4j - neo4j

I have follwing simple graph:
CREATE (:leaf)<-[:rel]-(:nonleaf)<-[:rel]-(:nonleaf)<-[:rel]-(:nonleaf)<-[:rel]-(r1:nonleaf{root:true})<-[:rel]-(r2:nonleaf{root:true})
I want to get the first ancestor starting from (:leaf) with root:true set on it. That is I want to get r1. For this I wrote following cypher:
MATCH (:leaf)<-[*]-(m)<-[*]-(r:nonleaf{root:true}) WHERE m.root<>true OR NOT exists(m.root)
RETURN r
But it returned both (r1) and (r2). Same happened for following cypher:
MATCH shortestPath((l:leaf)<-[*]-(r:nonleaf{root:true}))
RETURN r
Whats going on here?
Update
Ok after thinking more, it clicked to my mind that (r2) is also returned because on path from (:leaf) to (r2), there are nodes with no root property set on them (this should have clicked to me earlier, pretty much obvious, subtle interpretation mistake). In other words, it returns (:nonleaf{root:true}) if "for at least one m" following condition is true: m.root<>true OR NOT exists(m.root). The requirement here is that the condition should be valid for "all ms" on the path, "not at least one m". Now it remains to figure out how to put this in cypher and my grip on cypher isnt that tight...

You can enforce that there is a single root node on the matched path with the help of the single() predicate function:
MATCH p=(:leaf)<-[*]-(r:nonleaf{root:true})
WHERE SINGLE(m IN nodes(p) WHERE exists(m.root) AND m.root=true )
RETURN r

You just need to adjust your where condition a little so that it says "and the :nonleaf node before the root :nonleaf node matched is not itself marked as a root node". I think this will satisfy your needs.
MATCH (l:leaf)<-[*]-(r:nonleaf {root: true})
WHERE NOT (:nonleaf {root: true})<--(r)
RETURN r
UPDATED
Reading the updated example in the comments, I thought of another way to solve your problem using the APOC procedure apoc.path.expandConfig.
It does require a slight change to your data. Each root: true node should have a :root label set on it. Here is an update statement...
MATCH (n:nonleaf {root:true})
SET n:root
RETURN n
And here is the updated query
MATCH (leaf:leaf {name: 'leaf'})
WITH leaf
CALL apoc.path.expandConfig(leaf, {relationshipFilter:'<rel',labelFilter:'/root'} ) yield path
RETURN last(nodes(path))

Related

neo4j - puzzling behavior where should be very simple

I'm totally baffled. Have been using neo4j for a while now but my file just got much much bigger (1.4G) and all of a sudden simple queries just don't work anymore. Does cypher break down when the file gets big?
MATCH (n:Node)
WHERE n.ID = "myid"
WITH DISTINCT n
OPTIONAL MATCH (n)-[rel:RELATIONSHIP]->(:Node)
REMOVE rel.Property
WITH DISTINCT n
OPTIONAL MATCH (n)-[rel:RELATIONSHIP]->(other:Node)
WHERE other.ID in ["this","that"]
SET rel.Property=true //I added this inside a foreach and both "this" and "that" started getting set properly, but I'm not sure why that would make a difference...
return n, other
This query invariably only sets Property to true for "that" and not "this". I'm totally baffled.
When instead I end with RETURN rel {.*}, id(rel) it shows two trues with one of the rel ids 67876, but then when I
MATCH ()-[rel]-()
WHERE id(rel)=67876
RETURN rel {}
I get {} as a result (ie the property is not there at all!!)
I added foreach but does not seem to really make a difference (nor should it I don't think).
Even more confusing, if I end with WITH DISTINCT n return [(n)-[rel:RELATIONSHIP]->(o) | rel.filter] it will be missing one. However, if I remove the DISTINCT n, I get more than one row and the last ones are correct-- ie in the results the exact same relationship is coming back as having Property first null then true. It's like I've come across Schrödinger's cat.
Could my file be corrupt and how would I fix it? TIA!
Addendum: I got it to work by repeating the match at the end and doing away with comprehensive maps for the return, but I'm still puzzled about why the comprehensive maps are returning incorrect information in first row and correct information in second row - all referring to the same relationship property.

Neo4J Matching Nodes Based on Multiple Relationships

I had another thread about this where someone suggested to do
MATCH (p:Person {person_id: '123'})
WHERE ANY(x IN $names WHERE
EXISTS((p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(:Dias {group_name: x})))
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
This does what I need it to, returns nodes that fit the criteria without returning the relationships, but now I need to include another param that is a list.
....(:Dias {group_name: x, second_name: y}))
I'm unsure of the syntax.. here's what I tried
WHERE ANY(x IN $names and y IN $names_2 WHERE..
this gives me a syntax error :/
Since the ANY() function can only iterate over a single list, it would be difficult to continue to use that for iteration over 2 lists (but still possible, if you create a single list with all possible x/y combinations) AND also be efficient (since each combination would be tested separately).
However, the new existenial subquery synatx introduced in neo4j 4.0 will be very helpful for this use case (I assume the 2 lists are passed as the parameters names1 and names2):
MATCH (p:Person {person_id: '123'})
WHERE EXISTS {
MATCH (p)-[:BELONGS]-(:Face)-[:CORRESPONDS]-(:Image)-[:HAS_ACCESS_TO]-(d:Dias)
WHERE d.group_name IN $names1 AND d.second_name IN $names2
}
MATCH path=(p)-[:ASSOCIATED_WITH]-(:Person)
RETURN path
By the way, here are some more tips:
If it is possible to specify the direction of each relationship in your query, that would help to speed up the query.
If it is possible to remove any node labels from a (sub)query and still get the same results, that would also be faster. There is an exception, though: if the (sub)query has no variables that are already bound to a value, then you would normally want to specify the node label for the one node that would be used to kick off that (sub)query (you can do a PROFILE to see which node that would be).

Neo4j Cypher find loop from specific node

I want to find all loops that originate and terminate with a specific node in a Neo4j database. I tried:
START n=node:Event(time=",timestamp,")
MATCH p=(n)-[:LINKED_TO*1..5]->(n)
WHERE NONE (n IN nodes(p) WHERE size(filter(x IN nodes(p) WHERE n = x))> 2)
RETURN p, length(p)
This is the best I can mashup from what is on the web. There are two things I don't like about this:
1. it crashes
2. the count threshold must be ">2" to allow for the start+termination node. That means that loops that visit the same intermediate node twice will be included, which I wish was not the case.
I'm not interested in the shortest path. I want to know all loops that return to my starting node.
Thank you in advance!
This query should return all loops that start and end at the specified node and have no other repeated nodes:
START n=node:Event(time=",timestamp,")
MATCH p=(n)-[:LINKED_TO*1..5]->(n)
UNWIND TAIL(NODES(p)) AS m
WITH p, COUNT(DISTINCT m) AS cm
WHERE LENGTH(p)-1 = cm
RETURN p, LENGTH(p);
Thank you, cybersam! That was helpful. As typed, it gave a few errors and warned me that "START" is deprecated. I found the following modifications worked:
MATCH (n:Event{time:1458238060505007})
MATCH p=(n)-[:LINKED_TO*1..5]->(n)
UNWIND TAIL(NODES(p)) AS m WITH p RETURN p
The only problem with this is that it appears to give all paths that go through the desired start node, n. Is that true? If so, is there a way to correct this?
This what finally worked for me. It is very close to what cybersam suggested. Apologies for doing this "the wrong way". I'm sure cybersam will yell at me, again, but adding code via Comment is not very easy to read.
MATCH p=(n:Event{time:",timestamp,"})-[:LINKED_TO*1..5]->(n)
UNWIND TAIL (NODES(p)) AS m
WITH p,COUNT(DISTINCT m) AS cm
WHERE LENGTH(p) = cm
RETURN p
As I noted earlier, one sticking point was the use of "START", which is deprecated and causes errors (for example, when using RNeo4j in R, which I'm using). The new way appears to be to use MATCH and specify your starting node in the path pattern. The other confusing thing for me was the use of "LENGTH(p)-1" instead of "LENGTH(p)". For one node connecting to another, the path has a length of 2, not 3 and there are only 2 distinct nodes. For my application, "LENGTH(p)=cm" worked.
Finally, if you want the nodes in the paths, do NOT try to use "WITH m,..." because this messes up the "COUNT(DISTINCT(m))" computation for some reason that I do not understand.

Neo4j cypher - search for nodes with no path between them

I'm trying to find a generic way to search for a node or set of nodes which does not have a link to a another node or set of nodes.
As an example, I was able to find all the nodes of a specific type (e.g. :Style) which are connected somehow to a specific set of nodes (e.g. :MetadataRoot), with the following:
match (root:MetadataRoot),
(n:Style),
p=shortestPath((root)-[*]-(n))
return p
Using this, I was able to subtract the set of all :Style nodes from the nodes returned by the above query, but that doesn't seem like the best way to go about this.
If you know the label of the start nodes you can use the EXISTS function :
MATCH (n:Style)
WHERE NOT EXISTS((n)-[]-())
RETURN n
If you know the end node :
MATCH (n:Style)
WHERE NOT EXISTS ((n)-[*]-(:MetadataRoot))
RETURN n
EDIT :
Not sure, but regarding the performance issues in your comment, a workaround could be something like this :
MATCH p=allShortestPaths((n:Style)-[*]-(:MetadataRoot))
WITH nodes(p) as nodesRelated
MATCH (s:Style) WHERE NOT s IN nodesRelated
This should be way faster and it should need less resources to execute:
MATCH (n:Style),
OPTIONAL MATCH p=shortestPath((:MetadataRoot)-[*0..40]-(n))
WITH n, p
WHERE p IS NULL
RETURN n ```

MATCH prior to MERGE in single Cypher query confuses ON MATCH and ON CREATE

I've run into this on Neo4j 2.1.5. I have a query which I'm issuing from Node.js using the Neo4j REST API. The point of this query is to be able to create or update a given Node and set its state (including labels and properties) to some known state. The MATCH and REMOVE clause prior to the WITH is to work around the fact that there's no direct way to remove all of a Node's labels nor is there a way to update a Node's labels with a given set of labels. You have to explicitly remove the labels you don't want and add the labels you do want. And there's no way to remove labels in the MERGE clause.
A somewhat simplified version of the query looks like:
MATCH (m {name:'Brian'})
REMOVE m:l1:l2
WITH m
MERGE (n {name:'Brian'})
ON MATCH SET n={mprops} ON CREATE SET n={cprops}
RETURN n
where mprops = {updated:true, created:false} and cprops = {updated:false, created:true}. I do this so that in a single Cypher query I can remove all of the Node's existing labels and set new labels using the ON MATCH clause. The problem is that including the initial MATCH seems to confuse the ON MATCH vs ON CREATE logic.
Assuming the Brian Node already exists, the result of this query should show that n.created = false and n.updated = true. However, I get the opposite result, n.created=true, n.updated=false. If I remove the initial MATCH (and WITH) clause and execute only the MERGE clause, the results are as expected. So somehow, the inclusion of the MATCH clause causes the MERGE clause to think that a CREATE vs MATCH is happening.
I realize this is a weird use of the WITH clause, but it did seem like it would work around the limitation in manipulating labels. And Cypher thinks that it's valid Cypher. I'm assuming this is just a bug and an edge case, but I wanted to get others insights and possible alternatives before I report it.
I realize that I could have created a transaction and issued the MATCH and MERGE as separate queries within that transaction, but there are reasons that this does not work well in the design of the API I'm writing.
Thanks!
If you prefix your query with MATCH it will never execute if there is no existing ('Brian') node.
You also override all properties with your SET n = {param} you should use SET n += {param}
MERGE (n:Label { name:'Brian' })
ON MATCH SET n += {create :false,update:true }
ON CREATE SET n += {create :true,update:false }
REMOVE n:WrongLabel
RETURN n
I don't see why your query would not work, but the issues brought up by #FrobberOfBits are valid.
However, logically, your example query is equivalent to this one:
MATCH (m {name:'Brian'})
REMOVE m:l1:l2
SET m={mprops}
RETURN m
This query is simpler, avoids the use of MERGE entirely, and may avoid whatever issue you are seeing. Does this represent what you were trying to do?

Resources