Neo4j - traversal to find specific connected component - neo4j

Using neo4j 1.9.4, I'm trying to find the connected components (all reachable nodes) from a starting node where the relationship has a certain attribute ('since') and this attribute has a defined integer value, e.g. 20130101.
My initial approach was using a cypher query, but I got the feeling that this query loops to infinity if there is a loop within the graph? At least if I do not restrict the path length and restricting the length is not what I want to do.
So meanwhile I started using a traversal. Using neo4jphp a traversal looks like that:
$traversal->setOrder(Everyman\Neo4j\Traversal::OrderBreadthFirst)
->setPruneEvaluator(Everyman\Neo4j\Traversal::PruneNone)
->setReturnFilter(Everyman\Neo4j\Traversal::ReturnAll)
->setUniqueness(Everyman\Neo4j\Traversal::UniquenessNodeGlobal);
What I think I need is something like this:
->setPruneEvaluator('javascript', "position.RELATIONSHIP().getProperty('since').EQUALS(20130101)")
Obviously, RELATIONSHIP and EQUALS seem to be wrong.
I adopted this from the example https://github.com/jadell/neo4jphp/wiki/Traversals, where the following valid and working pruneElevater is set:
->setPruneEvaluator('javascript', "position.endNode().getProperty('name').toLowerCase().contains('t')")
I'm absolutely not familiar with JavasScript, so I can't figure out how to do that. Additionally, how can I make sure the traversal does not result in an error if there is a relationship that does not have the property "since"?
If I can achieve the same using a cypher query I would accept that, too.
EDIT: By the way, my approach using cypher was this:
START n=node({start_node}) MATCH p = n-[*]-m WHERE ALL(x IN RELATIONSHIPS(p) WHERE HAS(x.since) AND x.since = 20130101) RETURN DISTINCT m
EDIT2: Trying the suggested cypher query from ulkas give me the following error:
Invalid query
string matching regex ``(``|[^`])*`' expected but `*' found
Think we should have better error message here? Help us by sending this query to cypher#neo4j.org.
Thank you, the Neo4j Team.
"START n=node(40317) MATCH p = n-[r:*..]-m WHERE has(r.since) AND r.since = 20130101 RETURN DISTINCT m"
^
EDIT3: The suggestion of LameCode looked really promising, but still it returns an error:
Fatal error: Uncaught exception 'Everyman\Neo4j\Exception' with message 'Unable to execute traversal [400]: Headers: Array ( [Content-Length] => 5183 [Content-Type] => application/json; charset=UTF-8 [Access-Control-Allow-Origin] => * [Server] => Jetty(6.1.25) ) Body: Array ( [message] => Failed to execute script, see nested exception. [exception] => EvaluationException [fullname] => org.neo4j.server.rest.domain.EvaluationException [stacktrace] => Array ( [0] => org.neo4j.server.scripting.javascript.JavascriptExecutor.execute(JavascriptExecutor.java:118) [1] => org.neo4j.server.rest.domain.EvaluatorFactory$ScriptedEvaluator.evalPosition(EvaluatorFactory.java:140) [2] => org.neo4j.server.rest.domain.EvaluatorFactory$ScriptedPruneEvaluator.evaluate(EvaluatorFactory.java:161) [3] => org.neo4j.graphdb.traversal.Evaluator$AsPathEvaluator.evaluate(Evaluator.java:69) [4] => org.neo4j.kernel.impl.traversal.TraverserIterator.eva in /var/www/vendor/everyman/neo4jphp/lib/Everyman/Neo4j/Command.php on line 116
And I used the following pruneEvaluator:
->setPruneEvaluator('javascript', "position.lastRelationship().hasProperty('since') && position.lastRelationship().getProperty('since') == 20130101")
When changing from lastRelationship() to endNode() it at least doesn't return me an error, despite I am wondering about the many results it returns, as none of the nodes has exactly this since attribute?! So it seems even then, the prune evaluator does not get to work. I expected it to stop at each endNode if has no since property or if it is unqual the given date? What am I doing wrong, any ideas?

In regards to the Traverser that you are using. The javascript prune evaluator 'position' variable is a Path object. See: http://components.neo4j.org/neo4j/1.9.4/apidocs/org/neo4j/graphdb/Path.html
Those methods should be available to you.
Use lastRelationship() (because all the former relationships will have come through the prune evaluator already).
The Relationship object inherits from Property Container and that has a hasProperty() method.
setPruneEvaluator('javascript', "position.lastRelationship().hasProperty('since') && position.lastRelationship().getProperty('since') == 20130101")
I'm not sure if you need to use the Equals method or not since it's javascript.

Related

I can't guess legacy code's purpose (Neo4j, cypher query)

I'm new programming with Neo4j, so I don't know enough from it's cypher language yet to solve without help an annoying bug from legacy and undocumented code.
My main problem is that I can't guess the purpose of the following query... :s .
That's the problematic query:
START
n=node({self})
MATCH
n-[:RECOMMENDATION]->(m)
WHERE
m.concept_type='unifying_theme' AND
not( ()-[:REQUIRED]->m ) RETURN m
The query itself is written with only one line, I've formatted it to make more readable. The error message is the following (reformatted to be easier to read):
PatternException: Some identifiers are used as both relationships and nodes:
UNNAMED1 Query:
START n=node({self})
MATCH n-[:RECOMMENDATION]->(m)
WHERE m.concept_type='unifying_theme' AND not( ()-[:REQUIRED]->m )
RETURN m
Params: {'self': 423}
Trace:
org.neo4j.cypher.internal.pipes.matching.PatternGraph.validatePattern(PatternGraph.scala:98)
org.neo4j.cypher.internal.pipes.matching.PatternGraph.<init>(PatternGraph.scala:36)
org.neo4j.cypher.internal.executionplan.builders.PatternGraphBuilder$cla...
The query is embedded inside a NeoModel's python library "StructuredNode" instance. I guess the {self} refers to the node represented by the StructuredNode instance, and that the error if this query is related with the m variable...
I suppose maybe I should use more variable names to avoid conflicts, but I'm suspicious. I think there are more errors on this query because I've seen more ugly and buggy code of this disastrous programmer.
I don't know what was trying to do with the not( ()-[:REQUIRED]->m ) block, Is that a legal Cypher "subsentence" if m represents a node?
P.D.: I'm using Neo4j 1.9.7 .
Thank you in advance.

How to use START with Cypher / Neo4j 2.0

I am trying example provided in Graph Databases book (PDF page 51-52)with Neo4j 2.0.1 (latest). It appears that I cannot just copy paste the code sample from the book (I guess the syntax is no longer valid).
START bob=node:user(username='Bob'),
charlie=node:user(username='Charlie')
MATCH (bob)-[e:EMAILED]->(charlie)
RETURN e
Got #=> Index `user` does not exist.
So, I tried without 'user'
START bob=node(username='Bob'),
charlie=node(username='Charlie')
MATCH (bob)-[e:EMAILED]->(charlie)
RETURN e
Got #=> Invalid input 'u': expected whitespace, an unsigned integer, a parameter or '*'
Tried this but didn't work
START bob=node({username:'Bob'}),
(charlie=node({username:'Charlie'})
MATCH (bob)-[e:EMAILED]->(charlie)
RETURN e
Got #=> Invalid input ':': expected an identifier character, whitespace or '}'
I want to use START then MATCH to achieve this. Would appreciate little bit of direction to get started.
From version 2.0 syntax has changed.
http://docs.neo4j.org/chunked/stable/query-match.html
Your first query should look like this.
MATCH (bob {username:'Bob'})-[e:EMAILED]->(charlie {username:'Charlie'})
RETURN e
The query does not work out of the box because you'll need to create the user index first. This can't be done with Cypher though, see the documentation for more info. Your syntax is still valid, but Lucene indexes are considered legacy. Schema indexes replace them, but they are not fully mature yet (e.g. no wildcard searches, IN support, ...).
You'll want to use labels as well, in your case a User label. The query can be refactored to:
MATCH (b:User { username:'Bob' })-[e:EMAILED]->(c:User { username:'Charlie' })
RETURN e
For good performance, add a schema index on the username property as well:
CREATE INDEX ON :User(username)
Start is optional, as noted above. Given that it's listed under the "deprecated" section in the Cypher 2.0 refcard, I would try to avoid using it going forward just for safety purposes.
However, the refcard does state that you can prepend your Cypher query with "CYPHER 1.9" (without the quotes) in order to make explicit use of the older syntax.

How to check if an element is in a node.collection using Cypher?

I'm begining with Neo4j/Cypher, I have some nodes containing a property which is an array of integers. I want to check if a given number is in a node's collection and if so, append this node to the results. My query looks like this:
MATCH (a) WHERE has(a.user_ids) and (13 IN a.user_ids) RETURN a
where 13 is the given user_id. It throws a syntax error:
Type mismatch: a already defined with conflicting type Node (expected Collection<Any>)
Any idea how can I accomplish that?
Thanks in advance.
You can try the predicate ANY, which returns true if any member of a collection matches some criterion.
MATCH (a) WHERE has(a.user_ids) and ANY(user_id IN a.user_ids WHERE user_id = 13)
It looks a bit backwards now that I'm looking at it, but it should work.
Edit:
It was bugging me why your query didn't work and why my answer seemed backwards and indirect so I did a simple test. Basically, your original query works if you put the property reference in parentheses:
MATCH (a)
WHERE has(a.user_ids) and (13 IN (a.user_ids))
RETURN a
That's easier to read so that's what I should have answered. But I still couldn't see why the parentheses where necessary here, when they are not in other cases. They were not necessary inside the ANY() above, and if you 'detach' the collection from the node
MATCH (a)
WITH a.user_ids as user_ids, a
WHERE 13 IN user_ids
RETURN a
there's no problem. For some reason Cypher needs to be told to evaluate a.user_ids before IN, or it ignores user_ids and tries to evaluate 13 IN a. IN is listed as an operator in the documentation, but in this regard it woks differently than other operators. For example
MATCH (a) RETURN 13 + a.user_ids
returns fine and
MATCH (a) RETURN 13 * a.user_ids
MATCH (a) RETURN 13 < a.user_ids
fails but because a.user_ids is a collection, not because a is a node. It's probably not very important, it's easy enough to use parentheses, but it would be interesting to learn why they are necessary.
I also compared my answer to your original query with added parentheses to see if there were any performance drawback to the more indirect way. Turns out the execution plan is almost identical, 13 IN (a.user_ids) is refactored to use ANY() like in my answer.
My answer:
Filter(pred="any(user_id in Product(a,user_ids(6),true) where user_id == Literal(13))", _rows=1, _db_hits=8)
AllNodes(identifier="a", _rows=8, _db_hits=8)
Your query + ():
Filter(pred="any(-_-INNER-_- in Product(n,user_ids(6),true) where Literal(13) == -_-INNER-_-)", _rows=1, _db_hits=8)
AllNodes(identifier="n", _rows=8, _db_hits=8)
Finally, in your case you probably don't have to check for existence of property with has(). Absent properties and null are handled differently in 2.0 and if the property doesn't exist 13 IN (a.user_ids) will evaluate to false, so usually there is no reason to test for property existence before property evaluation for fear of the query breaking. The place to use has() would be when property existence is relevant in itself, and that would probably be a different property than the one evaluated, i.e. WHERE has(a.someProperty) AND 13 IN (a.someOtherProperty).
Since there is no performance difference, the more readable query is better, and since you, as far as I can see, don't really need to test for property existence, I think your query should be
MATCH (a)
WHERE 13 IN (a.user_ids)
RETURN a

Lambda expression in WHERE clause not working as expected

I'm new to Neo4j and trying to do a simple Cypher query using a lambda expression in the where clause but for some reason I can't seem to figure out why this isn't working.
Looks like:
class HealthNode {
public string Name{get;set;}
//Other Stuff
}
string Name = "Foobar";
var query = client
.Cypher
.Start(new { n = Neo4jClient.Cypher.All.Nodes })
.Where((HealthNode n) => n.Name == Name)
.Return<HealthNode>("n");
If I dump the Text and Parameters I'm getting:
START n=node(*)
WHERE (n.Name! = {p0})
RETURN n
//P0 Foobar
When I execute this, I of course get:
Cypher does not support != for inequality comparisons. Use <> instead
Why in the world is an extra Exclamation point to the name of the variable?
The ! means that the result will be false if the property doesn't exist. So, if you have more than one type in the graph, and that other type doesn't have a 'Name' property, neo4j won't bother matching.
See Neo4J Documentation for more info.
As to getting the != warning, are you changing the query at all when you paste it? Reformatting it? As I get the same warning if I do:
WHERE (n.Name != {p0})
but don't get any warning, and a successful completion if I use:
WHERE (n.Name! = {p0})
I think I found the cause of the problem here:
There was a change made to the 2.0 parser that implements NULL IF by default (instead of returning an error on a missing property) and removes the ! and ? operators since they no longer do anything.
neo4j pull request 1014
I suspect this will break a lot of things and not just Neo4J Client.
Fixed in Neo4jClient 1.0.0.625 and above, when talking to Neo2j 2.0.

Incorrect sort order with Neo4jClient Cypher query

I have the following Neo4jClient code
var queryItem = _graphClient
.Cypher
.Start(new
{
n = Node.ByIndexLookup("myindex", "Name", sku),
})
.Match("p = n-[r:Relationship]->ci")
.With("ci , r")
.Return((ci, r) => new
{
N = ci.Node<Item>(),
R = r.As<RelationshipInstance<Payload>>()
})
.Limit(5)
.Results
.OrderByDescending(u => u.R.Data.Frequency);
The query is executing fine but the results are not sorted correctly (i.e. in descending order). Here is the Payload class as well.
Please let me know if you see something wrong with my code. TIA.
You're doing the sorting after the .Results call. This means that you're doing it back in .NET, not on Neo4j. Neo4j is returning any 5 results, because the Cypher query doesn't contain a sort instruction.
Change the last three lines to:
.OrderByDescending("r.Frequency")
.Limit(5)
.Results;
As a general debugging tip, Neo4jClient does two things:
It helps you construct Cypher queries using the fluent interface.
It executes these queries for you. This is a fairly dumb process: we send the text to Neo4j, and it gives the objects back.
The execution is obviously working, so you need to work out why the queries are different.
Read the doco at http://hg.readify.net/neo4jclient/wiki/cypher (we write it for a reason)
Read the "Debugging" section on that page which tells you how to get the query text
Compare the query text with what you expected to be run
Resolve the difference (or report an issue at http://hg.readify.net/neo4jclient/issues/new if it's a library bug)

Resources