Neo 2.3.2 performance issues - neo4j

Just upgraded from 2.2.5 to 2.3.2 and a previous query that executed immediately now takes a considerable time. Seems linked to the depth as reducing it form 5 to 3 makes it quicker.
More details are as below,
Following is the Neo4j query used to search the recommended restaurants for the passed user_id, where depth/degree of the search is kept as 5.
MATCH (u:User {user_id:"bf9203fba4484f96b4983152e9ee859a"})-[r*1..5]-(place:Place)
WHERE ALL (rel in r WHERE rel.rating >= 0)
RETURN DISTINCT place.place_id, place.title, length(r) as LoR ORDER BY LoR, place.title
Old server instance has Neo4j 2.2.5, where result is displayed instantly but on new VM with Neo4j 2.3.2 it is taking quite long time to return the result.
If we decrease the search depth value to 2 or 3, queries are running faster
Anyone else experiencing this?

How do the query times compare when running completely non-warmed-up, e.g. just start server and run this query? I'm suspecting property loading may be the biggest problem due to the removal of the object cache in the 2.3 series.
Do the Place nodes and relationships w/ the rating property have many properties each?

Related

neo4j browser reports completely unrealistic runtime

I am using Neo4j community 4.2.1, playing with graph databases. I plan to operate on lots of data and want to get familiar with indexes and stuff.
However, I'm stuck at a very basic level because in the browser Neo4j reports query runtimes which have nothing to do with reality.
I'm executing the following query in the browser at http://localhost:7687/:
match (m:Method),(o:Method) where m.name=o.name and m.name <> '<init>' and
m.signature=o.signature and toInteger(o.access)%8 in [1,4]
return m,o
The DB has ~5000 Method labels.
The browser returns data after about 30sec. However, Neo4j reports
Started streaming 93636 records after 1 ms and completed after 42 ms, displaying first 1000 rows.
Well, 42ms and 30sec is really far away from each other! What am I supposed to do with this message? Did the query take only milliseconds and the remaining 30secs were spent rendering the stuff in the browser? What is going on here? How can I improve my query if I cannot even tell how long it really ran?
I modified the query, returning count(m) + count(n) instead of m,n which changed things, now runtime is about 2secs and Neo4j reports about the same amount.
Can somebody tell me how I can get realistic runtime figures of my queries without using the stop watch of my cell?

Can "DISTINCT" in a CYPHER query be responsible of a memory error when the query returns no result?

working on a pretty small graph of 5000 nodes with low density (mean connectivity < 5), I get the following error which I never got before upgrading to neo4j 3.3.0. The graph contains 900 molecules and their scaffold hierarchy, down to 5 levels.
(:Molecule)<-[:substructureOf*1..5]-(:Scaffold)
Neo.TransientError.General.StackOverFlowError
There is not enough stack size to perform the current task. This is generally considered to be a database error, so please contact Neo4j support. You could try increasing the stack size: for example to set the stack size to 2M, add `dbms.jvm.additional=-Xss2M' to in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation just add -Xss2M as command line flag.
The query is actually very simple, I use distinct because several path may lead to a single scaffold.
match (m:Molecule) <-[:substructureOf*3]- (s:Scaffold) return distinct s limit 20
This query returns the above error message whereas the next query does work.
match (m:Molecule) <-[:substructureOf*3]- (s:Scaffold) return s limit 20
Interestingly, the query works on a much larger database, in this small one the deepest hierarchy happened to be 2. Therefore the result of the last query is "No changes, no records)".
How comes that adding DISTINCT to the query fails with that memory error? Is there a way to avoid it, because I cannot guess the depth of the hierarchy which can be different for each molecules.
I tried the following values for as suggested in other posts.
#dbms.memory.heap.initial_size=512m
#dbms.memory.heap.max_size=512m
dbms.memory.heap.initial_size=512m
dbms.memory.heap.max_size=4096m
dbms.memory.heap.initial_size=4096m
dbms.memory.heap.max_size=4096m
None of these addressed the issue.
Thanks in advance for any help or clues.
Thanks for the additional info, I was able to replicate this on Neo4j 3.3.0 and 3.3.1, and this likely has to do with the behavior of the pruning-var-expand operation (that is meant to help when using variable-length expansions and distinct results) that was introduced in 3.2.x, and only when using an exact number of expansions (not a range). Neo4j engineering will be looking into this.
In the meantime, your requirement is such that we can use a different kind of query to get the results you want that should avoid this operation. Try giving this one a try:
match (s:Scaffold)
where (s)-[:substructureOf*3]->(:Molecule)
return distinct s limit 20
And if you do need to perform queries that may produce this error, you may be able to circumvent them by prepending your query with CYPHER 3.1, which will execute this with a plan produced by an older version of Cypher which doesn't use the pruning var expand operation.

NullPointerException on query in Neo4j 3.0.4 enterprise

I am running Neo4j 3.0.4 Enterprise in HA and am encountering an extremely mysterious issue. On doing a very basic Cypher query from our Spring Data Neo4j application (following is taken from the Neo4j DB query.log):
MATCH (n:`Node`)
WHERE n.`key` = { `key` }
WITH n
MATCH p=(n)-[*0..1]-(m)
RETURN p, ID(n) - {key: 123456}
I get an exception related to the page cache:
java.lang.NullPointerException
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCursor.assertPagedFileStillMappedAndGetIdOfLastPage(MuninnPageCursor.java:369)
at org.neo4j.io.pagecache.impl.muninn.MuninnReadPageCursor.next(MuninnReadPageCursor.java:55)
at org.neo4j.io.pagecache.impl.muninn.MuninnPageCursor.next(MuninnPageCursor.java:121)
at org.neo4j.kernel.impl.store.CommonAbstractStore.readIntoRecord(CommonAbstractStore.java:1039)
at org.neo4j.kernel.impl.store.CommonAbstractStore.access$000(CommonAbstractStore.java:64)
at org.neo4j.kernel.impl.store.CommonAbstractStore$1.next(CommonAbstractStore.java:1179)
at org.neo4j.kernel.impl.api.store.StoreSingleNodeCursor.next(StoreSingleNodeCursor.java:64)
at org.neo4j.kernel.impl.api.StateHandlingStatementOperations.nodeCursorById(StateHandlingStatementOperations.java:137)
at org.neo4j.kernel.impl.api.ConstraintEnforcingEntityOperations.nodeCursorById(ConstraintEnforcingEntityOperations.java:422)
at org.neo4j.kernel.impl.api.OperationsFacade.nodeGetProperty(OperationsFacade.java:333)
at org.neo4j.cypher.internal.spi.v3_0.TransactionBoundQueryContext$NodeOperations.getProperty(TransactionBoundQueryContext.scala:316)
And then from here on out, the same query will fail intermittently and will be logged as a one liner:
java.lang.NullPointerException
On the app server side, we get the following generic error code:
org.neo4j.ogm.exception.CypherException: Error executing Cypher "Neo.DatabaseError.Statement.ExecutionFailed"
Has any Neo4j experts seen this issue before?
Moving the master in my cluster (ie restarting) seems to fix it but this resurfaces if any significant amount of load is placed on the server. The page cache should be fairly large as I have an 8 GB database and have left not only the heap size as default but the page cache size unset too so that it takes on its default value.
The logs indicate that there are a large number of concurrent queries right before this exception (all happening within the same second and quite a few querying for the exact same thing). Could there be some type of race condition in how the page cache works? What is a realistic limit on concurrent reads?
Any advice at all is greatly appreciated!

Different results of two (synonymous) queries in Neo4j

I have identified that some queries happen to return less results than expected. I have taken one of the missing results and tried to force Neo4j to return this result - and I succeeded with the following query:
match (q0),(q1),(q2),(q3),(q4),(q5)
where
q0.name='v4' and q1.name='v3' and q2.name='v5' and
q3.name='v1' and q4.name='v3' and q5.name='v0' and
(q1)-->(q0) and (q0)-->(q3) and (q2)-->(q0) and (q4)-->(q0) and
(q5)-->(q4)
return *
I have supposed that the following query is semantically equivalent to the previous one. However in this case, Neo4j returns no result at all.
match (q1)-->(q0), (q0)-->(q3), (q2)-->(q0), (q4)-->(q0), (q5)-->(q4)
where
q0.name='v4' and q1.name='v3' and q2.name='v5' and
q3.name='v1' and q4.name='v3' and q5.name='v0'
return *
I have also manually verified that the required edges among vertices v0, v1, v3, v4 and v5 are present in the database with right directions.
Am I missing some important difference between these queries or is it just a bug of Neo4j? (I have tested these queries on Neo4j 2.1.6 Community Edition.)
Thank you for any advice
/EDIT: Updating to newest version 2.2.1 was of no help.
This might not be a complete answer, but here's what I found out.
These queries aren't synonymous, if I understand correctly.
First of all, use EXPLAIN (or even PROFILE) to look under the hood. The first query will be executed as follows:
The second query:
As you can see (even without going deep down), those are different queries in terms of both efficiency and semantics.
Next, what's actually going on here:
the 1st query will look through all (single) nodes, filter them by name, then - try to group them according to your pattern, which will involve computing Cartesian product (hence the enormous space complexity), then collect those groups into the larger ones, and then evaluate your other conditions.
the 2nd query will first pick a pair of nodes connected with some relationship (which satisfy the condition on the name property), then throw in the third node and filter again, ..., and so on till the end. The number of nodes is expected to decrease after every filter cycle.
By the way, is it possible that you accidentally set the same name twice (for q1 and q3?)

Cypher query resultset growing over subsequent runs?

I'm new to Neo4j CYPHER query language. I'm discovering it, while analyzing a graph of person to person relationships coming from a CRM system. I'm using Neo4j 2.1.2 Community Edition with Oracle Java JDK 1.7.0_45 on Windows 7 Enterprise, and interacting with Neo4j thru the web interface.
One thing puzzles me: I noticed that the resultset of some of my queries do grow over time, that is, if I run the same query after having used the database for quite a long time (1 or 2 hours later), I get a bit more results the second time -- having not updated, deleted or added anything to the database.
Is that possible? Are there special cases where it could happen? I would expect the database results to be consistent over time, as long as there is no change to the database.
I feel it is, as if the database was growing its indexes over time in the background, and if the query results were depending on the database engine's ability to reach more nodes and relationships thru the grown indexes. Could it be a memory or index configuration issue? Or did I possibly got to much coffee? Alas, it is not easily reproductible.
Sample query:
MATCH (pf:Portfolio)<-[:withRelation]-(p1:Partner)-[:JOINTACC]->(p2:Partner)
WHERE (pf.dateBoucl = '') AND (pf.catClient = 'NO')
AND NOT (p2)-[:relTo]->(:Partner)
MATCH (p1)-[r]->(p3:Partner)
WHERE NOT (p3)-[:relTo]->(:Partner)
AND NOT TYPE( r) IN [ 'relTo', 'ADRESSAT', 'MEMBER']
WITH pf, p1, p2, COLLECT( TYPE( r)) AS types
WHERE ALL( t IN types WHERE t = 'JOINTACC')
RETURN pf.catClient, pf.natureTitulaire, COUNT( DISTINCT pf);
At first I got 98 results. When running it 2 hours later, I get 103 results, and then it seems stable for subsequent runs. And I'm pretty sure I did not change the database contents.
Any hints very appreciated! Kind regards
Schema looks like this:
:schema
Indexes
ON :Country(ID) ONLINE (for uniqueness constraint)
ON :Partner(partnerID) ONLINE (for uniqueness constraint)
ON :Portfolio(partnerID) ONLINE
ON :Portfolio(noCli) ONLINE
ON :Portfolio(noDos) ONLINE
Constraints
ON (partner:Partner) ASSERT partner.partnerID IS UNIQUE
ON (country:Country) ASSERT country.ID IS UNIQUE
Dump / download your query results from both runs and do a diff on them. Then you see what differs and you can investigate where it came from.
Perhaps you also should update to 2.1.3 which has one caching problem resolved that could be related to this.

Resources