neo4j REST API slow - neo4j

I am using Neo4j 2.0.0M4 community edition with Node.js with https://github.com/thingdom/node-neo4j to access the Neo4j DB server over REST API by passing Cypher queries.
I have observed that the data returned by Neo4j from the webadmin of neo4j and even from the REST APi is pretty slow. for e.g.
a query returning 900 records takes 1.2s and then subsequent runs take around 200ms.
and similarly if the number of records go upto 27000 the query in the webadmin browser takes 21 sec.
I am wondering whats causing the REST API to be so slow and also how to go about improving the performance?
a) It's using the CYPHER? the jSON parsing or
b) the HTTP Overhead itself as similar query with 27000 records returned in mysql takes 11 ms
Any help is highly appreciated

Neo4j 2.0 is currently a milestone build that is not yet performance optimized.
Consider enabling streaming and make sure you use parameterized Cypher.
For large result sets the browser consumes a lot of time for rendering. You might try the same query using cURL to see a difference.

Related

cypher query using http/bolt into Neo4j hangs Java thread

I'm using Neo4j 3.5.14 Enterprise (cypher over http/bolt). I'm seeing an issue where randomly a cypher query would be stuck never to be back again which takes out a worker thread. Eventually, if the service is not redeployed, all worker threads would be stuck and the service is no longer doing its job.
I tried using apoc.cypher.runTimeboxed but that appears to cause my queries to not return until the time limit is over (20000 ms in this case) even though in some cases it can return faster than that. I'm actually not sure that runTimeboxed would work because I believe it is actually stuck forever which might not respond to time limit anyway depending on how that's implemented.
My question is - how would you end a runaway query like that? Any tricks?

Monitoring CYPHER queries performance in Neo4J

I am using Neo4JClient to connect to my Neo4J database and execute CYPHER queries. My goal is to check performance of queries I send to database. Problem is that I have to check it on the db side so I can't use Stopwatch in .NET. Queries have to be executed using Neo4JClient. I don't need to know execution times for specific queries. I.e. average for last 1000 queries will be enough.
I can use only Neo4J Community Edition.
Thanks in advance!
Neo4j Enterprise Edition has the capability to log slow queries taking longer than a given threshold, see the config settings containing querylog on http://neo4j.com/docs/stable/configuration-settings.html.

Neo4j performance difference in using shell and API

I understand that Neo4j supports different options to run the Cypher queries. The web browser, neo4j shell and the REST API.
Is there a difference in performance when using the shell and the API?
I'm working on a dataset that has around 10 million objects(nodes+edges).
Thanks!
The web browser use in the backend the ReST API. The shell is connected directly into neo4j.
So yes you will see performance differences, the shell will generally be more faster. Now using the shell will perform slower that connecting to ReST API in your application because in the shell you can't pass parameters.
In your application, passing parameters will permit that the execution can be cached (after the warmup).
Also, if you have bad indexes and bad queries, running it on a 10 million objects dataset will just result in being not performant in the shell, in the browser and in your application.

Neo4j 2.0.4 browser cannot query large datasets

Whenever I try to run cypher queries in Neo4j browser 2.0 on large (anywhere from 3 to 10GB) batch-imported datasets, I receive an "Unknown Error." Then Neo4j server stops responding, and I need to exit out using Task Manager. Prior to this operation, the server shuts down quickly and easily. I have no such issues with smaller batch-imported datasets.
I work on a Win 7 64bit computer, using the Neo4j browser. I have adjusted the .properties file to allow for much larger memory allocations. I have configured my JVM heap to 12g, which should be fine for 64bit JDK. I just recently doubled my RAM, which I thought would fix the issue.
My CPU usage is pegged. I have the logs enabled but I don't know where to find them.
I really like the visualization capabilities of the 2.0.4 browser, does anyone know what might be going wrong?
Your query is taking a long time, and the web browser interface reports "Unknown Error" after a certain timeout period. The query is still running, but you won't see the results in the browser. This drove me nuts too when it first happened to me. If you run the query in the neo4j shell you can verify whether or not this is the problem, because the shell won't time out.
Once this timeout occurs, you can find that the whole system becomes quite non-responsive, especially if you re-run the query, because now you have two extremely long queries running in parallel!
Depending on the type of query, you may be able to improve performance. Sometimes it's as simple as limiting the number of returned nodes (in cases where you only need to find one node or path).
Hope this helps.
Grace and peace,
Jim

Neo4j batching using REST interface locking database?

When batching several queries in an HTTP requests for Neo4j, does that cause the graph database to perform all the queries in the HTTP request before moving to the next request?
Could this potentially mean that a large enough batch would lock the whole database for the time it takes to perform all queries in the batch? Or are they somehow run in parallel?
Is the batching using the REST interface (and py2neo) using the batch inserter (so its non transactional) or normal transactional insertion?
Thanks
It performs all queries in the batch request, but other queries can come in in parallel and are executed on other threads. It is only if your batch-request consumes all CPU, Memory, IO that it affects other queries.
I would use the transactional API from 2.x on.

Resources