Error while trying huge gremlin Queries in Data Stax Graph - datastax-enterprise

All of this was run in datastax studio.
I have a gremlin query which needs addition of about 10 vertices and many edges.
I have written a gremlin query for this and it runs successfully.
Then i copied the same query twice but by changing some values. again this ran successfully
Then i tried the same query with different values in a different cell in datastax studio and got the following error:
java.util.concurrent.CompletionException:org.apache.tinkerpop.gremlin.driver.exception.ResponseException

Related

Monitoring CYPHER queries performance in Neo4J

I am using Neo4JClient to connect to my Neo4J database and execute CYPHER queries. My goal is to check performance of queries I send to database. Problem is that I have to check it on the db side so I can't use Stopwatch in .NET. Queries have to be executed using Neo4JClient. I don't need to know execution times for specific queries. I.e. average for last 1000 queries will be enough.
I can use only Neo4J Community Edition.
Thanks in advance!
Neo4j Enterprise Edition has the capability to log slow queries taking longer than a given threshold, see the config settings containing querylog on http://neo4j.com/docs/stable/configuration-settings.html.

Neo4j performance difference in using shell and API

I understand that Neo4j supports different options to run the Cypher queries. The web browser, neo4j shell and the REST API.
Is there a difference in performance when using the shell and the API?
I'm working on a dataset that has around 10 million objects(nodes+edges).
Thanks!
The web browser use in the backend the ReST API. The shell is connected directly into neo4j.
So yes you will see performance differences, the shell will generally be more faster. Now using the shell will perform slower that connecting to ReST API in your application because in the shell you can't pass parameters.
In your application, passing parameters will permit that the execution can be cached (after the warmup).
Also, if you have bad indexes and bad queries, running it on a 10 million objects dataset will just result in being not performant in the shell, in the browser and in your application.

neo4j broken/corrupted after ungraceful shutdown

I'm using Neo4j over windows for testing purposes and I'm working with a db containing ~2 million relations and about the same amount of nodes. after I had an ungraceful shutdown of neo4j while writing a batch of relations the db got corrupted.
it seems like there are some broken nodes/relations in the db and whenever I try to read them I get this error (I'm using py2neo):
Error: NodeImpl#1292315 not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
I tried rebooting but neo4j fails to recover from this error. I found this question:
Neo4j cannot read certain nodes. Throws NotFoundException. Corrupt database
but the answer he got is no good for me because it involved in going over the db and redo the indexing, and I can't even read those broken nodes/relations so I can't fix their index (tried it and got the same error).
In general I've had many stability issues with neo4j (and on multiple platforms, not just windows). if no decent solution is found for this problem I will have to switch to a different database.
thanks in advance!
I wrote a tool a while ago that allows you to copy a broken store and keeps the good records intact.
You might want to check it out. I assume you used the 2.1.x version of Neo4j.
https://github.com/jexp/store-utils/tree/21
For 2.0.x check out:
https://github.com/jexp/store-utils/tree/20
To verify if your datastore is consistent follow the steps mentioned in http://www.markhneedham.com/blog/2014/01/22/neo4j-backup-store-copy-and-consistency-check/.
Are you referring to batch inserter API when speaking of "while writing a batch of relations"?
If so, be aware that batch inserter API requires a clean shutdown, see the big fat red warning on http://docs.neo4j.org/chunked/stable/batchinsert.html.
Are the broken nodes schema indexed and are you attempting to read them via this indexed label/property? If so, it's possible you may have a broken index following the sudden shutdown.
Assuming this is the case, you could try deleting the schema subdirectory within the graph store directory while the server is not running and let the database rebuild the index on restart. While this isn't an official way to recover from a broken index, it can sometimes work. Obviously, I suggest you back up your store before trying this.

Neo4j json dl import

I'm just getting into Neo4j and I spent a couple hours mapping out some nodes and relationships.
I D/L'd the JSON and I'm trying to move the nodes to another computer, it seems like it should be a pretty simple query, but everything I'm finding about batch import is for csv's and a bit more involved.
Is the just a simple cypher to import the JSON d'l from the local Neo4j server?
Moving a full graph db to another box is most simply done by copying over the data/graph.db directory.
Alternatively you can use neo4j-shell's dump command.

neo4j REST API slow

I am using Neo4j 2.0.0M4 community edition with Node.js with https://github.com/thingdom/node-neo4j to access the Neo4j DB server over REST API by passing Cypher queries.
I have observed that the data returned by Neo4j from the webadmin of neo4j and even from the REST APi is pretty slow. for e.g.
a query returning 900 records takes 1.2s and then subsequent runs take around 200ms.
and similarly if the number of records go upto 27000 the query in the webadmin browser takes 21 sec.
I am wondering whats causing the REST API to be so slow and also how to go about improving the performance?
a) It's using the CYPHER? the jSON parsing or
b) the HTTP Overhead itself as similar query with 27000 records returned in mysql takes 11 ms
Any help is highly appreciated
Neo4j 2.0 is currently a milestone build that is not yet performance optimized.
Consider enabling streaming and make sure you use parameterized Cypher.
For large result sets the browser consumes a lot of time for rendering. You might try the same query using cURL to see a difference.

Resources