Batch-imported data not available in Neo4j until server restart - neo4j

I'm using batch-import to load a very small graph (5 nodes; 3 rels) into neo4j 1.9.1. The import showed success, yet the data is not available through webadmin or REST queries until I restart the neo4j server. Very strange. Can somebody enlight me?

You should not have the server running if you to batchimport, since both systems are accessing the same files, and having the server running might corrupt your DB. First to batchimport, then start your server to get correct numbers.

Related

How to know when Neo4j is ready to serve

I've developed an application which connects to Neo4j and creates a bunch of nodes. I've also developed a plugin for Neo4j using Graphaware. And both these are run in separate dockers (one for the code and one for the Neo4j with plugin).
Now, since I start these containers automatically and simultaneously, the code should wait for the Neo4j to completely start before it tries creating the nodes. For that, I'm testing the availability of the Neo4j by trying to connect to it using bolt protocol (Neo4j's driver).
The problem I've got is that it seems Neo4j starts accepting incoming connections before it completely loads the plugins. As the result, the connection is made before Neo4j is actually prepared and also something goes wrong (I don't know what) and the whole code halts (I don't think this issue is important) all because the connection is made before the plugins are loaded. I know that since if I delay the connection manually, everything goes forward smoothly.
So my question is how to make sure that Neo4j is warmed up (fully) before starting to connect to it? Right now I'm checking the availability of management (http://localhost:7474) but what if there's no management, to begin with?
At the moment you'll find that you can keep the management interface local, but you can't actually turn it off (unless you're working in embedded mode), so waiting for http://localhost:7474 is a good approach. If you want to be more fine-grained, you can check yourinstallation\logs\debug.log
2017-07-27 03:58:53.643+0000 INFO [o.n.k.AvailabilityGuard] Fulfilling of requirement makes database available: Database available
2017-07-27 03:58:53.644+0000 INFO [o.n.k.i.f.GraphDatabaseFacadeFactory] Database is now ready
Hope this helps.
Regards,
Tom

google cloud sql redmine mysql not responding

I've been trying to set up a Redmine on google compute engine with the mysql 5.5 database hosted on google cloud sql (d1, 512mb of ram, always-on, europe, package-billed).
Unfortunately, Redmine stops responding (really stops, I set the timeout to 1hour and nothing happens) to requests after a few minutes. Using newrelic I found out that it's database-related - ActiveRecord seems to have some problems with the database ..
In order to find out if the problems are really related to the cloud sql database, I set up a new database on my own server and it's working fine since then. So there definitely is an issue with the cloud sql database and redmine/ruby.
Does anyone have an idea what I can try to solve the problem?
Best,
Jan
GCE idle connections are closed automatically after 10 minutes as explained in [1]. As you are connecting to CloudSQL from a GCE instance, this is most likely the cause for your issue.
Additionally, take into account Cloud SQL instances can go down and come back anytime due to maintenances and connections must be managed accordingly. Checking the CloudSQL instance operation list would confirm this. Hope this helps.
[1] https://cloud.google.com/sql/docs/gce-access

neo4j broken/corrupted after ungraceful shutdown

I'm using Neo4j over windows for testing purposes and I'm working with a db containing ~2 million relations and about the same amount of nodes. after I had an ungraceful shutdown of neo4j while writing a batch of relations the db got corrupted.
it seems like there are some broken nodes/relations in the db and whenever I try to read them I get this error (I'm using py2neo):
Error: NodeImpl#1292315 not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
I tried rebooting but neo4j fails to recover from this error. I found this question:
Neo4j cannot read certain nodes. Throws NotFoundException. Corrupt database
but the answer he got is no good for me because it involved in going over the db and redo the indexing, and I can't even read those broken nodes/relations so I can't fix their index (tried it and got the same error).
In general I've had many stability issues with neo4j (and on multiple platforms, not just windows). if no decent solution is found for this problem I will have to switch to a different database.
thanks in advance!
I wrote a tool a while ago that allows you to copy a broken store and keeps the good records intact.
You might want to check it out. I assume you used the 2.1.x version of Neo4j.
https://github.com/jexp/store-utils/tree/21
For 2.0.x check out:
https://github.com/jexp/store-utils/tree/20
To verify if your datastore is consistent follow the steps mentioned in http://www.markhneedham.com/blog/2014/01/22/neo4j-backup-store-copy-and-consistency-check/.
Are you referring to batch inserter API when speaking of "while writing a batch of relations"?
If so, be aware that batch inserter API requires a clean shutdown, see the big fat red warning on http://docs.neo4j.org/chunked/stable/batchinsert.html.
Are the broken nodes schema indexed and are you attempting to read them via this indexed label/property? If so, it's possible you may have a broken index following the sudden shutdown.
Assuming this is the case, you could try deleting the schema subdirectory within the graph store directory while the server is not running and let the database rebuild the index on restart. While this isn't an official way to recover from a broken index, it can sometimes work. Obviously, I suggest you back up your store before trying this.

Neo4j 2.0.4 browser cannot query large datasets

Whenever I try to run cypher queries in Neo4j browser 2.0 on large (anywhere from 3 to 10GB) batch-imported datasets, I receive an "Unknown Error." Then Neo4j server stops responding, and I need to exit out using Task Manager. Prior to this operation, the server shuts down quickly and easily. I have no such issues with smaller batch-imported datasets.
I work on a Win 7 64bit computer, using the Neo4j browser. I have adjusted the .properties file to allow for much larger memory allocations. I have configured my JVM heap to 12g, which should be fine for 64bit JDK. I just recently doubled my RAM, which I thought would fix the issue.
My CPU usage is pegged. I have the logs enabled but I don't know where to find them.
I really like the visualization capabilities of the 2.0.4 browser, does anyone know what might be going wrong?
Your query is taking a long time, and the web browser interface reports "Unknown Error" after a certain timeout period. The query is still running, but you won't see the results in the browser. This drove me nuts too when it first happened to me. If you run the query in the neo4j shell you can verify whether or not this is the problem, because the shell won't time out.
Once this timeout occurs, you can find that the whole system becomes quite non-responsive, especially if you re-run the query, because now you have two extremely long queries running in parallel!
Depending on the type of query, you may be able to improve performance. Sometimes it's as simple as limiting the number of returned nodes (in cases where you only need to find one node or path).
Hope this helps.
Grace and peace,
Jim

Neo4j, There is not enough space on the disk

In our application which is using neo4j-1.8.2 we have so called synchronization process. This process reads some data from SQL Server db, processes it in some way and makes appropriate changes to the graph database. In the case if we have disk space outage (those disk where we have neo4j database located), neo4j server stops working (it is still running stops answering the queries). In neo4j web admin I have the following response for each cypher query- "Failed to get current transaction.". In the log file I see:
SEVERE: The RuntimeException could not be mapped to a response, re-throwing to the HTTP container
org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
...
Caused by: java.io.IOException: There is not enough space on the disk
My question is: when I clean some content from disk and around 10GB of free space appeared, does it mean that neo4j server will start working (answering to queries) again automatically OR do I need to restart it?
I see that it is not working after cleaning some content, I have to restart it, then it starts working again? I want to know if this is expected or can I do something to avoid restarting neo4j server?
Thanks in advance,
Denys
You have to restart Neo4j when running out of disc. Best practice is to setup some monitoring system giving you alert if free disc space goes below a certain capacity.

Resources