How to change the read timeout of cypher-shell? - neo4j

Using Neo4j 4.4.11 (community edition), I'm trying to delete a certain type of relationships from my graph with cypher-shell:
MATCH ()-[r:MYRELATIONSHIPLABEL]->() CALL { WITH r DETACH DELETE r } IN TRANSACTIONS OF 10000 ROWS;
But I always end up with this error:
Connection read timed out due to it taking longer than the server-supplied timeout value via configuration hint.
Is it possible to increase the read timeout directly in cypher-shell without changing the server settings? (I did not find anything in the docs.)

Yes, that is possible by using the following command.
call dbms.setConfigValue('dbms.transaction.timeout','0')
The zero indicates that there should be no timeout. As indicated by Christophe in the comments, this call changes the server setting via cypher shell, so all the applications connecting to Neo4j are affected. There is no way around this other than to change it back to the original setting once you are done.

Related

NEO4J BoltConnectionError: No connection found, did you connect to Neo4j?

I am new to Neo4j , using Desktop (3.4.1 Enterprise Version).
I have used LOAD CSV utility executed from Cypher Shell Command close to 1 Million records in the file. I have monitored the load using Neo4j browser by monitoring the count of properties and relationships that was created. Every time the load stops with the error "BoltConnectionError: No connection found, did you connect to Neo4j?" . I have also tried monitoring through the browser localhost:7474 - the error is different "server connection time out.. " , but the end result is that the load CSV fails to completed. Could someone please advise or guide me what needs to be done to resolve this issue ?
You should be loading along with USING PERIODIC COMMIT when loading data to batch the load and avoid killing your heap.
Also, you may want to EXPLAIN your query and ensure your load is using index lookups, especially if you're doing MERGEs on node properties.
In your query plan, watch out for Eager operations, that will effectively kill your periodic commit approach (and the browser should warn you with that query if it's in the query box prior to executing). You should include your query here for analysis and troubleshooting (along with the query plan) if the previous advice isn't enough to help you pinpoint the issue.

Neo4j query monitoring / profiling for long running queries

I have some relly long running queries. Just as abckground information: I am crawling my graph for all instances of a specific meta path. for example, count all instances of a specific metha path found in the graph.
MATCH (a:Content) - [:isTaggedWith]-> (t:Term) <-[:isTaggedWith]-(b:Content) return (*)
In the first place, I want to measure the runtimes. is there any possibility to do so? especially in the community edition?
Furthermore, I have the problem that I do not know, whether a query is still running in neo4j or if it was already terminated. I issue the query from a rest client but I am open to other options if necessary. For example, I queried neo4j with a rest client and set the read timeout (client side) on 2 days. The problem is, that I can't verify whether the query is still running or if the client is simply waiting for the neo4j answer, which will never appear because the query might already be killed in the backend. is there really no possibility to check from the browser or another client which queries are currently running? maybe with an option to terminate them as well.
Thanks in advance!
Measuring Query Performance
To answer your first question, there are two main options for measuring the performance of a query. The first is to use PROFILE; put it in front of a query (like PROFILE MATCH (a:Content)-[:IsTaggedWith]->(t:Term)...), and it will execute the query and display the execution plan used, including the native API calls, number of results from each operation, number of total database hits, and total time of execution.
The downside is that PROFILE will execute the query, so if it is an operation that writes to the database, the changes are persisted. To profile a query without actually executing it, EXPLAIN can be used instead of PROFILE. This will show the query plan and native operations that will be used to execute the query, as well as the estimated total database hits, but it will not actually run the query, so it is only an estimate.
Checking Long Running Queries (Enterprise only)
Checking for running queries can be accomplished using Cypher in Enterprise Edition: CALL dbms.listQueries;. You must be logged in as an admin user to perform the query. If you want to stop a long-running query, use CALL dbms.killQuery() and pass in the ID of the query you wish to terminate.
Note that besides manual killing of a query and timeout of it based on the configured query timeout, unless you have something else set up to kill long-runners, the queries should, in general, not be getting killed on the backend; however, with the above method, you can double-check your assumptions that the queries are indeed executing after sending.
These are available only in Enterprise Edition; there is no way that I am aware of to use these functions or replicate their behavior in Community.
For measuring long running queries I figured out the following approach:
Use a tmux (tmux crash course) terminal session, which is really very easy. Hereby, you can execute your query and close the terminal. Later on you can get back the session.
New session: tmux new -s *sessionName*
Detach from current session (within session): tmux detach
List sessions: tmux ls
Re-attach to session: tmux a -t *sessionName*
Within the tmux session, execute the query via the cypher shell. Either directly in the shell or pipe the command into the shell. The ladder approach is preferable because you can use the unix command time to actually measure the runtime as follows:
time cat query.cypher | cypher-shell -u neo4j -p n > result.txt
The file query.cypher simply conatins the regular query including terminating semicolon at the end. The result of the query will be piped into the result.txt and the runtime of the execution will be displayed in the terminal.
Moreover, it is possible to list the running queries only in the enterprise edition as correctly stated by #rebecca.

MySQL Error 2013: Lost connection to MySQL server during query

I've read all post with the same or very close headline, but still can't find a proper solution or explanation to my problem.
I'm working with MySQL Workbench 6.3 CE. I have been able to create a database with several tables, and create a conexion with python to write data to it. Still, I had a problem related to a varchar field that needed to be set to more than 45 characters. When I try to set it to bigger limits, like VARCHAR(70), no matter how many times I try, wether I set higher limits for timeout, I get the 2013 error, saying my connection was closed during the query.
I'm using the above version of workbench, on windows 10, and I'm trying to modify that field from the workbench. Afer that first time, I can't drop a table either, nor can I connect from python.
What is happening?
Ok, apparently what was happening is that I had a block, and there where a lot of query waiting in a situation of "waiting for table metadata block".
I did the following in the console of workbench
Select concat('KILL ',id,';') from information_schema.processlist where user='root'
that generates a list of all those processes. I copy that list in a new tab, and execute a massive kill of processes. After that it worked again.
Can anybody explain me how did I arrive to that situation and what precautions to take in my python scripts so as to avoid it?
Thnak you

How to configure a query timeout in Neo4j 3.0.1

I'd like to set a query timeout in neo4j.conf for Neo4j 3.0.1. Any query taking longer than the timeout should get killed. I'm primarily concerned with setting the timeout for queries originating from the Neo4j Browser.
It looks like this was possible in the past with:
execution_guard_enabled=true
org.neo4j.server.webserver.limit.executiontime=20000
However, this old method doesn't work for me. I see Neo4j 3.0 has a dbms.transaction_timeout option defined as a "timeout for idle transactions". However, this setting also doesn't seem to do the trick.
Thanks to #stdob for the comment explaining a solution.
In Neo4j 3.0.1 Community, I verified that the following addition to neo4j.conf enabled a query timeout of 1 second for Browser queries:
unsupported.dbms.executiontime_limit.enabled=true
unsupported.dbms.executiontime_limit.time=1s
I did not check whether the timeout applies to queries oustide of Neo4j Browser, but I assume so. I did find some documentation in the Neo4j codebase for unsupported.dbms.executiontime_limit.time:
If execution time limiting is enabled in the database, this configures the maximum request execution time.
I believe dbms.transaction.timeout is the current way of limiting execution time

neo4j broken/corrupted after ungraceful shutdown

I'm using Neo4j over windows for testing purposes and I'm working with a db containing ~2 million relations and about the same amount of nodes. after I had an ungraceful shutdown of neo4j while writing a batch of relations the db got corrupted.
it seems like there are some broken nodes/relations in the db and whenever I try to read them I get this error (I'm using py2neo):
Error: NodeImpl#1292315 not found. This can be because someone else deleted this entity while we were trying to read properties from it, or because of concurrent modification of other properties on this entity. The problem should be temporary.
I tried rebooting but neo4j fails to recover from this error. I found this question:
Neo4j cannot read certain nodes. Throws NotFoundException. Corrupt database
but the answer he got is no good for me because it involved in going over the db and redo the indexing, and I can't even read those broken nodes/relations so I can't fix their index (tried it and got the same error).
In general I've had many stability issues with neo4j (and on multiple platforms, not just windows). if no decent solution is found for this problem I will have to switch to a different database.
thanks in advance!
I wrote a tool a while ago that allows you to copy a broken store and keeps the good records intact.
You might want to check it out. I assume you used the 2.1.x version of Neo4j.
https://github.com/jexp/store-utils/tree/21
For 2.0.x check out:
https://github.com/jexp/store-utils/tree/20
To verify if your datastore is consistent follow the steps mentioned in http://www.markhneedham.com/blog/2014/01/22/neo4j-backup-store-copy-and-consistency-check/.
Are you referring to batch inserter API when speaking of "while writing a batch of relations"?
If so, be aware that batch inserter API requires a clean shutdown, see the big fat red warning on http://docs.neo4j.org/chunked/stable/batchinsert.html.
Are the broken nodes schema indexed and are you attempting to read them via this indexed label/property? If so, it's possible you may have a broken index following the sudden shutdown.
Assuming this is the case, you could try deleting the schema subdirectory within the graph store directory while the server is not running and let the database rebuild the index on restart. While this isn't an official way to recover from a broken index, it can sometimes work. Obviously, I suggest you back up your store before trying this.

Resources