I'm new to Neo4j - just started playing with it yesterday.
I have a question - is this statement an atomic operation?
start n = node(68362), n1 = node(68363) match n-[r]->n1 delete r;
YES. If you haven't started the transaction explicitly then Neo4j Server will start one and execute this statement and complete the transaction.
i believe you mean whether neo4j locks any data when executing this delete statement.
than the answer is NO.
Related
Maybe I am very stupid or Neo4j is not supposed to be fast. (Disclaimer: I am a Neo4j noob)
I have the following simple dijkstra query which is taking forever to run. I have to atleast wait for 5-10 minutes for it to execute.Sometimes my Chrome browser crashes because of it.
Sample Graph
Cypther Query
profile MATCH (startNode:Stop)--(st:Stoptime),
(endNode:Stop)--(et:Stoptime)
where endNode.name = 'Hauptbahnhof Süd' and
(startNode.name = 'Schlump' or startNode.name = 'U Schlump')
call apoc.algo.dijkstra(st, et, 'PRECEDES', 'weight') YIELD path, weight
return startNode, endNode, path, weight
limit 100;
Computer Config
I am using a Ubuntu VM on windows machine which has 24GB Ram and 6 Cpus.
Indexes
Sysinfo
When I run profile on the above Query, i get the following information:
Profile Information
For the love of God, I cant figure out, where the bottleneck lies. I have checked all other answers on this, but to no avail.
Since I don't have the data set to test out my suggestion with, I can only point you in the direction that I would look. Hopefully, it leads you to the answer.
In looking at the profile and query I see that startNode and endNode are both type :Stop and that the Stop.name property is indexed.
When looking for endNode.name = 'Hauptbahnhof Süd' there are 3 estimated rows and 3 rows are returned.
However when looking for (startNode.name = 'Schlump' or startNode.name = 'U Schlump') there are 6 estimated rows, but 14827 returned.
Are there indeed 14827 :Stop nodes that contain either 'Schlump' or 'U Schlump'?
Or is it the 6 estimated rows? If the latter is the case can you run the query without the OR:
where endNode.name = 'Hauptbahnhof Süd' and startNode.name = 'Schlump'
to see what the profiler comes up with.
If that performs as expected then the solution may be to rewrite the query to include that OR logic in a different format?
Perhaps
where endNode.name = 'Hauptbahnhof Süd' and startNode.name IN ['Schlump','U Schlump']
Also found this older answer indicating an issue with the OR operator and indexes prior to 3.2.
I had remembered seeing another recent answer about some issue with OR, but can't seem to locate it now.
Good luck!
I am using Py2neo 3.0 and Neo4j 3.0 to create nodes. Followed the transaction statements to create the nodes but failed.
Syntax:
tx = graph.begin()
a= Node("Person1", name="Alicedemo")
tx.create(a)
tx.commit
And, then did the same without transaction, and succeeded.
Syntax:
a= Node("Person1", name="Alicedemo")
graph.create(a)
Is their any problem with transaction in py2neo or else I am missing anything there?
I believe you forgot to use parenthesis
tx.commit()
I am using this example, http://neo4j.com/docs/stable/cypher-cookbook-newsfeed.html, to maintain newsfeeds for my users. So I use the following to post a status update:
MATCH (me)
WHERE me.name='Bob'
OPTIONAL MATCH (me)-[r:STATUS]-(secondlatestupdate)
DELETE r
CREATE (me)-[:STATUS]->(latest_update { text:'Status',date:123 })
WITH latest_update, collect(secondlatestupdate) AS seconds
FOREACH (x IN seconds | CREATE (latest_update)-[:NEXT]->(x))
RETURN latest_update.text AS new_status
I encountered a severe flaw in this and don't know how to fix it. In a very rare scenario where two status updates are posted at the exactly same time (ex. 10ms apart), instead of replacing the current status, Neo4j creates two status updates. This leads to a much bigger problem where, the next updates are posted twice!
This looks like a race condition. To resolve that you basically need to make sure that at a given time only one transaction is modifying the status for this specific user.
Neo4j's Java API does have the ability to set locks to achieve this. Cypher doesn't have an explicit feature for this but you can e.g. remove a non-existing property to force a lock on the given node. With a lock in place concurrent transaction need to wait this the holder of the lock is finished with his transaction.
So grab a lock early in your statement:
MATCH (me)
WHERE me.name='Bob'
REMOVE me._not_existing // side effect: grab a lock early
WITH me
OPTIONAL MATCH (me)-[r:STATUS]-(secondlatestupdate)
DELETE r
CREATE (me)-[:STATUS]->(latest_update { text:'Status',date:123 })
WITH latest_update, collect(secondlatestupdate) AS seconds
FOREACH (x IN seconds | CREATE (latest_update)-[:NEXT]->(x))
RETURN latest_update.text AS new_status
In the following scenario, node "x" does not exist.
start x=node:node_auto_index(key="x"), y=node(*)
return count(x), count(y)
It seems that if any of the starting points can't be found, nothing is returned.
Any suggestions how to work around this issue?
This is like saying the below (in SQL)--what do you expect will happen if table X is empty?
select count(x), count(y)
from x, y
I'm not sure exactly what you're trying to query here, but you might need to get your counts one at a time, if there's a chance that x will come back with no results:
start x=node:node_auto_index(key="x")
with count(x) as cntx
start y=node(*)
return cntx, count(y) as cnty
Thanks to Wes, I figured out how to do a "conditional add" with the old Cypher syntax (pre 2.0):
START x=node:node_auto_index(key="x")
with count(x) as exists
start y=node:node_auto_index(key="y")
where exists = 0
create (n {key:"y"})<-[:rel]-y
return n, y
The crux here is that you can't fire another "start" after a "where" clause. You need to query for the second node before checking for conditions (which is kind of bad for performance). This is remedied in 2.0 with if-then-else statements anyway...
In production, I am facing this problem.
There is a delete which is taking long time to execute and is finally throwing SQL error of -243.
I got the query using onstat -g.
Is there any way to find out what is causing it to take this much time and finally error out?
It uses COMMITTED READ isolation.
This is causing high Informix cpu usage as well.
EDIT
Environment - Informix 9.2 on Solaris
I do not see any issue related to indexes or application logic, but I suspect some informix corruption.
The session holds 8 locks on different tables while executing this DELETE query.
But, I do not see any locks on the table on which the delete is performed.
Would it be something like, informix is unable to get lock on the table?
DELETE doesn't care about your isolation level. You are getting 243 because another process is locking the table while you're trying to run your delete operation.
I would put your delete into an SP and commit each Xth record:
CREATE PROCEDURE tmp_delete_sp (
p_commit_records INTEGER
)
RETURNING
INTEGER,
VARCHAR(64);
DEFINE l_current_count INTEGER;
SET LOCK MODE TO WAIT 5; -- Wait 5 seconds if another process is locking the table.
BEGIN WORK;
FOREACH WITH HOLD
SELECT .....
DELETE FROM table WHERE ref = ^^ Ref from above;
LET l_current_count = l_current_count + 1;
IF (l_current_count >= p_commit_records) THEN
COMMIT WORK;
BEGIN WORK;
LET l_current_count = 0;
END IF;
END FOREACH;
COMMIT WORK;
RETURN 0, 'Deleted records';
END PROCEDURE;
Some syntax issues there, but it's a good starting block for you. Remember, inserts and updates get incrementally slower as you use more logical logs.
Informix was restarted ungracefully many times, which led to informix instability.
This was the root cause.