While importing a ton of data from spreadsheets, I attempted to use a labeling convention where nodes were capitalized like "This" and labels for relationships were labeled like "THIS". In one case, I accidentally used the relationship label format for a set of nodes. I then deleted those nodes and reimported them with the correct label format. (Side question - Was there a way to rename a label that I didn't see which could have avoided the delete/reimport?)
My problem is that in the built-in Cypher browser (Neo4j 2.1.3), both the right and wrong labels show up on the node list, even though there are zero nodes with the wrong label. So while I successfully removed the nodes, I can't figure out how to remove the label - not from nodes, which is easy enough using the REMOVE command, but from the database entirely. Why didn't it remove this label automatically when the items it was assigned to reached zero?
To be more specific, I can click on the node label for MEASURES and this query fires:
MATCH (n:`MEASURES`) RETURN n LIMIT 25
with these results:
Returned 0 rows in 77 ms
I would like to completely remove the label 'MEASURES' from the database since nothing is using it. Please let me know if you need further info.
I don't think there is yet a builtin way to remove no-longer-used labels entirely from a neo4j DB. I have also been annoyed by obsolete labels still showing up in places like the neo4j browser web UI.
I know of one way to remove them, but it may not be practical if your DB is enormous, and it might not be totally safe. Therefore, if you choose to do the following, you should make sure you have your original DB backed up (for example, you should make a copy of your original graph.db folder or rename it).
The technique is actually very simple. You just export all the data, shut down neo4j, delete or rename the original graph.db file, restart neo4j, and then re-import the data. The following steps assume that you are in your neo4j installation folder in a linux environment, and neo4j is not running as a service.
Export the data (as CYPHER statements that will recreate the data):
./bin/neo4j-shell -c "dump" > mydump.cql
Shut down neo4j (as it is not safe to remove or rename graph.db while the DB is running):
./bin/neo4j stop
Rename the current graph.db folder, just in case you need to replace the new folder created below:
mv data/graph.db data/graph.db.archive
Restart neo4j, which will automatically create a new graph.db folder:
./bin/neo4j start
Re-import the data from the dump:
./bin/neo4j-shell -file mydump.cql
The obsolete labels should be gone at this point from everywhere (you should reload any neo4j web pages).
Here is how to do it.
What you have to make sure is that
1) no node is using the label
2) and there are no indices or constraints on the label
1) Removing/renaming the label on nodes with a cypher query:
MATCH (n:OldLabel)
SET n:NewLabel /* Optional line if you want to rename the label */
REMOVE n:OldLabel
RETURN n
2a) Check if indices or constraints are using the label using the schema command in the neo4j-shell:
$ neo4j-shell
Welcome to the Neo4j Shell! Enter 'help' for a list of commands
NOTE: Remote Neo4j graph database service 'shell' at port 1337
neo4j-sh (?)$ schema
Indexes
ON :OldLabel(id) ONLINE (for uniqueness constraint)
ON :Person(name) ONLINE (for uniqueness constraint)
ON :Person(id) ONLINE (for uniqueness constraint)
Constraints
ON (person:Person) ASSERT person.name IS UNIQUE
ON (person:Person) ASSERT person.id IS UNIQUE
ON (oldlabel:OldLabel) ASSERT oldlabel.id IS UNIQUE
2b) Remove index and constraint in a cypher query:
DROP CONSTRAINT ON (n:OldLabel)
ASSERT n.id IS UNIQUE;
DROP INDEX ON :OldLabel(id);
Remember to make new indices and constraints if you just wanted to rename the label.
After this the label should no longer show up in the web interface.
Labels don't really exist apart from nodes that use them. You can always query for non-existant labels, and you'll always get zero nodes back.
Here, you're querying for MEASURES and getting nothing. That's pretty much the same thing as the label not existing.
Here's an example with a database I made just now:
$ neo4j-shell -path test
NOTE: Local Neo4j graph database service at 'test'
Welcome to the Neo4j Shell! Enter 'help' for a list of commands
neo4j-sh (?)$ MATCH (m:TotallyNonExistantLabel) return m;
+---+
| m |
+---+
+---+
0 row
1946 ms
So, the bottom line is that you can't really delete labels from your database other than deleting all of the nodes that use them. You can do that like this:
MATCH (f:ThisLabelGonnaDieSucka)
REMOVE f:ThisLabelGonnaDieSucka
RETURN f;
That's basically deleting ThisLabelGonnaDieSucka from the database.
Related
I want to delete every node and edge of any type from a Neo4j database. There are different ways of deleting nodes and edges suggested on SO. However, since my database is huge, and since all these methods rely on first querying for edges/nodes and then deleting them, which leads to loading (at least their indexes) into memory, these methods fail for my use case with the out-of-memory error. See the following example.
match ()-[r]->() delete r
match (n) delete n
Neo.TransientError.General.OutOfMemoryError: There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.
For different reasons, I cannot increase the amount of configured memory.
One radical solution is to delete the database's files which would lead to deleting the database effectively (even resetting the indexes). In my use case, this approach has its downsides, e.g., some of our applications rely on a set import path to bulk load data (e.g., Neo4jDesktop\relate-data\dbmss\dbms-...\import\) where deleting and re-creating the database requires updating all those dependent applications.
I was wondering if there is any efficient approach other than these to delete all nodes and edges from a huge Neo4j database---ideally without needing to loading/query the nodes/edges first.
If you are using Neo4j 4.3 and above, you can simply use:
DROP DATABASE database_name IF EXISTS <-- Best Way
OR you can use use CALL syntax, like this:
MATCH (n)
CALL { WITH n
DETACH DELETE n
} IN TRANSACTIONS OF 10000 ROWS;
The above query might not work on Neo4j Browser. To run it on neo4j Browser, try this:
:auto MATCH (n)
CALL { WITH n
DETACH DELETE n
} IN TRANSACTIONS OF 10000 ROWS;
Finally, if your version is less than 4.x, you can try the APOC as suggested in another answer, or simply run this query multiple times until the output is zero.
MATCH (n)
WITH n LIMIT 10000
DETACH DELETE n
RETURN count(*);
You can use apoc iterate function, this documentation will explain the details.
https://neo4j.com/labs/apoc/4.1/overview/apoc.periodic/apoc.periodic.iterate/#usage-apoc.periodic.iterate
CALL apoc.periodic.iterate(
"MATCH (n) RETURN n",
"DETACH DELETE n",
{batchSize:10000, parallel:true})
This will delete the nodes and edges per 10k batches.
You can change the batch size based on your intuition.
I'm using Neo4j Enterprise Edition. I want to clear the whole database I have created before . I mean Every single Node and it's relationships and also properties So I found this syntax on Neo4j book I ran the syntax :
MATCH (a)
OPTIONAL MATCH (a)-[r]-()
DELETE a, r
But still can see the properties on the property keys part
what's wrong?
What should I do so that even properties get deleted?
Neo4j Browser just show the data returned from CALL db.propertyKeys(). Currently the procedure db.propertyKeys() is returning unused properties, as you can see in this GitHub issue at Neo4j Repo.
That is: your database is totally empty, but Neo4j Browser still showing the properties that existed in your database at some point of time.
Since you are deleting all your nodes and relationships, you can alternatively delete all content of <neo4j-home>/data/databases/graph.db/ folder and restart Neo4j service. But you will need to recreate all indexes, constraints and do authentication again.
Tip: Currently you can use DETACH DELETE to delete a node and any relationship going to or from it. So instead of the query yo wrote you can use:
match (node)
detach delete node
My import.csv creates many nodes and merging creates a huge cartesian product and runs in a transaction timeout since the data has grown so much. I've currently set the transaction timeout to 1 second because every other query is very quick and is not supposed to take any longer than one second to finish.
Is there a way to split or execute this specific query in smaller chunks to prevent a timeout?
Upping or disabling the transaction timeout in the neo4j.conf is not an option because the neo4j service needs a restart for every change made in the config.
The query hitting the timeout from my import script:
MATCH (l:NameLabel)
MATCH (m:Movie {id: l.id,somevalue: l.somevalue})
MERGE (m)-[:LABEL {path: l.path}]->(l);
Nodecounts: 1000 Movie, 2500 Namelabel
You can try installing APOC Procedures and using the procedure apoc.periodic.commit.
call apoc.periodic.commit("
MATCH (l:Namelabel)
WHERE NOT (l)-[:LABEL]->(:Movie)
WITH l LIMIT {limit}
MATCH (m:Movie {id: l.id,somevalue: l.somevalue})
MERGE (m)-[:LABEL {path: l.path}]->(l)
RETURN count(*)
",{limit:1000})
The below query will be executed repeatedly in separate transactions until it returns 0.
You can change the value of {limit : 1000}.
Note: remember to install APOC Procedures according the version of Neo4j you are using. Take a look in the Version Compatibility Matrix.
The number of nodes and labels in your database suggest this is an indexing problem. Do you have constraints on both the Movie and Namelabel (which should be NameLabel since it is a node) nodes? The appropriate constraints should be in place and active.
Indexing and Performance
Make sure to have indexes and constraints declared and ONLINE for
entities you want to MATCH or MERGE on
Always MATCH and MERGE on a
single label and the indexed primary-key property
Prefix your load
statements with USING PERIODIC COMMIT 10000 If possible, separate node
creation from relationship creation into different statements
If your
import is slow or runs into memory issues, see Mark’s blog post on
Eager loading.
If your Movie nodes have unique names then use the CREATE UNIQUE statement. - docs
If one of the nodes is not unique but will be used in a relationship definition then the CREATE INDEX ON statement. With such a small dataset it may not be readily apparent how inefficient your queries are. Try the PROFILE command and see how many nodes are being searched. Your MERGE statement should only check a couple nodes at each step.
I deleted all of node and relationship. Now, I want to delete all existing labels with a Cypher query but I can't.
You are probably referring to the neo4j browser's "Node labels" display. The browser can continue to display labels that have been deleted from all nodes (or even if the DB no longer has any nodes). This is really just a minor nuisance.
As long as your Cypher queries show that there are no nodes with that label, rest assured that the label does not "really" exist in the DB.
If you're removing all of the data (nodes and relationships) anyway, you might as well delete your graph.db directory or wherever you store your data. This will also result in having pre-existing labels not show up in the browser.
This will also remove all indexes you might have had set up.
How to delete labels in neo4j? Actually I deleted all nodes and relationships, then I recreated the movie database and still the labels I created before appeared on the webinterface. I also tried to use a different location for the database and even after an uninstall and reinstall the labels still appeared. Why? Where are the labels stored? After the uninstall the programm, the database folder and the appdata folder were deleted.
How to reproduce? Install neo4j -> use the movie database example -> create (l:SomeLabel {name:"A freaky label"}) -> delete the node -> stop neo, create new folder -> start neo -> create movie shema -> match (n) return (n) -> SomeLabel appears, even if you changed the folder or make an uninstall / install.
Is there a way to delete labels even if there is no node with it?
There isn't at the moment (Neo4j 2.0.1) a way to explicitly delete a label once it has been created. Neo4j Browser will display all labels which are reported by the REST endpoint at:
http://localhost:7474/db/data/labels
Separately, the Neo4j Browser sidebar which displays labels doesn't properly refresh the listing when it loses connection with Neo4j. A web browser reload should work.
Lastly, there was a bug in Neo4j Browser's visualization which would display all labels for which a style had been created. If using a version of Neo4j which has the bug, you can clear the styling by clicking on "View Stylesheet" in the property inspector, then clicking the fire extinguisher icon. All of that needs usability improvement, admittedly.
Cheers,
Andreas
This seems to be worked out by version 2.3.0.
As an example, suppose we had created a movie in the data browser such as:
CREATE(m:Movie:Cinema:Film:Picture{title:"The Matrix"})
We could query it with
MATCH(m:Movie)
WHERE m.title = "The Matrix"
RETURN m
It would have 4 labels: Movie, Cinema, Film, and Picture
To remove the Picture label from all movies:
MATCH(m:Movie)
REMOVE m:Picture
RETURN m
To remove the Picture label from only that one movie:
MATCH(m:Movie)
WHERE m.title = "The Matrix"
REMOVE m:Picture
RETURN m
Let us assume that we have created a node Product as below
PRODUCT_MASTER { product_code :"ABC", product_name:"XYX }
CREATE INDEX ON :PRODUCT_MASTER (product_code);
Now even if I delete all PRODUCT_MASTER nodes from graph, we will keep getting PRODUCT_MASTER in browser under Node labels. To get rid of the same , we need to drop the index as well.
DROP INDEX ON :PRODUCT_MASTER (product_code);
In neo4j-shell , type in "schema" command to get the list of indexes and corresponding properties.
To summarize , in case we delete all of the nodes of particular type , you need delete indexes on that node as well .
I simply:
stop neo4j
delete the entire database, and that removes everything
start neo4j
on a mac the db is here
/usr/local/var/neo4j/data/databases/graph.db
The reason is that when a label is created, Neo4j indexes this label. You can delete the node but the index will remain.
At a guess - if you drop the index on the label, it will disappear from the GUI (NOTE- I've not got access to Neo4j at the moment to check this theory)
If you delete the index of that labels, then it will delete the labels from database.
I just found a workaround (with neo4j 2.2 M04). Dump the content of the DB to a file, throw away the DB, then insert the dump again. Only works for small DBs, though.
Step1: dump the content, using neo4j-shell
$NEO4J_HOME/bin/> neo4j-shell -c 'dump match a return a;' > dump.temp
Step2: throw away DB
(there's plenty ways to delete the folder $NEO4J_HOME/data/graph.db/ or wherever your DB folder is)
Step3: insert the dump again, using neo4j-shell
$NEO4J_HOME/bin/> neo4j-shell -file dump.temp
This should bring up statistics on how many nodes, relationships, properties and labels have been created.
(And Step4 would be to delete that dump.temp file, it has no reason to live inside the bin folder.)
What I find odd (and maybe Michael or somebody else from neo4j could shed some light on this): in my case, Step3 told me that some 50+ labels had been created. However, when I open the web interface, only those 15 or so labels, which I actually use, are listed. So the DB feels clean now. Not entirely sure that it is clean.
As of today, with Neo4j Desktop Version: 1.1.10 and DB Version: 3.4.7
Delete data + delete Index + delete any unique constraints + Developer > Refresh clears all Labels