using a query I have a list of nodes.
match (n) where n.afield is null return count(n),labels(n) ;
which gives
+---------------------------+
| count(n) | labels(n) |
+---------------------------+
| 7 | ["foo"] |
| 21 | [] |
(...)
If I want to delete all foo labeled node, I would use
match(n:foo) detach delete(n) ;
now, how can I delete all labelless node ? (those 21 in sample above)
match (n) where magic(n) detach delete ;
anyone know some kind of magic(n) ?
You can try this query to delete node without labels:
MATCH (n) where size(labels(n)) = 0
DETACH DELETE n
Related
I'm using neo4j to develop a proof of concept and I want to get all Nodes ID for all paths from my root node to leafs, example with ids :
ROOT1-->N1--->SN2--->L1
ROOT1-->N2--->SN3--->L3
What I want to get in my result query is : ROO1,N1,SN2 and ROOT1,N2,SN3
Im new to cypher and I struggle to get this result, any help would be usefull .
I assume that the ID that you mention is an id property.
To get a collection of the node ids in each full path (except for the leaf node):
MATCH p=(root {id: 'ROOT1'})-[*]->(leaf)
WHERE NOT (leaf)-->()
RETURN EXTRACT(x IN NODES(p)[..-1] | x.id) AS result;
Here is a sample result:
+----------------------+
| result |
+----------------------+
| ["ROOT1","N1","SN2"] |
| ["ROOT1","N2","SN3"] |
+----------------------+
I have some nodes like these:
neo4j-sh (?)$ match (n:Woka) return count (n);
+-----------+
| count (n) |
+-----------+
| 19798966 |
+-----------+
1 row
How I can change them from "Woka" to "Book" without delete and recreate?
you can use set and remove
MATCH (n:Woka)
SET n:Book
REMOVE n:Woka
I'm using neo4j 2.1.7 Recently i was experimenting with Match queries, searching for nodes with several labels. And i found out, that generally query
Match (p:A:B) return count(p) as number
and
Match (p:B:A) return count(p) as number
works different time, extremely in cases when you have for example 2 millions of Nodes A and 0 of Nodes B.
So do labels order effects search time? Is this future is documented anywhere?
Neo4j internally maintains a labelscan store - that's basically a lookup to quickly get all nodes carrying a definied label A.
When doing a query like
MATCH (n:A:B) return count(n)
labelscanstore is used to find all A nodes and then they're filtered if those nodes carry label B as well. If n(A) >> n(B) it's way more efficient to do MATCH (n:B:A) instead since you look up only a few B nodes and filter those for A.
You can use PROFILE MATCH (n:A:B) return count(n) to see the query plan. For Neo4j <= 2.1.x you'll see a different query plan depending on the order of the labels you've specified.
Starting with Neo4j 2.2 (milestone M03 available as of writing this reply) there's a cost based Cypher optimizer. Now Cypher is aware of node statistics and they are used to optimize the query.
As an example I've used the following statements to create some test data:
create (:A:B);
with 1 as a foreach (x in range(0,1000000) | create (:A));
with 1 as a foreach (x in range(0,100) | create (:B));
We have now 100 B nodes, 1M A nodes and 1 AB node. In 2.2 the two statements:
MATCH (n:B:A) return count(n)
MATCH (n:A:B) return count(n)
result in the exact same query plan (and therefore in the same execution speed):
+------------------+---------------+------+--------+-------------+---------------+
| Operator | EstimatedRows | Rows | DbHits | Identifiers | Other |
+------------------+---------------+------+--------+-------------+---------------+
| EagerAggregation | 3 | 1 | 0 | count(n) | |
| Filter | 12 | 1 | 12 | n | hasLabel(n:A) |
| NodeByLabelScan | 12 | 12 | 13 | n | :B |
+------------------+---------------+------+--------+-------------+---------------+
Since there are only few B nodes, it's cheaper to scan for B's and filter for A. Smart Cypher, isn't it ;-)
Could anyone please tell me the difference between Merge and Create in Cypher Querying.
How Neo4j stores data physically?
Thanks in advance..
CREATE does just what it says. It creates, and if that means creating duplicates, well then it creates. MERGE does the same thing as create, but also checks to see if a node already exists with the properties you specify. If it does, then it doesn't create. This helps avoid duplicates. Here's an example: I use CREATE twice to create a person with the same name.
neo4j-sh (?)$ create (p:Person {name: "Bob"});
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1
Properties set: 1
Labels added: 1
9 ms
neo4j-sh (?)$ create (p:Person {name: "Bob"});
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1
Properties set: 1
Labels added: 1
5 ms
So now when we query, there are two Bob's.
neo4j-sh (?)$ match (p:Person {name:"Bob"}) return p;
+--------------------------+
| p |
+--------------------------+
| Node[222124]{name:"Bob"} |
| Node[222125]{name:"Bob"} |
+--------------------------+
2 rows
46 ms
Let's MERGE in another Bob and see what happens.
neo4j-sh (?)$ merge (p:Person {name:"Bob"});
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
2 ms
neo4j-sh (?)$ match (p:Person {name:"Bob"}) return p;
+--------------------------+
| p |
+--------------------------+
| Node[222124]{name:"Bob"} |
| Node[222125]{name:"Bob"} |
+--------------------------+
2 rows
11 ms
Bob already existed, so MERGE did nothing here. Querying again, the same two Bobs are present. Had there been no Bobs in the database, MERGE would have done the same thing as CREATE.
I am trying to find the shortest path between two nodes but I need to exclude some nodes from the path. The cypher that I am trying is
START a=node(1), b=node(2), c=node(3,4)
MATCH p=a-[rel:RELATIONSHIP*]-b
WHERE NOT (c in nodes(p))
RETURN p
ORDER BY length(p) LIMIT 1
But this is giving me a path which includes one of the nodes in c.
Is there a way to do a traversal excluding some nodes?
Thanks
The MATCH ... WHERE part should be fine, but your START clause may not do what you expect. Do the following and consider the result
START a=node(1), b=node(2), c=node(3,4)
RETURN ID(a), ID(b), ID(c)
You get back
==> +-----------------------+
==> | ID(a) | ID(b) | ID(c) |
==> +-----------------------+
==> | 1 | 2 | 3 |
==> | 1 | 2 | 4 |
==> +-----------------------+
That means the rest of the query is executed twice, once excluding (3) from the path and once excluding (4). But that also means of course that it is run once not excluding each of them, which means you can indeed get results with those nodes present on the path.
If you want to exclude both of those nodes from the path, try collecting them and filtering with NONE or NOT ANY or similar. I think something like this should do it (can't test at the moment).
START a=node(1), b=node(2), c=node(3,4)
WITH a, b, collect (c) as cc
MATCH p=a-[rel:RELATIONSHIP*]-b
WHERE NONE (n IN nodes(p) WHERE n IN cc)
RETURN p
ORDER BY length(p) LIMIT 1