I have a neo4j database and I would like to use the result of a part of the cypher code (a set of node ids) to use in the second part:
Something like:
MATCH ()-[:KNOWS]->(b)
FOREACH (n IN distinct(id(b))| SET n :label)
In a pure cypher code, is there a way to loop over the result "distinct(id(b))" and apply to each element another query?
Two problems with the original query:
You have to have a collection to use FOREACH.
You bind n to a node id, and you can't set labels on node ids, only on nodes.
You can use FOREACH to set labels by doing
MATCH ()-[:KNOWS]->(b)
WITH collect (distinct b) as bb
FOREACH (b IN bb | SET b:MyLabel)
In this case you don't need to do it as a collection, you can just do
MATCH ()-[:KNOWS]->(b)
WITH distinct b
SET b:MyLabel
And in general you can pipe results to an additional query part with WITH
I obtained the needed result with:
MATCH ()-[:KNOWS]->(b)
WITH DISTINCT (b)
RETURN id(b)
Related
I'm trying to use Neo4j Cypher to implement the following function: given a node, check if it has any outgoing edges with a specific relationship type. If so, return the nodes it can reach out by those edges, otherwise delete this node. And my code is like this
MATCH (m:Node{Properties})
WITH (size((m)-[:type]->(:Node))) AS c,m
WHERE c=0
DETACH DELETE m
However I don't know how to apply the if/else condition here, and this code only implements part of what I need. I'd really appreciate your help and suggestions!
For example the database is like this:
A-[type]->B
A-[type]->C
If the original node is A and it has two edges with that type to B and C, then I want the query to return B and C as result.
If the original node is B, it should be deleted because there's no such outgoing edge from B.
[UPDATED]
The following query uses a FOREACH hack to conditionally delete m, and returns either the found n nodes, or NULL if there were none.
OPTIONAL MATCH (m:Node {...Properties...})-[:type]->(n:Node)
FOREACH(x IN CASE WHEN n IS NULL THEN [1] END | DETACH DELETE m)
RETURN n
You could also use the APOC procedure apoc.do.when instead of the FOREACH hack:
OPTIONAL MATCH (m:Node {...Properties...})-[:type]->(n:Node)
CALL apoc.do.when(n IS NULL, 'DETACH DELETE m', '', {m: m}) YIELD value
RETURN n
In Neo4j, is it faster to run a query against all nodes (AllNodesScan) and then filter on their labels with a WHERE clause, or to run multiple queries with a NodeByLabelScan?
To illustrate, I want all nodes that are labeled with one of the labels in label_list:
label_list = ['label_1', 'label_2', ...]
Which would be faster in an application (this is pseudo-code):
for label in label_list:
run.query("MATCH (n:{label}) return n")
or
run.query("MATCH (n) WHERE (n:label_1 or n:label_2 or ...)")
EDIT:
Actually, I just realized that the best option might be to run multiple NodeByLabelScan in a single query, with something looking like this:
MATCH (a:label_1)
MATCH (b:label_2)
...
UNWIND [a, b ..] as foo
RETURN foo
Could someone speak to it?
Yes, it would be better to run multiple NodeByLabelScans in a single query.
For example:
OPTIONAL MATCH (a:label_1)
WITH COLLECT(a) AS list
OPTIONAL MATCH (b:label_2)
WITH list + COLLECT(b) AS list
OPTIONAL MATCH (c:label_3)
WITH list + COLLECT(c) AS list
UNWIND list AS n
RETURN DISTINCT n
Notes on the query:
It uses OPTIONAL MATCH so that the query can proceed even if a wanted label is not found in the DB.
It uses multiple aggregation steps to avoid cartesian products (also see this).
And it uses UNWIND so that it can useDISTINCT to return distinct nodes (since a node can have multiple labels).
I run a Cypher query and update labels of the nodes matching a certain criteria. I also want to update nodes that do not pass that criteria in the same query, before I update the matched ones. Is there a construct in Cypher that can help me achieve this?
Here is a concrete formulation. I have a pool of labels from which I choose and assign to nodes. When I run a certain query, I assign one of those labels, l, to the nodes returned under the conditions specified by WHERE clause in the query. However, l could have been assigned to other nodes previously, and I want to rid all those nodes of l which are not the result of this query.
The conditions in WHERE clause could be arbitrary; hence simple negation would probably not work. An example code is as follows:
MATCH (v)
WHERE <some set of conditions>
// here I want to remove 'l' from the nodes
// not satisfied by the above condition
SET v:l
I have solved this problem by using a temporary label through this process:
Assign x to v.
Remove l from all nodes.
Assign l to all nodes containing x.
Removing x from all nodes.
Is there a better way to achieve this in Cypher?
This seems like one reasonable solution:
MATCH (v)
WITH REDUCE(s = {a:[], d:[]}, x IN COLLECT(v) |
CASE
WHEN <some set of conditions> AND NOT('l' IN LABELS(x)) THEN {a: s.a+x, d: s.d}
WHEN 'l' IN LABELS(x) THEN {a: s.a, d: s.d+x}
END) AS actions
FOREACH (a IN actions.a | SET a:l)
FOREACH (d IN actions.d | REMOVE d:l)
The above query tests every node, and remembers in the actions.a list the nodes that need the l label but do not yet have it, and in the actions.d list the nodes that have the label but should not. Then it performs the appropriate action for each list, without updating any nodes that are already OK.
A node A has 3 connected Nodes B1, B2, B3. Those Bx Nodes have again connected Nodes C1,C2,C3 and C4. Also Node A have 2 connected nodes C5 and C6.
Starting with node A I want to collect all C-nodes. I did a query for the A node, collect the two C-Nodes, then a query for the B-nodes, collect again all C-nodes and merge both arrays. Work but is not very clever.
I tried (Pseudocode)
MATCH (g)<-[:IS_SUBGROUP_OF*1]-(i)-[:HAS_C_NODES]->(c) WHERE g = A.uuid RETURN C_NODES
But I get either all c-nodes for A or for the B-nodes
How would I do a query that collects all C-Nodes starting with Node A?
* edited *
Here is some example data:
CREATE (a:A), (b1:B1), (b2:B2), (b3:B3), (c1:C1), (c2:C2), (c3:C3), (c4:C4), (a)-[r:HAS]->(c4), (a)-[r1:HAS]->(b1), (a)-[r2:HAS]->(b2), (a)-[r3:HAS]->(b3), (b1)-[r4:HAS]->(c1), (b1)-[r5:HAS]->(c2), (b2)-[r6:HAS]->(c3)
A query should return all nodes starting with C, no matter to which node they are connected (A or B).
You can add multiple labels for each node. You should use this to your advantage and segregate all the B and C nodes into a second label.
Eg:
CREATE (a:A), (b1:B1:BType), (b2:B2:BType), (b3:B3:BType), (c1:C1:CType), (c2:C2:CType), (c3:C3:CType), (c4:C4:CType), (a)-[r:HAS]->(c4), (a)-[r1:HAS]->(b1), (a)-[r2:HAS]->(b2), (a)-[r3:HAS]->(b3), (b1)-[r4:HAS]->(c1), (b1)-[r5:HAS]->(c2), (b2)-[r6:HAS]->(c3)
I have modified your create statement to group all the B nodes as :BType label and all the C nodes as :CType label.
You can simply use the optional match keyword to selectively traverse through the relationships if they exist and obtain the results you want.
match (a:A)-[:HAS]->(b:BType)-[:HAS]->(c:CType) optional match (a:A)-[:HAS]->(xc:CType) return c,xc
If you would like both sets of nodes to be grouped together you could try this statement instead which uses collect().
match (a:A)-[:HAS]->(b:BType)-[:HAS]->(c:CType) with a,collect (distinct c) as set1 optional match (a:A)-[:HAS]->(xc:CType) return set1 + collect (distinct xc) as output
I've got a graph where each node has label either A or B, and an index on the id property for each label:
CREATE INDEX ON :A(id);
CREATE INDEX ON :B(id);
In this graph, I want to find the node(s) with id "42", but I don't know a-priori the label. To do this I am executing the following query:
MATCH (n {id:"42"}) WHERE (n:A OR n:B) RETURN n;
But this query takes 6 seconds to complete. However, doing either of:
MATCH (n:A {id:"42"}) RETURN n;
MATCH (n:B {id:"42"}) RETURN n;
Takes only ~10ms.
Am I not formulating my query correctly? What is the right way to formulate it so that it takes advantage of the installed indices?
Here is one way to use both indices. result will be a collection of matching nodes.
OPTIONAL MATCH (a:B {id:"42"})
OPTIONAL MATCH (b:A {id:"42"})
RETURN
(CASE WHEN a IS NULL THEN [] ELSE [a] END) +
(CASE WHEN b IS NULL THEN [] ELSE [b] END)
AS result;
You should use PROFILE to verify that the execution plan for your neo4j environment uses the NodeIndexSeek operation for both OPTIONAL MATCH clauses. If not, you can use the USING INDEX clause to give a hint to Cypher.
You should use UNION to make sure that both indexes are used. In your question you almost had the answer.
MATCH (n:A {id:"42"}) RETURN n
UNION
MATCH (n:B {id:"42"}) RETURN n
;
This will work. To check your query use profile or explain before your query statement to check if the indexes are used .
Indexes are formed and and used via a node label and property, and to use them you need to form your query the same way. That means queries w/out a label will scan all nodes with the results you got.