Iterating through a collection with MATCH and CREATE clauses - neo4j

I want to do something like this in cypher:
MATCH (n:node) WHERE n.ID = x //x is an integer value
FOREACH (num in n.IDs:
MATCH (p:node) WHERE p.ID = num
CREATE (n)-[:LINK]->(p) )
where num is an array of integer values referring to the IDs of nodes that need to be linked to the node matched in the first line.
When I run this query, I get the error: Invalid use of MATCH inside FOREACH.
I'm in the early stages of teaching myself both Cypher and Neo4j. How can I achieve my desired functionality here? Or am I barking up the wrong tree - am I failing to grasp something that makes it unnecessary for me to do so?

This is not allowed, instead use the top-level MATCH like http://gist.neo4j.org/?8332363
MATCH (n:node), (p:node)
WHERE n.ID = 1 AND p.ID in [2,3,4]
CREATE (n)-[:LINK]->(p)

Related

Update nodes by a list of ids and values in one cypher query

I've got a list of id's and a list of values. I want to catch each node with the id and set a property by the value.
With just one Node that is super basic:
MATCH (n) WHERE n.id='node1' SET n.name='value1'
But i have a list of id's ['node1', 'node2', 'node3'] and same amount of values ['value1', 'value2', 'value3'] (For simplicity i used a pattern but values and id's vary a lot). My first approach was to use the query above and just call the database each time. But nowadays this isn't appropriate since i got thousand of id's which would result in thousand of requests.
I came up with this approach that I iterate over each entry in both lists and set the values. The first node from the node list has to get the first value from the value list and so on.
MATCH (n) WHERE n.id IN["node1", "node2"]
WITH n, COLLECT(n) as nodeList, COLLECT(["value1","value2"]) as valueList
UNWIND nodeList as nodes
UNWIND valueList as values
FOREACH (index IN RANGE(0, size(nodeList)) | SET nodes.name=values[index])
RETURN nodes, values
The problem with this query is that every node gets the same value (the last of the value list). The reason is in the last part SET nodes.name=values[index] I can't use the index on the left side nodes[index].name - doesn't work and the database throws error if i would do so. I tried to do it with the nodeList, node and n. Nothing worked out well. I'm not sure if this is the right way to achieve the goal maybe there is a more elegant way.
Create pairs from the ids and values first, then use UNWIND and simple MATCH .. SET query:
// THe first line will likely come from parameters instead
WITH ['node1', 'node2', 'node3'] AS ids,['value1', 'value2', 'value3'] AS values
WITH [i in range(0, size(ids)) | {id:ids[i], value:values[i]}] as pairs
UNWIND pairs AS pair
MATCH (n:Node) WHERE n.id = pair.id
SET n.value = pair.value
The line
WITH [i in range(0, size(ids)) | {id:ids[i], value:values[i]}] as pairs
combines two concepts - list comprehensions and maps. Using the list comprehension (with omitted WHERE clause) it converts list of indexes into a list of maps with id,value keys.

cypher - relationship traversal vs ordering by property

graph snippet
I have a Shape with a list of Points.
I have the following requirements:
1) retrieve an ordered set of Points;
2) insert/remove a Point and preserve the order of the rest of the Points
I can achieve this by:
A) Point has a sequence integer property that could be used to order;
B) Add a :NEXT relationship between each Point to create a linked list.
I'm new to Neo4j so not sure which approach is preferable to satisfy the requirements?
For the first requirement, I wrote the following queries and found the performance for the traversal to be poor but Im sure its a badly constructed query:
//A) 146 ms
Match (s:Shape {id: "1-700-y11-1.1.I"})-[:POINTS]->(p:Point)
return p
order by p.sequence;
//B) Timeout! Bad query I know, but dont know the right way to go about it!
Match path=(s:Shape {id: "1-700-y11-1.1.I"})-[:POINTS]->(p1:Point)-[:NEXT*]->(p2:Point)
return collect(p1, p2);
To get ordered list of points use a slightly modified version of the second query:
Match path=(s:Shape {id: "1-700-y11-1.1.I"})-[:POINTS]->(P1:Point)
-[:NEXT*]-> (P2:Point)
WHERE NOT (:Point)-[:NEXT]->(P1) AND
NOT (P2)-[:NEXT]->(:Point)
RETURN TAIL( NODES( path) )
And for example query to delete:
WITH "id" as pointToDelete
MATCH (P:Point {id: pointToDelete})
OPTIONAL MATCH (Prev:Point)-[:NEXT]->(P)
OPTIONAL MATCH (P)-[:NEXT]->(Next:Point)
FOREACH (x in CASE WHEN Prev IS NOT NULL THEN [1] ELSE [] END |
MERGE (Prev)-[:NEXT]->(Next)
)
DETACH DELETE P

Neo4j indices slow when querying across 2 labels

I've got a graph where each node has label either A or B, and an index on the id property for each label:
CREATE INDEX ON :A(id);
CREATE INDEX ON :B(id);
In this graph, I want to find the node(s) with id "42", but I don't know a-priori the label. To do this I am executing the following query:
MATCH (n {id:"42"}) WHERE (n:A OR n:B) RETURN n;
But this query takes 6 seconds to complete. However, doing either of:
MATCH (n:A {id:"42"}) RETURN n;
MATCH (n:B {id:"42"}) RETURN n;
Takes only ~10ms.
Am I not formulating my query correctly? What is the right way to formulate it so that it takes advantage of the installed indices?
Here is one way to use both indices. result will be a collection of matching nodes.
OPTIONAL MATCH (a:B {id:"42"})
OPTIONAL MATCH (b:A {id:"42"})
RETURN
(CASE WHEN a IS NULL THEN [] ELSE [a] END) +
(CASE WHEN b IS NULL THEN [] ELSE [b] END)
AS result;
You should use PROFILE to verify that the execution plan for your neo4j environment uses the NodeIndexSeek operation for both OPTIONAL MATCH clauses. If not, you can use the USING INDEX clause to give a hint to Cypher.
You should use UNION to make sure that both indexes are used. In your question you almost had the answer.
MATCH (n:A {id:"42"}) RETURN n
UNION
MATCH (n:B {id:"42"}) RETURN n
;
This will work. To check your query use profile or explain before your query statement to check if the indexes are used .
Indexes are formed and and used via a node label and property, and to use them you need to form your query the same way. That means queries w/out a label will scan all nodes with the results you got.

How to query for multiple OR'ed Neo4j paths?

Anyone know of a fast way to query multiple paths in Neo4j ?
Lets say I have movie nodes that can have a type that I want to match (this is psuedo-code)
MATCH
(m:Movie)<-[:TYPE]-(g:Genre { name:'action' })
OR
(m:Movie)<-[:TYPE]-(x:Genre)<-[:G_TYPE*1..3]-(g:Genre { name:'action' })
(m)-[:SUBGENRE]->(sg:SubGenre {name: 'comedy'})
OR
(m)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
The problem is, the first "m:Movie" nodes to be matched must match one of the paths specified, and the second SubGenre is depenedent on the first match.
I can make a query that works using MATCH and WHERE, but its really slow (30 seconds with a small 20MB dataset).
The problem is, I don't know how to OR match in Neo4j with other OR matches hanging off of the first results.
If I use WHERE, then I have to declare all the nodes used in any of the statements, in the initial MATCH which makes the query slow (since you cannot introduce new nodes in a WHERE)
Anyone know an elegant way to solve this ?? Thanks !
You can try a variable length path with a minimal length of 0:
MATCH
(m:Movie)<-[:TYPE|:SUBGENRE*0..4]-(g)
WHERE g:Genre and g.name = 'action' OR g:SubGenre and g.name='comedy'
For the query to use an index to find your genre / subgenre I recommend a UNION query though.
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' })
RETURN distinct m
UNION
(m:Movie)-[:SUBGENRE]->(x)<-[:SUB_TYPE*1..3]-(sg:SubGenre {name: 'comedy'})
RETURN distinct m
Perhaps the OPTIONAL MATCH clause might help here. OPTIONAL MATCH beavior is similar to the MATCH statement, except that instead of an all-or-none pattern matching approach, any elements of the pattern that do not match the pattern specific in the statement are bound to null.
For example, to match on a movie, its genre and a possible sub-genre:
OPTIONAL MATCH (m:Movie)-[:IS_GENRE]->(g:Genre)<-[:IS_SUBGENRE]-(sub:Genre)
WHERE m.title = "The Matrix"
RETURN m, g, sub
This will return the movie node, the genre node and if it exists, the sub-genre. If there is no sub-genre then it will return null for sub. You can use variable length paths as you have above as well with OPTIONAL MATCH.
[EDITED]
The following MATCH clause should be equivalent to your pseudocode. There is also a USING INDEX clause that assumes you have first created an index on :SubGenre(name), for efficiency. (You could use an index on :Genre(name) instead, if Genre nodes are more numerous than SubGenre nodes.)
MATCH
(m:Movie)<-[:TYPE*0..4]-(g:Genre { name:'action' }),
(m)-[:SUBGENRE]->()<-[:SUB_TYPE*0..3]-(sg:SubGenre { name: 'comedy' })
USING INDEX sg:SubGenre(name)
Here is a console that shows the results for some sample data.

Cypher path needs to exclude a certain relation

I have this graph:
A-[:X]->B-> a whole tree of badness
A-[:Y]->C-> a whole tree of goodness
I would like to know how to specify a path starting with A that excludes the :X relationship.
In this case "Y" could be any one of a number of different edge types. I do not want to specify them explicitly.
How do I write a path statement that includes A-[*]-B where * is not :X but can be anything else?
Solution for a fixed number of relationships between A and B
You can exclude a relationship type by matching all relationships from A to B and then filter out a specific type with WHERE NOT
MATCH p = (a:Label1)-[]-(b:Label2)
WHERE NOT (a)-[:X]-(b)
RETURN p
Solution for a variable length path between A and B
If you have a variable length path between A and B you cannot put the exact pattern in the WHERE NOT. Instead, you can use a NONE predicate on the path:
MATCH p = (a:Label1)-[*]-(b:Label2)
// this WHERE makes sure that none of the relationships in the
// returned path fulfill the criterion type(relationship) = 'X'
WHERE NONE (r in relationships(p) WHERE type(r) = 'X')
RETURN p
This Cypher query is simpler than the variable-length path query from #MartinPreusse, as it avoids using the RELATIONSHIPS function. Profiling shows that its execution plan is also a bit simpler, so it might be faster.
MATCH p=(a:Label1)-[rels*]-(b:Label2)
WHERE NONE (r IN rels WHERE type(r)= 'X')
RETURN p

Resources