Cypher - Best way to create relationships between two arrays of nodes - neo4j

I have two arrays of nodes
TYPE1 : [Node1, Node2, ...NodeN]
TYPE2 : [OtherNode1, OtherNode2....OtherNodeN]
I'm trying to connect each TYPE1 node to its corresponding TYPE2 node as follows.
(Node1) -[:RELATED_TO] -> (OtherNode1)
It's a simple one-to-one correspondence.
I used
MATCH (x:TYPE1),(y:TYPE2)
with x, y
with COLLECT(x) as n1, COLLECT(y) as n2
FOREACH(i in RANGE(0, 9) |
CREATE (n1[i])-[:RELATED_TO]->(n2[i])
)
which fails with
Error: Invalid input '[': expected an identifier character, node labels, a property map, ')' or a relationship pattern (line 4, column 21)
I have two questions.
What am I doing wrong in the query?
What is the best way to accomplish what I'm doing?
Many thanks!

Consider the following example data:
FOREACH (i IN range(1,10) | CREATE (:TYPE1), (:TYPE2))
Because you aren't interested in ordering your collections by any properties, you'll just be joining nodes in whatever order they are found by MATCH. The following query will do (what I think) you are trying to do, though it is inelegant:
MATCH (x:TYPE1), (y:TYPE2)
WITH COLLECT(DISTINCT x) AS n1, COLLECT(DISTINCT y) AS n2
WHERE LENGTH(n1) = LENGTH(n2)
FOREACH (i IN RANGE(0, LENGTH(n1) - 1) |
FOREACH (node IN [n1[i]] |
FOREACH (othernode IN [n2[i]] |
MERGE (node)-[:RELATED_TO]-(othernode)
)
)
)
Browser view post-query:

Related

Neo4j - creating multiple relationships with FOREACH/CASE WHEN trick, and then return number of relationships

I am creating a node, and creating relationships between it and a bunch of different existing nodes at the same time. Some of the other nodes may not exist, so I am using the FOREACH/CASE WHEN trick described here by neo4j staff and also here on StackOverflow.
I'd like to return the count of each relationship that were created, but I can't pull that out of the FOREACH. My current query is something like:
CREATE (p:Paper {title: $title})
WITH p
OPTIONAL MATCH (a:Author) WHERE a.name IN $a_names
OPTIONAL MATCH (p2:Paper) WHERE p2.title IN $citee_titles
OPTIONAL MATCH (p3:Paper) WHERE p3.title IN $citer_titles
FOREACH (_ IN CASE WHEN a IS NOT NULL THEN [1] END | MERGE (a)-[:AUTHORED]->(p))
FOREACH (_ IN CASE WHEN p2 IS NOT NULL THEN [1] END | MERGE (p)-[:CITES]->(p2))
FOREACH (_ IN CASE WHEN p3 IS NOT NULL THEN [1] END | MERGE (p3)-[:CITES]->(p))
Any of $a_names, $citee_titles or $citer_titles can be an empty set or have no elements matching any existing nodes.
I have tried adding a WITH and additional MATCHes on the end, something like:
WITH p, a, p2, p3
OPTIONAL MATCH (a)-[r1:AUTHORED]->(p)
OPTIONAL MATCH (p)-[r2:CITES]->(p2)
OPTIONAL MATCH (p3)-[r3:CITES]->(p)
RETURN COUNT(DISTINCT r1), COUNT(DISTINCT r2), COUNT(DISTINCT r3)
It works, but runs into the original problem that the FOREACH solution was trying to solve - if there are no matches for r1 then I don't get results for r2 and r3.
Any thoughts on how to deal with this? My goal is to test that the number of relationships created matches what I expect based on the parameters passed in.
In your query, you are trying to create the relationships with the newly created paper node, which is p. Since, there is only one p node, to get the count of each type of node created by the FOREACH statement, you can simply return the count of distinct a, p2, p3 nodes matched by the filters, like this
CREATE (p:Paper {title: $title})
WITH p
OPTIONAL MATCH (a:Author) WHERE a.name IN $a_names
OPTIONAL MATCH (p2:Paper) WHERE p2.title IN $citee_titles
OPTIONAL MATCH (p3:Paper) WHERE p3.title IN $citer_titles
FOREACH (_ IN CASE WHEN a IS NOT NULL THEN [1] END | MERGE (a)-[:AUTHORED]->(p))
FOREACH (_ IN CASE WHEN p2 IS NOT NULL THEN [1] END | MERGE (p)-[:CITES]->(p2))
FOREACH (_ IN CASE WHEN p3 IS NOT NULL THEN [1] END | MERGE (p3)-[:CITES]->(p))
RETURN COUNT(DISTINCT a), COUNT(DISTINCT p2), COUNT(DISTINCT p3)

Query to write hops and return all the properties from the middle nodes or a better way to do it and skip hops?

Node A is connected to Node E through different nodes B (B can be repeating), C and D etc as given below.
(A)--(C)--(D)--(E)
(A)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(C)--(D)--(E)
(A)--(B)--(B)--(B)--(C)--(D)--(E)
There could be up to 7 B nodes between A and C or no B node at all (like the first case above).
Question: How to get all the E1, E2, E3, E4 connected to A1 with a single query and return properties from all A, B, C, D and E nodes? I could not return the properties using the hops.
MATCH (A {Id:30})-[*1..6]-(E) RETURN DISTINCT A.Name, E.Name;
But we want to return B.Name (If there are multiple B nodes in the middle their names too), C.Name and D.Name too. Happy to skip hopping completely if required. Help please? Thanks in advance.
Try this
// in case there is always A,C and E, you can look for
// paths with length 3 to 6
MATCH path=(A)-[*3..6]-(E)
// return the name of each node in the same order
RETURN [n IN nodes(path) | n.name] AS nodeNames
Assuming A through E are node labels, this query should get all paths that match your pattern (with 0 to 7 B nodes between the A and C nodes), and return distinct lists of node Name values:
MATCH p=(:A)-[*..8]-(:C)--(:D)--(:E)
WHERE ALL(n IN NODES(p)[1..-3] WHERE 'B' IN LABELS(n))
RETURN DISTINCT [m IN NODES(p) | m.Name] AS names
In general, the query would be more efficient if you could also specify the relationship types and their directionality.

Cypher - Only show node name, not full node in path variable

In Cypher I have the following query:
MATCH p=(n1 {name: "Node1"})-[r*..6]-(n2 {name: "Node2"})
RETURN p, reduce(cost = 0, x in r | cost + x.cost) AS cost
It is working as expected. However, it prints the full n1 node, then the full r relationship (with all its attributes), and then full n2.
What I want instead is to just show the value of the name attribute of n1, the type attribute of r and again the name attribute of n2.
How could this be possible?
Thank you.
The tricky part of your request is the type attribute of r, as r is a collection of relationships of the path, not a single relationship. We can use EXTRACT to produce a list of relationship types for all relationships in your path. See if this will work for you:
MATCH (n1 {name: "Node1"})-[r*..6]-(n2 {name: "Node2"})
RETURN n1.name, EXTRACT(rel in r | TYPE(rel)) as types, n2.name, reduce(cost = 0, x in r | cost + x.cost) AS cost
You also seem to be calculating a cost for the path. Have you looked at the shortestPath() function?

how to remove Neo4j nodes with duplicate properties?

In Neo4j 2.1.6, I have nodes that are non-unique in respect of a certain property, inputID.
Using Cypher, how do I remove all nodes that are duplicates in terms of a given property, leaving only uniques?
I have tried the following...
MATCH (n:Input)
WITH n.inputID, collect(n) AS nodes
WHERE size(nodes) > 1
FOREACH (n in tail(nodes) | DELETE n)
...but it results in...
Expression in WITH must be aliased (use AS) (line 2, column 6)
"WITH n.inputID, collect(n) AS nodes"
^
Thanks,
G
You're not aliasing that WITH variable. Change this:
WITH n.inputID, collect(n) AS nodes
To this:
WITH n.inputID AS inputID, collect(n) AS nodes
As you correctly found out, using tail on a collection will let you remove the duplicates, don't forget to remove relationships before the node (DETACH) and alias the field as FrobberOfBits mentioned:
MATCH (n:Input)
WITH n.inputID AS inputID, collect(n) AS nodes
WHERE size(nodes) > 1
FOREACH (n in tail(nodes) | DETACH DELETE n)

neo4j collecting nodes and relations type b-->a<--c,a<--d

I am extending maxdemarzi's excellent graph visualisation example (http://maxdemarzi.com/2013/07/03/the-last-mile/) using VivaGraph backed by neo4j.
I want to display relationships of the type
a-->b<--c,b<--d
I tried the query
MATCH p = (a)--(b:X)--(c),(b:X)--(d)
RETURN EXTRACT(n in nodes(p) | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in relationships(p)| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
It looks like the named query picks up only a-->b<--c pattern and omits the b<--d patterns.
Am i missing something... can i not add multiple patterns in a named query?
The most immediate problem is that the comma in the MATCH clause separates the first pattern from the second. The variable 'p' only stores the first pattern. This is why you aren't getting the results you desire. Independent of that, you are at risk of having a 'loose binding' by putting a label on both of your nodes named 'b' in the two patterns. The second 'b' node should not have a label.
So here is a version of your query that should work.
MATCH p1=(a)-->(b:X)<--(c), p2=(b)<--(d)
WITH nodes(p1) + d AS ns, relationships(p1) + relationships(p2) AS rs
RETURN EXTRACT(n IN ns | {id:ID(n), name:COALESCE(n.name, n.title, ID(n)), type:LABELS(n)}) AS nodes,
EXTRACT(r in rs| {source:ID(startNode(r)) , target:ID(endNode(r))}) AS rels
Capture both paths, then build collections from the nodes and relationships of both paths. The collection of nodes actually only extracts the nodes from p1 and adds the 'd' node. You could write that part as
nodes(p1) + nodes(p2) as ns
but then the 'b' node will appear in the list twice.

Resources