Neo4j collection function error? - neo4j

I am running the following query that is meant to compare two collections nodes set1 and set2. All nodes in set2 are in set1, and I would like to identify all the nodes in set1 that are NOT in set2. However, the query returns a set of nodes that includes some of the nodes in set1. I am running this query on v2.1.7. Suggestions?
Query:
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with nodes(p) as set1, p
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with nodes(q) as set2,set1, p
WHERE ALL(x in set2 WHERE NOT x in set1)
with nodes(p) as pneumo
UNWIND pneumo AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
Alternative query, same result:
Query:
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with nodes(p) as set1, p
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with nodes(q) as set2,set1, p
WHERE NONE(x in set2 WHERE x in set1)
with nodes(p) as pneumo
UNWIND pneumo AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid

Your matches don't return just one row as you might expect but many rows,
and your comparison is done between the cross product of those many row combinations. You probably want to create a set for each of your two subtrees first with a combination of unwind + collect(distinct)
The code below will not be as fast, as cypher internally doesn't have a Set concept yet.
try this
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
unwind nodes(p) as n
with collect(distinct n) as set1
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
unwind nodes(q) as m
with collect(distinct m) as set2
WHERE NONE(x in set2 WHERE x in set1)
UNWIND set1 AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid

The following query was successful, and addresses Michael's discussion regarding cross products (above).
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with distinct nodes(p) as set1
UNWIND set1 as x1
with collect(DISTINCT x1) as set11
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with distinct nodes(q) as set2,set11
UNWIND set2 as x2
with collect(distinct x2) as set22,set11
with REDUCE(pneumo=[],x in set11|case when x in set22 then pneumo else pneumo
+ [x] END) AS pneumo
return pneumo

Related

Combine two cypher queries

Currently this is the data stored in the database
Org Name Org ID
A 1
B 2
C 5
D 9
I'm trying to combine these 2 queries:
MATCH (n:Org)
WHERE n.id in [1,2]
RETURN n.name as group1_name, n.id as group1_id
MATCH (n:Org)
WHERE n.id in [5,9]
RETURN n.name as group2_name, n.id as group2_id
I need the result to be shown like this:
group1_id group1_name group2_id group1_name
1 A 5 C
2 B 9 D
Assuming the two id lists are always the same size (in your example, 2), here is one approach (assuming you also want the id values sorted in ascending order):
MATCH (n:Org)
WHERE n.id in [1, 2]
WITH n ORDER BY n.id
WITH COLLECT(n) AS ns
MATCH (m:Org)
WHERE m.id in [5, 9]
WITH ns, m ORDER BY m.id
WITH ns, COLLECT(m) AS ms
UNWIND [i IN RANGE(0, SIZE(ns)-1) | {a: ns[i], b: ms[i]}] AS row
RETURN
row.a.id as group1_id, row.a.name as group1_name,
row.b.id as group2_id, row.b.name as group2_name
And here is a simpler approach:
WITH [1, 2] AS xs, [5, 9] AS ys
UNWIND RANGE(0, SIZE(xs)-1) AS i
MATCH (n:Org), (m:Org)
WHERE n.id = xs[i] AND m.id = ys[i]
RETURN n.id as group1_id, n.name as group1_name, m.id as group2_id, m.name as group2_name
And finally, if the xs and ys lists are passed to the query as parameters:
UNWIND RANGE(0, SIZE($xs)-1) AS i
MATCH (n:Org), (m:Org)
WHERE n.id = $xs[i].id AND m.id = $ys[i].y
RETURN n.id as group1_id, n.name as group1_name, m.id as group2_id, m.name as group2_name

Neo4j: Find nodes with property name that contains string (in the property name, not the property value)

Is there a way to find all nodes with properties that have a certain string?
Eg here with "ID":
match (n) where exists( n[".*"+"ID"]) return n
(this does not work).
Thanks!
This will give you just the keys.
MATCH (n) WHERE ANY(x IN KEYS(n) WHERE x =~".*ID") RETURN n, KEYS(n) AS myKeys
This will give you just the values.
MATCH (n) WHERE ANY(x IN KEYS(n) WHERE x =~".*ID")
RETURN n, [x IN KEYS(n) WHERE x =~".*ID" | n[x]] AS myValues
If you have apoc, this will give you the keys and the values.
MATCH (n) WHERE ANY(x IN KEYS(n) WHERE x =~".*ID")
WITH n, [x IN KEYS(n) WHERE x =~".*ID" | x] AS myKeys
RETURN id(n) AS nodeId, apoc.map.submap(n, myKeys) as submap

Is there a way i can return all the nodes their relationship and it's properties for the following query

I want to get all the list of distinct nodes and relationship that I am getting through this query.
MATCH (a:Protein{name:'9606.ENSP00000005995'})-[r:ON_INTERACTION_WITH]-(b:Protein)-[d:ON_INTERACTION_WITH]-(c:Protein)
Return a,b,c,d,r
limit 10
This should work:
MATCH (a:Protein{name:'9606.ENSP00000005995'})-[r:ON_INTERACTION_WITH]-(b:Protein)-[d:ON_INTERACTION_WITH]-(c:Protein)
WITH * LIMIT 10
RETURN
COLLECT(DISTINCT a) AS aList,
COLLECT(DISTINCT b) AS bList,
COLLECT(DISTINCT c) AS cList,
COLLECT(DISTINCT r) AS rList,
COLLECT(DISTINCT d) AS dList

Neo4j Cypher query and index of element in the collection

I'm trying to find index number of Decision by {decisionGroupId}, {decisionId} and {criteriaIds}
This is my current Cypher query:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = {decisionGroupId}
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN {criteriaIds}
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
ORDER BY weight DESC, totalVotes DESC
WITH COLLECT(childD) AS ps
RETURN REDUCE(ix = -1, i IN RANGE(0, SIZE(ps)-1)
| CASE ps[i].id WHEN {decisionId} THEN i ELSE ix END) AS ix
I have only 3 Decision in the database but this query returns the following indices:
2
3
4
while I expecting something like(starting from 0 and -1 if not found)
0
1
2
What is wrong with my query and how to fix it?
UPDATED
This query is working fine with COLLECT(DISTINCT childD) AS ps:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = {decisionGroupId}
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN {criteriaIds}
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
ORDER BY weight DESC, totalVotes DESC
WITH COLLECT(DISTINCT childD) AS ps
RETURN REDUCE(ix = -1, i IN RANGE(0, SIZE(ps)-1)
| CASE ps[i].id WHEN {decisionId} THEN i ELSE ix END) AS ix
Please help me to refactor this query and get rid of heavy REDUCE.
Let's try to get the reduce part right with a simpler query:
WITH ['a', 'b', 'c'] AS ps
RETURN
reduce(ix = -1, i IN RANGE(0, SIZE(ps)-1) |
CASE ps[i] WHEN 'b' THEN i ELSE ix END) AS ix
)
As I stated in the comments, it is usually better to avoid reduce if possible. So, to express the same using a list comprehension, use WHERE for filtering.
WITH ['a', 'b', 'c'] AS ps
RETURN [i IN RANGE(0, SIZE(ps)-1) WHERE ps[i] = 'b'][0]
The list comprehension results in a list with a single element, and we will use the [0] indexer to select that element.
After adapting this to your query, we'll get something like this:
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = {decisionGroupId}
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN {criteriaIds}
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
ORDER BY weight DESC, totalVotes DESC
WITH COLLECT(DISTINCT childD) AS ps
RETURN [i IN RANGE(0, SIZE(ps)-1) WHERE ps[i].id = {decisionId}][0]
If you have APOC installed, you can also use the function:
return apoc.coll.indexOf([1,2,3],2)

Neo4j - Return a path only when relationship property between all of the pairs of nodes exists

Let's say, I have a path A->B->C->D and the relationships have a property val.
Now, I have to pick any two nodes from the path and if the rel.val>0.8
and if it is true for all the pair of nodes, then return the path
Ex:
P = A-->B-->C-->D
All nodes = [A,B,C,D]
return p if{
rel.val of (A,B) >0.8
rel.val of (A,C) >0.8
rel.val of (A,D) >0.8
rel.val of (B,C) >0.8
rel.val of (B,D) >0.8
rel.val of (C,D) >0.8
}
Here is my query, (of course the query is wrong):
MATCH p=(a{word:"quality"})-[r*1..2]->(b)
WHERE NONE (n IN nodes(p) WHERE size(filter(x IN nodes(p) WHERE n = x))> 1)
MATCH q = (a)-[r:coocr]->(b) where a in nodes(p) AND b in nodes(p) AND NOT b = a AND None(rel IN rels(q) WHERE rel.val < 0.8 )
RETURN p
In summary, you want to MATCH a path and then make sure that all pairs of nodes in your path are connected by a relationship which fullfills a certain criterion (rel.val > 0.8).
Interesting question, I think this is not really straightforward. Maybe I am overlooking something obvious?
Here is an idea how to approach the problem. You first MATCH your path, then MATCH between all nodes in the path and count the number of relationships with rel.val > 0.8. This number has to be the size of the factorial of the number of nodes (num relationships == (num nodes)!, number of possible combinations of 2).
The following query returns the number of relationships, but I don't know how to compare this to the factorial of the number of nodes:
// match your path like before
MATCH p=(a:Uselabel {word:"quality"})-[r:USETYPE*1..2]->(b)
// use unwind to get the nodes from the path
UNWIND nodes(path) AS x
// do this twice to match the nodes onto themselves
UNWIND nodes(path) AS y
// match your relationship
MATCH (x)-[rel:USETYPE]-(y)
// criterion for your relationship
WHERE rel.val > 0.8
// only if two different nodes
WHERE x <> y
// get the count of pairs
WITH p, count(DISTINCT rel) AS num_pairs
// now I don't know how to get/compare the factorial of the number of nodes :)
RETURN num_pairs
I didn't find a built-in function for the factorial, so you have to look into this.

Resources