I couldn't find a similar post, so if you already know one or if my question is not the proper one, please let me know.
I have this query
MATCH
(t:Taxi {name:'Taxi1813'})<-[:ASSIGNED]-(u2:User)-[rd2:DROP_OFF]->
(g2:Grid)-[r:TO*1..2]-(g:Grid)<-[rd:DROP_OFF]-(u:User)-[:ASSIGNED]->(t)
WHERE ID(u2) < ID(u) AND rd2.time >= '04:38' AND rd2.time <= '04:42'
WITH DISTINCT u2, g2, u, g, rd2, rd
MATCH p=shortestPath((g2)-[r:TO*1..2]-(g))
WITH rd2, rd,u2, g2, u, g, p, REDUCE(totalTime = 0, x IN RELATIONSHIPS(p) | totalTime + x.time) AS totalTime
WHERE totalTime <= 4
RETURN u2.name, u.name
So at the end i got two columns
u2.name u.name
User179 UserTest
User177 User179
Is there is a way or function to merge both columns into a single one and remove duplicates
Users
User179
User177
UserTest
Any suggestions? Thank you
You can combine the two collections into a single collection and then return just the distinct items.
WITH ['User179', 'User177'] AS list1
, ['UserTest', 'User179'] AS list2
UNWIND list1 + list2 AS item
RETURN DISTINCT item
Alternatively, if you are using APOC you could use apoc.coll.union() instead.
WITH ['User179', 'User177'] AS list1
, ['UserTest', 'User179'] AS list2
RETURN apoc.coll.union(list1,list2)
u2.name + u.name combines the list
you could make something like "where u2.name is not equal to u.name(not right syntax)"
Related
I want to get all the list of distinct nodes and relationship that I am getting through this query.
MATCH (a:Protein{name:'9606.ENSP00000005995'})-[r:ON_INTERACTION_WITH]-(b:Protein)-[d:ON_INTERACTION_WITH]-(c:Protein)
Return a,b,c,d,r
limit 10
This should work:
MATCH (a:Protein{name:'9606.ENSP00000005995'})-[r:ON_INTERACTION_WITH]-(b:Protein)-[d:ON_INTERACTION_WITH]-(c:Protein)
WITH * LIMIT 10
RETURN
COLLECT(DISTINCT a) AS aList,
COLLECT(DISTINCT b) AS bList,
COLLECT(DISTINCT c) AS cList,
COLLECT(DISTINCT r) AS rList,
COLLECT(DISTINCT d) AS dList
match(n)-[r:LIKES]->(m) with count(n) as cnt, m where cnt = max(cnt) return m
Above query results in following error:
Invalid use of aggregating function max(...) in this context (line 1,
column 61 (offset: 60))
This query should return a collection of the m nodes that have the maximum count. It only needs to perform a single MATCH operation, and should be relatively performant.
MATCH (n)-[:LIKES]->(m)
WITH m, COUNT(n) AS cnt
WITH COLLECT({m: m, cnt: cnt}) AS data, MAX(cnt) AS maxcnt
RETURN REDUCE(ms = [], d IN data | CASE WHEN d.cnt = maxcnt THEN ms + d.m ELSE ms END) AS ms;
If you're just trying to find a single node that has the most LIKES relationships leading to it, then you can add an ORDER BY and LIMIT:
MATCH (n)-[:LIKES]->(m)
WITH m, count(n) AS cnt
ORDER BY cnt DESC
LIMIT 1
RETURN m
However, that query has the limitation in that if more than one node has the maximum number of inbound relationships, then those tied nodes won't be returned. To achieve that result, you might try something like this:
MATCH (n)-[:LIKES]->(m)
WITH m, count(n) AS cnt
WITH MAX(cnt) AS maxcnt
MATCH (o)
WHERE size((o)<-[:LIKES]-()) = maxcnt
RETURN o
Using Neo4J 2.1.5.
Data:
2000 persons
Goal: For each person, calculate the total of friends, friends' friends, friends' friends' friends.
Result is like follows:
Person FullName | Friends total | Friends-2 total | Friends-3 total | global total
MATCH (person:Person)
WITH person
OPTIONAL MATCH person-[:KNOWS]-(p2:Person)
WITH person, count(p2) as f1
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..2]-(f2:Person))
WHERE length(path) = 2
WITH count(nodes(path)[-1]) AS f2, person, f1
OPTIONAL MATCH path = shortestPath(person-[:KNOWS*..3]-(f3:Person))
WHERE length(path) = 3
WITH count(nodes(path)[-1]) AS f3, person, f2, f1
RETURN person._firstName + " " + person._lastName, f1, f2, f3, f1+f2+f3 AS total
The tricks is to avoid wrong calculations with cylic graph; that's why I use shortestPath.
However, this query lasts long: 60 seconds!
Any optimization possible?
[EDITED]
Does this work for you?
MATCH (person:Person)
OPTIONAL MATCH (person)-[:KNOWS]-(p1:Person)
WITH person, COALESCE(COLLECT(p1),[]) AS p1s
WITH person, CASE p1s WHEN [] THEN [NULL] ELSE p1s END AS p1s
UNWIND p1s AS p1
OPTIONAL MATCH (p1)-[:KNOWS]-(p2:Person)
WHERE NOT ((p2 = person) OR (p2 IN p1s))
WITH person, p1s, COALESCE(COLLECT(DISTINCT p2),[]) AS p2s
WITH person, p1s, CASE p2s WHEN [] THEN [NULL] ELSE p2s END AS p2s UNWIND p2s AS p2
OPTIONAL MATCH (p2)-[:KNOWS]-(p3:Person)
WHERE NOT ((p3 = person) OR (p3 IN p1s) OR (p3 IN p2s))
WITH person,
CASE p1s WHEN [NULL] THEN 0 ELSE SIZE(p1s) END AS f1,
CASE p2s WHEN [NULL] THEN 0 ELSE SIZE(p2s) END AS f2,
COUNT(DISTINCT p3) AS f3
RETURN person.firstName + " " + person.lastName, f1, f2, f3, f1+f2+f3 AS total;
Each friend is only counted once.
Here is an explanation of some of the more obscure tactics used. The query has to to replace empty p1s and p2s collections with [NULL] so that UNWIND will not abort the rest of the query. Then, when counting the size of the collections, we need give [NULL] collections a count of 0.
I am running the following query that is meant to compare two collections nodes set1 and set2. All nodes in set2 are in set1, and I would like to identify all the nodes in set1 that are NOT in set2. However, the query returns a set of nodes that includes some of the nodes in set1. I am running this query on v2.1.7. Suggestions?
Query:
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with nodes(p) as set1, p
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with nodes(q) as set2,set1, p
WHERE ALL(x in set2 WHERE NOT x in set1)
with nodes(p) as pneumo
UNWIND pneumo AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
Alternative query, same result:
Query:
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with nodes(p) as set1, p
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with nodes(q) as set2,set1, p
WHERE NONE(x in set2 WHERE x in set1)
with nodes(p) as pneumo
UNWIND pneumo AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
Your matches don't return just one row as you might expect but many rows,
and your comparison is done between the cross product of those many row combinations. You probably want to create a set for each of your two subtrees first with a combination of unwind + collect(distinct)
The code below will not be as fast, as cypher internally doesn't have a Set concept yet.
try this
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
unwind nodes(p) as n
with collect(distinct n) as set1
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
unwind nodes(q) as m
with collect(distinct m) as set2
WHERE NONE(x in set2 WHERE x in set1)
UNWIND set1 AS pneumolist
RETURN distinct pneumolist.FSN,pneumolist.sctid
The following query was successful, and addresses Michael's discussion regarding cross products (above).
MATCH p=(a:ObjectConcept{sctid:233604007})<-[:ISA*]-(b:ObjectConcept)
with distinct nodes(p) as set1
UNWIND set1 as x1
with collect(DISTINCT x1) as set11
MATCH q=(a:ObjectConcept{sctid:34020007})<-[:ISA*]-(b:ObjectConcept)
with distinct nodes(q) as set2,set11
UNWIND set2 as x2
with collect(distinct x2) as set22,set11
with REDUCE(pneumo=[],x in set11|case when x in set22 then pneumo else pneumo
+ [x] END) AS pneumo
return pneumo
I have two cypher queries and would like to know if there is a possibility to get the Points and HeadHunter information within one single statement.
MATCH (s:SEASON)-[*]->(e:EVENT)<-[f:FINISHED]-(p:PLAYER)
WHERE s.id = 8 AND e.id <= 1197
RETURN p, sum(f.points) AS Points
ORDER BY Points DESC
MATCH (s:SEASON)-[*]->(e:EVENT)<-[el:ELIMINATED]-(p:PLAYER)
WHERE s.id = 8 AND e.id <= 1197
RETURN p, sum(el.points) AS HeadHunter
ORDER BY HeadHunter DESC
I played around with different approaches but none of them worked. I would like to do something like this:
MATCH (s:SEASON)-[*]->(e:EVENT)<-[el:ELIMINATED]-(p:PLAYER),(e)<-[f:FINISHED]-(p)
WHERE s.id = 8 AND e.id <= 1197
RETURN p, sum(el.points) AS HeadHunter, sum(f.points) AS Points
Does this work for you?
MATCH (s:SEASON)-[*]->(e:EVENT)<-[rel:FINISHED|ELIMINATED]-(p:PLAYER)
WHERE s.id = 8 AND e.id <= 1197
WITH p, COLLECT(rel) AS rels
RETURN p,
REDUCE(s = 0, x IN rels | CASE WHEN TYPE(x) = 'FINISHED' THEN s + x.points ELSE s END) AS Points,
REDUCE(s = 0, x IN rels | CASE WHEN TYPE(x) = 'ELIMINATED' THEN s + x.points ELSE s END) AS HeadHunter
This should return the relevant sums for each PLAYER that either finished and/or got eliminated.
In principle I think your query can work, but you may have made a mistake with your p variable. In this query:
MATCH (s:SEASON)-[*]->(e:EVENT)<-[el:ELIMINATED]-(p:PLAYER),(e)<-[f:FINISHED]-(p)
WHERE s.id = 8 AND e.id <= 1197
RETURN p, sum(el.points) AS HeadHunter, sum(f.points) AS Points
Note that p has to be a player with both the ELIMINATED relationship and the FINISHED relationship. I'm guessing based on the semantics of your domain, that doesn't make sense; you can't be eliminated and have finished, right?
So you might want to try some variant of this:
MATCH (s:SEASON)-[*]->(e:EVENT)<-[el:ELIMINATED]-(pElim:PLAYER),(e)<-[f:FINISHED]-(pFinish)
WHERE s.id = 8 AND e.id <= 1197
RETURN pElim, sum(el.points) AS HeadHunter, pFinish, sum(f.points) AS Points
Note with this you'll have a lot of extra rows of output for various combinations of pElim and pFinish but the respective values should be correct.