neo4j distinct two columns - neo4j

How to return two different columns with cypher in Neo4j? The query which I've got is that:
MATCH (a:Person)-[r:WorksFOR]->(b:Boss), (c:Boss)<-[r2:WorksFOR]-(d:Person)
WHERE b.sex = c.sex
RETURN a, d;
And it returns:
a d
John Will
Will John
I want to get rid of one of the column.

OP needs to reword the question to clarify OP wants to get rid of one of the rows.
Here's a query that does that:
MATCH (a:Person)-[r:WorksFOR]->(b:Boss), (c:Boss)<-[r2:WorksFOR]-(d:Person)
WHERE b.name < c.name AND
b.sex = c.sex AND
b <> c
RETURN a, d;
The problem with your query is that b and c can match any Boss. To force them to match in one order, I've added b.name < c.name. The order doesn't matter, this just forces it to match one way, but not the other. I've added b <> c because you have to handle the case where they work for the same boss, which I don't think you want.
Once you add the ordering, the boss matching (b and c) can only happen one way, not the other way, so your second row of results gets eliminated.

Related

Separating matching nodes in a query result

I defined the directed relation Know on person nodes. For example, if Sara knows Alice then Sara-> Alice. I wrote this Cypher query to find all the people who know both the right and left side of the directed relation.
match ((n:Person)-[:Know]-> (m:Person)),(p:Person)
where EXISTS ((m)<-[:Know]-(p)-[:Know]->(n))
RETURN m,n,p
I need to get subgraphs with 3 nodes in the query's result but the result I get is a graph with many nodes. Is there any method to change the query to generate subgraphs with just 3 nodes (for example, a subgraph of Alex-> Sara, Alex-> Alice, Sara-> Alice and if Sara has the same condition on two other people it is shown in another subgraph). This requires repeating some nodes in the output.
MATCH clauses are more flexible than that. Try this:
MATCH (n:Person)-[:Know]->(m:Person)<-[:Know]-(p:Person)-[:Know]->(n)
WHERE NOT EXISTS (()-[:Know]->(p))
AND NOT EXISTS {
WITH m, n, p
MATCH (q:Person)-[:Know]->(m)
WHERE q <> n
AND q <> p
}
AND NOT EXISTS {
WITH m, n, p
MATCH (q:Person)-[:Know]->(n)
WHERE q <> p
}
RETURN m, n, p
You might have to use a unique ID property, and I'm not sure if the WITH clause will work here as I've gotten it; but with subqueries, you are generally able to import variables from above using WITH.

Filtering out nodes on two cypher paths

I have a simplified Neo4j graph (old version 2.x) as the image with 'defines' and 'same' edges. Assume the number on the define edge is a property on the edge
The queries I would like to run are:
1) Find nodes defined by both A and B -- Requried result: C, C, D
START A=node(885), B=node(996) MATCH (A-[:define]->(x)<-[:define]-B) RETURN DISTINCT x
Above works and returns C and D. But I want C twice since its defined twice. But without the distinct on x, it returns all the paths from A to B.
2)Find nodes that are NOT (defined by both A,B OR are defined by both A,B but connected via a same edge) -- Required result: G
Something like:
R1: MATCH (A-[:define]->(x)<-[:define]-B) RETURN DISTINCT x
R2: MATCH (A-[:define]->(e)-(:similar)-(f)<-[:define]-B) RETURN e,f
(Nodes defined by A - (R1+R2) )
3) Find 'middle' nodes that do not have matching calls from both A and B --Required result: C,G
I want to output C due to the 1 define(either 45/46) that does not have a matching define from B.
Also output G because there's no define to G from B.
Appreciate any help on this!
Your syntax is a bit strange to me, so I'm going to assume you're using an older version of Neo4j. We should be able to use the same approaches, though.
For #1, Your proposed match without distinct really should be working. The only thing I can see is adding missing parenthesis around A and B node variables.
START A=node(885), B=node(996)
MATCH (A)-[:define]->(x)<-[:define]-(B)
RETURN x
Also, I'm not sure what you mean by "returns all paths from A to B." Can you clarify that, and provide an example of the output?
As for #2, we'll need several several parts to this query, separating them with WITH accordingly.
START A=node(885), B=node(996)
MATCH (A)-[:define]->(x)<-[:define]-(B)
WITH A, B, COLLECT(DISTINCT x) as exceptions
OPTIONAL MATCH (A)-[:define]->(x)-[:same]-(y)<-[:define]-(B)
WHERE x NOT IN exceptions AND y NOT IN exceptions
WITH A, B, exceptions + COLLECT(DISTINCT x) + COLLECT(DISTINCT y) as allExceptions
MATCH (aNode)
WHERE aNode NOT IN allExceptions AND aNode <> A AND aNode <> B
RETURN aNode
Also, you should really be using labels on your nodes. The final match will match all nodes in your graph and will have to filter down otherwise.
EDIT
Regarding your #3 requirement, the SIZE() function will be very helpful here, as you can get the size of a pattern match, and it will tell you the number of occurrences of that pattern.
The approach on this query is to first get the collection of nodes defined by A or B, then filter down to the nodes where the number of :defines relationships from A are not equal to the number of :defines relationships from B.
While we would like to use something like a UNION WITH in order to get the union of nodes defined by A and union it with the nodes defined by B, Neo4j's UNION support is weak right now, as it doesn't let you do any additional operations after the UNION happens, so instead we have to resort to adding both sets of nodes into the same collection then unwinding them back into rows.
START A=node(885), B=node(996)
MATCH (A)-[:define]->(x)
WITH A, B, COLLECT(x) as middleNodes
MATCH (B)-[:define]->(x)
WITH A, B, middleNodes + COLLECT(x) as allMiddles
UNWIND allMiddles as middle
WITH DISTINCT A, B, middle
WHERE SIZE((A)-[:define]->(middle)) <> SIZE((B)-[:define]->(middle))
RETURN middle

Cypher query that will return only 1 relation of each type between two nodes

How can I craft a query that will return only one relation of a certain type between two nodes?
For example:
MATCH (a)-[r:InteractsWith*..5]->(b) RETURN a,r,b
Because (a) may have interacted with (b) many times, the result will contain many relations between the two. However, the relations are not identical. They have different properties because they occurred at different points in time.
But what if you're only interested in the fact that they have interacted at least once?
Instead of the result as it appears currently I'd like to receive a result that has either:
Only one random relation from the set of relations between (a) and (b)
Only those relations that fit to some criteria (e.g. "newest" or one of each type, ...)
One approach I have thought of is creating new relations of the type "hasEverInteractedWith". But there should be another way, right?
Use shortestPath() to get the quickest single result.
MATCH (a)-[:InteractsWith*..5]->(b)
WITH DISTINCT a, b
MATCH p = shortestPath((a)-[:InteractsWith*..5]->(b))
RETURN a, b, RELATIONSHIPS(p) AS r
If you want to get a specific one, you'll have to get all of the r and then filter them down, which will be slower (but provide more context).
MATCH (a)-[r:InteractsWith*..5]->(b)
WITH a, b, COLLECT(r) AS rs
RETURN a, b, REDUCE(s = HEAD(rs), r IN TAIL(rs)|CASE WHEN s.date > r.date THEN s ELSE r END)

Cypher query doesn't return all the expected nodes

I have this graph:
A<-B->C
B is the root of a tiny tree. There is exactly one relation between A and B, and one between B and C.
When I run the following, one node is returned. Why does this Cypher query not return the A and C nodes?
MATCH(a {name:"A"})<-[]-(rewt)-[]->(c) RETURN c
It would seem to be that the first half of that query would find the root, and the second half would find both child nodes.
Until a few minutes ago, I would have thought it logically identical to the following query which works. What's the difference?
MATCH (a {name:"A"})<-[]-(rewt)
MATCH (rewt)-[]->(c)
RETURN c
EDIT for cybersam
I have abstracted my database so we could discuss my specific issue. Now, we still have a tiny tree, but there are 4 nodes that are children of the root.(Sorry this is different, but I'm developing and don't want to change my environment too much.)
This query returns all 4:
match(a)<-[]-(b:ROOT)-[]->(c) return c
One of them has a name of "dddd"...
match(a {name"dddd"})<-[]-(b:ROOT)-[]->(c) return c
This query only returns three of them. "dddd" is not included. omg.
To answer cybersam's specific question, this query:
MATCH (a {name:"dddd"})<--(rewt:CODE_ROOT)
MATCH (rewt)-->(c)
RETURN a = c;
Returns four rows. The values are true, false, false, false
[UPDATED]
There is a difference between your 2 queries. A MATCH clause will filter out all duplicate relationships.
Therefore, your first query would filter out all matches where the left-side relationship is the same as the right-side relationship:
MATCH(a {name:"A"})<--(rewt)-->(c)
RETURN c;
Your second query would allow the 2 relationships to be the same, since the relationships are found by 2 separate MATCH clauses:
MATCH (a {name:"A"})<--(rewt)
MATCH (rewt)-->(c)
RETURN c;
If I am right, then the following query should return N rows (where N is the number of outgoing relationships from rewt) and only one value should be true:
MATCH (a {name:"A"})<--(rewt)
MATCH (rewt)-->(c)
RETURN a = c;
Both work just fine for me. I've tried on 2.3.0 Community.
Do you mind posting your CREATE command ?
In each MATCH clause, each relationship will be matched only once. See http://neo4j.com/docs/stable/cypherdoc-uniqueness.html for reference.
See this related question as well: What does a comma in a Cypher query do?

neo4j cypher: stacking results with UNION and WITH

I'm doing a query like
MATCH (a)
WHERE id(a) = {id}
WITH a
MATCH (a)-->(x:x)-->(b:b)
WITH a, x, b
MATCH (a)-->(y:y)-->(b:b)
WITH a, x, y, b
MATCH (b)-->(c:c)
RETURN collect(a), collect(x), collect(y), collect(b), collect(c)
what I want here is to have the b from MATCH (a)-->(y:y)-->(b:b) to be composed of the ones from that line and the ones from the previous MATCH (a)-->(x:x)-->(b:b). The problem I'm having with UNION is that its picky about the number and kind of nodes to be passed on the next query, and I'm having trouble understanding how to make it all go together.
What other solution could I use to merge these nodes during the query or just before returning them? (Or if should I do it with UNION then how to do it that way...)
(Of course the query up there could be done in other better ways. My real one can't. That is just meant to give a visual example of what I'm looking to do.)
Much obliged!
This simplified query might suit your needs.
I took out all the collect() function calls, as it is not clear that you really need to aggregate anything. For example, there will only be a single 'a' node, so aggregating the 'a's does not make sense.
Please be aware that every row of the result will be for a node labelled either 'x' or 'y'. But, since every row has to have both the x and y values -- every row will have a null value for one of them.
START a=node({id})
MATCH (a)-->(x:x)-->(b:b)-->(c:c)
RETURN a, x, null AS y, b, c
UNION
MATCH (a)-->(y:y)-->(b:b)-->(c:c)
RETURN a, null AS x, y, b, c
The best solution I could come up in the end was something like this
MATCH (a)-->(x:x)-->(b1:b)-->(c1:c)
WHERE id(a) = {id} AND NOT (a)-->(:y)-->(b1)
WITH a, collect(x) as xs, collect(DISTINCT b1) as b1s, collect(c1) as c1s
MATCH (a)-->(y:y)-->(b2:b)-->(c2:c)
RETURN a, xs, collect(y), (b1s + collect(b2)), c1s + collect(c2)

Resources