Most efficient way to get result from Neo4j - neo4j

I need to return a node and all the nodes associated with it based on a relationship. An example of the query would be like:
MATCH (n) where id(n)= {neo_id}
OPTIONAL MATCH p=(n)-[:OWNS]->(q)
Would it be more efficient to get node 'n' on its own and then get the paths in a separate call, or should I do a callthat returns 'n' and 'p'.
Addt. information: I have to do this for multiple relationships and I have noticed that every time I add a relationship, the combinatorics between all the paths causes the performance to deteriorate. Example:
MATCH (n) where id(n)= {neo_id}
OPTIONAL MATCH p=(n)-[:OWNS]->(q:Something)
OPTIONAL MATCH o=(n)-[:USES]->(r:SomethingElse)
.
.
.
OPTIONAL MATCH l=(n)-[:LOCATED_IN]->(r:NthSomethingElse)
RETURN n, p, o,..., l
or
//Call 1
MATCH (n) where id(n)= {neo_id}
RETURN n
//Call 2
MATCH (n) where id(n)= {neo_id}
OPTIONAL MATCH p=(n)-[:OWNS]->(q:Something)
RETURN p
//Call 3
MATCH (n) where id(n)= {neo_id}
OPTIONAL MATCH o=(n)-[:USES]->(r:SomethingElse)
RETURN o
.
.
.
//Call nth
MATCH (n) where id(n)= {neo_id}
OPTIONAL MATCH l=(n)-[:LOCATED_IN]->(r:NthSomethingElse)
RETURN l

If you always want to get n (if it exists), even if it has no relationships, then your first query is really the only way to do that. If you don't want that, then combining the 2 clauses into 1 probably has little impact on performance.
The reason you notice a slow down every time you add another MATCH is because of "cartesian products". That is: if a MATCH or OPTIONAL MATCH clause would "normally" produce N rows of data, but previous clauses in the same query had already produced M rows of data, the actual resulting number of rows would be M*N. So, every extra MATCH has a multiplicative effect on the number of data rows.
Depending on your use case, you might be able to avoid cartesian products by using aggregation on the results of all (or hopefully most) MATCH clauses, which can turn N into 1 (or some other smaller number). For example:
MATCH (n) where id(n)= {neo_id}
OPTIONAL MATCH p=(n)-[:OWNS]->(:Something)
WITH n, COLLECT(p) AS owns
OPTIONAL MATCH o=(n)-[:USES]->(:SomethingElse)
WITH n, owns, COLLECT(o) AS uses
OPTIONAL MATCH l=(n)-[:LOCATED_IN]->(:NthSomethingElse)
WITH n, owns, uses, COLLECT(l) AS located_in
RETURN n, owns, uses, located_in;

Related

OPTIONAL MATCH returns no path for disconnect nodes

I find weird that using OPTIONAL MATCH nodes that don’t have the expected relationship are not returned as a single node in path.
OPTIONAL MATCH path = (:Person) -[:LIKES]- (:Movie)
UNWIND nodes(p) as n
UNWIND rels(p) as e
WITH n
WHERE HEAD(LABELS(n)) = “Person”
return COUNT(DISTINCT n)
The number of people returned only includes those who liked a movie. By using OPTIONAL I would have expected all people to be returned.
Is there a workaround to this or am I doing some this wrong in the query?
A better way to go about this would be to match to all :People nodes first, then use the OPTIONAL MATCH to match to movies (or, if you want a collection of the movies they liked, use pattern comprehension).
If you do need to perform an UNWIND on an empty collection without wiping out the row, use a CASE around some condition to use a single-element list rather than the empty list.
MATCH (n:Person) // match all persons
OPTIONAL MATCH p = (n) -[:LIKES]- (m:Movie) // p and m are the optionals
UNWIND CASE WHEN p is null THEN [null] ELSE nodes(p) END as nodes // already have n, using a different variable
UNWIND CASE WHEN p is null THEN [null] ELSE rels(p) END as e // forcing a single element list means UNWIND won't wipe out the row
WITH n
WHERE HEAD(LABELS(n)) = “Person” // not really needed at all, and bad practice, you don't know the order of the labels on a node
return COUNT(DISTINCT n) // if this is really all you need, just keep the first match and the return of the query (without distinct), don't need anything else

How to write cypher statement to combine nodes when an OPTIONAL MATCH is null?

Background
Hi all, I am currently trying to write a cypher statement that allows me to find a set of paths on a map from a starting point. I want my search result to always return connecting streets within 5 nodes. Optionally, if there's a nearby hospital, I would like my search pattern to also indicate nearby hospitals.
Main Problem
Because there isn't always a nearby hospital to the current street, sometimes my optional match search pattern comes back as null. Here's the current cypher statement I'm using:
MATCH path=(a:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
OPTIONAL MATCH optionalPath=(b)-[:CONNECTED_TO]->(hospital:Hospital)
WHERE ALL (x IN nodes(path) WHERE (x:Street))
WITH DISTINCT nodes(path) + nodes(optionalPath) as n
UNWIND n as nodes
RETURN DISTINCT nodes;
However, this syntax only works if optionalPath contains nodes. If it doesn't, the statement nodes(path) + nodes(optionalPath) is an operation adding null and I get no records. This is true even the nodes(path) term does contain nodes.
What's the best way to get around this problem?
You can use COALESCE to replace a NULL with some other value. For example:
MATCH path=(:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
WHERE ALL (x IN nodes(path) WHERE x:Street)
OPTIONAL MATCH optionalPath=(b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH nodes(path) + COALESCE(nodes(optionalPath), []) as n
UNWIND n as nodes
RETURN DISTINCT nodes;
I have also made a few other improvements:
The WHERE clause was moved up right after the first MATCH. This eliminates the unwanted path values immediately. Your original query would get all path values (even unwanted ones) and always the perform the second MATCH query, and only eliminate unwanted paths afterwards. (But, it is actually not clear if you even need the WHERE clause at all; for example, if the CONNECTED_TO relationship is only used between Street nodes.)
The DISTINCT in your WITH clause would have prevented duplicate n collections, but the collections internally could have had duplicate paths. This was probably not what you wanted.
It seems you don't really want the path, just all the street nodes within 5 steps, plus any connected hospitals. So I would simplify your query to just that, and then condense the 3 columns down to 1.
MATCH (a:Street {id: 123})-[:CONNECTED_TO*..5]-(b:Street)
OPTIONAL MATCH (b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH collect(a) + collect(b) + collect(hospital) as n
UNWIND n as nodez
RETURN DISTINCT nodez;
If Streets can be indirectly connected (hospital in between), Than I'd adjust like this
MATCH (a:Street {id: 123})-[:CONNECTED_TO]-(b:Street)
WITH a as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
MATCH (a)-[:CONNECTED_TO]-(b:Street)
WITH nodez+collect(b) as nodez, b as a
OPTIONAL MATCH (b)-[:CONNECTED_TO]->(hospital:Hospital)
WITH nodez + collect(hospital) as n
UNWIND n as nodez
RETURN DISTINCT nodez;
It's a bit more verbose, but just says exactly what you want (and also adds the start node to the hospital check list)

cypher to combine nodes and relationships into a single column

So as a complication to this question, I basically want to do
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() RETURN DISTINCT n, r
And I want to return n and r as one column with no repeat values. However, running
MATCH (n:TEST) OPTIONAL MATCH (n)-[r]->() UNWIND n+r AS x RETURN DISTINCT x
gives a "Type mismatch: expected List but was Relationship (line 1, column 47)" error. And this query
MATCH (n:TEST) RETURN DISTINCT n UNION MATCH ()-[n]->() RETURN DISTINCT n
Puts nodes and relationships in the same column, but the context from the first match is lost in the second half.
So how can I return all matched nodes and relationships as one minimal list?
UPDATE:
This is the final modified version of the answer query I am using
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {properties:properties(r), id:id(r), type:type(r), startNode:id(startNode(r)), endNode:id(endNode(r))})} as n
There are a couple ways to handle this, depending on if you want to hold these within lists, or within maps, or if you want a map projection of a node to include its relationships.
If you're using Neo4j 3.1 or newer, then map projection is probably the easiest approach. Using this, we can output the properties of a node and include its relationships as a collected property:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r)} as n
Here's what you might do if you wanted each row to be its own pairing of a node and a single one of its relationships as a list:
...
RETURN [n, r] as pair
And as a map:
...
RETURN {node:n, rel:r} as pair
EDIT
As far as returning more data from each relationship, if you check the Code results tab, you'll see that the id, relationship type, and start and end node ids are included, and accessible from your back-end code.
However, if you want to explicitly return this data, then we just need to include it in the query, using another map projection for each relationship:
MATCH (n:TEST)
OPTIONAL MATCH (n)-[r]->()
RETURN n {.*, rels:collect(r {.*, id:id(r), type:type(r), startNode:startNode(r), endNode:endNode(r)})} as n

Cypher Optional Match

I have a graph in that contains two types of nodes (objects and pieces) and two types of links (similarTo and contains). Some pieces are made of the pieces.
I would like to extract the path to each piece starting from a set of objects.
MATCH (o:Object)
WITH o
OPTIONAL MATCH path = (p:Piece) <-[:contains*]- (o) -[:similarTo]- (:Object)
RETURN path
The above query only returns part of the pieces. In the returned graph, some objects do not directly connect to any pieces, the latter are not returned, although they actually do!
I can change the query to:
MATCH (o:Object) -[:contains*]-> (p:Piece)
OPTIONAL MATCH (o) –[:similarTo]- (:Object)
However, I did not manage to return the whole path for that query, which I need to return collection of nodes and links with:
WITH rels(path) as relations , nodes(path) as nodes
UNWIND relations as r unwind nodes as n
RETURN {nodes: collect(distinct n), links: collect(distinct {source: id(startNode(r)), target: id(endNode(r))})}
I'd be grateful to any recommendation.
Would something like this do the trick ?
I created a small graph representing objects and pieces here : http://console.neo4j.org/r/abztz4
Execute distinct queries with UNION ALL
Here you'll combine the two use cases in one set of paths :
MATCH (o:Object)
WITH o
OPTIONAL MATCH p=(o)-[:CONTAINS]->(piece)
RETURN p
UNION ALL
MATCH (o:Object)
WITH o
OPTIONAL MATCH p=(o)-[:SIMILAR_TO]-()-[:CONTAINS]->(piece)
RETURN p

Multiple Outer Joins / OPTIONAL MATCH without a hierarchy in neo4j

I am trying to match multiple outer joins on the same level in neo4j.
My database consists of users and a count of common up ur downratings on articles. The ratings counts are on seprate edges for up and downratings between the users.
---------- -----------
| User n | -[:rating_positive {weight}]-> | User n2 |
---------- -----------
| ^
\-----[:rating_negative {weight}]-------/
Now i want to produce edges that sum up these ratings.
I would love to use multiple optional merges, that do so sch as e.g.:
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
But: In this example I get all users without any positive ratings and those with positive and negatice ratings but none with only negative ratings. So there seems to be a sequence in OPTIONAL MATCH.
If I swap the order of the "OPTIONAL MATCHes" I get those with only negative ratings but not those with onl positive ratings.
So "OPTIONAL MATCH" is somehow a sequence where only when the first
sequence is met I get something from the second and so on?
Is there a workaround?
Neo4j Version is 2.1.3.
P.S.:
Even more confusing matching against NULL does not seem to work. So this query:
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
WHERE rating_positive IS NULL AND rating_negative IS NOT NULL
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
will give me lots of edges with NULL rating_negative and NON NULL rating_positive. I don't know what is happening with null matching in WHERE?
Anyway I found a way to recode the nulls to 0 values using "coalesce":
MATCH (n:`User`)
OPTIONAL MATCH (n:`User`)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n:`User`)-[rating_negative:rating_negative]-(n2:`User`)
WITH n, n2, coalesce(rating_positive.weight, 0) AS rating_positive, coalesce(rating_negative.weight, 0) as rating_negative
WHERE rating_positive = 0 AND rating_negative > 0
RETURN n.uid, n2.uid, rating_positive, rating_negative
With this query it works as expected.
I believe more than the sequencing of the optional match, it's the fact that you've bound n2.
So the next optional match is restricted to only match the nodes identified to be candidates for n2 in the previous match. And so it appears that the order of the optional match influences it.
If you take a look at a small sample graph I set up here http://console.neo4j.org/r/lrp55o , the following query
MATCH (n:User)
OPTIONAL
MATCH (n)-[:rating_negative]->(n2)
OPTIONAL
MATCH (n)-[:rating_positive]->(n2)
RETURN n,n2
returns B-[:rating_negative]->C and C-[:rating_negative]->D but it leaves out A-[:rating_positive]->B.
The first optional match for rating_negative bound C and D as nodes for "n2". The second optional match found no n which has a rating_positive to C or D and hence the results.
I'm a bit unclear about what you are trying to do with the query and null checks but a union would be one way to give you all the positive and negative relations (which you can add your filters to):
MATCH (n:User)
OPTIONAL
MATCH (n)-[rating:rating_negative]->(n2)
RETURN n,n2,rating.weight
UNION ALL
MATCH (n:User)
OPTIONAL
MATCH (n)-[rating:rating_positive]->(n2)
RETURN n, n2, rating.weight
If this is not what you're looking for, a small subgraph at http://console.neo4j.org?init=0 would be great to help you further.
EDIT: Since comments indicated that the sum of ratings was required between a pair of users, the following query does the job:
MATCH (u:User)-[rating:rating_positive|:rating_negative]->(u2)
RETURN u,u2,sum(rating.weight)
I can't be entirely sure whether this is what is causing your problem but it appears to me that you should be omitting the labels in the OPTIONAL MATCH clauses.
Perhaps try the query below
MATCH (n:`User`)
OPTIONAL MATCH (n)-[rating_positive:rating_positive]-(n2:`User`)
OPTIONAL MATCH (n)-[rating_negative:rating_negative]-(n2)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight
It may also be worth including the relationship directions.
MATCH (n:`User`)
OPTIONAL MATCH (n)-[rating_positive:rating_positive]->(n2:`User`)
OPTIONAL MATCH (n)-[rating_negative:rating_negative]->(n2)
RETURN n.uid, n2.uid, rating_positive.weight, rating_negative.weight

Resources