I have a graph in Neo4j (first time using it) of about 10 different nodes that are connected in various ways. Not all nodes are connected to each other, as some have up to 6 or 7 neighbors, while some have only 1. What query would I write/use to check if a path exists from NodeA to NodeB? It doesn't have to be the shortest path, just if a path exists.
Along with this, is there a way to count who has the most or least neighbors? Thanks everyone for help in advance.
Return Foo nodes a and b if there is at least one path between them. (This variable-length path query with unbounded length could take a very long time or run out of memory if there are a lot of paths or very long paths).
MATCH (a:Foo {id: 'a'}), (b:Foo {id: 'b'})
WHERE (a)-[*]-(b)
RETURN a, b;
Return all paths between a and b. (This query could require even more time and memory than the previous query, since it will attempt to return all matching paths).
MATCH path=(a:Foo {id: 'a'})-[*]-(b:Foo {id: 'b'})
RETURN path;
Return the 10 nodes with the most neighbors, in descending order:
MATCH (n)--()
WITH n, COUNT(*) AS c
RETURN n
ORDER BY c DESC
LIMIT 10;
Related
When I used this cypher query
match p=(n)-[r*8]-(n)
where id(n)=548
with p
where ALL(x IN nodes(p)[1..length(p)] WHERE SINGLE(y IN nodes(p)[1..length(p)] WHERE x=y))
return count(p)
it took 51922 ms to return the result; it is really a long time. How could I optimize this cypher query? Any help would be appreciated.
Looks like you want a simple circuit with no repeating nodes (except the start and end node).
There's an APOC Procedure to get all simple paths between two nodes, with a maximum path length. It doesn't currently work when the start and end nodes are the same, but if we set the end node as any adjacent node to your start node, and filter to only keep paths of length 7 (since the paths exclude the last hop back to the start node), then we should be able to get the right answer extremely fast.
match (n)--[m]
with distinct n, m
call apoc.algo.allSimplePaths(n, m, '', 7) YIELD path
with path
where length(path) = 7
return count(path)
I'm using Neo4j to try to find any node that is not connected to a specific node "a". The query that I have so far is:
MATCH p = shortestPath((a:Node {id:"123"})-[*]-(b:Node))
WHERE p IS NULL
RETURN b.id as b
So it tries to find the shortest path between a and b. If it doesn't find a path, then it returns that node's id. However, this causes my query to run for a few minutes then crashes when it runs out of memory. I was wondering if this method would even work, and if there is a more efficient way? Any help would be greatly appreciated!
edit:
MATCH (a:Node {id:"123"})-[*]-(b:Node),
(c:Node)
WITH collect(b) as col, a, b, c
WHERE a <> b AND NOT c IN col
RETURN c.id
So col (collect(b)) contains every node connected to a, therefore if c is not in col then c is not connected to a?
For one, you're giving this MATCH an impossible predicate to fulfill, so it will never find the shortest path.
WHERE clauses are associated with MATCH, OPTIONAL MATCH, and WITH clauses, so your query is asking for the shortest path where the path doesn't exist. That will never return anything.
Also, the shortestPath will start at the node you DON'T want to be connected, so this has no way of finding the nodes that aren't connected to it.
Probably the easiest way to approach this is to MATCH to all nodes connected to your node in question, then MATCH to all :Nodes checking for those that aren't in the connected set. That means you won't have to do a shortestPath from every single node in the db, just a membership check in a collection.
You'll need APOC Procedures for this, as it has the fastest means of matching to nodes within a subgraph.
MATCH (a:Node {id:"123"})
CALL apoc.path.subgraphNodes(a, {}) YIELD node
WITH collect(node) as subgraph
MATCH (b:Node)
WHERE NOT b in subgraph
RETURN b.id as b
EDIT
Your edited query is likely to blow up, that's going to generate a huge result set (the query will build a result set of every node reachable from your start node by a unique path in a cartesian product with every :Node).
Instead, go step by step, collect the distinct nodes (because otherwise you'll get multiples of the same nodes that can be reached via different paths), and then only after you have your collection should you start your match for nodes that aren't in the list.
MATCH (:Node {id:"123"})-[*0..]-(b:Node)
WITH collect(DISTINCT b) as col
MATCH (a:Node)
WHERE NOT a IN col
RETURN a.id
For example,I want to query allShortestPaths between 3 nodes(A,B,C),it means i want to query:
1. the allShortestPaths between A and B
2. the allShortestPaths between C and B
3. the allShortestPaths between A and C
but I only find the allShortestPaths query to get allShortestPaths between two nodes.
As follow:
MATCH (node1:E { eid:"a9c2f114-796f-4934-a2d0-04bb3345e1d2" }),
(node2:E { eid:"01968dd2-1ed6-472d-82e9-be7701036b3b" }),
p = allShortestPaths((node1)-[*]-(node2))
RETURN p LIMIT 25
I am wondering if there exists a allShortestPaths query supporting more than 2 nodes input?
Now,to search 3 nodes,I have to invoke the "allShortestPaths" three times,as follow:
MATCH (node1:E { eid:"b73ade90-dfa1-4b94-bd0f-c16fd93bd680" }),
(node2:E { eid:"ddb5c52d-7002-4ac7-87d5-0f727f2ab3e7" }),
(node3:E { eid:"0398b081-6676-4a91-856b-abbabaee5e70" }) ,
p = allShortestPaths((node1)-[*]-(node2)),
q = allShortestPaths((node3)-[*]-(node2)),
m = allShortestPaths((node3)-[*]-(node1))
RETURN p,q,m LIMIT 10
What i want to do is to search allShortestPaths between arbitrary number of nodes.
So far,I intend to write user-defined procedures,but it will costs more time.I wondering who can provide better advice.
i want to search search allShortestPaths between serveral nodes.
such as: allShortestPaths((a)-[*]-(b)-[*]-(c)-[*]-(a))
I want get the all shortest path between a and b,b and c,c and a in a query
You need a nested loops:
// Array of id
WITH ["b73ade90-dfa1-4b94-bd0f-c16fd93bd680",
"ddb5c52d-7002-4ac7-87d5-0f727f2ab3e7",
"0398b081-6676-4a91-856b-abbabaee5e70"] as IDS
UNWIND IDS as vid
// Looking for the desired nodes
MATCH (N:E {id: vid})
WITH collect(N) as NS
// Nested loops
UNWIND RANGE(0, size(NS)-2) as i1
UNWIND RANGE(i1+1, size(NS)-1) as i2
WITH NS[i1] as N1,
NS[i2] as N2
// Get paths
MATCH ps = allShortestPaths((N1)-[*]-(N2))
RETURN ps
Neo4j doesn't provide a version of allShortestPaths taking multiple patterns, which is what you want:
allShortestPaths((node1)-[*]-(node2), (node1)-[*]-(node3), (node2)-[*]-(node3))
You wish to optimize the traversals by piggy-backing on the first one to do the second one at the same time, but there's no such thing out of the box, and it wouldn't do the third one either. It's a really specific use case.
You either have to call allShortestPaths n*(n-1) times (for n nodes) in Cypher, or try implementing it yourself server-side in a procedure using the Traversal framework.
here a sample cypher
MATCH (n:Entity) where n.name IN {names}
WITH collect(n) as nodes
UNWIND nodes as n
UNWIND nodes as m
WITH * WHERE id(n) < id(m)
MATCH path = allShortestPaths( (n)-[*..4]-(m) )
RETURN path
see https://neo4j.com/developer/kb/all-shortest-paths-between-set-of-nodes/ for more
I search the longest path of my graph and I want to count the number of distinct nodes of this longest path.
I want to use count(distinct())
I tried two queries.
First is
match p=(primero)-[:ResponseTo*]-(segundo)
with max(length(p)) as lengthPath
match p1=(primero)-[:ResponseTo*]-(segundo)
where length(p1) = lengthPath
return nodes(p1)
The query result is a graph with the path nodes.
But if I tried the query
match p=(primero)-[:ResponseTo*]-(segundo)
with max(length(p)) as lengthPath
match p1=(primero)-[:ResponseTo*]-(segundo)
where length(p1) = lengthPath
return count(distinct(primero))
The result is
count(distinct(primero))
2
How can I use count(distinct()) over the node primero.
Node Primero has a field called id.
You should bind at least one of those nodes, add a direction and also consider a path-limit otherwise this is an extremely expensive query.
match p=(primero)-[:ResponseTo*..30]-(segundo)
with p order by length(p) desc limit 1
unwind nodes(p) as n
return distinct n;
I am very new to Cypher and I need help to solve a problem I am facing..
In my graph I have a path represeting a data stream and I need to know, for each node in the path, the distance from the last node of the path.
For example if i have the following path:
(a)->(b)->(c)->(d)
the distance must be 3 for a, 2 for b, 1 for c and 0 for d.
Is there an efficient way to obtain this result in Cypher?
Thanks a lot!
Mauro
If it is just hops between nodes then i think this will fit the bill.
match p=(a:Test {name: 'A'})-[r*3]->(d:Test {name: 'D'})
with p, range(length(p),0,-1) as idx
unwind idx as elem
return (nodes(p)[elem]).name as Node
, length(p) - elem as Distance
order by Node
In this answer, I define a path to be "complete" if its start node has no incoming relationship and its end node has no outgoing relationship.
This query returns, for each "complete" path, a collection of objects containing each node's neo4j-generated ID and the number of hops to the end of that path:
MATCH p=(x)-[*]->(y)
WHERE (NOT ()-->(x)) AND (NOT (y)-->())
WITH NODES(p) AS np, LENGTH(p) AS lp
RETURN EXTRACT(i IN RANGE(0, lp, 1) | {id: ID(np[i]), hops: lp - i})
NOTE: Matching with [*] will be costly with large graphs, so you may need to limit the maximum hop value. For example, use [*..4] instead to limit the max hop value to 4.
Also, qualifying the query with appropriate node labels and relationship types may speed it up.