I have a simple graph with nodes labeled as Organization and two directed relationship types 'Control' and 'Influence'. My objective is -
Step-1
Given a node, find all nodes connected to it with 'Control' relationship (in any direction)
Step-2
For all nodes found by Step-1, find any outbound 'Influence' relationships (of any length) and include those nodes as well
What I have been able to come up thus far:
MATCH (x:Organization {ORGID: "5621"})-[:Control*1..]-(y) WITH y MATCH y-[:Influence*0..]-(z) RETURN y,z;
Questions
1) This query does not include the starting node, how can I get that in the result?
2) Ideally I'd like to get the relationships in the result also, it just returns the nodes
TIA
Here is one way to do what you want:
MATCH (x:Organization { ORGID: "5621" }), p1 = x-[:Influence*0..]->(z)
WHERE NOT z-[:Influence]->()
WITH x, COLLECT(p1) AS c1
OPTIONAL MATCH x-[:Control*]-(y), p2=y-[:Influence*0..]->(z)
WHERE NOT z-[:Influence]->()
RETURN c1 + COLLECT(p2) AS result;
The query returns a collection of matching Influence paths. The WHERE clauses are used to make sure that each path is as long as possible (to avoid a lot of duplicate subpaths from showing up in the results). Every path is a collection of node(s) separated by relationships, and can consist of just a node if the path has no relationships.
Related
Having this query working in Cypher (Neo4j):
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
RETURN p
which returns all possible paths belonging a specific group (group is just a property to classify nodes), I am struggling to get a query that returns the paths in common between both collection of paths. It would be something like this:
MATCH p=(g:Node)-[:FOLLOWED_BY *2..2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
MATCH p=(g3:Node)-[:FOLLOWED_BY *2..2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15
RETURN INTERSECTION(path1, path2)
Of course I made that up. The goal is to get all the paths in common between both queries.
The start/end nodes of your 2 MATCHes have different groups, so they can never find common paths.
Therefore, when you ask for "paths in common", I assume you actually want to find the shared middle nodes (between the 2 sets of 3-node paths). If so, this query should work:
MATCH p1=(g:Node)-[:FOLLOWED_BY *2]->(g2:Node)
WHERE g.group=10 AND g2.group=10
WITH COLLECT(DISTINCT NODES(p1)[1]) AS middle1
MATCH p2=(g3:Node)-[:FOLLOWED_BY *2]->(g4:Node)
WHERE g3.group=15 AND g4.group=15 AND NODES(p2)[1] IN middle1
RETURN DISTINCT NODES(p2)[1] AS common_middle_node;
I have a graph consisting of paths.
I need to delete all nodes, that have a property:
linksTo:'javascript'
After deleting i have to reconnect the paths. This means i need to create a new relationship for each gap. This relationship has a property named deltaTime that holds some integer value. This value (deltaTime) should be the sum of all deltaTime-properties of the deleted relationships of this path. Please look at the following picture for better understanding.
I don't know how to detect multiple "bad" nodes in a row, with a variable row-length.
It would help if you could provide the labels involved in your graph, as well as the relationship types you're using.
Assuming all these paths are chains (only single relationships connecting each node), something like this should work for you:
// first, add a label on all the nodes we plan on deleting
// it helps if they already have the same label especially if linksTo property is indexed.
MATCH (n{linksTo:'javascript'})
SET n:ToDelete
With the appropriate nodes labeled, we'll find the segments of nodes to delete, the surrounding nodes that need to be connected, and create the new relationship. We ensure a and b are start and end nodes of the chain by ensuring there are no :ToDelete nodes linking to a, or from b.
MATCH (a:ToDelete)
WHERE NOT (:ToDelete)-->(a)
AND ()-->(a)
MATCH p = (a)-[rel*0..]->(b:ToDelete)
WHERE ALL(node in nodes(p) WHERE node:ToDelete)
AND NOT (b)-->(:ToDelete)
AND (b)-->()
WITH a, b, REDUCE(s = 0, r IN rel | s + r.value) as sum
// now get the adjacent nodes we need to connect
MATCH (x)-[r1]->(a), (b)-[r2]->(y)
WITH x, y, sum + r1.value + r2.value AS sumValue
// making up relationship type and property name as I don't know what you're using
MERGE (x)-[:Rel{value: sumValue}]->(y)
Finally, when you're sure the new relationships look correct, delete the nodes you no longer need.
MATCH (n:ToDelete)
DETACH DELETE n
I have a huge database of size 260GB, which is storing a ton of transaction information. It has Agent, Customer,Phone,ID_Card as the nodes. Relationships are as follows:
Agent_Send, Customer_Send,Customer_at_Agent, Customer_used_Phone,Customer_used_ID.
A single agent is connected to many customers .And hence hitting the agent node while querying a path is not feasible. Below is my query:
match p=((ph: Phone {Phone_ID : "3851308.0"})-[r:Customer_Send
| Customer_used_ID | Customer_used_Phone *1..5]-(n2))
with nodes(p) as ns
return extract (node in ns | Labels(node) ) as Labels
I am starting with a phone number and trying to extract a big "Customer" network. I am intentionally not touching the "Customer_at_Agent" relationship in the above networked query as it is not optimal as far as performance is concerned.
So, the idea is to extract all the "Customer" labeled nodes from the path and match it with [Customer_at_Agent] relationship.
For instance , something like:
match p=((ph: Phone {Phone_ID : "3851308.0"})-[r:Customer_Send
| Customer_used_ID | Customer_used_Phone *1..5]-(n2))
with nodes(p) as ns
return extract (node in ns | Labels(node) ) as Labels
of "type customer as c "
optional match (c)-[r1:Customer_at_Agent]-(n3)
return distinct p,r1
I am still new to neo4j and cypher and I am not able to figure out a hack to extract only "customer" nodes from the path and use that in the optional match.
Thanks in advance.
Use filter notation instead of extract and you can drop any nodes that aren't labelled right. Try out this query instead:
MATCH p = (ph:Phone {Phone_ID : "3851308.0"}) - [:Customer_Send|:Customer_used_ID|:Customer_used_Phone*1..5] - ()
WITH ph, [node IN NODES(p) WHERE node:Customer] AS customer_nodes
UNWIND customer_nodes AS c_node
OPTIONAL MATCH (c_node) - [r1:Customer_at_Agent] - ()
RETURN ph, COLLECT(DISTINCT r1)
So the second line takes the phone number and the path generated and gives you a list of nodes that have the Customer label as customer_nodes. You then unwind this list so you have individual nodes you can use in path matching. Line 4 performs your optional match and finds the r1 you're interested in, then line 5 will return the phone number node you started with and a collection of all of the r1 relationships that you found on customer nodes hooked up to that phone number.
UPDATE: I added some modifications to clean up your first query line as well. If you aren't going to use an alias (like r or n2 in the first line), then don't assign them in the first place; they can affect performance and cause confusion. Empty nodes and relationships are totally fine if you don't actually have any restrictions to place on them. You also don't need parentheses to mark off a path; they are used as a core part of Cypher's ASCII art to signify nodes, so I find they are more confusing than helpful.
In a graph where the following nodes
A,B,C,D
have a relationship with each nodes successor
(A->B)
and
(B->C)
etc.
How do i make a query that starts with A and gives me all nodes (and relationships) from that and outwards.
I do not know the end node (C).
All i know is to start from A, and traverse the whole connected graph (with conditions on relationship and node type)
I think, you need to use this pattern:
(n)-[*]->(m) - variable length path of any number of relationships from n to m. (see Refcard)
A sample query would be:
MATCH path = (a:A)-[*]->()
RETURN path
Have also a look at the path functions in the refcard to expand your cypher query (I don't know what exact conditions you'll need to apply).
To get all the nodes / relationships starting at a node:
MATCH (a:A {id: "id"})-[r*]-(b)
RETURN a, r, b
This will return all the graphs originating with node A / Label A where id = "id".
One caveat - if this graph is large the query will take a long time to run.
I'm struggling to find a single clean, efficient Cypher query that will let me identify all distinct paths emanating from a start node such that every relationship in the path is of the same type when there are many relationship types.
Here's a simple version of the model:
CREATE (a), (b), (c), (d), (e), (f), (g),
(a)-[:X]->(b)-[:X]->(c)-[:X]->(d)-[:X]->(e),
(a)-[:Y]->(c)-[:Y]->(f)-[:Y]->(g)
In this model (a) has two outgoing relationship types, X and Y. I would like to retrieve all the paths that link nodes along relationship X as well as all the paths that link nodes along relationship Y.
I can do this programmatically outside of cypher by making a series of queries, the first to
retrieve the list of outgoing relationships from the start node, and then a single query (submitted together as a batch) for each relationship. That looks like:
START n=node(1)
MATCH n-[r]->()
RETURN COLLECT(DISTINCT TYPE(r)) as rels;
followed by:
START n=node(1)
MATCH n-[:`reltype_param`*]->()
RETURN p as path;
The above satisfies my need, but requires at minimum 2 round trips to the server (again, assuming I batch together the second set of queries in one transaction).
A single-query approach that works, but is horribly inefficient is the following single Cypher query:
START n=node(1)
MATCH p = n-[r*]->() WHERE
ALL (x in RELATIONSHIPS(p) WHERE TYPE(x) = TYPE(HEAD(RELATIONSHIPS(p))))
RETURN p as path;
That query uses the ALL predicate to filter the relationships along the paths enforcing that each relationship in the path matches the first relationship in the path. This, however, is really just a filter operation on what it essentially a combinatorial explosion of all possible paths --- much less efficient than traversing a relationship of a known, given type first.
I feel like this should be possible with a single Cypher query, but I have not been able to get it right.
Here's a minor optimization, at least non-matching the paths will fail fast:
MATCH n-[r]->()
WITH distinct type(r) AS t
MATCH p = n-[r*]->()
WHERE type(r[-1]) = t // last entry matches
RETURN p AS path
This is probably one of those things that should be in the Java API if you want it to be really performant, though.