Merging tracks in Neo4j - neo4j

(Using Neo4j 3.x and neo4j.v1 Python driver)
I have two tracks T1 and T2 to the same target. Somewhere before reaching the target, the two tracks meet at node X and become one until the target is reached.
Track T1: T----------X-----------A
Track T2: '-----Q
I use the following Cypher query to generate each one of the tracks:
UNWIND {coords} AS coordinates
UNWIND {pax} AS pax
CREATE (n:Node)
SET n = coordinates
SET n.pax = pax
RETURN n
using the parameter list, e.g. {'pax': 'A', 'coords': [{'id': 0, 'lon': '8.553095', 'lat': '47.373146'}, etc.]}
and then link the nodes using the id only for the purpose of keeping the sequence of the trackpoints:
UNWIND {pax} AS pax
MATCH (n:Node {pax: pax})
WITH n
ORDER BY n.id
WITH COLLECT(n) AS nodes
UNWIND RANGE(0, SIZE(nodes) - 2) AS idx
WITH nodes[idx] AS n1, nodes[idx+1] AS n2
MERGE (n1)-[:NEXT]->(n2)
From the (unknown) point X (CS1 in the picture above) on, both tracks have identical trackpoints. I can match those using:
MATCH (n:Node), (m:Node)
WHERE m <> n AND n.id < m.id AND n.lat = m.lat AND n.lon = m.lon
MERGE (n)-[:IS]->(m)
with lat, lon being the (identical) coordinates. This is just my clumsy way to determine the first joint trackpoint. What I really need is to have one (linked) track from point X onward with the pax property updated, e.g. as ['A', 'B']
Question 1 (generalized):
How can I merge two nodes with a relationship into one node with an updated property? C3 and S3 merge into a new node CS3.
Question 2:
How can I do this if I have two linked lists with a set of pairwise identical properties?
(Ax)-[:NEXT]-> (A1)-[:NEXT]->(A2)-[:NEXT]->(A3)
(Ax)-[:NEXT]-> (B1)-[:NEXT]->(B2)-[:NEXT]->(B3)
where Ax.x <> Bx.x but A1.x = B1.X and A2.x = B2.x etc.
Thank you all for your hints and helpful ideas.

Related

Neo4J: How can I find if a path traversing multiple nodes given in a list exist?

I have a graph of nodes with a relationship NEXT with 2 properties sequence (s) and position (p). For example:
N1-[NEXT{s:1, p:2}]-> N2-[NEXT{s:1, p:3}]-> N3-[NEXT{s:1, p:4}]-> N4
A node N might have multiple outgoing Next relationships with different property values.
Given a list of node names, e.g. [N2,N3,N4] representing a sequential path, I want to check if the graph contains the nodes and that the nodes are connected with relationship Next in order.
For example, if the list contains [N2,N3,N4], then check if there is a relationship Next between nodes N2,N3 and between N3,N4.
In addition, I want to make sure that the nodes are part of the same sequence, thus the property s is the same for each relationship Next. To ensure that the order maintained, I need to verify if the property p is incremental. Meaning, the value of p in the relationship between N2 -> N3 is 3 and the value p between N3->N4 is (3+1) = 4 and so on.
I tried using APOC to retrieve the possible paths from an initial node N using python (library: neo4jrestclient) and then process the paths manually to check if a sequence exists using the following query:
q = "MATCH (n:Node) WHERE n.name = 'N' CALL apoc.path.expandConfig(n {relationshipFilter:'NEXT>', maxLevel:4}) YIELD path RETURN path"
results = db.query(q,data_contents=True)
However, running the query took some time that I eventually stopped the query. Any ideas?
This one is a bit tough.
First, pre-match to the nodes in the path. We can use the collected nodes here to be a whitelist for nodes in the path
Assuming the start node is included in the list, a query might go like:
UNWIND $names as name
MATCH (n:Node {name:name})
WITH collect(n) as nodes
WITH nodes, nodes[0] as start, tail(nodes) as tail, size(nodes)-1 as depth
CALL apoc.path.expandConfig(start, {whitelistNodes:nodes, minLevel:depth, maxLevel:depth, relationshipFilter:'NEXT>'}) YIELD path
WHERE all(index in range(0, size(nodes)-1) WHERE nodes[index] = nodes(path)[index])
// we now have only paths with the given nodes in order
WITH path, relationships(path)[0].s as sequence
WHERE all(rel in tail(relationships(path)) WHERE rel.s = sequence)
// now each path only has relationships of common sequence
WITH path, apoc.coll.pairsMin([rel in relationships(path) | rel.p]) as pairs
WHERE all(pair in pairs WHERE pair[0] + 1 = pair[1])
RETURN path

is it possible to iterate though property of relationship cypher

This is related to this question: How to store properties of a neo4j node as an array?
I would like to iterate through a property of a relationship and check max of that value and assign a new relationship of node1 and node2 and delete node1 from the pool and move to the second one. In other words as in the context of my previous question, How to assign a given employee to a given position based max(r.score) and move to the other employee who has a maximum r.score for another position? Thanks
Have this basic query to assign a position for the employee who has a maximum r.score w.r.t position and remove him from pool of candidates. However, I have to run this manually for the second position. Ideally I want something that checks length if available positions and then fills positions with max(r.score) and then stops when all positions are filled. may be returns a report of hired employees...
MATCH (e:Employee)-[r:FUTURE_POSITION]->(p:Position)
WITH MAX(r.score) as s
MATCH (e)-[r]->(p) WHERE r.score = s
CREATE (e)-[r2:YOUAREHIRED]->(p)
DELETE r
RETURN e.name, s
This query may work for you:
MATCH (:Employee)-[r:FUTURE_POSITION]->(p:Position)
WITH p, COLLECT(r) AS rs
WITH p, REDUCE(t = rs[0], x IN rs[1..] |
CASE WHEN x.score > t.score THEN x ELSE t END) AS maxR
WITH p, maxR, maxR.score AS maxScore, STARTNODE(maxR) AS e
CREATE (e)-[:YOUAREHIRED]->(p)
DELETE maxR
RETURN p, e.name AS name, maxScore;
The first WITH clause collects all the FUTURE_POSITION relationships for each p.
The second WITH clause obtains, for each p, the relationship with the maximum score.
The third WITH clause extracts the variables needed by subsequent clauses.
The CREATE clause creates the YOUAREHIRED relationship between e (the employee with the highest score for a given p) and p.
The DELETE clause deletes the FUTURE_POSITION relationship between e and p.
The RETURN clause returns each p, along with and the name of the employee who was just hired for p, and his score, maxScore.
[UPDATE]
If you want to delete all FUTURE_POSITION relationships of each p node that gets a YOUAREHIRED relationship, you can use this slightly different query:
MATCH (:Employee)-[r:FUTURE_POSITION]->(p:Position)
WITH p, COLLECT(r) AS rs
WITH p, rs, REDUCE(t = rs[0], x IN rs[1..] |
CASE WHEN x.score > t.score THEN x ELSE t END) AS maxR
WITH p, rs, maxR.score AS maxScore, STARTNODE(maxR) AS e
CREATE (e)-[:YOUAREHIRED]->(p)
FOREACH(x IN rs | DELETE x)
RETURN p, e.name AS name, maxScore;

Neo4j - Intersect two node lists using Cypher

Having the following graphs:
node g1 with child nodes (a, b)
node g2 with child nodes (b, c)
using the query
MATCH (n)-[]-(m) WHERE ID(m) = id RETURN n
being id the id of the node g1, I get a and b, and vice-versa when using the id of g2. What I would like to understand is how can I get the intersection of those two results, in this case having the first return (a, b) and the second return (b, c) getting as final result (b).
I tried using the WITH cause but I wasn't able to achieve the desired result. Keep in mind that I'm new to Neo4j and only came here after a few failed attempts, research on Neo4j Documentation, general google search and
Stackoverflow.
Edit1 (one of my tries):
MATCH (n)-[]->(m)
WHERE ID(m) = 750
WITH n
MATCH (o)-[]->(b)
WHERE ID(b) = 684 and o = n
RETURN o
Edit2:
The node (b), that I represented as being the same on both graphs are in fact two different nodes on the db, each one relating to a different graph (g1 and g2). Representatively they are the same as they have the exactly same info (labels and attributes), but on the database thy are not. I'm sorry since it was my fault for not being more explicit on this matter :(
Edit3:
Why I don't using a single node (b) for both graphs
Using the graphs above as example, imagine that I have yet another layer so: on g1 the child node (b) as a child (e), while on g2 the child node (b) as a child (f). If I had (b) as a single node, when I create (e) and (f) I only could add it to (b) loosing the hierarchy, becoming impossible to distinguish which of them, (e) or (f), belonged to g1 ou g2.
This should work (assuming you pass id1 and id2 as parameters):
MATCH (a)--(n)--(c)
WHERE ID(a) = {id1} AND ID(c) = {id2}
RETURN n;
[UPDATED, based on new info from comments]
If you have multiple "clones" of the "same" node and you want to quickly determine which clones are related without having to perform a lot of (slow) property comparisons, you can add a relationship (say, typed ":CLONE") between clones. That way, a query like this would work:
MATCH (a)--(m)-[:CLONE]-(n)--(c)
WHERE ID(a) = {id1} AND ID(c) = {id2}
RETURN m, n;
You can find the duplicity of the node, by using this query -
[1]
Duplicity with single node -
MATCH pathx =(n)-[:Relationship]-(find) WHERE find.name = "action" RETURN pathx;
[2]
or for two nodes giving only immediate parent node
MATCH pathx =(n)-[:Relationship]-(find), pathy= (p)-[:Relationship]
-(seek) WHERE find.name = "action" AND seek.name="requestID" RETURN pathx,
pathy;
[3]
or to find the entire network i.e. all the nodes connected -
MATCH pathx =(n)--()-[:Relationship]-(find), pathy= (p)--()-[:Relationship]-
(seek) WHERE find.name = "action"
AND seek.name="requestID" RETURN pathx, pathy;

Count duplicated

Imagine that i have a graph in which for every pair of nodes m,n of type Nod1 there can be a node k of type Nod2 that connect them through relationships of type Rel, that is, there can be multiple patterns of the kind p=(m:Nod1)-[r:Rel]-(k:Nod2)-[s:Rel]-(n:Nod1). For a given node m (satisfying for example m.key="whatever") how can i find the node n that maximizes the number of nodes k that connect m to n? For example: imagine that there are 3 nodes k that connects m to n1 satisfying n1.key="hello" and 10 nodes k that connects m to n2 satisfying n2.key="world"; how to build a query that retrieves the node n2? :)
The title of the question is count duplicated, because i think that the problem is solved if i can count all "duplicated" patterns for each node n (that is, all patterns that has n as "endnode")!! :)
Start by matching your m; then match the pattern you want, then filter by distinct n nodes, and count the number of k nodes connected via that n node, and you should be there.
MATCH (m:Nod1 { key: "whatever" })
WITH m
MATCH (m)-[r:Rel]-(k:Nod2)-[s:Rel]-(n:Nod1)
RETURN distinct(n), count(k) as x
ORDER BY x DESC;

Find the distance in a path between each node and the last node of the path

I am very new to Cypher and I need help to solve a problem I am facing..
In my graph I have a path represeting a data stream and I need to know, for each node in the path, the distance from the last node of the path.
For example if i have the following path:
(a)->(b)->(c)->(d)
the distance must be 3 for a, 2 for b, 1 for c and 0 for d.
Is there an efficient way to obtain this result in Cypher?
Thanks a lot!
Mauro
If it is just hops between nodes then i think this will fit the bill.
match p=(a:Test {name: 'A'})-[r*3]->(d:Test {name: 'D'})
with p, range(length(p),0,-1) as idx
unwind idx as elem
return (nodes(p)[elem]).name as Node
, length(p) - elem as Distance
order by Node
In this answer, I define a path to be "complete" if its start node has no incoming relationship and its end node has no outgoing relationship.
This query returns, for each "complete" path, a collection of objects containing each node's neo4j-generated ID and the number of hops to the end of that path:
MATCH p=(x)-[*]->(y)
WHERE (NOT ()-->(x)) AND (NOT (y)-->())
WITH NODES(p) AS np, LENGTH(p) AS lp
RETURN EXTRACT(i IN RANGE(0, lp, 1) | {id: ID(np[i]), hops: lp - i})
NOTE: Matching with [*] will be costly with large graphs, so you may need to limit the maximum hop value. For example, use [*..4] instead to limit the max hop value to 4.
Also, qualifying the query with appropriate node labels and relationship types may speed it up.

Resources