is it possible to iterate though property of relationship cypher - neo4j

This is related to this question: How to store properties of a neo4j node as an array?
I would like to iterate through a property of a relationship and check max of that value and assign a new relationship of node1 and node2 and delete node1 from the pool and move to the second one. In other words as in the context of my previous question, How to assign a given employee to a given position based max(r.score) and move to the other employee who has a maximum r.score for another position? Thanks
Have this basic query to assign a position for the employee who has a maximum r.score w.r.t position and remove him from pool of candidates. However, I have to run this manually for the second position. Ideally I want something that checks length if available positions and then fills positions with max(r.score) and then stops when all positions are filled. may be returns a report of hired employees...
MATCH (e:Employee)-[r:FUTURE_POSITION]->(p:Position)
WITH MAX(r.score) as s
MATCH (e)-[r]->(p) WHERE r.score = s
CREATE (e)-[r2:YOUAREHIRED]->(p)
DELETE r
RETURN e.name, s

This query may work for you:
MATCH (:Employee)-[r:FUTURE_POSITION]->(p:Position)
WITH p, COLLECT(r) AS rs
WITH p, REDUCE(t = rs[0], x IN rs[1..] |
CASE WHEN x.score > t.score THEN x ELSE t END) AS maxR
WITH p, maxR, maxR.score AS maxScore, STARTNODE(maxR) AS e
CREATE (e)-[:YOUAREHIRED]->(p)
DELETE maxR
RETURN p, e.name AS name, maxScore;
The first WITH clause collects all the FUTURE_POSITION relationships for each p.
The second WITH clause obtains, for each p, the relationship with the maximum score.
The third WITH clause extracts the variables needed by subsequent clauses.
The CREATE clause creates the YOUAREHIRED relationship between e (the employee with the highest score for a given p) and p.
The DELETE clause deletes the FUTURE_POSITION relationship between e and p.
The RETURN clause returns each p, along with and the name of the employee who was just hired for p, and his score, maxScore.
[UPDATE]
If you want to delete all FUTURE_POSITION relationships of each p node that gets a YOUAREHIRED relationship, you can use this slightly different query:
MATCH (:Employee)-[r:FUTURE_POSITION]->(p:Position)
WITH p, COLLECT(r) AS rs
WITH p, rs, REDUCE(t = rs[0], x IN rs[1..] |
CASE WHEN x.score > t.score THEN x ELSE t END) AS maxR
WITH p, rs, maxR.score AS maxScore, STARTNODE(maxR) AS e
CREATE (e)-[:YOUAREHIRED]->(p)
FOREACH(x IN rs | DELETE x)
RETURN p, e.name AS name, maxScore;

Related

Separating matching nodes in a query result

I defined the directed relation Know on person nodes. For example, if Sara knows Alice then Sara-> Alice. I wrote this Cypher query to find all the people who know both the right and left side of the directed relation.
match ((n:Person)-[:Know]-> (m:Person)),(p:Person)
where EXISTS ((m)<-[:Know]-(p)-[:Know]->(n))
RETURN m,n,p
I need to get subgraphs with 3 nodes in the query's result but the result I get is a graph with many nodes. Is there any method to change the query to generate subgraphs with just 3 nodes (for example, a subgraph of Alex-> Sara, Alex-> Alice, Sara-> Alice and if Sara has the same condition on two other people it is shown in another subgraph). This requires repeating some nodes in the output.
MATCH clauses are more flexible than that. Try this:
MATCH (n:Person)-[:Know]->(m:Person)<-[:Know]-(p:Person)-[:Know]->(n)
WHERE NOT EXISTS (()-[:Know]->(p))
AND NOT EXISTS {
WITH m, n, p
MATCH (q:Person)-[:Know]->(m)
WHERE q <> n
AND q <> p
}
AND NOT EXISTS {
WITH m, n, p
MATCH (q:Person)-[:Know]->(n)
WHERE q <> p
}
RETURN m, n, p
You might have to use a unique ID property, and I'm not sure if the WITH clause will work here as I've gotten it; but with subqueries, you are generally able to import variables from above using WITH.

Count the number of relationships between a pair of nodes and set it as parameter in Neo4J

I have a graph where a pair of nodes can have several relationships between them.
I would like to count this relationships between each pair of nodes, and set it as a parameter of each relationship.
I tried something like:
MATCH (s:LabeledExperience)-[r:NextExp]->(e:LabeledExperience)
with s, e, r, length(r) as cnt
MATCH (s2:LabeledExperience{name:s.name})-[r2:NextExp{name:r.name}]->(e2:LabeledExperience{name: e.name})
SET r2.weight = cnt
But this set the weight always to one.
I also tried:
MATCH ()-[r:NextExp]->()
with r, length(r) as cnt
MATCH ()-[r2:NextExp{name:r.name}]->()
SET r2.weight = cnt
But this takes too much time since there are more than 90k relationships and there is no index (since it is not possible to have them on edges).
They are always set to 1 because of the way you are counting.
When you group by s, e, r that is always going to result in a single row. But if you collect(r) for every s, e then you will get a collection of all of the :NextExp relationships between those two nodes.
Also, length() is for measuring the length (number of nodes) in a matched path and should not work directly on a relationship.
Match the relationship and put them in a collection for each pair of nodes. Iterate over each rel in the collection and set the size of the collection of rels.
MATCH (s:LabeledExperience)-[r:NextExp]->(e:LabeledExperience)
WITH s, e, collect(r) AS rels
UNWIND rels AS rel
SET rel.weight = size(rels)

Neo4j - Intersect two node lists using Cypher

Having the following graphs:
node g1 with child nodes (a, b)
node g2 with child nodes (b, c)
using the query
MATCH (n)-[]-(m) WHERE ID(m) = id RETURN n
being id the id of the node g1, I get a and b, and vice-versa when using the id of g2. What I would like to understand is how can I get the intersection of those two results, in this case having the first return (a, b) and the second return (b, c) getting as final result (b).
I tried using the WITH cause but I wasn't able to achieve the desired result. Keep in mind that I'm new to Neo4j and only came here after a few failed attempts, research on Neo4j Documentation, general google search and
Stackoverflow.
Edit1 (one of my tries):
MATCH (n)-[]->(m)
WHERE ID(m) = 750
WITH n
MATCH (o)-[]->(b)
WHERE ID(b) = 684 and o = n
RETURN o
Edit2:
The node (b), that I represented as being the same on both graphs are in fact two different nodes on the db, each one relating to a different graph (g1 and g2). Representatively they are the same as they have the exactly same info (labels and attributes), but on the database thy are not. I'm sorry since it was my fault for not being more explicit on this matter :(
Edit3:
Why I don't using a single node (b) for both graphs
Using the graphs above as example, imagine that I have yet another layer so: on g1 the child node (b) as a child (e), while on g2 the child node (b) as a child (f). If I had (b) as a single node, when I create (e) and (f) I only could add it to (b) loosing the hierarchy, becoming impossible to distinguish which of them, (e) or (f), belonged to g1 ou g2.
This should work (assuming you pass id1 and id2 as parameters):
MATCH (a)--(n)--(c)
WHERE ID(a) = {id1} AND ID(c) = {id2}
RETURN n;
[UPDATED, based on new info from comments]
If you have multiple "clones" of the "same" node and you want to quickly determine which clones are related without having to perform a lot of (slow) property comparisons, you can add a relationship (say, typed ":CLONE") between clones. That way, a query like this would work:
MATCH (a)--(m)-[:CLONE]-(n)--(c)
WHERE ID(a) = {id1} AND ID(c) = {id2}
RETURN m, n;
You can find the duplicity of the node, by using this query -
[1]
Duplicity with single node -
MATCH pathx =(n)-[:Relationship]-(find) WHERE find.name = "action" RETURN pathx;
[2]
or for two nodes giving only immediate parent node
MATCH pathx =(n)-[:Relationship]-(find), pathy= (p)-[:Relationship]
-(seek) WHERE find.name = "action" AND seek.name="requestID" RETURN pathx,
pathy;
[3]
or to find the entire network i.e. all the nodes connected -
MATCH pathx =(n)--()-[:Relationship]-(find), pathy= (p)--()-[:Relationship]-
(seek) WHERE find.name = "action"
AND seek.name="requestID" RETURN pathx, pathy;

cypher query to return or keep only the final sequence when variable length relationship identifiers are used

Is there a way to keep or return only the final full sequences of nodes instead of all subpaths when variable length identifiers are used in order to do further operations on each of the final full sequence path.
MATCH path = (S:Person)-[rels:NEXT*]->(E:Person)................
eg: find all sequences of nodes with their names in the given list , say ['graph','server','db'] with same 'seqid' property exists in the relationship in between.
i.e.
(graph)->(server)-(db) with same seqid :1
(graph)->(db)->(server) with same seqid :1 //there can be another matching
sequence with same seqid
(graph)->(db)->(server) with same seqid :2
Is there a way to keep only the final sequence of nodes say ' (graph)->(server)->(db)' for each sequences instead of each of the subpath of a large sequence like (graph)->(server) or (server)->(db)
pls help me to solve this.........
(I am using neo4j 2.3.6 community edition via java api in embedded mode..)
What we could really use here is a longestSequences() function that would do exactly what you want it to do, expand the pattern such that a and b would always be matched to start and end points in the sequence such that the pattern is not a subset of any other matched pattern.
I created a feature request on neo4j for exactly this: https://github.com/neo4j/neo4j/issues/7760
And until that gets implemented, we'll have to make do with some alternate approach. I think what we'll have to do is add additional matching to restrict a and b to start and end nodes of full sequences.
Here's my proposed query:
WITH ['graph', 'server' ,'db'] as names
MATCH p=(a)-[rels:NEXT*]->(b)
WHERE ALL(n in nodes(p) WHERE n.name in names)
AND ALL( r in rels WHERE rels[0]['seqid'] = r.seqid )
WITH names, p, a, rels, b
// check if b is a subsequence node instead of an end node
OPTIONAL MATCH (b)-[rel:NEXT]->(c)
WHERE c.name in names
AND rel.seqid = rels[0]['seqid']
// remove any existing matches where b is a subsequence node
WITH names, p, a, rels, b, c
WHERE c IS NULL
WITH names, p, a, rels, b
// check if a is a subsequence node instead of a start node
OPTIONAL MATCH (d)-[rel:NEXT]->(a)
WHERE d.name in names
AND rel.seqid = rels[0]['seqid']
// remove any existing matches where a is a subsequence node
WITH p, a, b, d
WHERE d IS NULL
RETURN p, a as startNode, b as endNode
MATCH (S:Person)-[r:NEXT]->(:Person)
// Possible starting node
WHERE NOT ( (:Person)-[:NEXT {seqid: r.seqid}]->(S) )
WITH S,
// Collect all possible values of `seqid`
collect (distinct r.seqid) as seqids
UNWIND seqids as seqid
// Possible terminal node
MATCH (:Person)-[r:NEXT {seqid: seqid}]->(E:Person)
WHERE NOT ( (E)-[:NEXT {seqid: seqid}]->(:Person) )
WITH S,
seqid,
collect(distinct E) as ES
UNWIND ES as E
MATCH path = (S)-[rels:NEXT* {seqid: seqid}]->(E)
RETURN S,
seqid,
path
[EDITED]
This query might do what you want:
MATCH (p1:Person)-[rel:NEXT]->(:Person)
WHERE NOT (:Person)-[:NEXT {seqid: rel.seqid}]->(p1)
WITH DISTINCT p1, rel.seqid AS seqid
MATCH path = (p1)-[:NEXT* {seqid: seqid}]->(p2:Person)
WHERE NOT (p2)-[:NEXT {seqid: seqid}]->(:Person)
RETURN path;
It first identifies all Person nodes (p1) with at least one outgoing NEXT relationship that have no incoming NEXT relationships (with the same seqid), and their distinct outgoing seqid values. Then it finds all "complete" paths (i.e., paths whose start and end nodes have no incoming or outgoing NEXT relationships with the desired seqid, respectively) starting at each p1 node and having relationships all sharing the same seqid. Finally, it returns each complete path.
If you just want to get the name property of all the Person nodes in each path, try this query (with a different RETURN clause):
MATCH (p1:Person)-[rel:NEXT]->(:Person)
WHERE NOT (:Person)-[:NEXT {seqid: rel.seqid}]->(p1)
WITH DISTINCT p1, rel.seqid AS seqid
MATCH path = (p1)-[:NEXT* {seqid: seqid}]->(p2:Person)
WHERE NOT (p2)-[:NEXT {seqid: seqid}]->(:Person)
RETURN EXTRACT(n IN NODES(path) | n.name);

Create relationship between nodes having same property value in common, using one Cypher query

Beginning with Neo4j 1.9.2, and using Cypher query language, I would like to create relationships between nodes having a specific property value in common.
I have set of nodes G having a property H, without any relationship currently existing between G nodes.
In a Cypher statement, is it possible to group G nodes by H property value and create a relationship HR between each nodes becoming to same group? Knowing that each group have a size between 2 & 10 and I'm having more than 15k of such groups (15k different H values) for about 50k G nodes.
I've tried hard to manage such query without finding a correct syntax. Below is a small sample dataset:
create
(G1 {name:'G1', H:'1'}),
(G2 {name:'G2', H:'1'}),
(G3 {name:'G3', H:'1'}),
(G4 {name:'G4', H:'2'}),
(G5 {name:'G5', H:'2'}),
(G6 {name:'G6', H:'2'}),
(G7 {name:'G7', H:'2'})
return * ;
At the end, I'd like such relationships:
G1-[:HR]-G2-[:HR]-G3-[:HR]-G1
And:
G4-[:HR]-G5-[:HR]-G6-[:HR]-G7-[:HR]-G4
In another case, I may want to update massively the relationships between nodes using/comparing some of their properties. Imagine nodes of type N and nodes of type M, with N nodes related to M with a relationship named :IS_LOCATED_ON. The order of the location can be stored as a property of N nodes (N.relativePosition being Long from 1 to MAX_POSITION), but we may need later to update the graph model such a way: make N nodes linked between themselves by a new :PRECEDES relationship, so that we can find easier and faster next node N on the given set.
I'd expect such language may allow to update massive set of nodes/relationships manipulating their properties.
Is it not possible?
If not, is it planned or may be it planned?
Any help would be greatly appreciated.
Since there's nothing in the data you supplied to get rank, I've played with collections
to get one as follows:
START
n=node(*), n2=node(*)
WHERE
HAS(n.H) AND HAS(n2.H) AND n.H = n2.H
WITH n, n2 ORDER BY n2.name
WITH n, COLLECT(n2) as others
WITH n, others, LENGTH(FILTER(x IN others : x.name < n.name)) as rank
RETURN n.name, n.H, rank ORDER BY n.H, n.name;
Building off of that you can then start determining relationships
START
n=node(*), n2=node(*)
WHERE
HAS(n.H) AND HAS(n2.H) AND n.H = n2.H
WITH n, n2 ORDER BY n2.name
WITH n, COLLECT(n2) as others
WITH n, others, LENGTH(FILTER(x IN others : x.name < n.name)) as rank
WITH n, others, rank, COALESCE(
HEAD(FILTER(x IN others : x.name > n.name)),
HEAD(others)
) as next
RETURN n.name, n.H, rank, next ORDER BY n.H, n.name;
Finally ( and slightly more condensed )
START
n=node(*), n2=node(*)
WHERE
HAS(n.H) AND HAS(n2.H) AND n.H = n2.H
WITH n, n2 ORDER BY n2.name
WITH n, COLLECT(n2) as others
WITH n, others, COALESCE(
HEAD(FILTER(x IN others : x.name > n.name)),
HEAD(others)
) as next
CREATE n-[:HR]->next
RETURN n, next;
You can just do it like that, maybe indicate direction in your relationships:
CREATE
(G1 { name:'G1', H:'1' }),
(G2 { name:'G2', H:'1' }),
(G3 { name:'G3', H:'1' }),
(G4 { name:'G4', H:'2' }),
(G5 { name:'G5', H:'2' }),
(G6 { name:'G6', H:'2' }),
(G7 { name:'G7', H:'2' }),
G1-[:HR]->G2-[:HR]->G3-[:HR]->G1,
G4-[:HR]->G5-[:HR]->G6-[:HR]->G7-[:HR]->G1
See http://console.neo4j.org/?id=ujns0x for an example.

Resources