Travel graph within one level only - neo4j

Within a graph there is a group G1 - this group G1 has 3 subgroups S1, S2 and S3. The relation is classified as IS_SUBGROUP_OF.
G1 itsself is again a subgroup of another group, lets call it D1. D1 has a lot of subgroups where G1 is only one.
Having a user U1 who is member of a Subgroup of G1 - here S1. I want to create a query which is able to gather all users of subgroup S1, traverse from user U1 to S1 and from there to G1, get the users of G1 and down from G1 to S2 and S3 and grab all users from S2 and S3 as well. The final result should be all users in the subgroups S1, S2 and S3 from the parent Group G1 including the users of G1.
I have tried:
MATCH (d:User) --> (S1:Subgroup)-[:IS_SUBGROUP_OF*0..]->(G1:Group)
WHERE d.name = "U1"
RETURN d
Unfortunately I traverse all groups and give back all users of any group in the graph. I tried to change the hop-level in the relation (e.g. 1 only) but didnt succeed. Do you have a hint how to create the query to get only this subset of users?
The name of the groups are just for the example and not known in the real world - all I know is the username (here: U1) - and from there I need to find various groups depending where the user is situated. So in the query I cannot work with names of groups but only with variables as they are not known.
* EDITED *
Sorry for the confusion, I labeld S1 wrongly as Subgroup, but only the relation mentions 'IS_SUBGROUP_OF', so all Group Nodes have the label 'Group', D1 would also have the label 'Group'. I also add the relation label for users, so the statement looks now like this:
MATCH (d:User) -[:IS_MEMBER_OF]-> (S1:Group)-[:IS_SUBGROUP_OF*0..]->(G1:Group)
WHERE d.name = "U1"
RETURN d

Let's try this, a minor tweak on Dave's answer (which should work fine, as far as I can tell...)
MATCH (:User {name: 'U1'})-[:IS_MEMBER_OF]->(:Group)-[:IS_SUBGROUP_OF]->(superGroup:Group)
WITH superGroup
MATCH (superGroup)<-[:IS_SUBGROUP_OF*0..1]-(:Group)<-[:IS_MEMBER_OF]-(users:User)
RETURN COLLECT(DISTINCT users)
Based upon the starting user, this finds the grandparent group or supergroup (G1 according to your example), then matches on users that are members of G1 or any of its immediate subgroups and returns the distinct collection. It will include the original matched user.

This answer assumes the user is identified as a member of a group by the relationship IS_MEMBER_OF.
The query first determines the parent group G1 based on the supplied user U1. It then determines all of the users of the child groups of G1 (S1, S2, S3) and returns the collection of distinct users accross the child groups.
This is a somewhat generalized approach that could be used to traverse more levels by modifying the number of levels to traverse in each situation.
// follow IS_MEMBER_OF or IS_SUBGROUP_OF relationships up
// the group/user hierarchy to find the parent group two
// levels up
match (u:User1 {name: 'U1'})-[:IS_MEMBER_OF|IS_SUBGROUP_OF*2]->(g:Group)
// using the parent group
with g
// follow the IS_MEMBER_OF or IS_SUBGROUP_OF relationships back down
// the hierarchy to find all of the peer users or the original user
match (g)<-[:IS_MEMBER_OF|IS_SUBGROUP_OF*2]-(u:User)
return collect(distinct u)

Would this work?
MATCH (d:User)-[*0..1]-(G1:Group)
WHERE d.name= 'U1'
RETURN DISTINCT d

Related

Nodes with relationship to multiple nodes

I want to get the Persons that know everyone in a group of persons which know some specific places.
This:
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
WITH collect(DISTINCT b) as persons
Match (a:Person)
WHERE ALL(b in persons WHERE (a)-[:knows]->(b))
RETURN a
works, but for the second part does a full nodelabelscan, before applying the where clause, which is extremely slow - in a bigger db it takes 8~9 seconds. I also tried this:
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
Match (a:Person)-[:knows]->(b)
RETURN a
This only needs 2ms, however it returns all persons that know any person of group b, instead of those that know everyone.
So my question is: Is there a effective/fast query to get what i want?
We have a knowledge base article for this kind of query that show a few approaches.
One of these is to match to :Persons known by the group, and then count the number of times each of those persons shows up in the results. Provided there aren't multiple :knows relationships between the same two people, if the count is equal to the collection of people from your first match, then that person must know all of the people in the collection.
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
WITH collect(b) as persons
UNWIND persons as b // so we have the entire list of persons along with each person
WITH size(persons) as total, b
MATCH (a:Person)-[:knows]->(b)
WITH total, a, count(a) as knownCount
WHERE total = knownCount
RETURN a
Here is a simpler Cypher query that also compares counts -- the same basic idea used by #InverseFalcon.
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'}), (a:Person)-[:knows]->(b)
WITH COLLECT({a:a, b:b}) as data, COUNT(DISTINCT b) AS total
UNWIND data AS d
WITH total, d.a AS a, COUNT(d.b) AS bCount
WHERE total = bCount
RETURN a

How to perform content based filtering neo4j with all item nodes connected to each other

I have two kinds of nodes in my database:
1) User
2) Media
3) Tag
I also have a relationship with all Media nodes like such:
(:Media)-[:IS_SIMILAR]-(:Media)
And another relationship (:Media)-[:HAS_TAG]-(:Tag)
And another relationship (:User)-[:LIKES]-(:Media)
Here's a visualization:
The green nodes are media and blue is a user (i excluded the tag nodes)
This IS_SIMILAR relationship has an attribute similarity. This attribute similarity is computed by calculating the number of tags each node pair has in common.
I am trying to perform content-based filtering by finding the media a user likes and getting top 10 media based on the similarity attribute.
I construct the following query:
Match(u:User{id:"Dorian"})-[:LIKES]-(m:Media)
WITH collect(m) as mu
UNWIND mu as m
Match(m)-[s:ISSIMILAR]-(o:Media)
WHERE NOT o in mu
RETURN DISTINCT o,s ORDER BY s.similarity DESC
With the following results:
Unfortunately, there are repeated Media nodes because each Media node that is liked by a user also has an IS_SIMILAR relationship with other media nodes.
Can you suggest:
1) how I can avoid this problem
2) another method to perform content-based recommendation with my schema?
You were almost there. This should work:
MATCH (u:User{id:"Dorian"})-[:LIKES]-(m:Media)
WITH collect(m) as mu
UNWIND mu as m
MATCH (m)-[s:ISSIMILAR]-(o:Media)
WHERE NOT o IN mu
WITH o ORDER BY s.similarity DESC
RETURN DISTINCT o;
Unfortunately, Cypher does not like RETURN DISTINCT o ORDER BY s.similarity DESC, but accepts the logically equivalent WITH o ORDER BY s.similarity DESC RETURN DISTINCT o.

CASE WHEN to RETURN maximum edge property in Cypher

I am trying to find out how to use a condition in a query. Basically, in the relationship between teachers and schools below, I need to get only teachers who moved from one school to another with a promotion. To do this, I use property (from 1 to 4, 1 is lowest role and 4 is highest) from [:HAD] in where statement, but the query brings up teacher were also promoted within the same school - has more than one [:HAD], which I don't need. So I am thinking to introduce an if statement who says if there are multiple [:HAD] between teacher and school, bring just the maximum [:HAD] property.
This is the original code which brings teachers who were promoted in the same school as well as those promoted to another school:
//CREATE LINK BETWEEN SCHOOLS FOR CENTRALITY MEASURE
//BASED ON TEACHERS MOVING TO ANOTHER SCHOOL BASED ON PROMOTION
MATCH (s1:School)<-[:WITH]-(:Contract)<-[x:HAD]-(:Teacher)-[y:HAD]->(:Contract)-[:WITH]->(s2:School)
WHERE y.HAD>x.HAD and s1 <> s2
MERGE (s1)-[:TRANSFER_ON_PROMOTION]->(s2)
RETURN s1, s2
This is my alteration to the original code:
//CREATE LINK BETWEEN SCHOOLS FOR CENTRALITY MEASURE
//BASED ON TEACHERS MOVING TO ANOTHER SCHOOL BASED ON PROMOTION
MATCH (s1:School)<-[:WITH]-()<-[x:HAD]-(:Teacher)-[y:HAD]->()-[:WITH]->(s2:School)
WHERE y.HAD>x.HAD and s1 <> s2
RETURN toInteger(x.HAD) AS x,
CASE
WHEN (x ORDER BY x DESC LIMIT 1) > 1 THEN x
ELSE 1
END as highest
MATCH (s1:School)<-[:WITH]-(:Contract)<-[x:HAD]-(:Teacher)-[y:HAD]->(:Contract)-[:WITH]->(s2:School)
WHERE y.HAD>highest and s1 <> s2
MERGE (s1)-[:TRANSFER_ON_PROMOTION]->(s2)
RETURN s1, s2
You can use WITH to chain queries and max to select the maximum HAD value:
MATCH (s1:School)<-[:WITH]-(:Contract)<-[x:HAD]-(:Teacher)-[y:HAD]->(:Contract)-[:WITH]->(s2:School)
WHERE s1 <> s2
WITH s1, max(x.HAD) AS xMaxHad, y, s2
WHERE y.HAD > xMaxHad
MERGE (s1)-[:TRANSFER_ON_PROMOTION]->(s2)
RETURN s1, s2
Note that modeling-wise, adding a HAD property to the HAD relationship seems bad practice - I'd recommend to use a different name for the property instead (e.g. level or rank). Even HAD is not a very good name, maybe HELD would be nicer.

Neo4j: Cypher Query With Variable Length and Condition on Node Labels

What I'm looking for
With variable length relationships (see here in the neo4j manual), it is possible to have a variable number of relationships with a certain label between two nodes.
# Cypher
match (g1:Group)-[:sub_group*]->(g2:Group) return g1, g2
I'm looking for the same thing with nodes, i.e. a way to query for two nodes with a variable number of nodes in between, but with a label condition on the nodes rather than the relationships:
# Looking for something like this in Cypher:
match (g1:Group)-->(:Group*)-->(g2:Group) return g1, g2
Example
I would use this mechanism, for example, to find all (direct or indirect) members of a group within a group structure.
# Looking for somthing like this in Cypher:
match (group:Group)-->(:Group*)-->(member:User) return member
Take, for example, this structure:
group1:Group
|-------> group2:Group -------> user1:User
|-------> group3:Group
|--------> page1:Page -----> group4:Group -----> user2:User
In this example, user1 is a member of group1 and group2, but user2 is only member of group4, not member of the other groups, because a non-Group labeled node is in between.
Abstraction
A more abstract pattern would be a kind of repeat operator |...|* in Cypher:
# Looking for repeat operator in Cypher:
match (g1:Group)|-[:is_subgroup_of]->(:Group)|*-[:is_member_of]->(member:User)
return member
Does anyone know of such a repeat operator? Thanks!
Possible Solution
One solution I've found, is to use a condition on the nodes using where, but I hope, there is a better (and shorter) soluation out there!
# Cypher
match path = (member:User)<-[*]-(g:Group{id:1})
where all(node in tail(nodes(path)) where ('Group' in labels(node)))
return member
Explanation
In the above query, all(node in tail(nodes(path)) where ('Group' in labels(node))) is one single where condition, which consists of the following key parts:
all: ALL(x in coll where pred): TRUE if pred is TRUE for all values in
coll
nodes(path): NODES(path): Returns the nodes in path
tail(): TAIL(coll): coll except first element–––I'm using this, because the first node is a User, not a Group.
Reference
See Cypher Cheat Sheet.
How about this:
MATCH (:Group {id:1})<-[:IS_SUBGROUP_OF|:IS_MEMBER_OF*]-(u:User)
RETURN DISTINCT u
This will:
find all subtrees of the group with ID 1
only traverse the relationships IS_GROUP_OF and IS_MEMBER_OF in incoming direction (meaning sub-groups or users that belong to group with ID or one of its sub-groups)
only return nodes which have a IS_MEMBER_OF relationship to a group in the subtree
and discard duplicate results (users who belong to more than one of the groups in the tree would otherwise appear multiple times)
I know this relies on relationships types rather than node labels, but IMHO this is a more graphy approach.
Let me know if this would work or not.

CYPHER: Querying the next hop node

Good evening,
as beginner I struggle with the transfering my relational db knowledge towards a graph DB and its queries. Lets assume I have a graph with the following nodes:
a PERSON node
two group graphs like MAIN GROUP A and MAIN GROUP B
MAIN GROUP A has another node like SUB GROUP 1 which has another node DETAIL GROUP Z
MAIN GROUP B has another node like SUB GROUP 2
The user node is related to SUB GROUP 2 and DETAIL GROUP Z.
With the query
MATCH (user:PERSON {name: "user"})-[relation:IS_MEMBER_OF*0..]->(team:GROUP)
RETURN team
I find directly the groups the user belongs to.
Desired would be to know the groups the user is also connected to, as PERSON is by defintion also a member of SUB GROUP 1, MAIN GROUP A and MAIN GROUP B.
Anybody able to push me into the right direction? Thanks a lot.
Balael
Assuming you have a HAS_SUBGROUP relationship linking a parent group to each child group, this query should return each team the user is a direct member of, and for each team, the distinct collection of ancestor teams.
MATCH (:PERSON {name: "user"})-[:IS_MEMBER_OF*]->(team:GROUP)
OPTIONAL MATCH (team)<-[:HAS_SUBGROUP*]-(ancestor_team)
RETURN team, COLLECT(DISTINCT ancestor_team);

Resources