Match nodes of variable depth - neo4j

I have users who like different geographies (could be a country, state or city) and I want to match those users who like geographies in the same country.
For eg.
user A likes USA
user B likes USA
user C likes San Jose
user D likes France
then I want user A to be matched to users B and C.
What cypher query will get me the results? This is what I tried:
/** node id of user A is 0 **/
START u=node(0) MATCH (u:users) - [:likes] - (g1) - [:contains*0..5] - (g2) - [:likes] - (o:users) RETURN o;
This query is not working as expected. What would be a right syntax?

If I understood you correctly, something like this might work in your case. But pay attention - there might be certain issues in case of circular paths.
The main idea behind this is setting not only relationships but their directions as well.

Related

neo4j - Return single instance of node - querying by property?

I am building a social network that has a specialized audience.
Users are related to each other by three primary relationship types.
[:FRIENDS]->(:USER),
[:WORKS_AT]->(:COMPANY),
[:WORKED_AT]->(:COMPANY),
[:FOLLOWS].
When working through a search scenario (a user wants to find another user), I've given each relationship a "priority" (so to speak).
For example, if a user wants to find another user named "Bart Simpson" - first, we will check co-worker relationships ([:WORKS_AT],[:WORKED_AT]). I've assigned those relationships a priority of 1. That way, "Bart Simpson" who works with me will appear in the search results before "Bart Simpson" - who lives hundreds of miles away in Springfield.
The second priority is [:FRIENDS]->(:USER). Do any of my friends have a friend named "Bart Simpson?" Priority #2.
The last priority is a global search. I don't have any co-workers named "Bart Simpson", my friends don't have any friends named "Bart Simpson" - but I met Bart at a conference, and I want to "friend" him. So, I've added a "Global" search. Find any users named "Bart Simpson".
So far, this is my Cypher:
optional match (u:USER {id:'1'})-[:WORKS_AT|:WORKED_AT]-(w:COMPANY)-[r]-(f:USER)
with collect(f{.*, priority:1,relationship:r.title,type:type(r)}) as user
optional match (u:USER {id: '1'})-[:FRIENDS]-(:USER)-[r:FRIENDS]-(f:USER)
with user + collect(f{.*, priority:2,relationship:r.title,type:type(r)}) as user
optional match (f:USER)
where f.id <> '1'
with user + collect(f{.*, priority:3,relationship:'',type:''}) as user
unwind user as users
with users as user
where toLower(user.last_name) STARTS WITH toLower('Sc') OR toLower(user.first_name) STARTS WITH toLower('Sc')
return distinct user
This is fantastic - however, a user could work at the same company, as well as
be friends, as well as appear in the global search. So - we have the potential for three (or more) "copies" of the same user - with different relationship attributes. The relationship attributes are important because in the app, they provide important context to the search. "Bart Simpson - Works at XYZ Company."
So what I'm really looking for is the ability to either return the user record with the highest priority - and do that based on the "ID" field. If that doesn't work, I could see a situation where we try to update the property of a node. So, when the query hits the priority 2 search, if there is already a user in the collection with the same "ID", it just appends the P2 relationship type to the record. Either is fine with me.
I'm open to suggestions and listening!
So, I've made some progress!
MATCH
(subject:USER {id:'1'})
MATCH
(subject)-[:WORKS_AT|:WORKED_AT]-(w:COMPANY)-[r]-(f1:USER)
WHERE
toLower(f1.last_name) STARTS WITH toLower('Sc') or
toLower(f1.first_name) STARTS WITH toLower('Sc')
WITH
COLLECT(f1.id) AS userIds,
COLLECT(f1{.*,priority:1,rType:type(r), title:r.title, detail:w.name}) AS users
OPTIONAL MATCH
(subject)-[:FRIEND]-(fw:USER)-[r:FRIEND]-(f2:USER)
WHERE
NOT(f2.id in userIds) AND
(
toLower(f2.last_name) STARTS WITH toLower('Sc') or
toLower(f2.first_name) STARTS WITH toLower('Sc')
)
WITH
users + COLLECT(f2{.*,priority:2,rType:"FRIEND", title:"Friends with " + fw.first_name + " " + fw.last_name, detail:''}) AS users,
userIds + collect(f2.id) AS userIds
OPTIONAL MATCH
(f3:USER)
WHERE
NOT(f3.id in userIds) AND
(
toLower(f3.last_name) starts with toLower('Sc') OR
toLower(f3.first_name) starts with toLower('Sc')
)
WITH
users + COLLECT(f3{.*,priority:3,rType:"GLOBAL", title:"", detail:''}) AS users
RETURN
users
The query has evolved a bit. Essentially, at the first stage, we collect the userIds of the items that were returned. At each subsequent stage, the results returned are compared against the running list of ids. If the id of the result is already in the list of ids, it is filtered out - thus ensuring a unique id in the set.
This is working - and for now, I'm going to run with it. Is this the most efficient query, or is there a better way to deal with this scenario?

How to count cypher labels with specific condition?

I have a graph database with information about different companies and their subsidiaries. Now my task is to display the structure of the company. This I have achieved with d3 and vertical tree.
But additionally I have to write summary statistics about the company that is currently displayed. Companies can be chosen from a dropdown list which is fetching this data dynamically via AJAX call.
I have to write in the same HTML a short summary like :
Total amount of subsidiaries for CompanyA: 300
Companies in Corporate Havens : 45%
Companies in Tax havens 5%
My database consists of two nodes: Company and Country, and the country has label like CH and TH.
CREATE (:TH:Country{name:'Nauru', capital:'Yaren', lng:166.920867,lat:-0.5477})
WITH 1 as dummy MATCH (a:Company), (b:Country) WHERE a.name=‘CompanyA ' AND b.name='Netherlands' CREATE (a)-[:IS_REGISTERED]->(b)
So how can I find amount of subsidiaries of CompanyA that are registered in corporate and tax havens? And how to pass this info further to html
I found different cypher queries to query all the labels as well as apocalyptic.stats but this does not allow me to filter on mother company. I appreciate help.
The cypher is good because you write a query almost in natural language (the query below may be incorrect - did not check, but the idea is clear):
MATCH (motherCompany:Company {name: 'CompanyA'})-[:HAS_SUBSIDIARY]->(childCompany:Company)
WITH motherCompany,
childCompany
MATCH (childCompany)-[:IS_REGISTERED]->(country:Country)
WITH motherCompany,
collect(labels(country)) AS countriesLabels
WITH motherCompany,
countriesLabels,
size([countryLabels IN countriesLabels WHERE 'TH' IN countryLabels ]) AS inTaxHeaven
RETURN motherCompany,
size(countriesLabels) AS total,
inTaxHeaven,
size(countriesLabels) - inTaxHeaven AS inCorporateHeaven

In- and excluding nodes in a cypher query

Good morning,
I want to build a structure in Neo4J where I can handle my users and groups (kind of ACL). The idea is to have for each user and for each group a node with all the details. The groups shall become a graph where a root group will have sub-groups that can have also sub-groups without limit. The relation will be -[:IS_SUBGROUP_OF]- - so far nothing exciting. Every user will be related to a group with -[:IS_MEMBER_OF]- to have a clear assignment. Of course a user can be a member of 1 or more groups. Some users will have a different relation like -[:IS_LEADER_OF]- to identify teamlead of the groups.
My tasks:
Assignment: I can query each member of a group with a simple query, I can even query members of the subgroups using the current logged in and asking user:
MATCH (d1:Group:Local) -- (c:User)
MATCH (d:User) -[:IS_MEMBER_OF|IS_LEADER_OF]- (g:Group:Local)-[:IS_SUBGROUP_OF*0..]->(d1)
WHERE c.login = userLogin
RETURN DISTINCT d.lastname, d.firstname
I get every related user to every group of the current user and below (subgroups). Maybe you have a hint how I cna improve the query or the model.
Approval
Here I am stucked as I want to have all users of the current group from the querying user and all members of all subgroups - except the leader of the current group. The reason behind is that a teamlead shall not be able to approve actions for himself but though for every other member of his group and all members of subgroups including their teamleads.
I tried to use the relations -[:IS_LEADER_OF]- to exclude them but than I loose also the teamleads of the subgroups. Does anyone has an idea how I would either change the model or how I can query the graph to get all users except the teamlead of the current group?
Thanks for your time,
Balael
* EDIT *
I think I am getting close, I just need to understand the results of those both queries:
MATCH (d:User) -- (g:Group) WHERE g.uuid = "xx"
RETURN d.lastname, d.firstname
Returns all user in this group no matter what relationship (leader / member)
MATCH (d:User) -- (g:Group), (g)--(c:User{uuid:"yy"})
RETURN d.lastname, d.firstname
Returns all user of that group except the user c. I would have expected to get c as well in the list with d-users as c is part of that group and should be found with (d:User).
I do not understand the difference between both queries, maybe someone has a hint for me?
You can simplify your query slightly (however this should not have an impact on performance):
MATCH (d:User) -[:IS_MEMBER_OF|IS_LEADER_OF]- (g:Group:Local)-[:IS_SUBGROUP_OF*0..]->(d1:Group:Local)--(c:User{login:"userlogin"})
RETURN DISTINCT d.lastname, d.firstname
Don't completely understand your question, but I assume you want to make sure that d1 and c are not connected by a IS_LEADER_OF relationship. If so, try:
MATCH (d:User) -[:IS_MEMBER_OF|IS_LEADER_OF]- (g:Group:Local)-[:IS_SUBGROUP_OF*0..]->(d1:Group:Local)-[r]-(c:User{login:"userlogin"})
WHERE type(r)<>'IS_LEADER_OF'
RETURN DISTINCT d.lastname, d.firstname
following up on * EDIT * in the question
In a MATCH you specify a path. By definition a path does not use the same relationship twice. Otherwise there is a danger to run into infinite recursion. Looking at the second query in the "EDIT" section above: the right part matches yy's relationship to the group whereas the left part matches all user related to this group. To prevent multiple usage of the same relationship the left part does not hit use yy

Get introduced [linkedin] like

I am using neo4j with people and companies as nodes and friend_of/works_at relationship between these.
I would like to know how to implement a get introduced to a second degree connection that linked in uses. The idea is to get your second degree connections at the company you wish to apply. If there are these second degree connections, then you would like to know who among your 1st deg connections can introduce y*ou to these 2nd deg connections.
For this I'm trying this query :
START from = node:Nodes(startNode), company = node:Nodes(endNode)
MATCH from-[:FRIEND_OF]->f-[:FRIEND_OF]-fof-[:WORKS_AT]->company
WHERE not(fof = from) and not (from-[:FRIEND_OF]->fof)
RETURN distinct f.name, fof.name, company.name
But, this returns duplicate friend of friend names (fof.name), since the distinct is applied on all the parameters that are returned as a whole. It could be like I have friends X and Y who are both connected to Z who works at company C. This way, I get both X-Z-C and Y-Z-C. But, I want to apply distinct on Z, such that I get either X-Z-C or Y-Z-C or maybe a list/collection/aggregate of all friends that connect to Z. This could like ["X","Y"..]->Z How should I modify my query?
http://console.neo4j.org/?id=s1m14g
start joe=node:node_auto_index(name = "Joe")
match joe-[:knows]->friend-[:knows]->friend_of_friend
where not(joe-[:knows]-friend_of_friend)
return collect(friend.name), friend_of_friend.name

How to get friends of friends that have the same interest?

Getting friends of friend are pretty easy, I got this which seems to work great.
g.v(1).in('FRIEND').in('FRIEND').filter{it != g.v(1)}
But what I want to do is only get friends of friends that have the same interests. Below I want Joe to be suggested Moe but not Noe because they do not have the same interest.
You simply need to extend your gremlin traversal to go over the LIKES edges too:
g.v(1).in('FRIEND').in('FRIEND').filter{it != g.v(1)}.dedup() \
as('friend').in('LIKES').out('LIKES').filter{it == g.v(1)}. \
back('friend').dedup()
Basically this goes out to friends of friends, as you had before and saves the position in the pipe under the name friend. It then goes out to mutual likes and searches for the original
source node. If it finds one it jumps back friend. The dedup() just removes duplicates and may speed up traversals.
The directionality of this may not be 100% correct as you haven't indicated direction of edges in your diagram.
Does this have to be in Gremlin? If Cypher is acceptable, you can do:
START s=node(Joe)
MATCH s-[:FRIEND]-()-[:FRIEND]-fof, s-[:LIKES]-()-[:LIKES]-fof
WHERE s != fof
RETURN fof
Query to get Mutual friends without considering common likes,
But if you they have common likes it will come on top.
Take a look of Order by.
MATCH (me:User{userid:'34219'})
MATCH (me)-[:FRIEND]-()-[:FRIEND]-(potentialFriend)
WITH me, potentialFriend, COUNT(*) AS friendsInCommon
WITH me,
potentialFriend,
SIZE((potentialFriend)-[:LIKES]->()<-[:LIKES]-(me)) AS sameInterest,
friendsInCommon
WHERE NOT (me)-[:FRIEND]-(potentialFriend)
RETURN potentialFriend, sameInterest, friendsInCommon,
friendsInCommon + sameInterest AS score
ORDER BY score DESC;
If you want only common likes add foll. condition -
Where sameInterest>0

Resources