Neo4j cypher query with variable relationship path length - neo4j

I'm moving my complex user database where users can be on one of many teams, be friends with each other and more to Neo4j. Doing this in a RDBMS was painful and slow, but is simple and blazing with Neo4j. :)
I was hoping there is a way to query for
a relationship that is 1 hop away and
another relationship that is 2 hops away
from the same query.
START n=node:myIndex(user='345')
MATCH n-[:IS_FRIEND|ON_TEAM*2]-m
RETURN DISTINCT m;
The reason is that users that are friends are one edge from each other, but users linked by teams are linked through that team node, so they are two edges away. This query does IS_FRIEND*2 and ON_TEAM*2, which gets teammates (yeah) and friends of friends (boo).
Is there a succinct way in Cypher to get both differing length relations in a single query?

I rewrote it to return a collection:
start person=node(1)
match person-[:IS_FRIEND]-friend
with person, collect(distinct friend) as friends
match person-[:ON_TEAM*2]-teammate
with person, friends, collect(distinct teammate) as teammates
return person, friends + filter(dupcheck in teammates: not(dupcheck in friends)) as teammates_and_friends
http://console.neo4j.org/r/oo4dvx
thanks for putting together the sample db, Werner.

I have created a small test database at http://console.neo4j.org/?id=sqyz7i
I have also created a query which will work as you described:
START n=node(1)
MATCH n-[:IS_FRIEND]-m
WITH collect(distinct id(m)) as a, n
MATCH n-[:ON_TEAM*2]-m
WITH collect(distinct id(m)) as b, a
START n=node(*)
WHERE id(n) in a + b
RETURN n

Related

Is there a simpler version of this cypher query?

I have constructed a query to find the people who follow each other and who have read books in the same genre. Here it is:
MATCH (u1:User)-[:READ]->(b1:Book)
WITH collect(DISTINCT b1.genre) AS genres,u1 AS user1
MATCH (u2:User)-[:READ]->(b2:Book)
WHERE (user1)<-[:FOLLOWS]->(u2) AND b2.genre IN genres
RETURN DISTINCT user1.username AS user1,u2.username AS user2
The idea is that we collect all the book genres for one of them, and if a book read by the other is in that list of genres (and they follow each other), then we return those users. This seems to work: we get a list of distinct pairs of individuals. I wonder, though, if there a quicker way to do this? My solution seems somewhat clumsy, but I found it surprisingly finicky trying to specify that they have read a book in the same genre without getting back all the pairs of books and duplicating individuals. For example, I
first wrote the following:
MATCH (b1:Book)<-[:READ]-(u1:User)-[:FOLLOWS]-(u2:User)-[:READ]->(b2:Book)
WHERE b1.genre = b2.genre
RETURN DISTINCT u1.username AS user1, u2.username AS user2
Which seems simpler, but in fact it returned repeated names for all the books that were read in the same genre. Is my solution the simplest, or is there a simpler one?
This is one way of rewriting the query
MATCH (n1:User)-[:FOLLOWS]-(n2:User)
MATCH (n1)-[:READ]->(book), (n2)-[:READ]->(book2)
WHERE book.genre = book2.genre
RETURN n1.username, n2.username, count(*)
Here is another collecting genres for each user
MATCH (n1:User)-[:FOLLOWS]-(n2:User)
WITH n1, n2,
[(n1)-[:READ]->(book) | book.genre] AS g1,
[(n2)-[:READ]->(book) | book.genre] AS g2
WHERE ANY(x IN g1 WHERE x IN g2)
RETURN n1, n2, count(*)
Note that sometimes longer queries are not especially better in the sense that the ways the data are retrieved need to make sense to yourself.
Your model however clearly shows that you would benefit from a bit of graph refactoring, extracting the genre into its own node, for eg
MATCH (n:Book)
MERGE (g:Genre {name: n.genre})
MERGE (n)-[:HAS_GENRE]->(g)
And this would be the new query which leverages a graph model
PROFILE
MATCH (n1:User)-[:FOLLOWS]-(n2:User)
WHERE (n1)-[:READ]->()-[:HAS_GENRE]->()<-[:HAS_GENRE]-()<-[:READ]-(n2)
RETURN n1.username, n2.username, count(*)

Cypher - given relationship, get the nodes

If I do the query
MATCH (:Label1 {prop1: "start node"}) -[relationships*1..10]-> ()
UNWIND relationships as relationship
RETURN DISTINCT relationship
How do I get nodes for each of acquired relationship to get result in format:
╒════════╤════════╤═══════╕
│"from" │"type" │"to" │
╞════════╪════════╪═══════╡
├────────┼────────┼───────┤
└────────┴────────┴───────┘
Is there a function such as type(r) but for getting nodes from relationship?
RomanMitasov and ray have working answers above.
I don't think they quite get at what you want to do though, because you're basically returning every relationship in the graph in a sort of inefficient way. I say that because without a start or end position, specifying a path length of 1-10 doesn't do anything.
For example:
CREATE (r1:Temp)-[:TEMP_REL]->(r2:Temp)-[:TEMP_REL]->(r3:Temp)
Now we have a graph with 3 Temp nodes with 2 relationships: from r1 to r2, from r2 to r3.
Run your query on these nodes:
MATCH (:Temp)-[rels*1..10]->(:Temp)
UNWIND rels as rel
RETURN startNode(rel), type(rel), endNode(rel)
And you'll see you get four rows. Which is not what you want because there are only two distinct relationships.
You could modify that to return only distinct values, but you're still over-searching the graph.
To get an idea of what relationships are in the graph and what they connect, I use a query like:
MMATCH (n)-[r]->(m)
RETURN labels(n), type(r), labels(m), count(r)
The downside of that, of course, is that it can take a while to run if you have a very large graph.
If you just want to see the structure of your graph:
CALL db.schema.visualization()
Best wishes and happy graphing! :)
Yes, such functions do exist!
startNode(r) to get the start node from relationship r
endNode(r) to get the end node
Here's the final query:
MATCH () -[relationships*1..10]-> ()
UNWIND relationships as r
RETURN startNode(r) as from, type(r) as type, endNode(r) as to

Aggregating relationships via cypher

I am fairly certain I have seen it somewhere but all keywords I have tried came up empty.
I have a graph that connects persons and companies via documents:
(:Person/:Company)-[ ]-(:Document)-[ ]-(:Person/:Company)
What I would like to do is return a graph that shows the connection between persons and companies directly with the relationship strength based on the number of connections between them.
I get the data with
MATCH (p)-[]-(d:Document)-[]-(c)
WHERE p:Person or p:Company and c:Person or c:Company
WITH p,c, count(d) as rel
RETURN p,rel,c
However in the Neo4J-Browser, the nodes appear without any relationships. Is there a way to achieve this or do I have to create some kind of meta relationship?
If you install APOC Procedures, you'll be able to create virtual relationships which are used for visualization but aren't actually stored in the db.
MATCH (p)-[]-(d:Document)-[]-(c)
WHERE (p:Person or p:Company AND c:Person or c:Company)
AND id(p) < id(c)
WITH p,c, count(d) as relStrength
CALL apoc.create.vRelationship(p,'REL',{strength:relStrength}, c) YIELD rel
RETURN p,rel,c
I also added a predicate on ids of p and c so you don't repeat the same two nodes with p and c switched.

Neo4j: multiple counts from multiple matches

Given a neo4j schema similar to
(:Person)-[:OWNS]-(:Book)-[:CATEGORIZED_AS]-(:Category)
I'm trying to write a query to get the count of books owned by each person as well as the count of books in each category so that I can calculate the percentage of books in each category for each person.
I've tried queries along the lines of
match (p:Person)-[:OWNS]-(b:Book)-[:CATEGORIZED_AS]-(c:Category)
where person.name in []
with p, b, c
match (p)-[:OWNS]-(b2:Book)-[:CATEGORIZED_AS]-(c2:Category)
with p, b, c, b2
return p.name, b.name, c.name,
count(distinct b) as count_books_in_category,
count(distinct b2) as count_books_total
But the query plan is absolutely horrible when trying to do the second match. I've tried to figure out different ways to write the query so that I can do the two different counts, but haven't figured out anything other than doing two matches. My schema isn't really about people and books. The :CATEGORIZED_AS relationship in my example is actually a few different relationship options, specified as [:option1|option2|option3]. So in my 2nd match I repeat the relationship options so that my total count is constrained by them.
Ideas? This feels similar to Neo4j - apply match to each result of previous match but there didn't seem to be a good answer for that one.
UNWIND is your friend here. First, calculate the total books per person, collecting them as you go.
Then unwind them so you can match which categories they belong to.
Aggregate by category and person, and you should get the number of books in each category, for a person
match (p:Person)-[:OWNS]->(b:Book)
with p,collect(b) as books, count(b) as total
with p,total,books
unwind books as book
match (book)-[:CATEGORIZED_AS]->(c)
return p,c, count(book) as subtotal, total

Mutual Friends of Friends Neo4j Cypher

I am new in Neo4j and Cypher and writing on my BA-Thesis in which I compare a RDBMS against Neo4j Graph Database in case of social networks. I´ve defined some queries in SQL and Cypher for a Performance Test over JDBC and REST API in JMETER. However, I have a problem declaring the Cypher query to get the Nodes which are the mutual friends of friends for a certain Node.
My first approach was like so:
MATCH (me:Enthusiast {Id: 488})-[:abonniert]->(f:Enthusiast)-[:abonniert]->(fof:Enthusiast)<-[:abonniert]-(f) RETURN o
I guess you're pretty close with your Cypher statement. I assume that "mutual friend on 2nd degree" means that I'm mutual friend with someone the target is mutual friend as well?
If so (shortening labels and relationship types for readbility):
MATCH
(me:En {Id: 488})-[:abonniert]->(f:En)-[:abonniert]->(fof:En),
(fof)-[:abonniert]->(f)-[:abonniert]->me
RETURN fof
it would be nice if you can create an example scenario at http://console.neo4j.org/ .
i would also omit the relationships direction.
MATCH (me:Enthusiast {Id: 488})-[:abonniert]->(f:Enthusiast),
(f)-[:abonniert]-(x:Enthusiast)-[:aboniert]-(y:Enthusiast)
WHERE f--y AND Id(y) <> 488
RETURN f, y, count(x) as NrMutFr
edit
try this console query, works for the scenario: http://console.neo4j.org/r/tws07k
my above query would in that case be
MATCH (me:Enthusiast {Id: 488})-[:abonniert]->(f:Enthusiast),
(f)-[:abonniert]->(x:Enthusiast)<-[:aboniert]-(y:Enthusiast)
WHERE me--y
RETURN f, y, count(x) as NrMutFr
the difference between your posted question query is that you must finish the last node with a new substitute y and not f. than also, if necessary, again match that y with starting me node
Once you've matched your friends you should be able to express the rest of the query as a path predicate: match "my friends", filter out everyone except "those of my friends who have some friend in common", which amounts to the same as "those of my friends who have a friend-of-friend who is a friend of mine.
MATCH (me:Enthusiast { Id: 488 })-[:abonniert]->(f)
WHERE f-[:abonniert]-()-[:abonniert]-()<-[:abonniert]-me
RETURN f
Here's a console: http://console.neo4j.org/r/87n0j9. If I have misunderstood your question you can make changes in that console, click "share" and post back the link here with an explanation of what result you expect to get back.
Edit
If you want to get the nodes that are two or more of your friends are related to in common, you can do
MATCH (me:Enthusiast { Id: 488 })-[:subscribed]->(f)-[:subscribed]->(common)
WITH common, count(common) AS cnt
WHERE cnt > 1
RETURN common
A node that is a common neighbour of your neighbours can be described as a node you can reach on at least two paths. You can therefore match your neighbour-of-neighbours, count the times each "non" is matched, and if it is matched more than once then it is a "non" that is common to at least two of your neighbours. If you want you can return that count and order the result by it as a type of scoring (since this seems to be for recommendation purposes).
MATCH (me:Enthusiast { Id: 488 })-[:subscribed]->(f)-[:subscribed]->(common)
WITH common, count(common) AS score
WHERE score > 1
RETURN common, score
ORDER BY score DESC

Resources