Neo4j: query to find nodes with most relationship - neo4j

I'm trying to find which movie has the most number of actors in it in my database.Here's what i came up with but it kept giving me blank.
MATCH (m:Movie)
WITH m, SIZE(()-[:ACTED_IN]->(m)) as actorCnt
MATCH (a)-[:ACTED_IN]->(m)
RETURN m, a

Maybe you did not wait long enough, because your query is trying to return all the actors for every movie.
This query should return a list of the actors for the (single) movie with the most actors:
MATCH (m:Movie)
WITH m
ORDER BY SIZE(()-[:ACTED_IN]->(m)) DESC
LIMIT 1
RETURN m, [(a)-[:ACTED_IN]->(m)|a] AS actors
It orders the movies by descending number of actors, takes just the first one, and returns it and a list of all its actors.

Related

Neo4j Cypher- With clause query

I'm doing some codes on the Neo4j's movies dataset the question was
Retrieve the actors who have acted in exactly five movies, returning the name of the actor, and the list of movies for that actor.
I wrote this following query and im not getting the result and it shows "no changes no result"
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a,m, count(m) AS numMovies
WHERE numMovies = 5
RETURN a.name,collect(m.title) AS movies
where as when I wrote this query for the same satement this time I just write the "collect(m.title) AS movies " in the WITH clause and I got the desired result.
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a, count(m) AS numMovies, collect(m.title) AS movies
WHERE numMovies = 5
RETURN a.name, movies
My doubt is that why result varies when I wrote the "collect(m.title) AS movies" in the RETURN clause.
Your first query has m, count(m), which will result in a count of 1 for each Movie node m.
You can check this by returning from the query in the second line:
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a, m, count(m) AS numMovies
The solution is to remove the separate m variable from the WITH clause as shown in your second query.

why DISTINCT is needed in this Cypher query?

The below query is taken from neo4j movie review dataset sandbox:
MATCH (u:User {name: "Some User"})-[r:RATED]->(m:Movie)
WITH u, avg(r.rating) AS mean
MATCH (u)-[r:RATED]->(m:Movie)-[:IN_GENRE]->(g:Genre)
WHERE r.rating > mean
WITH u, g, COUNT(*) AS score
MATCH (g)<-[:IN_GENRE]-(rec:Movie)
WHERE NOT EXISTS((u)-[:RATED]->(rec))
RETURN rec.title AS recommendation, rec.year AS year, COLLECT(DISTINCT g.name) AS genres, SUM(score) AS sscore
ORDER BY sscore DESC LIMIT 10
what I can not understand is: why the DISTINCT keyword is required in the query's return statement?. Because the expected results from the last MATCH statement is something like this:
g1,x
g1,y
...
g2,z
g2,v
g2,m
...
gn,m
gn,b
gn,x
where g1,g2,..gn are the set of genres and x,y,z,v,m,b... are a set of movies (in addition there is a user and score column deleted for readability).
So according to my understanding what this query is returning: For each movie return its genres and the sum of their scores.
Assumptions:
Every Movie has a unique title. (This is required for the query to work as is.)
Every Genre has a unique name.
Every Movie has at most one IN_GENRE relationship to each distinct Genre.
Given the above assumptions, you are correct that the DISTINCT is not necessary. That is because the RETURN clause is using rec.title as one of the aggregation grouping keys.

Using UNWIND on a list I created to return multiple values (Cypher)

I am using the "Movies" database in Neo4j to simplify my question (type :play movies in the query box of an empty sandbox). For a list of 3 actors that I specify, I want to determine the total number of movies they've worked on, the number of movies they've acted in, and the number of movies they've directed (if any). Here is what I came up with:
MATCH (p:Person)-->(m:Movie)
WITH p, m, count(m) AS total
MATCH (p)-[:ACTED_IN]->(m)
WITH p, m, total, count(DISTINCT m) AS actedIn
MATCH (p)-[:DIRECTED]->(m)
WITH p, m, total, actedIn, count(DISTINCT m) AS directed
UNWIND ["Tom Hanks", "Clint Eastwood", "Charlize Theron"] AS actors
RETURN DISTINCT actors, total, actedIn, directed
Currently, it is retuning that each actor acted in 1 movie and directed 1 movie, which is incorrect. I need to keep the WITH clauses in the query and I need to define the list of actors.
In the real query I am working on that compares to this simpler one, the same thing is happening where each element of the list I defined returns the same numbers as the other elements in the list. I am not sure what I am doing wrong here.
I think this query will work for you.
Since every person has been involved in a movie in some capacity the first MATCH can asser that and then the subsequent ones can be optional.
// Find the people that worked in total movies controlled by your list
MATCH (p:Person)-->(m:Movie)
WHERE p.name IN ["Tom Hanks", "Clint Eastwood", "Charlize Theron"]
// carry the people and the total movies per person
WITH p, count(m) AS total
// find the movies those people acted in
OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie)
// carry the people, total movies and the movies acted in
WITH p, total, count(m) AS actedIn
// find the movies they directed
OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie)
RETURN p.name, total, actedIn, count(m) AS directed

Get the full graph of a query in Neo4j

Suppose tha I have the default database Movies and I want to find the total number of people that have participated in each movie, no matter their role (i.e. including the actors, the producers, the directors e.t.c.)
I have already done that using the query:
MATCH (m:Movie)<-[r]-(n:Person)
WITH m, COUNT(n) as count_people
RETURN m, count_people
ORDER BY count_people DESC
LIMIT 3
Ok, I have included some extra options but that doesn't really matter in my actual question. From the above query, I will get 3 movies.
Q. How can I enrich the above query, so I can get a graph including all the relationships regarding these 3 movies (i.e.DIRECTED, ACTED_IN,PRODUCED e.t.c)?
I know that I can deploy all the relationships regarding each movie through the buttons on each movie node, but I would like to know whether I can do so through cypher.
Use additional optional match:
MATCH (m:Movie)<--(n:Person)
WITH m,
COUNT(n) as count_people
ORDER BY count_people DESC
LIMIT 3
OPTIONAL MATCH p = (m)-[r]-(RN) WHERE type(r) IN ['DIRECTED', 'ACTED_IN', 'PRODUCED']
RETURN m,
collect(p) as graphPaths,
count_people
ORDER BY count_people DESC

Neo4j Cypher Query include start node

I want to list all movies that I act in and the sum of actors in each movie, but the query below only returns the sum of actors apart from me and will not return movie which without other actors.
start me=node({0})
match me-[:ACTS_IN]->movie<-[:ACTS_IN]-otherActors
return movie, count(*) as actorSum
You need to break it up with WITH. The problem with your query is that you're claiming the me node in the first part of the match, so me can't ever be in otherActors.
start me=node({0})
match me-[:ACTS_IN]->movie
with movie
match movie<-[:ACTS_IN]-actors
return movie, count(*) as actorSum

Resources