Neo4j Cypher Query include start node - neo4j

I want to list all movies that I act in and the sum of actors in each movie, but the query below only returns the sum of actors apart from me and will not return movie which without other actors.
start me=node({0})
match me-[:ACTS_IN]->movie<-[:ACTS_IN]-otherActors
return movie, count(*) as actorSum

You need to break it up with WITH. The problem with your query is that you're claiming the me node in the first part of the match, so me can't ever be in otherActors.
start me=node({0})
match me-[:ACTS_IN]->movie
with movie
match movie<-[:ACTS_IN]-actors
return movie, count(*) as actorSum

Related

Neo4j Cypher- With clause query

I'm doing some codes on the Neo4j's movies dataset the question was
Retrieve the actors who have acted in exactly five movies, returning the name of the actor, and the list of movies for that actor.
I wrote this following query and im not getting the result and it shows "no changes no result"
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a,m, count(m) AS numMovies
WHERE numMovies = 5
RETURN a.name,collect(m.title) AS movies
where as when I wrote this query for the same satement this time I just write the "collect(m.title) AS movies " in the WITH clause and I got the desired result.
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WITH a, count(m) AS numMovies, collect(m.title) AS movies
WHERE numMovies = 5
RETURN a.name, movies
My doubt is that why result varies when I wrote the "collect(m.title) AS movies" in the RETURN clause.
Your first query has m, count(m), which will result in a count of 1 for each Movie node m.
You can check this by returning from the query in the second line:
MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a, m, count(m) AS numMovies
The solution is to remove the separate m variable from the WITH clause as shown in your second query.

why DISTINCT is needed in this Cypher query?

The below query is taken from neo4j movie review dataset sandbox:
MATCH (u:User {name: "Some User"})-[r:RATED]->(m:Movie)
WITH u, avg(r.rating) AS mean
MATCH (u)-[r:RATED]->(m:Movie)-[:IN_GENRE]->(g:Genre)
WHERE r.rating > mean
WITH u, g, COUNT(*) AS score
MATCH (g)<-[:IN_GENRE]-(rec:Movie)
WHERE NOT EXISTS((u)-[:RATED]->(rec))
RETURN rec.title AS recommendation, rec.year AS year, COLLECT(DISTINCT g.name) AS genres, SUM(score) AS sscore
ORDER BY sscore DESC LIMIT 10
what I can not understand is: why the DISTINCT keyword is required in the query's return statement?. Because the expected results from the last MATCH statement is something like this:
g1,x
g1,y
...
g2,z
g2,v
g2,m
...
gn,m
gn,b
gn,x
where g1,g2,..gn are the set of genres and x,y,z,v,m,b... are a set of movies (in addition there is a user and score column deleted for readability).
So according to my understanding what this query is returning: For each movie return its genres and the sum of their scores.
Assumptions:
Every Movie has a unique title. (This is required for the query to work as is.)
Every Genre has a unique name.
Every Movie has at most one IN_GENRE relationship to each distinct Genre.
Given the above assumptions, you are correct that the DISTINCT is not necessary. That is because the RETURN clause is using rec.title as one of the aggregation grouping keys.

Neo4j: query to find nodes with most relationship

I'm trying to find which movie has the most number of actors in it in my database.Here's what i came up with but it kept giving me blank.
MATCH (m:Movie)
WITH m, SIZE(()-[:ACTED_IN]->(m)) as actorCnt
MATCH (a)-[:ACTED_IN]->(m)
RETURN m, a
Maybe you did not wait long enough, because your query is trying to return all the actors for every movie.
This query should return a list of the actors for the (single) movie with the most actors:
MATCH (m:Movie)
WITH m
ORDER BY SIZE(()-[:ACTED_IN]->(m)) DESC
LIMIT 1
RETURN m, [(a)-[:ACTED_IN]->(m)|a] AS actors
It orders the movies by descending number of actors, takes just the first one, and returns it and a list of all its actors.

Neo4j: query to find 2 node with the most

I'm trying to find which pair of actors have acted together in most number of movies in my data base and my query kept returning blank, any suggestions?
MATCH (actor1:Actor)<-[st1:ACTED_IN]-(mv1:Movie)-[st2:ACTED_IN]->(actor2:Actor)
RETURN distinct actor1,actor2,count(mv1)
Looks like you have written relationship arrows in reverse direction.
It might be from Actor to Movie:
MATCH (actor1:Actor)-[st1:ACTED_IN]->(mv1:Movie)<-[st2:ACTED_IN]-(actor2:Actor)
RETURN distinct actor1, actor2, count(mv1)
Though you are using distinct your query will return duplicates because the following two are different records:
actor1, actor2, movie_count
actor2, actor1, movie_count
To get rid of this duplicate entries you can use simple trick of comparing ids of nodes like:
MATCH (actor1:Actor)-[:ACTED_IN]->(mv1:Movie)<-[:ACTED_IN]-(actor2:Actor)
WHERE id(actor1)>id(actor2)
RETURN actor1,actor2,count(mv1)
To find actors acted in most movies:
MATCH (actor1:Actor)-[:ACTED_IN]->(mv1:Movie)<-[:ACTED_IN]-(actor2:Actor)
WHERE id(actor1)>id(actor2)
RETURN actor1,actor2,count(mv1) AS movie_count
ORDER BY movie_count DESC
LIMIT 1

Use vars from before WITH statement in RETURN statement in Neo4j Cypher

I'm starting with Neo4j (v2.1.5) and I'm having an issue with the following Cypher query:
MATCH (actor:Person{name:"Tom Cruise"})-[role:ACTED_IN]->(movies)<-[r:ACTED_IN]-(coactors)
WITH coactors, count(coactors) as TimesCoacted
RETURN coactors.name, avg(TimesCoacted)
ORDER BY avg(TimesCoacted) DESC
It is based on the mini movie graph which comes with Neo4j installation.
Everything works fine, it shows all coactors which coacted in movies with Tom Cruise and how many times they coacted together, but the problem occurs when I want to list in which movies they coacted. Placing 'movies' variable in RETURN statement is throwing the following error:
movies not defined (line 3, column 9)
"RETURN movies, coactors.name, avg(TimesCoacted)"
^
Is there any way I can do it in one query?
Try the following:
MATCH
(actor:Person{name:"Tom Cruise"})-[role:ACTED_IN]->(movies)<-[r:ACTED_IN]-(coactors)
WITH
coactors,
count(coactors) as TimesCoacted,
movies // You have declare "movies" here in order to use it later in the query
RETURN
movies,
coactors.name,
avg(TimesCoacted)
ORDER BY
avg(TimesCoacted) DESC
What you define in the WITH-statement is the only thing that is available for further processing. In the original question the movies were not carried on to the next section (it was not part of WITH) and therefore movies could not be used in the return statement.
Edit: After updates from the OP the following was added.
Another example. If you wish to count the number of times the actors have coacted in a movie and list the movie-titles as well. Try the following:
MATCH
(actor:Person {name:"Tom Cruise"})-[:ACTED_IN]->(movie)<-[:ACTED_IN]-(coactor:Person)
WITH
actor,
coactor,
collect (distinct movie.title) as movieTitles
RETURN
actor.name as actorName,
coactor.name as coactorName,
movieTitles,
size(movieTitles) as numberOfMovies
MATCH
(actor:Person{name:"Tom Cruise"})-[role:ACTED_IN]->(movies)<-[r:ACTED_IN]-(coactors)
WITH
coactors,
count(coactors) as TimesCoacted,
collect(DISTINCT movies.title) as movies // <=- this line was crucial!
RETURN
movies,
coactors.name,
avg(TimesCoacted)
ORDER BY
avg(TimesCoacted) DESC

Resources