Cypher Neo4j ORDER BY DESC query - neo4j

I want to order the COUNT(Movie.title) in descending order.
But it gives an error.
This is the query.
MATCH (Movie {genre:"Action"})<-[:ACTS_IN]-(Person)
"RETURN Person.name, Movie.genre, COUNT(Movie.title)"
"ORDER BY COUNT(Movie.title) DESC"
"LIMIT 100";
Thanks!

You can use this query:
MATCH (movie:Movie {genre:"Action"})<-[:ACTS_IN]-(person:Person)
RETURN person.name, movie.genre, COUNT(distinct movie.title) AS cnt
ORDER BY cnt DESC
LIMIT 100

The error is returned because you cannot order by an aggregate immediately in Cypher. To order by any aggregate you need to use the WITH operator.
So your query should be (assumes that you want to list the titles per actor per genre):
MATCH (Movie {genre:"Action"})<-[:ACTS_IN]-(Person)
RETURN Person.name, Movie.genre, COUNT(Movie.title)
WITH Person.name AS name, Movie.genre AS genre, COLLECT(Movie.title) AS titles
RETURN name, genre, titles
ORDER BY LENGTH(titles) DESC
LIMIT 100
The limit 100 has now changed its behaviour so you probably want to move it up into the query:
MATCH (Movie {genre:"Action"})<-[:ACTS_IN]-(Person)
RETURN Person.name, Movie.genre, COUNT(Movie.title)
WITH Person, Movie
LIMIT 100
WITH Person.name AS name, Movie.genre AS genre, COLLECT(Movie.title) AS titles
RETURN name, genre, titles
ORDER BY LENGTH(titles) DESC
Aside: to make your queries perform well you should have an Index on the Movie.genre property and you should introduce labels for Movie and Person.

Related

Find n most referenced records by foreign_key in related table

I have a table skills and a table programs_skills which references skill_id as a foreign key, I want to retrieve the 10 most present skills in table programs_skills (I need to count the number of occurrence of skill_id in programs_skills and then order it by descending order).
I wrote this in my skill model:
def self.most_used(limit)
Skill.find(
ActiveRecord::Base.connection.execute(
'SELECT programs_skills.skill_id, count(*) FROM programs_skills GROUP BY skill_id ORDER BY count DESC'
).to_a.first(limit).map { |record| record['skill_id'] }
)
end
This is working but I would like to find a way to perform this query in a more elegant, performant, "activerecord like" way.
Could you help me rewrite this query ?
Just replace your query by:
WITH
T AS
(
SELECT skill_id, COUNT(*) AS NB, RANK() OVER(ORDER BY COUNT(*) DESC) AS RNK
FROM programs_skills
GROUP BY skill_id
)
SELECT wojewodztwo, NB
FROM T
WHERE RNK <= 10
This use CTE and windowed function.
ProgramsSkills.select("skill_id, COUNT(*) AS nb_skills")
.group(:skill_id).order("nb_skills DESC").limit(limit)
.first(limit).pluck(:skill_id)

How to limit overall union in cypher

I have these nodes:
user{user_id}: users
thread{thread_id, post_date} : posts
tag_id{tag_id}: the tag of the post
And these relationships:
(user) - [: FOLLOWED] -> (tag) // the user follows the tag
(thread) - [: BELONG_TO] -> (tag) // the post belongs to tag
(user) - [: READ{read_date}] -> (thread) // user reads the post
(user) - [: BEING_REPLIED{post_date}] -> (thread) // the user is given a reply by another user to his / her comment in a post
(user) - [: BEING_MENTIONED{post_date}] -> (thread) // the user is given a mention by another user comment in a post
I want to get 10 posts that the user is replied or mentioned by another user, then to the posts that belong to tag the user follows but the user has not read to display in each user's feed, I use multiple unions in the query but cannot limit to the total, the resulting form is limited to the last union
I wrote cypher as follows:
MATCH (u:User {user_id:3})-[rp:BEING_REPLIED]->(th:Thread)<-[r:READ]-(u:User {user_id:3})
WHERE rp.post_date> r.read_date
return u.user_id as user_id,th.thread_id as thread_id,
duration.inDays(datetime(),datetime(rp.post_date)).days*10 + 1000000 AS point
UNION ALL
MATCH (u:User {user_id:3})-[m:BEING_MENTIONED]->(th:Thread)<-[r:READ]-(u:User {user_id:3})
WHERE m.post_date> r.read_date
return u.user_id as user_id,th.thread_id as thread_id,
duration.inDays(datetime(),datetime(m.post_date)).days*10 + 1000000 AS point
UNION ALL
MATCH (u:User {user_id:3})-[m:BEING_MENTIONED]->(th:Thread)
WHERE NOT EXISTS ((u)-[:READ]->(th))
return u.user_id as user_id,th.thread_id as thread_id,
duration.inDays(datetime(),datetime(m.post_date)).days*10 + 1000000 AS point
MATCH (u:User)-[:FOLLOWED]->(t:Tag)<-[:BELONG_TO]->(th)
WHERE u.user_id = 3 AND NOT EXISTS((u)-[]->(th))
WITH u.user_id AS user_id, th.thread_id AS thread_id,
(0.5*th.like_count + 0.3*th.comment_count + 0.005*th.view_count
+ duration.inDays(datetime(),datetime(th.published_date)).days*100) AS point
ORDER BY point desc
RETURN DISTINCT user_id, thread_id, point
UNION
MATCH (u:User)-[:FOLLOWED]->(t:Tag)<-[:BELONG_TO]->(th)
WHERE u.user_id = 3 AND NOT EXISTS((u)-[]->(th))
AND NOT th.rating_total IS NULL
WITH u.user_id AS user_id, th.thread_id AS thread_id,
(duration.inDays(datetime(),datetime(th.published_date)).days*150 + 30*th.rating_total) AS point
ORDER BY point desc, th.published_date desc
RETURN DISTINCT user_id, thread_id, point
LIMIT 10
How can i set this query limit overall?
Thanks for your help!
You need subqueries for this, you should be using Neo4j 4.0.x or later, this allows you to perform post-UNION processing
Usage of UNION ALL in the subquery, with the LIMIT 10 outside of it, should allow you to get what you want.

Neo4j Match with multiple relationships

I need a MATCH where either relationship is true. I understand the (person1)-[:r1|:r2]-(person2). The problem I am having is that one of the MATCH traverse through another node. IE:
(p1:person)-[:FRIEND]-(p2:person)-[:FRIEND]-(p3:person)
So I want this kind of logic. The enemy of my enemy is my friend. And my friend is my friend. Output list of all the names who are my friend. I also limit the relationship to a particular value.
Something like:
MATCH (p1:Person)-[:ENEMY{type:'human'}]-(myEnemy:Person)-[enemy2:ENEMY{type:'human'}]-(myFriend:Person)
OR (p1:Person)-[friend:FRIEND{type:'human'}]-(myFriend:Person)
RETURN p1.name, myFriend.name
I need one list that I can then do aggregation on.
This is my first posting....so if my question is a mess...hit me with your feedback and I will clarify :)
You can use the UNION clause to combine 2 queries and also remove duplicate results:
MATCH (p:Person)-[:ENEMY{type:'human'}]-(:Person)-[:ENEMY{type:'human'}]-(f:Person)
WHERE ID(p) < ID(f)
RETURN p.name AS pName, f.name AS fName
UNION
MATCH (p:Person)-[:FRIEND{type:'human'}]-(f:Person)
WHERE ID(p) < ID(f)
RETURN p.name AS pName, f.name AS fName
The ID(p) < ID(f) filtering is done to avoid having the same pair of Person names being returned twice (in reverse order).
[UPDATE]
To get a count of how many friends each Person has, you can take advantage of the new CALL subquery syntax (in neo4j 4.0) to do post-union processing:
CALL {
MATCH (p:Person)-[:ENEMY{type:'human'}]-(:Person)-[:ENEMY{type:'human'}]-(f:Person)
WHERE ID(p) < ID(f)
RETURN p.name AS pName, f
UNION
MATCH (p:Person)-[:FRIEND{type:'human'}]-(f:Person)
WHERE ID(p) < ID(f)
RETURN p.name AS pName, f
}
RETURN pName, COUNT(f) AS friendCount

Neo4j cypher ALL IN

I have the following Cypher query:
MATCH (genre:Genre)<-[:BELONGS_TO]-(t:Title)
WHERE genre.name IN ["Comedy", "Drama"]
RETURN t
Which returns titles that belong to Comedy OR Drama genres.
How to change this query in order to return all titles that belong to Comedy AND Drama genres?
SIZE is your friend.
MATCH (t:Title)-[:BELONGS_TO]->(g:Gender)
WHERE g.name IN ["Comedy", "Drama"]
WITH t, COLLECT(g) AS g
WHERE SIZE(g) >= x
RETURN t
x - is number of elements in IN clause

match in clause in cypher

How can I do an match in clause in cypher
e.g. I'd like to find movies with ids 1, 2, or 3.
match (m:movie {movie_id:("1","2","3")}) return m
if you were going against an auto index the syntax was
START n=node:node_auto_index('movie_id:("123", "456", "789")')
how is this different against a match clause
The idea is that you can do:
MATCH (m:movie)
WHERE m.movie_id in ["1", "2", "3"]
However, this will not use the index as of 2.0.1. This is a missing feature in the new label indexes that I hope will be resolved soon. https://github.com/neo4j/neo4j/issues/861
I've found a (somewhat ugly) temporary workaround for this.
The following query doesn't make use of an index on Person(name):
match (p:Person)... where p.name in ['JOHN', 'BOB'] return ...;
So one option is to repeat the entire query n times:
match (p:Person)... where p.name = 'JOHN' return ...
union
match (p:Person)... where p.name = 'BOB' return ...
If this is undesirable then another option is to repeat just a small query for the id n times:
match (p:Person) where p.name ='JOHN' return id(p)
union
match (p:Person) where p.name ='BOB' return id(p);
and then perform a second query using the results of the first:
match (p:Person)... where id(p) in [8,16,75,7] return ...;
Is there a way to combine these into a single query? Can a union be nested inside another query?

Resources