How to group and count relationships in cypher neo4j - neo4j

How can I quickly count the number of "posts" made by one person and group them by person in a cypher query?
Basically I have message label nodes and users that posted (Relationship) those messages. I want to count the number of messages posted by each user.
Its a group messages by sender ID and count the number of messages per user.
Here is what I have so far...
START n=node(*) MATCH (u:User)-[r:Posted]->(m:Message)
RETURN u, r, count(r)
ORDER BY count(r)
LIMIT 10

How about this?
MATCH (u:User)-[r:POSTED]->(m:Message)
RETURN id(u), count(m)
ORDER BY count(m)
Have you had a chance to check out the current reference card?
https://neo4j.com/docs/cypher-refcard/current/
EDIT:
Assuming that the relationship :POSTED is only used for posts then one could do something like this instead
MATCH (u:User {name: 'my user'})
RETURN u, size((u)-[:POSTED]->())
This is significantly cheaper as it does not force a traversal to the actual Message.

Related

Return count of relationships instead of all

I wonder if anyone can advise how to adjust the following query so that it returns one relationship with a count of the number of actual relationships rather than every relationship? I have some nodes with many relationships and it's killing the graph's performance.
MATCH (p:Provider{countorig: "XXXX"})-[r:supplied]-(i:Importer)
RETURN p, i limit 100
Many thanks
To return the relationship name along with a count, change your "return" statement, like this:
MATCH (p:Provider{countorig: "XXXX"})-[r:supplied]-(i:Importer)
RETURN type(r), count(r)
Using type(r) will return the type of the relationship, which looks to be "supplied" in your example. And then count(r) is just using the built-in function to count the number of occurrences of that relationship in the query.

Using UNWIND on a list I created to return multiple values (Cypher)

I am using the "Movies" database in Neo4j to simplify my question (type :play movies in the query box of an empty sandbox). For a list of 3 actors that I specify, I want to determine the total number of movies they've worked on, the number of movies they've acted in, and the number of movies they've directed (if any). Here is what I came up with:
MATCH (p:Person)-->(m:Movie)
WITH p, m, count(m) AS total
MATCH (p)-[:ACTED_IN]->(m)
WITH p, m, total, count(DISTINCT m) AS actedIn
MATCH (p)-[:DIRECTED]->(m)
WITH p, m, total, actedIn, count(DISTINCT m) AS directed
UNWIND ["Tom Hanks", "Clint Eastwood", "Charlize Theron"] AS actors
RETURN DISTINCT actors, total, actedIn, directed
Currently, it is retuning that each actor acted in 1 movie and directed 1 movie, which is incorrect. I need to keep the WITH clauses in the query and I need to define the list of actors.
In the real query I am working on that compares to this simpler one, the same thing is happening where each element of the list I defined returns the same numbers as the other elements in the list. I am not sure what I am doing wrong here.
I think this query will work for you.
Since every person has been involved in a movie in some capacity the first MATCH can asser that and then the subsequent ones can be optional.
// Find the people that worked in total movies controlled by your list
MATCH (p:Person)-->(m:Movie)
WHERE p.name IN ["Tom Hanks", "Clint Eastwood", "Charlize Theron"]
// carry the people and the total movies per person
WITH p, count(m) AS total
// find the movies those people acted in
OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie)
// carry the people, total movies and the movies acted in
WITH p, total, count(m) AS actedIn
// find the movies they directed
OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie)
RETURN p.name, total, actedIn, count(m) AS directed

Nodes with relationship to multiple nodes

I want to get the Persons that know everyone in a group of persons which know some specific places.
This:
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
WITH collect(DISTINCT b) as persons
Match (a:Person)
WHERE ALL(b in persons WHERE (a)-[:knows]->(b))
RETURN a
works, but for the second part does a full nodelabelscan, before applying the where clause, which is extremely slow - in a bigger db it takes 8~9 seconds. I also tried this:
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
Match (a:Person)-[:knows]->(b)
RETURN a
This only needs 2ms, however it returns all persons that know any person of group b, instead of those that know everyone.
So my question is: Is there a effective/fast query to get what i want?
We have a knowledge base article for this kind of query that show a few approaches.
One of these is to match to :Persons known by the group, and then count the number of times each of those persons shows up in the results. Provided there aren't multiple :knows relationships between the same two people, if the count is equal to the collection of people from your first match, then that person must know all of the people in the collection.
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'})
WITH collect(b) as persons
UNWIND persons as b // so we have the entire list of persons along with each person
WITH size(persons) as total, b
MATCH (a:Person)-[:knows]->(b)
WITH total, a, count(a) as knownCount
WHERE total = knownCount
RETURN a
Here is a simpler Cypher query that also compares counts -- the same basic idea used by #InverseFalcon.
MATCH (:Place {name:'Breiter Weg'})<-[:knows]-(b:Person)-[:knows]->(:Place {name:'Buchhandel'}), (a:Person)-[:knows]->(b)
WITH COLLECT({a:a, b:b}) as data, COUNT(DISTINCT b) AS total
UNWIND data AS d
WITH total, d.a AS a, COUNT(d.b) AS bCount
WHERE total = bCount
RETURN a

Neo4j: Query to find the nodes with most relationships, and their connected nodes

I am using Neo4j CE 3.1.1 and I have a relationship WRITES between authors and books. I want to find the N (say N=10 for example) books with the largest number of authors. Following some examples I found, I came up with the query:
MATCH (a)-[r:WRITES]->(b)
RETURN r,
COUNT(r) ORDER BY COUNT(r) DESC LIMIT 10
When I execute this query in the Neo4j browser I get 10 books, but these do not look like the ones written by most authors, as they show only a few WRITES relationships to authors. If I change the query to
MATCH (a)-[r:WRITES]->(b)
RETURN b,
COUNT(r) ORDER BY COUNT(r) DESC LIMIT 10
Then I get the 10 books with the most authors, but I don't see their relationship to authors. To do so, I have to write additional queries explicitly stating the name of a book I found in the previous query:
MATCH ()-[r:WRITES]->(b)
WHERE b.title="Title of a book with many authors"
RETURN r
What am I doing wrong? Why isn't the first query working as expected?
Aggregations only have context based on the non-aggregation columns, and with your match, a unique relationship will only occur once in your results.
So your first query is asking for each relationship on a row, and the count of that particular relationship, which is 1.
You might rewrite this in a couple different ways.
One is to collect the authors and order on the size of the author list:
MATCH (a)-[:WRITES]->(b)
RETURN b, COLLECT(a) as authors
ORDER BY SIZE(authors) DESC LIMIT 10
You can always collect the author and its relationship, if the relationship itself is interesting to you.
EDIT
If you happen to have labels on your nodes (you absolutely SHOULD have labels on your nodes), you can try a different approach by matching to all books, getting the size of the incoming :WRITES relationships to each book, ordering and limiting on that, and then performing the match to the authors:
MATCH (b:Book)
WITH b, SIZE(()-[:WRITES]->(b)) as authorCnt
ORDER BY authorCnt DESC LIMIT 10
MATCH (a)-[:WRITES]->(b)
RETURN b, a
You can collect on the authors and/or return the relationship as well, depending on what you need from the output.
You are very close: after sorting, it is necessary to rediscover the authors. For example:
MATCH (a:Author)-[r:WRITES]->(b:Book)
WITH b,
COUNT(r) AS authorsCount
ORDER BY authorsCount DESC LIMIT 10
MATCH (b)<-[:WRITES]-(a:Author)
RETURN b,
COLLECT(a) AS authors
ORDER BY size(authors) DESC

Get the full graph of a query in Neo4j

Suppose tha I have the default database Movies and I want to find the total number of people that have participated in each movie, no matter their role (i.e. including the actors, the producers, the directors e.t.c.)
I have already done that using the query:
MATCH (m:Movie)<-[r]-(n:Person)
WITH m, COUNT(n) as count_people
RETURN m, count_people
ORDER BY count_people DESC
LIMIT 3
Ok, I have included some extra options but that doesn't really matter in my actual question. From the above query, I will get 3 movies.
Q. How can I enrich the above query, so I can get a graph including all the relationships regarding these 3 movies (i.e.DIRECTED, ACTED_IN,PRODUCED e.t.c)?
I know that I can deploy all the relationships regarding each movie through the buttons on each movie node, but I would like to know whether I can do so through cypher.
Use additional optional match:
MATCH (m:Movie)<--(n:Person)
WITH m,
COUNT(n) as count_people
ORDER BY count_people DESC
LIMIT 3
OPTIONAL MATCH p = (m)-[r]-(RN) WHERE type(r) IN ['DIRECTED', 'ACTED_IN', 'PRODUCED']
RETURN m,
collect(p) as graphPaths,
count_people
ORDER BY count_people DESC

Resources