I have a graph db of Node User (properties: uid, name) and Relationship Invitation (properties: invitation_id, invitation_time).
The relationship is built when one user invite other users. That means every time one user invites, it will build the same relationship between him and the users he invited.
I want to count the unique invitations of each user.
My cyper query is:
match (u:User)-[r:Invitation]->()
return u, count(distinct r)
order by count(distinct r) desc
Instead of meet my expectation, this query did not drop the duplicates.
So what should be the right query?
I got the answer by myself just after posting the question:
match (u:User)-[r:Invitation]->()
return u, count(distinct r.invitation_id)
order by count(distinct r.invitation_id) desc
Related
I'm trying to find all the relationships of the nodes which have one specific relationship. People can be connected to events which in turn are connected to churches. I'm interested in the people who are connected as witnesses to events (marriages) in the following manner:
(p:person)-[:ACTED_AS_BEKENDE]-(e:event)
What I'm struggling with is that when I write a simple MATCH statement with a WHERE clause (see below), I only get the events to which people were connected via this specific relationship.
MATCH (p:person)--(e:event)--(c:church)
WHERE (p:person)-[:ACTED_AS_BEKENDE]-(e:event)
RETURN distinct p.ID AS ID, p.Name AS NAME, labels(e) AS Event_name, e.Event_year AS year, labels(c) AS Church ORDER BY e.Event_year ASC
To reiterate: I need a query which first selects the people who are tied to events via the [:ACTED_AS_BEKENDE] edge and then retrieves all the events to which these people were tied.
Do you need something like this?
MATCH (p:person)-[:ACTED_AS_BEKENDE]-(:event)
WITH p
MATCH (p)--(e:event)--(c:church)
RETURN distinct p.ID AS ID, p.Name AS NAME, labels(e) AS Event_name, e.Event_year AS year, labels(c) AS Church ORDER BY e.Event_year ASC
This will first find all persons that are ACTED_AS_BEKENDE, and for them it will find the events and churches as you wanted
The below query is taken from neo4j movie review dataset sandbox:
MATCH (u:User {name: "Some User"})-[r:RATED]->(m:Movie)
WITH u, avg(r.rating) AS mean
MATCH (u)-[r:RATED]->(m:Movie)-[:IN_GENRE]->(g:Genre)
WHERE r.rating > mean
WITH u, g, COUNT(*) AS score
MATCH (g)<-[:IN_GENRE]-(rec:Movie)
WHERE NOT EXISTS((u)-[:RATED]->(rec))
RETURN rec.title AS recommendation, rec.year AS year, COLLECT(DISTINCT g.name) AS genres, SUM(score) AS sscore
ORDER BY sscore DESC LIMIT 10
what I can not understand is: why the DISTINCT keyword is required in the query's return statement?. Because the expected results from the last MATCH statement is something like this:
g1,x
g1,y
...
g2,z
g2,v
g2,m
...
gn,m
gn,b
gn,x
where g1,g2,..gn are the set of genres and x,y,z,v,m,b... are a set of movies (in addition there is a user and score column deleted for readability).
So according to my understanding what this query is returning: For each movie return its genres and the sum of their scores.
Assumptions:
Every Movie has a unique title. (This is required for the query to work as is.)
Every Genre has a unique name.
Every Movie has at most one IN_GENRE relationship to each distinct Genre.
Given the above assumptions, you are correct that the DISTINCT is not necessary. That is because the RETURN clause is using rec.title as one of the aggregation grouping keys.
Suppose tha I have the default database Movies and I want to find the total number of people that have participated in each movie, no matter their role (i.e. including the actors, the producers, the directors e.t.c.)
I have already done that using the query:
MATCH (m:Movie)<-[r]-(n:Person)
WITH m, COUNT(n) as count_people
RETURN m, count_people
ORDER BY count_people DESC
LIMIT 3
Ok, I have included some extra options but that doesn't really matter in my actual question. From the above query, I will get 3 movies.
Q. How can I enrich the above query, so I can get a graph including all the relationships regarding these 3 movies (i.e.DIRECTED, ACTED_IN,PRODUCED e.t.c)?
I know that I can deploy all the relationships regarding each movie through the buttons on each movie node, but I would like to know whether I can do so through cypher.
Use additional optional match:
MATCH (m:Movie)<--(n:Person)
WITH m,
COUNT(n) as count_people
ORDER BY count_people DESC
LIMIT 3
OPTIONAL MATCH p = (m)-[r]-(RN) WHERE type(r) IN ['DIRECTED', 'ACTED_IN', 'PRODUCED']
RETURN m,
collect(p) as graphPaths,
count_people
ORDER BY count_people DESC
How can I quickly count the number of "posts" made by one person and group them by person in a cypher query?
Basically I have message label nodes and users that posted (Relationship) those messages. I want to count the number of messages posted by each user.
Its a group messages by sender ID and count the number of messages per user.
Here is what I have so far...
START n=node(*) MATCH (u:User)-[r:Posted]->(m:Message)
RETURN u, r, count(r)
ORDER BY count(r)
LIMIT 10
How about this?
MATCH (u:User)-[r:POSTED]->(m:Message)
RETURN id(u), count(m)
ORDER BY count(m)
Have you had a chance to check out the current reference card?
https://neo4j.com/docs/cypher-refcard/current/
EDIT:
Assuming that the relationship :POSTED is only used for posts then one could do something like this instead
MATCH (u:User {name: 'my user'})
RETURN u, size((u)-[:POSTED]->())
This is significantly cheaper as it does not force a traversal to the actual Message.
I'm moving my complex user database where users can be on one of many teams, be friends with each other and more to Neo4j. Doing this in a RDBMS was painful and slow, but is simple and blazing with Neo4j. :)
I was hoping there is a way to query for
a relationship that is 1 hop away and
another relationship that is 2 hops away
from the same query.
START n=node:myIndex(user='345')
MATCH n-[:IS_FRIEND|ON_TEAM*2]-m
RETURN DISTINCT m;
The reason is that users that are friends are one edge from each other, but users linked by teams are linked through that team node, so they are two edges away. This query does IS_FRIEND*2 and ON_TEAM*2, which gets teammates (yeah) and friends of friends (boo).
Is there a succinct way in Cypher to get both differing length relations in a single query?
I rewrote it to return a collection:
start person=node(1)
match person-[:IS_FRIEND]-friend
with person, collect(distinct friend) as friends
match person-[:ON_TEAM*2]-teammate
with person, friends, collect(distinct teammate) as teammates
return person, friends + filter(dupcheck in teammates: not(dupcheck in friends)) as teammates_and_friends
http://console.neo4j.org/r/oo4dvx
thanks for putting together the sample db, Werner.
I have created a small test database at http://console.neo4j.org/?id=sqyz7i
I have also created a query which will work as you described:
START n=node(1)
MATCH n-[:IS_FRIEND]-m
WITH collect(distinct id(m)) as a, n
MATCH n-[:ON_TEAM*2]-m
WITH collect(distinct id(m)) as b, a
START n=node(*)
WHERE id(n) in a + b
RETURN n