Apparently it seems like the following WHERE clause will not work because we have two relationships (WorksAt and ResponsibleFor) in our query. If there was only one relationship then this would work like magic. Here in the query below the query returns all the courses in teh department science but it does not filter out courses NOT taught by Maria Smith. All i want to do is get only the courses taught by Maria Smith who works in Science Department.
I came across WITH and Start Clause that seem to be potential candidate clauses make it work where you could filter out one part of the query before sending it to another.
http://neo4j.com/docs/stable/query-with.html
but i havent been able to grasp the concept yet. Anyone up for help?
MATCH (d:Department)<-[w:WorksAt]-(t:Tutor)-[r:ResponsibleFor]->(c:Courses)
WHERE d.name='Science'
AND t.name='Maria Smith'
return c,r
There are a number of ways to skin this particular cat. Let's break it down.
Find the tutor whose name is 'Maria Smith' that works in the 'Science' department
MATCH (d:Department)<-[:WorksAt]-(t:Tutor)
WHERE d.name = 'Science' AND t.name = 'Maria Smith'
RETURN t
Find the courses that a tutor teaches
MATCH (t:Tutor)-[:ResponsibleFor]->(c:Courses)
RETURN t.name, c
Bring these two together to get the courses that Maria Smith from the Scence department teaches
MATCH (d:Department)<-[:WorksAt]-(t:Tutor)
WHERE d.name = 'Science' AND t.name = 'Maria Smith'
WITH t
MATCH (t)-[r:ResponsibleFor]->(c:Courses)
RETURN t.name, r, c
This can also be written as
MATCH (d:Department { name : 'Science' })<-[:WorksAt]-(t:Tutor { name : 'Maria Smith' })
WITH t
MATCH (t)-[r:ResponsibleFor]->(c:Courses)
RETURN t.name, r, c
To maximise query performance you can use schema indexes to quickly locate your Department and Tutor nodes. Are you doing this? To create the indexes use
CREATE INDEX ON :Department(name)
CREATE INDEX ON :Tutor(name)
Run these lines separately.
As an aside were you to want to list the courses that each tutor taught, as suggested above in the second query, you could use the following query to aggregate the courses for each tutor.
MATCH (t:Tutor)-[:ResponsibleFor]->(c:Courses)
RETURN t.name as CourseTutor, collect(c.name) as CourseName
Hope this helps.
Nice breakdown. For performance details on this type of query, refer to Wes Freeman's Pragmatic Cypher Optimization. In setting up the match, start with the smaller node set and work toward the larger (Wes's Rule 4).
Related
I have constructed a query to find the people who follow each other and who have read books in the same genre. Here it is:
MATCH (u1:User)-[:READ]->(b1:Book)
WITH collect(DISTINCT b1.genre) AS genres,u1 AS user1
MATCH (u2:User)-[:READ]->(b2:Book)
WHERE (user1)<-[:FOLLOWS]->(u2) AND b2.genre IN genres
RETURN DISTINCT user1.username AS user1,u2.username AS user2
The idea is that we collect all the book genres for one of them, and if a book read by the other is in that list of genres (and they follow each other), then we return those users. This seems to work: we get a list of distinct pairs of individuals. I wonder, though, if there a quicker way to do this? My solution seems somewhat clumsy, but I found it surprisingly finicky trying to specify that they have read a book in the same genre without getting back all the pairs of books and duplicating individuals. For example, I
first wrote the following:
MATCH (b1:Book)<-[:READ]-(u1:User)-[:FOLLOWS]-(u2:User)-[:READ]->(b2:Book)
WHERE b1.genre = b2.genre
RETURN DISTINCT u1.username AS user1, u2.username AS user2
Which seems simpler, but in fact it returned repeated names for all the books that were read in the same genre. Is my solution the simplest, or is there a simpler one?
This is one way of rewriting the query
MATCH (n1:User)-[:FOLLOWS]-(n2:User)
MATCH (n1)-[:READ]->(book), (n2)-[:READ]->(book2)
WHERE book.genre = book2.genre
RETURN n1.username, n2.username, count(*)
Here is another collecting genres for each user
MATCH (n1:User)-[:FOLLOWS]-(n2:User)
WITH n1, n2,
[(n1)-[:READ]->(book) | book.genre] AS g1,
[(n2)-[:READ]->(book) | book.genre] AS g2
WHERE ANY(x IN g1 WHERE x IN g2)
RETURN n1, n2, count(*)
Note that sometimes longer queries are not especially better in the sense that the ways the data are retrieved need to make sense to yourself.
Your model however clearly shows that you would benefit from a bit of graph refactoring, extracting the genre into its own node, for eg
MATCH (n:Book)
MERGE (g:Genre {name: n.genre})
MERGE (n)-[:HAS_GENRE]->(g)
And this would be the new query which leverages a graph model
PROFILE
MATCH (n1:User)-[:FOLLOWS]-(n2:User)
WHERE (n1)-[:READ]->()-[:HAS_GENRE]->()<-[:HAS_GENRE]-()<-[:READ]-(n2)
RETURN n1.username, n2.username, count(*)
I need to find out what publications have collaborated with a university.
I am new to neo4j, I'm not sure how to exactly go about it. I tried the query below, but this only returns records that are within that university only. I need to pull other universities that have collaborated.
This is what I have tried:
MATCH (f:FACULTY)-[p:PUBLISH]->(P:PUBLICATION),(f)<-[a:AFFILIATION_WITH]-(i:INSTITUION)
WHERE i.name = "UNIVERSITY_NAME"
RETURN i.name;
Also here is a description of the graph:
If I got your question right, you're almost there:
MATCH (i:INSTITUTION)<-[:AFFILIATION_WITH]-(:FACULTY)-[:PUBLISH]->(p:PUBLICATION)
WHERE i.name = "UNIVERSITY_NAME"
RETURN p.name;
Here, I am traversing the graph from an institution (university) down to the publication, through the faculty, and returning the name of the publication.
A couple of notes compared to your initial try:
The direction of the relationship Institution -> Faculty in your query does not match the one on your graph
You want to return publication names, not institutions
I'm new to neo4j and im struggling with the task to build a simple filter.
I played around and found the in operator but it only list me every "Person" where atleast one match is found. I want to only list "Person" that have all the properties included.
MATCH (p:Person)-[l:LIKES]->(f:Food) WHERE f.name in ["Spaghetti","Cheese","Chicken","Eggs"]
RETURN p
Result: Show only "Person" that like "Spaghetti","Cheese","Chicken","Eggs", ...
We have a knowledge base article on performing match intersection that should address this.
Applied to your case, here's one of the approaches you can use:
WITH ["Spaghetti","Cheese","Chicken","Eggs"] as foods
MATCH (p:Person)-[:LIKES]->(f:Food)
WHERE f.name in foods
WITH p, foods, count(f) as foodsLiked
WHERE foodsLiked = size(foods)
RETURN p
I have a graph datebase so that there is in it some pattern like this one:
(n1)-[:a]->(n2),
(n1)-[:b]->(n2),
(n1)-[:c]->(n2),
(n1)-[:e]->(n2),
(n1)-[:d]->(n3),
(n2)-[:b]->(n4)
And I want to have all graph with this pattern
MATCH p={
(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4),
(n1)-[:b]->(n2)<-[:c]-(n1),
(n1)-[:e]->(n2)
}
RETURN p
Is it possible? I've search a little but I haven't found how to do it.
I know we can use "|" for a type like this
()-[:a|b]->()
but there is no "&" and the path assigning only works on pattern which are written without ",".
Thanks
EDIT:
If it could help, here is another example of what I'm seeking:
In a database with movies, person and relations like ACTED_IN, KNOWS, FRIEND and HATE
I want all the graphs containing an actor "Actor1" (who ACTED_IN a movie "M") who KNOWS "Person1", FRIEND "Person2" and HATE "Person3" which ACTED_IN the same movie "M".
An UNION like the one in the answer of "Michael Hunger" does not work because we have multiple subgraphs and not graphs. Moreover, some subgraph might not be correct answers for the bigger pattern.
Your query will be very inefficient, as you don't restrict your search to a set of start nodes neither with labels or label+property combinations !!!!
You can use UNION for that:
MATCH p=(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4) RETURN p
UNION
MATCH p=(n1)-[:b]->(n2)<-[:c]-(n1) RETURN p
UNION
MATCH p=(n1)-[:e]->(n2) RETURN p
Suppose I have two kinds of nodes, Person and Competency. They are related by a KNOWS relationship. For example:
(:Person {id: 'thiago'})-[:KNOWS]->(:Competency {id: 'neo4j'})
How do I query this schema to find out all Person that knows all nodes of a set of Competency?
Suppose that I need to find every Person that knows "java" and "haskell" and I'm only interested in the nodes that knows all of the listed Competency nodes.
I've tried this query:
match (p:Person)-[:KNOWS]->(c:Competency) where c.id in ['java','haskell'] return p.id;
But I get back a list of all Person that knows either "java" or "haskell" and duplicated entries for those who knows both.
Adding a count(c) at the end of the query eliminates the duplicates:
match (p:Person)-[:KNOWS]->(c:Competency) where c.id in ['java','haskell'] return p.id, count(c);
Then, in this particular case, I can iterate the result and filter out results that the count is less than two to get the nodes I want.
I've found out that I could do it appending consecutive match clauses to keep filtering the nodes to get the result I want, in this case:
match (p:Person)-[:KNOWS]->(:Competency {id:'haskell'})
match (p)-[:KNOWS]->(:Competency {id:'java'})
return p.id;
Is this the only way to express this query? I mean, I need to create a query by concatenating strings? I'm looking for a solution to a fixed query with parameters.
with ['java','haskell'] as skills
match (p:Person)-[:KNOWS]->(c:Competency)
where c.id in skills
with p.id, count(*) as c1 ,size(skills) as c2
where c1 = c2
return p.id
One thing you can do, is to count the number of all skills, then find the users that have the number of skill relationships equals to the skills count :
MATCH (n:Skill) WITH count(n) as skillMax
MATCH (u:Person)-[:HAS]->(s:Skill)
WITH u, count(s) as skillsCount, skillMax
WHERE skillsCount = skillMax
RETURN u, skillsCount
Chris
Untested, but this might do the trick:
match (p:Person)-[:KNOWS]->(c:Competency)
with p, collect(c.id) as cs
where all(x in ['java', 'haskell'] where x in cs)
return p.id;
How about this...
WITH ['java','haskell'] AS comp_col
MATCH (p:Person)-[:KNOWS]->(c:Competency)
WHERE c.name in comp_col
WITH comp_col
, p
, count(*) AS total
WHERE total = length(comp_col)
RETURN p.name, total
Put the competencies you want in a collection.
Match all the people that have either of those competencies
Get the count of compentencies by person where they have the same number as in the competency collection from the start
I think this will work for what you need, but if you are building these queries programatically the best performance you get might be with successive match clauses. Especially if you knew which competencies were most/least common when building your queries, you could order the matches such that the least common were first and the most common were last. I think that would chunk down to your desired persons the fastest.
It would be interesting to see what the plan analyzer in the sheel says about the different approaches.