I am doing some research into Graph Database Systems as part of POC to possibly move a system across to. I am really new to the concept and come from a RDBMS background.
I have a model (schema?) that looks like:
Person-[HAS_NAME {type:First|Middle|Last}]-Name
Person-[WAS_BORN_ON]-DateOfBirth
Person-[RESIDES_AT {type:Current|Previous}]-Address
I am able to store this data perfectly using Neo4J and Neo4JClient in a C# application.
Where I am falling flat on my backside is that I want to get out of the store, a list of People and all nodes that are connected to the person (eg their Names, DateOfBirth and Addresses) where certain conditions are met,
I have started with this:
MATCH (dob:DateOfBirth)-[WAS_BORN_ON]-(person:Person)-[HAS_NAME]-(name:Name) WHERE dob.Id = '1954-05-09' RETURN person, dob, name
And it produces something like this which is great:
But I want to restrict it to people that have a (last) name of "williams" so I go with this
MATCH (dob:DateOfBirth)-[WAS_BORN_ON]-(person:Person)-[HAS_NAME]-(name:Name) WHERE dob.Id = '1954-05-09' and name.Value = 'Williams' RETURN person, dob, name"
Unfortunately it removes all the other names:
Unfortunately I want this:
Maybe something like this would work?
MATCH (dob:DateOfBirth{Id: "1954-05-09"})<-[:WAS_BORN_ON]-(person:Person)
WITH dob, person
MATCH (person)-[:HAS_NAME]->(surname:Name{Value: "Williams"})
WITH person, dob
MATCH (person)-[:HAS_NAME]->(name:Name)
RETURN person, dob, name
Edit: Updated query to improve performance as suggested by ulkas.
I would have gone the easy route...
match (dob:DateOfBirth{Id: "1954-05-09"})--(p:Person)--(surname:Name{Value: "Williams"})
optional match (p)--(n:Name)
return dob, p, n
alternatively with a WHERE clause:
match (dob:DateOfBirth)--(p:Person)--(surname:Name)
optional match (p)--(n:Name)
where dob.Id = "1954-05-09" and surname.Value = "Williams"
return dob, p, n
But then again, I'm not the expert on performance; I focus first on simplicity.
Related
[newbie question] I'm kinda in doubt if it's better using specific relationships or labels, but first let me give you a little bit more context.
Suppose that the graph should be able to answer the following questions/queries:
Given a person, return all the emails associated;
Given a person, return all the contacts;
I've come up with these 2 possible models:
They're both able to fulfill the required requests with the following queries:
First model:
MATCH (p:Person {name:"Bob"})-[:REACHABLE_BY]-(e:Email)
RETURN p.name AS Name, e.contact AS Contact
MATCH (p:Person {name:"Bob"})-[:REACHABLE_BY]-(c:Contact)
RETURN p.name AS Name, c.contact AS Contact
Second model:
MATCH (p:Person {name:"Bob"})-[:REACHABLE_BY_EMAIL]-(c:Contact)
RETURN p.name AS Name, c.contact AS Contact
MATCH (p:Person {name:"Bob"})-[:REACHABLE_BY_FAX]-(c:Contact)
RETURN p.name AS Name, c.contact AS Contact
UNION ALL
MATCH (p)-[:REACHABLE_BY_EMAIL]-(c2:Contact)
RETURN p.name AS Name, c2.contact AS Contact
But I'm wondering if there's a best practice to follow in this case. I mean, I know that having specific relationships in some cases is better since we reduce the number of nodes involved in the query (instead of filtering later by some property), but I feel like that in this case we can achieve the same result (maybe also in performance) by considering different labels.
Both of your models will work fine. But to obtain the best performance, you can combine these two models into a single one. Like this:
All the Contact nodes that store emails, will have Email label as well.
All the Contact nodes that store faxes will have Fax label as well.
Relationship type between Person and Email types node will be REACHABLE_BY_EMAIL
Relationship type between Person and Fax types node will be REACHABLE_BY_FAX
Using this model, you can easily query a person's email or by these queries:
MATCH (p:Person)-[:REACHABLE_BY_EMAIL]->(email)
RETURN p, email
MATCH (p:Person)-[:REACHABLE_BY_FAX]->(fax)
RETURN p, fax
Note, that I have not specified Email or Fax labels in the query, as they are redundant.
Also, now you can query your the emails and faxes, using simply
MATCH (e:Email) RETURN e
MATCH (f:Fax) RETURN f
If the need arises.
You can also use
(:Person)-[:REACHABLE_BY]->(:Contact)-[:HAS_TYPE]->(:ContactType)
with 'Fax', 'Phone', 'Email' as ContactType nodes.
In your queries, using directions will help you speed up things. Your 2nd query of the second model can be written as
MATCH (p:Person {name:"Bob"})-->(c:Contact)
RETURN p.name AS Name, c.contact AS Contact
For the model I suggest, the queries would be:
MATCH (p:Person {name:"Bob"})-->(c:Contact)-->(:ContactType {name:'Email'})
RETURN p.name AS Name, c.contact AS Contact
and
MATCH (p:Person {name:"Bob"})-->(c:Contact)
RETURN p.name AS Name, c.contact AS Contact
I've a graph database consisting of two types of nodes - persons and businesses, and one type of relationship - payment.
A person may pay either another person, or another business. Likewise, a business may pay a person or a business. That is, all these four types of paths are possible
(person)-[:PAYS]->(person)
(person)-[:PAYS]->(business)
(business)-[:PAYS]->(person)
(business)-[:PAYS]->(business)
In a use case of detecting possible money laundering, I would like to extract cases where payment made by a person went through several businesses before reaching another person. That is (omitting the relationship for convenience):
(person)-(business)-(business)-(business)-(person)
My cypher query should therefore look something like this:
(person)-[:PAYS*0..3]-(person)
However, this will also return me the following relationship, which isn't what I want:
(person)-(business)-(person)-(business)-(person)
What can I do to exclude (person) from the variable length relationship [:PAYS*0..3]?
I've followed the solution given here and tried this:
MATCH path((person)-[:PAYS*0..3]-(person))
WHERE NONE(n IN nodes(path) WHERE n:person)
RETURN path
However, this query ran for a long time before giving an output of zero results (which isn't correct). Another obvious solution is to change my relationship to make a distinction between [:PAYS_BUSINESS] and [:PAYS_PERSON], but I would find out if a solution is possible without changing my graph schema.
The reason that
MATCH path=((person)-[:PAYS*0..3]-(person))
WHERE NONE(n IN nodes(path) WHERE n:person)
RETURN path
does not result in anything seems to be that the first and the last node are persons
if you want to find the paths from :person to :person with only :business in between, you could do this
MATCH path=((p1:Person)-[:PAYS*1..3]-(p2:Person))
WHERE ALL(n IN nodes(path)[1..-1] WHERE n:Business)
RETURN path
You may all want to look at the apoc.path.expand and apoc.path.expandConfig procedures (https://neo4j.com/labs/apoc/4.1/overview/apoc.path/). Powerful, but you introduce a dependency on the APOC library.
5 minutes after I posted this question, I thought of and tried a possible solution that seems to work. Not sure if this is against the rules, but here's a possible way out of my own problem (in case someone else is facing the same problem):
MATCH x=(p1:person)-[:PAYS]-(b1:business)
WITH *
MATCH y=(b1:business)-[:PAYS*..3]-(b2:business)-[:PAYS]-(p2:person)
RETURN x, y
You might want to look at how I handled this with X-linked inheritance. In that use case you aggregate the sex of the parent (M or F) and can then excluded MM from the aggregated string since a man never passes an X to his son.
http://stumpf.org/genealogy-blog/graph-databases-in-genealogy
The query exclude all MM concatenated strings, rather accepted anything except MM:
match p=(n:Person{RN:32})<-[:father|mother*..99]-(m) with m, reduce(status ='', q IN nodes(p)| status + q.sex) AS c, reduce(srt2 ='|', q IN nodes(p)| srt2 + q.RN + '|') AS PathOrder where c=replace(c,'MM','') return distinct m.fullname as Fullname
In your case its P and B (person or business).
I'm trying to get a named relationship with an or in the query. I'm thinking the query should look similar to:
MATCH (A:person)-[B (:ACTED_IN|:DIRECTED)]->(C:person) RETURN A, B, C
but no matter how I put in the parens I get an error. I suppose a UNION would do the trick but was hoping that there was some way of doing it similar to the above. TIA.
EDIT: This does what I want but seems not the way to do it.
MATCH (A:person)-[B]->(C:person) WHERE type(B)="ACTED_IN" OR type(B)="DIRECTED" RETURN A,B,C
I am a new user so I do not have the option to comment on questions yet. I am guessing you are trying to get the person who either acted in or directed the movie. Its described in official Cypher Documentation: Match on multiple relationship types.
With Demo Movie Data on Neo4j to get persons from The Matrix movie, I would use this:
MATCH (TheMatrix { title: 'The Matrix' })<-[rel:ACTED_IN|:DIRECTED]-(person)
RETURN person.name, rel
Context:
I'm working on an Alumni project to understand the difference between giving and engagement. (engagement = showing up, attending events, volunteering, etc.) The value in the work will come from the insight gained from understanding the behavior partners.
In the query below I've been effective at bring back the "Biggest spenders", however I'd like to list the name of the (n) Alumni and the (a,b)gifts. There 30 gift types that fit into (a,b).
Please let me know your thoughts... Innosoljim
>//Who are Alumni that give the most?
>>MATCH (n:Alumni)-[r:Supportfin]->(b)
>>MATCH (n:Alumni)-[t:Gavefin]->(a)
>>RETURN n,b,a LIMIT 1500
Thanks for the Answer - Let me restate the goal for clarity: I'm trying to consolidate (into n.Alumni) many relationships -[Gave|Support]-> to unique nodes (Various Gifts) so that I can obtain a report on an Alumni's activity (giving, support, by n.name. The Graph model places the Alumni node at the center of each unique behavior (giving, support, graddate, address, degree, greeklife, etc....) Does this help?
MATCH (a:Alumni)-[r:Supportfin|Gavefin]->(gift)
RETURN a.name, collect(gift)
ORDER BY (a)-[r:Supportfin|Gavefin]-> count(*) DESC
Something like this maybe although this isn't working (syntaxerror)
Match Alumni to the gifts with both relationship types and return:
MATCH (a:Alumni)-[r:r:Supportfin|Gavefin]->(gift)
RETURN a.name, collect(gift)
Or split it by the different relationship types:
MATCH (a:Alumni)
OPTIONAL MATCH (s)-[:Supportfin]->(sup_gift)
OPTIONAL MATCH (a)-[:Gavefin]->(gave_gift)
RETURN a.name, collect(DISTINCT sup_gift), collect(DISTINCT gave_gift)
Without a proper decription of your graph model and problem the question is difficult to answer.
I have a graph datebase so that there is in it some pattern like this one:
(n1)-[:a]->(n2),
(n1)-[:b]->(n2),
(n1)-[:c]->(n2),
(n1)-[:e]->(n2),
(n1)-[:d]->(n3),
(n2)-[:b]->(n4)
And I want to have all graph with this pattern
MATCH p={
(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4),
(n1)-[:b]->(n2)<-[:c]-(n1),
(n1)-[:e]->(n2)
}
RETURN p
Is it possible? I've search a little but I haven't found how to do it.
I know we can use "|" for a type like this
()-[:a|b]->()
but there is no "&" and the path assigning only works on pattern which are written without ",".
Thanks
EDIT:
If it could help, here is another example of what I'm seeking:
In a database with movies, person and relations like ACTED_IN, KNOWS, FRIEND and HATE
I want all the graphs containing an actor "Actor1" (who ACTED_IN a movie "M") who KNOWS "Person1", FRIEND "Person2" and HATE "Person3" which ACTED_IN the same movie "M".
An UNION like the one in the answer of "Michael Hunger" does not work because we have multiple subgraphs and not graphs. Moreover, some subgraph might not be correct answers for the bigger pattern.
Your query will be very inefficient, as you don't restrict your search to a set of start nodes neither with labels or label+property combinations !!!!
You can use UNION for that:
MATCH p=(n3)<-[:d]-(n1)-[:a]->(n2)-[:b]->(n4) RETURN p
UNION
MATCH p=(n1)-[:b]->(n2)<-[:c]-(n1) RETURN p
UNION
MATCH p=(n1)-[:e]->(n2) RETURN p