I want to create an ontology about genealogy. I have the classes Person and Gender and the relation has_child between two Persons and the opposite relation has_parent. Each Person has a Gender. I would like to define some new properties like has_father defined by has_parent P1 and P1 has_gender MALE or has_sibling defined by has_sibling(X1, X2) = has_father(X1, F) and has_father(X2, F). For this example, I ignored the mother, but it's for the sake of simplicity.
I could create explicit relations and add them to the Persons, but I want the system to infer the relations.
So I found how to do it. I used the property chains mechanism.
Related
I have difficulty understanding this statement in DL:
∃R.∃S.C(a)
What does this proposition exactly mean?
Thanks in advance!
This means that the individual named a belongs to the concept ∃R.∃S.C. The concept ∃R.∃S.C represents the class of things that have a relation R with something that has a relation S with something that is in the class denoted by C. For instance, if R is the relation married to, S is the relation works for and C is the class of Public organisations, then ∃R.∃S.C represents all the entities that are married to something (or someone) that works for a public organisation. Then ∃R.∃S.C(a) means that the individual named a is such an entity.
I'm new to Neo4J and I have a question about the most appropriate data model for a the problem domain described below.
Background
As I understand it, in Neo4J every relationship has a direction of sorts, outgoing, incoming or undirected. I read that a common mistake newbies make in "bi-directional" relationships is that they might model the relationship in both directions where in reality one undirected relationship would serve the purpose well. I also understand that irrespective of the direction it is possible at query time to ignore it and query based on either side of the relationship.
Problem Domain
This is a bit cliché, but stick with me, in a graph where "People" can connect with one another I felt tempted to model that as an undirected relationship. However, perhaps in my problem domain I want to store meta data on the relationship edge rather than at either of the People nodes. E.g. timestamp of when then connected or type of relationship (friend, family, employer etc...). Perhaps this is a "side effect" of using the spring-data-neo4j libraries but it seems that if I wish to data meta data on the edge rather than at the node I must create a class which is annotated as #RelationshipEntity rather than #NodeEntity. This necessitates a #StartNode and #EndNode which then seems to imply a direction to my otherwise undirected relationship...
Now, as it turns out this might not be a bad thing in my case because perhaps after all it turns out there is something useful about this additional directed context (e.g. perhaps I want to know who initiated the relationship so that the target node (person) must accept the invitation to be friends).
Now imagine that each person can place the "relationship" into a "group" like "friends, family, colleagues" etc I feel like I'd now need to actually have two distinct edges that point in either direction so that the meta data specific to the given direction has a natural place to reside. But this seems to have been described as a newbie anti-pattern.
Questions
I have two questions:
1) Should I use two separate distinct relationship edges that essentially point either way as a bi-directional relationship if I need to store meta data that is specific to the direction. e.g. Person A <--> Person B but person A placed Person B in the friends group whereas Person B placed A in the colleagues group.
2) Given the Java data model below, I am unclear what the direction attribute on the #Relationship annotation of Person should be. If I specify nothing it default to OUTGOING. But since it's a reflective relationship depending on which person instance you look at the relationship could be either outgoing or incoming, e.g. if person A adds person B, both are Person instances, but the direction is outgoing on the person A instance and incoming on the person B instance. Should the annotation be omitted altogether since I'm using a #RelationshipEntity?
Java Data Model
#NodeEntity
#EqualsAndHashCode(of = {"id"})
#NoArgsConstructor
public abstract class Person {
#GraphId
private Long id;
... other attributes
#Relationship(type = "CONNECTION_OF", direction = UNDIRECTED)
private Set<Connection> connections;
}
#Data
#RelationshipEntity(type = "CONNECTION_OF")
public class Connection {
#GraphId
private Long relationshipId;
... other meta-data
#StartNode
private Person from;
#EndNode
private Person to;
}
1) The rule of thumb that works well is to answer a question - if a relationship from A to B exists, can another relationship from B to A still be created, with different metadata? And can you delete one direction of the relationship independently of the other?
If the answer is yes they go for two directed relationships, otherwise stay with UNDIRECTED, and create initiatedBy=userId property or similar.
For your case where you put connections into groups - the thing is that you really categorize the people from another person's view, maybe it is a completely different fact independent of the CONNECTED_TO relationship?
You could e.g. create a group node and link that to owner and all people in the group, with following schema:
(:Person)-[:OWNS]-(:Group)-[:CATEGORIZED]-(:Person)
2) Keep the #Relationship(type = "CONNECTION_OF", direction = UNDIRECTED). For a given person X the connections Set is going to have elements that have from=X for outgoing edges mixed with elements that have to=X for incoming. All of these will be mixed in one collection.
how are the two different? Since both of the relationships consist of one set of objects say A which cannot exist independently and another set of objects say B whose existence is necessary for the existence of objects in the set A.
So are the two same or are their some fundamental differences that I am missing out on?
"Composition relationship" isn't a formal relationship type in ER. If I take it to mean a binary relationship (a,b) which is left-total (every a is related to some b) and in which a->b (an a is related to only one b), then an identifying relationship would be a specific kind of composition relationship. An identifying relationship has the form ((x,b),b) in which (x,b)->b. Non-identifying composition relationships would then include cases in which the component set is identified by its own attributes.
In EER modelling is it possible and correct to have a disjoint specialization where none of the sub-classes have any specific attributes(local to them) but are entirely grouped on the basis of a defining attribute.For example we can have a USER entity with some attributes of which one is "role".Based on value of role(admin or author or editor) we will have the subclass entities ADMIN ,AUTHOR and EDITOR.None of them has any attributed which are only specific to them.Also please note that the specialization is disjoint and the superclass entity USER has total participation.
And if this is possible, can I convert it to relational model by creating a single relation for the superclass entity USER
Yes, it's possible and valid to do so if you want to record relationships that are restricted to the subtypes. If you don't have subtype-specific attributes or relationships, there's no point in distinguishing subtypes in the ER model. A simple role attribute on the User is sufficient for queries.
If you define subtypes in the ER model, this converts to a relation per subtype in the relational model. If you're looking for something like dependent types, you won't find it in the relational model, which corresponds to first-order logic. ER is even more limited.
I'm attempting to use Neo4j to model the relationship between projects, staff, and project roles. Each project has a role called "project manager" and a role called "director". What I'm trying to accomplish in the data model is the ability to say "for project A, the director is staff X." For my purposes, it's important that "project", "staff", and "role" are all entities (as opposed to properties). Is this possible in Neo4j? In simpler terms, are associative entities possible in Neo4j? In MySQL, this would be represented with a junction table with a unique id column and three foreign key columns, one for project, staff, and role respectively, which would allow me to identify the relationship between those entities as an entity itself. Thoughts?
#wassgren's answer is a solid one, worth considering.
I'll offer one additional option. That is, you can "Reify" that relationship. Reificiation is sort of when you take a relationship and turn it into a node. You're taking an abstract association (relationship between staff and project) and your'e turning it into a concrete entity (a Role) All of the other answer options involve basically two nodes Project and Staff, with variations on relationships between them. These approaches do not reify role, but store it as a property, or a label, of a relationship.
(director:Staff {name: "Joe"})-[:plays]->(r:Role {label:"Director"})-[:member_of]->(p:Project { name: "Project X"});
So...people don't contribute to projects directly, roles do. And people play roles. Which makes an intuitive sense.
The advantages of this approach is that you get to treat the "Role" as a first-class citizen, and assert relationships and properties about it. If you don't split the "Role" out into a separate node, you won't be able to hang relationships off of the node. Further, if you add extra properties to a relationship that is masquerading as a role, you might end up with confusions about when a property applies to the role, and when it applies to the association between a staff member and a project.
Want to know who is on a project? That's just:
MATCH (p:Project {label: "Project X"})<-[:member_of]-(r:Role)<-[:plays]-(s:Staff)
RETURN s;
So I think what I'm suggesting is more flexible for the long term, but it might also be overkill for you.
Consider a hypothetical future requirement: we want to associate roles with a technical level or job category. I.e. the project manager should always be a VP or higher (silly example). If your role is a relationship, you can't do that. If your role is a proper node, you can.
Conceptually, a Neo4j graph is built based on two main types - nodes and relationships. Nodes can be connected with relationships. However, both nodes and relationships can have properties.
To connect the Project and Staff nodes, the following Cypher statement can be used:
CREATE (p:Project {name:"Project X"})-[:IS_DIRECTOR]->(director:Staff {firstName:"Jane"})
This creates two nodes. The project node has a Label of type Project and the staff node has a Label of type Staff. Between these node there is a relationhip of type IS_DIRECTOR which indicates that Jane is the director of the project. Note that a relationship is always directed.
So, to find all directors of all project the following can be used:
MATCH (p:Project)-[:IS_DIRECTOR]->(director:Staff) RETURN director
The other approach is to add properties to a more general relationship type:
create (p:Project {name:"Project X"})<-[:MEMBER_OF {role:"Director"}]-(director:Staff {firstName:"Jane"})
This shows how you can add properties to a relationship. Notice that the direction of the relationship was changed for the second example.
To find all directors using the property based relationship the following can be used:
MATCH (p:Project)<-[:MEMBER_OF {role:"Director"}]-(directors:Staff) RETURN directors
To retrieve all role types (e.g. director) the following can be used:
MATCH
(p:Project)-[r]->(s:Staff)
RETURN
r, // The actual relationship
type(r), // The relationship type e.g. IS_DIRECTOR
r.role // If properties are available they can be accessed like this
And, to get a unique list of role names COLLECT and DISTINCT can be used:
MATCH
(p:Project)-[r]->(s:Staff)
RETURN
COLLECT(DISTINCT type(r)) // Distinct types
Or, for properties on the relationship:
MATCH
(p:Project)-[r]->(s:Staff)
RETURN
COLLECT(DISTINCT r.role) // The "role" property if that is used
The COLLECT returns a list result and the DISTINCT keyword makes sure that there are no duplicates in the list.