Is the major difference between RDF and Labelled Property Graph? - neo4j

For example, Barack Obama, in Labelled Property Graph, the birth date (1965) would be modeled as a Property of the entity BarackObama, without using a relation. But in RDF, i.e. freebase, BarackObama is linked through a relationship (birth_date) to a value '1965'. In these two cases, the former involves only one entity without relationship, but the latter, two entities are involved, "BarackObama" and "1965" with a relationship.
Is this the major difference?

Related

Difference between associative entities and a entity that is dependent on all of its foreign keys?

I was interested in storing the history of properties which contains two entites - properties and owners. Right now I am not sure what approach to take and needed some help.
I was thinking of creating an associative entity and so this would have its identity a combination of property and owner but the textbook comes up with this solution instead:
Solution
What's the difference between the solution above and an associative entity?
The owns table in the given solution would be called an associative entity set in the network data model. This data model supports only one-to-one and one-to-many binary relationships, and resolves many-to-many binary relationships as well as ternary and higher relationships into an associative entity set with binary one-to-many relationships.
However, in the entity-relationship model, the owns table represents a many-to-many relationship relation. The ER model directly supports many-to-many binary relationships as well as ternary and higher relationships, and uses "associative entities" to refer only to relationships which are the subjects of other relationships.

ER Diagram design issues

I'm designing an ER diagram for a social network and recently I got involved in an argument with my colleagues whether this part is right or wrong
ER DIAGRAM PROBLEM
Where Faqet(Pages) is connected with Shfrytezuesi(User) using three actions, pelqen is for storing likes, krijon faqe to know who created the page, and udheheq to store all page admins, so my question is
is this design wrong?
Can two tables be linked with more than one action, this is where I'm not certain
It's perfectly valid to have any number of relationships between any number of entity sets. My only concern with the diagram is that multiple role lines below Shfrytezuesi are merged into one - I recommend keeping them distinct.
Note that in the entity-relationship model, we don't link tables. That idea comes from the old network data model, in which rows represented entities, tables represented entity sets, and links between rows/tables represented relationships.
One disadvantage with that model is that it supports only directed binary relationships - many-to-many binary, ternary and higher relationships and relationships with attributes all required associative entities to be introduced. However, three binary relationships aren't equivalent to a ternary relationship, and not all relationships can be represented in binary data models.
The ER model supports n-ary relationships and attributes on relationships. Entity sets are represented by their primary keys and relationships by combinations of entity keys. Entity sets plus attributes form entity relations, relationship sets plus attributes form entity relations. These relations get mapped to tables. In practice, tables with the same primary keys get combined to reduce the number of tables, which means one-to-one and one-to-many relationships get combined into the relations for one of their associated entity sets.
Regardless of how tables are combined, attributes and relationships are represented by sets of columns. For example, based on your diagram, Pelqen would be represented as (FID PK, SID) (assuming SID is the primary key of Shfrytezuesi). These columns might have different names, e.g. SID might be renamed to AdminSID, especially if the relationship was combined into Faqet. The old network data model would view FID FK -> FID PK as a relationship, which as described above and below is a very limited kind of relationship and not the approach taken by the ER model.
Another disadvantage of the network data model is predetermined access paths, which means we have to navigate from table to table using the predefined relationships. This complicated queries and data processing significantly. This limitation was one of the main drivers for the development of the relational model, to which the ER model maps. The understanding of tables as relations in the RM enables us to construct and navigate arbitrary access paths using joins. So, we do link tables in the RM, but at query time and as needed rather than at design time. The ER model is used for conceptual design only and doesn't describe relationships between tables, only relationships between entity sets.
Now, the ER model isn't a complete and consistent logical model like the RM, but it is a significant improvement over the network data model. An even more rigorous approach than ER would be object-role modeling, but that's a different topic.

is optionality (mandatory, optional) and participation (total, partial) are same?

As i know optionality means the minimum cardinality of a relationship which is denoted as optional to optional, mandatory to optional, mandatory to mandatory..
And Participation denoted as bold line and a normal line.
In the Internet some refer participation as the dependency of the entity to the relationship which is also looks like identifying and non identifying relationship.
and some refer it as the minimum cardinality
What is the correct definitions of those relationships and what is the difference..
Let's start with definitions and examples of each of the concepts:
Total and partial participation:
Total participation (indicated by a double or thick association line) means that all the entities in an entity set must participate in the relationship. Partial participation (indicated by a single thin line) means that there can be entities in the entity set that don't participate in the relationship.
Medicine participates totally in the Produce relationship, meaning that Medicine can't exist unless Produced by a Laboratory. In contrast, a Laboratory can exist without Producing Medicine - Laboratory participates partially in the Produce relationsip.
Mandatory and optional roles:
In a relationship, roles can be optional or mandatory. This affects whether a relationship instance can exist without an entity in a given role. Mandatory roles are indicated with a solid association line, optional roles are indicated with a dotted line.
Roles aren't often talked about in database tutorials, but they're an important concept. Consider a marriage - a relationship with two mandatory roles filled by the same entity set. In most relationships, the entity sets also define the roles, but when an entity set appears multiple times in a single relationship, we distinguish them in different roles.
In the example above, a Patient can Purchase Medicine with or without a Prescription. A Purchase can't exist without a Patient and Medicine, but a Prescription is optional (overall, though it may be required in specific cases).
Identifying relationship / weak entity:
A weak entity is an entity that can't be identified by its own attributes and therefore has another entity's key as part of its own. An identifying relationship is the relationship between a weak entity and its parent entity. Both the identifying relationship and the weak entity are indicated with double borders. Weak entity sets must necessarily participate totally in their identifying relationship.
In this example, a Prescription contains LineItems which are identified by the Prescription's key and a line number. In other words, the LineItems table will have a composite key (Prescription_ID, Line_Number).
For examples of non-identifying relationships, see the previous examples. While Medicine participates totally in the Produce relationship, it has its own identity (e.g. a surrogate key, though I didn't indicate it). Note that surrogate keys always imply regular entities.
Mandatory/optional vs total/partial participation
Mandatory or optional roles indicate whether a certain role (with its associated entity set) is required for the relationship to exist. Total or partial participation indicate whether a certain relationship is required for an entity to exist.
Mandatory partial participation: See above: A Laboratory can exist without producing any medicine, but Medicine can't be Produced without a Laboratory.
Mandatory total participation: See above: Medicine can't exist without being Produced, and a Laboratory can't Produce something unspecified.
Optional partial participation: See above: A Prescription can exist without being Purchased, and a Purchase can exist without a Prescription.
That leaves optional total participation, which I had to think about a bit to find an example:
Some Patients Die of an unknown Cause, but a Cause of death can't exist without a Patient Dying of it.
Total/partial participation vs identifying/non-identifying relationships
As I said before, weak entity sets always participate totally in their identifying relationship. See above: a LineItem must be Contained in a Prescription, it's identity and existence depends on that. Partial participation in an identifying relationship isn't possible.
Total participation doesn't imply an identifying relationship - Medicine can't exist without being Produced by a Laboratory but Medicine is identified by its own attributes.
Partial participation in a non-identifying relationship is very common. For example, Medicine can exist without being Purchased, and Medicine is identified by its own attributes.
Mandatory/optional vs identifying/non-identifying relationships
It's unusual for a relationship to have less than two mandatory roles. Identifying relationships are binary relationships, so the parent and child roles will be mandatory - the Contain relationship between Prescription and LineItem can't exist without both entities.
Optional roles are usually only found on ternary and higher relationships (though see the example of patients dying of causes), and aren't involved in identification. An alternative to an optional role is a relationship on a relationship:
By turning Purchase into an associative entity, we can have it participate in a Fill relationship with Prescription. To maintain the same semantics as above I specified that a Purchase can only Fill one Prescription.
Physical modeling
If we translate from conceptual to physical model (skipping logical modeling / further normalization), making separate tables for each entity and relationship, things look pretty similar, though you have to know how to read the cardinality indicators on the foreign key lines to recover the ER semantics.
However, it's common to denormalize tables with the same primary keys, meaning one-to-many relationships are combined with the entity table on the many side:
A relationship is physically represented as two or more entity keys in a table. In this case, the entity keys - patient_id and cause_of_death_id are both found in the Patient table. Many people think the foreign key line represents the relationship, but this comes from confusing the entity-relationship model with the old network data model.
This is a crucial point - in order to understand different kinds of relationships and constraints on relationships, it's essential to understand what relationships are first. Relationships in ER are associations between keys, not between tables. A relationship can have any number of roles of different entity sets, while foreign key constraints enforce a subset constraint between two columns of one entity set. Now, armed with this knowledge, read my whole answer again. ;)
I hope this helps. Feel free to ask questions.

How to implement an EAV model in Neo4j?

The Entity-Attribute-Value (EAV) model is really powerful, but complex to implement using SQL, so people often look for alternatives to EAV. It seems like the perfect candidate for graph databases. I understand how to build a movie database where you have nodes with the Neo4j label "Movie" with the property "release_date" right on the node. How would you make this more generic, such that movies have the Neo4j label "Entity" following the general EAV model?
I've thought a lot about this, but I'm not confident I have a good solution. I'll take a stab at it anyway. Here's the most basic model:
<node> <relationship> <node>
Attribute --> :VALUE --> Entity
name="Label",type="string" --> value="Movie" --> name="The Matrix"
With this model, you can write code for how to display and edit Attribute.type. For example, maybe all labels have a text field with finite options on the front-end and all dates have a date-picker. You could break Attribute.type out into its own node, Type, if that was preferable (particularly would make sense for handling composite types). In that case, you have the relationship TYPE between Attribute and Type nodes.
This becomes a problem if entities have multiple relationships, as is the case for reviews or if you want to relate the value to something else, such as the user who assigned the value. Now, I think, the relationship "VALUE" has to be it's own node of type "Value" (i.e. has the Neo4j label, "Value") with an incoming relationship from both Attribute and User nodes.
The full form has Type nodes, Attribute nodes, User nodes, Value nodes, and Entity nodes, where the relationships have basically no properties on them.
Why do you need it in the first place?
I always thought that EAV was just a workaround for relational databases not being schema free.
Neo4j as other nosql databases is schema free, so you can just add the attributes that you want to both nodes and relationships.
If you need to you can also record the EAV model in a meta-schema within the graph but in most cases it is good enough if the meta-schema lives within the application that creates and uses your attributes.
Usually I treat labels as roles which in a certain context provide certain properties and relationships. A node can have many labels each of which representing one of those roles.
E.g. for the same node
:Person(name)-[:LIVES_IN]->(:City)
:Employee(empNo)-[:WORKS_AT]->(:Company)
:Developer()-[:HAS_SKILL]->(:CompSkill)
...
So in your case :Entity would just be a label that implies the name property.
And :Movie is a label that implies a release_date property and e.g. ACTED_IN relationships.

Core Data model - entities and inverses

I'm new to Core Data and I'm trying to implement it into my existing project. Here is my model:
Now, there's some things that don't make sense to me, likely because I haven't modelled it correctly.
CMAJournal is my top level object with an ordered set of CMAEntry objects and an ordered set of CMAUserDefine objects.
Here's my problem:
Each CMAUserDefine object has an ordered set of objects. For example, the "Baits" CMAUserDefine will have an ordered set of CMABait objects, the "Species" CMAUserDefine will have an ordered set of CMASpecies objects, etc.
Each CMAEntry object has attributes like baitUsed, fishSpecies, etc. that point to an object in the respective CMAUserDefine object. This is so if changes are made, each CMAEntry that references that object is also changed.
Now, from what I've read I should have inverses for each of my relationships. This doesn't make sense in my model. For example, I could have 5 CMAEntry objects whose baitUsed property points to the same CMABait object. Which CMAEntry does the CMABait's entry property point to if there are 5 CMAEntry objects that reference that CMABait? I don't think it should point to anything.
What I want is for all CMAUserDefine objects (i.e. all CMABait, CMASpecies, CMALocation, etc. objects) to be stored in the CMAJournal userDefines set, and have those objects be referenced in each CMAEntry.
I originally had this working great with NSArchiving, but the archive file size was MASSIVE. I mean, 18+ MB for 16 or so entries (which included about 20 images). And from what I've read, Core Data is something I should learn anyway.
So I'm wondering, is my model wrong? Did I take the wrong approach? Is there a more efficient way of using NSArchiver that will better fit my needs?
I hope that makes sense. Please let me know if I need to explain it better.
Thanks!
E: What lead me to this question is getting a bunch of "Dangling reference to an invalid object." = "" errors when trying to save.
A. Some Basics
Core Data needs a inverse relationship to model the relationship. To make a long story short:
In an object graph as modeled by Core Data a reference semantically points from the source object to a destination object. Therefore you use a single reference as CMASpecies's fishSpecies to model a to-one relationship and a collection as NSSet to model a to-many relationship. You do not care about the type of the inverse relationship. In many cases you do not have one at all.
In a relational data base relationships are modeled differently: If you have a 1:N (one-to-many) relationship the relationship is stored on the destination side. The reason for this is, that in a rDB every entity has a fixed size and therefore cannot reference a variable number of destinations. If you have a many-to-many relationship (N:M), a additional table is needed.
As you can see, in an object graph the types of relationships are to-one and to-many only depending on the source, while in rDB the types of relationships are one-to-one, one-to-many, many-to-many depending on both source and destination.
To select the right kind of rDB modeling Core Data wants to know the type of the inverse relationship.
Type Object graph Inverse | rDB
1:1 to-one id to-one id | source or destination attribute
1:N collection to-one id | destination attribute
N:M collection collection | additional table with two attributes
B. To your Q
In your case, if a CMAEntry object refers exactly one CMASpecies object, but a CMASpecies object can be referred by many CMAEntry objects, this simply means that the inverse relationship is a to-many relationship.
Yes, it is strange for a OOP developer to have such inverse relationships. For a SQL developer, it is the usual case. Developing an ORM (object relational mapper) this is one of the problems. (I know that, because I'm doing that for Objective-Cloud right now. But I did if different, more the OOP's point of view.) Every solution is a kind of unusual for one side. Somebody called ORM the "vietnam of software development".
To have a more simple example: Modeling a sports league you will find yourself having a entity Match with the properties homeTeam and guestTeam. You want to have an inverse relationship, no not homeMatches and guestMatches, but simply matches. This is obviously no inverse. Simply add inverse relationship, if Core Data wants and don't care about it.

Resources