Which entity relationship method looks right or better? - entity-relationship

I see that there are two different ways to make entity relationship on the example illustrated below (either by one to one or by many to many). Which one is better method? (What is the better method in terms of common practice or widely accepted convention. Possibly, which one is more efficient? If there is no better method what would be the trade-off of using one instead of another?)
One-to-one method
Many-to-many method

First of all, neither diagram is an entity-relationship diagram. Entity-relationship diagrams should be able to represent entity-relationship concepts, but the notation you used doesn't distinguish between entity relations and relationship relations, and shows columns, types and foreign key constraints, which belong in a physical model rather than a conceptual one. What you have is better described as table diagrams. For ERDs, I recommend Chen's original notation or something close to it.
The first diagram mixes a higher-level abstraction into an otherwise physical model, and for that reason, I recommend the second style as it's more consistent.
Note that in either diagram, CompanyType_ID in General appears at odds with the type of relationship you're trying to represent. It may not necessarily be wrong (entities described in General may each have a primary or distinguished CompanyType in addition to a set of secondary types) but even if it's modeled that way intentionally, it warrants a second look at least.

Related

One-to-many relationship between same entity in Core Data

I have entity called Item. It has attribute title and I want it to have collection of subitems (type of Item).
One item can have many (sub)items. (sub)item is part of right one item. For example, there is item titled as car. It has subitems titled wheels, engine and cabine. Cabine has subitems seat and steering wheel.
How to model it? Should I set inverse to subitems? If I set no inverse, I'm getting warning. And whether it is inverse or not, it is still many-to-many. No way to set it one-to-many.
How should I think of this problem? I don't have much experience with databases and I think there is also difference between modeling in Core Data and in SQL.
EDIT: There should be subitems instead of subitem in the picture
I've added relationship superitem as inverse to subitems. superitem is to-one type with nullify delete rule and subitems is to-many type with cascade delete rule. Seems to be the most perfect solution for my case. As bonus I don't have to write my own - addSubitem: method (as it is not generated for Swift) because it is automatically added if I set item's superitem.
Object modeling and relational database design are quite different, at least on the surface. The concepts of encapsulation, inheritance, and polymorphism have no exact analog in the relational data model. You are going to have to think about the problem in two different ways in order to do both object modeling and relational database design.
There is a model that is sort of half way between them. It's called the "Entity Relationship model", and this has been around almost as long as the relational model. This is useful for thinking about the problem and analyzing the data requirements at a conceptual level. ER modeling is very parallel to object modeling, except that object modeling models behavior as well as data, and ER modeling only models data.
The problem with learning ER modeling for this purpose is that in the present state of affairs, most of the professionals who use ER diagrams do not use them to depict a conceptual model. They use them to depict a relational design for a database. So if you learn ER modeling from them, you'll learn a design methodology, and not an analysis methodology.
Data analysis and database design are really very different activities, and it's useful to keep them separate in your mind, even if a single project requires you to do both of them. Oddly enough, the same division ultimately comes up in object modeling as well. Some object models are analysis models, and try to clarify the problem space. Other object models are design models, and try to clarify the solution space.
Acknowledging what Mitty said. You need wrap your brain around objects (not relational tables). Considering your example I would break it down as follows. The top level object is an item such as a car, truck, airplane, boat, etc. Items can have systems such as engines, transmissions, cabins. Systems can have components such as pistons, spark plugs, seats, steering wheels, tires. If you think of all these things as objects, then perhaps the beginning of a model would look like this:
An item may have many systems. Systems may have many components. Apple recommends setting the inverse, but you should worry more about the relationships and their cardinality (i.e. one-to-one, one-to-many). You can use a reflexive relationship (to self) as you depicted, but I think that limits your ability to really leverage the power of the object model as all 'things' would be represented as 'item' and you wouldn't have the nice distinction of system and component (IMO)

What to consider when deciding to use Single Table Inheritance

I'm getting ready to start a small project that provides an opportunity to use single table inheritance. As I read through prior post on STI on Stackoverflow there seems to be some strong opinions on sides of the argument.
My application is related to my horse racing hobby. A horse's connections are defined as its current jockey, trainer and owner. The jockey, trainer and owner could be modeled using three separate tables (models/classes) or as one one class with several sub-classes through single table inheritance.
When faced with a decision like this, is there a check list of questions that one can go through to determine what approach is preferable. I'm assuming that using STI would reduce the number of potential joins. What are the other practical considerations?
There are a few things you should think about:
Are the objects, conceptually, children of a single parent?
Don't use single table inheritance just because your classes share some attributes; make sure there is actually an OO inheritance relationship between each of them and an understandable parent class.
Do you need to do database queries on all objects together?
If you want to list the objects together or run aggregate queries on all of the data, you’ll probably want everything in the same database table for speed and simplicity.
Do the objects have similar data but different behavior?
If you have a larger number of model-specific columns, you should consider polymorphic associations instead.
The article linked goes in depth a bit more.

Modeling a database (ERD) that has quirky behavior

One of the databases that I'm working on has some quirky behavior that I want to account for in the entity-relationship diagram.
One of the behaviors is that there is a 'booking' table and a 'invoice' table. When a 'booking' is invoiced, then the record is inserted into the 'invoice' table and then deleted from the 'booking' table.
However, a reference is still kept of the booking number.
How do we model this? Big arrow between the tables and some text beside it describing what happens?
No, changing the database schema is not possible at this point in time
Edit: This is the type of diagram that I want to use:
alt text http://img813.imageshack.us/img813/5601/erdartistperformssong.png
Link
If, by ERD, you mean the original "Chen" diagrams where the relationship was words written in a diamond, then you have a relationship between between Booking and Invoice. It's a special kind of relationship that's NOT implemented with a simple foreign key; it's implemented via a complicated move and a constraint.
If, by ERD, you mean the diagrams that ERwin draws, then you don't have an easy way to do this. It tends to focus you on drawing PK-FK relationships. You have a non-PK-FK relationship between these things. Some kind of line with text is about all you can do.
Arrows, BTW, aren't appropriate because the ERD shows the "state" of the database. Data flowing around isn't part of an ERD. You do have a relationship, it's just not a typical PK-FK relationship. It's an atypical relationship based on rows existing in some places and not existing in others.
In the UML you can easily draw this as a "constraint" among the relationships.
I don't know what these people are talking about.
The Entity Relation Diagram doesn't describe the data fully; yes of course, it only shows Entities and Relations, it doesn't show Attributes. That's why it is called an ERD and not a Data Model. Evidently many people here can't tell the difference.
The Data Model is supposed to show as much as possible. But it depends on (a) the standard [if any] that you use and (b) the Notation. Some show more than others. IDEF1X which is the only Relational modelling Standard (NIST 184 of 1993). It is the most complete, and shows intricacies and complexities that other notations do not show. Recently MS and others have come out with "simplified" notations, of course, much is lost in the "ERDs".
It is not "process flow", it is a relation in a database.
UML is completely inappropriate for modelling data, especially when there is at least one Standard plus several non-standard but commonly used data modelling notations. There is nothing that can be shown in UML that can't be shown in IDEF1X. But most developers here have never heard of it (developers should not be modelling unless they have acquired modelling skills, but that is another story)..
This is a perfectly legal; it may not be commonly known, but it is legal and named. It is a Supertype-Subtype relation, except that the Cardinality is 1::0-n instead of 1::0-1. The IDEF1X Notation (right) has a Subtype symbol. Note there is only one relation at the parent end; and one each at the child end. And of course the crows feet show the cardinality. These relations can be Exclusive or Non-exclusive; yours is Exclusive; that is what the X through the half-circle means.
ERwin is the only modelling (not diagramming) tool that implements IDEF1X, and thus has the full complement of the IDEF1X Notation.
Of course, the Standard, the modelling capability, are all in the mind, not in the tool. I draw Data Models that are IDEF1X-compliant using a simple drawing tool.
I find that some developers baulk at the Subtype symbol, so I show a simplified version (left) in my IDEF1X models; it is intended to convey the sense of exclusivity, while the retention of the single line at the parent end indicates it is a subtype.
Lott: Click here▶Link to Data Model◀Lott: Click here
Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.
Sounds like a process flow, not an entity relationship. If at the time the entry is added to invoice, and the entry is deleted from booking, then there is never a relationship between the two. There is never a situation where you can traverse that relationship because there is never a record in both places that can be related together.
ERD don't describe the database fully. There are other things like process flow and use cases that detail other facets of the system.
This is kind of an analogy to UML for software. A class diagram doesn't show you all the different ways classes interact. One class might initialize locally and call functions of another class, but because there is not composition or inheritance that relates those two classes, then the class diagram doesn't show this relationship. Only when you fully document the system with all the various types of diagrams can you see all the facets of how it operates.

Bad practice to have models made up of other models?

I have a situation where I have Model A that has a variety of properties. I have discovered that some of the properties are similar across other models. My thought was I could create Model B and Model C and have Model A be a composite with a Model B property and a Model C property.
Just trying to determine if this is the best way to handle this situation.
It's definitely valid in certain situations. Let's say you have a Person class and a Company class, and they have the common properties streetNumber, streetName, postcode, etc. It makes sense to make a new model class called Address that both Person and Company contain. Inheritance is the completely wrong way to go in such a situation.
When properties (e.g. state) are the elements of commonality, I definately tend towards using composition rather than inheritance. When using inheritance, its perhaps best to wait until behavior is the commonality, and overrides are needed now or imminently.
What you're looking at is creating an Aggregate Root. A core paradigm of the Domain Driven Design (DDD) principals.
Certain models in your app will appear to belong "at the top" or "as root" to other objects. For example in the case of customers you might have a Contact model which then contains a collection of ContactPoints (names, addresses, etc).
Or a Post (in the case of a blog), which contains a collection of Comments, a Tite, Body and a TagSet (for tagging). Notice how the items i've highlighted as objects - these are other model types as opposed to simple types (strings, ints, etc).
The trick will come when and how you decide to 'fill' these Aggregate Root trees/graphs. Ie. How will you query just for a single TagSet? Will you go to the top and get the corresponding Post first? Maybe you just wanted to rename the tag "aspnetmvc" to "asp.net-mvc" for all Posts so you want to cut in and just get the TagSet item.
The MVC Storefront tutorial has some good examples of this pattern. Take a look if you can.

Database and relationship design when subclassing a model

I have a model "Task" which will HABTM many "TaskTargets".
When it comes to TaskTargets however, I'm writing the base TaskTarget class which is abstract (as much as can be in Rails). TaskTarget will be subclassed by various different conceptualizations of anything that can be the target of a task. So say, software subsystem, customer site, bathroom, etc...
The design of the classes here is fairly straightforward, but where I'm hitting a snag is in how I will relate it all together and how I will have rails manipulate those relationships.
My first thought is that I will have a TaskTarget table which will contain the basic common fields (name, description...). It will then also have a polymorphic relationship out to a table specific to the type of data the implementing class wraps.
This implies that the data for one instance of a class implementing TaskTarget will be found in two tables.
The second approach is to create a polymorphic HABTM relationship between Task and subclasses of TaskTarget which I thought I could reuse the table name TaskTarget for the join table.
Option #2 I suspect is the most robust, but maybe I'm missing something. Thanks for any help and of course I'm really just asking to make sure I get it done right, once!
I think the two approaches (easily) available to you in Rails are:
1) Single Table Inheritance: You create a single TaskTarget table that has every field that every subclass might want. You then also add a "type" field that stores the class name, and Rails will pretty much do the rest for you. See the ActiveRecord api docs for more info, especially the "Single Table Inheritance" section.
2) Concrete Table Inheritance: There is no table for the base TaskTarget class. Instead, simply create a table for each concrete class in your hierarchy with only the fields needed by that class.
The first option makes it easier to do things like "Show me all the TaskTargets, regardless of subclass," and results in fewer tables. It does make it a little harder to tell exactly what one subclass can do, as opposed to another, and if you have a lot of TaskTargets, I suppose eventually having them all in one table could be a performance concern.
The second option makes for a cleaner schema that is somewhat easier to read, and each class will work pretty much just like any normal ActiveRecord model. However, joining across all TaskTarget tables can be cumbersome, especially as you add more subclasses in the future. Implementing any necessary polymorphic associations may also involve some extra complexity.
Which option is better in your situation will depend on what operations you need to implement, and the characteristics of your data set.

Resources