all the attributes include in ER diagram or not?
My database have 8 tables and nearly 50 attributes, i include these 50 attributes in a single diagram OR not?
First thing - ER diagram is something that is first created usually and then later on converted into a data model like relational. So the data storing requirements are modeled into an ER diagram first and later on are converted into tables. There is rarely a case where you have to reverse engineer an existing data model into the ER diagram level (Conceptual Level).
During this process, tables are created for enitities and relationships accordingly (using some algorithmic process).
And a good way of practice is to draw the ER diagram with only entities and relationships without the attributes as it makes it very clumsy and difficult to concentrate on the actual entities and the relationships between them. Later on in a separate document or a page, you can list all the attributes corresponding to entities individually.
Hope this gives you a basic idea.
Related
I denormalising a OLTP database for use in a DWH.
At the moment I am denormalising studygroups.
Each studygroup has a key pointing towards 1 project.
Each project has a key pointing towards 1 department.
Each department has a key pointing towards 1 university.
Each universityhas a key pointing to 1 city.
Now I know that you are supposed to denormalize the sh*t out your OLTP but in this dwh department will be a dimension on its own. This goes for university also. Would it suffise to add a key from studygroup pointing at department or is it wiser to denormalize as far as you can and add all attributes from the department and all attributes from its M:1 related tables to the dimension studygroup? Even when department and university will be dimensions by themselves?
In other words: how far/deep do you go when denormalizing?
The key concept behind a dimensional model is:
Keep your fact tables in 3NF (third normal form);
De-normalize your dimensions into 2NF (second normal form)
So ideally, the only joins you should have in your model are the joins between fact tables and relevant dimensions.
As part of this philosophy:
Avoid "snow flake" designs, where dimensions contain keys to other dimensions. It's always possible to come up with a data model that allows the same functionality as the snow flakes, without violating 3NF/2NF rule;
Never have any direct joins between 2 separate dimensions (i.e, department and study group) directly. All relations among dimensions must be resolved via fact tables;
Never have any direct joins between 2 separate fact tables. Any relations among fact tables must be resolved via shared dimensions.
Finally, consider that dimensional design, besides optimization of the data for querying, serves a second important purpose: it's a semantic model of the business (or whatever else it represents). So, when making decisions about combining data elements into dimensions and facts, consider their "logical affinity" - they should make intuitive sense to the end users. If you have hard times explaining to a BI analyst the meaning of your dimension or fact table, most likely you've made a modeling mistake.
For example, in your case you should consider logical relations between universities, departments, study groups, etc. It's very likely that University/Department form a natural hierarchy. If so, they should belong to the same dimension. Study group, on the other hand, might not - let's assume, it's possible to form study groups across multiple universities and/or multiple departments. Such Many:Many relations are clear indication that they should be resolved via fact tables. In addition, relations between universities and departments are stable (rarely change), while study groups are formed and dissolved very often, and thus should be modeled separately.
In general, if you see 1:1 or 1:M relations between dimensional elements, it's often an indication that they should be de-normalized into the same table (again, only if their combination makes logical sense). If the relations are M:M, most likely they belong to different tables (you can force them into the same table, but often such tables look like Frankenstein creatures).
You can get much better help by making your question more specific - draw your dimensional model, post it, and ask for specific issues/challenges you have. For general concepts, books from Kimball and Inmon are your best friends.
I'm designing an ER diagram for a social network and recently I got involved in an argument with my colleagues whether this part is right or wrong
ER DIAGRAM PROBLEM
Where Faqet(Pages) is connected with Shfrytezuesi(User) using three actions, pelqen is for storing likes, krijon faqe to know who created the page, and udheheq to store all page admins, so my question is
is this design wrong?
Can two tables be linked with more than one action, this is where I'm not certain
It's perfectly valid to have any number of relationships between any number of entity sets. My only concern with the diagram is that multiple role lines below Shfrytezuesi are merged into one - I recommend keeping them distinct.
Note that in the entity-relationship model, we don't link tables. That idea comes from the old network data model, in which rows represented entities, tables represented entity sets, and links between rows/tables represented relationships.
One disadvantage with that model is that it supports only directed binary relationships - many-to-many binary, ternary and higher relationships and relationships with attributes all required associative entities to be introduced. However, three binary relationships aren't equivalent to a ternary relationship, and not all relationships can be represented in binary data models.
The ER model supports n-ary relationships and attributes on relationships. Entity sets are represented by their primary keys and relationships by combinations of entity keys. Entity sets plus attributes form entity relations, relationship sets plus attributes form entity relations. These relations get mapped to tables. In practice, tables with the same primary keys get combined to reduce the number of tables, which means one-to-one and one-to-many relationships get combined into the relations for one of their associated entity sets.
Regardless of how tables are combined, attributes and relationships are represented by sets of columns. For example, based on your diagram, Pelqen would be represented as (FID PK, SID) (assuming SID is the primary key of Shfrytezuesi). These columns might have different names, e.g. SID might be renamed to AdminSID, especially if the relationship was combined into Faqet. The old network data model would view FID FK -> FID PK as a relationship, which as described above and below is a very limited kind of relationship and not the approach taken by the ER model.
Another disadvantage of the network data model is predetermined access paths, which means we have to navigate from table to table using the predefined relationships. This complicated queries and data processing significantly. This limitation was one of the main drivers for the development of the relational model, to which the ER model maps. The understanding of tables as relations in the RM enables us to construct and navigate arbitrary access paths using joins. So, we do link tables in the RM, but at query time and as needed rather than at design time. The ER model is used for conceptual design only and doesn't describe relationships between tables, only relationships between entity sets.
Now, the ER model isn't a complete and consistent logical model like the RM, but it is a significant improvement over the network data model. An even more rigorous approach than ER would be object-role modeling, but that's a different topic.
I have entity called Item. It has attribute title and I want it to have collection of subitems (type of Item).
One item can have many (sub)items. (sub)item is part of right one item. For example, there is item titled as car. It has subitems titled wheels, engine and cabine. Cabine has subitems seat and steering wheel.
How to model it? Should I set inverse to subitems? If I set no inverse, I'm getting warning. And whether it is inverse or not, it is still many-to-many. No way to set it one-to-many.
How should I think of this problem? I don't have much experience with databases and I think there is also difference between modeling in Core Data and in SQL.
EDIT: There should be subitems instead of subitem in the picture
I've added relationship superitem as inverse to subitems. superitem is to-one type with nullify delete rule and subitems is to-many type with cascade delete rule. Seems to be the most perfect solution for my case. As bonus I don't have to write my own - addSubitem: method (as it is not generated for Swift) because it is automatically added if I set item's superitem.
Object modeling and relational database design are quite different, at least on the surface. The concepts of encapsulation, inheritance, and polymorphism have no exact analog in the relational data model. You are going to have to think about the problem in two different ways in order to do both object modeling and relational database design.
There is a model that is sort of half way between them. It's called the "Entity Relationship model", and this has been around almost as long as the relational model. This is useful for thinking about the problem and analyzing the data requirements at a conceptual level. ER modeling is very parallel to object modeling, except that object modeling models behavior as well as data, and ER modeling only models data.
The problem with learning ER modeling for this purpose is that in the present state of affairs, most of the professionals who use ER diagrams do not use them to depict a conceptual model. They use them to depict a relational design for a database. So if you learn ER modeling from them, you'll learn a design methodology, and not an analysis methodology.
Data analysis and database design are really very different activities, and it's useful to keep them separate in your mind, even if a single project requires you to do both of them. Oddly enough, the same division ultimately comes up in object modeling as well. Some object models are analysis models, and try to clarify the problem space. Other object models are design models, and try to clarify the solution space.
Acknowledging what Mitty said. You need wrap your brain around objects (not relational tables). Considering your example I would break it down as follows. The top level object is an item such as a car, truck, airplane, boat, etc. Items can have systems such as engines, transmissions, cabins. Systems can have components such as pistons, spark plugs, seats, steering wheels, tires. If you think of all these things as objects, then perhaps the beginning of a model would look like this:
An item may have many systems. Systems may have many components. Apple recommends setting the inverse, but you should worry more about the relationships and their cardinality (i.e. one-to-one, one-to-many). You can use a reflexive relationship (to self) as you depicted, but I think that limits your ability to really leverage the power of the object model as all 'things' would be represented as 'item' and you wouldn't have the nice distinction of system and component (IMO)
What is the difference between ER Diagrams and Database Schema? MySQL Workbench has facility to draw ER diagrams, but the symbols for ER diagrams different in other drawing tools than MySQL Workbench method.
A database schema is usually a relational model/diagram. it shows the link between tables: primary keys and foreign keys.
In database diagram the relation between an apple and a apple tree would be:
A foreign key "ID__TRE" which cannot be null in the table "APPLE" is linked to a primary key "ID_TRE" in the table "TREE".
An entity relationship diagram. Shows links between the entities and the kind of relation between them. We are not talking about tables or keys there! Usually the entity relationship diagram follows Merise model. Database manager and developer as myself usually build an entity relationship model before conceiving the relational model/diagram.
The set of symbol in Merise are:(0-1, 0-n, 1-1, 1-n). The first number 0 or 1 describes whether the other part of the association is required for an object to exist. If it is zero, it means it can exists without being associated. If it is One it means that the object only exist in relation with an other object (e.g an apple need a tree to exist --> 1, a tree needn't apple to exists -->0)
The second character tell us how many objects are accepted in the other part of the association. If it is 1, then only one object can exists in the relation, if it is n, a infinite number of object can be linked (e.g.: an apple can have one tree --> 1, a tree can have multiples apples --> n)
With Entity relationship the relationship will be described as :
An apple has to belong to at least one tree to exists and can belong to only one tree(1-1). A tree needn't an apple to exist but it can have an infinite number of apples (0-n).
In fact both description mean the same but one is database oriented while the other is modelling oriented. Some modelling software such as DB-MAIN convert automatically an ER diagram to the relational diagram.
ENTITY RELATIONSHIP DIAGRAMS (ERDs) are just that: DIAGRAMS which describe the RELATIONSHIPS between ENTITIES. Now let's look closer...
ERDs are often created by Business Analysts (NOT DBAs);
ERDs are often described in LAYMAN's terms (NOT techno-speak of DBAs or other);
ERDs are meant to summarize & clarify understanding for End Users and Business SMEs (again, NOT the DBAs or Developers)
ERDs work best when each entity is described in the SINGULAR, and the lines connecting various entities to other entities in the ERD use verbs (of action or possession, or existence) to describe each relation;
ERDs can (and do) include lines which denote n:n relationships, but this is not a requirement.
Examples of entities in an ERD for a blog: Member, Post, Comment, Category
Examples of relationships described in an ERD:
Member "posts" 1 to n Posts; (note we AREN'T describe WHAT a post looks like)
Post "relevant-to" 1 to n Categories
etc.
DATA SCHEMAS bear some resemblance to ERDs, but they should NOT be considered either equivalent or interchangeable. If you make an ERD which can be used as a data schema... be open to the possibility you DIDN'T make an ERD ! ;-)
DATA schemas are diagrams used to describe to a DBA how data will be stored in a database (relational or non-relational).
Data Schemas almost invariable describe the structure & characteristics of TABLES;
Tables are "containers" (cardboard boxes);
As such tables in a data schema are BEST named in the PLURAL
Examples of the TABLES in a SCHEMA for the same blog:
MEMBERS, POSTS, CATEGORIES, COMMENTS (relational database)
or
POSTS (keyed by Member-Date and all other columns in 1 table (non-relational database like for a "big data" project);
a data schema would then describe the data contained in each table:
MEMBER
FirstName (char:25)
LastName (char:25)
etc.
the lines between tables in a data schema would NOT try to represent any 'relation' other than a "KEY" between 2 fields which could be used to "join" the tables, and some additional characteristics of those lines to denote n:n relationships.
BOTH diagrams serve quite DIFFERENT purposes:
ERD: to make mere mortal end-users (and business owners) UNDERSTAND the model of a given business solution; and
DATA SCHEMA: a "blueprint" used by DBAs to BUILD databases, and by DEVELOPERS to CONSUME the data in that database.
A database schema is a description of the actual construction of the database. It is an all-encompassing term that refers to the collective of tables, columns, triggers, relationships, key constraints, functions and procedures. It can refer to a document that describes all of this (such as an XML Schema) or as an abstraction of database makeup itself ("It would be difficult to change the schema of the database at this point"). It does not refer to rows inserted into the schema, or data itself. You would insert data into an existing schema.
An Entity Relationship Diagram is a visualization of the relationships between tables in a database. At the very least, it includes table names visualized as squares connected by lines that represent primary and foreign key constraints. It often includes the column names and symbols that include information about what kind of relationship exists between the columns (one-to-one, one-to-many, many-to-many).
One of the databases that I'm working on has some quirky behavior that I want to account for in the entity-relationship diagram.
One of the behaviors is that there is a 'booking' table and a 'invoice' table. When a 'booking' is invoiced, then the record is inserted into the 'invoice' table and then deleted from the 'booking' table.
However, a reference is still kept of the booking number.
How do we model this? Big arrow between the tables and some text beside it describing what happens?
No, changing the database schema is not possible at this point in time
Edit: This is the type of diagram that I want to use:
alt text http://img813.imageshack.us/img813/5601/erdartistperformssong.png
Link
If, by ERD, you mean the original "Chen" diagrams where the relationship was words written in a diamond, then you have a relationship between between Booking and Invoice. It's a special kind of relationship that's NOT implemented with a simple foreign key; it's implemented via a complicated move and a constraint.
If, by ERD, you mean the diagrams that ERwin draws, then you don't have an easy way to do this. It tends to focus you on drawing PK-FK relationships. You have a non-PK-FK relationship between these things. Some kind of line with text is about all you can do.
Arrows, BTW, aren't appropriate because the ERD shows the "state" of the database. Data flowing around isn't part of an ERD. You do have a relationship, it's just not a typical PK-FK relationship. It's an atypical relationship based on rows existing in some places and not existing in others.
In the UML you can easily draw this as a "constraint" among the relationships.
I don't know what these people are talking about.
The Entity Relation Diagram doesn't describe the data fully; yes of course, it only shows Entities and Relations, it doesn't show Attributes. That's why it is called an ERD and not a Data Model. Evidently many people here can't tell the difference.
The Data Model is supposed to show as much as possible. But it depends on (a) the standard [if any] that you use and (b) the Notation. Some show more than others. IDEF1X which is the only Relational modelling Standard (NIST 184 of 1993). It is the most complete, and shows intricacies and complexities that other notations do not show. Recently MS and others have come out with "simplified" notations, of course, much is lost in the "ERDs".
It is not "process flow", it is a relation in a database.
UML is completely inappropriate for modelling data, especially when there is at least one Standard plus several non-standard but commonly used data modelling notations. There is nothing that can be shown in UML that can't be shown in IDEF1X. But most developers here have never heard of it (developers should not be modelling unless they have acquired modelling skills, but that is another story)..
This is a perfectly legal; it may not be commonly known, but it is legal and named. It is a Supertype-Subtype relation, except that the Cardinality is 1::0-n instead of 1::0-1. The IDEF1X Notation (right) has a Subtype symbol. Note there is only one relation at the parent end; and one each at the child end. And of course the crows feet show the cardinality. These relations can be Exclusive or Non-exclusive; yours is Exclusive; that is what the X through the half-circle means.
ERwin is the only modelling (not diagramming) tool that implements IDEF1X, and thus has the full complement of the IDEF1X Notation.
Of course, the Standard, the modelling capability, are all in the mind, not in the tool. I draw Data Models that are IDEF1X-compliant using a simple drawing tool.
I find that some developers baulk at the Subtype symbol, so I show a simplified version (left) in my IDEF1X models; it is intended to convey the sense of exclusivity, while the retention of the single line at the parent end indicates it is a subtype.
Lott: Click here▶Link to Data Model◀Lott: Click here
Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.
Sounds like a process flow, not an entity relationship. If at the time the entry is added to invoice, and the entry is deleted from booking, then there is never a relationship between the two. There is never a situation where you can traverse that relationship because there is never a record in both places that can be related together.
ERD don't describe the database fully. There are other things like process flow and use cases that detail other facets of the system.
This is kind of an analogy to UML for software. A class diagram doesn't show you all the different ways classes interact. One class might initialize locally and call functions of another class, but because there is not composition or inheritance that relates those two classes, then the class diagram doesn't show this relationship. Only when you fully document the system with all the various types of diagrams can you see all the facets of how it operates.