What is the difference between ER Diagrams and Database Schema? MySQL Workbench has facility to draw ER diagrams, but the symbols for ER diagrams different in other drawing tools than MySQL Workbench method.
A database schema is usually a relational model/diagram. it shows the link between tables: primary keys and foreign keys.
In database diagram the relation between an apple and a apple tree would be:
A foreign key "ID__TRE" which cannot be null in the table "APPLE" is linked to a primary key "ID_TRE" in the table "TREE".
An entity relationship diagram. Shows links between the entities and the kind of relation between them. We are not talking about tables or keys there! Usually the entity relationship diagram follows Merise model. Database manager and developer as myself usually build an entity relationship model before conceiving the relational model/diagram.
The set of symbol in Merise are:(0-1, 0-n, 1-1, 1-n). The first number 0 or 1 describes whether the other part of the association is required for an object to exist. If it is zero, it means it can exists without being associated. If it is One it means that the object only exist in relation with an other object (e.g an apple need a tree to exist --> 1, a tree needn't apple to exists -->0)
The second character tell us how many objects are accepted in the other part of the association. If it is 1, then only one object can exists in the relation, if it is n, a infinite number of object can be linked (e.g.: an apple can have one tree --> 1, a tree can have multiples apples --> n)
With Entity relationship the relationship will be described as :
An apple has to belong to at least one tree to exists and can belong to only one tree(1-1). A tree needn't an apple to exist but it can have an infinite number of apples (0-n).
In fact both description mean the same but one is database oriented while the other is modelling oriented. Some modelling software such as DB-MAIN convert automatically an ER diagram to the relational diagram.
ENTITY RELATIONSHIP DIAGRAMS (ERDs) are just that: DIAGRAMS which describe the RELATIONSHIPS between ENTITIES. Now let's look closer...
ERDs are often created by Business Analysts (NOT DBAs);
ERDs are often described in LAYMAN's terms (NOT techno-speak of DBAs or other);
ERDs are meant to summarize & clarify understanding for End Users and Business SMEs (again, NOT the DBAs or Developers)
ERDs work best when each entity is described in the SINGULAR, and the lines connecting various entities to other entities in the ERD use verbs (of action or possession, or existence) to describe each relation;
ERDs can (and do) include lines which denote n:n relationships, but this is not a requirement.
Examples of entities in an ERD for a blog: Member, Post, Comment, Category
Examples of relationships described in an ERD:
Member "posts" 1 to n Posts; (note we AREN'T describe WHAT a post looks like)
Post "relevant-to" 1 to n Categories
etc.
DATA SCHEMAS bear some resemblance to ERDs, but they should NOT be considered either equivalent or interchangeable. If you make an ERD which can be used as a data schema... be open to the possibility you DIDN'T make an ERD ! ;-)
DATA schemas are diagrams used to describe to a DBA how data will be stored in a database (relational or non-relational).
Data Schemas almost invariable describe the structure & characteristics of TABLES;
Tables are "containers" (cardboard boxes);
As such tables in a data schema are BEST named in the PLURAL
Examples of the TABLES in a SCHEMA for the same blog:
MEMBERS, POSTS, CATEGORIES, COMMENTS (relational database)
or
POSTS (keyed by Member-Date and all other columns in 1 table (non-relational database like for a "big data" project);
a data schema would then describe the data contained in each table:
MEMBER
FirstName (char:25)
LastName (char:25)
etc.
the lines between tables in a data schema would NOT try to represent any 'relation' other than a "KEY" between 2 fields which could be used to "join" the tables, and some additional characteristics of those lines to denote n:n relationships.
BOTH diagrams serve quite DIFFERENT purposes:
ERD: to make mere mortal end-users (and business owners) UNDERSTAND the model of a given business solution; and
DATA SCHEMA: a "blueprint" used by DBAs to BUILD databases, and by DEVELOPERS to CONSUME the data in that database.
A database schema is a description of the actual construction of the database. It is an all-encompassing term that refers to the collective of tables, columns, triggers, relationships, key constraints, functions and procedures. It can refer to a document that describes all of this (such as an XML Schema) or as an abstraction of database makeup itself ("It would be difficult to change the schema of the database at this point"). It does not refer to rows inserted into the schema, or data itself. You would insert data into an existing schema.
An Entity Relationship Diagram is a visualization of the relationships between tables in a database. At the very least, it includes table names visualized as squares connected by lines that represent primary and foreign key constraints. It often includes the column names and symbols that include information about what kind of relationship exists between the columns (one-to-one, one-to-many, many-to-many).
Related
I'm designing an ER diagram for a social network and recently I got involved in an argument with my colleagues whether this part is right or wrong
ER DIAGRAM PROBLEM
Where Faqet(Pages) is connected with Shfrytezuesi(User) using three actions, pelqen is for storing likes, krijon faqe to know who created the page, and udheheq to store all page admins, so my question is
is this design wrong?
Can two tables be linked with more than one action, this is where I'm not certain
It's perfectly valid to have any number of relationships between any number of entity sets. My only concern with the diagram is that multiple role lines below Shfrytezuesi are merged into one - I recommend keeping them distinct.
Note that in the entity-relationship model, we don't link tables. That idea comes from the old network data model, in which rows represented entities, tables represented entity sets, and links between rows/tables represented relationships.
One disadvantage with that model is that it supports only directed binary relationships - many-to-many binary, ternary and higher relationships and relationships with attributes all required associative entities to be introduced. However, three binary relationships aren't equivalent to a ternary relationship, and not all relationships can be represented in binary data models.
The ER model supports n-ary relationships and attributes on relationships. Entity sets are represented by their primary keys and relationships by combinations of entity keys. Entity sets plus attributes form entity relations, relationship sets plus attributes form entity relations. These relations get mapped to tables. In practice, tables with the same primary keys get combined to reduce the number of tables, which means one-to-one and one-to-many relationships get combined into the relations for one of their associated entity sets.
Regardless of how tables are combined, attributes and relationships are represented by sets of columns. For example, based on your diagram, Pelqen would be represented as (FID PK, SID) (assuming SID is the primary key of Shfrytezuesi). These columns might have different names, e.g. SID might be renamed to AdminSID, especially if the relationship was combined into Faqet. The old network data model would view FID FK -> FID PK as a relationship, which as described above and below is a very limited kind of relationship and not the approach taken by the ER model.
Another disadvantage of the network data model is predetermined access paths, which means we have to navigate from table to table using the predefined relationships. This complicated queries and data processing significantly. This limitation was one of the main drivers for the development of the relational model, to which the ER model maps. The understanding of tables as relations in the RM enables us to construct and navigate arbitrary access paths using joins. So, we do link tables in the RM, but at query time and as needed rather than at design time. The ER model is used for conceptual design only and doesn't describe relationships between tables, only relationships between entity sets.
Now, the ER model isn't a complete and consistent logical model like the RM, but it is a significant improvement over the network data model. An even more rigorous approach than ER would be object-role modeling, but that's a different topic.
How can I represent my lookup tables in technical reports?
In other words, the ER model is used to represent a database,
but what about lookup tables?
To recover a conceptual model (entity sets, attributes and relationships) from a physical model (tables and columns), we first have to understand the logical model. This means understanding the domains and functional dependencies which are represented by the lookup table.
Lookup table is a common term which can mean different things. I generally understand it as a table which represents a domain with a surrogate key, and associates it with a name and/or a few other attributes. In the ER model, these would be simple entity relations, and leaves / terminal nodes in the graph of entity sets.
If a lookup table records facts about only one type of thing (represented by the key of the lookup table), then you can represent that type as an entity set (rectangle) with an attribute (oval) for each dependent column, and draw relationships (diamonds) to connect it to other entity sets as required. Look for foreign key columns / constraints in other tables to find these relationships.
For example, consider the following physical model:
CarMake and CarModel are examples of lookup tables. This isn't a very good model, since in the real world CarModelId determines CarMakeId, while the model treats them as independent elements in CarSales. However, since the point of the example is to focus on lookup tables, I'll use it as is.
In this case, CarMake and CarModel describe a single entity set each. Their functional dependencies are CarMakeId -> CarMakeName and CarModelId -> CarModelName. In CarSales, we've got CarSaleId -> RegNumber, Price, SoldOn (attributes) and CarSaleId -> CarMakeId, CarModelId (relationships).
In this case our ER model is similar to the physical model:
However, in some cases, you may find multiple types of things combined into one lookup table due to the similar physical structure. This doesn't affect the logical or conceptual models, but makes it more complicated to recover since we have to understand how the table is used to unpack it.
I was going through this site to understand ER to relational model mapping.
Below is the link:
ER Model to Relational model
Consider case 1: It says that since the passport entity type is in total participation, we can merge person and passport tables along with the has relationship into one table with all the attributes of the above three and primary key as Person_id.
My doubt is that wont it lead to a lot of NULL values for those people who do not own a passport. I was thinking that a better solution would be to include Person_id as a foreign key in the Passport relation and a separate relation for Person entity type itself.
Both the solutions seems to have their pros and cons:
1) One big table means a possibility of lot of NULL values but ease of access of passport details of a person.
2) Two separate tables mean that no NULL values but to find the passport details of people, we have to perform a join operation or search through two separate tables.
Which of these two solutions is correct? By correct, I mean to ask that in common practice in such cases, which solution is used?
Both solutions are commonly used. I would only consider option 1 if no other information depended on the passport number, but in this case I'd model it as an (optional) attribute in ER and not a separate entity. If a passport has any dependent attributes, such as country of origin or expiry date, I would model it as a separate entity and implement it using option 2.
I need to create an ER diagram and a relational model for a hospital. I have kept it simple to 3 entities. Please can someone have a look and tell me if I have normalised this correctly?
I am not sure it there should be a : relationship between the trust and patients? Does a person assume that the first entity is a single unit, in this case a hospital has x relationships to....
Diagram 2:
ER Diagram & Relational Schema
Your normalization isn't correct. First of all, you don't list functional dependencies. Without that, any attempt to normalize is just guesswork.
Now, I can make a reasonable guess, but even with that you have some problems. The Hires and Cares for relations aren't reflected in your 1NF/2NF/3NF tables. In your 2NF, you introduce a Shift relation that was presumably derived from Doctor in 1NF, but Patient_ID wasn't present in the latter, unless that's actually what the Shift column was for, but renaming fields isn't part of normalization. Also, what happened to Hospital in 2NF? It reappears in 3NF but with a Location attribute that was derived from where? Also, you're not indicating a primary key for Shift, is it correct to assume the combination of all 4 fields constitute a candidate key? It's good practice to be explicit about your keys.
Can an entity be without attributes?
Your Hospital relation does have an attribute - Hospital_ID, which represents an identity relation. Remember an attribute is a binary relation, and Hospital_ID -> Hospital_ID qualifies.
I am not sure it there should be a : relationship between the trust and patients?
What is the trust? It isn't indicated in your diagram at all.
Does a person assume that the first entity is a single unit, in this case a hospital has x relationships to....
Many-to-many relationships are decomposed into two or more one-to-many relationships, and in reading the cardinality indicators we tend to put the singular entity first, but this isn't a rule or a safe assumption. Be explicit about cardinalities.
I am a DB newbie to the bitemporal world and had a naive question.
Say you have a master-satellite relationship between two tables - where the master stores essential information and the satellite stores information that is relevant to only few of the records of the master table. Example would be 'trade' as a master table and 'trade_support' as the satellite table where 'trade_support' will only house supporting information for non-electronic trades (which will be a small minority).
In a non-bitemporal landscape, we would model it as a parent-child relationship. My question is: in a bitemporal world, should such a use case be still modeled as a two-table parent-child relationship with 4 temporal columns on both tables? I don't see a reason why it can't be done, but the question of "should it be done" is quite hazy in my mind. Any gurus to help me out with the rationale behind the choice?
Pros:
Normalization
Cons:
Additional table and temporal columns to maintain and manage via DAO's
Defining performant join conditions
I believe this should be a pretty common use-case and wanted to know if there are any best practices that I can benefit from.
Thanks in advance!
Bitemporal data management and foreign keys can be quite tricky. For a master-satellite relationship between bitemporal tables, an "artificial key" needs to be introduced in the master table that is not unique but identical for different temporal or historical versions of an object. This key is referenced from the satellite. When joining the two tables a bitemporal context (T_TIME, V_TIME) where T_TIME is the transaction time and V_TIME is the valid time must be given for the join. The join would be something like the following:
SELECT m.*, s.*
FROM master m
LEFT JOIN satellite s
ON m.key = s.master_key
AND <V_TIME> between s.valid_from and s.valid_to
AND <T_TIME> between s.t_from and s.t_to
WHERE <V_TIME> between m.valid_from and m.valid_to
AND <T_TIME> between m.t_from and m.t_to
In this query the valid period is given by the columns valid_from and valid_to and the transaction period is given by the columns t_from and t_to for both the master and the satellite table. The artificial key in the master is given by the column m.key and the reference to this key by s.master_key. A left outer join is used to retrieve also those entries of the master table for which there is no corresponding entry in the satellite table.
As you have noted above, this join condition is likely to be slow.
On the other hand this layout may be more space efficient in that if only the master data (in able trade) or only the satellite data (in table trade_support) is updated, this will only require a new entry in the respective table. When using one table for all data, a new entry for all columns in the combined table would be necessary. Also you will end up with a table with many null values.
So the question you are asking boils down to a trade-off between space requirements and concise code. The amount of space you are sacrificing with the single-table solution depends on the number of columns of your satellite table. I would probably go for the single-table solution, since it is much easier to understand.
If you have any chance to switch database technology, a document oriented database might make more sense. I have written a prototype of a bitemporal scala layer based on mongodb, which is available here:
https://github.com/1123/bitemporaldb
This will allow you to work without joins, and with a more flexible structure of your trade data.