Representation of Lookup table - entity-relationship

How can I represent my lookup tables in technical reports?
In other words, the ER model is used to represent a database,
but what about lookup tables?

To recover a conceptual model (entity sets, attributes and relationships) from a physical model (tables and columns), we first have to understand the logical model. This means understanding the domains and functional dependencies which are represented by the lookup table.
Lookup table is a common term which can mean different things. I generally understand it as a table which represents a domain with a surrogate key, and associates it with a name and/or a few other attributes. In the ER model, these would be simple entity relations, and leaves / terminal nodes in the graph of entity sets.
If a lookup table records facts about only one type of thing (represented by the key of the lookup table), then you can represent that type as an entity set (rectangle) with an attribute (oval) for each dependent column, and draw relationships (diamonds) to connect it to other entity sets as required. Look for foreign key columns / constraints in other tables to find these relationships.
For example, consider the following physical model:
CarMake and CarModel are examples of lookup tables. This isn't a very good model, since in the real world CarModelId determines CarMakeId, while the model treats them as independent elements in CarSales. However, since the point of the example is to focus on lookup tables, I'll use it as is.
In this case, CarMake and CarModel describe a single entity set each. Their functional dependencies are CarMakeId -> CarMakeName and CarModelId -> CarModelName. In CarSales, we've got CarSaleId -> RegNumber, Price, SoldOn (attributes) and CarSaleId -> CarMakeId, CarModelId (relationships).
In this case our ER model is similar to the physical model:
However, in some cases, you may find multiple types of things combined into one lookup table due to the similar physical structure. This doesn't affect the logical or conceptual models, but makes it more complicated to recover since we have to understand how the table is used to unpack it.

Related

How deep to go when denormalising

I denormalising a OLTP database for use in a DWH.
At the moment I am denormalising studygroups.
Each studygroup has a key pointing towards 1 project.
Each project has a key pointing towards 1 department.
Each department has a key pointing towards 1 university.
Each universityhas a key pointing to 1 city.
Now I know that you are supposed to denormalize the sh*t out your OLTP but in this dwh department will be a dimension on its own. This goes for university also. Would it suffise to add a key from studygroup pointing at department or is it wiser to denormalize as far as you can and add all attributes from the department and all attributes from its M:1 related tables to the dimension studygroup? Even when department and university will be dimensions by themselves?
In other words: how far/deep do you go when denormalizing?
The key concept behind a dimensional model is:
Keep your fact tables in 3NF (third normal form);
De-normalize your dimensions into 2NF (second normal form)
So ideally, the only joins you should have in your model are the joins between fact tables and relevant dimensions.
As part of this philosophy:
Avoid "snow flake" designs, where dimensions contain keys to other dimensions. It's always possible to come up with a data model that allows the same functionality as the snow flakes, without violating 3NF/2NF rule;
Never have any direct joins between 2 separate dimensions (i.e, department and study group) directly. All relations among dimensions must be resolved via fact tables;
Never have any direct joins between 2 separate fact tables. Any relations among fact tables must be resolved via shared dimensions.
Finally, consider that dimensional design, besides optimization of the data for querying, serves a second important purpose: it's a semantic model of the business (or whatever else it represents). So, when making decisions about combining data elements into dimensions and facts, consider their "logical affinity" - they should make intuitive sense to the end users. If you have hard times explaining to a BI analyst the meaning of your dimension or fact table, most likely you've made a modeling mistake.
For example, in your case you should consider logical relations between universities, departments, study groups, etc. It's very likely that University/Department form a natural hierarchy. If so, they should belong to the same dimension. Study group, on the other hand, might not - let's assume, it's possible to form study groups across multiple universities and/or multiple departments. Such Many:Many relations are clear indication that they should be resolved via fact tables. In addition, relations between universities and departments are stable (rarely change), while study groups are formed and dissolved very often, and thus should be modeled separately.
In general, if you see 1:1 or 1:M relations between dimensional elements, it's often an indication that they should be de-normalized into the same table (again, only if their combination makes logical sense). If the relations are M:M, most likely they belong to different tables (you can force them into the same table, but often such tables look like Frankenstein creatures).
You can get much better help by making your question more specific - draw your dimensional model, post it, and ask for specific issues/challenges you have. For general concepts, books from Kimball and Inmon are your best friends.

ER Diagram design issues

I'm designing an ER diagram for a social network and recently I got involved in an argument with my colleagues whether this part is right or wrong
ER DIAGRAM PROBLEM
Where Faqet(Pages) is connected with Shfrytezuesi(User) using three actions, pelqen is for storing likes, krijon faqe to know who created the page, and udheheq to store all page admins, so my question is
is this design wrong?
Can two tables be linked with more than one action, this is where I'm not certain
It's perfectly valid to have any number of relationships between any number of entity sets. My only concern with the diagram is that multiple role lines below Shfrytezuesi are merged into one - I recommend keeping them distinct.
Note that in the entity-relationship model, we don't link tables. That idea comes from the old network data model, in which rows represented entities, tables represented entity sets, and links between rows/tables represented relationships.
One disadvantage with that model is that it supports only directed binary relationships - many-to-many binary, ternary and higher relationships and relationships with attributes all required associative entities to be introduced. However, three binary relationships aren't equivalent to a ternary relationship, and not all relationships can be represented in binary data models.
The ER model supports n-ary relationships and attributes on relationships. Entity sets are represented by their primary keys and relationships by combinations of entity keys. Entity sets plus attributes form entity relations, relationship sets plus attributes form entity relations. These relations get mapped to tables. In practice, tables with the same primary keys get combined to reduce the number of tables, which means one-to-one and one-to-many relationships get combined into the relations for one of their associated entity sets.
Regardless of how tables are combined, attributes and relationships are represented by sets of columns. For example, based on your diagram, Pelqen would be represented as (FID PK, SID) (assuming SID is the primary key of Shfrytezuesi). These columns might have different names, e.g. SID might be renamed to AdminSID, especially if the relationship was combined into Faqet. The old network data model would view FID FK -> FID PK as a relationship, which as described above and below is a very limited kind of relationship and not the approach taken by the ER model.
Another disadvantage of the network data model is predetermined access paths, which means we have to navigate from table to table using the predefined relationships. This complicated queries and data processing significantly. This limitation was one of the main drivers for the development of the relational model, to which the ER model maps. The understanding of tables as relations in the RM enables us to construct and navigate arbitrary access paths using joins. So, we do link tables in the RM, but at query time and as needed rather than at design time. The ER model is used for conceptual design only and doesn't describe relationships between tables, only relationships between entity sets.
Now, the ER model isn't a complete and consistent logical model like the RM, but it is a significant improvement over the network data model. An even more rigorous approach than ER would be object-role modeling, but that's a different topic.

Mapping ER Model to Relational Model

I was going through this site to understand ER to relational model mapping.
Below is the link:
ER Model to Relational model
Consider case 1: It says that since the passport entity type is in total participation, we can merge person and passport tables along with the has relationship into one table with all the attributes of the above three and primary key as Person_id.
My doubt is that wont it lead to a lot of NULL values for those people who do not own a passport. I was thinking that a better solution would be to include Person_id as a foreign key in the Passport relation and a separate relation for Person entity type itself.
Both the solutions seems to have their pros and cons:
1) One big table means a possibility of lot of NULL values but ease of access of passport details of a person.
2) Two separate tables mean that no NULL values but to find the passport details of people, we have to perform a join operation or search through two separate tables.
Which of these two solutions is correct? By correct, I mean to ask that in common practice in such cases, which solution is used?
Both solutions are commonly used. I would only consider option 1 if no other information depended on the passport number, but in this case I'd model it as an (optional) attribute in ER and not a separate entity. If a passport has any dependent attributes, such as country of origin or expiry date, I would model it as a separate entity and implement it using option 2.

Relationship Modelling Assistance - Normalisation

I need to create an ER diagram and a relational model for a hospital. I have kept it simple to 3 entities. Please can someone have a look and tell me if I have normalised this correctly?
I am not sure it there should be a : relationship between the trust and patients? Does a person assume that the first entity is a single unit, in this case a hospital has x relationships to....
Diagram 2:
ER Diagram & Relational Schema
Your normalization isn't correct. First of all, you don't list functional dependencies. Without that, any attempt to normalize is just guesswork.
Now, I can make a reasonable guess, but even with that you have some problems. The Hires and Cares for relations aren't reflected in your 1NF/2NF/3NF tables. In your 2NF, you introduce a Shift relation that was presumably derived from Doctor in 1NF, but Patient_ID wasn't present in the latter, unless that's actually what the Shift column was for, but renaming fields isn't part of normalization. Also, what happened to Hospital in 2NF? It reappears in 3NF but with a Location attribute that was derived from where? Also, you're not indicating a primary key for Shift, is it correct to assume the combination of all 4 fields constitute a candidate key? It's good practice to be explicit about your keys.
Can an entity be without attributes?
Your Hospital relation does have an attribute - Hospital_ID, which represents an identity relation. Remember an attribute is a binary relation, and Hospital_ID -> Hospital_ID qualifies.
I am not sure it there should be a : relationship between the trust and patients?
What is the trust? It isn't indicated in your diagram at all.
Does a person assume that the first entity is a single unit, in this case a hospital has x relationships to....
Many-to-many relationships are decomposed into two or more one-to-many relationships, and in reading the cardinality indicators we tend to put the singular entity first, but this isn't a rule or a safe assumption. Be explicit about cardinalities.

What is different between ER Diagram and Database Schema?

What is the difference between ER Diagrams and Database Schema? MySQL Workbench has facility to draw ER diagrams, but the symbols for ER diagrams different in other drawing tools than MySQL Workbench method.
A database schema is usually a relational model/diagram. it shows the link between tables: primary keys and foreign keys.
In database diagram the relation between an apple and a apple tree would be:
A foreign key "ID__TRE" which cannot be null in the table "APPLE" is linked to a primary key "ID_TRE" in the table "TREE".
An entity relationship diagram. Shows links between the entities and the kind of relation between them. We are not talking about tables or keys there! Usually the entity relationship diagram follows Merise model. Database manager and developer as myself usually build an entity relationship model before conceiving the relational model/diagram.
The set of symbol in Merise are:(0-1, 0-n, 1-1, 1-n). The first number 0 or 1 describes whether the other part of the association is required for an object to exist. If it is zero, it means it can exists without being associated. If it is One it means that the object only exist in relation with an other object (e.g an apple need a tree to exist --> 1, a tree needn't apple to exists -->0)
The second character tell us how many objects are accepted in the other part of the association. If it is 1, then only one object can exists in the relation, if it is n, a infinite number of object can be linked (e.g.: an apple can have one tree --> 1, a tree can have multiples apples --> n)
With Entity relationship the relationship will be described as :
An apple has to belong to at least one tree to exists and can belong to only one tree(1-1). A tree needn't an apple to exist but it can have an infinite number of apples (0-n).
In fact both description mean the same but one is database oriented while the other is modelling oriented. Some modelling software such as DB-MAIN convert automatically an ER diagram to the relational diagram.
ENTITY RELATIONSHIP DIAGRAMS (ERDs) are just that: DIAGRAMS which describe the RELATIONSHIPS between ENTITIES. Now let's look closer...
ERDs are often created by Business Analysts (NOT DBAs);
ERDs are often described in LAYMAN's terms (NOT techno-speak of DBAs or other);
ERDs are meant to summarize & clarify understanding for End Users and Business SMEs (again, NOT the DBAs or Developers)
ERDs work best when each entity is described in the SINGULAR, and the lines connecting various entities to other entities in the ERD use verbs (of action or possession, or existence) to describe each relation;
ERDs can (and do) include lines which denote n:n relationships, but this is not a requirement.
Examples of entities in an ERD for a blog: Member, Post, Comment, Category
Examples of relationships described in an ERD:
Member "posts" 1 to n Posts; (note we AREN'T describe WHAT a post looks like)
Post "relevant-to" 1 to n Categories
etc.
DATA SCHEMAS bear some resemblance to ERDs, but they should NOT be considered either equivalent or interchangeable. If you make an ERD which can be used as a data schema... be open to the possibility you DIDN'T make an ERD ! ;-)
DATA schemas are diagrams used to describe to a DBA how data will be stored in a database (relational or non-relational).
Data Schemas almost invariable describe the structure & characteristics of TABLES;
Tables are "containers" (cardboard boxes);
As such tables in a data schema are BEST named in the PLURAL
Examples of the TABLES in a SCHEMA for the same blog:
MEMBERS, POSTS, CATEGORIES, COMMENTS (relational database)
or
POSTS (keyed by Member-Date and all other columns in 1 table (non-relational database like for a "big data" project);
a data schema would then describe the data contained in each table:
MEMBER
FirstName (char:25)
LastName (char:25)
etc.
the lines between tables in a data schema would NOT try to represent any 'relation' other than a "KEY" between 2 fields which could be used to "join" the tables, and some additional characteristics of those lines to denote n:n relationships.
BOTH diagrams serve quite DIFFERENT purposes:
ERD: to make mere mortal end-users (and business owners) UNDERSTAND the model of a given business solution; and
DATA SCHEMA: a "blueprint" used by DBAs to BUILD databases, and by DEVELOPERS to CONSUME the data in that database.
A database schema is a description of the actual construction of the database. It is an all-encompassing term that refers to the collective of tables, columns, triggers, relationships, key constraints, functions and procedures. It can refer to a document that describes all of this (such as an XML Schema) or as an abstraction of database makeup itself ("It would be difficult to change the schema of the database at this point"). It does not refer to rows inserted into the schema, or data itself. You would insert data into an existing schema.
An Entity Relationship Diagram is a visualization of the relationships between tables in a database. At the very least, it includes table names visualized as squares connected by lines that represent primary and foreign key constraints. It often includes the column names and symbols that include information about what kind of relationship exists between the columns (one-to-one, one-to-many, many-to-many).

Resources