Related
Let's assume i have some 2 simple tables:
IMPORTANT: This is about relational algebra, not SQL.
Band table:
band_name founded
Gambo 1975
John. 1342
Album table:
album_name band_name
Celsius. Gambo
Trambo Gambo
Now, since the Band and the Album table share the same column name "band_name", would it be necessary to rename it when i would join them?
As far as i know, the join eliminates the duplicate entry that is shared amongst the join. This example, where i simply pick all Bands that are existing in the Album table (obviously just 'Gambo' in this giving example)
Πfounded, band_name(Band ⋈ Album)
should therefore work fine, right? Can somebody confirm?
(Have to enter a caveat that there are many variants of Relational Algebra; that they differ in semantics; and they differ in syntax. Assuming you intend a variant similar to that in wikipedia ...)
Yes that expression should work fine. The natural join operator ⋈ matches same-named attributes between its two operands. So the subexpression Band ⋈ Album produces a result with attributes {band_name, founded, album_name}. Your expression projects two of those.
Note the attributes for a relation value are a set not a sequence; therefore any operation over relation operands with same-named attributes must match attributes.
In contrast, Cartesian Product × requires its operands to have disjoint attribute names. Then Band × Album is ill-formed and would be rejected. (So you'd need to Rename band_name in one of them, to get relations that could be operands.)
I'm not all that happy with your way of putting it "the join eliminates the duplicate entry that is shared amongst the join." Because only in SQL do you get a duplicate (from SELECT * FROM Band, Album ... -- which results in a table with four columns, of which two are named band_name). SQL FROM list of tables is a botch-up: neither join nor Cartesian Product, but something trying to be both, and succeeding only in being neither. RA's ⋈ never produces a "duplicate" so never does it "eliminate" anything.
Particularly if there's Keys declared and a Foreign Key constraint (from Album's band_name to Band's) I see those as identifying the same band, then the natural operation is to bring together that which has been taken apart, so the name 'Natural Join'.
As i know optionality means the minimum cardinality of a relationship which is denoted as optional to optional, mandatory to optional, mandatory to mandatory..
And Participation denoted as bold line and a normal line.
In the Internet some refer participation as the dependency of the entity to the relationship which is also looks like identifying and non identifying relationship.
and some refer it as the minimum cardinality
What is the correct definitions of those relationships and what is the difference..
Let's start with definitions and examples of each of the concepts:
Total and partial participation:
Total participation (indicated by a double or thick association line) means that all the entities in an entity set must participate in the relationship. Partial participation (indicated by a single thin line) means that there can be entities in the entity set that don't participate in the relationship.
Medicine participates totally in the Produce relationship, meaning that Medicine can't exist unless Produced by a Laboratory. In contrast, a Laboratory can exist without Producing Medicine - Laboratory participates partially in the Produce relationsip.
Mandatory and optional roles:
In a relationship, roles can be optional or mandatory. This affects whether a relationship instance can exist without an entity in a given role. Mandatory roles are indicated with a solid association line, optional roles are indicated with a dotted line.
Roles aren't often talked about in database tutorials, but they're an important concept. Consider a marriage - a relationship with two mandatory roles filled by the same entity set. In most relationships, the entity sets also define the roles, but when an entity set appears multiple times in a single relationship, we distinguish them in different roles.
In the example above, a Patient can Purchase Medicine with or without a Prescription. A Purchase can't exist without a Patient and Medicine, but a Prescription is optional (overall, though it may be required in specific cases).
Identifying relationship / weak entity:
A weak entity is an entity that can't be identified by its own attributes and therefore has another entity's key as part of its own. An identifying relationship is the relationship between a weak entity and its parent entity. Both the identifying relationship and the weak entity are indicated with double borders. Weak entity sets must necessarily participate totally in their identifying relationship.
In this example, a Prescription contains LineItems which are identified by the Prescription's key and a line number. In other words, the LineItems table will have a composite key (Prescription_ID, Line_Number).
For examples of non-identifying relationships, see the previous examples. While Medicine participates totally in the Produce relationship, it has its own identity (e.g. a surrogate key, though I didn't indicate it). Note that surrogate keys always imply regular entities.
Mandatory/optional vs total/partial participation
Mandatory or optional roles indicate whether a certain role (with its associated entity set) is required for the relationship to exist. Total or partial participation indicate whether a certain relationship is required for an entity to exist.
Mandatory partial participation: See above: A Laboratory can exist without producing any medicine, but Medicine can't be Produced without a Laboratory.
Mandatory total participation: See above: Medicine can't exist without being Produced, and a Laboratory can't Produce something unspecified.
Optional partial participation: See above: A Prescription can exist without being Purchased, and a Purchase can exist without a Prescription.
That leaves optional total participation, which I had to think about a bit to find an example:
Some Patients Die of an unknown Cause, but a Cause of death can't exist without a Patient Dying of it.
Total/partial participation vs identifying/non-identifying relationships
As I said before, weak entity sets always participate totally in their identifying relationship. See above: a LineItem must be Contained in a Prescription, it's identity and existence depends on that. Partial participation in an identifying relationship isn't possible.
Total participation doesn't imply an identifying relationship - Medicine can't exist without being Produced by a Laboratory but Medicine is identified by its own attributes.
Partial participation in a non-identifying relationship is very common. For example, Medicine can exist without being Purchased, and Medicine is identified by its own attributes.
Mandatory/optional vs identifying/non-identifying relationships
It's unusual for a relationship to have less than two mandatory roles. Identifying relationships are binary relationships, so the parent and child roles will be mandatory - the Contain relationship between Prescription and LineItem can't exist without both entities.
Optional roles are usually only found on ternary and higher relationships (though see the example of patients dying of causes), and aren't involved in identification. An alternative to an optional role is a relationship on a relationship:
By turning Purchase into an associative entity, we can have it participate in a Fill relationship with Prescription. To maintain the same semantics as above I specified that a Purchase can only Fill one Prescription.
Physical modeling
If we translate from conceptual to physical model (skipping logical modeling / further normalization), making separate tables for each entity and relationship, things look pretty similar, though you have to know how to read the cardinality indicators on the foreign key lines to recover the ER semantics.
However, it's common to denormalize tables with the same primary keys, meaning one-to-many relationships are combined with the entity table on the many side:
A relationship is physically represented as two or more entity keys in a table. In this case, the entity keys - patient_id and cause_of_death_id are both found in the Patient table. Many people think the foreign key line represents the relationship, but this comes from confusing the entity-relationship model with the old network data model.
This is a crucial point - in order to understand different kinds of relationships and constraints on relationships, it's essential to understand what relationships are first. Relationships in ER are associations between keys, not between tables. A relationship can have any number of roles of different entity sets, while foreign key constraints enforce a subset constraint between two columns of one entity set. Now, armed with this knowledge, read my whole answer again. ;)
I hope this helps. Feel free to ask questions.
Im new to Rails and I'm in the middle of sketching up an ERD for my new app. A Yelp-sort of app, where a Client is sorted by price.
So I want one Client to have many priceranges - One Client can both have pricerange $ and Pricerange $$$$ for example. The priceranges are:
$ - $$ - $$$ - $$$$ - $$$$$
How would this look in a table? Would I create a table called PriceRange with Range1, Range2, Range3, Range4, Range5 to be booleans?
Doesn't the PriceRange-table need any foreign/primary keys?
PriceRange
Range1 (Boolean)
Range2 (Boolean)
Range3 (Boolean)
Range4 (Boolean)
Range5 (Boolean)
Look, I'm Brazilian and I'm not very knowledgeable about yelp applications. I do not quite know what it is, but from what I saw, they are systems to assess/measure/evaluate (perhaps the translation is wrong here for you) things, in this case, companies, right?
Following this logic, let's think...
By the description of your problem (context), you have clients (companies), and they can have price ranges, correct? If:
A price interval is represented by textual names, such as "$", "$$",
and so on,
and the same price range may have (numeric) values for different companies,
And the same price range (type) can be (or not) assigned to different
companies,
Then here is what we have:
By decomposing this conceptual model, you would end up with three tables:
Companies
Price Ranges
Price Ranges from Companies
The primary keys of Company and Price Ranges will be passed to Price Ranges from Companies as foreign keys. You can use them as a composite primary key, or use a surrogate key. If using a surrogate key, you will permit/allow a company to have the same kind of price range more than once, which I believe is not the case.
Let's look at another situation, if things are simpler as:
If there is no need to store prices,
and an company may have or not one or more price ranges represented by "$", "$$", and so on,
Then here is what we have:
Similarly, we'll have the same 3 tables. Likewise, you still must pass the primary keys of Companies and Price Ranges to Price Ranges from Companies as foreign keys.
So I want one Client to have many priceranges - One Client can both
have pricerange $ and Pricerange $$$$ for example
Notice how N-N relationships allow us to create optional relationships between entities. This will allow a company to have zero, one, two, (etc.) or all price ranges defined. Again, so that is not allowed a company to have a price range more than once, set the foreign keys as composite primary key in Price Ranges from Companies.
If you have any questions or anything I explained has nothing to do with your context, please do not hesitate to comment.
EDIT
Is the Price ranges from companies what is called a Joint table?
Yes. There are also other terms used, some in different areas of computer science, such as Link Table, or Intermediate Table.
Actually we do not have a table here in the diagram, but an entity. In the Conceptual Model there are no tables, but entities and relationships. Be careful with this terminology when developing the Conceptual Model, or else you may get confused (I say this from experience).
However, yes, once decomposed, we will have a table from this relationship. When decomposed, N-N relationships will always become tables, no exception. Differently, 1-1 and 1-N (or N-1) relationships do not become tables. These tables with these special names (Join/Link/Intermediate Tables) serves to associate records from different tables, hence the name.
And is it necessary to have a column called Price Range Id? I mean
what is it there for?
At where? If you say at the Price Ranges entity, it is rather necessary. Must We not identify records in a table in some way? Here I set what is called a Surrogate Key. If on the other hand, you have a column with unique values for each record in the table, you can also use this column. I highly recommend that you consider the use of surrogate keys. Read the link I gave you.
In the Conceptual Model, we have to define the properties and also the primary keys. During the phase of the conceptual model, natural attributes of entities can become primary keys if you so desire. In this case, we have what is called a Natural Key.
If on the other hand you refer to Price Ranges from Companies entity, so the question is another ("And is it necessary to have a column called Price Range Id?"). Here we have a table with two columns, as I told you. The two are foreign keys. You need it so you can relate rows from the two tables... I think you were not referring to that, is not it? If so, no problem, you can comment and ask more questions. I do not care to answer. To be honest, I did not quite understand your question.
EDIT 2
So that Company 28 can be identified in the Price Ranges (for instance
ID 40) Which would make it easier to call out the price ranges it has?
Maybe my English is not very good, but it seems to me that you have a beginner's doubt/question in relation to the concept of tables and relationships between them. If not that, I apologize because maybe I did not understand. But let's see...
The tables in a database have rows / records. Each line has its own data. Even with this, each line / record needs to be differentiated and identified somehow. That is why we attach to each line an identifier, known as the primary key (this, and this). In summary, the primary key is how we identify, differentiate, separate and organize different records.
Even if all records have different values, you must select a field (column) that represents the primary key of the table. By obligation, every record MUST have a primary key. Although you can choose which field is a primary key, you are allowed to choose one or more fields to serve as the primary key. When this happens, that is, when more than one field participates/serves as the primary key, we have a table with something called Composite Primary Key. Similarly, it has the ability to identify records. Note that, because of that, primary key values must be unique, otherwise you may have 2 identical records.
This is the basic concept so that we can relate tables to each other, in case, records/rows of tables together. If we have a Company identified by the ID 28 (a line/record), and we want to relate it to a Price Range identified by the ID 40, then we need to store somewhere that relationship (28 <--> 40). This is where the role of intermediate/link/join tables comes in (but only to relationships N-N! For 1-N or N-1 relationships it works similarly, but not identical).
My original question was whether it was necessary, and why a company
ID had to link up with a price range ID at all.
With this table storing records which relates to other records (for their primary keys), we can perform a SQL join operation (If you have questions about this, see this image). Depending on how you perform this operation, you'll get:
All companies that have Price Ranges.
All companies that do not have Price Ranges.
All the Price Ranges of a given company.
All companies that have or not a X Price Range.
All price ranges that are given or not to companies.
...
Anyway, you get all this because of the established relationship.
If it could just be taken out and then the table of price ranges would
only involve Pricerange1-5.
This sentence I did not understand. What should be taken out? Could you please explain this sentence better?
I need to create an ER diagram and a relational model for a hospital. I have kept it simple to 3 entities. Please can someone have a look and tell me if I have normalised this correctly?
I am not sure it there should be a : relationship between the trust and patients? Does a person assume that the first entity is a single unit, in this case a hospital has x relationships to....
Diagram 2:
ER Diagram & Relational Schema
Your normalization isn't correct. First of all, you don't list functional dependencies. Without that, any attempt to normalize is just guesswork.
Now, I can make a reasonable guess, but even with that you have some problems. The Hires and Cares for relations aren't reflected in your 1NF/2NF/3NF tables. In your 2NF, you introduce a Shift relation that was presumably derived from Doctor in 1NF, but Patient_ID wasn't present in the latter, unless that's actually what the Shift column was for, but renaming fields isn't part of normalization. Also, what happened to Hospital in 2NF? It reappears in 3NF but with a Location attribute that was derived from where? Also, you're not indicating a primary key for Shift, is it correct to assume the combination of all 4 fields constitute a candidate key? It's good practice to be explicit about your keys.
Can an entity be without attributes?
Your Hospital relation does have an attribute - Hospital_ID, which represents an identity relation. Remember an attribute is a binary relation, and Hospital_ID -> Hospital_ID qualifies.
I am not sure it there should be a : relationship between the trust and patients?
What is the trust? It isn't indicated in your diagram at all.
Does a person assume that the first entity is a single unit, in this case a hospital has x relationships to....
Many-to-many relationships are decomposed into two or more one-to-many relationships, and in reading the cardinality indicators we tend to put the singular entity first, but this isn't a rule or a safe assumption. Be explicit about cardinalities.
I'm new to ER diagrams. I noticed that draw.io (which was recommended on Stackoverflow) does not have a one (optional) to one (mandatory) relationship.
Let's say, I have two tables "user" (id, affiliate_id) and "affiliate" (id). There doesn't have to be an affiliate, in which case user.affiliate_id would be null.
However, if there is an affiliate, then user.affiliate_id will link to affiliate.id.
So wouldn't that be a one (optional) to one (mandatory) relationship?
PS: I was thinking that maybe user.affiliate_id must not be null in a strict sense. However, it doesn't break the foreign key constraints (at least for SQLite 3).
I think you are describing a one-to-many relationship. One user can (optionally) be associated to one affiliate, but the same affiliate can be associated to more than one user.
Or am I misunderstanding?
Yes, you are right, that would be a zero/one (or zero/many) to one relationship type, which has to be shown in the diagram. There are different notations for ER diagrams (and it's therefore, in fact, preferable to use UML class diagrams). For instance, in the notation used by Oracle the optional end of the connection line representing the relationship type is annotated with both a zero and a one symbol. In UML, the annotation of the optional association end would be 0..1 (if single-valued) or 0..* (if many-valued).