Relational Algebra "Only Once" or "Exists once" - join

So I have 2 relations
Student = {student id, name, address}
Course = {course no, title, subject}
Completed = {course no, student id, grade, semester}
and I want to display the name of students who have COMPLETED only one COURSE of "Physics" (Which is a subject)
I dont have problems joining the tables to get the data together, my problem is with how to get values that appear only once?
What I have so far
PICourse_no (σ Subject=´Physics´(COURSE))
That gets me all the course numbers that are Physics related
PIStudent_Id(σCourseNo= (PICourse_no (σ Subject=´Physics´(COURSE))))
And with that I think I'm getting the Id's of all the students who study a physics related course...but now here is my problem, how do I remove the students who have MORE than one physics related course?

"how do I remove the students who have MORE than one physics related course?"
That is done by the relational MINUS operator or one of its nephews (sometimes known as antijoin). As indicated in the comments, there are a major number of distinct sets of operators all called "relational algebra". You really have to look into which one you are supposed to be using.

Related

ER Model representing entities not stored in DB and user choice

I'm trying to create a ER diagram of a simple retail chain type database model. You have your customer, the various stores, inventory etc.
My first question is, how to represent a customer placing an order in a store. If the customer is a discount card holder, the company has their name, address etc, so I can have a cardHolder entity connect to item and store with an order relationship. But how do I represent an order being placed by a customer who is not really an entity in the database?
Secondly, how are conditional... stuff represented in ER diagrams, e.g. in a car dealership, a customer may choose one or more optional extra when buying a car. I would think that there is a Car entity with the relevant attributes and the options as a multi-valued attribute, but how do you represent a user picking those options (I.e. order table shows the car ordered, extras chosen and the added cost of extras) in the order relationship?
First, do you really need to model customers as distinct entities, or do you just need order, payment and delivery details? Many retail systems don't track individual customers. If you need to, you can have a customer table with a surrogate key and unique constraints on identifying attributes like SSN or discount card number (even if those attributes are optional). It's generally hard to prevent duplication in customer tables since there's no ideal natural key for people, so consider whether this is really required.
How to model optional extras depends on what they depends on. Some extras might be make or model-specific, e.g. the choice of certain colors or manual/automatic transmission. Extended warranties might be available across the board.
Here's an example of car-specific optional extras:
car (car_id PK, make, model, color, vin, price, ...)
car_extras (extra_id PK, car_id FK, option_name, price)
order (order_id PK, date_time, car_id FK, customer_id FK, payment_id FK, discount)
order_extras (order_id PK/FK, car_id FK, extra_id PK/FK)
I excluded price totals since those can be calculated via aggregate queries.
In my example, order_extras.car_id is redundant, but supports better integrity via the use of composite FK constraints (i.e. (order_id, car_id) references the corresponding columns in order, and (car_id, extra_id) references the corresponding columns in car_optional_extras to prevent invalid extras from being linked to an order).
Here's an ER diagram for the tables above:
First, as per your thought you can definitely have two kinds of customers. Discount card holders whose details are present with the company and new customers whose details aren't available with the company.
There are three possible ways to achieve what you are trying,
1) Have two different order table in the system(which I personally wouldn't suggest)
2) Have a single Order table in the system and getting the details of those who are a discount card holder.
3) Insert a row in the discount card holder table for new/unregistered customers having only one order table in the system.
Having a single order table would make the system standardized and would be more convenient while performing many other operations.
Secondly, to solve your concern, you need to follow normalization. It will reduce the current problem faced and will also make the system redundant free and will make the entities light weighted which will directly impact on the performance when you grow large.
The extra chosen items can be listed in the order against the customer by adding it at the time of generating a bill using foreign key. Dealing with keys will result in fast and robust results instead of storing redundant/repeating details at various places.
By following normalization, the problem can be handled by applying foreign keys wherever you want to refer data to avoid problems or errors.
Preferably NF 4 would be better. Have a look at the following link for getting started with normalization.
http://www.w3schools.in/dbms/database-normalization/

Select rows with "one of each" in relational algebra

Say I have a Personstable with attributes {name, pet}. How do I select the names of people where they have one of each kind of pet (dog, cat, bird), but a person only has one of each kind of pet if they pet is in the table.
Example: Bob, Dog and Bob, Cat are the only rows in the table. Therefore, Bob has one of each kind of pet. But the moment Lynda, Bird are added, Bob doesn't have one of each type of pet anymore.
I think the first step to this is to π(pet). You get a list of all kinds of pets since relational algebra removes duplicates. Not sure what to do after this, but I have think I need to join π(pet) and Persons.
I've tried a few things like Natural Join and Cross products but I haven't arrived at a result yet and I'm out of ideas.
The answer to the question can be found with the Division operator:
Persons ÷ πpet(Persons)
This relational algebra expression returns a relation with only the column name, containing all the names of the persons that have all the different kind of pets currently present in the Persons table itself.
The division is an operator that, in some sense, is the inverse of the product operator (the name is derived exactly from this fact). It is a derived operator that can be defined in terms of projection, set difference and product (see for instance this answer).

E-R diagram confusion

I am in the process of designing this E-R diagram for a shop of which I have shown part of below (the rest is not relevant). See the link please:
E-R diagram
The issue that I have is that the shop only sells two items, Socks and Shoes.
Have I correctly detailed this in my diagram? I'm not sure if my cardinalities and/or my design is correct. A customer has to buy at least one of these items for the order to exist (but has the liberty to buy any number).
The Shoe and Sock entities would have their respective ID attribute, and I am planning to translate to a relational schema like this:
(I forgot to add to my diagram the ORDER_CONTAINS relationship to have an attribute called "Quantity". )
Table: Order_Contains
ORDER_ID | SHOEID | SOCKID | QTY
primary key | FK, could be null |FK, could be null | INT
This clearly won't work since the Qty would be meaningless. Is there a way I can reduce the products to just two products and make all this work?
Having two one-to-many relationships combined into one with nullable fields is a poor design. How would you record an order containing both shoes and socks - a row per shoe with SOCKID set to NULL and vice-versa for socks, or would you combine rows? In the former case the meaning of QTY is clear though it depends on the contents of SHOEID/SOCKID fields, but what would the QTY mean in the latter case? How would you deal with rows where both SHOEID and SOCKID are NULL and the QTY is positive? Keep in mind Murphy's law of databases - if it can be recorded it will be. Worse, your primary key (ORDER_ID) will prevent you from recording more than one row, so a customer couldn't buy more than one (pair of) socks or shoes.
A better design would be to have two separate relations:
Order_Socks (ORDER_ID PK/FK, SOCKID PK/FK, QTY)
Order_Shoes (ORDER_ID PK/FK, SHOEID PK/FK, QTY)
With this, there's only one way to record the contents of an order and it's unambiguous.
You have not explained very well the context here. I'll try to explain from what I understand, and give you some hints.
Do your shop only and always (forever) sell 2 products? Do the details of these products (color, model, weight, width, etc...) need to be persisted in the database? If yes, then we have two entities in the model, SOCKS and SHOES. Each entity has its own properties. A purchase or a order is usually seen as an event on the ERD. If your customers always buys (or order) socks with shoes, then there will always be a link between three entities:
CLIENTS --- SHOES --- SOCKS
This connection / association / relationship is an event, and this would be the purchase (or order).
If a customer can buy separate shoes and socks, then socks and shoes are subtypes of a super entity, called PRODUCTS, and a purchase is an event between CUSTOMERS and PRODUCTS. Here in this case we have a partitioning relationship.
If however, your customers buy separate products, and your store will not sell forever only 2 products, and details of the products are not always the same and will not be saved as columns in a table, then the case is another.
Shoes and socks are considered products, as well as other items that can be considered in future. Thus, we have records/rows in a PRODUCTS table.
When a customer places an order (or a purchase), he (she) is acquiring products. There is a strong link between customers and products here, again usually an event, which would be the purchase (or a order).
I do not know if you do it, but before thinking of start a diagram, type the problem context in a paper or a document. Show all details present in the situation.
The entities are seen when they have properties. If you need to save the name of a customer, the customer's eye color, the customer's e-mail, and so on, then you will have certainly a CUSTOMER entity.
If you see entities relate in some way, then you have a relationship, and you should ask yourself what kind of relationship these entities form. In your case of products and customers, we have a purchasing relationship there between. The established relationship is a purchase (or an order, you call it). One customer can buy various products, and one product (not on the same shelf, is the type, model) can be purchased for several customers, thus, we have a Many-To-Many relationship.
The relationship created changes according to the context. Whatever, we'll invent something crazy here as examples. Say we have customers and products. Say you want to persist a situation where customers lick Products (something really crazy, just for you to see how the context says the relationship).
There would be an intimate connection between customers and products entities (really close... I think...). In this case, the relationship represents a history of customers licking products. This would generate an EVENT. In this event you could put properties such as the date, the amount of times a customer licked a proper product, the weather, the time, the traffic light color on the street, etc., only what you need to persist according to your context, your needs.
Remember that for N-N relationships created, we need to see if new entities (out of relationship) will emerge. This usually happens when you are decomposing the conceptual model to the logical model. Probably, product orders will generate not one but two entities: The ORDER and the products of orders. It is within the products of orders that you place the list of products ordered from each customer, and the quantity.
I would like to present various materials to study ERD, but unfortunately they are all in Portuguese. I hope I have helped you in some way. If you want to be more specific about your problem, I think I can really help you best. Anything, please ask.

Matching students to courses with course limit (Hungarian, Max Flow, Min-Cost-Flow, ...)

I am currently writing a program which maps students to courses. Currently, I am using a SAT-Solver, but I am trying to implement a polynomial time / non greedy algorithm which solves the following sub-problem:
There are students (50-150)
There are subjects (10-20), e.g. 'math', 'biology', 'art'
There are courses per subject (at least one), e.g. 'math-1', 'math-2', 'biology-1', 'art-1', 'art-2', 'art-3'
A student selects some (fixed) subjects (10-12) and for each subject the student has to be assigned to exactly one of the existing courses (if possible). It does not matter which course 'math-1' or 'math-2' is being selected.
The courses have a maximum number of allowed students (20-34)
Each course is in a fixed block (= timeslot 1 to 13)
A student may not be assigned to courses being in the same block
I am now describing what I have done so far.
(1) Ignoring the course-student-limit
I was able to solve this with the hungarian algorithm / bipartite matching. Each student may be computed individually by modelling it as following:
left nodes represent the subjects 'math', 'biology', 'art' (of the student)
right nodes represent the blocks '1', '2', .... '13'
an edge is inserted for each course from 'subject' to 'block'
This way the student is assigned for every subject to a course while not attending courses which are in the same block. But course-limits are ignored.
(2) Ignoring the selected subjects of the student
I was able to solve this with a max-flow-algorithm. For each student the following is modelled:
Layer 1: From source to each student with a flow of 13
Layer 2: From each student to his/her personal block with a flow of 1
Layer 3: From each student-block to each course in that block with flow 1
Layer 4: From each course to the sink with 'max-student-limit'
This way the student selects arbitrary courses and the course-limit is fullfilled. But he/she may be unlucky and be assigned to 'math-1', 'math-2' and 'math-3' ignoring the subjects 'biology' and 'art'.
(3) Greedy Hungarian
Another idea I had was to match one student at a time with the hungarian algorithm and adjusting the weights so that 'more empty courses' are preferred. For example one could model:
left nodes are subjects of the student
right nodes are blocks
for each course insert an edge from subject to the block of the course with weight = number of free seats
And then computing a Maximum-Weight-Matching.
I would really appreciate any suggestions / help.
Thank you!

ER Model Diagram Good design? how to express myself?

I am trying to understand the concept of ER modelling, but I do not yet succeed. I have designed the ER model about movie database, but I do not know wheather it is a good design and how to connect the entities:
between Actor and Film i want to say "actor can play in each film only once" and at the same time "many actors can play in many movies" -- is it 1 to 1 relation or many to many?
and HOW do we need to think about entities ans relations between them? relations to one user, one film, one actor, one director, or in general?
UPDATE: new question : should the relation between Director and Film be 1 to many or many to many? I want to say : "one director can have many films && many directors can have may films" ??
Think about it like this: There are many movies. There are many actors. It makes sense that you would only want to include each actor in a particular movie once, but otherwise you want to be able to "mix and match" the movies and actors to express the relationship.
Looking at your diagram, you don't seem to have any fields which express the relationship between Film and Actor - those lines need to match actual fields. Read up on foreign keys: http://en.wikipedia.org/wiki/Foreign_key
The relationship between Actor and Movie that you want is actually many-to-many. You can express this with a "join table" (you'd need to add this to your diagram).
Something like this would work:
FilmActor
-------
uidFilm
uidActor
And put a unique constraint on those two fields together so it can't be duplicated (i.e. the same Actor can't appear in a Film twice)

Resources