Entity relationship diagram advise needed - entity-relationship

I am creating a database based on a ERD i have designed according to some business rules where I am allowed to make assumptions and implement them for the future.
Business rule:
Entity relationship diagram
Based on the business rules the customer is invoiced for the holiday, hence the relationship would be 1..1, however I have been left to assume that the customer may receive one or more invoices for the same reservation, that's if the customer makes changes to the reservation or a reminder invoice is raised.
IF i leave the relationship 1..1 then i might a swell get rid of the invoice table and use the reservation as the invoice since they use the same attributes and link it to the payment_method.
I don't know which way is best, first time doing databases...
Please advise

It almost sounds to me like you should make it a 1 to many relationship between the invoice and the reservation. You say that a customer may receive multiple invoices for a single reservation, such as if the reservation changes. That makes me think that it should be a one reservation to one or more invoices.
What I might include on the invoice table would be a field telling if it is the latest invoice, or a nullable field pointing to the next invoice. If an invoice becomes invalid/outdated/superseded, then a new invoice is created and all previous invoices then have their superseded field filled in to point to the most current invoice. That way you can still keep a trail of previous invoices as well as the current one.

Related

Is a table (from source system) that contains only relationships and current status of a row from another table a fact table in Data Warehouse?

I am developing a BI system for our company, from scratch, and currently, I am designing a data warehouse. I am completely new to this so there are many things that I don't really understand, so I need to hear some more insights into this.
My problems are:
1) In our source system, there are tables called "Booking" and "BookingAccess". Booking table holds the data of a booking, such as check-in time and check-out time, booking date, booking number, gross amount of that booking.
Whereas in BookingAccess, it holds foreign keys related to the booking, such as bookerID, customerID, processID, hotelID, paymentproviderID and a current status of that booking. Booking and BookingAccess has a 1:1 relation ship.
Our source system is about checking the validity of those bookings, these bookings are not ours. We receive these booking information from other sources, outsource the above process for them. The gross amount is just an information of that booking that we need to validate, their are not parts of our business. The current status of a booking which is hold in the BookingAccess table is the current status of that booking in our system, which can be "Processing" or "Finshed".
From what I read from Ralph Kimball, in this situation, the "Booking" is the Dimension table, and the BookingAccess should be the fact. I feel that the BookingAccess is some what a [Accumulating Snapshot table], in which I should track the time when a booking is "Processing", and when a booking is "Finshed".
Do I get it right?
2) In "Booking" table, there is also a foreign key called "ImportID". This key links to a table called "Import". This "Import" table hold history records of files (these file contain bookings which will be written to the "Booking" table) which were imported to our system, including attributes such as file name, imported date, total booking imported...
From my point of view, this is clearly a fact table.
But the problem is that, the "Import" table and the "Booking" table has a relationship of one to many (1 ImportID in "Import" table can have 1, 2 or more records which have a same ImportID in "Booking" table). This is against the idea of fact tables which insists that the relationship between Fact and Dimension must be many-to-one, which fact is always in the many side.
So what approach should I use to solve this case? I'm thinking of using bridge tables to solve this problem. But I don't know if this is a good practice, as there are a lot of record in the "Import" table, so I will have to create a big bridge table just to covers all of this.
3) Should I separate a table (from source system) which contains a mix of relationships and information to a fact table containing only relationships, and dimension table containing only information? (For example, a table called "Customer" in source system. This table contains some things like customer name, customer address and customertype id, customer parentID....)
I am asking this because I feel that if I use BI tools to analyze things (for example, analyzing the number of customers which has customertypeid = 1), I feel it's some what weird if there are no fact tables involved in.
Or should I treat it as a mere dimension table and use snowflake-schema? But this will lead to a mix of Star-Schema and snowflake-schema in our Data Warehouse. Is this normal? I have read some official sources (most likely Oracle) stating that one should try to avoid using and mixing snowflake-schema as much as possible. But some sources like Microsoft say that this is very normal. Even the Advanture Work Data Warehouse sample database uses this kind of approach.
Or should I de-normalize every relation in that "Customer" table? But I don't think this is a good approach as it will make the Customer contain a lot of columns, and it will be very hard to track the history of every row in the "DIM_Customer" table. For example, if any change occur in any relation of "Customer" table, the whole "DIM_Customer" table will need to be updated.
I still have a lot of question regarding to Data Warehouse. I am working with it nearly alone, without any help or consultant. So pardon me if I made any kind of inconveniences or mistakes.

How to determine if something should be an entity or an identifying relationship?

I'm trying to sketch out an ERD for booking hotel rooms. I had the "Reservation" as its own entity, so a User makes a reservation, and a reservation is for a room. But I guess it could also be an identifying relationship between the user and the room, as it joins them both. A user reserves a room. The reservation table would have a user_id and room_id.
I would think other entities should be related to the reservation though, such as payment, rate of cost. Any input would help, I'm quite new to these. Thanks!
An identifying relationship is the relationship between a weak entity and its parent entity. A weak entity is an entity that can't be identified by its own attributes and has another entity's key as part of its own.
Does this apply to your situation? Is a User partially identified by the Room they reserve, or a Room by the User who reserved it? I would think not.
The other possibility is to make Reservation a regular relationship. Can a Reservation be identified by some combination of the entities it's related to? I don't think so. I imagine any Reservation could be repeated at a later date, but dates are usually seen as value sets, not entity sets.
Reservation should probably be an entity set.

E-R diagram confusion

I am in the process of designing this E-R diagram for a shop of which I have shown part of below (the rest is not relevant). See the link please:
E-R diagram
The issue that I have is that the shop only sells two items, Socks and Shoes.
Have I correctly detailed this in my diagram? I'm not sure if my cardinalities and/or my design is correct. A customer has to buy at least one of these items for the order to exist (but has the liberty to buy any number).
The Shoe and Sock entities would have their respective ID attribute, and I am planning to translate to a relational schema like this:
(I forgot to add to my diagram the ORDER_CONTAINS relationship to have an attribute called "Quantity". )
Table: Order_Contains
ORDER_ID | SHOEID | SOCKID | QTY
primary key | FK, could be null |FK, could be null | INT
This clearly won't work since the Qty would be meaningless. Is there a way I can reduce the products to just two products and make all this work?
Having two one-to-many relationships combined into one with nullable fields is a poor design. How would you record an order containing both shoes and socks - a row per shoe with SOCKID set to NULL and vice-versa for socks, or would you combine rows? In the former case the meaning of QTY is clear though it depends on the contents of SHOEID/SOCKID fields, but what would the QTY mean in the latter case? How would you deal with rows where both SHOEID and SOCKID are NULL and the QTY is positive? Keep in mind Murphy's law of databases - if it can be recorded it will be. Worse, your primary key (ORDER_ID) will prevent you from recording more than one row, so a customer couldn't buy more than one (pair of) socks or shoes.
A better design would be to have two separate relations:
Order_Socks (ORDER_ID PK/FK, SOCKID PK/FK, QTY)
Order_Shoes (ORDER_ID PK/FK, SHOEID PK/FK, QTY)
With this, there's only one way to record the contents of an order and it's unambiguous.
You have not explained very well the context here. I'll try to explain from what I understand, and give you some hints.
Do your shop only and always (forever) sell 2 products? Do the details of these products (color, model, weight, width, etc...) need to be persisted in the database? If yes, then we have two entities in the model, SOCKS and SHOES. Each entity has its own properties. A purchase or a order is usually seen as an event on the ERD. If your customers always buys (or order) socks with shoes, then there will always be a link between three entities:
CLIENTS --- SHOES --- SOCKS
This connection / association / relationship is an event, and this would be the purchase (or order).
If a customer can buy separate shoes and socks, then socks and shoes are subtypes of a super entity, called PRODUCTS, and a purchase is an event between CUSTOMERS and PRODUCTS. Here in this case we have a partitioning relationship.
If however, your customers buy separate products, and your store will not sell forever only 2 products, and details of the products are not always the same and will not be saved as columns in a table, then the case is another.
Shoes and socks are considered products, as well as other items that can be considered in future. Thus, we have records/rows in a PRODUCTS table.
When a customer places an order (or a purchase), he (she) is acquiring products. There is a strong link between customers and products here, again usually an event, which would be the purchase (or a order).
I do not know if you do it, but before thinking of start a diagram, type the problem context in a paper or a document. Show all details present in the situation.
The entities are seen when they have properties. If you need to save the name of a customer, the customer's eye color, the customer's e-mail, and so on, then you will have certainly a CUSTOMER entity.
If you see entities relate in some way, then you have a relationship, and you should ask yourself what kind of relationship these entities form. In your case of products and customers, we have a purchasing relationship there between. The established relationship is a purchase (or an order, you call it). One customer can buy various products, and one product (not on the same shelf, is the type, model) can be purchased for several customers, thus, we have a Many-To-Many relationship.
The relationship created changes according to the context. Whatever, we'll invent something crazy here as examples. Say we have customers and products. Say you want to persist a situation where customers lick Products (something really crazy, just for you to see how the context says the relationship).
There would be an intimate connection between customers and products entities (really close... I think...). In this case, the relationship represents a history of customers licking products. This would generate an EVENT. In this event you could put properties such as the date, the amount of times a customer licked a proper product, the weather, the time, the traffic light color on the street, etc., only what you need to persist according to your context, your needs.
Remember that for N-N relationships created, we need to see if new entities (out of relationship) will emerge. This usually happens when you are decomposing the conceptual model to the logical model. Probably, product orders will generate not one but two entities: The ORDER and the products of orders. It is within the products of orders that you place the list of products ordered from each customer, and the quantity.
I would like to present various materials to study ERD, but unfortunately they are all in Portuguese. I hope I have helped you in some way. If you want to be more specific about your problem, I think I can really help you best. Anything, please ask.

How to design/model a has many relationship that has a meaningful join table?

I wasn't able to put well into words (in Question title) what I'm trying to do, so in honor of the saying that an image is worth a thousand words; In a nutshell what I'm trying to do is..
Basically, what I have is A Teacher has many Appointments and A Student has many Appointments which roughly translates to:
I'm trying to stay away from using the has_and_belongs_to_many macro, because my appointments model has some meaning(operations), for instance it has a Boolean field: confirmed.
So, I was thinking about using the has_many :through macro, and perhaps using an "Appointable" join table model? What do you guys think?
The Scenario I'm trying to code is simple;
A Student requests an Appointment with a Teacher at certain Date/Time
If Teacher is available (and wants to give lesson at that Date/Time), She confirms the Appointment.
I hope you can tell me how would you approach this problem? Is my assumption of using the has_many :through macro correct?
Thank you!
Both teachers and students could inherit from a single class e.g. Person. Then create an association between Person and Appointments. This way you keep the architecture open so that if in the future you want to add 'Parents' then they could easily be integrated and may participate in appointments.
It may not be completely straightforward how you do the joins with the children classes (Students, Parents, Teachers). It may involve polymorphic relationships which I don't particularly like. You should though get away with a single join table.
In any case, you want to design so that your system can be extended. Some extra work early on will save you a lot of work later.

Freezing associated objects

Does anyone know of any method in Rails by which an associated object may be frozen. The problem I am having is that I have an order model with many line items which in turn belong to a product or service. When the order is paid for, I need to freeze the details of the ordered items so that when the price is changed, the order's totals are preserved.
I worked on an online purchase system before. What you want to do is have an Order class and a LineItem class. LineItems store product details like price, quantity, and maybe some other information you need to keep for records. It's more complicated but it's the only way I know to lock in the details.
An Order is simply made up of LineItems and probably contains shipping and billing addresses. The total price of the Order can be calculated by adding up the LineItems.
Basically, you freeze the data before the person makes the purchase. When they are added to an order, the data is frozen because LineItems duplicate nessacary product information. This way when a product is removed from your system, you can still make sense of old orders.
You may want to look at a rails plugin call 'AASM' (formerly, acts as state machine) to handle the state of an order.
Edit: AASM can be found here http://github.com/rubyist/aasm/tree/master
A few options:
1) Add a version number to your model. At the day job we do course scheduling. A particular course might be updated occasionally but, for business rule reasons, its important to know what it looked like on the day you signed up. Add :version_number to model and find_latest_course(course_id), alter code as appropriate, stir a bit. In this case you don't "edit" models so much as you do a new save of the new, updated version. (Then, obviously, your LineItems carry a item_id and an item_version_number.)
This generic pattern can be extended to cover, shudder, audit trails.
2) Copy data into LineItem objects at LineItem creation time. Just because you can slap has_a on anything, doesn't mean you should. If a 'LineItem' is supposed to hold a constant record of one item which appeared on an invoice, then make the LineItem hold a constant record of one item which appeared on an invoice. You can then update InventoryItem#current_price at will without affecting your previously saved LineItems.
3) If you're lazy, just freeze the price on the order object. Not really much to recommend this but, hey, it works in a pinch. You're probably just delaying the day of reckoning though.
"I ordered from you 6 months ago and now am doing my taxes. Why won't your bookstore show me half of the books I ordered? What do you mean their IDs were purged when you stopped selling them?! I need to know which I can get deductions for!"
Shouldn't the prices already be frozen when the items are added to the order? Say I put a widget into my shopping basket thinking it costs $1 and by the time I'm at the register, it costs $5 because you changed the price.
Back to your problem: I don't think it's a language issue, but a functional one. Instead of associating the prices with items, you need to copy the prices. If every item in the order has it's own version of a price, future price changes won't effect it, you can add discounts, etc.
Actually, to be clean you need to add versioning to your prices. When an item's price changes, you don't overwrite the price, you add a newer version. The line items in your order will still be associated with the old price.

Resources