database design for composite elements - ruby-on-rails

I'm building a site that tracks donations and sales of items in a school auction.
Items can be sold individually or in lots, which are just groups of items bundled for sale as a single unit (like a gift certificate for a dinner Item bundled with a gift certificate for movie tickets Item).
Both of these things (Items and Lots) share fields like name, description, value. But Items have additional fields, like the donor, restrictions of use, type of item, etc.
I started by creating a table called Lot and an association table that lets Lots contain 1+ Items.
That works great for Lots. But that leaves me with a problem:
When Buyers win I need to record the win and the price. I'm doing that with a Win table that associates the Buyer with the Lot and the winning price.
But how do I deal with all the Items that aren't assigned to Lots? Should every item be in a Lot, just singly? That would make sense because it would work with the Win table scheme above, but I would need to automatically create a Lot for every Item that isn't already in another Lot. Which seems weird.
I'm sure this is a simple problem, but I can't figure it out!
Thanks!

Your approach of treating every item as a lot should be the winning one. It may sound weird, but it will make things way easier in the long run.
I have to deal on a daily base with a database where a similar problem was 'solved' the other way round, meaning keeping bundles of items and items apart and that proved to be a great pita (and for sure I'm not talking about a flat round bread here).
This database is both backbone for statistical evaluations and a bunch of (web) applications and on countless occasions I run into trouble when deciding which table to chose or how to level the differences between those two groups in querying and in coding.
So, even if your project will be rather small eventually, that is a good idea.
Yes, you need to provide a method to put every item in a lot, but this trouble is to be taken just once. On the other hand your queries wouldn't become significantly more complex because of that 'extra' table, so I'd definitely would chose this way.

It sounds like you have an Auction model that could have one or many Items. Then you could have two different types of Auctions, Auction::Single and Auction::Lot. Price would be a column on Auction. Auction has many Bids which is a join model between the Auction and the User (or Bidder). That join model would also store the bid price. When the Auction is over, you could create a separate Win record if you want, or just find the Winner through the highest Bid from Auction.

It would be helpful if you showed some code. But, what you want is a polymorphic association. So, something like:
class Item
has_one :win, as: :winnable
belongs_to :lot
end
class Lot
has_one :win, as: :winnable
has_many :items
end
class Win
belongs_to :buyer
belongs_to :winnable, polymorphic: true
end

Related

Small number of set categories with many-to-many relation?

I'm relatively new to the Rails framework and I'm not sure if the approach I am taking is the most efficient/effective way or if I am following Rails conventions well.
The basic issue I have is that my application will have a Company model and various set Categories (not editable by the user). Each Company can be part of multiple Categories. My understanding, from other examples, is that I should set the relationships as something like:
Company has_many_belongs_to_many Categories
Category has_many_belongs_to_many Companies
However, since there will not be that many categories (<10), and since they will not change/be editable/be added/be removed by users, I'm not sure I need to create a whole new table for categories then join them onto Companies? Is there a better way to do this in Rails that I'm missing? Thanks in advance!
Even though you may only have 10 categories or so, I would say this is still fine to have it in its own table. Setting up the relationships give you programmatic power to retrieve companies related to a single category and vice versa when you need it without having to reconstruct queries yourself.
An example of the simplicity for keeping those in the database:
# Get all companies under a specific Category
#category = Category.find(1)
#companies = #category.companies
That's pretty simple if you ask me. And if you add another category to the table in the future, you won't need to write any new code to get it work.
Another thing, I would check out using has_many :through instead of has_and_belongs_to_many (habtm), as habtm can cause unforeseen problems as your application gets bigger. Here is a great article that goes into that problem a lot deeper: Why You Don’t Need Has_and_belongs_to_many Relationships. Not saying you can't use it (if the shoe fits), but generally it's good to be aware of potential problems so you can make the right decision for you and your app.

How to design/model a has many relationship that has a meaningful join table?

I wasn't able to put well into words (in Question title) what I'm trying to do, so in honor of the saying that an image is worth a thousand words; In a nutshell what I'm trying to do is..
Basically, what I have is A Teacher has many Appointments and A Student has many Appointments which roughly translates to:
I'm trying to stay away from using the has_and_belongs_to_many macro, because my appointments model has some meaning(operations), for instance it has a Boolean field: confirmed.
So, I was thinking about using the has_many :through macro, and perhaps using an "Appointable" join table model? What do you guys think?
The Scenario I'm trying to code is simple;
A Student requests an Appointment with a Teacher at certain Date/Time
If Teacher is available (and wants to give lesson at that Date/Time), She confirms the Appointment.
I hope you can tell me how would you approach this problem? Is my assumption of using the has_many :through macro correct?
Thank you!
Both teachers and students could inherit from a single class e.g. Person. Then create an association between Person and Appointments. This way you keep the architecture open so that if in the future you want to add 'Parents' then they could easily be integrated and may participate in appointments.
It may not be completely straightforward how you do the joins with the children classes (Students, Parents, Teachers). It may involve polymorphic relationships which I don't particularly like. You should though get away with a single join table.
In any case, you want to design so that your system can be extended. Some extra work early on will save you a lot of work later.

Simple Ruby on Rails database design

For practice I'm writing a shopping website where we have tables User and Item. A user obviously has_many items (when they are added to their basket), but the item, it belongs_to a User, even though many users will have the same item in their basket?
Furthermore, what if I want a list of items a user has added to their basket, but also a list of items they have viewed (for making suggestions based on searches), would it be better to have some 'through' tables: Basket and Viewed?
When you have this many-to-many relationships, you can use the HABTM schema:
Class User...
has_and_belongs_to_many :items
However, most of the time webshops use orderlines to keep up with items that users are purchasing. This means that an 'user' 'has_many' 'orderlines', an 'item' 'has_many' 'orderlines', an 'orderline' 'belongs_to' an 'user' and to an 'item'.
And maybe your orderlines will just be copies of items, and won't have a direct link because you don't want to alter the orderline after they have been processed. It really depends on the focus of your shop which scheme suits your needs.
Try to find some examples on the web and think about how you want to handle items, orders and baskets.
I'm used to separate things that are not the same, even if the relationship is one-to-one. So first of all I would recommend users from baskets (1:1-relationship).
After that a basket contains many items and items can be in multiple baskets (m:n-relationship). Make sure, that maybe a user likes to buy the same item multiple times.
views can be realised as a linking table between users and items: users have many views and items have many views, but one view is always linked to exactly one user and one item.

A database design for variable column names

I have a situation that involves Companies, Projects, and Employees who write Reports on Projects.
A Company owns many projects, many reports, and many employees.
One report is written by one employee for one of the company's projects.
Companies each want different things in a report. Let's say one company wants to know about project performance and speed, while another wants to know about cost-effectiveness. There are 5-15 criteria, set differently by each company, which ALL apply to all of that company's project reports.
I was thinking about different ways to do this, but my current stalemate is this:
To company table, add text field criteria, which contains an array of the criteria desired in order.
In the report table, have a company_id and columns criterion1, criterion2, etc.
I am completely aware that this is typically considered horrible database design - inelegant and inflexible. So, I need your help! How can I build this better?
Conclusion
I decided to go with the serialized option in my case, for these reasons:
My requirements for the criteria are simple - no searching or sorting will be required of the reports once they are submitted by each employee.
I wanted to minimize database load - where these are going to be implemented, there is already a large page with overhead.
I want to avoid complicating my database structure for what I believe is a relatively simple need.
CouchDB and Mongo are not currently in my repertoire so I'll save them for a more needy day.
This would be a great opportunity to use NoSQL! Seems like the textbook use-case to me. So head over to CouchDB or Mongo and start hacking.
With conventional DBs you are slightly caught in the problem of how much to normalize your data:
A sort of "good" way (meaning very normalized) would look something like this:
class Company < AR::Base
has_many :reports
has_many :criteria
end
class Report < AR::Base
belongs_to :company
has_many :criteria_values
has_many :criteria, :through => :criteria_values
end
class Criteria < AR::Base # should be Criterion but whatever
belongs_to :company
has_many :criteria_values
# one attribute 'name' (or 'type' and you can mess with STI)
end
class CriteriaValues < AR::Base
belongs_to :report
belongs_to :criteria
# one attribute 'value'
end
This makes something very simple and fast in NoSQL a triple or quadruple join in SQL and you have many models that pretty much do nothing.
Another way is to denormalize:
class Company < AR::Base
has_many :reports
serialize :criteria
end
class Report < AR::Base
belongs_to :company
serialize :criteria_values
def criteria
self.company.criteria
end
# custom code here to validate that criteria_values correspond to criteria etc.
end
Related to that is the rather clever way of serializing at least the criteria (and maybe values if they were all boolean) is using bit fields. This basically gives you more or less easy migrations (hard to delete and modify, but easy to add) and search-ability without any overhead.
A good plugin that implements this is Flag Shih Tzu which I've used on a few projects and could recommend.
Variable columns (eg. crit1, crit2, etc.).
I'd strongly advise against it. You don't get much benefit (it's still not very searchable since you don't know in which column your info is) and it leads to maintainability nightmares. Imagine your db gets to a few million records and suddenly someone needs 16 criteria. What could have been a complete no-issue is suddenly a migration that adds a completely useless field to millions of records.
Another problem is that a lot of the ActiveRecord magic doesn't work with this - you'll have to figure out what crit1 means by yourself - now if you wan't to add validations on these fields then that adds a lot of pointless work.
So to summarize: Have a look at Mongo or CouchDB and if that seems impractical, go ahead and save your stuff serialized. If you need to do complex validation and don't care too much about DB load then normalize away and take option 1.
Well, when you say "To company table, add text field criteria, which contains an array of the criteria desired in order" that smells like the company table wants to be normalized: you might break out each criterion in one of 15 columns called "criterion1", ..., "criterion15" where any or all columns can default to null.
To me, you are on the right track with your report table. Each row in that table might represent one report; and might have corresponding columns "criterion1",...,"criterion15", as you say, where each cell says how well the company did on that column's criterion. There will be multiple reports per company, so you'll need a date (or report-number or similar) column in the report table. Then the date plus the company id can be a composite key; and the company id can be a non-unique index. As can the report date/number/some-identifier. And don't forget a column for the reporting-employee id.
Any and every criterion column in the report table can be null, meaning (maybe) that the employee did not report on this criterion; or that this criterion (column) did not apply in this report (row).
It seems like that would work fine. I don't see that you ever need to do a join. It looks perfectly straightforward, at least to these naive and ignorant eyes.
Create a criteria table that lists the criteria for each company (company 1 .. * criteria).
Then, create a report_criteria table (report 1 .. * report_criteria) that lists the criteria for that specific report based on the criteria table (criteria 1 .. * report_criteria).

How to model "products" in an online store application

I'm building an online store to sell products like "Green Extra-large, T-shirts". I.e., the same shirt can have many sizes / colors, different combination can be sold out, different combination might have different prices, etc.
My question is how I should model these products in my Rails application (or really how to do it in any application).
My current thinking is:
Class Product
has_many :variants, :through => :characteristics
has_many :characteristics
end
Class Characteristic
belongs_to :product
belongs_to :variants
end
Class Variant
has_many :products, :through => :characteristics
belongs_to :characteristic
end
So each product will have one or more characteristics (e.g., "Color", "Size", etc), and each characteristic will then have one or more variants (e.g., "Red", "Blue", etc).
The problem with this method is where do I store price and inventory? I.e., a given product's price and inventory are determined by the variants its characteristics take. (Green might be more expensive than red, large might be out of stock, etc).
One thought I had was to give products a "base_price", and let variants modify it, but this seems overly complex (and might not work).
I have seen two solutions to this kind of dilemma. The first is to try to use characteristics to define subordinate products to the "main" product. The challenge here is that in addition to your thoughts for far, in most cases the product will evolve with new manufacturers that bring new aspects to the table. For example, one manufacturer may make a cheaper product, but have a different application method for the logo or stitching that may be significant enough to track.
I think that carrying a non significant product number for each product and then attaching the characteristics as attributes works out the best. It is easily searched and extensible. If a group of products are strongly related, a ProductGroup that the individual products attach to works well.
In tables:
ProductGroup
--------------------
ProductGroupID
ProductGroupName
ProductGroupDescription
Product
--------------------
ProductID
ProductGroupID
QtyOnHand
BasePrice
ProductColorID
ProductSizeID
ProductColor
------------
ProductColorID
ProductColorName
ProductSize
--------------
ProductSizeID
ProductSizeName
...more attributes...
The advantages here are that you can easily query for specific attributes, attributes are "flexible" in that more can be added (and old ones adjusted: if you started with "Red" but then added another "Red" to the color pool, you can change them to "Maroon" and "Bright Red".
You can control price and inventory are at the detail product level (although more tables may be required to account for sourcing costs).
This all assumes that your characteristics are universally shared. If they are not, your characteristic subtable approach can work by creating a join table between characteristics and the product detail tables and populate as needed. This will require more business logic .to ensure each product category gets all characteristics necessary. In this latter case I would use "prototype" products in the base product table (with Qty and Cost of 0) that I would clone the characteristics from and then adjust as each new product is entered. As you move forward, when a new variation appears, having a "clone this product" function that allows you to just adjust the differences from the base product would be valuable.
Finally, as far as managing the inventory and pricing, this is going to happen at the UI layer. Being able to generate queries for related products (product groups) and manage all the pricing for related products will go a long way to making this livable.
Just a quick-note. You can always try and take a look at the sourcecode of some other e-commerce products like Spree and Substruct they probably already answered that question for you.

Resources