What to consider when deciding to use Single Table Inheritance

What to consider when deciding to use Single Table Inheritance - ruby-on-rails

I'm getting ready to start a small project that provides an opportunity to use single table inheritance. As I read through prior post on STI on Stackoverflow there seems to be some strong opinions on sides of the argument.
My application is related to my horse racing hobby. A horse's connections are defined as its current jockey, trainer and owner. The jockey, trainer and owner could be modeled using three separate tables (models/classes) or as one one class with several sub-classes through single table inheritance.
When faced with a decision like this, is there a check list of questions that one can go through to determine what approach is preferable. I'm assuming that using STI would reduce the number of potential joins. What are the other practical considerations?

There are a few things you should think about:
Are the objects, conceptually, children of a single parent?
Don't use single table inheritance just because your classes share some attributes; make sure there is actually an OO inheritance relationship between each of them and an understandable parent class.
Do you need to do database queries on all objects together?
If you want to list the objects together or run aggregate queries on all of the data, you’ll probably want everything in the same database table for speed and simplicity.
Do the objects have similar data but different behavior?
If you have a larger number of model-specific columns, you should consider polymorphic associations instead.
The article linked goes in depth a bit more.

Related

Core Data design principles

I just started reading this guide: https://developer.apple.com/library/content/documentation/Cocoa/Conceptual/CoreData/KeyConcepts.html#//apple_ref/doc/uid/TP40001075-CH30-SW1
And it basically has (in my opinion) two big contradictions:
I get them both, but basically, if I follow the first "implement a custom class to the entity from which classes representing subentities also inherit"-statement, then ALL my entities will be put in the same table. Which could cause performance issues, according to the NOTE.
How big of a performance hit would I run into of it create a "custom super entity"?

You can use the inheritance mechanism to get a default database structure. From your link:
If you have a number of entities that are similar, you can factor the common properties into a superentity, also known as a parent entity.
There is no contradiction. The documentation is just telling you what the database structure is going to be when you use a certain facility. (And it is the standard database table idiom for inheritance.) Using the entity inheritance mechanism automatically declares and implements default parent-child class inheritance functionality along with a parent table. Otherwise you do any parent-child class inheritance declaration and implementation by hand. Each comes with certain performance and other characteristics.
Design involves tradeoffs between costs and benefits over multiple dimensions. "Performance" itself involves multiple dimensions, and has no meaning outside of given application usage patterns. Other dimensions relevant here include complexity of both construction and maintenance.
If you query about entities as parents sufficiently frequently then it can be better to have all parent data in its own table. But if you sufficiently rarely ask for the parent data while querying about a given child type or if you sufficiently frequently need both child and parent data then it can be better to only have parent data in the child tables or table. But notice that each design performs worse at the other kind of query.

The first is talking about sub-entities. The second is talking about subclasses. These are 2 different hierarchies.
One use for sub-entities is if you have a table where you want to show cells displaying different entities. By making them sub-entities, you can fetch the parent entity and all sub-entities will be returned. This is actually how the Notes app shows the "All Notes" cell above folders, that is actually displaying the Account entity, and both Account and Folder are sub-entities of NoteContainer which is what is fetched. This does mean all of the rows are in the same table, but personally I have not experienced any performance problems but it is something to keep in mind when modifying the entities in other ways like indexes, relations or constraints for example.

I'm not familiar with this quirk of SQLite, but modeling base class/subclass relationships are usually done with different tables. There is one table that represents the base class which contains attributes common to all derivative classes (Vehiclea) and a different table for each subclass which contain attributes unique to that subclass (Cars, Trains, Airplanes).
Performance is no better or worse than any entity normalized across different tables.

Rails self-referential hierarchical relationship

I've read Rails guides and tried a few different things w/Active Record but haven't been able to figure out what the best way to do this is.
I need to set up a self-referential (users to users) relationship that is hierarchical. It usually would be no more than 5 levels high, but should be able to scale up infinitely.
I've tried creating a UserHierarchy model with a DB schema like this:
parent | child | level
However, managing this is a bit too difficult and too complicated to handle.
What's the best way in Rails to do a self-referential hierarchical relationship? I've checked out gems like ancestry, but the majority of them use class inheritance and don't work well for self-referential relationships. It's a many-to-many, self-referential hierarchy (in MySQL).

Ancestry is one of the gems that is esp. well suited for object tree structures (what you call self-referential hierarchical relationships). You should have a more detailed look at it.
Generally, there are about four common ways to store trees in a SQL database:
Simple parent pointers. You just add a new column called parent_id to your model holding the ID of the parent object. This allows easy inserts and is well suited for single-level hierarchies but is generally difficult to use for deeper hierarchies and is thus generally not used as the primary mechanism (although it is sometimes combined with other mechanisms)
Nested Sets. You define your trees as a structure of nested sets. This is typically implemented with a right and a left column which are populated with numbers to define the set. It allows efficient querying but is a bit tricky when inserting values. Esp. when having concurrent changes to a tree, it is sometimes prone to inconsistencies. This model is e.g. used to the awesome_nested_set gem.
Materialized Paths. This is the model e.g. used by ancestry. It stores the full parent path of all elements. This allows for efficient inserting and querying. Changing a tree is bit more expensive.
Closure Trees. This mechanism stores for each element all of its parents in a table. This is e.g. used by the closure_tree gem.
Generally, all these options allow to store a tree of objects, i.e. a hierarchical structure of objects of the same class (an ActiveRecord model in this case).
Which one to use depends on which trade-offs are more important for your specific use-case. Most importantly, you should figure out if you are changing trees often (e.g. moving sub-trees around or adding only leaves) and how you are querying the tree (e.g. do you only need direct children, do you need whole sub-trees, do you need to filter) and chose the appropriate solution based on that.

Rails 3.0 - best practices: multiple subtypes of a model object

So this is probably a fairly easy question to answer but here goes anyway.
I want to have this view, say media_objects/ that shows a list of media objects. Easy enough, right? However, I want the list of media objects to be a collection of things that are subtypes of MediaObject, CDMediaObject, DVDMediaObject, for example. Each of these subtypes needs to be represented with a db table for specific set of metadata that is not entirely common across the subtypes.
My first pass at this was to create a model for each of the subtypes, alter the MediaObject to be smart enough to join into those tables on it's conceptual 'all' behavior. This seems straightforward enough but I end up doing a lot of little things that feel not so rails-O-rific so I wanted to ask for advice here.
I don't have any concrete code for this example yet, obviously, but if you have questions I'll gladly edit this question to provide that information...
thanks!

Creating a model for each sub-type is the way to go, but what you're talking about is multiple-table inheritance. Rails assumes single-table inheritance and provides really easy support for setting it up. Add a type column to your media_objects table, and add all the columns for each of the specific types of MediaObject to the table. Then make each of your models a sub-class of MediaObject:
class MediaObject < ActiveRecord::Base
end
class CDMediaObject < MediaObject
end
Rails will handle pulling the records out and instantiating the correct subclass, so that when you MediaObject.find(:all) the results will contain a mixture of instances of the various subclasses of MediaObject.
Note this doesn't meet your requirement:
Each of these subtypes needs to be represented with a db table for specific set of metadata that is not entirely common across the subtypes.
Rails is all about convention-over-configuration, and it will make your life very easy if you write your application to it's strengths rather than expecting Rails to adapt to your requirements. Yes, STI will waste space leaving some columns unpopulated for every record. Should you care? Probably not; database storage is cheap, and extra columns won't affect lookup performance if your important columns have indexes on them.
That said, you can setup something very close to multiple-table inheritance, but you probably shouldn't.

I know this question is pretty old but just putting down my thoughts, if somebody lands up here.
In case the DB is postgres, I would suggest use STI along hstore column for storing attributes not common across different objects. This will avoid wasting space in DB yet the attributes can be accessed for different operations.

I would say, it depends on your data: For example, if the differences between the specific media objects do not have to be searchable, you could use a single db table with a TEXT column, say "additional_attributes". With rails, you could then serialize arbitrary data into that column.
If you can't go with that, you could have a general table "media_objects" which "has one :dataset". Within the dataset, you could then store the specifics between CDMediaObject, DVDMediaObject, etc.
A completely different approach would be to go with MongoDB (instead of MySQL) which is a document store. Each document can have a completely different form. The entire document tree is also searchable.

Database and relationship design when subclassing a model

I have a model "Task" which will HABTM many "TaskTargets".
When it comes to TaskTargets however, I'm writing the base TaskTarget class which is abstract (as much as can be in Rails). TaskTarget will be subclassed by various different conceptualizations of anything that can be the target of a task. So say, software subsystem, customer site, bathroom, etc...
The design of the classes here is fairly straightforward, but where I'm hitting a snag is in how I will relate it all together and how I will have rails manipulate those relationships.
My first thought is that I will have a TaskTarget table which will contain the basic common fields (name, description...). It will then also have a polymorphic relationship out to a table specific to the type of data the implementing class wraps.
This implies that the data for one instance of a class implementing TaskTarget will be found in two tables.
The second approach is to create a polymorphic HABTM relationship between Task and subclasses of TaskTarget which I thought I could reuse the table name TaskTarget for the join table.
Option #2 I suspect is the most robust, but maybe I'm missing something. Thanks for any help and of course I'm really just asking to make sure I get it done right, once!

I think the two approaches (easily) available to you in Rails are:
1) Single Table Inheritance: You create a single TaskTarget table that has every field that every subclass might want. You then also add a "type" field that stores the class name, and Rails will pretty much do the rest for you. See the ActiveRecord api docs for more info, especially the "Single Table Inheritance" section.
2) Concrete Table Inheritance: There is no table for the base TaskTarget class. Instead, simply create a table for each concrete class in your hierarchy with only the fields needed by that class.
The first option makes it easier to do things like "Show me all the TaskTargets, regardless of subclass," and results in fewer tables. It does make it a little harder to tell exactly what one subclass can do, as opposed to another, and if you have a lot of TaskTargets, I suppose eventually having them all in one table could be a performance concern.
The second option makes for a cleaner schema that is somewhat easier to read, and each class will work pretty much just like any normal ActiveRecord model. However, joining across all TaskTarget tables can be cumbersome, especially as you add more subclasses in the future. Implementing any necessary polymorphic associations may also involve some extra complexity.
Which option is better in your situation will depend on what operations you need to implement, and the characteristics of your data set.

Best Practice for Model Design in Ruby on Rails

The RoR tutorials posit one model per table for the ORM to work.
My DB schema has some 70 tables divided conceptually into 5 groups of functionality
(eg, any given table lives in one and only one functional group, and relations between tables of different groups are minimised.)
So: should I design a model per conceptual group, or should I simply have 70 Rails models and leave the grouping 'conceptual'?
Thanks!

Most likely, you should have 70 models. You could namespace the models to have 5 namespaces, one for each group, but that can be more trouble than it's worth. More likely, you have some common functionality throughout each group. In that case, I'd make a module for each group containing its behavior, and include that in each relevant model. Even if there's no shared functionality, doing this can let you quickly query a model for its conceptual group.

I cover this in one of my large apps by just making sure that the tables/models are conceptually grouped by name (with almost 1:1 table-model relationship). Example:
events
event_types
event_groups
event_attendees
etc...
That way when I'm using TextMate or whatever, the model files are nicely grouped together by the alpha sort. I have 80 models in this app, and it works well enough to keep things organised.

You should definitely use one model per table in order to take advantage of all the ActiveRecord magic.
But you could also group your models together into namespaces using modules and sub-directories, in order to avoid having to manage 70 files in your models directory.
For example, you could have:
app/models/admin/user.rb
app/models/admin/group.rb
for models Admin::User and Admin::Group, and
app/models/publishing/article.rb
app/models/publishing/comment.rb
for Publishing::Article and Publishing::Comment
And so forth...

Without knowing more details about the nature of the seventy tables and their conceptual relations it isn't really possible to give a good answer. Are these legacy tables or have you designed this from scratch?
Are the tables related by some kind of inheritance pattern or could they be? Rails can do a limited form of inheritance. Look up Single Table Inheritance (STI).
Personally, I would put a lot of effort into avoiding working with seventy tables simply because that is an awful lot of work - seventy Models & Controllers and their 4+ views, helpers, layouts, and tests not to mention the memory load issue of keeping the design in ind. Unless of course I was getting paid by the hour and well enough to compensate for the repetition.

Before jumping in a making 70 models, please consider this question to help you decide:
Would each of your tables be considered an "object" for example a "cars" table or are some of the tables holding only relationship information, all foreign key columns for example?
In Rails only the "object" tables become models! (With some exception for specific types of associations) So it is very likely that if you have only 5 groups of functionality, you might not have 70 models. Also, if the groups of functionality you mentioned are vastly different, they may even be best suited in their own app.

There may be a small number of cases where you can use the Rails standard single-table-inheritance model. Perhaps all of the classes in one particular functional grouping have the same fields (or nearly all the same). In that case, take advantage of the DRYness STI offers. When it doesn't make sense, though, use class-per-table.
In the class-per-table version, you can't easily pull common functionality into a base class. Instead, pull it into a module. A hierarchy like the following might prove useful:
app/models/admin/base.rb - module Admin::Base, included by all other Admin::xxx
app/models/admin/user.rb - class Admin::User, includes Admin::Base
app/models/admin/group.rb - class Admin::Group, includes Admin::Base

It's already mentioned, it's hard to give decent advice without knowing your database schema etc, however, I would lean towards creating the 70+ models, (one for each of your tables.)
You may be able to get away with ditching some model, but for the cost (negliable) you may as well have them there.
You don't need to create a controller + views for each model (as answerd by srboisvert). You only need a controller for each resource (which I would expect to be a lot less than 70 - probably only 10 or 15 or so judging by your description).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart