Rails - Should I use boolean fields or relational table? - ruby-on-rails

User model has three keys: is_master, is_standard, is_guest because I originally wanted to use activerecord's boolean methods like is_master? or is_power?.
However, if it would be better to create a UserType relationship and create my own methods like this:
def master?
return true if self.user_type = 1
end

If the master/standard/guest relationship is mutually exclusive (that is, you can only ever be one of them) then a field that stores the type (in a human-readable form, please -- no opaque numbers) is better. You can always reimplement is_foo? trivially.
On the other hand, if you could have an account that be more than one of master/standard/guest/whatever at once, then stick with the separate boolean fields.

As a DBA I always hate when people are using columns as flags, it would be a lot of extra columns.
If it is all the same type (like account type), I would do as the first anwer suggests (including using text, not numbers).
If you on the other hand needs it for separate flags or multiple types (but having a restricted number) I would actually go for a binary calculation.
That is, have one columns in the table representing all the flags and then assign each flag a number.
ex.
FLAGS = {:master => 1, :standard => 2, :guest => 4, :power => 8,
:another_flag => 32, :yet_another_flag => 64}
def is_master?
self.flags & FLAGS[:master]
end
def is_standard?
self.flags & FLAGS[:standard]
end
It requires a bit more work when setting the values, but doesn't clutter up the table with a lot of columns used only for flags.

If you look at how the restful authentication plugin does it (which includes user roles) they use a join table.
I think a join is much more readable.Using a join allows you to be more flexible with your role-system.
If you require that a user can only have a single role I would put that logic in your model.

I'd start with the simplest thing that solves your current need. Refactor and add complexity from there as needed. Sounds like a set of boolean columns is just what you need.

Related

In Rails, what are the disadvantages of using model methods for constants, instead of Active Record columns?

I'm building a Rails app that will have a very high number of models using single-table inheritance. For the model subclasses, I want to be able to set constant values for things like name and description. I know I can set defaults using attribute, like this:
class SpecialWidget < Widget
attribute :name, :string, default: "Special Widget"
attribute :description, :text, default: "This is an awfully important widget."
end
The advantage here, as I understand it, is that by storing the defaults in the database, I retain the ability to do things use #order to sort by name, and paginate. But it seems bad to store constants in the database like that. It seems better to use constant methods, like this:
class SpecialWidget < Widget
def name
"Special Widget"
end
def description
"This is an awfully important widget."
end
end
In fact, that's what I was doing originally, but then I read posts like these (one, two, three), which pointed out that then if I wanted to do nice things like sort by the methods, I'd have to load the entire Widget.all into memory and then do a plain-old Ruby sort.
My application is built quite heavily around these STI models, and I will definitely have to sort by constants like name. Are the concerns about sorting and pagination significant disadvantages that will cause me to come to regret using methods in the future, or will the difference be negligible? What other disadvantages/problems might I have? I'd really like to be able to use methods instead of storing constants in the database, if possible without crippling my app's performance.
There are many benefits and few downsides to storing the default values in the database. But if it troubles you, you can have similar sorting efficiency by constructing your sort like this:
class SpecialWidget < Widget
DefaultAttrs = {name: 'Special Widget', description: 'This is... etc'}
end
class Widget < ApplicationRecord
def self.sort_by_name
types = pluck(:type).uniq
case_statements = types.map{|type| "WHEN '#{type}` THEN `#{type.constantize.const_get(:'DefaultAttrs')[:name]}'"
case_sql = "CASE type #{case_statements.join(' ') END"
order(case_sql)
end
end
... not very elegant, but it does the job!
maybe better to put the constants in the database!
It depends entirely on the shape of your data and how you want to use it. You haven't provided enough contextual specifics to guarantee that my recommendation applies to your situation, but it's a recommendation that's specifically designed to work for 95+% of all situations.
Just Put the Data in the Relational Database
The database is the store for all things in your domain that is dynamic and needs to be persisted, i.e. state. It should be internally consistent, meaningfully self-descriptive, and well-structured in order to fully leverage the power of a relational db to flexibly manipulate and represent complex inter-related data.
Based on what you've said, and assuming that there are a bunch of different "widget types" implemented using Rail's STI implementation with a type column, I would model Widget and SpecialWidget in the database like this:
widgets
id | type
-------------------
1 | 'Widget'
2 | 'SpecialWidget'
3 | 'Widget'
4 | 'Widget'
widget_types
type | name | description
--------------------------------------------------------------
'Widget' | 'Normal Widget' | 'A normal widget.'
'SpecialWidget' | 'Special Widget' | 'This is an awfully important widget.'
You called these values a "constant", but are they really? In the purposes of your domain, will they never change like the value of Matth::PI never changes? Or will descriptions be changed, widgets renamed, widgets added, and widgets expired? Without knowing for sure I'm going to assume they're not actually Constant.
Having name and description as methods is effectively storing that widget_types table in your application source code, moving data out of your database. If you really can't afford the extra millisecond a simple JOIN for two small strings on each Widget incurs, then just load the full widget_types table into cache once on application startup, and it'll perform the same as saving it in source code.
This schema is more normalized (incurring benefits), the data itself describes all I need to know, and as you've pointed out, I can flexibly operate on that data (important since you "will definitely have to sort"). The data in this form is also extensible for future changes as they come.
Again: the database stores structured data for the purpose of on-demand flexible manipulation -- you can make up queries on the fly, and the DB can answer it.
I Really Don't Want to Put Data in the Database
Okay... then you'll have to pass that data into the database every time you want to operate on it. You can do it like so:
SELECT w.id, w.type, wt.name
FROM widgets w
INNER JOIN (
VALUES ('Widget', 'Normal Widget'), ('SpecialWidget', 'Special Widget')
) wt(type, name) ON wt.type = w.type
ORDER BY wt.name
The VALUES expression creates an ad-hoc table mapping the class to the name. By passing in that mapping and joining on it (every time), you can tell the DB to ORDER BY it.

Rails - best database design for existing model

So I have inherited some Rails 4 code and database models which I need to add to.
The model, called mpb_item has a table called mpb_items.
Inside this items table there are columns such as:
role1_start_date, role2_start_date, role3_start_date, role4_start_date
Not ideal but this is what it is. They should have been in a separate roles table I guess.
I need to add functionality to Suspend any one of these roles (or all of them).
I guess I can either:
To the existing table, I can add a new boolean column for each existing role column. e.g. role1_suspended, role2_suspended etc
Create table called mpb_suspensions, with 2 columns: mpb_item_id and role_name. Since the roles don't have ids themselves, the role_name column will store 'role1', or 'role2' etc depending on which role was suspended.
In my View, I need to have the ability to "suspend" each job, or all of them. I'm not sure how the model code would look to do this and which approach would be best.
If I had to build this, given the minimal info you provided I'd start by picking option 2 that you described. (a RoleSuspension is its own table) Benefits include:
fewer new columns total, and doesn't make any table indefinitely large
no redundant columns (role1, role2, etc.)
no need to add more columns if the # of possible roles changes (ie. it's more normalized)
no nil values (approach 1)
you can easily query for the overall state of RoleSuspensions (whereas with approach 1 you'd need to count the non-nil values in role1_suspended, role2_suspended, etc. then sum them up), and easier to index
you can attach logic to a RoleSuspension; it's a separate animal from its parent MbpItem. If it's just a bunch of boolean columns, any complex logic would need to be mushed into the MbpItem model, and would likely be much messier to maintain.
Following option 2's logic, suspending a role would involve creating a new record like this:
#mbp_item.role_suspensions.create!(role_number: 2)
and checking for suspension status would involve something like this:
if #mbp_item.role_suspensions.any?
# or....
RoleSuspension.where(role_number: 3).each do |s|
puts "Item #{s.mbp_item} is suspended for role #{s.role_number}."
end
Database performance will be a larger consideration with this approach, depending on the answers to the following questions:
Where and how often do you need to check for role suspensions?
When you check for role suspensions, how do you approach asking the question? In other words is some global task asking "What role suspensions exist, and for what mbp_items?" or do you check for suspensions on a given mbp_item when you render that object?
If you'll need to check for role suspensions very frequently, perhaps you should add a boolean column to mbp_items called has_suspensions, which will be a partial cache in that it will indicate whether any suspensions exist for this MbpItem (and must be maintained in after_create and after_destroy hooks).
On the other hand, if you know that suspension info never will need to have its own logic and never will need to be queried directly, you could add a single column to MbpItem, role_suspensions, containing a serialized array of integers for the suspended roles. This would be a much less invasive approach in terms of database structure, and probably much simpler even than your option 1 in that it would allow you to suspend and de-suspend any role number with less code and less metaprogramming, but if you ever need to add any fancy logic to the suspension or desuspension process (ie. if a RoleSuspension deserves status as its own object), you'd regret this approach.

In Rails, what is the best way to store multiple boolean attributes in a model?

I have a model House that has many boolean attributes, like has_fireplace, has_basement, has_garage, and so on. House has around 30 such boolean attributes. What is the best way to structure this model for efficient database storage and search?
I would like to eventually search for all Houses that have a fireplace and a garage, for example.
The naive way, I suppose, would be to simply add 30 boolean attributes in the model that each corresponds to a column in the database, but I'm curious if there's a Rails best practice I'm unaware of.
Your 'naive' assumption is correct - the most efficient way from a query speed and productivity perspective is to add a column for each flag.
You could get fancy as some others have described, but unless you're solving some very specific performance problems, it's not worth the effort. You'd end with a system that's harder to maintain, less flexible and that takes longer to develop.
For that many booleans in a single model you might consider using a single integer and bitwise operations to represent, store and retrieve values. For example:
class Model < ActveRecord::Base
HAS_FIREPLACE = (1 << 0)
HAS_BASEMENT = (1 << 1)
HAS_GARAGE = (1 << 2)
...
end
Then some model attribute called flags would be set like this:
flags |= HAS_FIREPLACE
flags |= (HAS_BASEMENT | HAS_GARAGE)
And tested like this:
flags & HAS_FIREPLACE
flags & (HAS_BASEMENT | HAS_GARAGE)
which you could abstract into methods. Should be pretty efficient in time and space as an implementation
I suggest the flag_shih_tzu gem. It helps you store many boolean attributes in one integer column. It gives you named scopes for each attribute and a way to chain them together as active record relations.
Here's another solution.
You could make a HouseAttributes model and set up a two way has_and_belongs_to_many association
# house.rb
class House
has_and_belongs_to_many :house_attributes
end
# house_attribute.rb
class HouseAttribute
has_and_belongs_to_many :houses
end
Then each attribute for a house would be a database entry.
Don't forget to set up your join table on your database.
If you're wanting to query on those attributes, then you're unfortunately probably stuck with first-class fields, if performance is a consideration. Bitfields and flag strings are an easy way to solve the problem, but they don't scale well against production data sets.
If you aren't going to worry about performance, then I'd use an implementation where each property is represented by a character ("a" = "garage", "b" = "fireplace", etc), and you just build a string that represents all the flags a record has. The primary advantage this has over a bitfield is that a) it's easier for a human to debug, and b) you don't need to worry about the size of your data types.
If performance is a concern, then you will likely need to promote them to first-class fields.
Normally I'd agree that your naive assumption is correct.
If the number of boolean fields keep growing and growing (has_fusion_reactor?) you may also consider serializing an array of flags
# house.rb
class House
serialize :flags
…
end
# Setting flags
#house.flags = [:fireplace, :pool, :doghouse]
# Appending
#house.flags << :sauna
#Querying
#house.flags.has_key? :porch
#Searching
House.where "flags LIKE ?", "pool"
I'm thinking about something like this
You have a House Table (for details of the house)
You have another master table called Features (which has features, like 'fireplace', 'basement' etc..)
and you have a joining table like Houses_Features
and it has house_id and feature_id
By that way you can assign features to a given house. dont know whether this matches to your needs, but just think about it :D
thanks and regards
sameera
You could always have a TEXT column that you hold JSON in (say, data), and then your queries could use SQL's LIKE.
Eg: house.data #=> '{"has_fireplace":true,"has_basement":false,"has_garage":true}'
Thus, doing a find using LIKE '%"has_fireplace":true%' would return anything with a fireplace.
Using model relationships (eg, a model for Fireplace, Basement, and Garage in addition to just House) would be extremely cumbersome in this case, since you have so many models.

Add fields to ActiveRecord model dynamically in Rails 2.2.2?

Say I wanted to allow an administrative user to add a field to an ActiveRecord Model via an interface in the Rails app. I believe the normal ActiveRecord::Migration code would be adequate for modifying the AR Model's table structure (something that would not be wise for many applications - I know). Of course, only certain types of fields could be added...in theory.
Obviously, the forms that add (or edit) records to this newly modified ActiveRecord Model would need to be build dynamically at run-time. A common form_for approach won't do. This discussion suggests this can only be accomplished with JavaScript.
http://groups.google.com/group/rubyonrails-talk/browse_thread/thread/fc0b55fd4b2438a5
I've used Ruby in the past to query an object for it's available methods. I seem to remember it was insanely slow. I'm too green with Ruby and Rails to know an elegant way to approach this. I hope someone here may. I'm also open to entirely different approaches to this problem that don't involve modifying the database.
To access the columns which are currently defined for a model, use the columns method - it will give you, for each column, its name, type and other information (such as whether it is a primary key, etc.)
However, modifying the schema at runtime is delicate.
The schema is pre-loaded (and cached, from the DB driver) by each model class when it is first loaded. In production mode, Rails only does this once per model, around startup.
In order to force Rails to refresh its cached schema following your modification, you should force Ruby to reload the affected model's class (pretty much what Rails does for you automatically, after each request, when running in development mode - see how to reload a class using remove_const followed by load.)
If you have a Mongrel cluster, you also have to inform the other processes in the cluster, which run in their own separate memory space, to also reload their model's classes (some clusters will allow you to create a 'restart.txt' file, which will cause an automatic soft-restart of all processes in your cluster with no additional work required on your behalf.)
Now, these having been said, depending on the actual problem that you need to solve you may not need to dynamically alter the schema after all. Instead of adding, say, columns col1, col2 and col3 to some table entries (model Entry), you can use a table called dyn_attribs, where Entry has_many :dyn_attribs, and where dyn_attribs has both a key column (which in this case can have values col1, col2 or col3) and a value column (which lists the corresponding values for col1, col2 etc.)
Thus, instead of:
my_entry = Entry.find(123)
col1 = my_entry.col1
#do something with col1
you would use:
my_entry = Entry.find(123, :include => :dyn_attribs)
dyn_attribs = my_entry.dyn_attribs.inject(HashWithIndifferentAccess.new) { |s,a|
s[a.key] = a.value ; s
}
col1 = dyn_attribs[:col1]
#do something with col1
The above inject call can be factored away into the model, or even into a base class inherited from by all models that may require additional, dynamic columns/attributes (see Polymorphic associations on how to make several models share the same dyn_attribs table for dynamic attributes.)
UPDATE
Adding or renaming a column via a regular HTML form.
Assume that you have a DynAttrTable model representing a table with dynamic attributes, as well as a DynAttrDef defining the dynamic attribute names for a given table.
Run:
script/generate scaffold_resource DynAttrTable name:string
script/generate scaffold_resource DynAttrDef name:string
rake db:migrate
Then edit the generated models:
class DynAttrTable < ActiveRecord::Base
has_many :dyn_attr_defs
end
class DynAttrDef < ActiveRecord::Base
belongs_to :dyn_attr_table
end
You may continue to edit the controllers and the views like in this tutorial, replacing Recipe with DynAttrTable, and Ingredient with DynAttrDef.
Alternatively, use one of the plugins reviewed here to automatically put the dyn_attr_tables and dyn_attr_defs tables under management by an automated interface (with all its bells and whistles), with virtually zero implementation effort on your behalf.
This should get you going.
Say I wanted to allow an
administrative user to add a field to
an ActiveRecord Model via an interface
in the Rails app.
I've solved this sort of problem before by having an extra model called AdminAdditions. The table includes an id, an admin user id, a model name string, a type string, and a default value string.
I override the model's find and save methods to add attributes from its admin_additions, and save them appropriately when changed. The model table has a large text field, initially empty, where I save nondefault values of the added attributes.
Essentially the views and controllers can pretend that every attribute of the model has its own column. This means form_for and so on all work.
ActiveRecord::Migration.add_column(User, "email", :string)
You could use Flex Attributes for this, though if you want to be able to search or order by these new columns you'll have to write (a lot of) custom SQL.
I have seen the dynamic alteration/migration of tables offered as a solution many times but I have never actually seen it implemented. There are many reasons why this solution is rarely implemented.
If the table is large then the table may/will be locked for extended periods of what is supposed to be up-time.
Why is your model changing dynamically? It is quite rare for a models structure to need to change dynamically. It is more often an indication that you are trying to model something specific in a generalised way.
This is often an attempt a producing a "Categorised" model than could be better solved by another approach.
DDL statements are often not allowed by the same user that is being used for day to day DML requirements. Whilst this could be the case, and often is in the ROR arena it is not always the "right" way to do it.
What are you trying to achieve here? A better understanding of the problem would probably reveal a more natural solution.
If you were doing this with PostgreSQL now you could probably get away with a JSON type field and then just store whatever in the json hash.

Best way to store constants referenced in the DB?

In my database, I have a model which has a field which should be selected from one of a list of options. As an example, consider a model which needs to store a measurement, such as 5ft or 13cm or 12.24m3. The obvious way to achieve this is to have a decimal field and then some other field to store the unit of measurement.
So what is the best way to store the unit of measurement? I've used a couple of approaches in the past:
1) Storing the various options in another DB table (and associated model), and linking the two with a standard foreign key (and usually eager loading the associated model). This seems like overkill, as you are forcing the DB to perform a join on every query.
2) Storing the options as a constant Hash, loaded in one of the initializers, where the key into the Hash is stored in the unit of measurement field. This way, you effectively do the join in Ruby (which may or may not be a performance increase), but you lose the ability to query from the "unit of measurement" side. This wouldn't be a problem provided it's unlikely you'd need to do queries like "find me all measurements with units of cm".
Neither of these feel particularly elegant to me.. can anyone suggest something better?
Have you seen constant_cache? It's sort of the combination of the best of 1 and 2 - lookup data is stored in the DB, but it's exposed as class constants on the lookup model and only loaded at application start, so you don't suffer the join penalties constantly. The following example comes from the README:
migration:
create_table :account_statuses do |t|
t.string :name, :description
end
AccountStatus.create!(:name => 'Active', :description => 'Active user account')
AccountStatus.create!(:name => 'Pending', :description => 'Pending user account')
AccountStatus.create!(:name => 'Disabled', :description => 'Disabled user account')
model:
class AccountStatus < ActiveRecord::Base
caches_constants
end
using it:
Account.new(:username => 'preagan', :status => AccountStatus::PENDING)
I would go with option one. How large will it be the UnitOfMeasurement table? And, if using an integer primary key, why do you worry so much about speed?
Option 1 is the way to go for design reasons. Just declare it with an integer (even smallint) primary key and a field for the unit description.
Has ActiveRecord gotten support for natural keys, yet? If it has, you can just make the name (or whatever) column of the UnitOfMeasure table the PK, that way the value of the FK column has all the info you need, and you still have a fully normalized DB with a canonical set of UnitOfMeasurement values.
Do you need to perform lookups on these values? If not, you could as well store them as a string and parse the string later on in the application that reads the values. While you risk storing unparseable data, you gain speed and reduce DB complexity. Sometimes normalizing a database is not helpful. In the end /something/ within your system needs to know that "cm" is a length measure and "m3" is a room measure and comparing "3cm" to "1m3" doesn't make any sense anyway. So you just as well can put all that knowledge in code.
Let's say you are only going to display that data anyway, what is normalizing good for here?

Resources