mongodb data design question - ruby-on-rails

I'm trying my first application with mongodb on Rails using mongo_mapper and I'm weighing my options on an STI model like below.
It works fine, and I will of course add to this in more ways than I can currently count, I just curious if I wouldn't be better off with Embedded Documents or some such.
I'd like my models to share as much as possible, IE since they all inherit certain attributes, a shared form partial on property/_form.html.erb... in addition to their own unique form elements etc. I know the views will differ but I'm not sure on the controllers yet, as I could use property controller I assume for most things? And I'm sure it will get more complex as I go along.
Any pointers resources and/or wisdom (pain saving tips) would be greatly appreciated
property.rb
class Property
include MongoMapper::Document
key :name, String, :required => true
key :_type, String, :required => true
key :location_id, Integer, :required => true
key :description, String
key :phone, String
key :address, String
key :url, String
key :lat, Numeric
key :lng, Numeric
key :user_id, Integer, :required => true
timestamps!
end
restaurant
class Restaurant < Property
key :cuisine_types, Array, :required => true
end
bar
class Bar < Property
key :beers_on_tap, Array
end

Don't be afraid of more models, the idea of OO is to be able to cut up your concerns into tiny pieces and then treat each of them in the way they need to be treated.
For example, your Property model seems to be doing a whole lot. Why not split out the geo stuff you've got going on into an EmbeddedDocument (lat, lng, address, etc)? That way your code will remain simpler and more readable.
I use this sort of STI myself and I find it makes my code much simpler and more useable. One of the beauties of using a DB like Mongo is that you can do very complex STI like this and still have a manageable collection of data.
Regarding your cuisine_types and beers_on_tap etc, I think those are fine concepts. It might be useful to have Cuisine and Beer models too, so your database remains more normalized (a concept that is easy to lose in Mongo). e.g.:
class Bar < Property
key :beer_ids, Array
many :beers, :in => :beer_ids
end
class Beer
include MongoMapper:Document
key :name, String
end

Do you expect to return both Restaurants and Bars in the same query?
If not, you might want to reconsider having them derive from a base type.
By default, Mongo_Mapper is going to put both Restaurants and Bars in a single collection. This could hamper performance and make things harder to scale in the future.
Looking through some of the Mongo_Mapper code, it looks like you might be able to set this on the fly with set_collection_name.

Related

Is there a better way to manage has_many relationship with jsonb in Rails?

Considering Students who can study various things, I'm storing those in a jsonb column referencing a Studies table. Indexing the studies isn't really important (for now) and I prefer to avoid a relationship table.
Therefore: add_column :students, :studies, :jsonb, default: []
And in my simple form (in slim):
= simple_form_for #student do |f|
= f.input :studies, as: :check_boxes, collection: Study.all, label_method: :name
This works stupendously well considering the brevity and the simplicity of it. Except for one small detail: the form doesn't check previously saved studies as their IDs are stored as strings in the jsonb array ["", "2", "12"] and the form apparently requires integers.
I resorted to add a studies' value function in the Student model, but it seems sooo overkill (also the .reject(&:zero?) to remove the empty array value):
def studies=(array)
# transform strings to integers and remove leading empty value
super(array.map(&:to_i).reject(&:zero?))
end
Is there a better way?
I would say the better way is just using the relationship table. Overriding the assignment method on a model is generally not the right approach.
JSONB is nice, gives flexibility, and can be even queried nicely, but unless you have a really strong reason to go with it in this case, you should probably stick to has_many :through... association.
Either way, depending on how you wired everything, maybe instead of overriding the assignment method you would be better by putting your logic in action filters or somewhere where you do model validation...

Flattening a polymorphic AR relation with Elasticsearch/Tire

I'm working with a Rails 3 application to allow people to apply for grants and such. We're using Elasticsearch/Tire as a search engine.
Documents, e.g., grant proposals, are composed of many answers of varying types, like contact information or essays. In AR, (relational dbs in general) you can't specify a polymorphic "has_many" relation directly, so instead:
class Document < ActiveRecord::Base
has_many :answerings
end
class Answering < ActiveRecord::Base
belongs_to :document
belongs_to :question
belongs_to :payload, :polymorphic => true
end
"Payloads" are models for individual answer types: contacts, narratives, multiple choice, and so on. (These models are namespaced under "Answerable.")
class Answerable::Narrative < ActiveRecord::Base
has_one :answering, :as => :payload
validates_presence_of :narrative_content
end
class Answerable::Contact < ActiveRecord::Base
has_one :answering, :as => :payload
validates_presence_of :fname, :lname, :city, :state, :zip...
end
Conceptually, the idea is an answer is composed of an answering (functions like a join table, stores metadata common to all answers) and an answerable (which stores the actual content of the answer.) This works great for writing data. Search and retrieval, not so much.
I want to use Tire/ES to expose a more sane representation of my data for searching and reading. In a normal Tire setup, I'd wind up with (a) an index for answerings and (b) separate indices for narratives, contacts, multiple choices, and so on. Instead, I'd like to just store Documents and Answers, possibly as parent/child. The Answers index would merge data from Answerings (id, question_id, updated_at...) and Answerables (fname, lname, email...). This way, I can search Answers from a single index, filter by type, question_id, document_id, etc. The updates would be triggered from Answering, but each answering will then pull in information from its answerable. I'm using RABL to template my search engine inputs, so that's easy enough.
Answering.find(123).to_indexed_json # let's say it's a narrative
=> { id: 123, question_id: 10, :document_id: 24, updated_at: ..., updated_by: root#me.com, narrative_content: "Back in the day, when I was a teenager, before I had...", answerable_type: "narrative" }
So, I have a couple of questions.
The goal is to provide a single-query solution for all answers, regardless of underlying (answerable) type. I've never set something like this up before. Does this seem like a sane approach to the problem? Can you foresee wrinkles I can't? Alternatives/suggestions/etc. are welcome.
The tricky part, as I see it, is mapping. My plan is to put explicit mappings in the Answering model for the fields that need indexing options, and just let the default mappings take care of the rest:
mapping do
indexes :question_id, :index => :not_analyzed
indexes :document_id, :index => :not_analyzed
indexes :narrative_content, :analyzer => :snowball
indexes :junk_collection_total, :index => :not_analyzed
indexes :some_other_crazy_field, :index
[...]
If I don't specify a mapping for some field, (say, "fname") will Tire/ES fall back on dynamic mapping? (Should I explicitly map every field that will be used?)
Thanks in advance. Please let me know if I can be more specific.
Indexing is the right way to go about this. Along with indexing field names, you can index the results of methods.
mapping do
indexes :payload_details, :as => 'payload_details', :analyzer => 'snowball',:boost => 0
end
def payload_details
"#{payload.fname} #{payload.lname}" #etc.
end
The indexed value becomes a duck type, so if you index all of the values that you reference in your view, the data will be available. If you access an attribute that is not indexed on the model of the indexed item, it will grab the instance from ActiveRecord, if you access an attribute of a related model, I am pretty sure you get a reference error, but the dynamic finder may take over.

Relations with mongoid, what should i use?

I'm using Ruby on Rails 3.1 with mongoid and trying to set up som rather simple relations between posts, comments, users and tags.
I'm very new to mongodb, and no-sql in general so I'm a bit confused.
What I am trying to accomplish is this:
Users, posts and comments should be able to have multiple tags.
Tags should have name, type and a count of how many times it has been used.
I need to be able to get all available tags so that users kan choose from them.
And the other way around, be able to retrieve tags from users, posts and comments.
I've read a lot about it, and still can't seem to figure out which approach I should take. Should I use referencial or embedded relationships?
I've looked at a couple of gems but no-one seems to work as i described above.
Sidenote: I am going to use Tire for my search-function later on.
Cool, welcome to MongoDB! This is hard to get right and depends on your application but I'll try and give you some pointers based on what you've written and what I think will work best.
This isn't always the case but the general theory is that if an object is always manipulated and viewed in context of another you should embed it inside that object. This is probably the case with comments and posts in your application. Therefore you may want to embed comments inside posts.
However, because you use the tag object in multiple contexts I would make it its own collection like this:
class Tag
include Mongoid::Document
field :tag, type: String
field :type, type: String
field :count, type: Integer
end
Let's run down your requirements and build the models.
Tags should have name, type and a count of how many times it has been used.
Done via above code for Tag class.
Users, posts and comments should be able to have multiple tags.
Ok so let's give each of these classes a "tags" field that has an array of tag IDs, this would be a referential relationship.
class User
include Mongoid::Document
field :first_name, type: String
field :last_name, type: String
field :email, type: String
field :tags, type: Array
end
Also here we will embed comments inside of the posts along with having the array of tag IDs like we do for Users.
class Post
include Mongoid::Document
field :subject, type: String
field :body, type: String
field :tags, type: Array
embeds_many :comments
end
class Comment
include Mongoid::Document
field :name, type: String
field :type, type: String
field :count, type: Integer
embedded_in :post
end
Make sense? There is some more info here on modeling these sorts of relationships in Rails but using mongomapper instead of mongoid (so don't pay attention to the syntax but pay attention to the ideas presented)

Rails Models: how would you create a pre-defined set of attributes?

I'm trying to figure out the best way to design a rails model. For purposes of the example, let's say I'm building a database of characters, which may have several different fixed attributes. For instance:
Character
- Morality (may be "Good" or "Evil")
- Genre (may be "Action", "Suspense", or "Western")
- Hair Color (may be "Blond", "Brown", or "Black")
... and so on.
So, for the Character model there are several attributes where I want to basically have a fixed list of possible selections.
I want users to be able to create a character, and in the form I want them to pick one from each of the available options. I also want to be able to let users search using each of these attributes... ( ie, "Show me Characters which are 'Good', from the 'Suspense' genre, and have 'Brown' hair).
I can think of a couple ways to do this...
1: Create a string for each attribute and validate limited input.
In this case I would define an string column "Morality" on the character table, then have a class constant with the options specified in it, and then validate against that class constant.
Finding good characters would be like Character.where(:morality=>'Good').
This is nice and simple, the downside is if I wanted to add some more detail to the attribute, for instance to have a description of "Good" and "Evil", and a page where users could view all the characters for a given morality.
2: Create a model for each attribute
In this case Character belongs_to Morality, there would be a Morality model and a moralities table with two records in it: Morality id:1, name:Good etc.
Finding good characters would be like Morality.find_by_name('Good').characters... or
Character.where(:morality=> Morality.find(1).
This works fine, but it means you have several tables that exist only to hold a small number of predefined attributes.
3: Create a STI model for attributes
In this case I could do the same as #2, except create a general "CharacterAttributes" table and then subclass it for "MoralityAttribute" and "GenreAttribute" etc. This makes only one table for the many attributes, otherwise it seems about the same as idea #2.
So, those are the three ways I can think of to solve this problem.
My question is, how would you implement this, and why?
Would you use one of the approaches above, and if so which one? Would you do something different? I'd especially be interested to hear performance considerations for the approach you would take. I know this is a broad question, thank you for any input.
EDIT:
I'm adding a Bounty of 250 (more than 10% of my reputation!!) on this question because I could really use some more extended discussion of pros / cons / options. I'll give upvotes to anyone who weighs in with something constructive, and if someone can give me a really solid example of which approach they take and WHY it'll be worth +250.
I'm really agonizing over the design of this aspect of my app and it's now time to implement it. Thanks in advance for any helpful discussion!!
FINAL NOTE:
Thank you all for your thoughtful and interesting answers, all of them are good and were very helpful to me. In the end (coming in right before the bounty expired!) I really appreciated Blackbird07's answer. While everyone offered good suggestions, for me personally his was the most useful. I wasn't really aware of the idea of an enum before, and since looking into it I find it solves many of the issues I've been having in my app. I would encourage everyone who discovers this question to read all the answers, there are many good approaches offered.
I assume that you are going to have more than a few of these multiple-choice attributes, and would like to keep things tidy.
I would recommend the store it in the database approach only if you want to modify the choices at runtime, otherwise it would quickly become a performance hit; If a model has three such attributes, it would take four database calls instead of one to retreive it.
Hardcoding the choices into validations is a fast way, but it becomes tedious to maintain. You have to make sure that every similar validator and drop-down list etc. use matching values. And it becomes quite hard and cumbersome if the list becomes long. It's only practical if you have 2-5 choices that really won't change much, like male, female, unspecified
What I'd recommend is that you use a configuration YAML file. This way you can have a single tidy document for all your choices
# config/choices.yml
morality:
- Good
- Evil
genre:
- Action
- Suspense
- Western
hair_color:
- Blond
- Brown
- Black
Then you can load this file into a constant as a Hash
# config/initializers/load_choices.rb
Choices = YAML.load_file("#{Rails.root}/config/choices.yml")
Use it in your models;
# app/models/character.rb
class Character < ActiveRecord::Base
validates_inclusion_of :morality, in: Choices['morality']
validates_inclusion_of :genre, in: Choices['genre']
# etc…
end
Use them in views;
<%= select #character, :genre, Choices['genre'] %>
etc…
Put simply, you're asking how to enumerate ActiveRecord attributes. There are a lot of discussions around the web and even on SO for using enums in rails applications, e.g. here, here or here to name a few.
I never used one of the many gems there are for enums, but active_enum gem sounds particularly suited for your use case. It doesn't have the downsides of an activerecord-backed attribute set and makes maintenance of attribute values a piece of cake. It even comes with form helpers for formtastic or simple form (which I assume could help you for attribute selection in your character search).
If a change in any of these attributes would be strongly tied to a change in the code (ie: When a new Hair Color is introduced, a new page is created or a new action is implemented), then I'd say add them as a string hash (option 1). You could store them in the Character model as a finalized hashes with other meta-data.
class Character < ActiveRecord::Base
MORALITY = {:good => ['Good' => 'Person is being good'], :evil => ['Evil' => 'Person is being Evil']}
...
end
Character.where(:morality => Character::MORALITY[:good][0])
Edit to add the code from comment:
Given Character::MORALITY = {:good => {:name => 'Good', :icon => 'good.png'}, ...
- Character::MORALITY.each do |k,v|
= check_box_tag('morality', k.to_s)
= image_tag(v[:icon], :title => v[:name])
= Character::MORALITY[#a_character.morality.to_sym][:name]
My suggestion is to use a NoSQL database such as MongoDB.
MongoDB support embedded documents. An embedded document is saved in the same entry as the parent. So it is very fast for retrieval, it is like accessing a common field. But embed documents can be very rich.
class Character
include Mongoid::Document
embeds_one :morality
embeds_many :genres
embeds_one :hair_colour
index 'morality._type'
index 'genres._type'
end
class Morality
include Mongoid::Document
field :name, default: 'undefined'
field :description, default: ''
embedded_in :character
end
class Evil < Morality
include Mongoid::Document
field :name, default: 'Evil'
field :description,
default: 'Evil characters try to harm people when they can'
field :another_field
end
class Good < Morality
include Mongoid::Document
field :name, default: 'Good'
field :description,
default: 'Good characters try to help people when they can'
field :a_different_another_field
end
Operations:
character = Character.create(
morality: Evil.new,
genres: [Action.new, Suspense.new],
hair_colour: Yellow.new )
# very very fast operations because it is accessing an embed document
character.morality.name
character.morality.description
# Very fast operation because you can build an index on the _type field.
Character.where('morality._type' => 'Evil').execute.each { |doc| p doc.morality }
# Matches all characters that have a genre of type Western.
Character.where('genres._type' => 'Western')
# Matches all characters that have a genre of type Western or Suspense.
Character.any_in('genres._type' => ['Western','Suspense'])
This approach has the advantage that adding a new type of Morality is just adding a new Model that inherits from Morality. You don't need to change anything else.
Adding new Morality types do not have any performance penalty. The index take care of maintaing fast query operations.
Accessing the embed fields is very fast. It is like accessing a common field.
The advantage of this approach over just a YML file is that you can have very rich embed documents. Each of these documents can perfectly grow to your needs. Need a description field? add it.
But I would combine the two options. The YML file could be very useful for having a reference that you can use in Select boxes for example. While having embeds document gives you the desired flexibility.
I'll follow 2 principles: DRY, developers happiness over code complicate.
First of all, the predefined Character data will be in the model as a constant.
The second is about validation, we will do a bit metaprogramming here, as well as searching with scopes.
#models/character.rb
class Character < ActiveRecord::Base
DEFAULT_VALUES = {:morality => ['Good', 'Evil'], :genre => ['Action', 'Suspense', 'Western'], :hair_color => ['Blond', 'Brown', 'Black']}
include CharacterScopes
end
#models/character_scopes.rb
module CharacterScopes
def self.included(base)
base.class_eval do
DEFAULT_VALUES.each do |k,v|
validates_inclusion_of k.to_sym, :in => v
define_method k do
where(k.to_sym).in(v)
end
# OR
scope k.to_sym, lambda {:where(k.to_sym).in(v)}
end
end
end
end
#app/views/characters/form.html
<% Character::DEFAULT_VALUES.each do |k,v] %>
<%= select_tag :k, options_from_collection_for_select(v) %>
<% end %>
For the multiple values case, one option is to use bit fields as implemented in the FlagShihTzu gem. This stores a number of flags in a single integer field.

Parse before storing in MVC

I'm getting started with parsing data and getting some structure from user supplied strings (mostly pulling out digits and city names).
I've run a bit of code in the ruby interpreter, and now I want to use that same code in a web application.
I'm struggling as to where in the code my parsing should be, or how it is structured.
My initial instinct was that it belongs in the model, because it is data logic. For example, does the entry have an integer, does it have two integers, does it have a city name, etc. etc.
However, my model would need to inherit both ActiveRecord, and Parslet (for the parsing), and Ruby apparently doesn't allow multiple inheritance.
My current model is looking like this
#concert model
require 'parslet'
class concert < Parlset::Parser
attr_accessible :date, :time, :city_id, :band_id, :original_string
rule(:integer) {match('[0-9]').repeat(1)}
root(:integer)
end
Really not much there, but I think I'm stuck because I've got the structure wrong and don't know how to connect these two pieces.
I'm trying to store the original string, as well as components of the parsed data.
I think what you want is:
#concert model
require 'parslet'
class concert < ActiveRecord::Base
before_save :parse_fields
attr_accessible :date, :time, :city_id, :band_id, :original_string
rule(:integer) {match('[0-9]').repeat(1)}
root(:integer)
private
def parse_fields
date = Parlset::Parser.method_on_original_string_to_extract_date
time = Parlset::Parser.method_on_original_string_to_extract_time
city_id = Parlset::Parser.method_on_original_string_to_extract_city_id
band_id = Parlset::Parser.method_on_original_string_to_extract_band_id
end
end
It looks to me as though you need several parsers (one for city names, one for digits). I would suggest that you create an informal interface for such parsers, such as
class Parser
def parse(str) # returning result
end
end
Then you would create several Ruby classes that each do a parse task in ./lib.
Then in the model, you'd require all these ruby classes, and put them to the task, lets say in a before_save hook or such.
As the author of parslet, I might add that parsing digits or city names is probably not the sweet spot for parslet. Might want to consider regular expressions there.

Resources