I have an app that consists mainly of restaurant model instances. One of the essential attributes for these restaurants is labeling the cuisine it falls under. I'm currently at odds with myself in regards to designing this. On one hand I thought of creating a Cuisine model and creating either a HMT or HABTM association between Restaurants and Cuisines.
More recently I came across this post which shows how to create a pre-defined set of attributes. To take the answer one step further I'm assuming (in my case) I'd add a string-based cuisine column to my restaurant model and setup a select box in my restaurant form that would save the selected value.
What I was wondering was what would be the most efficient way of doing this? The goal is to eventually be able to query restaurants based what cuisine(s) they fall under. I wasn't sure if a model would be the best choice due to it only serving as a join table in a sense with a name attribute. Wasn't sure if having this extra table for something so minute would be optimal.
On the other hand I didn't know if using YAML for this would be conducive since the values are essentially dummy strings with no tangible records on file like I'd have with a model instance. Can someone help me sort out this confusion?
There are many benefits of normalizing many-to-many relationships in the db. Here are some:
Searching, sorting, and creating indexes is faster, since tables are narrower, and more rows fit on a data page.
You can have more clustered indexes (one per table), so you get more flexibility in tuning queries.
Index searching is often faster, since indexes tend to be narrower and shorter.
More tables allow better use of segments to control physical placement of data.
You usually have fewer indexes per table, so data modification commands are faster.
Fewer null values and less redundant data, making your database more compact.
Triggers execute more quickly if you are not maintaining redundant data.
Data modification anomalies are reduced.
Normalization is conceptually cleaner and easier to maintain and change as your needs change.
Also, by normalizing you get the cleaner syntax and other infrastructure benefits from ActiveRecord, e.g.
cuisine.restaurants.where(city: 'Toledo')
Related
We have to create a request system which will have roughly 10 different types of requests. All of these requests will belong to the 'accounting' aspect of our application. Therefore we've called them "Accounting requests".
All requests share maybe only a few columns and each has up to 20 columns individually.
We started to wonder if having separate tables for each request type would be practical in terms of speed when we start to have to do very complicated joins or queries, for example, fetching ALL requests types into a single table and then sorting it.
Maybe it would be easier to just use Single Table Inheritance since it will have a type column and we'd be using one table to store all 10 accounting request types.
What do you think regarding using STI for this many polymorphic associations and requirements?
Essentially, it would have models like so:
AccountingRequest
BillingRequest < AccountingRequest
CheckRequest < AccountingRequest
CancellationRequest < AccountingRequest
Each subclass has roughly 10+ fields.
Currently reading about Multiple Table Inheritance here. This seems like the solution that fits my requirements in this case. Not sure yet though.
STI is a good fit if your models all share the same attributes.
However if your sub classes start having attributes specific to them and not applicable to others, then STI can result in a lot of null columns. In that case, I usually prefer to go with polymorphic association.
This railscast episode is a great example of the difference between the 2
You can use STI in that situation. But making STI will require all the columns into one single table and that's not the good think. The table will go very large in the number of fields.
I think you should divide into two tables like as below...
Request: A request table will be the polymorphic table which saved the information for the type of requests.
RequestItem: The request item table will save all the 20 fields records into the table and will have a foreign key of request table. The request item table will have two fields into the database that's called key and value.
It sounds do-able.
When I've looked into this, I found that making extensive use of value objects helped to control the non-applicability of some attributes to some of the types.
In my case I had types of products, some of which would not have particular measurements for example. In those cases I used a Null Object to indicate "Not applicable" where appropriate.
Edit: I also found the composed_of syntax very convenient: https://apidock.com/rails/ActiveRecord/Aggregations/ClassMethods/composed_of
For now I'm using a bit of NoSQL for such cases. Postgresql's JSONB type allows to store multilevel ruby hash. It also provides rich functionality: DB level constraints, indexes and query operators.
So common attributes are stored in standard way and child specific - in jsonb. Then you can use whatever you need on top of this: STI, Value Objects pattern, serialization or just create scopes for each child. I prefer the last one - my models are thin, most of constraints are DB level and all business logic is in service classes.
Pros:
Avoiding alter table on big tables when need to add one more child type
Keeping my queries efficient
Preventing storing and selecting unnecessary columns
Serialization out of the box for JSON APIs
Cons:
A bit of schemaless
Vendor lock
I'm creating a jewellery product catalogue application and I need to store properties for each product such as material, finishes, product type etc.
I've concluded that there needs to be a model for each property, mainly because things like material and finishes might have prices and weights and other things associated with them.
Which of the two options will be the most efficient way to store data and be scalable
Create a model PropertyMap that will map property types and IDs to a Product ID.
Create several other models such as ProductMaterial, ProductFinish etc that will made a property to a product
All the data needs to be searchable & filterable. The database will probably index around 10K products.
Open to other smarter ways to store this data as well!
As a rule of thumb, to get the most out of your database tools, it's best to normalize your data according to the typical SQL conventions. That means that a bunch of fields that have a one-to-one relationship with each other should be collected together into the same table. That way you can grab them all (and they're frequently needed together) with a simple and efficient query.
If you instead have to gather them up from some different organization, both you and the database will end up having to do a lot more work. It will scale poorly, both on the hardware and in your brain as you struggle to maintain and extend it.
I am using postgresql.
I started to realise that I have created too many columns for the User model, and most of them are boolean fields.
Correct me if I am wrong, if I just update one boolean value, the whole table are being updated even though "Patch" verb is being used.
So I decided to create a specific model for some boolean columns, however, this would also trigger two queries, one for the User load and the other for the newly created model load.
My question is: Would it be better if I chop some of the columns to form a new model? Or a model with many columns just don't affect the performance of a rails app.
My main concern is the data connection speed, please advise.
Avoiding a join will be better than having two tables.
You can limit which columns are returned by using select in ActiveRecord. If you have large text fields, but don't need them at a particular time, this can be helpful in improving performance. The impact is probably negligible with boolean columns.
In one of my views, I need to present alot of statistics about the data in the model and the nested models. For the nested models, there are alot of sums and counts with various conditions.
Is it better to write individual methods for each statistic I need, which generates lots of SQL calls but makes nice short pieces of code? Or should I write a method that just loops over each nested model once, computes all the counts/sums manually, and returns the data in a hash?
It depends, if breaking the big query into independent queries forces you to, for example, join twice a same pair of tables, or reading twice a same table, then it is not a good strategy.
If that's not the case, you may want to break the query into independent queries for better legibility and reusability.
Use EXPLAIN to analyze your queries, try different ones and test performance with considerable amount of test data. Consider creating indexes for big tables with many records.
So this is probably a fairly easy question to answer but here goes anyway.
I want to have this view, say media_objects/ that shows a list of media objects. Easy enough, right? However, I want the list of media objects to be a collection of things that are subtypes of MediaObject, CDMediaObject, DVDMediaObject, for example. Each of these subtypes needs to be represented with a db table for specific set of metadata that is not entirely common across the subtypes.
My first pass at this was to create a model for each of the subtypes, alter the MediaObject to be smart enough to join into those tables on it's conceptual 'all' behavior. This seems straightforward enough but I end up doing a lot of little things that feel not so rails-O-rific so I wanted to ask for advice here.
I don't have any concrete code for this example yet, obviously, but if you have questions I'll gladly edit this question to provide that information...
thanks!
Creating a model for each sub-type is the way to go, but what you're talking about is multiple-table inheritance. Rails assumes single-table inheritance and provides really easy support for setting it up. Add a type column to your media_objects table, and add all the columns for each of the specific types of MediaObject to the table. Then make each of your models a sub-class of MediaObject:
class MediaObject < ActiveRecord::Base
end
class CDMediaObject < MediaObject
end
Rails will handle pulling the records out and instantiating the correct subclass, so that when you MediaObject.find(:all) the results will contain a mixture of instances of the various subclasses of MediaObject.
Note this doesn't meet your requirement:
Each of these subtypes needs to be represented with a db table for specific set of metadata that is not entirely common across the subtypes.
Rails is all about convention-over-configuration, and it will make your life very easy if you write your application to it's strengths rather than expecting Rails to adapt to your requirements. Yes, STI will waste space leaving some columns unpopulated for every record. Should you care? Probably not; database storage is cheap, and extra columns won't affect lookup performance if your important columns have indexes on them.
That said, you can setup something very close to multiple-table inheritance, but you probably shouldn't.
I know this question is pretty old but just putting down my thoughts, if somebody lands up here.
In case the DB is postgres, I would suggest use STI along hstore column for storing attributes not common across different objects. This will avoid wasting space in DB yet the attributes can be accessed for different operations.
I would say, it depends on your data: For example, if the differences between the specific media objects do not have to be searchable, you could use a single db table with a TEXT column, say "additional_attributes". With rails, you could then serialize arbitrary data into that column.
If you can't go with that, you could have a general table "media_objects" which "has one :dataset". Within the dataset, you could then store the specifics between CDMediaObject, DVDMediaObject, etc.
A completely different approach would be to go with MongoDB (instead of MySQL) which is a document store. Each document can have a completely different form. The entire document tree is also searchable.