I have two models which are one-to-one to each other. A currently has one B.
Lately I encounter many cases where it is desirable if A keeps the id of B in order to simplify logic and boost performance. However I wonder if:
this is possible
would breach and convention
any thoughts really
UPDATE
I was wrong, the left outer join would not be benefited by the extra foreign key.
The only place I can think of is to find all A's which does not have B, an inner join is required on each of my 100000+ records. But if there is an id then I can know straight away which A has a B.
I don't believe this is possible - you have to decide where to keep the foreign key. Would it make sense for you to use a joining through relationship and for both your existing models to be has_one through the join?
Another alternative is to put A id on B, i.e. denormalize. This would allow you to figure out which A's without a B, and A's with a B. This might be appropriate for reporting scenarios when B's don't move between A's often.
Related
We have a Ruby on Rails application that is getting quite big. And we are trying to split it up into separate logical domains, and we are wondering how to enforce that the logical domain is not crossed.
Let's say we have models A, B, C, D. Model A, B are used in one part of the app, Model C, D in another. We would like to be able to draw some logical system boundaries, for the sake of faster continuous integration and independent teams.
We can re-organize the code, and break some dependencies.
For example,
let's say model B is a join or mapping table, that joins A and C in a many-to-many relationships
class A
has_many :C, through: B
We could change this to:
class A
has_many :B
From A, you get a list of c_id through the mapping table B, and then you can call a GET API on C, to get the list of C objects. This is similar to if we were to A and C physically in separate databases, and we want to display a list of C objects, when we show A.
Eventually, we would probably want A, B in one service, and C be in a separate service. In the meanwhile, we do want to keep A, B, C are in the same Postgres database, but just make a logical extraction.
Question: What are ways we can enforce logical system boundary in Rails + Postgres, so that we don't have team members that take class A, and accidentally joins with C, and introduce unintended coupling before we do a full service split.
This blog suggests using test suites and selectively loading the necessary dependencies https://medium.com/#dan_manges/the-modular-monolith-rails-architecture-fb1023826fc4. But I feel that might not prevent developer from using SQL directly to join tables across logical boundary. I am wondering what others might have tried.
I encountered against the typical case of a model (Address) and multiple models having this data (Company, Person). Let's work with these models as example, but can be generalized with any others.
Making the correct choice is important, as changing the database schema or making deep changes are not easy later.
Initially it will be one-to-one association, then we have these possibilities:
1) Use the classical normalization and put all the address data into each model requiring it. Then create an address subform for render partial into each one, and put the code behavior into a Module to be called from each model controller using it.
This could looks fine, but it has the problem that is hard to make changes, as any change in the Address data needs to be done on each model using it. Also hard to change it to a one-to-many association, if required.
2) Rails polymorphic feature. Address is a model itself, and belongs_to :addressable, polymorphic: true, then Company and Person have has_one/has_many :address/:addresses, as :addressable. Then add the polymorphic FK to Address table and it's done.
I like this solution, it is clear and with only 2 extra columns we have done it. The only contra I can think about is we rely in a framework feature, so if someday we migrate to another one, it should feature polymorphic too. Well this is not really true because we can always make the SQL manually searching by addressable_type = "type", but I think is not database design standard.
3) Join tables. One for each parent. So we have JT_Company_Address, JT_Person_Address, and we use the Rails has_one/has_many :through associations using those join tables.
I think using join tables is more database design standard, but add some extra tables.
4) Reverse the relationship, making Address the parent and each of the other a child. So each child would have a FK address_id.
I don't like much this one, as I will handle mainly companies and persons, and then look up their address if required, not the opposite, looking addresses and then loop up what it has in it. We can always use the bidiriectional association using the Rails inverse_of, but at database layer Address would be the parent one anycase.
My favorites at this moment are 2 and 3. I will probably use Rails and not move from it, so the polymorphic looks very easy and clean.
Then, what do you think, which one is the best one? Any advice is welcome.
Thanks.
I'm modeling legal cases with Attorneys, Firms, Judges, Cases and Parties and I'm trying to model the relationship between an Attorney and a Party on a Case. The trouble is that my current model doesn't scope the relationship to a particular case, that is, once an Attorney has a REPRESENTS relationship to a Party, then she is always associated with that Party, even on unrelated Cases. I know that relationships can only have two nodes on them, so how do I model this without creating a SQL-like join table? That is, I want this (even though I can't have it):
(Attorney)-[REPRESENTS]->(Party+Case)
Here's a simplified sketch of my models:
(Attorney {email:, ...})
-[REPRESENTS]->(Party)
-[MEMBER_OF]->(Firm)
(Party {name:, ...})
-[PARTY_IN {role: <plaintiff, defenadant, ...>}]->(Case)
(Firm {email_domain:, ...})
(Case {title:, case_number:, court_house:, ...)
(Judge {name:,...})
-[PRESIDING_OVER]->(Case)
Look into the use of intermediate nodes. Not a perfect example, but this might help you think through the data model.
http://www.markhneedham.com/blog/2013/10/22/neo4j-modelling-hyper-edges-in-a-property-graph/
The idea is that you might want to create a relationship node that connects the Case, Party, and Attorney
Seems like you may want to break out your PARTY_IN roles as their own nodes.
So you might have something like this:
(:Party)-[:PARTY_AS]->(:Defendant)-[:IN]->(:Case)
(:Attorney)-[:REPRESENTS]->(:Defendant)-[:IN]->(:Case)
You can either use separate labels for :Defendant, :Plaintiff, etc (recommended), or have a more generalized (:Role) with a type field.
If you wanted a query to give you the parties and attorneys for a case, you might use something like:
MATCH (case:Case)
WHERE case.id = 123
WITH case
MATCH (party:Party)-[:PARTY_AS]->(role)-[:IN]->(case)
WITH party, role
MATCH (attorney:Attorney)-[:REPRESENTS]->(role)
RETURN LABELS(role) AS role, COLLECT(attorney) AS attorneys, party
(using collect here as multiple attorneys may represent a party in a case)
I am working on a project that currently has tons of HABTM associations. Essentially, everything is related to everything else. I am considering setting up a single intermediate table/model that has two polymorphic fields. This way, if I add another model I can easily connect it to the remaining models. Is this a good idea? If not, why not? If it is, why don't all rails projects have this kind of intermediate table?
I see two other options. I could keep adding intermediate tables or I could add a table that contains one of each type. The former option is kind of a hassle and the latter option does not allow for self joins.
While a polymorphic join table sounds like it would make things easier, I think you will end up creating more headache for yourself than it's worth. Here are a few potential challenges/problems off the top of my head:
You will not be able to use ActiveRecord's has_and_belongs_to_many association or related helpers without a ton of hacking/monkeypatching which will immediately eclipse the time it would take to setup individual pairwise link tables.
Your join table will have two id columns, let's call them a_id and b_id. For any given pair of models you will have to ensure that the ids always end up in the same column.
Example: If you have two models called User and Role, you would have to ensure for that pair that the user_id is always stored in col a_id and the role_id is always stored in col b_id, otherwise you will not be able to index the table in any kind of meaningful way (and will run the risk of defining the same relationship twice).
If you ever want to use database enforcement of FOREIGN KEY constraints it is unlikely that this polymorphic link table scheme will be supported.
The universal link table will get n times larger than n separate link tables. It shouldn't matter much with good indexing but as your application and data grow this could become a headache and limit some of your options in regards to scaling. Give your DB a break.
Most or least importantly (I can't decide) you will be bucking the norm which means a lot fewer (if any) resources out there to help you when you run into trouble. Basically the Adam Sandler "they're all gonna laugh at you" rationale.
Last thought: Can you eliminate any of the link tables by using has_many :xxx, :through => :xxx relationships?
Thinking it all through, you could actually do this, but I wouldn't. Join tables grow fast enough as it is and i like to keep model relationships simple and easy to alter.
I'm used to working on very large systems / data sets though, so if you're going going to have much in each join then ok. I'd still do it separately for joins however and i really like my polymorphics.
I think it would be cleaner and more flexible if you were to use multiple join tables as opposed to one giant multipurpose join table.
I'm using MS SQL Server 2008R2, but I believe this is database agnostic.
I'm redesigning some of my sql structure, and I'm looking for the best way to set up 1 to many relationships.
I have 3 tables, Companies, Suppliers and Utilities, any of these can have a 1 to many relationship with another table called VanInfo.
A van info record can either belong to a company, supplier or utility.
I originally had a company_id in the VanInfo table that pointed to the company table, but then when I added suppliers, they needed vaninfo records as well, so I added another column in VanInfo for supplier_id, and set a constraint that either supplier_id or company_id was set and the other was null.
Now I've added Utilities, and now they need access to the VanInfo table, and I'm realizing that this is not the optimum structure.
What would be the proper way of setting up these relationships? Or should I just continue adding foreign keys to the VanInfo table? or set up some sort of cross reference table.
The application isn't technically live yet, but I want to make sure that this is set up using the best possible practices.
UPDATE:
Thank you for all the quick responses.
I've read all the suggestions, checked out all the links. My main criteria is something that would be easy to modify and maintain as clients requirements always tend to change without a lot of notice. After studying, research and planning, I'm thinking it is best to go with a cross reference table of sorts named Organizations, and 1 to 1 relationships between Companies/Utilities/Suppliers and the Organizations table, allowing a clean relationship to the Vaninfo table. This is going to be easy to maintain and still properly model my business objects.
With your example I would always go for 'some sort of cross reference table' - adding columns to the VanInfo table smells.
Ultimately you'll have more joins in your SP's but I think the overhead is worth it.
When you design a database you should not think about where the primary/foreign key goes because those are concepts that doesn’t belong to the design stage. I know it sound weird but you should not think about tables as well ! (you could implement your E/R model using XML/Files/Whatever
Sticking to E/R relationship design you should just indentify your entity (in your case Company/supplier/utilities/vanInfo) and then think about what kind of relationship there is between them(if there are any). For example you said the company can have one or more VanInfo but the Van Info can belong only to one Company. We are talking about a one – to- many relationship as you have already guessed. At this point when you “convert” you design model (a one-to many relationship) to a Database table you will know where to put the keys/ foreign keys. In the case of a one-to-Many relationship the foreign key should go to the “Many” side. In this case the van info will have a foreign keys to company (so the vaninfo table will contain the company id) . You have to follow this way for all the others tables
Have a look at the link below:
https://homepages.westminster.org.uk/it_new/BTEC%20Development/Advanced/Advanced%20Data%20Handling/ERdiagrams/build.htm
Consider making Com, Sup and Util PKs a GUID, this should be enough to solve the problem. However this sutiation may be a good indicator of poor database design, but to propose a different solution one should know more broad database context, i.e. that you are trying to achive. To me this seems like a VanInfo should be just a separate entity for each of the tables (yes, exact duplicate like Com_VanInfo, Sup_VanInfo etc), unless VanInfo isn't shared between this entities (then relationships should be inverted, i.e. Com, Sup and Util should contain FK for VanInfo).
Your database basically need normalization and I think you're database should be on its fifth normal form where you have two tables linked by one table. Please see this article, this will help you:
http://en.wikipedia.org/wiki/Fifth_normal_form
You may also want to see this, database normalization:
http://en.wikipedia.org/wiki/Database_normalization