Where should I write the script that fills new data in my updated schema? - ruby-on-rails

I am working on a web app that already has a schema in place (aka in prod) with a certain number of tables (A, B, C...).
Table A has an attribute that corresponds to an enum from table B. Problem is that I can have only one item of that enum list in my dedicated column in table A. But I want my objects from table A to have many of them. So I created a join table A_B and a has_many through association with my table B.
The first consequence is that I need to fill my join table with data from the previous schema architecture. To be clearer, they were objects from table A that were associated with one element of the table B enum. I need to report these simple relationships (only one element from enum list in table B is associated with table A objects) in my newly created join table.
Here's the type of things I'd like to do:
list_of_ids = []
Model_A.where(attribute: 0).each { |r| list_of_ids << r.id }
a.each { |el| A_B.create(tableA_id: el, tableB_id: 0) }
Where should I write and execute these lines of code that will update my data ?

As stated in my comments, I would put this "data-update" logic in the same migration file of the join table creation.
Why in a migration file?
the data conversion needs to be done only once, AFTER you created the join table and BEFORE you remove the column which holds the foreign key. If you do this data conversion after the column removal, you will get an error saying that your code is trying to access column that does not exist anymore.
the migration is responsible for changing the DB structure AND for the data integrity.
Why not a rake task?
rake tasks are meant to be run several times, not only once. The usual tasks are "send_emails", "update_expiration_dates", "compute_cache", "close_inactive_users_account", etc.
the data you have before your conversion has_many -> HABTM have to be updated to follow the new structure. the rake task could not be ran, and then your data would not be updated, therefore you would loose the association between your models (removing the foreign_key column before running the rake task would make you loose this data).
your data-conversion logic must happen after the migration creating the join table and BEFORE the migration removing the foreign key's column. There are clear way to say to your rails app: "do this migration, then stop, run this task, then do this migration". It will run the migrations consecutively. If you have these 2 migrations pending, they will be run at once unless you specify otherwise (which is not common at all), and then you rake task will be useless because it relies on the fact that the foreign key column still exists.

In addition to MrYoshiji's answer (too long for comments):
Migrations are intended for one-off changes to both database schemas and the data therein. The official guide mentions this, too.
In your case, populating the join table is an appropriate thing to do in a migration. Otherwise data would be left in an invalid or incorrect state vis-a-vis the schema and model changes you have made. Running migrations is a typical step in deploying a Rails app. Since you probably do not want to deploy without updating the data, having it as part of a migration is a great solution. In contrast, if you created a custom rake task to update the data then you would need to remember to run it manually after deploying or add it as a deployment step, neither of which is a very good option.

Related

Make a new column (migration) between two other columns?

Is there a quick/easy way to make a migration that adds a new column between two existing columns?
Note: I googled and couldn't find an obvious answer. But I also am curious if it's even good practice (since if for some reason other column/s were removed, then the migration may fail?)
If you haven't commited your changes to production you can reorder migrations by rolling them back and then changing the timestamps in the file name.
If thats not an alternative you can actually re-order the columns on some databases directly with SQL even if its not part of the migrations DSL. Migrations are after all really just a DSL to create SQL strings and run them in a repeatable way across environments.
If you can't generate the SQL you want with the DSL you can always use execute to execute a raw DSL string.
# MySQL example
class ReorderYourTableName < ActiveRecord::Migration[5.2]
def change
execute "alter table yourTableName change column yourColumnName yourColumnName dataType after yourSpecificColumnName;"
end
end
However on some DBs you can't actually reorder the columns without extensive steps of creating new columns and shuffling the data around.

What is the proper way of merging multiple postgres schemas into a single schema preserving the foreign key relationships?

We have a SAAS platform written in Rails using the postgres' schema based multitenancy and Apartment gem. The different schemas are identical, with same number of tables and same columns in each table. We want to migrate to foreign key based multitenant system where we want to merge all the records from different schemas into a single schema, identifying each record with a tenant_id. What is the proper way of merging all the records from the different schemas, and preserving the foreign key relationships.
This is a situation that will need care. I think (I could be wrong) that the best approach is to add to all tables tenant_id and original_id ... before attempting migration populate original_id in all tables with id of that record. Essentially this is to have a record of what the value of id was before the merging.
After merge you can then run a rake task that rebuilds the associations. So if you had...
class Foo
has_many :bars
Your migration script would do (after migration)
Bar.all.each do |bar|
foo = Foo.find_by(tenant_id: bar.tenant_id, original_id: bar.foo_id)
bar.update_column(:foo_id, foo.id)
end
You'd need to do something similar for every relation, so it's a bit of a slog.
Hopefully, someone else will come up with a better solution.
NOTE THIS IS NOT IDEMPOTENT. If it errors, you can't restart it except by redoing the merging completely.

How to create postgres database for Rails app manually?

How can I create postgres db for Rails app properly but in psql, not via rake db:create?
I mean, one can always write CREATE DATABASE project_name, but I don't know what happens in that rake task under the hood. Maybe there are a lot of additional params.
Update
After first answer I decided to clarify: I know how to write and use migrations, they are awesome, but my question not about them. It's about rake db:create task and pg adapter.
In other words, I just want to know which command in psql is equal to rake db:create.
If you select the db on pgadmin III it will show you the sql instructions with the local things to load. They are very importanst if you have full text index on. You must run them from the database postgres.
Rails expects table names to match model names but be plural and snake_case. For example, a User model will store records in a users table and a BlogEntry model will store records in a blog_entries table.
Rails expects a table's primary key to be named id and it expects foreign keys to match model names but be snake_case and end with _id. For example, if BlogEntry belongs_to User, Rails will expect the blog_entries table to have a user_id column.
Join tables (such as used with many-to-many relations) are expected to be named with the two models' names in plural snake case and alphabetical order. For example, a join table describing a many-to-many relation between a User model and a Blog model would be expected to have the name blogs_users and have, at the least, the columns blog_id and user_id.
Those are the basics. Of course, all of this is configurable: For example, you can use the table_name class method to tell a model to use a table with a different name, and the relation methods (belongs_to, has_many, etc.) all take options letting you specify different names.
Apart from these naming conventions Rails doesn't require anything special from a database, as long as the correct credentials and configuration are specified in config/database.yml.

Model/Migration creation order matters in Rails?

I have generated 4 models in rails. Now some of them requires columns for foreign keys from each other.
Suppose I have a table User and other table Post. Now do i need to generate the models in order for ex. User and then Post, so that Posts table could contain user_id column in it?
I am not running rake db:migrate now. What I'm doing is generating models and specifying the columns that might be necessary.
I want to know is, does rake db:migrate takes care of the order automatically and i can create the models in any order? Or as migration files has timestamps attached to it in the file name, it will be processed according to the order of its creation and will give me the error for ex. user_id foreign key dependency, table Users not found?
There are no foreign key dependencies in Rails migrations, so as long as you are not creating dependent data for the models in the migrations, it will work correctly.

When to assign a foreign key and when to just create an association?

I'm learning about model associations in rails. I've learned how to give the table for a model a column to hold foreign keys of another model like so:
rails generate model User account_id:integer
I would then take the primary key of an account from an Account table and assign it to the account_id for the designated user.
However, I am also told to create the association in User.rb like so:
has_one :account
I understand the difference between these two things. One creates a column in the table (the first line after the migration), and the other generates a series of helpers (the latter).
However, what I am seeing in tutorials is that sometimes both are done, while other times, only the association (has_one :account) is done. How do I go about deciding when to create a column in the table to hold foreign keys, and when to just create the association in the model .rb file?
You will always have to create the column in the Database (1)
It is not mandatory to define the relation(s) inside the models, but it is highly recommended (2)
(1) : The Database needs this column of foreign keys to be able to retrieve the corresponding record. Without the column, the DB cannot find back the related record. You can use a Migration to create this column, not only a scaffold.
(2) : You can skip the relations declaration in the models, but it is highly recommended because:
it generate methods corresponding to the relations (ex: User belongs_to :role, then you can do user.role directly instead of Role.where(id: user.role_id).first)
not every human can remember all the associations. It is better to show/list everything that is linked to your model
You asked:
How do I go about deciding when to create a column in the table to hold foreign keys, and when to just create the association in the model .rb file?
I would answer:
Always create the column (cannot work if you don't) AND define every association(s) (relation(s)) inside the model.
Your first example is a scaffold that creates the model file and the migration.
has_one helps ActiveRecord understand the relationships between the tables so that it can, among other things, generate proper SQL queries for you.
But #1 has to be in place for #2 to even work. However, creating the db column via a scaffold command isn't necessary - it's just convenient sometimes, because it creates both the model file, and the associated migration.
You could just write the migration by hand. Just because whatever thing you're following doesn't always mention adding the db column doesn't mean it's not necessary. It's probably just assumed that you've already done it because these foreign key migrations are so common after you get up and running with Rails that they sort of go without saying after a while.

Resources