For example in this migration I have a relation one to many "Category has many Subcategories". If I not put "add_index :subcategories, :category_id", it work anyway.
class CreateSubcategories < ActiveRecord::Migration
def change
create_table :subcategories do |t|
t.string :nombre
t.string :descripcion
t.integer :category_id
t.timestamps
end
add_index :subcategories, :category_id
end
end
For validate foreign key I use this
validates :category, presence: true
Its advised to add index on such a column as supposedly you will be performing multiple lookups across the two tables. In relational database, column category_id would be a foreign key on subcategories table which references id column of category table. You'll find more information on Database index at wikipedia (quickest available reference).
Sure, you could skip creating index for this column but for a performance penalty eventually. Sure it would work without index but I believe you also want an application that is good usability wise - usability in terms of performance. When your table grows large(theoretically), you'll eventually start noticing that Data Retrieval involving joins across two or more tables, categories and subcategories in this case relatively slower.
Sure, one can argue that there is a performance penalty for maintaining an index, i.e. the DBMS would have to go through extra writes. So, it is really up to you and your business requirement whether or not you have more number of Data retrievals or Data writes. If you have more data retrievals then definitely go for the index, if you think there won't be much reads and only writes which you feel your application can live with (less likely) then sure you could skip it.
Given the scenario where you're performing validations on the category's presence. I would definitely go with adding the index.
Related
I have a ruby on Rails 4 app, using devise and with a User model and a Deal model.
I am creating a user_deals table for has_many/has_many relationship between User and Deal.
Here is the migration
class CreateUserDeals < ActiveRecord::Migration
def change
create_table :user_deals do |t|
t.belongs_to :user
t.belongs_to :deal
t.integer :nb_views
t.timestamps
end
end
end
When a user load a Deal (for example Deal id= 4), I use a method called show
controllers/deal.rb
#for the view of the Deal page
def show
end
In the view of this Deal id=4 page, I need to display the nb of views of the Devise's current_user inside the Deal page the user is currently on.
deal/show.html
here is the nb of views of user: <% current_user.#{deal_id}.nb_views%>
Lets' say I have 10M+ user_deals lines, I wanted to know if I should use an index
add_index :user_deals, :user_id
add_index :user_deals, :deal_id
or maybe
add_index(:deals, [:user_id, deal_id])
Indeed in other situations I would have said Yes, but here I don't know how Rails works behind the scenes. It feels as if Rails is aware of what to do without me needing to speed up the process,...as if when Rails loads this view that there is no SQL query (such as 'find the nb of views WHERe user_id= x and deal_id= Y')....because I'm using just for the current_user who is logged-in (via devise's current_user) and for deal_id Rails knows it as we are on the very page of this deal (show page) so I just pass it as a parameter.
So do I need an index to speed it up or not?
Your question on indexes is a good one. Rails does generate SQL* to do its magic so the normal rules for optimising databases apply.
The magic of devise only extends to the current_user. It fetches their details with a SQL query which is efficient because the user table created by devise has helpful indexes on it by default. But these aren't the indexes you'll need.
Firstly, there's a neater more idiomatic way to do what you're after
class CreateUserDeals < ActiveRecord::Migration
def change
create_join_table :users, :deals do |t|
t.integer :nb_views
t.index [:user_id, :deal_id]
t.index [:deal_id, :user_id]
t.timestamps
end
end
end
You'll notice that migration included two indexes. If you never expect to create a view of all users for a given deal then you won't need the second of those indexes. However, as #chiptuned says indexing each foreign key is nearly always the right call. An index on an integer costs few write resources but pays out big savings on read. It's a very low cost default defensive position to take.
You'll have a better time and things will feel clearer if you put your data fetching logic in the controller. Also, you're showing a deal so it will feel right to make that rather than current_user the centre of your data fetch.
You can actually do this query without using the through association because you can do it without touching the users table. (You'll likely want that through association for other circumstances though.)
Just has_many :user_deals will do the job for this.
To best take advantage of the database engine and do this in one query your controller can look like this:
def show
#deal = Deal.includes(:user_deals)
.joins(:user_deals)
.where("user_deals.user_id = ?", current_user.id)
.find(params["deal_id"])
end
Then in your view...
I can get info about the deal: <%= #deal.description %>
And thanks to the includes I can get user nb_views without a separate SQL query:
<%= #deal.user_deals.nb_views %>
* If you want to see what SQL rails is magically generating just put .to_sql on the end. e.g. sql_string = current_user.deals.to_sql or #deal.to_sql
Yes, you should use an index to speed up the querying of the user_deals records. Definitely at least on user_id, but probably both [:user_id, :deal_id] as you stated.
As for why you don't see a SQL query...
First off, your code in the view appears to be incorrect. Assuming you have set up a has_many :deals, through: :user_deals association on your User class, it should be something like:
here is the nb of views of user: <%= current_user.deals.find(deal_id).nb_views %>
If you see the right number showing up for nb_views, then a query should be made when the view is rendered unless current_user.deals is already being loaded earlier in the processing or you've got some kind of caching going on.
If Rails is "aware", there is some kind of reason behind it which you should figure out. Expected base Rails behavior is to have a SQL query issued there.
Is a cleaner way of indexing other tables not:
class CreateUserDeals < ActiveRecord::Migration
def change
create_table :user_deals do |t|
t.references :user
t.references :deal
t.integer :nb_views
t.timestamps
end
end
end
Firstly, can anyone explain how unique index works in databases?
Suppose I have a User model with a name column and I add a unique index on it but in the model (user.rb) I just have a presence validator on the name field.
So now when I'm trying to create two users with same name, I get PGError
duplicate key value violates unique constraint "index_users_on_name"
So It looks to me like the unique index working same as uniqueness validator(?)
If so then what about foreign keys?
Lets say I have Post model with belongs_to :user association with User has_many :posts.
And a foreign key user_id in the posts table with unique index. Then multiple posts cannot have a same user_id.
Can someone explain how unique index works?
I'm on Rails 4 with Ruby 2.0.0.
Here are the difference between unique index and validates_uniqueness_of
This is a patch to enable ActiveRecord to identify db-generated errors for unique constraint violations. For example, it makes the following work without declaring a validates_uniqueness_of:
create_table "users" do |t|
t.string "email", null: false
end
add_index "users", ["email"], unique: true
class User < ActiveRecord::Base
end
User.create!(email: 'abc#abc.com')
u = User.create(email: 'abc#abc.com')
u.errors[:email]
=> "has already been taken"
The benefits are speed, ease of use, and completeness --
Speed
With this approach you don't need to do a db lookup to check for uniqueness when saving (which can sometimes be quite slow when the index is missed -- https://rails.lighthouseapp.com/projects/8994/tickets/2503-validate... ). If you really care about validating uniqueness you're going to have to use database constraints anyway so the database will validate uniqueness no matter what and this approach removes an extra query. Checking the index twice isn't a problem for the DB (it's cached the 2nd time around), but saving a DB round-trip from the application is a big win.
Ease of use
Given that you have to have db constraints for true uniqueness anyway, this approach will let everything just happen automatically once the db constraints are in place. You can still use validates_uniqueness_of if you want to.
Completeness
validates_uniqueness_of has always been a bit of a hack -- it can't handle race conditions properly and results in exceptions that must be handled using somewhat redundant error handling logic. (See "Concurrency and integrity" section in http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMe...)
validates_uniqueness_of is not sufficient to ensure the uniqueness of a value. The reason for this is that in production, multiple worker processes can cause race conditions:
Two concurrent requests try to create a user with the same name (and
we want user names to be unique)
The requests are accepted on the server by two worker processes who
will now process them in parallel
Both requests scan the users table and see that the name is
available
Both requests pass validation and create a user with the seemingly
available name
For more clear understanding please check this
If you create a unique index for a column it means you’re guaranteed the table won’t have more than one row with the same value for that column. Using only validates_uniqueness_of validation in your model isn’t enough to enforce uniqueness because there can be concurrent users trying to create the same data.
Imagine that two users tries to register an account with the same email where you have added validates_uniqueness_of :email in your user model. If they hit the “Sign up” button at the same time, Rails will look in the user table for that email and respond back that everything is fine and that it’s ok to save the record to the table. Rails will then save the two records to the user table with the same email and now you have a really shitty problem to deal with.
To avoid this you need to create a unique constraint at the database level as well:
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :email
...
end
add_index :users, :email, unique: true
end
end
So by creating the index_users_on_email unique index you get two very nice benefits. Data integrity and good performance because unique indexes tends to be very fast.
If you put unique: true in your posts table for user_id then it will not allow to enter duplicate records with same user_id.
Db Unique index and i quote from this SO question is:
Unique Index in a database is an index on that column that also
enforces the constraint that you cannot have two equal values in that
column in two different rows
While ROR uniqueness validation should do the same but from application level, meaning that the following scenario could rarely but easily happen:
User A submits form
Rails checks database for existing ID for User A- none found
User B submits form
Rails checks database for existing ID for User B- none found
Rails Saves user A record
Rails saves user B record
Which happened to me a month ago and got advise to solve it using DB unique index in this SO question
By the way this workaround is well documented in Rails:
The best way to work around this problem is to add a unique index to
the database table using
ActiveRecord::ConnectionAdapters::SchemaStatements#add_index. In the
rare case that a race condition occurs, the database will guarantee
the field’s uniqueness
As for the uniqueness goes,
Uniqueness validates that the attribute's value is unique right before
the object gets saved. It does not create a uniqueness constraint in
the database, so it may happen that two different database connections
create two records with the same value for a column that you intend to
be unique. To avoid that, you must create a unique index on both
columns in your database.
Also, if you just have validates_uniqueness_of at model level then you would be restricted to insert duplicate records from rails side BUT not at the database level.
SQL inject queries through dbconsole would insert duplicate records without any problem.
When you say that you created a foreign key with index on "user_id" in "posts" table then by default rails only creates an index on it and NOT a unique index. If you have 1-M relationship then there is no point in unique index in your case.
If you had unique: true in your posts table for "user_id"
then there is no way that duplicate records with same "user_id" would go through
What's the best way of searching across multiple active record models without using something like sphinx or solr?
I use IndexTank http://indextank.com/
The simplest solution is to inherit models from the one and to store it in one table. However it's very bad if you have an existent schema or if models differs much.
Another (and i think, much better) solution (if your database allow it) - to use UNION in your SQL to get results from multiple tables. In this case you should use find_by_sql.
For example, if you have Post and Question models and want to list both in one list (and filter them by some conditions, possible) you should:
Add field type to each table with default value matching model name. For example:
create_table :questions do |t|
t.text :text
t.string :type, :null => false, :default => 'Question'
t.timestamps
end
Query both models as following:
Post.find_by_sql("SELECT id,type,created_at,updated_at,title,description,NULL AS text FROM posts
UNION SELECT id,type,created_at,updated_at,NULL as title, NULL AS description, text FROM questions
ORDER BY created_at DESC")
Using type field Rails will distinguish different models and return list of both Post and Question, so you may search (or paginate) for both.
Assuming you are using a relational database, writing raw SQL statements is probably your best bet.
Take a look at http://api.rubyonrails.org/classes/ActiveRecord/Base.html#method-c-find_by_sql
Item.find_by_sql("SELECT items.name, users.last_name, products.reference_number ... JOIN... ")
...would return a collection of Item but with the attributes returned by the SQL query. So in the above example each instance of Item returned would have a name, last_name and reference_number attribute even though they are not really attributes of an Item.
Is it generally better practice (and why) to validate attributes in the model or in the database definition?
For (a trivial) example:
In the user model:
validates_presence_of :name
versus in the migration:
t.string :name, :null => false
On the one hand, including it in the database seems more of a guarantee against any type of bad data sneaking in. On the other hand, including it in the model makes things more transparent and easier to understand by grouping it in the code with the rest of the validations. I also considered doing both, but this seems both un-DRY and less maintainable.
I would highly recommend doing it in both places. Doing it in the model saves you a database query (possibly across the network) that will essentially error out, and doing it in the database guarantees data consistency.
And also
validates_presence_of :name
not the same to
t.string :name, :null => false
If you just set NOT NULL column in your DB you still can insert blank value (""). If you're using model validates_presence_of - you can't.
It is good practice to do both. Model Validation is user friendly while database validation adds a last resort component which hardens your code and reveals missing validitions in your application logic.
It varies. I think that simple, data-related validation (such as string lengths, field constraints, etc...) should be done in the database. Any validation that is following some business rules should be done in the model.
I would recommend Migration Validators project ( https://rubygems.org/gems/mv-core ) to define validation on db level and then transparently promote it to ActiveRecord model.
Example:
in migration:
def change
create_table :posts do |t|
t.string :title, length: 1..30
end
end
in your model:
class Post < ActiveRecord::Base
enforce_migration_validations
end
As result you will have two level data validation. The first one will be implemented in db ( as condition in trigger of check constraint ) and the second one as ActiveModel validation in your model.
Depends on your application design,
If you have a small or medium size application you can either do it in both or just in model,
But if you have a large application probably its service oriented or in layers then have basic validation i.e mandatory/nullable, min/max length etc in Database and more strict i.e patters or business rules in model.
I would like to confirm that the following analysis is correct:
I am building a web app in RoR. I have a data structure for my postgres db designed (around 70 tables; this design may need changes and additions during development to reflect Rails ways of doing things. EG, I designed some user and role tables - but if it makes sense to use Restful Authentication, I will scrub them and replace with whatever RA requires. ).
I have a shellscript which calls a series of .sql files to populate the empty database with tables and initial data (eg, Towns gets pre-filled with post towns) as well as test data (eg, Companies gets a few dummy companies so I have data to play with).
for example:
CREATE TABLE towns (
id integer PRIMARY KEY DEFAULT nextval ('towns_seq'),
county_id integer REFERENCES counties ON DELETE RESTRICT ON UPDATE CASCADE,
country_id integer REFERENCES countries ON DELETE RESTRICT ON UPDATE CASCADE NOT NULL,
name text NOT NULL UNIQUE
);
Proposition 0: Data lasts longer than apps, so I am convinced that I want referential integrity enforced at the DB level as well as validations in my RoR models, despite the lack of DRYNESS.
Proposition 1: If I replace the script and sql files with Migrations, it is currently impossible to tell my Postgres database about the Foreign Key and other constraints I currently set in SQL DDL files within the migration code.
Proposition 2: The touted benefit of migrations is that changes to the schema are versioned along with the RoR model code. But if I keep my scripts and .sql files in railsapp/db, I can version them just as easily.
Proposition 3: Given that migrations lack functionality I want, and provide benefits I can replicate, there is little reason for me to consider using them. So I should --skipmigrations at script/generate model time.
My question: If Proposition 0 is accepted, are Propositions 1,2,3 true or false, and why?
Thanks!
Proposition 1 is false in at least two situations - you can use plugins like foreign_key_migrations to do the following:
def self.up
create_table :users do |t|
t.column :department_id, :integer, :references => :departments
end
end
which creates the appropriate foreign key constraint in your DB.
Of course, you might have other things that you want to do in your DDL, in which case the second situation becomes more compelling: you're not forced to use the Ruby DSL in migrations. Try the execute method, instead:
def self.up
execute 'YOUR SQL HERE'
end
With that, you can keep the contents of your SQL scripts in migrations, gaining the benefits of the latter (most prominently the down methods, which you didn't address in your original question) and retaining the lower-level control you prefer.
Proposition 1 is mistaken : you can definitely define referential integrity using migrations if only by using direct SQL inside the migration, see this post for more details.
Proposition 2: The touted interest of migrations is to be able to define your database model incrementally while keeping track of what each change added and be able to easily rollback any such change at a later time.
You have to be careful with the order you create/modify things in but you can do it.
One thing to keep in mind : rails is better suited for application-centri design. in the Rails Way(tm) the database is only ever accessed through the application active record layer and exposes data to the outside using webservices
1: You may want to try out this plugin. I didn't try it myself though, but it seems to be able to add foreign key constraints through migrations.
2: The real benefit of migration is the ability to go back and forth in the history of your database. That's not as easy with your .sql files.
3: See if the above-mentioned plugin works for you, then decide :) At any rate, it's not a capital sin if you don't use them!
Since you are using Postgres and may not want to install the foreign_key_migrations plugin, here is what I do when I want to use both migrations and foreign key constraints.
I add a SchemaStatements method to ActiveRecord::SchemaStatements called "add_fk_constraint".
This could go in some centralized file, but in the example migration file below, I have just put it inline.
module ActiveRecord
module ConnectionAdapters # :nodoc:
module SchemaStatements
# Example call:
# add_fk_constraint 'orders','advertiser_id','advertisers','id'
# "If you want add/alter a 'orders' record, then its 'advertiser_id' had
# better point to an existing 'advertisers' record with corresponsding 'id'"
def add_fk_constraint(table_name, referencing_col, referenced_table, referenced_col)
fk_name = "#{table_name}_#{referencing_col}"
sql = <<-ENDSQL
ALTER TABLE #{table_name}
ADD CONSTRAINT #{fk_name}
FOREIGN KEY (#{referencing_col}) REFERENCES #{referenced_table} (#{referenced_col})
ON UPDATE NO ACTION ON DELETE CASCADE;
CREATE INDEX fki_#{fk_name} ON #{table_name}(#{referencing_col});
ENDSQL
execute sql
end
end
end
end
class AdvertisersOrders < ActiveRecord::Migration
def self.up
create_table :advertisers do |t|
t.column :name, :string, :null => false
t.column :net_id, :integer, :null => false
t.column :source_service_id, :integer, :null => false, :default => 1
t.column :source_id, :integer, :null => false
end
create_table :orders do |t|
t.column :name, :string, :null => false
t.column :advertiser_id, :integer, :null => false
t.column :source_id, :integer, :null => false
end
add_fk_constraint 'orders','advertiser_id','advertisers','id'
end
def self.down
drop_table :orders
drop_table :advertisers
end
end
I hopes this helps someone. It has been very useful to me since I need to load a lot of externally supplied data with SQL "COPY" calls, yet I find the migrations system very convenient.