Rails what's difference in unique index and validates_uniqueness_of - ruby-on-rails

Firstly, can anyone explain how unique index works in databases?
Suppose I have a User model with a name column and I add a unique index on it but in the model (user.rb) I just have a presence validator on the name field.
So now when I'm trying to create two users with same name, I get PGError
duplicate key value violates unique constraint "index_users_on_name"
So It looks to me like the unique index working same as uniqueness validator(?)
If so then what about foreign keys?
Lets say I have Post model with belongs_to :user association with User has_many :posts.
And a foreign key user_id in the posts table with unique index. Then multiple posts cannot have a same user_id.
Can someone explain how unique index works?
I'm on Rails 4 with Ruby 2.0.0.

Here are the difference between unique index and validates_uniqueness_of
This is a patch to enable ActiveRecord to identify db-generated errors for unique constraint violations. For example, it makes the following work without declaring a validates_uniqueness_of:
create_table "users" do |t|
t.string "email", null: false
end
add_index "users", ["email"], unique: true
class User < ActiveRecord::Base
end
User.create!(email: 'abc#abc.com')
u = User.create(email: 'abc#abc.com')
u.errors[:email]
=> "has already been taken"
The benefits are speed, ease of use, and completeness --
Speed
With this approach you don't need to do a db lookup to check for uniqueness when saving (which can sometimes be quite slow when the index is missed -- https://rails.lighthouseapp.com/projects/8994/tickets/2503-validate... ). If you really care about validating uniqueness you're going to have to use database constraints anyway so the database will validate uniqueness no matter what and this approach removes an extra query. Checking the index twice isn't a problem for the DB (it's cached the 2nd time around), but saving a DB round-trip from the application is a big win.
Ease of use
Given that you have to have db constraints for true uniqueness anyway, this approach will let everything just happen automatically once the db constraints are in place. You can still use validates_uniqueness_of if you want to.
Completeness
validates_uniqueness_of has always been a bit of a hack -- it can't handle race conditions properly and results in exceptions that must be handled using somewhat redundant error handling logic. (See "Concurrency and integrity" section in http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMe...)
validates_uniqueness_of is not sufficient to ensure the uniqueness of a value. The reason for this is that in production, multiple worker processes can cause race conditions:
Two concurrent requests try to create a user with the same name (and
we want user names to be unique)
The requests are accepted on the server by two worker processes who
will now process them in parallel
Both requests scan the users table and see that the name is
available
Both requests pass validation and create a user with the seemingly
available name
For more clear understanding please check this
If you create a unique index for a column it means you’re guaranteed the table won’t have more than one row with the same value for that column. Using only validates_uniqueness_of validation in your model isn’t enough to enforce uniqueness because there can be concurrent users trying to create the same data.
Imagine that two users tries to register an account with the same email where you have added validates_uniqueness_of :email in your user model. If they hit the “Sign up” button at the same time, Rails will look in the user table for that email and respond back that everything is fine and that it’s ok to save the record to the table. Rails will then save the two records to the user table with the same email and now you have a really shitty problem to deal with.
To avoid this you need to create a unique constraint at the database level as well:
class CreateUsers < ActiveRecord::Migration
def change
create_table :users do |t|
t.string :email
...
end
add_index :users, :email, unique: true
end
end
So by creating the index_users_on_email unique index you get two very nice benefits. Data integrity and good performance because unique indexes tends to be very fast.
If you put unique: true in your posts table for user_id then it will not allow to enter duplicate records with same user_id.

Db Unique index and i quote from this SO question is:
Unique Index in a database is an index on that column that also
enforces the constraint that you cannot have two equal values in that
column in two different rows
While ROR uniqueness validation should do the same but from application level, meaning that the following scenario could rarely but easily happen:
User A submits form
Rails checks database for existing ID for User A- none found
User B submits form
Rails checks database for existing ID for User B- none found
Rails Saves user A record
Rails saves user B record
Which happened to me a month ago and got advise to solve it using DB unique index in this SO question
By the way this workaround is well documented in Rails:
The best way to work around this problem is to add a unique index to
the database table using
ActiveRecord::ConnectionAdapters::SchemaStatements#add_index. In the
rare case that a race condition occurs, the database will guarantee
the field’s uniqueness

As for the uniqueness goes,
Uniqueness validates that the attribute's value is unique right before
the object gets saved. It does not create a uniqueness constraint in
the database, so it may happen that two different database connections
create two records with the same value for a column that you intend to
be unique. To avoid that, you must create a unique index on both
columns in your database.
Also, if you just have validates_uniqueness_of at model level then you would be restricted to insert duplicate records from rails side BUT not at the database level.
SQL inject queries through dbconsole would insert duplicate records without any problem.
When you say that you created a foreign key with index on "user_id" in "posts" table then by default rails only creates an index on it and NOT a unique index. If you have 1-M relationship then there is no point in unique index in your case.
If you had unique: true in your posts table for "user_id"
then there is no way that duplicate records with same "user_id" would go through

Related

Rubocop Uniqueness validation should be with a unique index, in values that start from some specific values

I have a rails model that validates the uniqueness of order_number value, thay should start from 1_000_000, so I added a variable that is used as a first value:
# order model
STARTING_NUMBER = 1_000_000
validates :order_number, uniqueness: true
When I checked my code by Rubocop I had an error:
app/models/order.rb:3:3: C: Rails/UniqueValidationWithoutIndex: Uniqueness validation should be with a unique index.
validates :order_number, uniqueness: true
I fixed it by adding disable/enable Rubocop comments:
STARTING_NUMBER = 1_000_000
# rubocop:disable Rails/UniqueValidationWithoutIndex
validates :order_number, uniqueness: true
# rubocop:enable Rails/UniqueValidationWithoutIndex
Is there a better solution?
The proper fix is to add a unique index to your database with a migration:
def change
add_index :orders, :order_number, unique: true
end
That will fix the underlying problem and keep Rubocop from complaining.
From the fine Rubocop manual:
When you define a uniqueness validation in Active Record model, you also should add a unique index for the column. There are two reasons First, duplicated records may occur even if Active Record’s validation is defined. Second, it will cause slow queries.
Rubocop sees that you have a uniqueness validation but didn't find a corresponding unique index in your db/schema.rb. A uniqueness validation in a model is subject to race conditions so you can still get duplicate values.
Rubocop is telling you to add a unique index/constraint in the database to ensure uniqueness. The Rails guides say the same thing:
It does not create a uniqueness constraint in the database, so it may happen that two different database connections create two records with the same value for a column that you intend to be unique. To avoid that, you must create a unique index on that column in your database.
The validation will also do a potentially expensive database query so you really want to index that column anyway, might as well make it a unique index to ensure data integrity while you're at it (broken code is temporary, broken data is forever).
Don't suppress the warning, address it.

Is necessary put "add_index" in a migration one to many?

For example in this migration I have a relation one to many "Category has many Subcategories". If I not put "add_index :subcategories, :category_id", it work anyway.
class CreateSubcategories < ActiveRecord::Migration
def change
create_table :subcategories do |t|
t.string :nombre
t.string :descripcion
t.integer :category_id
t.timestamps
end
add_index :subcategories, :category_id
end
end
For validate foreign key I use this
validates :category, presence: true
Its advised to add index on such a column as supposedly you will be performing multiple lookups across the two tables. In relational database, column category_id would be a foreign key on subcategories table which references id column of category table. You'll find more information on Database index at wikipedia (quickest available reference).
Sure, you could skip creating index for this column but for a performance penalty eventually. Sure it would work without index but I believe you also want an application that is good usability wise - usability in terms of performance. When your table grows large(theoretically), you'll eventually start noticing that Data Retrieval involving joins across two or more tables, categories and subcategories in this case relatively slower.
Sure, one can argue that there is a performance penalty for maintaining an index, i.e. the DBMS would have to go through extra writes. So, it is really up to you and your business requirement whether or not you have more number of Data retrievals or Data writes. If you have more data retrievals then definitely go for the index, if you think there won't be much reads and only writes which you feel your application can live with (less likely) then sure you could skip it.
Given the scenario where you're performing validations on the category's presence. I would definitely go with adding the index.

Does Rails 3.2+ require a join table *in addition to* declaring the has_and_belongs_to_many relation?

I have a number of many-to-many relationships in my app.
I do not need to store information about the relationships themselves, so am using the has_and_belongs_to_many relation in my models.
I've read the Active Record documentation and it seems to confirm my strategy, BUT I'm not clear if I still need to create join tables in the database or if ActiveRecord in Rails 3.2 is smart enough to handle it using the model relations alone.
Any references or explanations would be appreciated.
----- Break -----
If I did need to store data about the relationship itself and I were using has_many => through in my model, would I need to remove the Primary Key from the "through" table (e.g. so that it only has the two foreign keys?)
Thank you!
Yes, you need to create the join table for a has_and_belongs_to_many association. Remember in Rails you need to 'migrate to create'. Using this article as an example, say we have an Account model and a Role model, we can create a join table through this migration:
rails generate migration create_accounts_roles_join_table
Now we will edit the migration file that was just created
create_table :accounts_roles, :id => false do |t|
t.integer :account_id
t.integer :role_id
end
It is important to include :id => false as this will leave off the primary key that is normally generated when you create a table. Also, we specified the two foreign keys account_id and role_id.
run rake db:migrate and add the HABTM associations in both models and everything is set up.
Also, as a side note, adding join_table to the end of the migration generator is not required but is more descriptive and integer can be replaced with references when adding the foreign keys. They are equivalent but maybe a bit more descriptive.
On the second part of your question, you don't need to remove the primary key from the table.

Rails migration assumes a relationship where there is none

I have a Rails 3.1 app with a User model and a Venue model. These two models have a HABTM relationship - A user may manage many venues and a venue may be managed by many users.
I'd like users to be able to select a default venue so I'm trying to add a default_venue_id attribute to User with the following migration:
class AddDefaultVenueIdToUser < ActiveRecord::Migration
def self.up
add_column :users, :default_venue_id, :integer
end
def self.down
remove_column :users, :default_venue_id
end
end
The problem is that when I run that migration against my PostgreSQL database, it's assuming that default_venue_id is the foreign key for a relationship with the non-existent default_venues table and throws the following error:
PGError: ERROR: relation "default_venues" does not exist
: ALTER TABLE "users" ADD FOREIGN KEY ("default_venue_id") REFERENCES "default_venues" ("id")
Should I be doing something to tell the database that I'm not trying to create a relationship or am I going about this the wrong way?
Edit: I've just realised that another developer who worked on the project added the schema_plus gem which automatically defines constraints for columns ending in _id
That explains why I've never run into this behaviour before!
It would be helpful to see the Models as well to diagnose this problem.
However with that being said, I think that if you are using HABTM associations here it might be a good idea to take a look at a has many through relationship. example: VenueManagement which would have your user_id and venue_id. That way you could handle extra attributes on the association where it makes sense, like a default flag.
Hope that helps.
Since Rails seems to pick up on the _id part of the tag and tries linking it to a table, the easy solution to this is to try name it differently as convention says a field with _id links to a table.
An example of which could be
:default_id_for_venue, or just :defualt_venue

Searching across multiple active record models without using using sphinx or solr

What's the best way of searching across multiple active record models without using something like sphinx or solr?
I use IndexTank http://indextank.com/
The simplest solution is to inherit models from the one and to store it in one table. However it's very bad if you have an existent schema or if models differs much.
Another (and i think, much better) solution (if your database allow it) - to use UNION in your SQL to get results from multiple tables. In this case you should use find_by_sql.
For example, if you have Post and Question models and want to list both in one list (and filter them by some conditions, possible) you should:
Add field type to each table with default value matching model name. For example:
create_table :questions do |t|
t.text :text
t.string :type, :null => false, :default => 'Question'
t.timestamps
end
Query both models as following:
Post.find_by_sql("SELECT id,type,created_at,updated_at,title,description,NULL AS text FROM posts
UNION SELECT id,type,created_at,updated_at,NULL as title, NULL AS description, text FROM questions
ORDER BY created_at DESC")
Using type field Rails will distinguish different models and return list of both Post and Question, so you may search (or paginate) for both.
Assuming you are using a relational database, writing raw SQL statements is probably your best bet.
Take a look at http://api.rubyonrails.org/classes/ActiveRecord/Base.html#method-c-find_by_sql
Item.find_by_sql("SELECT items.name, users.last_name, products.reference_number ... JOIN... ")
...would return a collection of Item but with the attributes returned by the SQL query. So in the above example each instance of Item returned would have a name, last_name and reference_number attribute even though they are not really attributes of an Item.

Resources