MongoDB indexes - ruby-on-rails

I'm in the process of converting my Rails app to use mongodb through mongoid. I have two questions relating to indexes. I think I know the answer, but I want confirmation from someone who has more experience with mongodb.
Let's look at the following example where I have one relational association between Users and Posts.
user.rb
class User
has_many_related :posts
end
post.rb
class Post
belongs_to_related :user
end
Now when I look at the indexes created through the MongoHQ interface, I notice the following two:
Key Name: _id_
Indexed Field: _id
Unique: <blank>
Is the id guaranteed to be unique? If so, why isn't unique set. If not, how can I set this and do I need to?
Key Name: user_id_1
Indexed Field: user_id
Unique: false
Am I correct in assuming the Indexed Field is the field name in the collection? Just want to confirm as Key Name has the _1 after it.

Yes, _id in MongoDB is always unique. It's the primary key, which is why setting UNIQUE isn't necessary.

Here is very clearly described indexes in MongoDB Indexing Overview.
_id
The _id index is a unique index** on the _id field, and MongoDB creates this index by default on all collections. You cannot delete the index on _id.
The _id field is the primary key for the collection, and every document must have a unique _id field. You may store any unique value in the _id field. The default value of _id is ObjectID on every insert()
** Although the index on _id is unique, the getIndexes() method will not print unique: true in the mongo shell.

If you do not specify the _id value manually in MongoDB, then the type will be set to a special BSON datatype that consists of a 12-byte binary value.
The 12-byte value consists of a 4-byte timestamp, a 3-byte machine id, a 2-byte process id, and a 3-byte counter. Due to this design, this value has a resonably high probability of being unique.
Reference: The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing (book)

Related

Rails PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "table_pkey"

I created something like a JSON backup for my project's database, and then I populate it like this
Model.find_or_initialize_by(:id => h["id"]).update(h)}
being h a hash of the model attributes for an instance.
The records are effectively created, but when I want to create a new record, rails rises this error
PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "table_pkey"
What could I be doing wrong? It happens for all models which were created using scaffold, here a migration as an example.
class CreateModel < ActiveRecord::Migration[6.1]
def change
create_table :models do |t|
t.string :attribute1
t.string :attribute2
t.string :attribute3
t.timestamps
end
end
end
You're using sequential integer id's for your table according to the migration. This works well enough if you allow the database to assign id's for you. Every time a new record comes in, database takes the next number on the list and assigns it to that record (simplifying here).
Lets assume the database id sequence is currently at 3 and the records you imported have ids 4, 37 and 143025. Inserting a new record to the database, database says id is 3, all good, sequence is now at 4. Inserting another one, database says id is 4. Trying to insert it, but there already is a 4 in the database.
PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "table_pkey"
A few possible solutions:
After importing, change the database id sequence to something bigger than the largest id you imported. (hacky, but works) Postgres manually alter sequence
Import the items without hardcoding their id-s. (complicated)
Change your database to use uuid-s instead of integer id-s (architectural change, difficult if the app is live, best solution if you're still in development)
Use a proper database backup system rather than building your own. pg_dump

Postgresql depend on record id

I am working on an application design, using Ruby on Rails and Postgresql. I have a table with the following fields
Table: account_type
Fields: id(primary key), name(String)
AccountType name is unique string (so am thinking about putting unique constraints on it). Depending on the name (type) I'm going to make some checks in my Models. Something like that:
def urban?
self.name == 'Some long type'
end
The question is: do I leave it like that? Or, as the other option, I can depend on some ID. So, assuming that my 'Some long type' is always created with ID=1, I can check for
def urban?
self.id == 1
end
Is it a good practice if I do depend on the ID? What about readability? Are there other solutions to that problem?
The second example is a text-book case of how NOT to use surrogate keys
Your real primary key is account_type and should have a unique key. There is always endless debate about the 'goodness' of using auto-inc id columns for primary keys. To query by id depends on how the rows have been inserted. Querying by account_type.name is immutable.
Readability? the id field gives no information to what the record really means.
Other Solutions? I don't really see what problem is, but you could also use an enum type (but it is much less flexible than a lookup table.)

What's the _type column for a polymorphic used for?

From the Rails Guide, to establish a polymorphic relationship on one model, I need to add two columns for the corresponding table.
As the image below shows, the _id column is used as a foreign key. But I cannot figure out the usage of the _type column? What's the usage of it?
The _type column is used to identify what resource this comes from. In this case, the polymorphic resource could be one of Employee or a Product. In other words: an image can relate to either a product or an employee.
The _type column will simply contain the string of either "Employee" or "Product". When this association is accessed, Rails will use it to know what model to use to load the associated object.

Rails 3: what does "a model with a uniquely indexed column" exactly mean

"a model with a uniquely indexed column"
Does this mean just a model and a column with a unique validation on the column? Or does it mean the column needs add_index in the migration?
And could you explain what exactly it means to create an add_index. Such as if you have an Authors model, with a name column. What does adding an index to 'name' accomplish?
Thanks.
I am taking it to mean that the model has a column that is guaranteed to be unique and that there is an index on it. I take it you are reading about models in general in Rails.
A unique column means that no two models (such as User1 and User2) can have the same value for that column. For example, users would have unique logins. No two users should exist that have the same login (or username or email). But Rails automatically gives models an ID column that is always unique. Unless you change it, the first record will have ID 1, then 2, then 3, etc.
An index on a column means that it is easier to find that column. Think of a an encyclopedia. There is so much information in there, but the appendix (like an index) helps you quickly find what you are looking for. There may be an appendix of key terms, and then it will tell you where to quickly find it. That's what an index on a column does.
So "a model with a uniquely indexed column" in Rails, by default, is the ID column: it is unique and will automatically get an index on it to more quickly find records.
Extra: when you make a model with a foreign key (example: model User may have a gender_id, and you may have a table called Gender that defines Male and Female and the gender_id corresponds to a Gender object), then you should add an index to that foreign key to make searches on it faster.
More information: http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_index

Ruby On Rails custom ID

Hi I want to have custom ids for a table and not use the default auto_increment. Is that possible? I have turned :id => false and later on I create a t.column :id, :bigint
But when I create a new record the id is set to NULL
Is there anyway to bypass this behaviour.
I could set my primary key to a field uid (it is actually facebook id) but usually it breaks for me as relations usually look for the id field
It's almost certainly going to be easier to use Rails' surrogate autoincrement ID as your primary ID.
There's no problem having a second unique column facebook_id and performing all your searches based on that field, but for the compatability reasons you mention (i.e. plugin authors making assumptions they shouldn't) I'd stick with Rails defaults as much as sensible. The performance issues relating to primary keys vs unique keys are minimal
I don't know where the problem is. Of course id is NULL when you disable auto_increment for the id column and don't insert an id value by hand.
Btw: You can define the foreign and primary keys for relations. It is no problem to use facebook_id as primary key.

Resources