I need to create a product code which will be generated with a custom function. It will start with a letter, have the id of the category and then have a random 7 digit number. For this, I can set the primary_key to a string and generate the code or I can use FriendlyID. What might be the best for this situation?
Short answer: Use something like friendly_id
The story now:
Choosing a natural key for the primary_key of a table should always be measured well, it has an impact on your data model.
The first issue is regarding related records in other tables. If you will have related tables, the foreign key in those tables should also be VARCHAR and match your generated primary key. If you are not sure what to do, avoid custom primary keys.
Another issue in your question may be:
It will start with a letter, have the id of the category
Is this id the primary key of a Category model? If it's the case, you are generating a key with DB isolation in mind, but re-tighting with this one. Think again for this one.
Go for a slug generated by your function, you will be free for the future. You may create a brand new algorithm and thus only do a once for all change of the slugs. You may even have 2 slugs, the old one which redirects to the new one, and the new one.
"a model with a uniquely indexed column"
Does this mean just a model and a column with a unique validation on the column? Or does it mean the column needs add_index in the migration?
And could you explain what exactly it means to create an add_index. Such as if you have an Authors model, with a name column. What does adding an index to 'name' accomplish?
Thanks.
I am taking it to mean that the model has a column that is guaranteed to be unique and that there is an index on it. I take it you are reading about models in general in Rails.
A unique column means that no two models (such as User1 and User2) can have the same value for that column. For example, users would have unique logins. No two users should exist that have the same login (or username or email). But Rails automatically gives models an ID column that is always unique. Unless you change it, the first record will have ID 1, then 2, then 3, etc.
An index on a column means that it is easier to find that column. Think of a an encyclopedia. There is so much information in there, but the appendix (like an index) helps you quickly find what you are looking for. There may be an appendix of key terms, and then it will tell you where to quickly find it. That's what an index on a column does.
So "a model with a uniquely indexed column" in Rails, by default, is the ID column: it is unique and will automatically get an index on it to more quickly find records.
Extra: when you make a model with a foreign key (example: model User may have a gender_id, and you may have a table called Gender that defines Male and Female and the gender_id corresponds to a Gender object), then you should add an index to that foreign key to make searches on it faster.
More information: http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_index
I'm writing my first rails app & want to get into some good habits from the start. The table in question is to be to hold employee data, one of the fields being the manager's ID. To reflect the hierarchical structure, I'm thinking of using acts_as_tree, so the parent_id would be the manager's id field (right?). If we are to use (import) data from our existing HR application - PeopleSoft - the employee ID is a string. Employee ID seems to make the most sense as a PK (coming from the PeopleSoft developer perspective, I realize I may be biased and/or not seeing all of the possibilities -- I welcome suggestions on this as well)
I know that one of the philosophies behind rails is "convention over configuration", so I'd like to use the defaults - the PK being the autoincrementing integer. Would it make sense in this case to create a "lookup table" or something in order to maintain the use/association of the ID coming from PS? There will be reports/exports going back into the PS world....
Thanks
You're correct in that the convention in Rails is to use the default auto-incrementing id. If you have a one-to-one relationship between people and employee IDs, then employee ID should just be a field (column) on your person model. Make it a key (but not a primary key) if you're going to be doing a lot of lookups using it.
Hi I want to have custom ids for a table and not use the default auto_increment. Is that possible? I have turned :id => false and later on I create a t.column :id, :bigint
But when I create a new record the id is set to NULL
Is there anyway to bypass this behaviour.
I could set my primary key to a field uid (it is actually facebook id) but usually it breaks for me as relations usually look for the id field
It's almost certainly going to be easier to use Rails' surrogate autoincrement ID as your primary ID.
There's no problem having a second unique column facebook_id and performing all your searches based on that field, but for the compatability reasons you mention (i.e. plugin authors making assumptions they shouldn't) I'd stick with Rails defaults as much as sensible. The performance issues relating to primary keys vs unique keys are minimal
I don't know where the problem is. Of course id is NULL when you disable auto_increment for the id column and don't insert an id value by hand.
Btw: You can define the foreign and primary keys for relations. It is no problem to use facebook_id as primary key.
I was watching a screencast where the author said it is not good to have a primary key on a join table but didn't explain why.
The join table in the example had two columns defined in a Rails migration and the author added an index to each of the columns but no primary key.
Why is it not good to have a primary key in this example?
create_table :categories_posts, :id => false do |t|
t.column :category_id, :integer, :null => false
t.column :post_id, :integer, :null => false
end
add_index :categories_posts, :category_id
add_index :categories_posts, :post_id
EDIT: As I mentioned to Cletus, I can understand the potential usefulness of an auto number field as a primary key even for a join table. However in the example I listed above, the author explicitly avoids creating an auto number field with the syntax ":id => false" in the "create table" statement. Normally Rails would automatically add an auto-number id field to a table created in a migration like this and this would become the primary key. But for this join table, the author specifically prevented it. I wasn't sure why he decided to follow this approach.
Some notes:
The combination of category_id and post_id is unique in of itself, so an additional ID column is redundant and wasteful
The phrase "not good to have a primary key" is incorrect in the screencast. You still have a Primary Key -- it is just made up of the two columns (e.g. CREATE TABLE foo( cid, pid, PRIMARY KEY( cid, pid ) ). For people who are used to tacking on ID values everywhere this may seem odd but in relational theory it is quite correct and natural; the screencast author would better have said it is "not good to have an implicit integer attribute called 'ID' as the primary key".
It is redundant to have the extra column because you will place a unique index on the combination of category_id and post_id anyway to ensure no duplicate rows are inserted
Finally, although common nomenclature is to call it a "composite key" this is also redundant. The term "key" in relational theory is actually the set of zero or more attributes that uniquely identify the row, so it is fine to say that the primary key is category_id, post_id
Place the MOST SELECTIVE column FIRST in the primary key declaration. A discussion of the construction of b(+/*) trees is out of the scope of this answer ( for some lower-level discussion see: http://www.akadia.com/services/ora_index_selectivity.html ) but in your case, you'd probably want it on post_id, category_id since post_id will show up less often in the table and thus make the index more useful. Of course, since the table is so small and the index will be, essentially, the data rows, this is not very important. It would be in broader cases where the table is wider.
It is a bad idea not to have a primary key on any table, period (if the DBMS is a relational DBMS - or an SQL DBMS). Primary keys are a crucial part of the integrity of your database.
I suppose if you don't mind your database being inaccurate and providing incorrect answers every so often, then you could do without...but most people want accurate answers from their DBMS and for such people, primary keys are crucial.
A DBA would tell you that the primary key in this case is actually the combination of the two FK columns. Since Rails/ActiveRecord doesn't play nice with composite PKs (by default, at least), that may be the reason.
The combination of foreign keys can be a primary key (called a composite primary key). Personally I favour using a technical primary key instead of that (auto number field, sequence, etc). Why? Well, it makes it much easier to identify the record, which you may need to do if you're going to delete it.
Think about it: if you're going to present a Webpage of all the linkages, having a primary key to identify the record makes it much easier.
Basically because there's no need for it. The combination of the two foreign key field adequately uniquely identifies any row.
But that merely says why it's not a Good Idea.... but why would it be a Bad Idea?
Consider the overhead adding a identity column would add. The table would take up 50% more disk space. Worse is the index situation. With a identity field, you have to maintain the identity count, plus a second index. You'll be tripling the disk space and tripling the work the needs to be performed on every insert. With the only advantage being a slightly shorter WHERE clause in a DELETE command.
On the other hand, If the composite key fields are the entire table, then the index can be the table.
Placing the most selective column first should only be relevant in the INDEX declaration. In the KEY declaration, it should not matter (because, as has been correctly pointed out, the KEY is a SET, and inside a set, order doesn't matter - the set {a1,a2} is the same set as {a2,a1}).
If a DBMS product is such that ordering of attributes inside a KEY declaration makes a difference, then that DBMS product is guilty of not properly distinguishing between the logical design of a database (the part where you do the KEY declaration) and the physical design of the database (the part where you do the INDEX declaration).
I wanted to comment on the following comment : "It is not correct to say zero or more".
I wanted to remark that the text to which this comment was added simply did not contain the text "zero or more", so the author of the comment I wanted to comment on was criticizing someone else for something that hadn't been said.
I also wanted to comment that it is not correct to say that it is not correct say "zero or more". Relational theory as commonly known today among the few people who still bother to study the details of that theory, actually REQUIRES the possibility of a key with no attributes.
But when I pressed the button "comment", the system responded to me that commenting requires a reputation score of 50 (or some such).
A sad illustration of how the world seems to have forgotten that science is not democracy, and that in science, the truth is not determined by whoever happens to be the majority, nor by whoever happens to have "enough reputation".
Pros of having a single PK
Uniquely identifies a row with a single value
Makes it easy to reference the relationship from elsewhere if needed
Some tools want you to have a single integer value pk
Cons of having a single PK
Uses more disk space
Need 3 indexes rather than 1
Without a unique constraint you could end up with multiple rows for the same relationship
Notes
You need to define a unique constraint if you want to avoid duplicates
In my opinion don't use the single pk if you're table is going to be huge, otherwise trade off some disk space for the convenience. Yes it's wasteful, but who cares about a few MB on disk in real world applications.