I'm looking for a way to create a column that autoincrements the way the automatic :id column does. I could probably handle this somehow in the model, but that seems kludgey. I haven't found anything in stock Rails 3 that handles this; are there gems available that might handle this? I'm surprised it's not already an option, since Rails handles this behavior for primary key columns.
Normally auto-incrementing columns are implemented using database sequences. The advantage of using a sequence over calculating the next increment, is that getting the next value from a sequence is atomic. So if you have multiple processes creating new elements, the sequence will make sure your numbers are really unique.
Sequences can be used in postgresql, oracle, mysql, ...
How to implement this, if you are using postgres for instance:
select the next value from the sequence:
Integer(Operator.connection.select_value("SELECT nextval('#{sequence_name}')"))
create a sequence:
Operator.connection.execute("CREATE sequence #{sequence_name}")
set the start-value of a sequence :
Operator.connection.execute("SELECT setval('#{sequence_name}', #{new_start_serial})")
Hope this helps.
If you really think you need this you could create a before_create filter in the model to check the last record attribute value and add 1 to it. Feels hacking though.
Related
I'm trying to figure out the best way to model a simple lookup in my rails app.
I have a model, Coupon, which can be of two "types", either Percent or Fixed Amount. I don't want to build a database table around the "type" lookup, so what I'd like to do is just have a coupon_type(integer) field on the Coupon table which can be either 1 (for Percent) or 2 (for Fixed).
What is the best way to handle this?
I found this: Ruby on Rails Static List Options for Drop Down Box but not sure how that would work when I want each value to have two fields, both ID and Description.
I'd like this to populate a select list as well.
Thank you for the feedback!
If this is really unlikely to change, or if it does change it will be an event significant enough to require redeployment, the easiest approach is to have a constant that defines the conditions:
COUPON_TYPES = {
:percent => 1,
:fixed => 2
}
COUPON_TYPES_FOR_SELECT = COUPON_TYPES.to_a
The first constant defines a forward mapping, the second in a format suitable for options_for_select.
It's important to note that this sort of thing would take, at most, a millisecond to read from a database. Unless you're rendering hundreds of forms per second it would hardly impact performance.
Don't worry about optimizing things that aren't perceptible problems.
We are building ASP.NET MVC3 web applications using Visual Studio, SQL Server 2008 R2 & EF Code First 4.1.
Quite often we have smaller, what we call, "lookup" tables. For example a "Status" table contain an "Id" and a "Name". As the application grows these tables become quite frequent and I would like to know the best way to "group" these lesser important tables away from the crux of the application.
It has been suggest to me to add a prefix like "LkStatus" to help me but what about moving all the lookup tables out of dbo and into there own schema?
Can anyone see any drawbacks in this method?
Thanks Paul
No drawbacks with this method. I'm a fan of schemas personally. I'd use Lookup though
To change your table schema, you have two ways:
ALTER SCHEMA Lookup TRANSFER dbo.SomeTable
or
ALTER AUTHORIZATION ON dbo.SomeTable TO Lookup
This is going to be down to preference. There really isn't a "gotcha" either way. I prefer a table prefix but wouldn't be bothered either way. We use LU_*. As long as either option is enforced that maintenance down the line will be easy.
Since the tables are small, what about grouping them together into a single table? Instead of using the table name as a pseudo-key, use a real key. For example, you could have a table called Lookup, with an Id, Type, Name and Value, where Type = 'Status' for your status values. Seting the clustered index to (Type, Name) would physically group all rows of the same type together, which would make it fast to read them all as a group, if needed.
If your Names can have different data types, add an extra column for each required type: one for integers, one for strings, one for floats, etc. You can do something similar using an XML column; the T-SQL takes just a little more effort.
What's the best way to guarantee that a code is unique? The code is XXX-XXXXX where X is a number only.
What way other than search for the code in a database table there is to make the process faster and cleaner?
Regards.
Normal approach is to use :uniqueness validation. That handles db searching.
More bulletproof is to use 1) + unique index on that field. If the saving fails without validation errors, you could generate a new code and try again.
Since no two times are the same, using some kind of hash based on time is the easiest way to guarantee uniqueness. If you are storing xxx-xxxx though, you are limiting yourself. You may also use a unique auto-incrementing value. Store the value server-side for the next number to be assigned and then increment it whenever you issue a new unique id.
both are acceptable options without knowing additional information
A hash based on time is actually not 'guarateed' to be unique. Using some type of hash is just a way to create a digest from a large source data. Since all data can then be described in 128bits (using md5) then its possible to encounter hash collisions.
The validates :uniquness will do a query to determine if the fields value has been used before. You can use this but it should not be your only solution. If the field is intended to be unique, you should place a unique index on the column in the database. If you only rely on the rails validation, you are running the risk of a race condition on data insertion into the table. I can bypass the validation, but another write could have also passed the validation and both end up getting into the table.
Are you generating the value or is it user input?
I'm using Rails3.rc and Active Record 3 (with meta_where) and just started to switch to Sequel, because it seems to be much faster and offers some really great features.
I'm already using the active_model plugin (and some others).
As far as I know, I should use User[params[:id]] instead of User.find(params[:id]). But this doesn't raise if no record exists and doesn't convert the value to an integer (type of PK), so it's as a string in the where clause. I'm not sure if this is causing any performance issues. Does this harm identity_map? What's the best way to solve both these issues?
Is there an easy way to flip the usage of associations like User.messages_dataset and User.messages so that User.messages behaves like in Active Record (User.messages_data_set). I guess I'd use the #..._dataset a lot but never need the array method, because I could just add .all?
I noticed some same (complex) queries are executed several times within one action sometimes. Is there something like the Active Record query cache? (identity_map doesn't seem to work for these cases).
Is there a to_sql I can call to get the raw SQL a dataset would produce?
You can either use:
User[params[:id].to_i] || raise Sequel::Error
Or write your own method that does something like that. Sequel supports non-integer primary keys, so it wouldn't do the conversion automatically. It shouldn't have any affect on an identity map. Note that Sequel doesn't use an identity map unless you are using the identity_map plugin. I guess the best way is to write your own helper method.
Not really. You can use the association_proxies plugin so that non-array methods are sent to the dataset instead of the array of objects. In general, you shouldn't be using the association dataset method much. If you are using it a lot, it's a sign that you should have an association for that specific usage.
There is and will never be a query cache. You should write your actions so that the results of the first query are cached and reused.
Dataset#sql gives you the SELECT SQL for the dataset.
In my present Rails application, I am resolving scheduling conflicts by sorting the models by the "created_at" field. However, I realized that when inserting multiple models from a form that allows this, all of the created_at times are exactly the same!
This is more a question of best programming practices: Can your application rely on your ID column in your database to increment greater and greater with each INSERT to get their order of creation? To put it another way, can I sort a group of rows I pull out of my database by their ID column and be assured this is an accurate sort based on creation order? And is this a good practice in my application?
The generated identification numbers will be unique.
Regardless of whether you use Sequences, like in PostgreSQL and Oracle or if you use another mechanism like auto-increment of MySQL.
However, Sequences are most often acquired in bulks of, for example 20 numbers.
So with PostgreSQL you can not determine which field was inserted first. There might even be gaps in the id's of inserted records.
Therefore you shouldn't use a generated id field for a task like that in order to not rely on database implementation details.
Generating a created or updated field during command execution is much better for sorting by creation-, or update-time later on.
For example:
INSERT INTO A (data, created) VALUES (smething, DATE())
UPDATE A SET data=something, updated=DATE()
That depends on your database vendor.
MySQL I believe absolutely orders auto increment keys. SQL Server I don't know for sure that it does or not but I believe that it does.
Where you'll run into problems is with databases that don't support this functionality, most notably Oracle that uses sequences, which are roughly but not absolutely ordered.
An alternative might be to go for created time and then ID.
I believe the answer to your question is yes...if I read between the lines, I think you are concerned that the system may re-use ID's numbers that are 'missing' in the sequence, and therefore if you had used 1,2,3,5,6,7 as ID numbers, in all the implementations I know of, the next ID number will always be 8 (or possibly higher), but I don't know of any DB that would try and figure out that record Id #4 is missing, so attempt to re-use that ID number.
Though I am most familiar with SQL Server, I don't know why any vendor who try and fill the gaps in a sequence - think of the overhead of keeping that list of unused ID's, as opposed to just always keeping track of the last I number used, and adding 1.
I'd say you could safely rely on the next ID assigned number always being higher than the last - not just unique.
Yes the id will be unique and no, you can not and should not rely on it for sorting - it is there to guarantee row uniqueness only. The best approach is, as emktas indicated, to use a separate "updated" or "created" field for just this information.
For setting the creation time, you can just use a default value like this
CREATE TABLE foo (
id INTEGER UNSIGNED AUTO_INCREMENT NOT NULL;
created TIMESTAMP NOT NULL DEFAULT NOW();
updated TIMESTAMP;
PRIMARY KEY(id);
) engine=InnoDB; ## whatever :P
Now, that takes care of creation time. with update time I would suggest an AFTER UPDATE trigger like this one (of course you can do it in a separate query, but the trigger, in my opinion, is a better solution - more transparent):
DELIMITER $$
CREATE TRIGGER foo_a_upd AFTER UPDATE ON foo
FOR EACH ROW BEGIN
SET NEW.updated = NOW();
END;
$$
DELIMITER ;
And that should do it.
EDIT:
Woe is me. Foolishly I've not specified, that this is for mysql, there might be some differences in the function names (namely, 'NOW') and other subtle itty-bitty.
One caveat to EJB's answer:
SQL does not give any guarantee of ordering if you don't specify an order by column. E.g. if you delete some early rows, then insert 'em, the new ones may end up living in the same place in the db the old ones did (albeit with new IDs), and that's what it may use as its default sort.
FWIW, I typically use order by ID as an effective version of order by created_at. It's cheaper in that it doesn't require adding an index to a datetime field (which is bigger and therefore slower than a simple integer primary key index), guaranteed to be different, and I don't really care if a few rows that were added at about the same time sort in some slightly different order.
This is probably DB engine depended. I would check how your DB implements sequences and if there are no documented problems then I would decide to rely on ID.
E.g. Postgresql sequence is OK unless you play with the sequence cache parameters.
There is a possibility that other programmer will manually create or copy records from different DB with wrong ID column. However I would simplify the problem. Do not bother with low probability cases where someone will manually destroy data integrity. You cannot protect against everything.
My advice is to rely on sequence generated IDs and move your project forward.
In theory yes the highest id number is the last created. Remember though that databases do have the ability to temporaily turn off the insert of the autogenerated value , insert some records manaully and then turn it back on. These inserts are no typically used on a production system but can happen occasionally when moving a large chunk of data from another system.