I am trying to move my application from Postgres to Oracle, and I am facing some surprises with Oracle sequence management during seeding of initial data.
=> the objective is to run the same application on various databases (PostGres, Oracle, MSSQL), and this initial data (Admin user, parameters ...) are supposed to have specific id's, starting from 1, assigned regarding the order of creation. Of course, for this specific purpose I could hardcode the id's.
=> Migration and seeding are done by
rake db:migrate RAILS_ENV=ORACLE
rake db:seed RAILS_ENV=ORACLE
Environments have nothing specific, but the relevant ActiveRecord adapter.
With Oracle, seeded data id's do not start from 1 as expected (behaviour in Postgres or in MS SQL), but they start with 10000.
Having a look at sequences created during db migration, they all start with 10000 (LAST_NUMBER).
Is it an Oracle way, or is it an activerecord-oracle_enhanced-adapter way of doing things ?
Why is it set like this ?
Is there a way to start numbering from 1 ?
Thanks for your help,
Best regards,
Fred
Is it an Oracle way, or is it an activerecord-oracle_enhanced-adapter
way of doing things ?
This is the adapter's way of doing things. Oracle will by default use the min value of the sequence that is created (so typically 1).
The adapter, as of version 1.6.7, is setting this in oracle_enhanced_adapter.rb:
self.default_sequence_start_value = 10000
Is there a way to start numbering from 1 ?
You can override this. Passing the :sequence_start_value option to the create_table method allows you to specify your own.
In a migration this might look like:
create_table :table_name, primary_key_trigger: true, sequence_start_value: 1 do |t|
...
ID's should have no business value. I would change your approach so that you don't care what the dbms uses.
I would consider adding an additional key that is populated manually by a trigger and/or stored procedures that populate the field, starting at one and incrementing by 1.
Related
The setup
We have some tables which have very high id values, and as such they are bigints in production, which was achieved by running migrations changing the id columns including limit: 8. This methodology is outlined here: https://stackoverflow.com/a/5870148/2240218
Those migrations don't modify db/schema.rb, so when we run rake db:test:prepare, the test database is created with normal 4-byte integer columns which have a maximum of 2.1 billion (for what it's worth, we are using Postgres).
A note about our ids
For legacy reasons they are tied to being foreign keys from a third party system. We would ideally be using the id column as an internal surrogate primary key and the third party key would be a separate column entirely (which would remove this whole problem), but the overhead in this change is beyond what I'm trying to get to at the moment.
The problem
I'm trying to put some integration tests in place with real-world data, and some of these have an id larger than 2.1billion. We will have some calls into these external systems when running the tests (which we'll ultimately stub using VCR) so they need to be correct. However, when I try and use this data it blows up because the value is too large for the column in the test database.
So my question is: is there any non-massively-hacky way to ensure these id columns are bigints in the test database after running db:test:prepare?
Change the schema format from :ruby to :sql so that your schema dump is pure SQL. This should keep those large integers intact (as well as any stored procs, etc you might have).
In config/application.rb:
config.active_record.schema_format = :sql
http://guides.rubyonrails.org/active_record_migrations.html#types-of-schema-dumps
I've got a large web app which writes many millions of rows into partitioned tables in PostgreSQL each day (meaning there's a new table for each day's data).
We're using PostgreSQL's table inheritance and partitioning to speed things along:
Due to there being year's worth of data in our DB we can't effectively use insert triggers to route the content to the correct table (the functions are getting very, very long in length).
Long story short, we need ActiveRecord to know which table to insert and update the data on. BUT, not change the table that is used for selects and other DB tasks.
Obviously it's simple to define the table name for a model, but is it possible to override the table name for just particular actions?
Here's a little more detail:
Database:
Table: dashboard.impressions (id, host, data, created_on, etc)
Table: data.impressions_20120801 (inherited from dashboard.impressions, with a constraint of created_on being equal to the tables date)
Impression.create :host=>"localhost", :data=>"{...}", created_on=>DateTime.now should write to the data.impressions_20120801 table, where Impression.where(:host=>"localhost") should search on the dashboard.impressions table, since that contains all the data.
Edit: I'm running PostgreSQL 9.1 and Rails 3.2.6
I don't do Rails so I can't help with the ActiveRecord side, but I can offer a pure Pg fallback solution for if you can't get ActiveRecord to do what you want. It'll cost you a little bit of insert performance so it'll be much better to teach ActiveRecord to do the inserts to the right place.
Personally I'd just do the INSERTs directly via the pg gem and bypass ActiveRecord completely. If you can't do that, or ActiveRecord does caching that means you shouldn't, try this alternate partitioning trigger implementation.
Instead of explicitly listing every partition in your trigger function, consider EXECUTE ... USING for insertion, and generate the partition name using your naming scheme. Something like the untested:
CREATE OR REPLACE FUNCTION partition_trigger() RETURNS trigger AS $$
DECLARE
target_partition text;
BEGIN
IF tg_op = 'INSERT' THEN
target_partition = ( ... work out the partition name ... )
EXECUTE 'INSERT INTO '||quote_ident(target_partition)||' (col1,col2) VALUES ($1, $2)'
USING (NEW.col1, NEW.col2);
END IF;
RETURN NULL;
END;
$$ LANGUAGE 'plpgsql';
In my rails app I have a seeds.rb script which inserts lots of records. In fact I'm trying to load 16 million of them. It's taking a long time.
One thing I wanted to try to speed this up, is to drop the table indices and re-add them afterwards. If it sounds like I'm doing something insane, please let me know, but that seems to be one recommendation for bulk loading into postgres
I use add_index and remove_index commands in migrations, but the same syntax doesn't work in a seeds.rb file. Is it possible to do this outside a migration in fact? (I'm imagining it might not be best practice, because it represents a schema change)
rails v2.3.8,
postgres v8.4.8
One possibility is just to indulge in a little raw SQL within seeds.rb
ActiveRecord::Base.connection.execute("DROP INDEX myindex ON mytable")
At 16 million records, I would recommend managing the whole thing via raw SQL (contained within seeds.rb if you like). Do all 16 million records go into a single table? There ought to be some PostgreSQL magic to bulk import a file (in a PostgreSQL specific format) into a table.
I'm looking for a way to create a column that autoincrements the way the automatic :id column does. I could probably handle this somehow in the model, but that seems kludgey. I haven't found anything in stock Rails 3 that handles this; are there gems available that might handle this? I'm surprised it's not already an option, since Rails handles this behavior for primary key columns.
Normally auto-incrementing columns are implemented using database sequences. The advantage of using a sequence over calculating the next increment, is that getting the next value from a sequence is atomic. So if you have multiple processes creating new elements, the sequence will make sure your numbers are really unique.
Sequences can be used in postgresql, oracle, mysql, ...
How to implement this, if you are using postgres for instance:
select the next value from the sequence:
Integer(Operator.connection.select_value("SELECT nextval('#{sequence_name}')"))
create a sequence:
Operator.connection.execute("CREATE sequence #{sequence_name}")
set the start-value of a sequence :
Operator.connection.execute("SELECT setval('#{sequence_name}', #{new_start_serial})")
Hope this helps.
If you really think you need this you could create a before_create filter in the model to check the last record attribute value and add 1 to it. Feels hacking though.
Is there anything (warnings, advices) that I should know if I want to develop an inventory management system using Ruby on Rails. The biggest problem that I could think of is on how to do long calculations on the stocks. The other one would be on how to do cachings on stock counts. BTW, I'll be using MySQL as the database. Thanks in advance.
I think, there is no reason for not writing it with Rails.
To the caching of stock count's, there is a method in Rails, which is named cache_column. This caches the number of relations in a column.
And to do big calculations on stocks. I don't know why this should be a problem.
And if this would da heavy work, you can put it into a worker.
there is no argument that speaks against using Ruby on Rails for that.
if you want to make big calculations on a database level (like SUM) be sure to use BIGINT explicitely in your migrations for this column, as the MySQL Integer (signed) supports a Maximum of 2147483647, and the result of your calculation will be computed in the same data type by MySQL.
To keep track of cached stock counts, use counter_cache