mass inserting into model in rails, how to auto increment id field? - ruby-on-rails

I have a model for stocks and a model for stock_price_history.
I want to mass insert with this
sqlstatement = "INSERT INTO stock_histories SELECT datapoint1 AS id,
datapoint2 AS `date` ...UNION SELECT datapoint9,10,11,12,13,14,15,16,
UNION SELECT datapoint 17... etc"
ActiveRecord::Base.connection.execute sqlstatement
However, I don't actually want to use datapoint1 AS id. If I leave it blank I get an error that my model has 10 fields and I'm inserting only 9 and that it is missing the primary key.
Is there a way to force an auto increment on the id when inserting by SQL?
Edit: Bonus question cause I'm a noob. I am developing in SQLite3 and deploying to a Posgres (i.e. Heroku), Will I need to modify the above mass insert statement so it's for a posgres database?
2nd edit: my initial question had Assets and AssetHistory instead of Stocks and Stock_Histories. I changed it to Stocks / Stock price histories because I thought it was more intuitive to understand. Therefore some answers refer to Asset Histories for this reason.

You can change your SQL and be more explicit about which fields you're inserting, and leave id out of the list:
insert into asset_histories (date) select datapoint2 as `date` ...etc
Here's a long real example:
jim=# create table test1 (id serial not null, date date not null, name text not null);
NOTICE: CREATE TABLE will create implicit sequence "test1_id_seq" for serial column "test1.id"
CREATE TABLE
jim=# create table test2 (id serial not null, date date not null, name text not null);
NOTICE: CREATE TABLE will create implicit sequence "test2_id_seq" for serial column "test2.id"
CREATE TABLE
jim=# insert into test1 (date, name) values (now(), 'jim');
INSERT 0 1
jim=# insert into test1 (date, name) values (now(), 'joe');
INSERT 0 1
jim=# insert into test1 (date, name) values (now(), 'bob');
INSERT 0 1
jim=# select * from test1;
id | date | name
----+------------+------
1 | 2013-03-14 | jim
2 | 2013-03-14 | joe
3 | 2013-03-14 | bob
(3 rows)
jim=# insert into test2 (date, name) select date, name from test1 where name <> 'jim';
INSERT 0 2
jim=# select * from test2;
id | date | name
----+------------+------
1 | 2013-03-14 | joe
2 | 2013-03-14 | bob
(2 rows)
As you can see, only the selected rows were inserted, and they were assigned new id values in table test2. You'll have to be explicit about all the fields you want to insert, and ensure that the ordering of the insert and the select match.
Having said all that, you might want to look into the activerecord-import gem, which makes this sort of thing a lot more Railsy. Assuming you have a bunch of new AssetHistory objects (not persisted yet), you could insert them all with:
asset_histories = []
asset_histories << AssetHistory.new date: some_date
asset_histories << AssetHistory.new date: some_other_date
AssetHistory.import asset_histories
That will generate a single efficient insert into the table, and handle the id for you. You'll still need to query some data and construct the objects, which may not be faster than doing it all with raw SQL, but may be a better alternative if you've already got the data in Ruby objects.

Related

Sorting by rank and total where multiple entries may exist

Ruby 2.1.5
Rails 4.2.1
My model is contributions, with the following fields:
event, contributor, date, amount
The table would have something like this:
earth_day, joe, 2014-04-14, 400
earth_day, joe, 2015-05-19, 400
lung_day, joe, 2015-05-20, 800
earth_day, john, 2015-05-19, 600
lung_day, john, 2014-04-18, 900
lung_day, john, 2015-05-21, 900
I have built an index view that shows all these fields and I implemented code to sort (and reverse order) by clicking on the column titles in the Index view.
What I would to do is have the Index view displayed like this:
Event Contributor Total Rank
Where event is only listed once per contributor and the total is sum of all contributions for this event by the contributor and rank is how this contributor ranks relative to everyone else for this particular event.
I am toying with having a separate table where only a running tally is kept for each event/contributor and a piece of code to compute rank and re-insert it in the table, then use that table to drive views.
Can you think of a better approach?
Keeping a running tally is a fine option. Writes will slow down, but reads will be fast.
Another way is to create a database view, if you are using postgresql, something like:
-- Your table structure and data
create table whatever_table (event text, contributor text, amount int);
insert into whatever_table values ('e1', 'joe', 1);
insert into whatever_table values ('e2', 'joe', 1);
insert into whatever_table values ('e1', 'jim', 0);
insert into whatever_table values ('e1', 'joe', 1);
insert into whatever_table values ('e1', 'bob', 1);
-- Your view
create view event_summary as (
select
event,
contributor,
sum(amount) as total,
rank() over (order by sum(amount) desc) as rank
from whatever_table
group by event, contributor
);
-- Using the view
select * from event_summary order by rank;
event | contributor | total | rank
-------+-------------+-------+------
e1 | joe | 2 | 1
e1 | bob | 1 | 2
e2 | joe | 1 | 2
e1 | jim | 0 | 4
(4 rows)
Then you have an ActiveRecord class like:
class EventSummary < ActiveRecord::Base
self.table_name = :event_summary
end
and you can do stuff like EventSummary.order(rank: :desc) and so on. This won't slow down writes, but reads will be a little slower, depending on how much data you are working with.
Postgresql also has support for materialized views, which could give you the best of both worlds, assuming you can have a little bit of lag between when the data is entered and when the summary table is updated.

Generate a Rails model from within code (invoke generator from a controller)

I have a particular need to be able to invoke a Rails command from within code (ie when some action occurs). I need a model to be created with specific fields (which will be determined by a form) and run the created migration in the end.
So this form I have would create all the fields, which then would result with the creation of a model with its certain fields (table and columns)
So, is there a way to invoke rails generate model NAME [field[:type][:index] field[:type] and bundle exec rake db:migrate from within a controller/ruby code?
Rather than having one table per category, here's a more relational-database-y approach:
create table category (
id serial primary key,
name text not null
);
create table attribute (
id serial primary key,
name text not null
);
create table item (
id serial primary key,
category_id integer not null references category (id),
description text
);
create table category_attribute (
attribute_id integer not null references attribute (id),
category_id integer not null references category (id)
);
create table item_attribute (
attribute_id integer not null references (attribute.id),
item_id integer not null references item (id),
value text
);
When you create a category, you store its name (and any other one-to-one attributes) in the category table. You make sure the attribute table has an entry for every attribute the category has, and then use the category_attribute table to link those attributes to that category.
When you add a new member of a category, you use the item table to store the main things about the item and the item_attribute table to store the values of each of its attributes. So with the cars and pets approach you mentioned, your database might look like
category
id | name
----+------
1 | car
2 | pet
attribute
id | name
----+------------
1 | make
2 | breed
3 | model_year
4 | name
category_attribute
attribute_id | category_id
--------------+-------------
1 | 1
2 | 2
3 | 1
4 | 2
item
id | category_id | description
----+-------------+----------------
1 | 1 | Hyundai Accent
2 | 2 | Fuzzy kitty
item_attribute
attribute_id | item_id | value
--------------+---------+---------
1 | 1 | Hyundai
3 | 1 | 2007
2 | 2 | DSH
4 | 2 | Sam
This approach can feel pretty non-obvious, because it doesn't match the "one object with many attributes" style you use with Rails models. This is how relational databases work, however. I believe there's some ActiveRecord magic you can do to make the object/relational translation a little more automatic, but I don't remember what it's called at the moment.

Creating a PostgreSQL sequence to a field (which is not the ID of the record)

I am working on a Ruby on Rails app. We are using a PostgreSQL database.
There is a table named scores with the following columns:
Column | Type
--------------+-----------------------
id | integer
value | double precision
ran_at | timestamp
active | boolean
build_id | bigint
metric_id | integer
platform_id | integer
mode_id | integer
machine_id | integer
higher_better | boolean
job_id | integer
variation_id | integer
step | character varying(255)
I need to add a sequence to job_id (note: there is no model for job).
How do I create this sequence?
Use CREATE SEQUENCE:
CREATE SEQUENCE scores_job_id_seq; -- = default name for plain a serial
Then add a column default to scores.job_id:
ALTER TABLE scores ALTER COLUMN job_id SET DEFAULT nextval('scores_job_id_seq');
If you want to bind the sequence to the column (so it is deleted when the column is deleted), also run:
ALTER SEQUENCE scores_job_id_seq OWNED BY scores.job_id;
All of this can be replaced with using the pseudo data type serial for the column job_id to begin with:
Safely and cleanly rename tables that use serial primary key columns in Postgres?
If your table already has rows, you may want to set the SEQUENCE to the next highest value and fill in missing serial values in the table:
SELECT setval('scores_job_id_seq', COALESCE(max(job_id), 1)) FROM scores;
Optionally:
UPDATE scores
SET job_id = nextval('scores_job_id_seq')
WHERE job_id IS NULL;
How to check a sequence efficiently for used and unused values in PostgreSQL
Postgres manually alter sequence
How to reset postgres' primary key sequence when it falls out of sync?
The only remaining difference, a serial column is also set to NOT NULL. You may or may not want that, too:
ALTER TABLE scores ALTER COLUMN job_id SET NOT NULL;
But you cannot just alter the type of an existing integer:
ALTER TABLE scores ALTER job_id TYPE serial;
serial is not an actual data type. It's just a notational convenience feature for CREATE TABLE.
In Postgres 10 or later consider an IDENTITY column:
Auto increment table column
So I figured out how to do this using ActiveRecord migrations on Ruby on Rails. I basically used Erwin's commands and help from this page and put them in the migration files. These are the steps:
1.
In the terminal, type:
rails g migration CreateJobIdSequence
rails g migration AddJobIdSequenceToScores
2.
Edit the migration files as follows:
20140709181616_create_job_id_sequence.rb :
class CreateJobIdSequence < ActiveRecord::Migration
def up
execute <<-SQL
CREATE SEQUENCE job_id_seq;
SQL
end
def down
execute <<-SQL
DROP SEQUENCE job_id_seq;
SQL
end
end
20140709182313_add_job_id_sequence_to_scores.rb :
class AddJobIdSequenceToScores < ActiveRecord::Migration
def up
execute <<-SQL
ALTER SEQUENCE job_id_seq OWNED BY scores.job_id;
ALTER TABLE scores ALTER COLUMN job_id SET DEFAULT nextval('job_id_seq');
SQL
end
def down
execute <<-SQL
ALTER SEQUENCE job_id_seq OWNED BY NONE;
ALTER TABLE scores ALTER COLUMN job_id SET NOT NULL;
SQL
end
end
3.
Migrate the database. In the terminal type:
rake db:migrate

Updating inserted record within MERGE statement in SQL Server 2008 R2

I have following code in my SQL Server 2008 R2 stored procedure. In that stored procedure, I am copying one city to another city with it's family and persons.
Here I maintain family's source and target id in #FamilyIdMap.
left column indicates the codes line no.
-- Copy Person
1> DECLARE #PersonIdMap table (TargetId int, SourceId int)
2> MERGE Person as PersonTargetTable
3> USING (SELECT PersonID, FamilyID, PersonName, ParentID FROM Person
4> WHERE FamilyID in (SELECT FamilyID from Family where FamilyName like '%DA%'))
5> AS PersonSourceTable ON (0=1)
6> WHEN NOT MATCHED THEN
7> INSERT(FamilyID, PersonName, ParentID)
8> VALUES
9> ((SELECT TOP 1 TargetID from #FamilyIdMap WHERE SourceID=FamilyID),PersonName,
10> ParentID) OUTPUT
11> INSERTED.PersonID, PersonSourceTable.PersonID
12> INTO #PersonIdMap;
It gives the output like this:
Source Table
PersonID FamilyID PersonName ParentID
1 1 ABC Null
2 1 Son of ABC 1
3 1 Son of ABC 1
4 2 XYZ NULL
5 2 Son of XYZ 4
Target Table (Copied from Source Table using above given code)
PersonID FamilyID PersonName ParentID
6 1 ABC Null
7 1 Son of ABC 1 <-- ParentID Remains as it is
8 1 Son of ABC 1 <--
9 2 XYZ NULL
10 2 Son of XYZ 4 <--
Problem in above output is it doesn't update the parentID, I want the output to be this:
Expected Target Table
PersonID FamilyID PersonName ParentID
6 1 ABC Null
7 1 Son of ABC 6 <-- ParentID should be updated
8 1 Son of ABC 6 <--
9 2 XYZ NULL
10 2 Son of XYZ 9 <--
I know problem is at line # 10 of code
10> ParentID) OUTPUT
but what should I replace with ParentID to update it ? Thanks in advance.
What you are trying to do cannot be done in a single step in SQL Server 2008R2.
Updating the ParentId has to be a second step, as you cannot access OUTPUT values in one row that where the result of the insert of another row. However, you are already collecting the information for the second step. So, you just need to add a simple update.
IF OBJECT_ID('dbo.Person') IS NOT NULL DROP TABLE dbo.Person;
IF OBJECT_ID('dbo.Family') IS NOT NULL DROP TABLE dbo.Family;
CREATE TABLE dbo.Family(FamilyID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, FamilyName NVARCHAR(60));
CREATE TABLE dbo.Person(PersonID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, FamilyID INT REFERENCES dbo.Family(FamilyID), PersonName NVARCHAR(60), ParentID INT);
INSERT INTO dbo.Family(FamilyName) VALUES
('DA1'),
('DA2');
INSERT INTO dbo.Person(FamilyID, PersonName, ParentID) VALUES
(1, 'ABC', NULL),
(1, 'Son of ABC', 1),
(1, 'Son of ABC', 1),
(2, 'XYZ', NULL),
(2, 'Son of XYZ', 4 );
DECLARE #FamilyIdMap table (TargetId int, SourceId int)
MERGE dbo.Family tf
USING (SELECT * FROM dbo.Family WHERE FamilyName like '%DA%') AS sf
ON 1=0
WHEN NOT MATCHED THEN
INSERT (FamilyName)
VALUES(sf.FamilyName)
OUTPUT INSERTED.FamilyID, sf.FamilyID
INTO #FamilyIdMap;
DECLARE #PersonIdMap table (TargetId int, SourceId int)
MERGE dbo.Person as tp
USING (SELECT p.PersonID, p.FamilyID, p.PersonName, p.ParentID, fm.SourceId,fm.TargetId FROM Person AS p
INNER JOIN #FamilyIdMap AS fm
ON p.FamilyID = fm.SourceId) AS sp
ON (0=1)
WHEN NOT MATCHED THEN
INSERT(FamilyID, PersonName, ParentID)
VALUES
(sp.TargetId,PersonName, ParentID) OUTPUT
INSERTED.PersonID, sp.PersonID
INTO #PersonIdMap;
UPDATE p SET
ParentID = pm.TargetId
FROM dbo.Person AS p
JOIN #PersonIdMap pm
ON pm.SourceId = p.ParentID
WHERE EXISTS(SELECT 1 FROM #PersonIdMap pmf WHERE pmf.TargetId = p.PersonID);
SELECT * FROM dbo.Family;
SELECT * FROM #FamilyIdMap;
SELECT * FROM dbo.Person;
SELECT * FROM #PersonIdMap;
I did add code to create and fill the #FamilyIdMap table. I also cleaned up your original MERGE a little. It is now using the #FamilyIdMap table as a means to select the rows instead of joining to the dbo.Family table again. If you run this only on a small subset of families this should be faster. If you have a lot of families and you copy them all, going against the dbo.Family table again might be faster.
The final UPDATE updates only new rows in the Person table (all newly created PersonIds can be found in the TargetId column of the #PersonIdMap table), changing old ParentId values to new ParentId values using the information in the #PersonIdMap table.
I did not include transaction management, but atleast the MERGE dbo.Person and the following UPDATE dbo.Person should be executed inside the same transaction.

Newly assigned Sequence is not working

In PostgreSQL, I created a new table and assigned a new sequence to the id column. If I insert a record from the PostgreSQL console it works but when I try to import a record from from Rails, it raises an exception that it is unable to find the associated sequence.
Here is the table:
\d+ user_messages;
Table "public.user_messages"
Column | Type | Modifiers | Storage | Description
-------------+-----------------------------+------------------------------------------------------------+----------+-------------
id | integer | not null default nextval('new_user_messages_id'::regclass) | plain |
But when I try to get the sequence with the SQL query which Rails uses, it returns NULL:
select pg_catalog.pg_get_serial_sequence('user_messages', 'id');
pg_get_serial_sequence
------------------------
(1 row)
The error being raised by Rails is:
UserMessage.import [UserMessage.new]
NoMethodError: undefined method `split' for nil:NilClass
from /app/vendor/bundle/ruby/1.9.1/gems/activerecord-3.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:910:in `default_sequence_name'
This problem only occurs when I use the ActiveRecord extension for importing bulk records, single records get saved through ActiveRecord.
How do I fix it?
I think your problem is that you set all this up by hand rather than by using a serial column. When you use a serial column, PostgreSQL will create the sequence, set up the appropriate default value, and ensure that the sequence is owned by the table and column in question. From the fine manual:
pg_get_serial_sequence(table_name, column_name)
get name of the sequence that a serial or bigserial column uses
But you're not using serial or bigserial so pg_get_serial_sequence won't help.
You can remedy this by doing:
alter sequence new_user_messages_id owned by user_messages.id
I'm not sure if this is a complete solution and someone (hi Erwin) will probably fill in the missing bits.
You can save yourself some trouble here by using serial as the data type of your id column. That will create and hook up the sequence for you.
For example:
=> create sequence seq_test_id;
=> create table seq_test (id integer not null default nextval('seq_test_id'::regclass));
=> select pg_catalog.pg_get_serial_sequence('seq_test','id');
pg_get_serial_sequence
------------------------
(1 row)
=> alter sequence seq_test_id owned by seq_test.id;
=> select pg_catalog.pg_get_serial_sequence('seq_test','id');
pg_get_serial_sequence
------------------------
public.seq_test_id
(1 row)

Resources