How can safely assign an incrementing ID to a subset of objects? - ruby-on-rails

I have an Order object which can be in an unpaid or paid state. When an order is paid, I want to set an order_number which should be an incrementing number.
Sounds easy enough, but I'm worried about collisions. I can imagine one order having an order_number stored in memory, about to save and then another order saves itself, using that number, now the one in memory should be recalculated, but how?

You can create a database table that just contains an AUTO_INCREMENT primary key. When you need to to get a new order_number just do an insert into this table and get the the value of the primary key for the created row.

There are a lot of approaches. Essentially you need a lock to ensure that each request to the counter always return a different value.
Memcache, Redis and some key-value stores have this kind of counter feature. eg, each time you want to get a new order_number, just call the incr command of Redis, it will increment the counter and return the new value.
A more complex solution can be implemented via the trigger/stored procedure/sequence features of RMDBS(like mysql). For mysql, create a new table contains an AUTO_INCREMENT primary key. When you want to get a new order_number, make an insert into this table and get last_insert_id(). If you want ACID, just wrap the procedure in a transaction.

Related

Primary and Foreign Key in DW tables

I've read that dimension tables hold the primary key and and fact tables contain the foreign key which references the primary key of Dimension tables.
Now the confusion I am having is this - suppose I have an ETL pipeline which populates the dimension table (let's say customer) from a source (say another DB). Let's assume this is a frequently changing table and has over 200 columns. How do I incorporate these changes in the dimension tables? I want to have only the latest record for each customer (type 1 SCD) in the DWH.
One thing what I could do is delete the row in the dimension table and re-insert the new updated row. But this approach won't work because of the primary key - foreign key constraint (which will not allow me to delete the record).
Should I write an update statement with all 200 columns in the ETL script? Or is there any other approach?
Strictly speaking you just need to update the fields that changed. But the cost of updating all in a single record is probably similar (assuming it’s row based storage), and it’s probably easier to write.
You can’t delete and re-insert, as the new row will have a new PK and old facts will no longer be linked.

Retrieve data quickly and efficient

I am using Ruby On Rails and Postgresql as a DB.
My users are subscribed to a plan and based on their plan they are allow to a number of resources, I have a DB table which is keeping track of their usage. I need to check their usage on multiple actions, I would like to know if there is a best practice of working whit this data (storing and retrieving)
My table:
UserStats: id(pk), projects_left, keys_left, user_id
Usually on create actions I retrieve data and then update the data on that userstats table also there are many places where I do just a select on the table.
If resources are also stored as database tables you can consider creating a trigger on insert to those tables which will cause the insert to fail if they exceed their limits.
this way you never need to update UserStats, you just store their max allowed.
I believe it's less error prone, handles deletes without extra code and allows other apps to modify the db
eg:
CREATE OR REPLACE FUNCTION check_limits_projects(int user_id) RETURNS TRIGGER AS
DECLARE
my_projects_count INT
my_project_limit INT
BEGIN
SELECT count(*) INTO my_projects_count FROM projects WHERE user_id = NEW.user_id
SELECT projects_left INTO my_projects_limit FROM UserStats WHERE user_id = NEW.user_id
IF (my_project_count >= my_project_limit) THEN
RETURN FALSE
END IF
RETURN NEW
END
CREATE TRIGGER 'limit_check' BERFORE INSERT ON projects;

Rails model.create set id

I wonder if it's possible to to run Model.create() such that instead of taking next free id integer it takes the lowest free integer.
For example, assume we have records for id=10..20 and we don't have records for id=0..9, I want create instance of Model with id starting from 0 (in normal Mode.create() in would create instance staring from 21)
Preferably I want to do it in automatic manner. I don't want to change id by explicitly defining it.
DB
You'll be best doing this at database-level (look at altering the auto-increment number)
Although I think you can do this in Rails, I would highly recommend using the DB functionality to make it happen. You can do something like this in PHPMYAdmin (for MYSQL):
If you set the Auto-Increment to the number you wish to start at, every time you save data into the DB, it will just save with that number. I think using any Rails-based method will just overcomplicate things unnecessarily.
I'd discourage it.
Those ids serve solely as unique identifiers for rows in a table, and it's the database's job to assign one. You can verify that the model doesn't require an id to be saved:
m = Model.new
# populate m with data
m.name = "Name"
# look at what m contains
m
# and save it
m.save
# now inspect it again and see it got its unique id
m
While it might be possible to modify ids, it's not a good practice to give more sense to ids — when each new record gets a unique id at any time it's easier to debug possible DB structure errors that might occur during development. Like, say, some associated objects suddenly show up in a new user's account. Weird enough, right? That can happen and, worst case, can show up in production resulting in a severe security breach.
Keeping ids unique at all times eliminates this bug's effect. That seems much more important if the associated objects store confidential information and you care about keeping them safe. Encryption concerns aside.
So, to be sure in every situation, developers have adopted a practice of not giving id any other role other than uniquely identifying a row in a table. If you want it to do something else, consider making another field for that purpose.

How to create table in erlang mnesia with multiple unique columns?

something like unique column in sql. Any suggestion?
Your question is quite "open", so I tried to figure out what you want to do.
If you need to add a column which is not the primary key to store something like a unique ID, you can store there an erlang reference (Ref = make_ref()). which is almost guaranteed to be unique (cycle around 2^82). I don't know what is the behavior in multinode, but if there is a problem it is possible to tag the record with {node(),make_ref()}.
if you want create unique records by the combination of several keys: K1,K2,K3 you can use the tuple {K1,K2,K3} as key of the table and use a set or ordered set. but it will more complex to look into the table
if it it something else, some complementary information could help.

2 column table, ignore duplicates on mass insert postgresql

I have a Join table in Rails which is just a 2 column table with ids.
In order to mass insert into this table, I use
ActiveRecord::Base.connection.execute("INSERT INTO myjointable (first_id,second_id) VALUES #{values})
Unfortunately this gives me errors when there are duplicates. I don't need to update any values, simply move on to the next insert if a duplicate exists.
How would I do this?
As an fyi I have searched stackoverflow and most the answers are a bit advanced for me to understand. I've also checked the postgresql documents and played around in the rails console but still to no avail. I can't figure this one out so i'm hoping someone else can help tell me what I'm doing wrong.
The closest statement I've tried is:
INSERT INTO myjointable (first_id,second_id) SELECT 1,2
WHERE NOT EXISTS (
SELECT first_id FROM myjointable
WHERE first_id = 1 AND second_id IN (...))
Part of the problem with this statement is that I am only inserting 1 value at a time whereas I want a statement that mass inserts. Also the second_id IN (...) section of the statement can include up to 100 different values so I'm not sure how slow that will be.
Note that for the most part there should not be many duplicates so I am not sure if mass inserting to a temporary table and finding distinct values is a good idea.
Edit to add context:
The reason I need a mass insert is because I have a many to many relationship between 2 models where 1 of the models is never populated by a form. I have stocks, and stock price histories. The stock price histories are never created in a form, but rather mass inserted themselves by pulling the data from YahooFinance with their yahoo finance API. I use the activerecord-import gem to mass insert for stock price histories (i.e. Model.import columns,values) but I can't type jointable.import columns,values because I get the jointable is an undefined local variable
I ended up using the WITH clause to select my values and give it a name. Then I inserted those values and used WHERE NOT EXISTS to effectively skip any items that are already in my database.
So far it looks like it is working...
WITH withqueryname(first_id,second_id) AS (VALUES(1,2),(3,4),(5,6)...etc)
INSERT INTO jointablename (first_id,second_id)
SELECT * FROM withqueryname
WHERE NOT EXISTS(
SELECT first_id FROM jointablename WHERE
first_id = 1 AND
second_id IN (1,2,3,4,5,6..etc))
You can interchange the Values with a variable. Mine was VALUES#{values}
You can also interchange the second_id IN with a variable. Mine was second_id IN #{variable}.
Here's how I'd tackle it: Create a temp table and populate it with your new values. Then lock the old join values table to prevent concurrent modification (important) and insert all value pairs that appear in the new table but not the old one.
One way to do this is by doing a left outer join of the old values onto the new ones and filtering for rows where the old join table values are null. Another approach is to use an EXISTS subquery. The two are highly likely to result in the same query plan once the query optimiser is done with them anyway.
Example, untested (since you didn't provide an SQLFiddle or sample data) but should work:
BEGIN;
CREATE TEMPORARY TABLE newjoinvalues(
first_id integer,
second_id integer,
primary key(first_id,second_id)
);
-- Now populate `newjoinvalues` with multi-valued inserts or COPY
COPY newjoinvalues(first_id, second_id) FROM stdin;
LOCK TABLE myjoinvalues IN EXCLUSIVE MODE;
INSERT INTO myjoinvalues
SELECT n.first_id, n.second_id
FROM newjoinvalues n
LEFT OUTER JOIN myjoinvalues m ON (n.first_id = m.first_id AND n.second_id = m.second_id)
WHERE m.first_id IS NULL AND m.second_id IS NULL;
COMMIT;
This won't update existing values, but you can do that fairly easily too by using with a second query that does an UPDATE ... FROM while still holding the write table lock.
Note that the lock mode specified above will not block SELECTs, only writes like INSERT, UPDATE and DELETE, so queries can continue to be made to the table while the process is ongoing, you just can't update it.
If you can't accept that an alternative is to run the update in SERIALIZABLE isolation (only works properly for this purpose in Pg 9.1 and above). This will result in the query failing whenever a concurrent write occurs so you have to be prepared to retry it over and over and over again. For that reason it's likely to be better to just live with locking the table for a while.

Resources