I want to write a Stored Proc which will need to perform below steps
Get all the rows from a table where flag= 'Y' and status = != 'PROCESSED'
Update rows from step 1 , set status = 'PROCESSED'
I want to do this because this SP will be called every 5 mins from my java program and i do not want to pick the rows which i have already returned from SP thats why i need to mark them processed.
Something like this?
Retrieve the rows you're interested in. Use the holdlock keyword to ensure nothing can sneak in an extra row between the select and the update. The lock is held until the end of the transaction.
The stored procedure performs the retrieval with the shared lock and then upgrades that to exclusive with the update statement.
When the transaction commits the locks are released.
create proc update_status as
begin transaction
select *
from
t1 holdlock
where
flag = 'Y'
and status != 'PROCESSED'
update t1 set
status = 'PROCESSED'
where
flag = 'Y'
and status != 'PROCESSED'
commit
go
Related
I will try to keep the query as short as possible. This involves 2 tables - lets call them staging_data and audit_data. STAGING_DATA has 3 columns:
user_no with data type number,
update_date_time with data type as date in DD-MON-YYYY HH24:MI:SS format
status_code which is varchar(1).
audit_data table also has the same 3 columns. The ask is to add 3 columns to audit_data table
seq_no (which will be unique to every user),
active_from (date type without the time format)
active_to (date type without the time format).
There is a procedure that inserts data from staging_data to audit_data.
Sample of the table audit_data
That data in audit table should look like :
For the next record for user_no 523(lets assume update_date_time is '23-Nov-2020 10:20') seq_no becomes 3, active_from_date becomes '23-Nov-2020', active_to becomes 31-Dec-99 and the active_to of user_no 523 with seq_no 2 becomes '22-Nov-2020'. So the data should look like this :
Highlighted the 3rd record which will be added later in light green.
So here goes my solution : I suggested to use row_number() over(partition by user_no) analytical function to get seq_no for each user. I wanted to create a view based on that but Boss doesn't want a view. He strictly wants to use a procedure. Procedure should check if the user_no exists (in this example 523). If exists then seq_no increases and active_to of the previous record for 523 changes to latest active_from - 1 date. I will be honest - I have no clue how to achieve this in Procedure. I understand I can create a cursor with the query I had in my mind for the view. But to add seq_no and change active_to date is something that has puzzled me. Can anyone please guide me in right direction/s? Also I apologise in advance if I have left out any other details. Its midnight here now and after 8 hours of racking my brain on this I am very hungry!
edit 11th Mar : here is the code for the procedure I wrote to insert data into the audit table for situation when a particular user_no has no record in audit table :
create or replace procedure test_aud IS
user_found_audit number;
lv_user_no AUDIT_DATA.user_no%TYPE;
cursor member_no is select distinct user_no from STAGING_DATA;
begin
open member_no;
loop
fetch member_no into lv_user_no;
exit when member_no%notfound;
select count(*) into user_found_audit from AUDIT_DATA where user_no = lv_user_no;
if user_found_audit = 0 then
insert into AUDIT_DATA(user_no, update_date_time,status_code, seq_no, last_update_date, active_from, active_to)
select user_no, update_date_time,status_code,row_number() over(partition by user_no order by UPDATE_DATE_TIME) as seqno,
to_char(trunc(update_date_time),'DD-MON-YYYY'),
to_char(trunc(update_date_time),'DD-MON-YYYY'),
lead(to_char(trunc(update_date_time)-1,'DD-MON-YYYY'),1,'31-DEC-99') over(PARTITION BY user_no ORDER BY UPDATE_DATE_TIME) from STAGING_DATA where user_no = lv_user_no;
commit;
else
dbms_output.put_line(lv_user_no||' exists in audit table');
-- to code the block when user_no exists, involves an update and insert
end if;
end loop;
close member_no;
end;
/
Well you need to collect a couple things. The latest stage row and the latest audit row. Then it is just a matter of generating the new audit information and updating the previous latest one. The following makes a couple simplifying assumptions:
Only the latest stage data for a given user_no needs processed as
all prior have been processed, However it does not assume the stage
table has been cleared.
The sequencing of 'Y' and 'N' status_codes are properly order in
that manner. In fact it does not even check the value.
It need not concern itself with the inherent race condition. The
condition is derives from seq_no being generated as Max()+1. This
structure virtually guarantees a duplicate will eventually be
created.
The nested procedure "establish_audit" does all the actual work. The rest are just supporting characters, including a couple just for debug purpose. See fiddle.
create or replace
procedure generate_stage_audit(user_no_in staging_data.user_no%type)
as
k_end_of_time constant date := date '9999-12-31';
l_latest_user_stage staging_data%rowtype;
l_latest_user_audit audit_data%rowtype;
procedure establish_audit
is
begin
insert into audit_data(user_no, update_date_time, status_code
,seq_no, active_from, active_to)
select l_latest_user_stage.user_no
, l_latest_user_stage.update_date_time
, l_latest_user_stage.status_code
, coalesce(l_latest_user_audit.seq_no,0) + 1
, trunc(l_latest_user_stage.update_date_time)
, k_end_of_time
from dual;
update audit_data
set active_to = trunc(l_latest_user_stage.update_date_time - 1)
where user_no = l_latest_user_audit.user_no
and seq_no = l_latest_user_audit.seq_no;
end establish_audit;
procedure retrieve_latest_stage
is
begin
select *
into l_latest_user_stage
from staging_data
where (user_no, update_date_time) =
( select user_no, max(update_date_time)
from staging_data
where user_no = user_no_in
group by user_no
);
end retrieve_latest_stage;
procedure retrieve_latest_audit
is
begin
select *
into l_latest_user_audit
from audit_data
where (user_no, seq_no) =
( select user_no, max(seq_no)
from audit_data
where user_no = user_no_in
group by user_no
);
exception
when no_data_found then
null;
end retrieve_latest_audit;
---- for debugging ---
procedure show_stage
is
begin
dbms_output.put_line('-------- Stage Row -------');
dbms_output.put_line(' user_no==>' || to_char(l_latest_user_stage.user_no));
dbms_output.put_line('update_date_time==>' || to_char(l_latest_user_stage.update_date_time));
dbms_output.put_line(' status_code==>' || to_char(l_latest_user_stage.status_code));
end show_stage;
procedure show_audit
is
begin
dbms_output.put_line('-------- Audit Row -------');
dbms_output.put_line(' user_no==>' || to_char(l_latest_user_audit.user_no));
dbms_output.put_line('update_date_time==>' || to_char(l_latest_user_audit.update_date_time));
dbms_output.put_line(' status_code==>' || to_char(l_latest_user_audit.status_code));
dbms_output.put_line(' seq_no==>' || to_char(l_latest_user_audit.seq_no));
dbms_output.put_line(' active_from==>' || to_char(l_latest_user_audit.active_from));
dbms_output.put_line(' active_to==>' || to_char(l_latest_user_audit.active_to));
end show_audit;
begin -- the main event
retrieve_latest_stage;
show_stage;
retrieve_latest_audit;
show_audit;
establish_audit;
end generate_stage_audit;
A couple warnings:
It seems you may be tempted to use string data type for the audit
columns Active_Form and Active_to as you are trying to declare then
"date type without the time". However there is no such data type in
Oracle; time is part of all dates. Do not do so, store them as
standard dates. (Note Dates are not stored in any format, but an
internal structure. Formats are strictly a visual representation).
Just throwaway the time with the format on the query or by setting
nls_date_format.
You may be tempted to convert call this through a trigger. Do not,
it will likely result in an "ORA-04091: Table is mutating"
exception.
I want to retrieve the value of a field and increment it safely in Informix 12.1 when multiple users are connected.
What I want in C terms is lastnumber = counter++; in a concurrent environment.
The documentation mentions one way of doing this which is to make everyone connect with a wait parameter, lock the row, read the data, increment it and release the lock.
So this is what I tried:
begin work;
select
lastnum
from tbllastnums
where id = 1
for update;
And I can see that the row is locked until I commit or end my session.
However when I put this in a stored procedure:
create procedure "informix".select_for_update_test();
define vLastnum decimal(15);
begin work;
select
lastnum
into vLastnum
from tbllastnums
where id = 1
for update;
commit;
end procedure;
The database gives me a syntax error. (tried with different editors) So why is it a syntax error to write for update clause within a stored procedure? Is there an alternative to this?
Edit
Here's what I ended up with:
DROP TABLE if exists tstcounter;
^!^
CREATE TABLE tstcounter
(
id INTEGER NOT NULL,
counter INTEGER DEFAULT 0 NOT NULL
)
EXTENT SIZE 16
NEXT SIZE 16
LOCK MODE ROW;
^!^
ALTER TABLE tstcounter
ADD CONSTRAINT PRIMARY KEY (id)
CONSTRAINT tstcounter00;
^!^
insert into tstcounter values(1, 0);
^!^
select * from tstcounter;
^!^
drop function if exists tstgetlastnumber;
^!^
create function tstgetlastnumber(pId integer)
returning integer as lastCounter
define vCounter integer;
foreach curse for
select counter into vCounter from tstcounter where id = pId
update tstcounter set counter = vCounter + 1 where current of curse;
return vCounter with resume;
end foreach;
end function;
^!^
SPL and cursors 'FOR UPDATE'
If you manage to find the right bit of the manual — Updating or Deleting Rows Identified by Cursor Name under the FOREACH statement in the SPL (Stored Procedure Language) section of the Informix Guide to SQL: Syntax manual — then you'll find the magic information:
Specify a cursor name in the FOREACH statement if you intend to use the WHERE CURRENT OF cursor clause in UPDATE or DELETE statements that operate on the current row of cursor within the FOREACH loop. Although you cannot include the FOR UPDATE keywords in the SELECT ... INTO segment of the FOREACH statement, the cursor behaves like a FOR UPDATE cursor.
So, you'll need to create a FOREACH loop with a cursor name and take it from there.
Access to the manuals
Incidentally, if you go to the IBM Informix Knowledge Center and see this icon:
that is the 'show table of contents' icon and you need to press it to see the useful information for navigating to the manuals. If you see this icon:
it is the 'hide table of contents' icon, but you should be able to see the contents down the left side. It took me a while to find out this trick. And I've no idea why the contents were hidden by default for me, but I think that was a UX design mistake if other people also suffer from it.
I'm experiencing a race condition in ActiveRecord with PostgreSQL where I'm reading a value then incrementing it and inserting a new record:
num = Foo.where(bar_id: 42).maximum(:number)
Foo.create!({
bar_id: 42,
number: num + 1
})
At scale, multiple threads will simultaneously read then write the same value of number. Wrapping this in a transaction doesn't fix the race condition because the SELECT doesn't lock the table. I can't use an auto increment, because number is not unique, it's only unique given a certain bar_id. I see 3 possible fixes:
Explicitly use a postgres lock (a row-level lock?)
Use a unique constraint and retry on fails (yuck!)
Override save to use a subselect, I.E.
INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42));
All these solutions seem like I'd be reimplementing large parts of ActiveRecord::Base#save! Is there an easier way?
UPDATE:
I thought I found the answer with Foo.lock(true).where(bar_id: 42).maximum(:number) but that uses SELECT FOR UDPATE which isn't allowed on aggregate queries
UPDATE 2:
I've just been informed by our DBA, that even if we could do INSERT INTO foo (bar_id, number) VALUES (42, (SELECT MAX(number) + 1 FROM foo WHERE bar_id = 42)); that doesn't fix anything, since the SELECT runs in a different lock than the INSERT
Your options are:
Run in SERIALIZABLE isolation. Interdependent transactions will be aborted on commit as having a serialization failure. You'll get lots of error log spam, and you'll be doing lots of retries, but it'll work reliably.
Define a UNIQUE constraint and retry on failure, as you noted. Same issues as above.
If there is a parent object, you can SELECT ... FOR UPDATE the parent object before doing your max query. In this case you'd SELECT 1 FROM bar WHERE bar_id = $1 FOR UPDATE. You are using bar as a lock for all foos with that bar_id. You can then know that it's safe to proceed, so long as every query that's doing your counter increment does this reliably. This can work quite well.
This still does an aggregate query for each call, which (per next option) is unnecessary, but at least it doesn't spam the error log like the above options.
Use a counter table. This is what I'd do. Either in bar, or in a side-table like bar_foo_counter, acquire a row ID using
UPDATE bar_foo_counter SET counter = counter + 1
WHERE bar_id = $1 RETURNING counter
or the less efficient option if your framework can't handle RETURNING:
SELECT counter FROM bar_foo_counter
WHERE bar_id = $1 FOR UPDATE;
UPDATE bar_foo_counter SET counter = $1;
Then, in the same transaction, use the generated counter row for the number. When you commit, the counter table row for that bar_id gets unlocked for the next query to use. If you roll back, the change is discarded.
I recommend the counter approach, using a dedicated side table for the counter instead of adding a column to bar. That's cleaner to model, and means you create less update bloat in bar, which can slow down queries to bar.
Ok.. so I have boss that's a bit of a nut when it comes to using the date as an indicator of change. He doesn't trust it.
What I want to do is have something work the same way as the date update that comes native with active record, but instead base it on an ever increasing number..
I know... the number of seconds since 1973 is constantly getting bigger Well unless you count daylight savings and things.
I'm wondering if there are any thoughts, on how to do this gracefully..
Note I have 20 tables that need this and I am a big fan of DRY.
Have a look at http://api.rubyonrails.org/classes/ActiveRecord/Locking/Optimistic.html, I think this is exactly what you want.
Optimistic locking within ActiveRecord means that if a lock_version column is present on a specific table then it will be updated (+1) every time you change that record (via ActiveRecord, of course).
I ended up using a mass trigger inside the database.
The function creates a record (or updates it) in a new table called data_changed.
def create_trigger_function(schema)
puts "DAVE: creating trigger function for tenant|schema #{schema.to_s}"
sql = "CREATE OR REPLACE FUNCTION \""+schema+"\".insert_into_delta_table() RETURNS TRIGGER AS 'BEGIN
UPDATE \""+schema+"\".data_changes SET status = 1, created_at = now() where table_name = TG_TABLE_NAME and record_id = NEW.id;
INSERT INTO \""+schema+"\".data_changes (status, table_name, market_id, record_id, created_at)
( select m.* from \""+schema+"\".data_changes as ds right outer join
(select 1, CAST (TG_TABLE_NAME AS text ) as name , markets.id, NEW.id as record_id, now() from \""+schema+"\".markets) as m
on
ds.record_id = m.record_id
and ds.market_id = m.id
and table_name = name
where ds.id is null );
RETURN NULL;
END;' LANGUAGE plpgsql;"
connection.execute(sql);
end
Now all I have to do to find all the changed "products" is
update data_changes set status = 2 where status = 1 and table_name = 'products'
select * from products where id in (select record_id from data_changes where status = 2 and table_name = 'products')
update data_changes set status = 3 where status = 2 and table_name = 'products'
If a product gets updated after I do my first update, but before I do the select, then it won't show up in my select, because it's id will be reset to 1.
If a product gets updated after I do my select, but before I do the last update,then again it will not be affected, by the last update.
The contents of my select, will be out of date, but there's no real way of avoiding that.
EDIT:
I've narrowed my mysql wait timeout down to this line:
IF #resultsFound > 0 THEN
INSERT INTO product_search_query (QueryText, CategoryId) VALUES (keywords, topLevelCategoryId);
END IF;
Any idea why this would cause a problem? I can't work it out!
I've written a stored proc to search for products in certain categories, due to certain constraints I came across, I was unable to do what I wanted (limiting, but whilst still returning the total number of rows found, with sorting, etc..)
It's meant splits up a string of category Ids, from 1,2,3 in to a temporary table, then builds the full-text search query based on sorting options and limits, executes the query string and then selects out the total number of results.
Now, I know I'm no MySQL guru, very far from it, I've got it working, but I keep getting time outs with product searches etc. So I'm thinking this may be causing some kind of problem?
Does anyone have any ideas how I can tidy this up, or even do it in a much better way that I probably don't know about?
Thanks.
DELIMITER $$
DROP PROCEDURE IF EXISTS `product_search` $$
CREATE DEFINER=`root`#`localhost` PROCEDURE `product_search`(keywords text, categories text, topLevelCategoryId int, sortOrder int, startOffset int, itemsToReturn int)
BEGIN
declare foundPos tinyint unsigned;
declare tmpTxt text;
declare delimLen tinyint unsigned;
declare element text;
declare resultingNum int unsigned;
drop temporary table if exists categoryIds;
create temporary table categoryIds
(
`CategoryId` int
) engine = memory;
set tmpTxt = categories;
set foundPos = instr(tmpTxt, ',');
while foundPos <> 0 do
set element = substring(tmpTxt, 1, foundPos-1);
set tmpTxt = substring(tmpTxt, foundPos+1);
set resultingNum = cast(trim(element) as unsigned);
insert into categoryIds (`CategoryId`) values (resultingNum);
set foundPos = instr(tmpTxt,',');
end while;
if tmpTxt <> '' then
insert into categoryIds (`CategoryId`) values (tmpTxt);
end if;
CASE
WHEN sortOrder = 0 THEN
SET #sortString = "ProductResult_Relevance DESC";
WHEN sortOrder = 1 THEN
SET #sortString = "ProductResult_Price ASC";
WHEN sortOrder = 2 THEN
SET #sortString = "ProductResult_Price DESC";
WHEN sortOrder = 3 THEN
SET #sortString = "ProductResult_StockStatus ASC";
END CASE;
SET #theSelect = CONCAT(CONCAT("
SELECT SQL_CALC_FOUND_ROWS
supplier.SupplierId as Supplier_SupplierId,
supplier.Name as Supplier_Name,
supplier.ImageName as Supplier_ImageName,
product_result.ProductId as ProductResult_ProductId,
product_result.SupplierId as ProductResult_SupplierId,
product_result.Name as ProductResult_Name,
product_result.Description as ProductResult_Description,
product_result.ThumbnailUrl as ProductResult_ThumbnailUrl,
product_result.Price as ProductResult_Price,
product_result.DeliveryPrice as ProductResult_DeliveryPrice,
product_result.StockStatus as ProductResult_StockStatus,
product_result.TrackUrl as ProductResult_TrackUrl,
product_result.LastUpdated as ProductResult_LastUpdated,
MATCH(product_result.Name) AGAINST(?) AS ProductResult_Relevance
FROM
product_latest_state product_result
JOIN
supplier ON product_result.SupplierId = supplier.SupplierId
JOIN
category_product ON product_result.ProductId = category_product.ProductId
WHERE
MATCH(product_result.Name) AGAINST (?)
AND
category_product.CategoryId IN (select CategoryId from categoryIds)
ORDER BY
", #sortString), "
LIMIT ?, ?;
");
set #keywords = keywords;
set #startOffset = startOffset;
set #itemsToReturn = itemsToReturn;
PREPARE TheSelect FROM #theSelect;
EXECUTE TheSelect USING #keywords, #keywords, #startOffset, #itemsToReturn;
SET #resultsFound = FOUND_ROWS();
SELECT #resultsFound as 'TotalResults';
IF #resultsFound > 0 THEN
INSERT INTO product_search_query (QueryText, CategoryId) VALUES (keywords, topLevelCategoryId);
END IF;
END $$
DELIMITER ;
Any help is very very much appreciated!
There is little you can do with this query.
Try this:
Create a PRIMARY KEY on categoryIds (categoryId)
Make sure that supplier (supplied_id) is a PRIMARY KEY
Make sure that category_product (ProductID, CategoryID) (in this order) is a PRIMARY KEY, or you have an index with ProductID leading.
Update:
If it's INSERT that causes the problem and product_search_query in a MyISAM table the issue can be with MyISAM locking.
MyISAM locks the whole table if it decides to insert a row into a free block in the middle of the table which can cause the timeouts.
Try using INSERT DELAYED instead:
IF #resultsFound > 0 THEN
INSERT DELAYED INTO product_search_query (QueryText, CategoryId) VALUES (keywords, topLevelCategoryId);
END IF;
This will put the records into the insertion queue and return immediately. The record will be added later asynchronously.
Note that you may lose information if the server dies after the command is issued but before the records are actually inserted.
Update:
Since your table is InnoDB, it may be an issue with table locking. INSERT DELAYED is not supported on InnoDB.
Depending on the nature of the query, DML queries on InnoDB table may place gap locks which will lock the inserts.
For instance:
CREATE TABLE t_lock (id INT NOT NULL PRIMARY KEY, val INT NOT NULL) ENGINE=InnoDB;
INSERT
INTO t_lock
VALUES
(1, 1),
(2, 2);
This query performs ref scans and places the locks on individual records:
-- Session 1
START TRANSACTION;
UPDATE t_lock
SET val = 3
WHERE id IN (1, 2)
-- Session 2
START TRANSACTION;
INSERT
INTO t_lock
VALUES (3, 3)
-- Success
This query, while doing the same, performs a range scan and places a gap lock after key value 2, which will not let insert key value 3:
-- Session 1
START TRANSACTION;
UPDATE t_lock
SET val = 3
WHERE id BETWEEN 1 AND 2
-- Session 2
START TRANSACTION;
INSERT
INTO t_lock
VALUES (3, 3)
-- Locks
Try wrapping your EXECUTE with the following:
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
EXECUTE TheSelect USING #keywords, #keywords, #startOffset, #itemsToReturn;
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ ;
I do something similiar in TSQL for all report stored proc and searches where repeatable reads aren't important to reduce locking/blocking issues with other processes running on the database.
Turn on slow queries, that will give you an idea of what is taking so long to execute that there is a timeout.
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
Pick the slowest query and optimise that. then run for a while and repeat.
There is some excellent information and tools here http://hackmysql.com/nontech
DC
UPDATE:
Either you have a network problem causing the timeout, if you are using a local mysql instance then that is unlikely, OR something is locking a table for far too long causing a timeout. the process that is locking the table or tables for far too long will be listed in the slow log as a slow query. you can also get the slow log query to display any queries that fail to use an index resulting in an inefficient query.
If you can get the problem to occur while you are present then you can also use a tool like phpmyadmin or the commandline to run "SHOW PROCESSLIST\G" this will give you a list of what queries are running while the problem is occurring.
You think the problem is in your insert statement, therefore something is locking that table. therefore you need to find what is locking that table, therefore you need to find what is running so slow its locking the table for far too long. Slow queries is one way to do that.
Other things to look at
CPU - is it idle or running at full pelt
IO - is io causing holdups
RAM - are you swapping all the time (will cause excessive io)
Does the table product_search_query use an index?
What is the primary key?
If your index uses strings that are too long? you may build a huge index file that causes very slow inserts (slow query log will also show that)
And yes the problem may be elsewhere, but you must start somewhere mustn't you.
DC