I'm extracting data from various sources into one table. In this new table, there's a field called lineno. This field value is should be in sequence based on company code and batch number. I've wrote the following procedure
CREATE PROCEDURE update_line(company CHAR(4), batch CHAR(8), rcptid CHAR(12));
DEFINE lineno INT;
SELECT Count(*)
INTO lineno
FROM tmp_cb_rcpthdr
WHERE cbrh_company = company
AND cbrh_batchid = batch;
UPDATE tmp_cb_rcpthdr
SET cbrh_lineno = lineno + 1
WHERE cbrh_company = company
AND cbrh_batchid = batch
AND cbrh_rcptid = rcptid;
END PROCEDURE;
This procedure will be called using the following trigger
CREATE TRIGGER tmp_cb_rcpthdr_ins INSERT ON tmp_cb_rcpthdr
REFERENCING NEW AS n
FOR EACH ROW
(
EXECUTE PROCEDURE update_line(n.company, cbrh_batchid, cbrh_rcptid)
);
However, I got the following error
SQL Error = -747 Table or column matches object referenced in
triggering statement.
From oninit.com, I learn that the error caused by a triggered SQL statement acts on the triggering table which in this case is the UPDATE statement.
So my question is, how do I solve this problem? Is there any work around or better solution?
I think the design needs to be reconsidered. For a start, what happens if some rows get deleted from tmp_cb_rcpthdr ? The COUNT(*) query will result in duplicate lineno values.
Even if this is an ETL only process, and you can be confident the data won't be manipulated from elsewhere, performance will be an issue, and will only get worse the more data you have for any one combination of company and batch_id.
Is it necessary for the lineno to increment from zero, or is it just to maintain the original load order? Because if it's the latter, a SEQUENCE or a SERIAL field on the table will achieve the same end, and be a lot more efficient.
If you must generate lineno in this way, I would suggest you create a second control table, keyed on company and batch_id, that tracks the current lineno value, ie: (untested)
CREATE PROCEDURE update_line(company CHAR(4), batch CHAR(8));
DEFINE lineno INT;
SELECT cbrh_lineno INTO lineno
FROM linenoctl
WHERE cbrh_company = company
AND cbrh_batchid = batch;
UPDATE linenoctl
SET cbrh_lineno = lineno + 1
WHERE cbrh_company = company
AND cbrh_batchid = batch;
-- A test that no other process has grabbed this record
-- might need to be considered here, ie cbrh_lineno = lineno
RETURN lineno + 1
END PROCEDURE;
Then use it as follows:
CREATE TRIGGER tmp_cb_rcpthdr_ins INSERT ON tmp_cb_rcpthdr
REFERENCING NEW AS n
FOR EACH ROW
(
EXECUTE PROCEDURE update_line(n.company, cbrh_batchid) INTO cbrh_lineno
);
See the IDS documentation for more on using calculated values with triggers.
Related
Consider an enterprise that captures sensor data for different production facilities. per facility, we create an aggregation query that averages the values to 5min timeslots. This query exists out of a long list of with-clauses and writes data to a table (called aggregation_table).
Now my problem: currently we have n queries running that exactly run the same logic, the only thing that differs are table names (and sometimes column names but let's ignore that for now).
Instead of managing n different scripts that are basically the same, I would like to put it in a stored procedure that is able to work like this:
CALL aggregation_query(facility_name) -> resolve the different tables for that facility and then use them in the different with clauses
On top of that, instead of having this long set of clauses that give me the end-result, I would like to chunk them up in logical blocks that are parametrizable, So for example, if I call the aforementioned stored_procedure for facility A, I want to be able to pass / use this table name in these different functions, where the output can be re-used in the next statement (like you would do with with clauses).
Another argument of why I want to chunk this up in re-usable blocks is because we have many "derivatives" on this aggregation query, for example to manage historical data, to correct data or to have the sensor data on another aggregation level. As these become overly complex, it is much easier to manage them without having to copy paste and adjust these every time.
In the current set-up, it could be useful to know that I am only entitled to use plain BigQuery, As my team is not allowed to access the CI/CD / scheduling and repository. (meaning that I cannot solve the issue by having CI/CD that deploys the n different versions of the procedure and functions)
So in the end, I would like to end up with something like this using only bigquery:
CREATE OR REPLACE PROCEDURE
`aggregation_function`()
BEGIN
DECLARE
tablename STRING;
DECLARE
active_table_name STRING; ##get list OF tables CREATE TEMP TABLE tableNames AS
SELECT
table_catalog,
table_schema,
table_name
FROM
`catalog.schema.INFORMATION_SCHEMA.TABLES`
WHERE
table_name = tablename;
WHILE
(
SELECT
COUNT(*)
FROM
tableNames) >= 1 DO ##build dataset + TABLE name
SET
active_table_name = CONCAT('`',table_catalog,'.',table_schema,'.' ,table_name,'`'); ##use concat TO build string AND execute
EXECUTE IMMEDIATE '''
INSERT INTO
`aggregation_table_for_facility` (timeslot, sensor_name, AVG_VALUE )
WITH
STEP_1 AS (
SELECT
*
FROM
my_table_function_step_1(active_table_name,
parameter1,
parameter2) ),
STEP_2 AS (
SELECT
*
FROM
my_table_function_step_2(STEP_1,
parameter1,
parameter2) )
SELECT * FROM STEP_2
'''
USING active_table_name as active_table_name;
DELETE
FROM
tableNames
WHERE
table_name = tablename;
END WHILE
;
END
;
I was hoping someone could make a snippet on how I can do this in Standard SQL / Bigquery, so basically:
stored procedure that takes in a string variable and is able to use that as a table (partly solved in the approach above, but not sure if there are better ways)
(table) function that is able to take this table_name parameter as well and return back a table that can be used in the next with clause (or alternatively writes to a temp table)
I think below code snippets should provide you with some insights when dealing with procedures, inserts and execute immediate statements.
Here I'm creating a procedure which will insert values into a table that exists on the information schema. Also, as a value I want to return I use OUT active_table_name to return the value I assigned inside the procedure.
CREATE OR REPLACE PROCEDURE `project-id.dataset`.custom_function(tablename STRING,OUT active_table_name STRING)
BEGIN
DECLARE query STRING;
SET active_table_name= (SELECT CONCAT('`',table_catalog,'.',table_schema,'.' ,table_name,'`')
FROM `project-id.dataset.INFORMATION_SCHEMA.TABLES`
WHERE table_name = tablename);
#multine query can be handled by using ''' or """
Set query =
'''
insert into %s (string_field_0,string_field_1,string_field_2,string_field_3,string_field_4,int64_field_5)
with custom_query as (
select string_field_0,string_field_2,'169 BestCity',string_field_3,string_field_4,55677 from %s limit 1
)
select * from custom_query;
''';
# querys must perform operations and must be the last thing to perform
# pass parameters using format
execute immediate (format(query,active_table_name,active_table_name));
END
You can also use a loop to iterate trough records from a working table so it will execute the procedure and also be able to get the value from the procedure to use somewhere else.ie:A second procedure to perform a delete operation.
DECLARE tablename STRING;
DECLARE out_value STRING;
FOR record IN
(SELECT tablename from `my-project-id.dataset.table`)
DO
SET tablename = record.tablename;
LOOP
call `project-id.dataset`.custom_function(tablename,out_value);
select out_value;
END LOOP;
END FOR;
To recap, there are some restrictions such as the possibility to call procedures inside a execute immediate or to use execute immediate inside an execute immediate, to count a few. I think these snippets should help you dealing with your current situation.
For this sample I use the following documentation:
Data Manipulation Language
Dealing with outputs
Information Schema Tables
Execute Immediate
For...In
Loops
Would you please provide an an example for a Redshift procedure where you have used a cursor and an UPDATE statement in conjunction? Is that even feasible, I couldn't find an example. I'm looking for a simple template code to learn how to have these 2 together in a single procedure on Redshift.
Here is an example use case:
I have a table like this:
CREATE TABLE test_tbl
(
Contactid VARCHAR(500),
sfdc_OppId_01 VARCHAR(500),
sfdc_OppId_02 VARCHAR(500),
sfdc_OppId_03 VARCHAR(500),
sfdc_OppId_04 VARCHAR(500),
sfdc_OppId_05 VARCHAR(500),
sfdc_OppId_06 VARCHAR(500)
)
I want to update each sfdc_OppId_xx with the relative value from another table; sfdc_tbl. Here is what sfdc_tbl looks like:
sfdc_contactId
sfdc_Opp_Id
AA123hgt
999999
AA123hgt
888888
AA123hgt
777777
AA123hgt
432567
AA123hgt
098765
AA123hgt
112789
So as you see, there are duplicate sfdc_contactid in the sfdc_tbl. My final goal is to list all the sfdc_Opp_Id for given contactid horizontally in the test_tbl. I shall not have duplicate contactid in the test_tbl.
INSERT INTO test_tbl (Contactid)
SELECT sfdc_contactId
FROM sfdc_tbl
GROUP BY sfdc_contactId
And here is what I'm trying to do:
CREATE OR REPLACE PROCEDURE testing_procedure (results INOUT refcursor)
AS
$$
BEGIN
OPEN cursor_testing FOR
SELECT
Ops.sfdc_Opp.id,
ROW_NUMBER () OVER(PARTITION BY Ops.sfdc_contactId ORDER BY sfdc_Opp_Id ) RWN
FROM sfdc_tbl Ops
INNER JOIN test_tbl tbl
ON Ops.sfdc_contactId = tbl.contactid;
UPDATE test_tbl
SET sfdc_Opp_01 = CASE WHEN cursor_testing.RWN = 1 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_02 = CASE WHEN cursor_testing.RWN = 2 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_03 = CASE WHEN cursor_testing.RWN = 3 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_04 = CASE WHEN cursor_testing.RWN = 4 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_05 = CASE WHEN cursor_testing.RWN = 5 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_06 = CASE WHEN cursor_testing.RWN = 6 THEN cursor_testing.sfdc_Ops_id ELSE NULL END
;
END;
$$
LANGUAGE plpgsql;
I keep getting an error
incorrect syntax at or near "cursor_testing"
I've answered a question with a similar solution. The SQL uses a cursor's data to INSERT into a table and this same path should work for UPDATE - How to join System tables or Information Schema tables with User defined tables in Redshift
That being said and looking at your code I really think you would be better off using a temp table rather than a cursor. The first thing to note is that a cursor is not a table. Your use pattern won't work AFAIK. You read a cursor row by row (or bunches) which is contrary to Redshift's columnar table storage. So you will need to loop on the rows from the cursor and perform N updates. This will be extremely slow! You would be querying columnar data, storing the results in a cursor as rows, reading these row one by one, and then performing a new query (UPDATE) for each row. Ick! Stay in "columnar land" and use a temp table.
I will try to keep the query as short as possible. This involves 2 tables - lets call them staging_data and audit_data. STAGING_DATA has 3 columns:
user_no with data type number,
update_date_time with data type as date in DD-MON-YYYY HH24:MI:SS format
status_code which is varchar(1).
audit_data table also has the same 3 columns. The ask is to add 3 columns to audit_data table
seq_no (which will be unique to every user),
active_from (date type without the time format)
active_to (date type without the time format).
There is a procedure that inserts data from staging_data to audit_data.
Sample of the table audit_data
That data in audit table should look like :
For the next record for user_no 523(lets assume update_date_time is '23-Nov-2020 10:20') seq_no becomes 3, active_from_date becomes '23-Nov-2020', active_to becomes 31-Dec-99 and the active_to of user_no 523 with seq_no 2 becomes '22-Nov-2020'. So the data should look like this :
Highlighted the 3rd record which will be added later in light green.
So here goes my solution : I suggested to use row_number() over(partition by user_no) analytical function to get seq_no for each user. I wanted to create a view based on that but Boss doesn't want a view. He strictly wants to use a procedure. Procedure should check if the user_no exists (in this example 523). If exists then seq_no increases and active_to of the previous record for 523 changes to latest active_from - 1 date. I will be honest - I have no clue how to achieve this in Procedure. I understand I can create a cursor with the query I had in my mind for the view. But to add seq_no and change active_to date is something that has puzzled me. Can anyone please guide me in right direction/s? Also I apologise in advance if I have left out any other details. Its midnight here now and after 8 hours of racking my brain on this I am very hungry!
edit 11th Mar : here is the code for the procedure I wrote to insert data into the audit table for situation when a particular user_no has no record in audit table :
create or replace procedure test_aud IS
user_found_audit number;
lv_user_no AUDIT_DATA.user_no%TYPE;
cursor member_no is select distinct user_no from STAGING_DATA;
begin
open member_no;
loop
fetch member_no into lv_user_no;
exit when member_no%notfound;
select count(*) into user_found_audit from AUDIT_DATA where user_no = lv_user_no;
if user_found_audit = 0 then
insert into AUDIT_DATA(user_no, update_date_time,status_code, seq_no, last_update_date, active_from, active_to)
select user_no, update_date_time,status_code,row_number() over(partition by user_no order by UPDATE_DATE_TIME) as seqno,
to_char(trunc(update_date_time),'DD-MON-YYYY'),
to_char(trunc(update_date_time),'DD-MON-YYYY'),
lead(to_char(trunc(update_date_time)-1,'DD-MON-YYYY'),1,'31-DEC-99') over(PARTITION BY user_no ORDER BY UPDATE_DATE_TIME) from STAGING_DATA where user_no = lv_user_no;
commit;
else
dbms_output.put_line(lv_user_no||' exists in audit table');
-- to code the block when user_no exists, involves an update and insert
end if;
end loop;
close member_no;
end;
/
Well you need to collect a couple things. The latest stage row and the latest audit row. Then it is just a matter of generating the new audit information and updating the previous latest one. The following makes a couple simplifying assumptions:
Only the latest stage data for a given user_no needs processed as
all prior have been processed, However it does not assume the stage
table has been cleared.
The sequencing of 'Y' and 'N' status_codes are properly order in
that manner. In fact it does not even check the value.
It need not concern itself with the inherent race condition. The
condition is derives from seq_no being generated as Max()+1. This
structure virtually guarantees a duplicate will eventually be
created.
The nested procedure "establish_audit" does all the actual work. The rest are just supporting characters, including a couple just for debug purpose. See fiddle.
create or replace
procedure generate_stage_audit(user_no_in staging_data.user_no%type)
as
k_end_of_time constant date := date '9999-12-31';
l_latest_user_stage staging_data%rowtype;
l_latest_user_audit audit_data%rowtype;
procedure establish_audit
is
begin
insert into audit_data(user_no, update_date_time, status_code
,seq_no, active_from, active_to)
select l_latest_user_stage.user_no
, l_latest_user_stage.update_date_time
, l_latest_user_stage.status_code
, coalesce(l_latest_user_audit.seq_no,0) + 1
, trunc(l_latest_user_stage.update_date_time)
, k_end_of_time
from dual;
update audit_data
set active_to = trunc(l_latest_user_stage.update_date_time - 1)
where user_no = l_latest_user_audit.user_no
and seq_no = l_latest_user_audit.seq_no;
end establish_audit;
procedure retrieve_latest_stage
is
begin
select *
into l_latest_user_stage
from staging_data
where (user_no, update_date_time) =
( select user_no, max(update_date_time)
from staging_data
where user_no = user_no_in
group by user_no
);
end retrieve_latest_stage;
procedure retrieve_latest_audit
is
begin
select *
into l_latest_user_audit
from audit_data
where (user_no, seq_no) =
( select user_no, max(seq_no)
from audit_data
where user_no = user_no_in
group by user_no
);
exception
when no_data_found then
null;
end retrieve_latest_audit;
---- for debugging ---
procedure show_stage
is
begin
dbms_output.put_line('-------- Stage Row -------');
dbms_output.put_line(' user_no==>' || to_char(l_latest_user_stage.user_no));
dbms_output.put_line('update_date_time==>' || to_char(l_latest_user_stage.update_date_time));
dbms_output.put_line(' status_code==>' || to_char(l_latest_user_stage.status_code));
end show_stage;
procedure show_audit
is
begin
dbms_output.put_line('-------- Audit Row -------');
dbms_output.put_line(' user_no==>' || to_char(l_latest_user_audit.user_no));
dbms_output.put_line('update_date_time==>' || to_char(l_latest_user_audit.update_date_time));
dbms_output.put_line(' status_code==>' || to_char(l_latest_user_audit.status_code));
dbms_output.put_line(' seq_no==>' || to_char(l_latest_user_audit.seq_no));
dbms_output.put_line(' active_from==>' || to_char(l_latest_user_audit.active_from));
dbms_output.put_line(' active_to==>' || to_char(l_latest_user_audit.active_to));
end show_audit;
begin -- the main event
retrieve_latest_stage;
show_stage;
retrieve_latest_audit;
show_audit;
establish_audit;
end generate_stage_audit;
A couple warnings:
It seems you may be tempted to use string data type for the audit
columns Active_Form and Active_to as you are trying to declare then
"date type without the time". However there is no such data type in
Oracle; time is part of all dates. Do not do so, store them as
standard dates. (Note Dates are not stored in any format, but an
internal structure. Formats are strictly a visual representation).
Just throwaway the time with the format on the query or by setting
nls_date_format.
You may be tempted to convert call this through a trigger. Do not,
it will likely result in an "ORA-04091: Table is mutating"
exception.
Been a while since I have done any sereious T-SQL. I thought I had this right but I am receiving an error that I cannot figure out the cause of. This is for a stored procedure that is not complex. The code is below:
--======================================================
-- Create Natively Compiled Stored Procedure Template
--======================================================
USE Viper;
GO
-- Drop stored procedure if it already exists
IF OBJECT_ID('API.newCategory','P') IS NOT NULL
DROP PROCEDURE API.newCategory;
GO
CREATE PROCEDURE API.newCategory
-- Add the parameters for the stored procedure here
#Category varchar(20) = null
WITH NATIVE_COMPILATION, SCHEMABINDING
AS BEGIN ATOMIC WITH
(
TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english'
)
--Insert statements for the stored procedure here
DECLARE #tmp int = 0;
IF #Category IS NOT NULL
AND #Category != ''
SET #tmp = ISNULL(
(SELECT idCategory
FROM Products.Category
WHERE Category = #Category),0);
IF #tmp = 0
INSERT INTO Products.Category (Category)
OUTPUT inserted.idCategory INTO #tmp
VALUES (#Category);
ELSE
UPDATE Category
SET active = 1
WHERE idCategory = #tmp;
RETURN #tmp;
END;
GO
The error message that I am receiving is:
Msg 1087, Level 16, State 1, Procedure newCategory, Line 22 [Batch
Start Line 11]
Must declare the table variable "#tmp".
I hope someone can point me in the right direction. I am sure that it is something stupidly simple, I just can't see it right now. A bit rusty I'm afraid.
Just to be clear, the operational goal of the SP is to:
1/ Check that there is actually a Category supplied to work with
2/ If there is then try and get its primary key id (idCategory)
3/ If there is no PK for the Category then insert it and return the idCategory into #tmp
4/ If there is a PK then make sure the active column is set to 1
5/ return #tmp as the result (either the PK or 0)
Cheers and thanks for any help
The Frog
Your problem is in this statement:
IF #tmp = 0
INSERT INTO Products.Category (Category)
OUTPUT inserted.idCategory INTO #tmp
VALUES (#Category);
You are doing an OUTPUT INTO where what you are output into your previously declare #tmp which is declared as an int. OUTPUT statements can only be against tables, temp tables or table variables.
A workaround could be to declare a table variable #catTab: DECLARE #catTab AS TABLE(CatID int) and OUTPUT into that variable followed by a: SELECT #tmp = CatID FROM #catTab. That should do it.
i'm creating a procedure to update/insert a table using merge statement(upsert).now i have a problem: using procedure parameters i have to do this upsert.
procedure xyz( a in table.a%type,b in table.b%type,....)
is
some local variables;
begin
merge into target_table
using source_table --instead of the source table, i have to use procedure parameters here
on (condition on primary key in the table)
when matched then
update the table
when not matched then
insert the table ;
end xyz;
so how to use procedure parameters instead of source table in merge statement?? or
suggest me a query to fetch the procedure parameters and use it as source table values.
help me please.
Thanks in advance.
I know that I'm eight years late to the party, but I think that I was trying to do something similar to what you were doing, but trying to Upsert based on parameters passed into a stored procedure that returns an empty string on success and an error on failure back to my VB Code. Below is all of my code along with comments explaining what I did, and why I did it. Let me know if this helps you or anyone else. This is my first time answering a post.
PROCEDURE UpsertTSJobData(ActivitySeq_in IN NUMBER,
Owner_in In VARCHAR2,
NumTrailers_in IN NUMBER,
ReleaseFormReceived_in IN NUMBER,
Response_out OUT VARCHAR2) AS
err_num NUMBER;
err_msg VARCHAR2(4000);
BEGIN
--This top line essentially does a "SELECT *" from the named table
--and looks for a match based on the "ON" statement below
MERGE INTO glob1app.GFS_TS_JOBDATA_TAB tsj
--This select statement is used for the INSERT when no match
--is found and the UPDATE when a match is found.
--It creates a "pseudo-table"
USING (
SELECT ActivitySeq_in AS ActSeq,
Owner_in As Owner,
NumTrailers_in As NumTrailers,
ReleaseFormReceived_in As ReleaseFormReceived
FROM DUAL) input
--This ON statement is what we're doing the match on to find
--matching records. This decides whether it will be an
--INSERT or an UPDATE
ON (tsj.Activity_seq = ActivitySeq_in)
WHEN MATCHED THEN
--Here we UPDATE based on the passed in input table
UPDATE
SET OWNER = input.owner,
NUMTRAILERS = input.NumTrailers,
RELEASEFORMRECEIVED = input.releaseformreceived
WHEN NOT MATCHED THEN
--Here we INSERT based on the passed in input table
INSERT (
ACTIVITY_SEQ,
OWNER,
NUMTRAILERS,
RELEASEFORMRECEIVED
)
VALUES (
input.actseq,
input.owner,
input.numtrailers,
input.releaseformreceived
);
Response_out := '';
EXCEPTION
WHEN OTHERS THEN
err_num := SQLCODE;
err_msg := SUBSTR(SQLERRM, 1, 3900);
Response_out := TO_CHAR (err_num) || ': ' || err_msg;
END;
Maby something like
DECLARE V_EXISTS NUMBER;
BEGIN SELECT COUNT(*) INTO V_EXISTS FROM TARGET_TABLE WHERE PK_ID = :ID;
IF V_EXISTS > 0 THEN
-- UPDATE
ELSE
-- INSERT
END IF;
END;
Also, you may try to use so-called tempotary table (select from DUAL)
CREATE TABLE TEST (N NUMBER(2), NAME VARCHAR2(20), ADRESS VARCHAR2(100));
INSERT INTO TEST VALUES(1, 'Name1', 'Adress1');
INSERT INTO TEST VALUES(2, 'Name2', 'Adress2');
INSERT INTO TEST VALUES(3, 'Name3', 'Adress3');
SELECT * FROM TEST;
-- test update
MERGE INTO TEST trg
USING (SELECT 1 AS N, 'NameUpdated' AS NAME,
'AdressUpdated' AS ADRESS FROM Dual ) src
ON ( src.N = trg.N )
WHEN MATCHED THEN
UPDATE
SET trg.NAME = src.NAME,
trg.ADRESS = src.ADRESS
WHEN NOT MATCHED THEN
INSERT VALUES (src.N, src.NAME, src.ADRESS);
SELECT * FROM TEST;
-- test insert
MERGE INTO TEST trg
USING (SELECT 34 AS N, 'NameInserted' AS NAME,
'AdressInserted' AS ADRESS FROM Dual ) src
ON ( src.N = trg.N )
WHEN MATCHED THEN
UPDATE
SET trg.NAME = src.NAME,
trg.ADRESS = src.ADRESS
WHEN NOT MATCHED THEN
INSERT VALUES (src.N, src.NAME, src.ADRESS);
SELECT * FROM TEST;
DROP TABLE TEST;
see here
Its very difficult to tell from you question exactly what you what, but I gather you want the table that you are merging into ( or on ) to be dynamic. In that case, what you should be using is the DBMS_SQL package to create dynamic SQL