I'm trying to left join two stored procedures in a Firebird query.
In my example data the first returns 70 records, the second just 1 record.
select
--...
from MYSP1('ABC', 123) s1
left join MYSP2('DEF', 456) s2
on s1.FIELDA = s2.FIELDA
and s1.FIELDB = s2.FIELDB
The problem is performances: it takes 10 seconds, while each procedure takes less than 1 second. I suspect that procedures are run multiple times instead of just once. It would make sense to execute them just once, because I pass fixed parameters to them.
Is there a way to oblige Firebird to simply execute once each procedure and then join their results?
Since it seems there is no way, I solved this issue running this query inside a new stored procedure, where I cache all results from MYSP2 into a global temporary table and make the join between MYSP1 and the temporary table.
This is temporary table definition:
create global temporary table MY_TEMP_TABLE
(
FIELDA varchar(3) not null,
FIELDB smallint not null,
FIELDC varchar(10) not null
);
This is stored procedure body:
--cache MYSP2 results
delete from MY_TEMP_TABLE;
insert into MY_TEMP_TABLE
select *
from MYSP2('DEF', 456)
;
--join data
for
select
--...
from MYSP1('ABC', 123) s1
left join MY_TEMP_TABLE s2
on s1.FIELDA = s2.FIELDA
and s1.FIELDB = s2.FIELDB
into
--...
do
suspend;
But if there is another solution without temporary tables it would be great!
Maybe this can help:
with MYSP2W as (MYSP2('DEF', 456))
select
--...
from MYSP1('ABC', 123) s1
left join MYSP2W s2
on s1.FIELDA = s2.FIELDA
and s1.FIELDB = s2.FIELDB
Related
Would you please provide an an example for a Redshift procedure where you have used a cursor and an UPDATE statement in conjunction? Is that even feasible, I couldn't find an example. I'm looking for a simple template code to learn how to have these 2 together in a single procedure on Redshift.
Here is an example use case:
I have a table like this:
CREATE TABLE test_tbl
(
Contactid VARCHAR(500),
sfdc_OppId_01 VARCHAR(500),
sfdc_OppId_02 VARCHAR(500),
sfdc_OppId_03 VARCHAR(500),
sfdc_OppId_04 VARCHAR(500),
sfdc_OppId_05 VARCHAR(500),
sfdc_OppId_06 VARCHAR(500)
)
I want to update each sfdc_OppId_xx with the relative value from another table; sfdc_tbl. Here is what sfdc_tbl looks like:
sfdc_contactId
sfdc_Opp_Id
AA123hgt
999999
AA123hgt
888888
AA123hgt
777777
AA123hgt
432567
AA123hgt
098765
AA123hgt
112789
So as you see, there are duplicate sfdc_contactid in the sfdc_tbl. My final goal is to list all the sfdc_Opp_Id for given contactid horizontally in the test_tbl. I shall not have duplicate contactid in the test_tbl.
INSERT INTO test_tbl (Contactid)
SELECT sfdc_contactId
FROM sfdc_tbl
GROUP BY sfdc_contactId
And here is what I'm trying to do:
CREATE OR REPLACE PROCEDURE testing_procedure (results INOUT refcursor)
AS
$$
BEGIN
OPEN cursor_testing FOR
SELECT
Ops.sfdc_Opp.id,
ROW_NUMBER () OVER(PARTITION BY Ops.sfdc_contactId ORDER BY sfdc_Opp_Id ) RWN
FROM sfdc_tbl Ops
INNER JOIN test_tbl tbl
ON Ops.sfdc_contactId = tbl.contactid;
UPDATE test_tbl
SET sfdc_Opp_01 = CASE WHEN cursor_testing.RWN = 1 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_02 = CASE WHEN cursor_testing.RWN = 2 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_03 = CASE WHEN cursor_testing.RWN = 3 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_04 = CASE WHEN cursor_testing.RWN = 4 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_05 = CASE WHEN cursor_testing.RWN = 5 THEN cursor_testing.sfdc_Ops_id ELSE NULL END,
sfdc_Opp_06 = CASE WHEN cursor_testing.RWN = 6 THEN cursor_testing.sfdc_Ops_id ELSE NULL END
;
END;
$$
LANGUAGE plpgsql;
I keep getting an error
incorrect syntax at or near "cursor_testing"
I've answered a question with a similar solution. The SQL uses a cursor's data to INSERT into a table and this same path should work for UPDATE - How to join System tables or Information Schema tables with User defined tables in Redshift
That being said and looking at your code I really think you would be better off using a temp table rather than a cursor. The first thing to note is that a cursor is not a table. Your use pattern won't work AFAIK. You read a cursor row by row (or bunches) which is contrary to Redshift's columnar table storage. So you will need to loop on the rows from the cursor and perform N updates. This will be extremely slow! You would be querying columnar data, storing the results in a cursor as rows, reading these row one by one, and then performing a new query (UPDATE) for each row. Ick! Stay in "columnar land" and use a temp table.
I would like to change a couple of column values before they get inserted.
I am using Informix as database.
I have a table consisting of 3 columns: Name (NVARCHAR), Type (INT), Plan (NVARCHAR).
Every time a new record is inserted, I would like to check the Name value before inserting it. If the Name starts with an F, I would like to set the Type value to 1 and the Plan Name to "Test"
In short, what I want the trigger to do is:
For every new insertion, first check if Name value starts with F.
If yes, set the Type and Plan to 1 and "Test" then insert.
If no, insert the values as-is.
I have looked up the CREATE TRIGGER statement with BEFORE and AFTER. However, I would like to have a clearer example. My case would probably involve BEFORE though.
The answer of #user3243781 get close, but did not work because it returns the error:
-747 Table or column matches object referenced in triggering statement.
This error is returned when a triggered SQL statement acts on the
triggering table, or when both statements are updates, and the column
that is updated in the triggered action is the same as the column that
the triggering statement updates.
So the alternative is handle with the NEW variable directly.
For that you need to use a procedure with the triggers reference resource, which means the procedure will able to act like the trigger by self.
Below is my example which I run with dbaccess over a Informix v11.70.
This resource is available only for versions +11 of the engine, as far I remember.
create table teste ( Name NVARCHAR(100), Type INT , Plan NVARCHAR(100) );
Table created.
create procedure check_name_values()
referencing new as n for teste ;;
define check_type integer ;;
define check_plan NVARCHAR ;;
if upper(n.name) like 'F%' then
let n.type = 1;;
let n.plan = "Test";;
end if
end procedure ;
Routine created.
;
create trigger trg_tablename_ins
insert on teste
referencing new as new
for each row
(
execute procedure check_name_values() with trigger references
);
Trigger created.
insert into teste values ('cesar',99,'myplan');
1 row(s) inserted.
insert into teste (name) values ('fernando');
1 row(s) inserted.
insert into teste values ('Fernando',100,'your plan');
1 row(s) inserted.
select * from teste ;
name cesar
type 99
plan myplan
name fernando
type 1
plan Test
name Fernando
type 1
plan Test
3 row(s) retrieved.
drop table if exists teste;
Table dropped.
drop procedure if exists check_name_values;
Routine dropped.
create trigger trg_tablename_ins
insert on tablename
referencing new as new
for each row
(
execute procedure check_name_values
(
new.name,
new.type,
new.plan
)
);
create procedure check_name_values
(
name NVARCHAR,
new_type integer,
new_plan NVARCHAR,
)
define check_type integer ;
define check_plan NVARCHAR ;
let check_type = 1;
let check_plan = "Test";
if name = 'F%'
then
insert into tablename (name,type,plan) values (name,check_type,check_plan);
else
insert into tablename (name,type,plan) values (name,new_type,new_plan);
end if ;
end procedure ;
Here is my version an adaptation of an old example I found in the informix usenet group.
It is possible to update columns in a trigger statement but not very straight forward. You have to use stored procedures an the into statement with the execute procedure command.
It worked here for IBM Informix Dynamic Server Version 12.10.FC11WE.
drop table if exists my_table;
drop sequence if exists my_table_seq;
create table my_table (
id INTEGER
NOT NULL,
col_a char(32)
NOT NULL,
col_b char(20)
NOT NULL,
hinweis char(64),
uslu char(12)
DEFAULT USER
NOT NULL,
dtlu DATETIME YEAR TO SECOND
DEFAULT CURRENT YEAR TO SECOND
NOT NULL
)
;
create sequence my_table_seq
increment 1
start 1;
drop procedure if exists get_user_datetime();
create function get_user_datetime() returning char(12),datetime year to second;
return user, current year to second;
end function
;
drop trigger if exists ti_my_table;
create trigger ti_my_table insert on my_table referencing new as n for each row (
execute function get_user_datetime() into uslu, dtlu
)
;
drop trigger if exists tu_my_table;
create trigger tu_my_table update on my_table referencing new as n for each row (
execute function get_user_datetime() into uslu, dtlu
)
;
insert into my_table values (my_table_seq.nextval, "a", "b", null, "witz", mdy(1,1,1900)) ;
SELECT *
FROM my_table
WHERE 1=1
;
I am writing a PL/SQL stored procedure which will be called from within a .NET application.
My stored procedure must return
the count of values in a table of part revisions, based on an input part number,
the name of the lowest revision level currently captured in this table for the input part number
the name of the revision level for a particular unit in the database associated with this part number and an input unit ID.
The unit's revision level name is captured within a separate table with no direct relationship to the part revision table.
Relevant data structure:
Table Part has columns:
Part_ID int PK
Part_Number varchar2(30)
Table Part_Revisions:
Revision_ID int PK
Revision_Name varchar2(100)
Revision_Level int
Part_ID int FK
Table Unit:
Unit_ID int PK
Part_ID int FK
Table Unit_Revision:
Unit_ID int PK
Revision_Name varchar2(100)
With that said, what is the most efficient way for me to query these three data elements into a ref cursor for output? I am considering the following option 1:
OPEN cursor o_Return_Cursor FOR
SELECT (SELECT COUNT (*)
FROM Part_Revisions pr
inner join PART pa on pa.part_id = pr.part_id
WHERE PA.PART_NO = :1 )
AS "Cnt_PN_Revisions",
(select pr1.Revision_Name from Part_Revisions pr1
inner join PART pa1 on pa1.part_id = pr1.part_id
WHERE PA.PART_NO = :1 and pr1.Revision_Level = 0)
AS "Input_Revison_Level",
(select ur.Revision_Name from Unit_Revision ur
WHERE ur.Unit_ID = :2) as "Unit_Revision"
FROM DUAL;
However, Toad's Explain Plan returns Cost:2 Cardinality: 1, which I suspect is due to me using DUAL in my main query. Comparing that to option 2:
select pr.Revision_Name, (select count(*)
from Part_Revisions pr1
where pr1.part_id = pr.part_id) as "Count",
(select ur.Revision_Name
from Unit_Revision ur
where ur.Unit_ID = :2) as "Unit_Revision"
from Part_Revisions pr
inner join PART pa on pa.part_id = pr.part_id
WHERE PA.PART_NO = :1 and pr.Revision_Level = 0
Essentially I don't really know how to compare the results from my execution plans, to chose the best design. I have also considered a version of option 1, where instead of joining twice to the Part table, I select the Part_ID into a local variable, and simply query the Part_Revisions table based on that value. However, this is not something I can use the Explain Plan to analyze.
Your description and select statements look different... I based the procedure on the SQL statements.
PROCEDURE the_proc
(
part_no_in IN NUMBER
, revision_level_in IN NUMBER
, unit_id_in IN NUMBER
, part_rev_count_out OUT NUMBER
, part_rev_name_out OUT VARCHAR2
, unit_rev_name_out OUT VARCHAR2
)
AS
BEGIN
SELECT COUNT(*)
INTO part_rev_count_out
FROM part pa
WHERE pa.part_no = part_no_in
AND EXISTS
(
SELECT 1
FROM part_revisions pr
WHERE pa.part_id = pr.part_id
);
SELECT pr1.revision_name
INTO part_rev_name_out
FROM part_revisions pr1
WHERE pr1.revision_level = revision_level_in
AND EXISTS
(
SELECT 1
FROM part pa1
WHERE pa1.part_id = pr1.part_id
AND pa.part_no = part_no_in
);
SELECT ur.revision_name
INTO unit_rev_name_out
FROM unit_revision ur
WHERE ur.unit_id = unit_id_in;
END the_proc;
It looks like you are obtaining scalar values. Rather than return a cursor, just return the values using clean sql statements. I have done this numerous times from .net, it works fine.
Procedure get_part_info(p_partnum in part.part_number%type
, ret_count out integer
, ret_revision_level out part_revisions.revision_level%type
, ret_revision_name out part_revisions.revision_name%type) as
begin
select count(*) into ret_count from ....;
select min(revision_level) into ret_revision_level from ...;
select revision_name in ret_revision_name...;
return;
end;
I have two Hive tables and I am trying to join both of them. The tables are not clustered or partitioned by any field. Though the tables contain records for common key fields, the join query always returns 0 records. All the data types are 'string' data types.
The join query is simple and looks something like below
select count(*) cnt
from
fsr.xref_1 A join
fsr.ipfile_1 B
on
(
A.co_no = B.co_no
)
;
Any idea what could be going wrong? I have just one record (same value) in both the tables.
Below are my table definitions
CREATE TABLE xref_1
(
co_no string
)
clustered by (co_no) sorted by (co_no asc) into 10 buckets
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
CREATE TABLE ipfile_1
(
co_no string
)
clustered by (co_no) sorted by (co_no asc) into 10 buckets
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
Hi You are using Star Schema Join. Please use your query like this:
SELET COUNT(*) cnt FROM A a JOIN B b ON (a.key1 = b.key1);
If still have issue Then use MAPJOIN:
set hive.auto.convert.join=true;
select count(*) from A join B on (key1 = key2)
Please see Link for more detail.
I have one table (Table1) with some info and a string ID
I have another table (Table2) with some more info and a similar string ID (it is missing an extra char in the middle).
I was originally joining the tables on
t2.StringID = substring(t1.StringID,0,2)+substring(t1.StringID,4,7)
But that was too slow, so I decided to create a new column on Table1 which is already mapped to the PrimaryID of Table2, and then index that col.
So, to update that new column I do this:
select distinct PrimaryID,
substring(t2.StringID,0,2)+
substring(t2.StringID,4,7)) as StringIDFixed
into #temp
from Table2 t2
update Table1 tl
set t1.T2PrimaryID = isnull(t.PrimaryID, 0)
from Table1 t11, #temp t
where t11.StringID = t.StringIDFixed
and t1.T2PrimaryID is null
It creates the temp table in a few seconds, but the update has been running for 25 minutes now, and I dont know if it will even ever finish.
Table 1 has 45MM rows, Table 2 has 1.5MM
I know that's a chunky amount of data, but still, i feel like this shouldnt be that hard.
It's Sybase IQ 12.7
Any ideas?
Thanks.
Created an index on the temp table which took a few seconds, and then re ran the same update which then only took 7 seconds.
create index idx_temp_temp on #temp (StringIDFixed)
I hate Sybase.
select distinct isnull(t2.PrimaryID, 0),
substring(t2.StringID,0,2)+
substring(t2.StringID,4,7)) as StringIDFixed
into #temp
from Table2 t2
create HG index idx_temp_temp_HG on #temp (StringIDFixed)
or
create LF index idx_temp_temp_LF on #temp (StringIDFixed)
--check if in Table1 exists index HG or LF in StringID if not.. create index
update Table1 tl
set t1.T2PrimaryID = t.PrimaryID
from Table1 t11, #temp t
where t11.StringID = t.StringIDFixed
-- check if is necesary
-- and t1.T2PrimaryID is null replace for t11.T2PrimaryID is null
Consider replacing your update with an inner join to avoid the isnull() function on a big dataset.
update Table1
set a.T2PrimaryID = b.PrimaryID
from Table1 a
inner join #temp b
on a.StringID = b.StringIDFixed