Adding multiple data using a CLOB - stored-procedures

I have this external oracle Stored Procedure which takes INTEGER,CLOB,VARCHAR as parameters, and which inserts a record to table upon executing. This will be called using a dao layer which consists of JAVA + Spring.
I have been asked to insert multiple records (1000s ) using the same procedure. so I am thinking of writing a pl/sql block which accepts either String or Clob and substrings the values in a loop which calls the procedure. For that I have to either append a String with delemeters for each record and pass it as a parameter or I could create a CLOB from that String and pass it as a parameter.
Eg:String param ="value1,value2,value3 | value1,value2,value3 | value1,value2,value3 ..etc"
My questions are:
Is there a better solution than what I am thinking (because I think it is better to loop it inside the DB server rather than looping in DAO layer and making 1000s of DB calls)?
If I go ahead with my solution will there be limitations which prevents my effort, such as size of the data that I can pass to the pl/sql block?

I would refer you to this SO question:
Bulk insert from Java into Oracle
Basically, you should be doing a few bulk operations rather than thousands of individual ones.

Related

Implementing a unique surrogate key in Advantage Database Server

I've recently taken over support of a system which uses Advantage Database Server as its back end. For some background, I have years of database experience but have never used ADS until now, so my question is purely about how to implement a standard pattern in this specific DBMS.
There's a stored procedure which has been previously developed which manages an ID column in this manner:
#ID = (SELECT ISNULL(MAX(ID), 0) FROM ExampleTable);
#ID = #ID + 1;
INSERT INTO Example_Table (ID, OtherStuff)
VALUES (#ID, 'Things');
--Do some other stuff.
UPDATE ExampleTable
SET AnotherColumn = 'FOO'
WHERE ID = #ID;
My problem is that I now need to run this stored procedure multiple times in parallel. As you can imagine, when I do this, the same ID value is getting grabbed multiple times.
What I need is a way to consistently create a unique value which I can be sure will be unique even if I run the stored procedure multiple times at the same moment. In SQL Server I could create an IDENTITY column called ID, and then do the following:
INSERT INTO ExampleTable (OtherStuff)
VALUES ('Things');
SET #ID = SCOPE_IDENTITY();
ADS has autoinc which seems similar, but I can't find anything conclusively telling me how to return the value of the newly created value in a way that I can be 100% sure will be correct under concurrent usage. The ADS Developer's Guide actually warns me against using autoinc, and the online help files offer functions which seem to retrieve the last generated autoinc ID (which isn't what I want - I want the one created by the previous statement, not the last one created across all sessions). The help files also list these functions with a caveat that they might not work correctly in situations involving concurrency.
How can I implement this in ADS? Should I use autoinc, some other built-in method that I'm unaware of, or do I genuinely need to do as the developer's guide suggests, and generate my unique identifiers before trying to insert into the table in the first place? If I should use autoinc, how can I obtain the value that has just been inserted into the table?
You use LastAutoInc(STATEMENT) with autoinc.
From the documentation (under Advantage SQL->Supported SQL Grammar->Supported Scalar Functions->Miscellaneous):
LASTAUTOINC(CONNECTION|STATEMENT)
Returns the last used autoinc value from an insert or append. Specifying CONNECTION will return the last used value for the entire connection. Specifying STATEMENT returns the last used value for only the current SQL statement. If no autoinc value has been updated yet, a NULL value is returned.
Note: Triggers that operate on tables with autoinc fields may affect the last autoinc value.
Note: SQL script triggers run on their own SQL statement. Therefore, calling LASTAUTOINC(STATEMENT) inside a SQL script trigger would return the lastautoinc value used by the trigger's SQL statement, not the original SQL statement which caused the trigger to fire. To obtain the last original SQL statement's lastautoinc value, use LASTAUTOINC(CONNECTION) instead.
Example: SELECT LASTAUTOINC(STATEMENT) FROM System.Iota
Another option is to use GUIDs.
(I wasn't sure but you may have already been alluding to this when you say "or do I genuinely need to do as the developer's guide suggests, and generate my unique identifiers before trying to insert into the table in the first place." - apologies if so, but still this info might be useful for others :) )
The use of GUIDs as a surrogate key allows either the application or the database to create a unique identifier, with a guarantee of no clashes.
Advantage 12 has built-in support for a GUID datatype:
GUID and 64-bit Integer Field Types
Advantage server and clients now support GUID and Long Integer (64-bit) data types in all table formats. The 64-bit integer type can be used to store integer values between -9,223,372,036,854,775,807 and 9,223,372,036,854,775,807 with no loss of precision. The GUID (Global Unique Identifier) field type is a 16-byte data structure. A new scalar function NewID() is available in the expression engine and SQL engine to generate new GUID. See ADT Field Types and Specifications and DBF Field Types and Specifications for more information.
http://scn.sap.com/docs/DOC-68484
For earlier versions, you could store the GUIDs as a char(36). (Think about your performance requirements here of course.) You will then need to do some conversion back and forth in your application layer between GUIDs and strings. If you're using some intermediary data access layer, e.g. NHibernate or Entity Framework, you should be able to at least localise the conversions to one place.
If some part of your logic is in a stored procedure, you should be able to use the newid() or newidstring() function, depending on the type of the backing column:
INSERT INTO Example_Table (newid(), OtherStuff)

Are there any extensions to TADOQuery that include client indexes

Quick question (hopefully)
I have a large dataset (>100,000 records) that I would like to use as a lookup to determine existence or non-existence of multiple keys. The purpose of this is to find FK violations before trying to commit them to the database to try and avoid the resultant EDatabaseError messing up my transaction.
I had been using TClientDataSet/TDatasetProvider with the FindKey method, as this allowed a client-side index to be set up and was faster (2s to scan each key rather than 10s for ADO). However, moving to large datasets the population of the CDS is starting to take far more time than the local index is saving.
I see that I have a few options for alternatives:
client cursor with TADOQuery.locate method
ADO SELECT statements for each check (no client cache)
ADO SEEK method
Extend TADOQuery to mimic FindKey
The Locate method seems easiest and doesn't spam the server with the SELECT/SEEK methods. I like the idea of extending the TADOQuery, but was wondering whether anyone knew of any ready-made solutions for this rather than having to create my own?
I would create a temporary table in the database server. Insert all 100,000 records into this temp table. Do bulk inserts of say 3000 records at a time, to minimise round trips to the server. Then run select statements on this temp table to check for foreign key violations etc. If all okay, do an insert SQL from the temp table to the main table.

Avoiding round-trips when importing data from Excel

I'm using EF 4.1 (Code First). I need to add/update products in a database based on data from an Excel file. Discussing here, one way to achieve this is to use dbContext.Products.ToList() to force loading all products from the database then use db.Products.Local.FirstOrDefault(...) to check if product from Excel exists in database and proceed accordingly with an insert or add. This is only one round-trip.
Now, my problem is there are two many products in the database so it's not possible to load all products in memory. What's the way to achieve this without multiplying round-trips to the database. My understanding is that if I just do a search with db.Products.FirstOrDefault(...) for each excel product to process, this will perform a round-trip each time even if I issue the statement for the exact same product several times ! What's the purpose of the EF caching objects and returning the cached value if it goes to the database anyway !
There is actually no way to make this better. EF is not a good solution for this kind of tasks. You must know if product already exists in database to use correct operation so you always need to do additional query - you can group multiple products to single query using .Contains (like SQL IN) but that will solve only check problem. The worse problem is that each INSERT or UPDATE is executed in separate roundtrip as well and there is no way to solve this because EF doesn't support command batching.
Create stored procedure and pass information about product to that stored procedure. The stored procedure will perform insert or update based on the existence of the record in the database.
You can even use some more advanced features like table valued parameters to pass multiple records from excel into procedure with single call or import Excel to temporary table (for example with SSIS) and process them all directly on SQL server. As last you can use bulk insert to get all records to special import table and again process them with single stored procedures call.

Should parameters to SP's of SqlDataSources be ordered identically?

I'm writing some stored procedures to do CRUD operations against some tables in a SQL Server database, which will be used in a FormView on an ASP.NET 2.0 page. I've already written the hardest one, which is the insert SP. Now I'm going to work on the select, update and delete SP's. What I'd like to know is, do the parameters to the SP's that are used by the SqlDataSource have to be in exactly the same order? For example, the insert operation requires about a dozen parameters, all of which are stored into the 3 tables that the insert handles. However, to retrieve the same data, all I need is the primary keys, which are just 2 parameters. Do I need to provide all of the parameters, in the same order, as I've specified for the insertion stored procedure?
No. Each command will have separate parameter collections SqlDataSource.DeleteParameters
SSMS tools pack has a CRUD generator that might be useful too

MSSQL2000: Using a stored procedure results as a table in sql

Let's say I have 'myStoredProcedure' that takes in an Id as a parameter, and returns a table of information.
Is it possible to write a SQL statement similar to this?
SELECT
MyColumn
FROM
Table-ify('myStoredProcedure ' + #MyId) AS [MyTable]
I get the feeling that it's not, but it would be very beneficial in a scenario I have with legacy code & linked server tables
Thanks!
You can use a table value function in this way.
Here is a few tricks...
No it is not - at least not in any official or documented way - unless you change your stored procedure to a TVF.
But however there are ways (read) hacks to do it. All of them basically involved a linked server and using OpenQuery - for example seehere. Do however note that it is quite fragile as you need to hardcode the name of the server - so it can be problematic if you have multiple sql server instances with different name.
Here is a pretty good summary of the ways of sharing data between stored procedures http://www.sommarskog.se/share_data.html.
Basically it depends what you want to do. The most common ways are creating the temporary table prior to calling the stored procedure and having it fill it, or having one permanent table that the stored procedure dumps the data into which also contains the process id.
Table Valued functions have been mentioned, but there are a number of restrictions when you create a function as opposed to a stored procedure, so they may or may not be right for you. The link provides a good guide to what is available.
SQL Server 2005 and SQL Server 2008 change the options a bit. SQL Server 2005+ make working with XML much easier. So XML can be passed as an output variable and pretty easily "shredded" into a table using the XML functions nodes and value. I believe SQL 2008 allows table variables to be passed into stored procedures (although read only). Since you cited SQL 2000 the 2005+ enhancements don't apply to you, but I mentioned them for completeness.
Most likely you'll go with a table valued function, or creating the temporary table prior to calling the stored procedure and then having it populate that.
While working on the project, I used the following to insert the results of xp_readerrorlog (afaik, returns a table) into a temporary table created ahead of time.
INSERT INTO [tempdb].[dbo].[ErrorLogsTMP]
EXEC master.dbo.xp_readerrorlog
From the temporary table, select the columns you want.

Resources