I have a Document Sequence table with 1 row and 1 column. All it does is have a sequence number. Every time we need to create a new document, we call a stored proc which updates the existing sequence number in this table by 1 and read that and use that for id in the document table.
My question is: If multiple requests try to call this stored proc which updates the sequence number and returns it, is there a chance it will give the same number to multiple callers? I right now have another stored proc which calls this seq number generator sp in a transaction and then create a document with the obtained id. I was wondering if I had to do it with just entity framework in code and not use the stored proc, is it possible? Will Entity framework 4 support transaction by itself so that only one process is calling the seq# updater sp at a time?
Related
I have main stream that has some fields and hundreds of thousands of records.
I created a Table Input to just query the max value of a date column. It brings 1 unique record.
Now I need to do some kind of CROSS join this Table Input into the main stream, and add this new column into ts columns set. There's no ON clause, all records will have the same value for that column.
I tried using Merge Join, but instead of adding the value to all records it added an extra record to the stream. This extra record has null on all fields and the date value on the new field, while all original records have the new field as null.
You could use a "Join Rows (cartesian product)" step for this case.
you could use a stream lookup step. You would just need to make sure your main stream has a constant lookup value (add constant step right before the stream lookup) and add the same constant value in a new column to your query stream. The stream lookup should find the query result and add it to your main stream.
I have an SSIS routine that reads from a very dynamic table and inserts whichever rows it finds into a table in a different database, before truncating the original source table.
Due to the dynamic nature of the source table this truncation not surprisingly leads to rows not making it to the second database.
What is the best way of deleting only those rows that have been migrated?
There is an identity column on the source table but it is not migrated across.
I can't change either table schema.
A option, that might sound stupid but it works, is to delete first and use the OUTPUT clause.
I created a simple control flow that populates a table for me.
IF EXISTS
(
SELECT 1 FROM sys.tables AS T WHERE T.name = 'DeleteFirst'
)
BEGIN
DROP TABLE dbo.DeleteFirst;
END
CREATE TABLE dbo.DeleteFirst
(
[name] sysname
);
INSERT INTO
dbo.DeleteFirst
SELECT
V.name
FROM
master.dbo.spt_values V
WHERE
V.name IS NOT NULL;
In my OLE DB Source, instead of using a SELECT, DELETE the data you want to go down the pipeline and OUTPUT the DELETED virtual table. Somethinng like
DELETE
DF
OUTPUT
DELETED.*
FROM
dbo.DeleteFirst AS DF;
It works, it works!
One option would be to create a table to log the identity of your processed records into, and then a separate package (or dataflow) to delete those records. If you're already logging processed records somewhere then you could just add the identity there - otherwise, create a new table to store the data.
A second option: If you're trying to avoid creating additional tables, then separate the record selection and record processing into two stages. Broadly, you'd select all your records in the control flow, then process them on-by-one in the dataflow.
Specifically:
Create a variable of type Object to store your record list, and another variable matching your identity type (int presumably) to store the 'current record identity'.
In the control flow, add an Execute SQL task which uses a query to build a list of identity values to process, then stores them into the recordlist variable.
Add a Foreach Loop Container to process that list; the foreach task would load the current record identifier into the second variable you defined above.
In the foreach task, add a dataflow to copy that single record, then delete it from the source.
There's quite a few examples of this online; e.g. this one from the venerable Jamie Thomson, or this one which includes a bit more detail.
Note that you didn't talk about the scale of the data; if you have very large numbers of records the first suggestion is likely a better choice. Note that in both cases you lose the advantage of the table truncation (because you're using a standard delete call).
I am just now starting to dig into Teradata's locking features and Google is fairly convoluted with explanations on this. Hopefully, I can get a very simple and streamlined answer from SE.
After encountering numerous issues with identity columns in Teradata, I've decided to create a mechanism that mimics Oracle's sequence. To do this, I am creating a table with two fields, one that holds a table name and the other that stores its last-used sequence. I am going to then create a stored procedure that takes a table name. Within the procedure, it will perform the following options:
Select the last-used sequence from the sequence table into a variable (select LastId from mydb.sequence where tablename = :tablename)
Add 1 to the variable's value, thus incrementing it
Update the sequence table to use the incremented value
Return the sequence variable to the procedure's OUT param so I can access the sequenced ID in my .NET app
While all of those operations are taking place, I need to lock the sequence table for all read and write access to ensure that other calls to the procedure do not attempt to sequence the table while it is currently in the process of being sequenced. This is obviously to keep the same sequence from being used twice for the same table.
If this were .NET, I'd use a Sync Lock (VB.NET) or lock (C#) to keep other threads from entering a block of code until the current thread was finished. I am wondering if there's a way to lock a table much in the same way that I would lock a thread in .NET.
Consider using an explicit locking mechanism for a rowhash lock for the transaction:
BEGIN TRANSACTION;
LOCKING ROW EXCLUSIVE
SELECT LastId + 1 INTO :LastID
FROM MyDB.SequenceCols
WHERE TableName = :TableName
AND DatabaseName = :DatabaseName;
UPDATE MyDB.SequenceCols
SET LastId = :LastID
WHERE TableName = :TableName
AND DatabaseName = :DatabaseName;
END TRANSACTION;
The rowhash lock will allow the procedure to be used by other processes against other tables. To ensure row level locking you must fully qualify the primary index of the SequenceCols table. In fact, the primary index of the SequenceCols table should be UNIQUE on DatabaseName and TableName.
EDIT:
The exclusive rowhash lock would prevent another process from reading the row until the END TRANSACTION is processed owner of the rowhash lock.
I have a sequence used in a stored proc that update multiple table just like below:
create procedure()
-- retrieve new sequence number
sequence.nextval();
-- update table_A using newly created sequence number
insert into table_A(theID) values(sequence.currval());
-- update table_B using newly created sequence number
insert into table_B(theID) values(sequence.currval());
end procedure;
May I know whether the code above is a thread-safe implementation? For each procedure's execution, can I guarantee theID in table_A and table_B always retrieving the same sequence number when there are more than one execution at a time?
One of Informix's primary objectives is to make sure that procedure works exactly as you need it to, regardless of how many thousands of users are also running the same procedure at the same time. Indeed, you can have multiple concurrent sessions of your own each running the procedure and each session is isolated from all the other sessions.
So, the code shown is 'thread safe'.
I'm using EF 4.1 (Code First). I need to add/update products in a database based on data from an Excel file. Discussing here, one way to achieve this is to use dbContext.Products.ToList() to force loading all products from the database then use db.Products.Local.FirstOrDefault(...) to check if product from Excel exists in database and proceed accordingly with an insert or add. This is only one round-trip.
Now, my problem is there are two many products in the database so it's not possible to load all products in memory. What's the way to achieve this without multiplying round-trips to the database. My understanding is that if I just do a search with db.Products.FirstOrDefault(...) for each excel product to process, this will perform a round-trip each time even if I issue the statement for the exact same product several times ! What's the purpose of the EF caching objects and returning the cached value if it goes to the database anyway !
There is actually no way to make this better. EF is not a good solution for this kind of tasks. You must know if product already exists in database to use correct operation so you always need to do additional query - you can group multiple products to single query using .Contains (like SQL IN) but that will solve only check problem. The worse problem is that each INSERT or UPDATE is executed in separate roundtrip as well and there is no way to solve this because EF doesn't support command batching.
Create stored procedure and pass information about product to that stored procedure. The stored procedure will perform insert or update based on the existence of the record in the database.
You can even use some more advanced features like table valued parameters to pass multiple records from excel into procedure with single call or import Excel to temporary table (for example with SSIS) and process them all directly on SQL server. As last you can use bulk insert to get all records to special import table and again process them with single stored procedures call.