Resetting next sequence ID on SERIAL column - informix

After committing a row with a SERIAL (autonumber) column, the row is deleted, but when another row is added, the deleted row's sequence ID is not reused.
The only way I have found for reusing the deleted row's sequence ID is to ALTER the SERIAL column to an INTEGER, then change it back to a SERIAL.
Is there an easier quicker way for accomplishing the resetting of the next sequence ID so that there are no gaps in the sequence?
NOTE: This is a single-user application, so no worries about multiple users simultaneously inserting rows.

There isn't a particularly easy way to do that. You can reset the number by inserting … hmmm, once upon a long time ago, there were bugs in this, and you're using ancient enough versions of the software that on occasions the bug might still be relevant, though all current versions of Informix products do not have the bug.
The safe technique is to insert: 231-2 (note the minus 2; that's +2,147,483,646), then insert a row with 0 (to generate +2,147,483,647), then insert another row with 0 to trip the next sequence number back to 1. That insert operation will fail if there's already a row with 1 in the system and you have a unique constraint on the SERIAL column. You then need to insert the maximum value, or the value before the first gap that you want to fill (another failing insert). However, note that after filling the gap, the inserted values will increase by one, bumping into any still existing rows and causing insertion failures (because you do have a unique constraint/index on each SERIAL column, don't you — and don't 'fess up if you do not have such indexes; just go and add them!).
If you have a more recent version of Informix, you can insert +2,147,483,647 and then a single row to wrap the value without running into trouble. If you have an old version of Informix with the bug, then inserting +2,147,482,647 directly caused problems. IIRC, the trouble was that you ended up with NULLs being generated, but it was long enough ago now (another millennium) that I'm not absolutely sure of that any more.
None of this is really easy, in case you hadn't noticed.
Generally speaking, it is unwise to fill the gaps; you're better off leaving them and not worrying about them, or inserting some sort of dummy record that says "this isn't really a record — but the serial value is otherwise missing so I'm here to show that we know about it and it isn't missing but isn't really used either".

Related

How to highlight changed cells when updating a DBgrid?

Let's say I am showing stock prices, or sports scores, or movie attendance or something.
Periodically, I will refresh the grid by Close() and then Open() of a query linked to its associated datasource.
I know how to owner draw a cell with OnDrawCell() - what I can't figure out is how to know if the new value is the same as or different from the previous value for a given cell.
I suppose there are two use cases here, one where the number of rows is fixed and they remain in the same row order and one where rows can change (insert/delete or reorder).
For the former, I can take a snapshopshot before updating and compare after the update, but that might be a lot of data. I am not sure if I want to restrict the operation to the currently visible rows. I think that a user might want to scroll down and still be notified of any which have changed during the last update.
For the latter, I am stumped, unless, of course, each row has a unique key.
How can I do this (efficiently)? A solution for TDbGrid would help everyone, a solution with TMS Software's TAdvDbGrid would be fine by me (as would a (preferably free) 3rd party component).
TDBGrid reads the data currently contained in its assigned dataset. It has no capacity to remember prior values, perform calculations, or anything else. If you want to track changes, you have to do it yourself. You can do it by multiple means (a prior value column, a history table, or whatever), but it can't be done by the grid itself. TDBGrid is for presenting data, not analyzing or storing it.
One suggestion would be to track it in the dataset using the BeforePost event, where you can store the _oldvalue of a your into a LastValue column, and then use that to see if the value has changed in your TDBGrid.OnDrawColumnCell event and alter the drawing/coloring as needed. Something like if LastValue <> CurrValue then... should work.

Informatica writes rejected rows into a bad file, how to avoid that?

I have developed an Informatica PowerDesigner 9.1 ETL Job which uses lookup and an update transform to detect if the target table has the the incoming rows from the source or not. I have set for the Update transform a condition
IIF(ISNULL(target_table_surrogate_id), DD_INSERT, DD_REJECT)
Now, when the incoming row is already in the target table, the row is rejected. Informatica writes these rejected rows into a .bad file. How to prevent this? Is there a way to determine that the rejected rows are not written into a .bad file? Or should I use e.g. a router insted of an update transform to determine if the row is insert row an then discard the other rows?
Put a filter transformation before the update strategy transformation and filter away the bad rows
Well, typically when we check for presence of a row in target, the decision is between insert and update, however, thats a business decision.
Till the time you are marking rows as dd_reject, they would be written to a bad file. Avoiding bad file can meqn mul6things here...
One, to not have the file created at all... use a filter to block the rows...you dont need update strategy for that...a simple filter should be good enough.
Second, if you want to process ur rows differently, dont mark them as reject, use a router and process them differently...
hope to help,
raghav
If you don't need rejected rows you can uncheck the option in update strategy "Forward rejected rows"

SQL Server table ID approachs to max length

I have code first MVC 4 application with SQL Server 2008.
One of my tables is used much, so, many data stored on it in every day and after some time I delete old data. That is why, element's ID increases speedily.
I defined it's ID as int type in the model. I am worry about table will fill after some time.
What will I do if table ID will arrive max length? I never meet with this situation.
My second question is that, if I will change ID's type from int to long type, then export-import database, will this long type affect (reduce) speed of the site?
If you use an INT IDENTITY starting at 1, and you insert a row every second, every day of the year, all year long -- then you need 66.5 years before you hit the 2 billion limit ...
If you're afraid - what does this command tell you?
SELECT IDENT_CURRENT('your-table-name-here')
Are you really getting close to 2,147,483,647 - or are you still a ways away??
If you should be getting "too close": you can always change the column's datatype to be BIGINT - if you use a BIGINT IDENTITY starting at 1, and you insert one thousand rows every second, you need a mind-boggling 292 million years before you hit the 922 quadrillion limit ....
As far as I know, there isn't any effective method to prevent reaching the limit of the auto-increased identity. You can set it to a data type big enough to last long when you create the table. But, here's one solution I can think of.
Create a new temp table with the same data structure, with the auto-increament column already included and set as primary key. Then, inside the Management Studio, import the data into the new table from the old table. When you are asked to copy the data or write your own query, just choose to write the query, and select everything from your old table except the ID. This can reset the identity to start back from 1. You can delete the old table after that and rename the new temp table. Although you have to right click the database itself to access the Import and Export command, you can set the source and destination database as the same in the options.
This method is pretty much easy and I've done it myself a couple of times.

Speed of ALTER TABLE ADD COLUMN in Sqlite3?

I have an iOS app that uses sqlite3 databases extensively. I need to add a column to a good portion of those tables. None of the tables are what I'd really consider large (I'm used to dealing with many millions of rows in MySQL tables), but given the hardware constraints of iOS devices I want to make sure it won't be a problem. The largest tables would be a few hundred thousand rows. Most of them would be a few hundred to a few thousand or tens of thousands.
I noticed that sqlite3 can only add columns to the end of a table. I'm assuming that's for some type of speed optimization, though possibly it's just a constraint of the database file format.
What is the time cost of adding a row to an sqlite3 table?
Does it simply update the schema and not change the table data?
Does the time increase with number of rows or number of columns already in the table?
I know the obvious answer to this is "just test" and I'll be doing that soon, but I couldn't find an answer on StackOverflow after a few minutes of searching so I figured I'd ask so others can find this information easier in the future.
From the SQLite ALTER TABLE documentation:
The execution time of the ALTER TABLE command is independent of the
amount of data in the table. The ALTER TABLE command runs as quickly
on a table with 10 million rows as it does on a table with 1 row.
The documentation implies the operation is O(1). It should run in negligible time.

using triggers to update Values

I'm trying to enhance the performance of a SQL Server 2000 job. Here's the scenario:
Table A has max. of 300,000 rows. If I update/delete the 100th row (Based on the insertion time) all the rows which has been added after that row, should update their values. Row no. 101, should update its value based on row no. 100 and row no. 102 should update its value based on the row no.101's updated value. e.g.
Old Table:
ID...........Value
100..........220
101..........(220/2) = 110
102..........(110/2)=55
......................
Row No. 100 updated with new value: 300.
New Table
ID...........Value
100..........300
101..........(300/2) = 150
102..........(150/2)=75
......................
The actual values calculation is more complex. the formula is for simplicity.
Right now, a trigger is defined for update/delete statements. When a row is updated or deleted, the trigger adds the row's data to a log table. Also, a SQL Job is created in code-behind after update/delete which fires a stored procedure that finally, iterates through all the next rows of table A and updates their values. The process takes ~10 days to be accomplished for 300,000 rows.
When the SP gets fired, it updates the next rows' values. I think this causes the trigger to run again for each SP update and add these rows to the log table too. Also, The task should be done in DB-side as requested by customer.
To solve the problem:
Modify the stored procedure and call it directly from the trigger. The stored procedure then drops the trigger and updates the next rows' values and then creates the trigger again.
There will be multiple instances of the program running simultaneously. if another user modifies a row while the SP is being executed, the system will not fire the trigger and I'll be in trouble! Is there any workaround for this?
What's your opinion about this solution? Is there any better way to achieve this?
Thank you.
First, about the update process. I understand, your procedure is simply calling itself, when it comes to updating the next row. With 300K rows this is certainly not going to be very fast, even without logging (though it would most probably take much fewer days to accomplish). But what is absolutely beyond me is how it is possible to update more than 32 rows that way without reaching the maximum nesting level. Maybe I've got the sequence of actions wrong.
Anyway, I would probably do that differently, with just one instruction:
UPDATE yourtable
SET #value = Value = CASE ID
WHEN #id THEN #value
ELSE #value / 2 /* basically, your formula */
END
WHERE ID >= #id
OPTION (MAXDOP 1);
The OPTION (MAXDOP 1) bit of the statement limits the degree of parallelism for the statement to 1, thus making sure the rows are updated sequentially and every value is based on the previous one, i.e. on the value from the row with the preceding ID value. Also, the ID column should be made a clustered index, which it typically is by default, when it's made the primary key.
The other functionality of the update procedure, i.e. dropping and recreating the trigger, should probably be replaced by disabling and re-enabling it:
ALTER TABLE yourtable DISABLE TRIGGER yourtabletrigger
/* the update part */
ALTER TABLE yourtable ENABLE TRIGGER yourtabletrigger
But then, you are saying the trigger shouldn't actually be dropped/disabled, because several users might update the table at the same time.
All right then, we are not touching the trigger.
Instead I would suggest adding a special column to the table, the one the users shouldn't be aware of, or at least shouldn't care much of and should somehow be made sure never to touch. That column should only be updated by your 'cascading update' process. By checking whether that column was being updated or not you would know whether you should call the update procedure and the logging.
So, in your trigger there could be something like this:
IF NOT UPDATE(SpecialColumn) BEGIN
/* assuming that without SpecialColumn only one row can be updated */
SELECT TOP 1 #id = ID, #value = Value FROM inserted;
EXEC UpdateProc #id, #value;
EXEC LogProc ...;
END
In UpdateProc:
UPDATE yourtable
SET #value = Value = #value / 2,
SpecialColumn = SpecialColumn /* basically, just anything, since it can
only be updated by this procedure */
WHERE ID > #id
OPTION (MAXDOP 1);
You may have noticed that the UPDATE statement is slightly different this time. I understand, your trigger is FOR UPDATE (= AFTER UPDATE), which means that the #id row is already going to be updated by the user. So the procedure should skip it and start from the very next row, and the update expression can now be just the formula.
In conclusion I'd like to say that my test update involved 299,995 of my table's 300,000 rows and took approximately 3 seconds on my not so very fast system. No logging, of course, but I think that should give your the basic picture of how fast it can be.
Big theoretical problem here. It is always extremely suspicious when updating one row REQUIRES updating 299,900 other rows. It suggests a deep flaw in the data model. Not that it is never appropriate, just that it is required far far less often than people think. When things like this are absolutely necessary, they are usually done as a batch operation.
The best you can hope for, in some miraculous situation, is to turn that 10 days into 10 minutes, but never even 10 seconds. I would suggest explaining thoroughly WHY this seems necessary, so that another approach can be explored.

Resources