What's the difference between Jet OLEDB:Transaction Commit Mode and Jet OLEDB:User Commit Sync? - oledb

Althoug both Jet/OLE DB parameters are relativly well documented I fail to understand the difference between these two connection parameters:
The first one:
Jet OLEDB:Transaction Commit Mode
(DBPROP_JETOLEDB_TXNCOMMITMODE)
Indicates whether Jet writes data to
disk synchronously or asynchronously
when a transaction is committed.
The second one:
Jet OLEDB:User Commit Sync
(DBPROP_JETOLEDB_USERCOMMITSYNC)
Indicates whether changes that were
made in transactions are written in
synchronous or asynchronous mode.
What's the difference? When to use which?

This is very long, so here's the short answer:
Don't set either of these. The default settings for these two options are likely to be correct. The first, Transaction Commit Mode, controls Jet's implicit transactions, and applies outside of explicit transactions, and is set to YES (asynchronous). The second controls how Jet interacts with its temporary database during an explicit transaction and is set to NO (synchronous). I can't think of a situation where you'd want to override the defaults here. However, you might want to set them explicitly just in case you're running in an environment where the Jet database engine settings have been altered from their defaults.
Now, the long explanation:
I have waded through a lot of Jet-related resources to see if I can find out what the situation here is. The two OLEDB constants seem to map onto these two members of the SetOptionEnum of the top-level DAO DBEngine object (details here for those who don't have the Access help file available):
dbImplicitCommitSync
dbUserCommitSync
These options are there for overriding the default registry settings for the Jet database engine at runtime for any particular connection, or for permanently altering the stored settings for it in the registry. If you look in the Registry for HLKM\Software\Microsoft\Jet\X.X\ you'll find that under the key there for the Jet version you're using there are keys, of which two are these:
ImplicitCommitSync
UserCommitSync
The Jet 3.5 Database Engine Programmer's Guide defines these:
ImplicitCommitSync: A value of Yes indicates that Microsoft Jet will wait for commits to finish. A value other than Yes means that Microsoft Jet will perform commits asynchronously.
UserCommitSync: When the setting has a value of Yes, Microwsoft Jet will wait for commits to finish. Any other value means that Microsoft Jet will perform commits asynchronously.
Now, this is just a restatement of what you'd already said. The frustrating thing is that the first has a default value of NO while the second defaults to YES. If they really were controlling the same thing, you'd expect them to have the same value, or that conflicting values would be a problem.
But the key actually turns out to be in the name, and it reflects the history of Jet in regard to how data writes are committed within and outside of transactions. Before Jet 3.0, Jet defaulted to synchronous updates outside of explicit transactions, but starting with Jet 3.0, IMPLICIT transactions were introduced, and were used by default (with caveats in Jet 3.5 -- see below). So, one of these two options applies to commits OUTSIDE of transactions (dbImplicitCommitSync) and the other for commits INSIDE of transactions (dbUserCommitSync). I finally located a verbose explanation of these in the Jet Database Engine Programmer's Guide (p. 607-8):
UserCommitSynch
The UserCommitSynch setting determines
whether changes made as part of an
explicit transaction...are written to
the database in synchronous mode or
asynchronous mode. The default value...is Yes, which specifies
asynchronous mode. It is not
recommended that you change this value
because in synchronous mode, there is
no guarantee that information has been
written to disk before your code
proceeds to the next command.
ImplicitCommitSync
By default, when
performing operations that add,
delete, or update records outside of
explicit transactions, Microsoft Jet
automatically performs internal
transactions called implicit
transactions that temporarily save
data in its memory cache, and then
later write the data as a chunk to the
disk. The ImplicitCommitSync setting
determines whether changes made by
using implicit transactions are
written to the database in synchronus
mode or asynchronous mode. The default
value...is No, which specifies that
these changes are written to the
database in asynchronous mode; this
provides the best performance. If you
want implicit transactions to be
written to the database in synchronous
mode, change the value...to Yes. If
you change the value...you get
behavior similar to Microsoft Jet
versions 2.x and earlier when you
weren't using explicit transactions.
However, doing so can also impair
performance considerably, so it is not
recommended that you change the value
of this setting.
Note: There is no longer a need to use
explicit transactions to improve the
performance of Microsoft Jet. A
database application using Microsoft
Jet 3.5 should use explicit
transactions only in situations where
there may be a need to roll back
changes. Micosoft Jet can now
automatically perform implicit
transactions to improve performance
whenever it adds, deletes or changes
records. However, implicit
transactions for SQL DML statements
were removed in Microsoft Jet
3.5...see "Removal of Implicit Transactions for SQL DML Statements"
later in this chapter.
That section:
Removal of Implicit Transactions for SQL DML Statements
Even with all the work in Microsoft
Jet 3.0 to eliminate transactions in
order to obtain better performance,
SQL DML statements were still placed
in an implicit transaction. In
Microsoft Jet 3.5, SQL DML statements
are not placed in an implicit
transaction. This substantially
improves performance when running SQL
DML statements that affect many
records of data.
Although this change provides a
substantial performance improvement,
it also introduces a change to the
behavior of SQL DML statements. When
using Microsoft Jet 3.0 and previous
versions that use implicit
transactions for SQL DML statements,
an SQL DML statement rolls back if any
part of the statement is not
completed. When using Microsoft Jet
3.5, it is possible to have some of the records committed by SQL DML
statement while others are not. An
example of this would be when the
Microsoft Jet cache is exceeded. The
data in the cache is written to disk
and the next set of records is
modified and placed in the cache.
Therefore, if the connection is
terminated, it is possible that some
of the records were saved to disk, but
others were not. This is the same
behavior as using DAO looping routines
to update data withoug an explicit
transaction in Microsoft Jet 3.0. If
you want to avoid this behavior, you
need to add explicit transactions
around the SQL DML statement to define
a set of work and you must sacrifice
the performance gains.
Confused yet? I certainly am.
The key point to me seems to me to be that dbUserCommitSync seems to control the way Jet writes to the TEMPORARY database it uses for staging EXPLICIT transactions, while dbImplicitCommitSync relates to where Jet uses its implicit transactions OUTSIDE of an explicit transaction. In other words, dbUserCommitSync controls the behavior of the engine while inside a BeginTrans/CommitTrans loop, while dbImplicitCommitSync controls how Jet behaves in regard to asynch/synch outside of explicit transactions.
Now, as to the "Removal of Implicit Transactions" section: my reading is that implicit transactions apply to updates when you're looping through a recordset outside of a transaction, but no longer apply to a SQL UPDATE statement outside a transaction. It stands to reason that an optimization that improves the performance of row-by-row updates would be good and wouldn't actually help so much with a SQL batch update, which is already going to be pretty darned fast (relatively speaking).
Also note that the fact that it is possible to do it both ways is what enables DoCmd.RunSQL to make incomplete updates. That is, a SQL command that would fail with CurrentDB.Execute strSQL, dbFailOnError, can run to completion if executed with DoCmd.RunSQL. If you turn off DoCmd.SetWarnings, you don't get a report of an error, and you don't get the chance to roll back to the initial state (or, if you are informed of the errors and decide to commit, anyway).
So, what I think is going on is that SQL executed through the Access UI is wrapped in a transaction by default (that's how you get a confirmation prompt), but if you turn off the prompts and there's an error, you get the incomplete updates applied. This has nothing to do with the DBEngine settings -- it's a matter of the way the Access UI executes SQL (and there's an option to turn it off/on).
This contrasts to updates in DAO, which were all wrapped in the implicit transactions starting with Jet 3.0, but starting with Jet 3.5, only sequential updates were wrapped in the implicit transactions -- batch SQL commands (INSERT/UPDATE/DELETE) are not.
At least, that's my reading.
So, in regard to the issue in your actual question, in setting up your OLEDB connection, you'd set the options for the Jet DBEngine for that connection according to what you were doing. It seems to me that the default Jet DBEngine settings are correct and shouldn't be altered -- you want to use implicit transactions for edits where you're walking through a recordset and updating one row at a time (outside of an explicit transaction). On the other hand, you can wrap the whole thing in a transaction and get the same result, so really, this only applies to cases where you're walking a recordset and updating and have not used an explicit transaction, and the default setting seems quite correct to me.
The other setting, UserCommitSync, seems to me to be something you'd definitely want to leave alone as well, as it seems to me to apply to the way Jet interacts with its temp database during an explicit transaction. Setting it to asynchronous would seem to me to be quite dangerous as you'd basically not know the state of the operation at the point that you committed the data.

You'd think that USERCOMMITSYNC=YES would be the option to commit synchronously. And that is the cause of the confusion.
I spent ages googling on this topic because I found that the behavior I was getting with old vb6 applications was not the same as I get in .net oledb/jet4
Now I really should back up what I'm going to say with a link to the actual page(s) I read but I can't find those pages now.
Anyway, I was browsing MSDN website and found a page that described a 'by design' error in Jet3 which transposed the functionality of USERCOMMITSYNC meaning a value of NO gets synchronous commit.
Therefore MS set the default to NO and we get synchronous commit by default. Exactly as described above by David Fenton. A behavior we've all come to accept.
But, the document then went on to explain that the behavior in oledb/Jet4 has been changed. Basically MS fixed their bug and now a setting of USERCOMMITSYNC=YES does what it says.
But did they change the default? I think not because now my explicit transactions are NOT committing synchronously in .Net applications using oledb/jet4.

Related

Keep data available during a transaction (postgresql)

Why can't I access data during a transaction with a truncate statement ?
I though it would be possible to read the data of a table while a transaction is running on the same table.
It is possible as long as you don't do a truncate statement.
And I didn't found anything against it in the documentation.
Here is a sample project to illustrate this behavior and the README try to explain how to reproduce : https://github.com/Haelle/pg_transaction_tests
If it is not supposed to happen, it could be either an issue with ? :
my code...
ActiveRecord
gem pg
an option in Postgresql
This is something specific to postgresql, from the TRUNCATE documentation:
TRUNCATE acquires an ACCESS EXCLUSIVE lock on each table it operates on, which blocks all other concurrent operations on the table. When RESTART IDENTITY is specified, any sequences that are to be restarted are likewise locked exclusively. If concurrent access to a table is required, then the DELETE command should be used instead.
This is a very specific use-case you are stuck with. Not sure why one would need to be able to access the table, which is about to be truncated, during the transaction of the truncate. But, as the note says: use delete instead. In rails this means: .destroy_all (checks rails validations) or .delete_all (does not check rails validations)
The documentation states:
TRUNCATE acquires an ACCESS EXCLUSIVE lock on each table it operates on, which blocks all other concurrent operations on the table.
This will even block the ACCESS SHARE lock required by a SELECT.
This is necessary because the table receives a new file during that operation (the old one is deleted at COMMIT time.
If you need an operation that won't block concurrent readers, use the (much more expensive)
DELETE FROM tablename;

Avoid too long active transaction with IBX components

My question is very simple, what is the best practice to avoid to have too long active transaction with an application that use many component TIBDataSet? I would avoid to have very old OAT and than have very bad performance
My application have more dataset that must be always opened (until the application is running). I would avoid to close and reopen the transaction because I will be reopen all dataset.
I must be replace this component?
And if yes, what is the best choice?
ClientDataSet with DataSetProvider or switch to IBO component (also If I wouldn't install other component on my IDE)
Read-only transactions don't affect performance of FB server. In our project we use single projectwide read-only always open transaction for data fetching and multiple short living transactions for data modification.
We use modified IBX components where second separate transaction for data reading was added.

Interbase transaction monitoring

I have a very strange problem with transactions in Interbase 7.5 which seem to be stuck.
I can track the problem with IBConsole -> right click DB -> Performance Monitor -> Transactions
Usually this list should show only a few active transaction. But I get several hundred active transactions when I start my application (a web module for an apache webserver using Delphi 7 Interbase components, e.g. IBQuery, IBTransaction, ...)
Transaction type is always listed as snapshot, if this is of relevance.
I have already triple checked all sql statements and cannot find anything that should produce such problems...
Is there any way get the sql statements of a specific transaction?
Any other suggestion how to find such a problem would be very welcome.
Is there any way get the sql statements of a specific transaction?
Yes, you can SELECT from TMP$STATEMENTS WHERE TRANSACTION_ID = .... That's from memory, but should get you started.
In IB Performance Monitor, you can locate the transaction from the statements tab, using the button on the toolbar. Can't remember if you can go the other way in that app. It's been a long time since I wrote it!
Active IBX data-sets require an active transaction all the time. If you don't have active data-sets just don't forget to commit all the active transactions.
If you have active data-sets, you can configure all your components to use the same TIbTransaction object, and you can also configure the unique TIbTransaction to commit or rollback after a idle time-out period via the IdleTimer and DefaultAction properties.
Terminating the transaction (by manually or automatically committing or rolling back) will close all the linked datasets (TIBQuery, TIBTable and the like).
You may be tempted to use the CommitRetaining or RollbackRetaining methods to terminate the transaction without closing the related data-sets, but this may affect the performance of the server, and my advise is to always avoid using it.
If you want to improve your application, you should consider changing your database connection layer or introducing a in-memory capable dataset over IBX, for example, Delphi's TClientDataSet, which allows you to retrieve data and retain it in memory while closing all the underlying datasets (and transactions), while allowing you to use the traditional Insert/Append/Edit/Delete methods to modify the data and then apply that changes to the database in a new short-time transaction.

How to revert changes with procedural memory?

Is it possible to store all changes of a set by using some means of logical paths - of the changes as they occur - such that one may revert the changes by essentially "stepping back"? I assume that something would need to map the changes as they occur, and the process of reverting them would thus ultimately be linear.
Apologies for any incoherence and this isn't applicable to any particular language. Rather, it's a problem of memory – i.e. can a set * (e.g. which may be some store of user input)* of a finite size that's changed continuously * (e.g. at any given time for any amount of time - there's no limit with regards to how much it can be changed)* be mapped procedurally such that new - future - changes are assumed to be the consequence of prior change * (in a second, mirror store that can be used to revert the state of the set all the way to its initial state)*.
You might want to look at some functional data structures. Functional languages, like Erlang, make it easy to roll back to the earlier state, since changes are always made on new data structures instead of mutating existing ones. While this feature can be used at repeatedly internally, Erlang programming typically uses this abundantly at the top level of a "process" so that on any kind of failure, it aborts both processing as well as all the changes in their entirety simply by throwing an exception (in a non-functional language, using mutable data structures, you'd be able to throw an exception to abort, but restoring originals would be your program's job not the runtime's job). This is one reason that Erlang has a solid reputation.
Some of this functional style of programming is usefully applied to non-functional languages, in particular, use of immutable data structures, such as immutable sets, lists, or trees.
Regarding immutable sets, for example, one might design a functionally-oriented data structure where modifications always generate a new set given some changes and an existing set (a change set consisting of additions and removals). You'd leave the old set hanging around for reference (by whomever); languages with automatic garbage collection reclaim the old ones when they're no longer being used (referenced).
You can put a id or tag into your set data structure, this way you can do some introspection to see what data structure id someone has a hold of. You also can capture the id of the base off of which each new version was generated; this gives you some history or lineage.
If desired, you can also capture a reference to the entire old data structure in the new one, or, one can maintain a global list of all of the sets as they are being generated. If you do, however, you'll have to take over more responsibility for storage management, as an automatic collector will probably not find any unused (unreferenced) garbage to collect without additional some help.
Database designs do some of this in their transaction controllers. For the purposes of your question, you can think of a database as a glorified set. You might look into MVCC (Multi-version Concurrency Control) as one example that is reasonably well written up in literature. This technique keeps old snapshot versions of data structures around (temporarily), meaning that mutations always appear to be in new versions of the data. An old snapshot is maintained until no active transaction references it; then is discarded. When two concurrently running transactions both modify the database, they each get a new version based off the same current and latest data set. (The transaction controller knows exactly which version each transaction is based off of, though the transaction's client doesn't see the version information.) Assuming both concurrent transactions choose to commit their changes, the versioning control in the transaction controller recognizes that the second committer is trying to commit a change set that is not a logical successor to the first (since both changes sets as we postulated above were based on the same earlier version). If possible, the transaction controller will merge the changes as if the 2nd committer was really working off the other, newer version committed by the first committer. (There are varying definitions of when this is possible, MVCC says it is when there are no write conflicts, which is a less-than-perfect answer but fast and scalable.) But if not possible, it will abort the 2nd committers transaction and inform the 2nd committer thereof (they then have the opportunity, should they like, to retry their transaction starting from the newer base). Under the covers, various snapshot versions in flight by concurrent transactions will probably share the bulk of the data (with some transaction-specific change sets that are consulted first) in order to make the snapshots cheap. There is usually no API provided to access older versions, so in this domain, the transaction controller knows that as transactions retire, the original snapshot versions they were using can also be (reference counted and) retired.
Another area this is done is using Append-Only-Files. Logging is a way of recording changes; some databases are based 100% on log-oriented designs.
BerkeleyDB has a nice log structure. Though used mostly for recovery, it does contain all the history so you can recreate the database from the log (up to the point you purge the log in which case you should also archive the database). Again someone has to decide when they can start a new log file, and when they can purge old log files, which you'd do to conserve space.
These database techniques can be applied in memory as well. (Nothing is free, though, of course ;)
Anyway, yes, there are fields where this is done.
Immutable data structures help preserve history, by simply keeping old copies; changes always go to new copies. (And efficiency techniques can make this not as bad as it sounds.)
Id's can help understand lineage without necessarily holding onto all the old copies.
If you do want to hold onto all old the copies, you have to look at your domain design to understand when/how/if old data structures possibly can get accessed with an eye toward how to eventually reclaim them. You'll mostly likely have to help get involved in defining how they get released, if ever. Or how they get archived for posterity though at the cost of slower access later.

Interprocess SQLite Thread Safety (on iOS)

I'm trying to determine if my sqlite access to a database is thread-safe on iOS. I'm writing a non App Store app (or possibly a launch daemon), so Apple's approval isn't an issue. The database in question is the built-in sms.db, so for sure the OS is also accessing this database for reading and writing. I only want to be able to safely read it.
I've read this about reading from multiple processes with sqlite:
Multiple processes can have the same database open at the same time.
Multiple processes can be doing a SELECT at the same time. But only
one process can be making changes to the database at any moment in
time, however.
I understand that thread-safety can be compiled out of sqlite, and that sqlite3_threadsafe() can be used to test for this. Running this on iOS 5.0.1
int safe = sqlite3_threadsafe();
yields a result of 2. According to this, that means mutex locking is available. But, that doesn't necessarily mean it's in use.
I'm not entirely clear on whether thread-safety is dynamically enabled on a per connection, per database, or global basis.
I have also read this. It looks like sqlite3_config() can be used to enable safe multi-threading, but of course, I have no control, or visibility into how the OS itself may have used this call (do I?). If I were to make that call again in my app, would it make it safe to read the database, or would it only deconflict concurrent access for multiple threads in my app that used the same sqlite3 database handle?
Anyway, my question is ...
can I safely read this database that's also accessed by iOS, and if so, how?
I've never used SQLite, but I've spent a decent amount of time reading its docs because I plan on using it in the future (and the docs are interesting). I'd say that thread safety is independent of whether multiple processes can access the same database file at once. SQLite, regardless of what threading mode it is in, will lock the database file, so that multiple processes can read from the database at once but only one can write.
Thread safety only affects how your process can use SQLite. Without any thread safety, you can only call SQLite functions from one thread. But it should still, say, take an EXCLUSIVE lock before writing, so that other processes can't corrupt the database file. Thread safety just protects data in your process's memory from getting corrupted if you use multiple threads. So I don't think you ever need to worry about what another process (in this case iOS) is doing with an SQLite database.
Edit: To clarify, any time you write to the database, including a plain INSERT/UPDATE/DELETE, it will automatically take an EXCLUSIVE lock, write to the database, then release the lock. (And it actually takes a SHARED lock, then a RESERVED lock, then a PENDING lock, then an EXCLUSIVE lock before writing.) By default, if the database is already locked (say from another process), then SQLite will return SQLITE_BUSY without waiting. You can call sqlite3_busy_timeout() to tell it to wait longer.
I don't think any of this is news to you, but a few thoughts:
In terms of enabling multi-threading (either serialized or multi-threaded), the general counsel is that one can invoke sqlite3_config() (but you may have to do a shutdown first as suggested in the docs or as discussed on SO here) to enable the sort of multi-threading you want. That may be of diminished usefulness here, though, where you have no control over what sort of access iOS is requesting of sqlite and/or this database.
Thus, I would have thought that, from an academic perspective, it would not be safe to read this system database (because as you say, you have no assurance of what the OS is doing). But I wouldn't be surprised if iOS is opening the database using whatever the default mode is, so from a more pragmatic perspective, you might be fine.
Clearly, for most users concerned about multi-threaded access within a single app, the best counsel would be to bypass the sqlite3_config() silliness and just simply ensure coordinated access through your own GCD serial queue (i.e., have a dedicated queue through which all database interactions go through, gracefully eliminating the multi-thread issue altogether). Sadly, that's not an option here because you're trying to coordinate database interaction with iOS itself.

Resources