Trying to hack my way around SQLite3 concurrent writing, any better way to do this? - delphi

I use Delphi XE2 along with DISQLite v3 (which is basically a port of SQLite3). I love everything about SQLite3, except the lack of concurrent writing, especially that I extensively rely on multi-threading in this project :(
My profiler made it clear I needed to do something about it, so I decided to use this approach:
Whenever I need to insert a record in DB, Instead of doing an INSERT, I write the SQL query in a special foler, ie.
WriteToFile_Inline(SPECIAL_FOLDER_PATH + '\' + GUID, FileName + '|' + IntToStr(ID) + '|' + Hash + '|' + FloatToStr(ModifDate) + '|' + ...);
I added a timer (in the main app thread) that fires every minute, parse these files and then INSERT the queries using a transaction.
Delete those temporary files at the end.
The result is I have like 500% performance gain. Plus this technique is ACID, as I can always scan the SPECIAL_FOLDER_PATH after a power failure and execute the INSERTs I find.
Despite the good results, I'm not very happy with the method used (hackish to say the least), I keep thinking that if I could have a generics-like with fast lookup access, thread-safe, ACID list, this would be much cleaner (and possibly faster?)
So my question is: do you know anything like that for Delphi XE2?
PS. I trust many of you reading the code above be in shock and will start insulting me at this point! Please be my guest, but if you know a better (ie. faster) ACID approach, please share your thoughts!

Your idea of sending the inserts to a queue, which will rearrange the inserts, and join them via prepared statements is very good. Using a timer in the main thread or a separated thread is up to you. It will avoid any locking.
Do not forget to use a transaction, then commit it every 100/1000 inserts for instance.
About high performance using SQLite3, see e.g. this blog article (and graphic below):
In this graphic, best performance (file off) comes from:
PRAGMA synchronous = OFF
Using prepared statements
Inside a transaction
In WAL mode (especially in concurrency mode)
You may also change the page size, or the journal size, but settings above are the best. See https://stackoverflow.com/search?q=sqlite3+performance
If you do not want to use a background thread, ensure WAL is ON, prepare your statements, use batchs, and regroup your process to release the SQLite3 lock as soon as possible.
The best performance will be achieved by adding a Client-Server layer, just as we did for mORMot.

With files you organized an asynchronous job queue with persistance. It allows you to avoid one-by-one and use batch (records group) approach to insert the records. Comparing one-by-one and batch:
first works in auto-commit mode (probably) for each record, second wraps a batch into a single transaction and gives greatest performance gain.
first prepares an INSERT command each time when you need to insert a record (probably), second once per batch and gives second by value gain.
I dont think, that SQLite concurrency is a problem in your case (at least not the main issue). Because in SQLite a single insert is comparably fast and concurrency performance issues you will get with high workload. Probably similar results you will get with other DBMS, like Oracle.
To improve your batch approach, consider the following:
consider to set journal_mode to WAL and disable shared cache mode.
use a background thread to process your queue. Instead of a fixed time interval (1 min), check SPECIAL_FOLDER_PATH more often. And if the queue has more than X Kb of data, then start processing. Or use a count of queued records and event to notify the thread, that the queue should start processing.
use multy-record prepared INSERT instead of single-record INSERT. You can build an INSERT for 100 records and process your queue data in a single batch, but by 100 record chanks.
consider to write / read a binary field values instead of a text values.
consider to use a set of files with preallocated size.
etc

sqlite3_busy_timeout is pretty inefficient because it doesn't return immediately when the table it's waiting on is unlocked.
I would try creating a critical section (TCriticalSection?) to protect each table. If you enter the critical section before inserting a row and exit it immediately thereafter, you will create better table locks than SQLite provides.
Without knowing your access patterns, though, it's hard to say if this will be faster than batching up a minute's worth of inserts into single transactions.

Related

How to ensure insert rate 1 insert per second when using ClickhouseIO

I'm using Apache Beam Java SDK to process events and write them to the Clickhouse Database.
Luckily there is ready to use ClickhouseIO.
ClickhouseIO accumulates elements and inserts them in batch, but because of the parallel nature of the pipeline it still results in a lot of inserts per second in my case. I'm frequently receiving "DB::Exception: Too many parts" or "DB::Exception: Too much simultaneous queries" in Clickhouse.
Clickhouse documentation recommends doing 1 insert per second.
Is there a way I can ensure this with ClickhouseIO?
Maybe some KV grouping before ClickhouseIO.Write or something?
It looks like you interpret these errors not quite correct:
DB::Exception: Too many parts
It means that insert affect more partitions than allowed (by default this value is 100, it is managed by parameter max_partitions_per_insert_block).
So either the count of affected partition is really large or the PARTITION BY-key was defined pretty granular.
How to fix it:
try to group the INSERT-batch such way it contains data related to less than 100 partitions
try to reduce the size of insert-block (if it quite huge) - withMaxInsertBlockSize
increase the limit max_partitions_per_insert_block in SQL-query (like this, INSERT .. SETTINGS max_partitions_per_insert_block=300 (I think ClickhouseIO should have the ability to set custom options on query level)) or on server-side by modifying userprofile-settings
DB::Exception: Too much simultaneous queries
This one managed by param max_concurrent_queries.
How to fix it:
reduce the count of concurrent queries by Beam means
increase the limit on the server-side in userprofile- or server-settings (see https://github.com/ClickHouse/ClickHouse/issues/7765)

Apache-camel Xpathbuilder performance

I have following question. I set up an camel -project to parse certain xml files. I have to selecting take out certain nodes from a file.
I have two files 246kb and 347kb in size. I am extracting a parent-child pair of 250 nodes in the above given example.
With the default factory here are the times. For the 246kb file respt 77secs and 106 secs. I wanted to improve the performance so switched to saxon and the times are as follows 47secs and 54secs. I was able to cut the time down by at least half.
Is it possible to cut the time further, any other factory or optimizations I can use will be appreciated.
I am using XpathBuilder to cut the xpaths out. here is an example. Is it possible to not to have to create XpathBuilder repeatedly, it seems like it has to be constructed for every xpath, I would have one instance and keep pumping the xpaths into it, maybe it will improve performance further.
return XPathBuilder.xpath(nodeXpath)
.saxon()
.namespace(Consts.XPATH_PREFIX, nameSpace)
.evaluate(exchange.getContext(), exchange.getIn().getBody(String.class), String.class);
Adding more details based on Michael's comments. So I am kind of joining them, will become clear with my example below. I am combining them into a json.
So here we go, Lets say we have following mappings for first and second path.
pData.tinf.rexd: bm:Document/bm:xxxxx/bm:PmtInf[{0}]/bm:ReqdExctnDt/text()
pData.tinf.pIdentifi.instId://bm:Document/bm:xxxxx/bm:PmtInf[{0}]/bm:CdtTrfTxInf[{1}]/bm:PmtId/bm:InstrId/text()
This would result in a json as below
pData:{
tinf: {
rexd: <value_from_xml>
}
pIdentifi:{
instId: <value_from_xml>
}
}
Hard to say without seeing your actual XPath expression, but given the file sizes and execution time my guess would be that you're doing a join which is being executed naively as a cartesian product, i.e. with O(n*m) performance. There is probably some way of reorganizing it to have logarithmic performance, but the devil is in the detail. Saxon-EE is quite good at optimizing join queries automatically; if not, there are often ways of doing it manually -- though XSLT gives you more options (e.g. using xsl:key or xsl:merge) than XPath does.
Actually I was able to bring the time down to 10 secs. I am using apache-camel. So I added threads there so that multiple files can be read in separate threads. Once the file was being read, it had serial operation to based on the length of the nodes that had to be traversed. I realized that it was not necessary to be serial here so introduced parrallelStream and that now gave it enough power. One thing to guard agains is not to have a proliferation of threads since that can degrade the performance. So I try to restrict the number of threads to twice or thrice the number of cores on the operating machine.

Memory usage increases with every record read

I have a couple of database management tasks that need to go through every record in the database. It was my understanding that with the CakePHP 3.x ORM, I could do something like this, and it would only ever have one record in memory at a time:
$records = TableRegistry::get('Whatever')->find();
foreach ($records as $record) {
// do some processing
}
However, this is eventually crashing with an "out of memory" exception. I've added a bit of logging of memory_get_peak_usage, and it's increasing with every iteration, even if there is nothing other than the logging happening inside the foreach loop. The delta is around 12K every time through the loop.
I'm running 3.2.7, and results are similar whether I have debugging and/or SQL logging enabled or not. Adding frequent calls to gc_collect_cycles() only slows the process down, it doesn't help with the memory usage.
Is this expected, or a bug? If the former, is there anything I can do differently in this code to prevent it? (Obviously, I could process it in smaller batches, but that's not an elegant solution.)
CakePHP 3.x ORM has built in query caching for the ResultSet object. When you iterator over the result set the entities are stored in an internal array. This is done so that you can rewind the iterator and loop again.
If you are going to iterate over a large result set only once, and you want to reduce memory usage then you have to disable result buffering.
$records = TableRegistry::get('Whatever')->find()->bufferResults(false);
foreach ($records as $record) {
// do some processing
}
With buffering turned off the entity is fetched from the result set and there should be no references to it afterwards.
Documentation for this feature is available in the CakePHP book: https://book.cakephp.org/3.0/en/orm/retrieving-data-and-resultsets.html#working-with-result-sets
Here's the API reference: https://api.cakephp.org/3.6/class-Cake.Database.Query.html#_bufferResults
From my understanding it is the expected behaviour, as you execute the query build with the ORM when you start iterating over the object($records). Thus all the data is loaded into memory, and you then iterate over each entry one by one.
If you want to limit the memory usage I would suggest you look into limit and offset. With these you can extract subsets to work on, thus limiting memory usage.

Optimizing looping through dataset

I have a code some thing like this
dxMemOrdered : TdxMemData;
while not qrySandbox2.EOF do
begin
dxMemOrdered.append;
dxMemOrderedTotal.asCurrency := qrySandbox2.FieldByName('TOTAL').asCurrency;
dxMemOrdered.post;
qrySandbox2.Next;
end;
this code executes in a thread. When there are huge records say "400000" it is taking around 25 minutes to parse through it. Is there any way that i can reduce the size by optimizing the loop? Any help would be appreciated.
Update
Based on the suggestions i made the following changes
dxMemOrdered : TdxMemData;
qrySandbox2.DisableControls;
while not qrySandbox2.Recordset.EOF do
begin
dxMemOrdered.append;
dxMemOrderedTotal.asCurrency := Recordset.Fields['TOTAL'].Value;
dxMemOrdered.post;
qrySandbox2.Next;
end;
qrySandbox2.EnableControls;
and my output time have improved from 15 mins to 2 mins. Thank you guys
Without seeing more code, the only suggestion I can make is make sure that any visual control that is using the memory table is disabled. Suppose you have a cxgrid called Grid that is linked to your dxMemOrdered memory table:
var
dxMemOrdered: TdxMemData;
...
Grid.BeginUpdate;
try
while not qrySandbox2.EOF do
begin
dxMemOrdered.append;
dxMemOrderedTotal.asCurrency := qrySandbox2.FieldByName('TOTAL').asCurrency;
dxMemOrdered.Post;
qrySandbox2.Next;
end;
finally
Grid.EndUpdate;
end;
Some ideas in order of performance gain vs work to do by you:
1) Check if the SQL dialect that you are using lets you use queries that directly SELECT from/INSERT to. This depends on the database you're using.
2) Make sure that if your datasets are not coupled to visual controls, that you call DisableControls/EnableControls around this loop
3) Does this code have to run in the main program thread? Maybe you can send if off to a separate thread while the user/program continues doing something else
4) When you have to deal with really large data, bulk insertion is the way to go. Many databases have options to bulk insert data from text files. Writing to a text file first and then bulk inserting is way faster than individual inserts. Again, this depends on your database type.
[Edit: I just see you inserting the info that it's TdxMemData, so some of these no longer apply. And you're already threading, missed that ;-). I leave this suggestions in for other readers with similar problems]
It's much better to let SQL do the work instead of iterating though a loop in Delphi. Try a query such as
insert into dxMemOrdered (total)
select total from qrySandbox2
Is 'total' the only field in dxMemOrdered? I hope that it's not the primary key otherwise you are likely to have collisions, meaning that rows will not be added.
There's actually a lot you could do to speed up your thread.
The first would be to look at the problem in a broader perspective:
Am I fetching data from a cached / fast disk, possibly moved in memory?
Am I doing the right thing, when aggregating totals by hand? SQL engines are expecially optimized to do those things, all you'd need to do is to define an additional logical field where to store the SQL aggregated result.
Another little optimization that may bring an improvement over large amounts of looping is to not use constructs like:
Recordset.Fields['TOTAL'].Value
Recordset.FieldByName('TOTAL').Value
but to add the fields with the fields editor and then directly accessing the right field. You'll save a whole loop through the fields collection, that otherwise is performed on every field, on every next record.

ActiveRecord bulk data, memory grows forever

I am using ActiveRecord to bulk migrate some data from a table in one database to a different table in another database. About 4 million rows.
I am using find_each to fetch in batches. Then I do a little bit of logic to each record fetched, and write it to a different db. I have tried both directly writing one-by-one, and using the nice activerecord-import gem to batch write.
However, in either case, my ruby process memory usage is growing quite a bit throughout the life of the export/import. I would think that using find_each, I'm getting batches of 1000, there should only be 1000 of them in memory at a time... but no, each record I fetch seems to be consuming memory forever, until the process is over.
Any ideas? Is ActiveRecord caching something somewhere that I can turn off?
update 17 Jan 2012
I think I'm going to give up on this. I have tried:
* Making sure everything is wrapped in a ActiveRecord::Base.uncached do
* Adding ActiveRecord::IdentityMap.enabled = false (I think that should turn off the identity map for the current thread, although it's not clearly documented, and I think the identity map isn't on by default in current Rails anyhow)
Neither of those seem to have much effect, memory is still leaking.
I then added a periodic explicit:
GC.start
That seems to slow down the rate of memory leak, but the memory leak still happens (eventually exhausting all memory and bombing).
So I think I'm giving up, and deciding it is not currently possible to use AR to read millions of rows from one db and insert them into another. Perhaps there is a memory leak in MySQL-specific code being used (that's my db), or somewhere else in AR, or who knows.
I would suggest queueing each unit of work into a Resque queue . I have found that ruby has some quirks when iterating over large arrays like these.
Have one main thread that queue's up the work by ID, then have multiple resque workers hitting that queue to get the work done.
I have used this method on approx 300k records, so it would most likely scale to millions.
Change line #86 to bulk_queue = [] since bulk_queue.clear only sets the length of the arrya to 0 makeing it impossible for the GC to clear it.

Resources