i have two applications (server and client), that uses TQuery connected with TClientDataSet through TDCOMConnection,
and in some cases clientdataset opens about 300000 records and than application throws exception "Temporary table resource limit".
Is there any workaround how to fix this? (except "do not open such huge dataset"?)
update: oops i'm sorry there is 300K records, not 3 millions..
The error might be from the TQuery rather than the TClientDataSet. When using a TQuery it creates a temporary table and it might be this limit that you are hitting. However in saying this, loading 3,000,000 records into a TClientDataSet is a bad idea also as it will try to load every record into memory - which maybe possible if they are only a few bytes each but it is probably still going to kill your machine (obviously at 1kb each you are going to need 3GB of RAM minimum).
You should try to break your data into smaller chunks. If it is the TQuery failing this will mean adjusting the SQL (fewer fields / fewer records) or moving to a better database (the BDE is getting a little tired after all).
You have the answer already. Don't open such a huge dataset in a ClientDataSet (CDS).
Three million rows in a CDS is a huge memory load (depending on the size of each row, it can be gigantic).
The whole purpose of using a CDS is to work quickly with small datasets that can be manipulated in memory. Adding that many rows is ridiculous; use a real dataset instead, or redesign things so you don't need to retrieve so many rows at a time.
over 3 million records is way too much to handle at once. My guess is that you are performing an export or something like that which requires that many records to be sent down the wire. One method you could use to reduce this issue would be to have the middle-tier generate an export file, and then deliver that file to the client (preferably compressing first using ZLIB or something simular).
If you are pulling data back to the client for viewing purposes, then consider sending summary information only, and then allowing the client to dig thier way thru the data a portion at a time. The users would thank you because your performance will go way up and they won't have to dig thru records they don't care about looking at.
EDIT
Even 300,000 records is way too much to handle at once. If you had that many pennies, would you be able to carry them all? But if you made it into larger denominations, you could. if your sending data to the client for a report, then I strongly suggest a summary method... give them the large picture and let them drill slowly into the data. send grouped data and then let them open up slowly.
If this is a search results screen, then set a limit of the number of records to be returned + 1. For example to display 100 records, set the limit to 101. Still only display 100, the last record means that there were MORE than 100 records so the customer needs to adjust thier search criteria to return a smaller subset.
Temporary table resource limit is not a limit for one single query. it is the limit for all open queries together. so it may be a solution for you to close all other queries at the time.
if it is not possible for you to use ADO connection, also you can design a paging mechanism for querying data page by page.
GOOD LUCK
Related
I have an application that, at its core, is a sort of data warehouse and report generator. People use it to "mine" through a large amount of data with ad-hoc queries, produce a report page with a bunch of distribution graphs, and click through those graphs to look at specific result sets of the underlying items being "mined." The problem is that the database is now many hundreds of millions of rows of data, and even with indexing, some queries can take longer than a browser is willing to wait for a response.
Ideally, at some arbitrary cutoff, I want to "offline" the user's query, and perform it in the background, save the result set to a new table, and use a job to email a link to the user which could use this as a cached result to skip directly to the browser rendering the graphs. These jobs/results could be saved for a long time in case people wanted to revisit the particular problem they were working on, or emailed to coworkers. I would be tempted to just create a PDF of the result, but it's the interactive clicking of the graphs that I'm trying to preserve here.
None of the standard Rails caching techniques really captures this idea, so maybe I just have to do this all by hand, but I wanted to check to see if I wasn't missing something that I could start with. I could create a keyed model result in the in-memory cache, but I want these results to be preserved on the order of months, and I deploy at least once a week.
Considering Data loading from lots of join tables. That's why it's taking time to load.
Also you are performing calculation/visualization tasks with the data you fetch from DB, then show on UI.
I like to recommend some of the approaches to your problem:
Minimize the number of joins/nested join DB queries
Add some direct tables/columns, ex. If you are showing counts of comments of user the you can add new column in user table to store it in user table itself. You can add scheduled job to update data or add callback to update count
also try to minimize the calculations(if any) performing on UI side
you can also use the concept of lazy loading for fetching the data in chunks
Thanks, hope this will help you to decide where to start 🙂
By using both the synchronous=OFF and journal_mode=MEMORY options, I am able to reduce the speed of updates from 15 ms to around 2 ms which is a major performance improvement. These updates happen one at a time, so many other optimizations (like using transactions about a bunch of them) are not applicable.
According to the SQLite documentation, the DB can go 'corrupt' in the worst case if there is a power outage of some type. However, is not the worst thing that can happen is for the data to be lost, or possibly part of a transaction to be lost (which I guess is a form of corruption). Is it really possible for arbitrary corruption to occur with either of these options? If so, why?
I am not using any transactions, so partially written data from transactions is not a concern, and I can handle loosing data once in a blue moon. But if 'corruption' means that all the data in the DB can be randomly changed in an unpredictable way, that would be a strong reason to not use these options.
Does any one know what the real worst-case behavior would be on iOS?
Tables are organized as B-trees with the rowid as the key.
If some writes get lost while SQLite is updating the tree structure, the entire table might become unreadable.
(The same can happen with indexes, but those could be simply dropped and recreated.)
Data is organized in pages (typically 1 KB or 4 KB). If some page update gets lost while some tree is being reorganized, all the data in these pages (i.e., some random rows from the table with nearby rowid values) might become corruped.
If SQLite needs to allocate a new page, and that page contains plausible data (e.g., deleted data from the same table), and the writing of that page gets lost, then you have incorrect data in the table, without the ability to detect it.
I need to insert 800000 records into an MS Access table. I am using Delphi 2007 and the TAdoXxxx components. The table contains some integer fields, one float field and one text field with only one character. There is a primary key on one of the integer fields (which is not autoinc) and two indexes on another integer and the float field.
Inserting the data using AdoTable.AppendRecord(...) takes > 10 Minutes which is not acceptable since this is done every time the user starts using a new database with the program. I cannot prefill the table because the data comes from another database (which is not accessible through ADO).
I managed to get down to around 1 minute by writing the records to a tab separated text file and using a tAdoCommand object to execute
insert into table (...) select * from [filename.txt] in "c:\somedir" "Text;HDR=Yes"
But I don't like the overhead of this.
There must be a better way, I think.
EDIT:
Some additional information:
MS Access was chosen because it does not need any additional installation on the target machine(s) and the whole database is contained in one file which can be easily copied.
This is a single user application.
The data will be inserted only once and will not change for the lifetime of the database. Though, the table contains one additional field that is used as a flag to indicate that the corresponding record in another database has been processed by the user.
One minute is acceptable (up to 3 minutes would be too) and my solution works, but it seems too complicated to me, so I thought there should be an easier way to do this.
Once the data has been inserted, the performance of the table is quite good.
When I started planning/implementing the feature of the program working with the Access database the table was not required. It only became necessary later on, when another feature was requested by the customer. (Isn't that always the case?)
EDIT:
From all the answers I got so far, it seems that I already got the fastest method for inserting that much data into an Access table. Thanks to everybody, I appreciate your help.
Since you've said that the 800K records data won't change for the life of the database, I'd suggest linking to the text file as a table, and skip the insert altogether.
If you insist on pulling it into the database, then 800,000 records in 1 minute is over 13,000 / second. I don't think you're gonna beat that in MS Access.
If you want it to be more responsive for the user, then you might want to consider loading some minimal set of data, and setting up a background thread to load the rest while they work.
It would be quicker without the indexes. Can you add them after the import?
There are a number of suggestions that may be of interest in this thread Slow MSAccess disk writing
What about skipping the text file and using ODBC or OLEDB to import directly from the source table? That would mean altering your FROM clause to use the source table name and an appropriate connect string as the IN '' part of the FROM clause.
EDIT:
Actually I see you say the original format is xBase, so it should be possible to use the xBase ISAM that is part of Jet instead of needing ODBC or OLEDB. That would look something like this:
INSERT INTO table (...)
SELECT *
FROM tablename IN 'c:\somedir\'[dBase 5.0;HDR=NO;IMEX=2;];
You might have to tweak that -- I just grabbed the connect string for a linked table pointing at a DBF file, so the parameters might be slightly different.
Your text based solution seems the fastest, but you can get it quicker if you could get an preallocated MS Access in a size near the end one. You can do that by filling an typical user database, closing the application (so the buffers are flushed) and doing a manual deletion of all records of that big table - but not shrinking/compacting it.
So, use that file to start the real filling - Access will not request any (or very few) additional disk space. Don't remeber if MS Access have a way to automate this, but it can help much...
How about an alternate arrangement...
Would it be an option to make a copy of an existing Access database file that has this table you need and then just delete all the other data in there besides this one large table (don't know if Access has an equivalent to something like "truncate table" in SQL server)?
I would replace MS Access with another database, and for your situation I see Sqlite is the best choice, it doesn't require any installation into client machine, and it's very fast database and one of the best embedded database solution.
You can use it in Delphi in two ways:
You can download the Database engine Dll from Sqlite website and use Free Delphi component to access it like Delphi SQLite components or SQLite4Delphi
Use DISQLite3 which have the engine built in, and you don't have to distribute the dll with your application, they have a free version ;-)
if you still need to use MS Access, try to use TAdoCommand with SQL Insert statment directly instead of using TADOTable, that should be faster than using TADOTable.Append;
You won't be importing 800,000 records in less than a minute, as someone mentioned; that's really fast already.
You can skip the annoying translate-to-text-file step however if you use the right method (DAO recordsets) for doing the inserts. See a previous question I asked and had answered on StackOverflow: MS Access: Why is ADODB.Recordset.BatchUpdate so much slower than Application.ImportXML?
Don't use INSERT INTO even with DAO; it's slow. Don't use ADO either; it's slow. But DAO + Delphi + Recordsets + instantiating the DbEngine COM object directly (instead of via the Access.Application object) will give you lots of speed.
You're looking in the right direction in one way. Using a single statement to bulk insert will be faster than trying to iterate through the data and insert it row by row. Access, being a file-based database will be exceedingly slow in iterative writes.
The problem is that Access is handling how it optimizes writes internally and there's not really any way to control it. You've probably reached the maximum efficiency of an INSERT statement. For additional speed, you should probably evaluate if there's any way around writing 800,000 records to the database every time you start the application.
Get SQL Server Express (free) and connect to it from Access an external table. SQL express is much faster than MS Access.
I would prefill the database, and hand them the file itself, rather than filling an existing (but empty) database.
If the data you have to fill changes, then keep an ODBC access database (MDB file) synchronized on the server using a bit of code to see changes in the main database and copy them to the access database.
When the user requests a new database zip up the MDB, transfer it to them, and open it.
Alternately, you may be able to find code that opens and inserts data into databases directly.
Alternately, alternately, you may be able to find another format (other than csv) which access can import that is faster.
-Adam
Also check to see how long it takes to copy the file. That will be the lower bound of how fast you can write data. In db's like SQL, it usually takes a bulk load utility to get close to that speed. As far as I know, MS never created a tool to write directly to MS Access tables the way bcp does. Specialized ETL tools will also optimize some of the steps surrounding the insert, such as the way SSIS does transformations in memory, DTS likewise has some optimizations.
Perhaps you could open a ADO Recordset to the table with lock mode adLockBatchOptimistic and CursorLocation adUseClient, write all the data to the recordset, then do a batch update (rs.UpdateBatch).
If it's coming from dbase, can you just copy the data and index files and attach directly without loading? Should be pretty efficient (from the people who bring you FoxPro.) I imagine it would use the existing indexes too.
At the least, it should be a pretty efficient single-command Import.
how much do the 800,000 records change from one creation to the next? Would it be possible to pre populate the records and then just update the ones that have changed in the external database when creating the new database?
This may allow you to create the new database file quicker.
How fast is your disk turning? If it's 7200RPM, then 800,000 rows in 3 minutes is still 37 rows per disk revolution. I don't think you're going to do much better than that.
Meanwhile, if the goal is to streamline the process, how about a table link?
You say you can't access the source database via ADO. Can you set up a table link in MS Access to a table or view in the source database? Then a simple append query from the table link would copy the data over from the source database to the target database for you. I'm not sure, but I think this would be pretty fast.
If you can't set up a table link until runtime, maybe you could build the table link programatically via ADO, then build the append query programatically, then invoke the append query.
HI
The best way is Bulk Insert from txt File as they said
you should insert your record's in txt file then bulk insert the txt file into table
that time should be less than 3 second.
This is very strange. I have a CR that takes over 30 minutes to run. It uses 5 large tables and queries the server. I made a View on the server which is IBM i to gather the data there. For some reason it is not giving me data on the CR past 08/12. When I query past that date on the server,it does have data, and even if I make a quick report on CR it will show all the data incl 2013.
The reason can possibly be this>
When I made the View, I mistakenly had a mix of databases used. And one of the 2 databases was one being used as part of a data purge. So it may have not had data past 8.12/
But since that point, I have also modified the View to add some new columns and this it does and even shows them in the data that it does show (till 8/12)
So this would tell me that the CR is fully using the new View.
So I can re create the CR but this is rather tedious. Perhaps there is one thing I am not doing?
Crystal Reports generally does better in reporting over processing a query. For a faster, and easier way of debugging, it's often better to make a procedure in your database that joins together the data from various sources. Once you have the data you want, then use Crystal to display that data.
In other words, try to avoid doing any more work in Crystal than you have to. Sure, the grouping and headers and pretty formatting will be done there. But all of the querying, joining, and sorting is better done in your database. If the query is slow there, then you can optimize there. If the wrong data is returned, you fix your procedure until it is returning what you want.
An additional benefit is when the report needs to change. If the data needs to come from a different location, you can modify the procedure and never touch Crystal. If the formatting needs to change, you can modify the Crystal and never touch the procedure. You're changing less and thus don't have to test everything.
Is the crystal report attached to a scratch server?
If you are using SQL Server, then you can modify the SQL that constitutes your view by modifying the table names to be like this: databasename..tablename I'm not certain how to do the equivalent in other DBMS.
If you modify your table like that so that the view is querying tables from the correct non-purged database and you are still not getting data more recent than 8/12, then check if there are constraints in the WHERE and/or HAVING statements, or if there are implicit/explicit constraints in ON section of the JOINs.
I need to insert 800000 records into an MS Access table. I am using Delphi 2007 and the TAdoXxxx components. The table contains some integer fields, one float field and one text field with only one character. There is a primary key on one of the integer fields (which is not autoinc) and two indexes on another integer and the float field.
Inserting the data using AdoTable.AppendRecord(...) takes > 10 Minutes which is not acceptable since this is done every time the user starts using a new database with the program. I cannot prefill the table because the data comes from another database (which is not accessible through ADO).
I managed to get down to around 1 minute by writing the records to a tab separated text file and using a tAdoCommand object to execute
insert into table (...) select * from [filename.txt] in "c:\somedir" "Text;HDR=Yes"
But I don't like the overhead of this.
There must be a better way, I think.
EDIT:
Some additional information:
MS Access was chosen because it does not need any additional installation on the target machine(s) and the whole database is contained in one file which can be easily copied.
This is a single user application.
The data will be inserted only once and will not change for the lifetime of the database. Though, the table contains one additional field that is used as a flag to indicate that the corresponding record in another database has been processed by the user.
One minute is acceptable (up to 3 minutes would be too) and my solution works, but it seems too complicated to me, so I thought there should be an easier way to do this.
Once the data has been inserted, the performance of the table is quite good.
When I started planning/implementing the feature of the program working with the Access database the table was not required. It only became necessary later on, when another feature was requested by the customer. (Isn't that always the case?)
EDIT:
From all the answers I got so far, it seems that I already got the fastest method for inserting that much data into an Access table. Thanks to everybody, I appreciate your help.
Since you've said that the 800K records data won't change for the life of the database, I'd suggest linking to the text file as a table, and skip the insert altogether.
If you insist on pulling it into the database, then 800,000 records in 1 minute is over 13,000 / second. I don't think you're gonna beat that in MS Access.
If you want it to be more responsive for the user, then you might want to consider loading some minimal set of data, and setting up a background thread to load the rest while they work.
It would be quicker without the indexes. Can you add them after the import?
There are a number of suggestions that may be of interest in this thread Slow MSAccess disk writing
What about skipping the text file and using ODBC or OLEDB to import directly from the source table? That would mean altering your FROM clause to use the source table name and an appropriate connect string as the IN '' part of the FROM clause.
EDIT:
Actually I see you say the original format is xBase, so it should be possible to use the xBase ISAM that is part of Jet instead of needing ODBC or OLEDB. That would look something like this:
INSERT INTO table (...)
SELECT *
FROM tablename IN 'c:\somedir\'[dBase 5.0;HDR=NO;IMEX=2;];
You might have to tweak that -- I just grabbed the connect string for a linked table pointing at a DBF file, so the parameters might be slightly different.
Your text based solution seems the fastest, but you can get it quicker if you could get an preallocated MS Access in a size near the end one. You can do that by filling an typical user database, closing the application (so the buffers are flushed) and doing a manual deletion of all records of that big table - but not shrinking/compacting it.
So, use that file to start the real filling - Access will not request any (or very few) additional disk space. Don't remeber if MS Access have a way to automate this, but it can help much...
How about an alternate arrangement...
Would it be an option to make a copy of an existing Access database file that has this table you need and then just delete all the other data in there besides this one large table (don't know if Access has an equivalent to something like "truncate table" in SQL server)?
I would replace MS Access with another database, and for your situation I see Sqlite is the best choice, it doesn't require any installation into client machine, and it's very fast database and one of the best embedded database solution.
You can use it in Delphi in two ways:
You can download the Database engine Dll from Sqlite website and use Free Delphi component to access it like Delphi SQLite components or SQLite4Delphi
Use DISQLite3 which have the engine built in, and you don't have to distribute the dll with your application, they have a free version ;-)
if you still need to use MS Access, try to use TAdoCommand with SQL Insert statment directly instead of using TADOTable, that should be faster than using TADOTable.Append;
You won't be importing 800,000 records in less than a minute, as someone mentioned; that's really fast already.
You can skip the annoying translate-to-text-file step however if you use the right method (DAO recordsets) for doing the inserts. See a previous question I asked and had answered on StackOverflow: MS Access: Why is ADODB.Recordset.BatchUpdate so much slower than Application.ImportXML?
Don't use INSERT INTO even with DAO; it's slow. Don't use ADO either; it's slow. But DAO + Delphi + Recordsets + instantiating the DbEngine COM object directly (instead of via the Access.Application object) will give you lots of speed.
You're looking in the right direction in one way. Using a single statement to bulk insert will be faster than trying to iterate through the data and insert it row by row. Access, being a file-based database will be exceedingly slow in iterative writes.
The problem is that Access is handling how it optimizes writes internally and there's not really any way to control it. You've probably reached the maximum efficiency of an INSERT statement. For additional speed, you should probably evaluate if there's any way around writing 800,000 records to the database every time you start the application.
Get SQL Server Express (free) and connect to it from Access an external table. SQL express is much faster than MS Access.
I would prefill the database, and hand them the file itself, rather than filling an existing (but empty) database.
If the data you have to fill changes, then keep an ODBC access database (MDB file) synchronized on the server using a bit of code to see changes in the main database and copy them to the access database.
When the user requests a new database zip up the MDB, transfer it to them, and open it.
Alternately, you may be able to find code that opens and inserts data into databases directly.
Alternately, alternately, you may be able to find another format (other than csv) which access can import that is faster.
-Adam
Also check to see how long it takes to copy the file. That will be the lower bound of how fast you can write data. In db's like SQL, it usually takes a bulk load utility to get close to that speed. As far as I know, MS never created a tool to write directly to MS Access tables the way bcp does. Specialized ETL tools will also optimize some of the steps surrounding the insert, such as the way SSIS does transformations in memory, DTS likewise has some optimizations.
Perhaps you could open a ADO Recordset to the table with lock mode adLockBatchOptimistic and CursorLocation adUseClient, write all the data to the recordset, then do a batch update (rs.UpdateBatch).
If it's coming from dbase, can you just copy the data and index files and attach directly without loading? Should be pretty efficient (from the people who bring you FoxPro.) I imagine it would use the existing indexes too.
At the least, it should be a pretty efficient single-command Import.
how much do the 800,000 records change from one creation to the next? Would it be possible to pre populate the records and then just update the ones that have changed in the external database when creating the new database?
This may allow you to create the new database file quicker.
How fast is your disk turning? If it's 7200RPM, then 800,000 rows in 3 minutes is still 37 rows per disk revolution. I don't think you're going to do much better than that.
Meanwhile, if the goal is to streamline the process, how about a table link?
You say you can't access the source database via ADO. Can you set up a table link in MS Access to a table or view in the source database? Then a simple append query from the table link would copy the data over from the source database to the target database for you. I'm not sure, but I think this would be pretty fast.
If you can't set up a table link until runtime, maybe you could build the table link programatically via ADO, then build the append query programatically, then invoke the append query.
HI
The best way is Bulk Insert from txt File as they said
you should insert your record's in txt file then bulk insert the txt file into table
that time should be less than 3 second.