Is it possible to change the way indexes are used in TClientDataSet to sort records? After reading this question, I thought it would be nice to be able to sort string fields logically in a client dataset. But I have no idea how to override default behavior of client dataset when it comes to indexes. Any ideas?
PS: My CDS is not linked to any provider. I'm looking for a way to modify the sort mechanism of the TClientDataSet (or the parent in which the mechanism is implemented) itself.
You cannot override the sort mechanism of a ClientDataSet - unless you rewrite the according part of Midas.
To achieve the correct sorting (whatever logical means) you can introduce a new field and set its values in a way so that, sorted with the standard mechanism, they will give the required sort order.
Read the excellent on-line article Understanding ClientDataSet Indexes by Cary Jensen.
It explains how to use various ways of sorting and indexing using IndexDefs, IndexFieldNames and IndexName.
Edit: reply to your comment.
You cannot override a sorting method in TClientDataSet, but you can add do this:
If you want to do custom sorting on anything else than existing fields, then you have to add a Calculated Field, perform a kind of order calculation in the OnCalcFields event, then add that field to the IndexDefs.
I would try to achieve the desired sort with an SQL statement that feed the ClientDataSet.
For example if I was dealing with the following strings in FieldN
a_1
a_20
a_10
a_2
and I wanted them sorted like this (I assume this is similar to what you mean by logically
a_1
a_2
a_10
a_20
then I would write the SQL as
SELECT FieldA,
FieldB,
... ,
FieldN,
CAST(SUBSTRING(FieldN, 3, 2) TO INTEGER) As FieldM '<== pseudocode
FROM TableA
ORDER BY FieldM
The exact syntax of the SubString and Cast to Integer operations will depend on which DBMS you're using.
Related
I located some records by this code:
ADOQuery1.Locate('field1',ADOQuery2.FieldByName('field2').Value,[])
How to go to the last one of these records?
You have a number of options. The best depends on a whole lot of considerations you haven't mentioned in your question. I'll provide a very brief overview of the options to avoid this becoming "too broad". It'll be up to you to make your choice and figure out the details. If you get stuck, you can ask a new, more specific question.
Using Locate
A solution involving Locate is only feasible if your dataset is sorted by the same field you're searching on.
Clearly your Search Value is not a unique key. So I'm guessing that you're trying to find the last row matching Search Key in data sorted by some other unique field. (Otherwise the concept of last is meaningless.)
So it's highly probable this is not appropriate for you; unless your data is ordered by a composite key of your search field followed by a unique key.
The approach is simple: navigate forwards until you find a row where the search value doesn't match, then backtrack by 1 row.
if not DataSet.Locate(SearchField, SearchValue, []) then
{ handle not found case as desired }
else
begin
while (not DataSet.Eof) and (DataSet.FieldByName(SearchField).Value = SearchValue) do
DataSet.Next;
{ Watch out for case that last row in dataset matches search value }
if (DataSet.FieldByName(SearchField).Value <> SearchValue) then
DataSet.Prior;
end;
Implement your own search
This is straight-forward and will always work. But it is inefficient, having O(n) complexity. So not advised for large datasets.
DataSet.Last;
while (not DataSet.Bof) and (DataSet.FieldByName(SearchField).Value <> SearchValue) do
DataSet.Prior;
NOTE: In order to mirror behaviour of Locate it would be advisable to enhance this method to deal with the case where a match is not found at all. In that case the active record should not be inadvertently changed as a side-effect of the search.
Use filtering
Obviously this solution depends on whether filtering the dataset is appropriate to the rest of your code. But it is a fairly simple option, and depending factors beyond the scope of this answer, it can be more performant than the previous option.
DataSet.Filtered := False;
{ The next line may be a little tricky.
Ensure the filter string is appropriate for the data-types involved. }
DataSet.Filter := '<string of the form SearchField = SearchValue>';
DataSet.Filtered := True;
DataSet.Last;
See documentation on the Filter property.
NOTE: It may be advisable to take precaution against setting the filter redundantly.
Use a master-detail relationship
This option is included because your question code indicates the SearchValue comes from the active record of another dataset. You're using ADO, so this option is available to you.
DataSet.MasterSource := <Appropriate DataSource>;
DataSet.MasterFields := SearchField;
DataSet.Last;
See documentation on master-detail relationships and on ADO MasterFields.
Offload the work to the RDBMS
Finally, it's worth considering using a stored procedure to get the information you need directly from the database. The advantage is that the server can leverage available indexes and have the potential to provide the most performant option. Again though, a lot depends on the particulars of your application.
A query along the following lines can form the basis of your stored procedure.
select MAX(UniqueField) as RowKey
from Table
where SearchField = SearchValue
Then call your stored procedure, and use its result to find the desired row.
DataSet.Locate(UniqueField, RowKey, []);
NOTE: Don't forget to consider the stored procedure returning NULL if no rows with SearchValue exist.
General Disclaimer
All the above code is extremely brief and for illustrative purposes only. In many cases additional code is required for a robust implementation.
E.g. It might be necessary to DisableControls and enable them again.
NOTE: It's very important with the above to be aware of the actual ordering of the data in your datasets. Failure to take this into account can lead to incorrect behaviour. Even the last option may exhibit worse than expected performance if your dataset is not sorted by UniqueKey.
If your table has an Autoincrement identity field you can do this
adoquery1.sql.clear;
adoquery1.sql.add('select top 1 * from yourtablename where field1=value1 and filed2=value2 order by yourAIcolums desc')
adoquery1.execsql;
value1 and value2 are your desired values.pass them as parameters or put them in command text
this way you get only row you want and no need to loop
I am scanning an SQLite database looking for all matches and using
OneFound:=False;
if tbl1.FieldByName('Name').AsString = 'jones' then
begin
OneFound:=True;
tbl1.Next;
end;
if OneFound then // Do something
or should I be using
if not(OneFound) then OneFound:=True;
Is it faster to just assign "True" to OneFound no matter how many times it is assigned or should I do the comparison and only change OneFuond the first time?
I know a better way would be to use FTS3, but for now I have to scan the database and the question is more on the approach to setting OneFound as many times as a match is encountered or using the compare-approach and setting it just once.
Thanks
Your question is, which is faster:
if not(OneFound) then OneFound:=True;
or
OneFound := True;
The answer is probably that the second is faster. Conditional statements involve branches which risks branch mis-prediction.
However, that line of code is trivial compared to what is around it. Running across a database one row at a time is going to be outrageously expensive. I bet that you will not be able to measure the difference between the two options because the handling of that little Boolean is simply swamped by the rest of the code. In which case choose the more readable and simpler version.
But if you care about the performance of this code you should be asking the database to do the work, as you yourself state. Write a query to perform the work.
It would be better to change your SQL statement so that the work is done in the database. If you want to know whether there is a tuple which contains the value 'jones' in the field 'name', then a quicker query would be
with tquery.create (nil) do
begin
sql.add ('select name from tbl1 where name = :p1 limit 1');
sql.params[0].asstring:= 'jones';
open;
onefound:= not isempty;
close;
free
end;
Your syntax may vary regarding the 'limit' clause but the idea is to return only one tuple from the database which matches the 'where' statement - it doesn't matter which one.
I used a parameter to avoid problems delimiting the value.
1. Search one field
If you want to search one particular field content, using an INDEX and a SELECT will be the fastest.
SELECT * FROM MYTABLE WHERE NAME='Jones';
Do not forget to create an INDEX on the column, first!
2. Fast reading
But if you want to search within a field, or within several fields, you may have to read and check the whole content. In this case, what will be slow will be calling FieldByName() for each data row: you should better use a local TField variable.
Or forget about TDataSet, and switch to direct access to SQLite3. In fact, using DB.pas and TDataSet requires a lot of data marshalling, so is slower than a direct access.
See e.g. DiSQLite3 or our DB classes, which are very fast, but a bit of higher level. Or you can use our ORM on top of those classes. Our classes are able to read more than 500,000 rows per second from a SQLite3 database, including JSON marshalling into objects fields.
3. FTS3/FTS4
But, as you guessed, the fastest would be indeed to use the FTS3/FTS4 feature of SQlite3.
You can think of FTS4/FTS4 as a "meta-index" or a "full-text index" on supplied blob of text. Just like google is able to find a word in millions of web pages: it does not use a regular database, but full-text indexing.
In short, you create a virtual FTS3/FTS4 table in your database, then you insert in this table the whole text of your main records in the FTS TEXT field, forcing the ID field to be the one of the original data row.
Then, you will query for some words on your FTS3/FTS4 table, which will give you the matching IDs, much faster than a regular scan.
Note that our ORM has dedicated TSQLRecordFTS3 / TSQLRecordFTS4 kind of classes for direct FTS process.
I have an unsorted dataset (a TMSQuery from Devart) that I cannot sort using ORDER BY because I manipulate the records after opening the query so the order given by "ORDER BY" is lost.
I don't want to rewrite the whole logic so I should find a way to sort a dataset.
I can Assign the dataset to a TMemDataSet (TMemDataSet is a DevArt class) descendant (TVirtualTable from Devart), but after this how do I sort (I need to sort by a date field)?
I read this question but it doesn't relly contain the answer I am looking for.
Using IndexFieldNames I solved the problem, it was what I waslooking for. Directly from the TMSQuery component:
MSQuery1.IndexFieldNames := 'EXECUTION_DATE'; //this does the job
I am using Delphi2010 and I'm trying to do an insert into multiple tables. I don't know the best way to do this. What I'm wondering is if there is a possible way to do one insert using one of Delphi's tools like the TQuery or TClientDataSet or would it be better to use code (we use Pascal language). An array maybe? I haven't been using Delphi that long but I have inserted and updated info into one table before, not multiple. Also, these tables use pretty much the same field names.
Any help would be greatly appreciated.
Thanks in advance!!
Call a stored procedure to update your tables simultaneously, with a transaction wrapper. Or re-design your database to eliminate duplicate/redundant data, so that you would never need to update several tables at once.
Note that this answer is perfectly valid, given the information provided in the question...
(Note: it's late, couldn't sleep, bored. This is what you get, given the quality of the information in the question!)
Another possible solution could be to make a updateable view on the database, and update the view from Delphi.
Making a updateable view just moves the work of updating 2 tables to the SQL instead of in Delphi.
This moves the business logic to the sql instead of in the Delphi. It propobly also generates less network trafic.
As Chris writes: use transactions, when 2 or more updates/inserts is dependent of each other.
Which data access componentes do you use ?
Which restrictions do you have ?
do you want to insert the same values into both tables ?
why not easy like:
for i = low(tables) to high(tables) do
begin
query.sql.text := 'insert into '+tables[i]+' (fields) values('+ ...)';
query.execsql;
end;
I have a Dataset i want to apply a filter based on a dataset-type field record count, something like: 'NESTED_DATASET_FIELD.RecordCount > 0'
If the dataset comes from a SQL based storage engine, use a select distict query on the joined table with only fields from the master table in the result set. Let the SQL engine do the work for you.
Depending on your situation, you can use:
In OnFilterRecord event you can have:
Accept := myDataSetField.NestedDataSet.RecordCount>0;
If you have a SQL backend you can use the Exists or Count in order to fetch only the records which you need. Perhaps is the best approach if you are over a network. However I don't know what infrastructure you have.
In OnFilterRecord event you can have:
Accept := not myDataSetField.IsNull;
//Just testing if the DataSet field is empty - which is one of the fastest ways to do it
...but this depends on the structure of your data / dataset etc.
Sometimes is better to have a dedicated field in your DataSet / Table to specify this status because usually getting such info from the nested dataset can be expensive. (One must fetch it at least partially etc.)
Also, for the same considerations (see 4. above) perhaps you can have a Stored Procedure (if your DB backend permits) to get this info.
HTH