I am looking for an efficient way to detect the number of unique values in an array.
My current approach:
Quicksort array of integers
Then run a loop to compare elements.
In code:
yearHolder := '';
for I := 0 to High(yearArray) do
begin
currYear := yearArray[i];
if (yearHolder <> currYear) then
begin
yearHolder := currYear;
Inc(uniqueYearNumber);
end;
end;
Here is an example with the THashedStringList:
hl := THashedStringList.Create; // in Inifiles
try
hl.Sorted := True;
hl.Duplicates := dupIgnore; // ignores attempts to add duplicates
for i := 0 to High(yearArray) do
hl.Add(yearArray[i]);
uniqueYearCount := hl.Count;
finally
hl.Free;
end;
In general, you can use this algorithm:
Create a hash table that maps year to count of occurrences.
For each number in your array, put a corresponding entry in a hash table.
When done, get the number of entries in the hash.
However, in your case, your variables are named "year". If this is really a year, this is simpler, because years have a very limited range. Say, the range 0-3000 should be enough. So, instead of a hash table, you can use a simple array of counters. Initialize it with 0s. Then when you see the year 2009, increment the element arr[2009]. At the end, count the number of elements with arr[i] >= 1.
A minor deviation from the plan could be more worthwhile: never add duplicates to the array in the first place, or add them directly to the proposed hash array.
Up till D2009, there is only THashedStringList (which needs a bunch of costly number -> string conversions and hashes on strings to operate), but if you have D2009 then the Generics.Collections unit has some interesting data structures.
In Delphi using DeHL we say:
List uniqueWidgets := List.Create( MassiveListOfNonUniqueWidgets.Distinct());
:P
I'd recommend adding the items to a Set and, once completed, reading the size of the resulting Set. Because Sets do not allow duplicates, in Java, DDL, .Net, and many (if not all languages), this is a safe, cheap and reliable method.
A more efficient algorithm would be to dump everything a hash table (not sure if delphi even has this).
Iterate through the list (in this case, yearArray) and use the values as keys in the hash table.
Retreive the number of keys in the hash table.
Related
I am dynamically adding fields to a TDataSet using the following code:
while not ibSQL.Eof do
fieldname := Trim(ibSql.FieldByName('columnnameofchange').AsString);
TDataSet.FieldDefs.Add(fieldname , ftString, 255);
end
Problem is that I might get duplicate names so what is the easiest way to screen for duplicates and not add the duplicates that are already added.
I hope not to traverse through the TDataSet.FieldDefList for each column added as this would be tedious for every single column addition. And there can be many additions.
Please supply another solution if possible. If not then I am stuck using the FieldDefList iteration.
I will also add that screening out duplicates on the SQL query is an option but not a desired option.
Thanks
TFieldDefs has a method IndexOf that returns -1 when a field with the given name does not exist.
If I understand you correctly, the easiest way would probably be to put all of the existing field names in a TStringList. You could then check for the existence before adding a new field, and if you add it you simply add the name to the list:
var
FldList: TStringList;
i: Integer;
begin
FldList := TStringList.Create;
try
for i := 0 to DataSet.FieldCount - 1 do
FldList.Add(DataSet.Fields[i].FieldName);
while not ibSQL.Eof do
begin
fieldname := Trim(ibSql.FieldByName('columnnameofchange').AsString);
if FldList.IndexOf(fieldName) = -1 then
begin
FldList.Add(fieldName);
DataSet.FieldDefs.Add(fieldname , ftString, 255);
end;
ibSQL.Next;
end;
finally
FldList.Free;
end;
end;
I'm posting this anyway as I finished writing it, but clearly screening on the query was my preference for this problem.
I'm having a bit of trouble understanding what you're aiming for so forgive me if I'm not answering your question. Also, it has been years since I used Delphi regularly so this is definitely not a specific answer.
If you're using the TADOQuery (or whatever TDataSet you're using) in the way I expect my workaround was to do something like:
//SQL
SELECT
a.field1,
a.... ,
a.fieldN,
b.field1 as "AlternateName"
FROM
Table a INNER JOIN Table b
WHERE ...
As which point it automatically used AlternateName instead of field1 (thus the collision where you're forced to work by index or rename the columns.
Obviously if you're opening a table for writing this isn't a great solution. In my experience with Delphi most of the hardship could be stripped out with simple SQL tricks so that you did not need to waste time playing with the fields.
Essentially this is just doing what you're doing at the source instead of the destination and it is a heck of a lot easier to update.
What I'd do is keep a TStringList with Sorted := true and Duplicates := dupError set. For each field, do myStringList.Add(UpperCase(FieldName)); inside a try block, and if it throws an exception, you know it's a duplicate.
TStringList is really an incredibly versatile class. It's always a bit surprising all the uses you can find for it...
Delphi does not seem to like multi-field indexes.
How do I physically sort a a table so that I wind up with a table that has the rows in the desired order?
Example:
mytable.dbf
Field Field-Name Field-Type Size
0 Payer Character 35
1 Payee Character 35
2 PayDate Date
3 Amount Currency
I need to produce a table sorted alphabetically by "Payee"+"Payer"
When I tried using an index of "Payee+Payer", I got an error:
"Field Index out of range"
The index field names need to be separated by semicolons, not plus symbols. Try that and it should work.
Ok, let's try to put some order.
First, isn't advisable to physically sort a table. In fact the most RDBMS even don't provide you this feature. Usually, one, in order to not force a full table scan (it is called sometimes natural scan) creates indexes on the table fields on which he thinks that the table will be sorted / searched.
As you see, the first step in order to sort a table is usually index creation. This is a separate step, it is done once, usually at, let's say, "design time". After this, the DB engine will take care to automatically update the indexes.
The index creation is done by you (the developer) using (usually) not Delphi (or any other development tool) but the admin tool of your RDBMS (the same tool which you used when you created your table).
If your 'DB engine' is, in fact, a Delphi memory dataset (TClientDataSet) then you will go to IndexDefs property, open it, add a new index and set the properties there accordingly. The interesting property in our discussion is Fields. Set it to Payee;Payer. Set also the Name to eg. "idxPayee". If you use other TDataSet descendant, consult the docs of your DB engine or ask another question here on SO.com providing the details.
Now, to use the index. (IOW, to sort the table, as you say). In your program (either at design time either at run time) set in your 'Table' the IndexName to "idxPayee" or any other valid name you gave or set IndexFieldNames to Payee;Payer.
Note once again that the above is an example based on TClientDataSet. What you must retain from the above (if you don't use it) is that you must have an already created index in order to use it.
Also, to answer at your question, yes, there are some 'table' types (TDataSet descendants in Delphi terminology) which support sorting, either via a Sort method (or the like) either via a SortFields property.
But nowadays usually when one works with a SQL backend, the preferred solution is to create the indexes using the corresponding admin tool and then issue (using Delphi) an SELECT * FROM myTable ORDER BY Field1.
HTH
If you're still using BDE you can use the BDE API to physically sort the DBF table:
uses
DbiProcs, DbiTypes, DBIErrs;
procedure SortTable(Table: TTable; const FieldNums: array of Word; CaseInsensitive: Boolean = False; Descending: Boolean = False);
var
DBHandle: hDBIDb;
RecordCount: Integer;
Order: SORTOrder;
begin
if Length(FieldNums) = 0 then
Exit;
Table.Open;
RecordCount := Table.RecordCount;
if RecordCount = 0 then
Exit;
DBHandle := Table.DBHandle;
Table.Close;
if Descending then
Order := sortDESCEND
else
Order := sortASCEND;
Check(DbiSortTable(DBHandle, PAnsiChar(Table.TableName), nil, nil, nil, nil, nil,
Length(FieldNums), #FieldNums[0], #CaseInsensitive, #Order, nil, False, nil, RecordCount));
end;
for example, in your case:
SortTable(Table1, [2, 1]); // sort by Payee, Payer
Cannot check, but try IndexFieldNames = "Payee, Payer".
Sure indexes by these 2 fields should exist.
You can create an index on your table using the TTable.AddIndex method in one call. That will sort your data when you read it, that is if you use the new index by setting the TTable.IndexName property to the new index. Here's an example:
xTable.AddIndex('NewIndex','Field1;Field2',[ixCaseInsensitive]);
xTable.IndexName := 'NewIndex';
// Read the table from top to bottom
xTable.First;
while not xTable.EOF do begin
..
xTable.Next;
end;
I am writing some matrix routines in Delphi and this problem came up. I have defined a real matrix thus:-
RealArrayNPbyNP = Array[1..200,1..200] of Extended;
I have populated this array with a 5 x 6 matrix.
How do I query the array to get the number of rows (which in this case will be 5) and the number of cols (which in this case will be 6) in delphi code.
You declared the matrix as 200 x 200. No matter how much of it you use, the matrix is always 200 x 200. All fields outside of your 5 x 6 range contain at least some data, be it useful or not.
Perhaps you should consider using dynamic arrays:
var
arr: array of array of Extended
With this you can use Setlength to fit the array dimensions to your needs. To get a 5x6 matrix you can use this code (thanks to Rob for the hint):
SetLength(arr, 5, 6);
As you can see, the actual dimension can be queried with the Length function. Length(arr) gets the first dimension while Length(arr[I) will give the second dimension.
With this construct each "row" of the matrix can have an independent number of "columns".
If you don't want a dynamic array and have no additional information as to which values constitute valid ones (i.e., if you can't search for them/count them), you'll essentially have to have additional information. In other words, you'd need two more variables, NRows and NColumns which you set when you populate the array.
If you can guarantee that the matrix is initially populated with known values, e.g., 0 or some invalid value, then you can do what amounts to strlen() and simply count the number of elements in the first row and column that are present before the flag value. This tends to be inefficient, however. Why can't you just pass the current size around as you need it? Or better yet, encapsulate your matrix functionality in a proper object.
If I understand your problem correctly, what your trying to do is to have a preallocated array, which you are only filling partially and your wanting to determine how much is filled.
I would create a class which contains all of the logic for dealing with this "array" and write a property setter for the array which contained the following logic:
Procedure SetArrayValue(X,Y:Extended);
begin
fInternalArray[x,y] := Extended;
fInternalArrayMaxX := Max(fInternalArrayMaxX,X);
fInternalArrayMaxY := Max(fInternalArrayMaxY,Y);
end;
and an array initialization/clear function which looked like the following:
Procedure ClearArray;
begin
FillMemory(#fInternalArray,SizeOf(fInternalArray),0);
fInternalArrayMaxX := 0;
fInternalArrayMaxY := 0;
end;
You can also extend the determination of if an array element has a value by adding another array which matches the bounds of boolean and modifying it appropriately.
What would be the best way, in delphi, to create and store data which will often be searched on and modified?
Basically, I would like to write a function that searches an existing database for telephone numbers and keeps track of how many times each telephone number has been used, the first date used, and the latest date used. The database that is being searched is basically a log of orders placed, containing the telephone number that was used to place the order. It's not an SQL database or anything that can easily be queried for such things (it's an old btrieve database), so I need to create a way of gaining this information (to eventually output to a text file).
I am thinking of creating a record containing the phone number, the two dates, and the number of times used, and then adding a record to a dynamic array for each telephone number. I would then search the array, entry by entry, for each record in the database, to see if the phone number for the current record is already in the array. Then updating or creating a record as necessary.
This seems like it would work, but as there are tens of thousands of entries in the database, it may not be the best way, and a rather slow and inefficient way of doing things. Is there a better way, given the limited actions I can perform on the database?
Someone suggested that rather than using an array, use a MySQL table to keep track of the numbers, and then query each number for every database record. This seems like even more overhead though!
Thanks a lot for your time.
I would register the aggregates in a totally disconnected TClientDataset(cds), and updating the values as you get them from the looping. If the Btrieve could be sorted by telephone number, much better. Then use the data on the cds to generate the report.
(If you go this way, I suggest get Midas SpeedFix from the Andreas Hausladen' blog, along with the other finest stuff you can find there).
Ok, here is a double pass old-school method that works well and should scale well (I used this approach against a multi-million record database once, it took time but gave accurate results).
Download and install Turbo Power
SysTools -- the sort engine
works very well for this process.
create a sort, with a fixed record
of phone number, you will be using
this to sort.
Loop thru your records, at each order, add the
phone number to the sort.
Once the first iteration is done, start
popping the phone numbers from the
sort, increment a counter if the
phone number is the same as the last
one read, otherwise report the
number and clear your counter.
This process can also be done with any SQL Database, but my experience has been that the sort method is faster than managing a temporary table and generates the same results.
EDIT -- You stated that this is a BTrieve database, why not just create a key on the phone number, sort on that key, then apply step 4 over this table (next instead of pop). Either way you will need to touch every record in your database to get counts, the index/sort just makes your decision process easier.
For example, lets say that you have two tables, one the customer table is where the results will be stored, and the other the orders table. Sort both by the same phone number. Then start a cursor at the top of both lists and then apply the following psuedocode:
Count := 0;
While (CustomerTable <> eof) and (OrderTable <> eof) do
begin
comp = comparetext( customer.phone, order.phone );
while (comp = 0) and (not orderTable eof) do
begin
inc( Count );
order.next;
comp = comparetext( customer.phone, order.phone );
end;
if comp < 0 then
begin
Customer.TotalCount = count;
save customer;
count := 0;
Customer.next;
end
else if (Comp > 0) and (not OrderTable EOF) then
begin
Order.Next; // order no customer
end;
end;
// handle case where end of orders reached
if (OrdersTable EOF) and (not CustomersTable EOF) then
begin
Customer.TotalCount = count;
save customer;
end;
This code has the benefit of walking both lists once. There are no lookups necessary since both lists are sorted the same, they can be walked top to bottom taking action only when necessary. The only requirement is that both lists have something in common (in this example phone number) and both lists can be sorted.
I did not handle the case where there is an order and no customer. My assumption was that orders do not exist without customers and would be skipped for counting.
Sorry, couldn't edit my post (wasn't registered at the time). The data will be thrown away once all the records in the database have been iterated through. The function won't be called often. It's basically going to be used as a way of determining how often people have ordered over a period of time from records we already have, so really it's just needed to produce a one off list.
The data will be persistent for the duration of the creation of the list. That is, all telephone numbers will need to be present to be searched on until the very last database record is read.
If you were going to keep it in memory and don't want anything fancy, you'd be better off using a TStringList so you can use the Find function. Find uses Hoare's selection or Quick-select, an O(n) locator. For instance, define a type:
type
TPhoneData = class
private
fPhone:string;
fFirstCalledDate:TDateTime;
fLastCalledDate:TDateTime;
fCallCount:integer;
public
constructor Create(phone:string; firstDate, lastDate:TDateTime);
procedure updateCallData(date:TDateTime);
property phoneNumber:string read fPhone write fPhone;
property firstCalledDate:TDateTime read fFirstCalledDate write fFirstCalledDate;
property lastCalledDate:TDateTime read fLastCalledDate write fLastCalledDate;
property callCount:integer read fCallCount write fCallCount;
end;
{ TPhoneData }
constructor TPhoneData.Create(phone: string; firstDate, lastDate: TDateTime);
begin
fCallCount:=1;
fFirstCalledDate:=firstDate;
fLastCalledDate:=lastDate;
fPhone:=phone;
end;
procedure TPhoneData.updateCallData(date: TDateTime);
begin
inc(fCallCount);
if fFirstCalledDate<date then fFirstCalledDate:=date;
if date>fLastCalledDate then fLastCalledDate:=date;
end;
and then fill it, report on it:
procedure TForm1.btnSortExampleClick(Sender: TObject);
const phoneSeed:array[0..9] of string = ('111-111-1111','222-222-2222','333-333-3333','444-444-4444','555-555-5555','666-666-6666','777-777-7777','888-888-8888','999-999-9999','000-000-0000');
var TSL:TStringList;
TPD:TPhoneData;
i,index:integer;
phone:string;
begin
randseed;
TSL:=TStringList.Create;
TSL.Sorted:=true;
for i := 0 to 100 do
begin
phone:=phoneSeed[random(9)];
if TSL.Find(phone, index) then
TPhoneData(TSL.Objects[index]).updateCallData(now-random(100))
else
TSL.AddObject(phone,TPhoneData.Create(phone,now,now));
end;
for i := 0 to 9 do
begin
if TSL.Find(phoneSeed[i], index) then
begin
TPD:=TPhoneData(TSL.Objects[index]);
ShowMessage(Format('Phone # %s, first called %s, last called %s, num calls %d', [TPD.PhoneNumber, FormatDateTime('mm-dd-yyyy',TPD.firstCalledDate), FormatDateTime('mm-dd-yyyy',TPD.lastCalledDate), TPD.callCount]));
end;
end;
end;
Instead of a TStringList I would recommend using DeCAL's (on sf.net) DMap to store the items in memory. You could specify the phone is the key and store a Record/Class structure containing the rest of the record.
So your Record class will be:
TPhoneData = class
number: string;
access_count: integer;
added: TDateTime.
...
end;
Then in code:
procedure TSomeClass.RegisterPhone(number, phoneData);
begin
//FStore created in Constructor as FStore := DMap.Create;
FStore.putPair([number, phoneData])
end;
...
procedure TSoemClass.GetPhoneAndIncrement(number);
var
Iter: DIterator;
lPhoneData: TPhoneData;
begin
Iter := FStore.locate([number]);
if atEnd(Iter) then
raise Exception.CreateFmt('Number %s not found',[number])
else
begin
lPhoneData := GetObject(Iter) as TPhoneData;
lPhoneData.access_count = lPhoneData.access_count + 1;
//no need to save back to FStore as it holds a pointer to lPhoneData
end;
end;
DMap implements a red/black tree so the data structure sorts the keys for you for free. You can also use a DHashMap for the same affect and (arguably) increased speed.
DeCAL is one of my favourite data structure libraries and would recommend anybody doing in-memory storage operations to have a look.
Hope that helps
I am using JvMemoryData to populate a JvDBUltimGrid. I'm primarily using this JvMemoryData as a data structure, because I am not aware of anything else that meets my needs.
I'm not working with a lot of data, but I do need a way to enumerate the records I am adding to JvMemoryData. Has anyone done this before? Would it be possible to somehow "query" this data using TSQLQuery?
Or, is there a better way to do this? I'm a bit naive when it comes to data structures, so maybe someone can point me in the right direction. What I really need is like a Dictionary/Hash, that allows for 1 key, and many values. Like so:
KEY1: val1;val2;val3;val4;val5;etc...
KEY2: val1;val2;val3;val4;val5;etc...
I considered using THashedStringList in the IniFiles unit, but it still suffers from the same problem in that it allows only 1 key to be associated with a value.
One way would be to create a TStringList, and have each item's object point to another TList (or TStringList) which would contain all of your values. If the topmost string list is sorted, then retrieval is just a binary search away.
To add items to your topmost list, use something like the following (SList = TStringList):
Id := SList.AddObject( Key1, tStringList.Create );
InnerList := tStringList(SList.Objects[id]);
// for each child in list
InnerList.add( value );
When its time to dispose the list, make sure you free each of the inner lists also.
for i := 0 to SList.count-1 do
begin
if Assigned(SList.Objects[i]) then
SList.Objects[i].free;
SList.Objects[i] := nil;
end;
FreeAndNil(SList);
I'm not a Delphi programmer but couldn't you just use a list or array as the value for each hash entry? In Java terminology:
Map<String,List>
You already seem to be using Jedi. Jedi contains classes that allow you to map anything with anything.
Take a look at this related question.
I have been using an array of any arbitrarily complex user defined record types as a cache in conjunction with a TStringList or THashedStringList. I access each record using a key. First I check the string list for a match. If no match, then I get the record from the database and put it in the array. I put its array index into the string list. Using the records I am working with, this is what my code looks like:
function TEmployeeCache.Read(sCode: String): TEmployeeData;
var iRecNo: Integer;
oEmployee: TEmployee;
begin
iRecNo := CInt(CodeList.Values[sCode]);
if iRecNo = 0 then begin
iRecNo := FNextRec;
inc(FNextRec);
if FNextRec > High(Cache) then SetLength(Cache, FNextRec * 2);
oEmployee := TEmployee.Create;
oEmployee.Read(sCode);
Cache[iRecNo] := oEmployee.Data;
oEmployee.Free;
KeyList.Add(Format('%s=%s', [CStr(Cache[iRecNo].hKey), IntToStr(iRecNo)]));
CodeList.Add(Format('%s=%s', [sCode, IntToStr(iRecNo)]));
end;
Result := Cache[iRecNo];
end;
I have been getting seemingly instant access this way.
Jack