Does generic dictionary class have a method to get keys by index? - delphi

this might be a realy easy question but i think i am mentaly blind or something; How can i get a key by its index in The Dictionary class In Delphi (10.1). I mean the structure has a property called Count, so it must have a some sort of array or a list in it, why cant i get the keys by indices.
I also tried KeyCollection property in Dictionary class, but it also doesn't have anything useful. I need something like:
key: string;
key := dicTest.GetKey(keyIndex);
Thanks a lot.

The Delphi RTL generic dictionary is unordered. As a consequence of being unordered, items in the container do not possess a meaningful index.
The keys can be enumerated using the Keys property:
var
dict: TDictionary<string, Integer>;
key: string;
....
for key in dict.Keys do
Writeln(key);
Likewise the values can be enumerated using the Values property.
var
dict: TDictionary<string, Integer>;
value: Integer;
....
for value in dict.Values do
Writeln(value);
If you wish to enumerate key/value pairs then the dictionary itself provides an enumerator for that:
var
dict: TDictionary<string, Integer>;
item: TPair<string, Integer>;
....
for item in dict do
Writeln(item.Key, ', ', item.Value);
Note that for each of these enumerators, no guarantees are made about the order of the items. Simple acts like adding a new item to the dictionary can result in a change in the order of the items under enumeration.

To add to David's answer, the whole point of a dictionary or hash-table structure is to very efficiently store and retrieve key-value pairs in memory.
This is achieved as follows:
Items are placed into a predictable location within in large block of memory based on the key.
When trying to find the item, you know where it should be stored (based on the key), and can immediately go to that location to hopefully find it.
The following diagram illustrates:
+-------------+
| .. |
| .. |
Add |World (Data) | Find
Hello |Abc (Data) | Hello
| | .. | |
| | .. | |
+---> |Hello (Data) | <---+
| .. |
| .. |
|Xyz (Data) |
+-------------+
Note the following:
There is no ordering as to where items will be inserted.
The basic operations require knowing the key to look them up.
Although Delphi allows the items to be iterated, their positions are not guaranteed to be consistent.
The big advantage of this kind of structure is that it doesn't matter how large the collection grows, adding and looking up and deleting items has the same performance. This is referred to as having order of complexity O(1).
These structures tend to be inefficient in terms of memory requirements. You may have noted the "gaps" in the allocated memory in the illustration above.
There's no built in mechanism for knowing the number of items in the collection, but it's trivial track the number as items are added/removed in a single extra field. (Which is what Delphi's implementation does.) Knowing count does not imply you can access items by index.
In summary
If you're unable to keep track of your keys, then a Dictionary is not the right tool for you. You may be better with a List or array. Though I suggest you make an effort to understand the benefits and limitations of these structures to help decide which is the best tool for the job.

The easiest way to access the keys by index is to retrieve them as an array. The Keys property provides a function for that: ToArray

Related

Data structures in Rascal

I am looking for a data structure that can mimic an Object or a struct. Really, just some compact way to pass around different types of variables. Currently I am using a tuple but referencing various parts of the tuple is less pleasant than I would like. Currently I've just created aliases that represent the various locations in the tuple:
alias AuxClass = tuple[str,str,list[int],list[int],Dec];
int ACLS = 0;
But I've had to restructure this tuple and thus had to change the indexing. Is there something I can use here that I've missed or perhaps a feature coming in the future?
Thanks!
Please take a look at the algebraic data types feature:
http://tutor.rascal-mpl.org/Rascal/Rascal.html#/Rascal/Declarations/AlgebraicDataType/AlgebraicDataType.html
You can create a constructor to represent the type of data that you are trying to define above, similar to what you would do with a struct, and give each element in the constructor a field name:
data AuxClass = auxClass(str f1, str f2, list[int] f3, list[int] f4, Dec f5)
You can then create new instances of this just using the constructor name and providing the data:
a = auxClass("Hello", "World", [1,2,3], [4,5,6], D1) (where D1 is a Dec).
Once you have an instance, you can access information using the field names:
a.f1 // which equals "Hello"
a.f3 // which equals [1,2,3]
size(a.f3) // which currently equals 3
and you can update information using the field names:
a.f2 = "Rascal"
a.f4 = a.f4 + 7 // f4 is now [4,5,6,7]
Algebraic data types are actually quite flexible, so there is a lot you can do with them beyond this. Feel free to look through the documentation and ask questions here.

Most appropriate data structure for a CSV table?

I'm looking for an advice on most appropriate data structure for holding CSV(Comma Separated Value) table in a memory.
It should cover both cases: table with and without a header.
If the table contains a header, all fields of all rows are determined by key->value pairs, where the key is a name from a header and value is an appropriate content of a field.
If the table does not contain a header, then rows are simply lists of strings or also key->value pairs with key names generated (like 'COL1', 'COL2', ... 'COLn').
I'm looking for most simple (less code) and most generic solution at the same time.
I'm thinking about the following subclassing, but doubt if it's the right/effective way of implementation:
TCSV = class (TObjectList<TDictionary<string, string>>)
...
public
constructor Create(fileName: string; header: Boolean; encoding: string = '';
delimiter: Char = ';'; quoteChar: Char = '"'); overload;
...
end;
It looks like I have to keep keys for every row of fields. What about TDictionary<string, TStringList> ? Would it be a better solution ?
What about a TClientDataset? Seems quite easy.
Just a simple guide on how to use TClientDataSet as an in-memory dataset, can be found here.
The structure you are proposing would mean that you would have a TDictionary instance for every row in your csv file. In essence duplicating the column names for every row. Seems like a bit of a waste.
Assuming that with TDictionary<string, TStringList> you would fill each TStringList with the values from a single column. That could work, but it still won't be easy to iterate over all columns per row of data.
As GolezTrol suggests, TClientDataSet comes standard with Delphi, is very powerful and as a dataset intended to be used with columnar data. Also, although it is a dataset, it does not require a database (connection) and is used in many application for exactly the goal you are trying to achieve: an in-memory dataset.
I recommend you try the TJvCsvDataSet, which I wrote and contributed to the JEDI JVCL. It works on CSV files with and without headers. It works with data aware controls including DB Grids.
It parses CSV data, and works entirely like the Client Dataset that others have suggested.
Internally it uses an array of byte records and parses each row and keeps an integer "lookup" so that it knows where each individual column starts on that particular row. That makes changing out one value for another value (modifying a field in a row) a very fast operation.
It supports most common field types (although not blob or currency right now) and it parses CSV features including embedded carriage return + linefeeds that are inside a field value, and embedded CSV "escape codes" so that you can put a double quote character inside a string, for instance.
It has a property called FieldDef which can be used to define the types of the columns, or it can simply read the header of the file, and treat each value inside as a string (if you don't tell it otherwise).
It can modify a CSV by adding or removing columns, and do most common things you'd want to do with a CSV table. I have used it and tested it heavily, and it works fine.
Depending of the usage instead of TDataSet you may also use Synopse TSynBigTable which is more perfomant and has less limitations.
For no "time or size critical" applications TDataSet is OK.
So you basically want to be able to access elements like:
for RowNum := 0 to csv.Count - 1 do
begin
Name := csv[RowNum]['Name'];
// Do something
end;
TObjectList<TDictionary<string, string>> would certainly do the job but its not very efficient.
Loading the csv into a Dataset would probably be the least amount of code but would have slightly more overhead.
You might want to consider a combination of a simple Tstringlist or TList<string> for the header and break the data into a new class that takes header list in its constructor. You would get the same result:
TCSVRow = class
private
FHeaders: TList<string>;
FFields: TList<string>;
public
constructor(Headers: TList<string>);
function GetField(index: string): string;
property Fields[index: string]: string read GetField; default;
end;
TCSV = class
private
FHeaders: TList<string>;
FRows:TList<TCSVRow>;
public
function GetRow(Index: integer):TCSVRow;
property Rows[index: integer]:TCSVRow read GetRow; default;
end;
implementation
function TCSVRow.GetField(index: string): string;
begin
Result := FFields[FHeaders.IndexOf(index)];
end;
function TCSV.GetRow(Index:integer):TCSVRow;
begin
Result := FRows[Index];
end;
This is incomplete and I typed directly into the browser so I haven't tested it for correctness but you get the general idea. This way the header information is only stored once instead of duplicated for each row.
You could save a small bit of memory by making FFields a string array instead of a TList<string> but TList<string> is easier to work with IMHO.
Update
On second thought David has a point. The CSVRow class could be eliminated. You could simply have either TList<TList<string>> or a 2d array. Either way I still think you should keep the headers in a separate list. In which case TCSV would look more like:
TCSV = class
private
FHeaders: TList<string>;
FData:TList<TList<string>>;
public
function GetData(Row: integer; Column:string):string;
property Data[Row: integer; Column:string]:string read GetData; default;
end;
function TCSV.GetData(Row: integer; Column:string):string;
begin
Result := FData[Row][FHeaders.IndexOf(Column)];
end;
There are many possible solutions to this.
If you want something really simple and generic as per your request (not necessarily the fanciest solution), why not just...
TMyRec =
record
HeaderNames: array of string;
StringValues: array of array of string
end;
Just set the length of the arrays as needed (using SetLength).

F# -> Seq to Map

I'm trying to load all my Categories from the database and then map them to a Map (dictionary?), however when I use the following code:
[<StructuralComparison>]
type Category = {
mutable Id: string;
Name: string;
SavePath: string;
Tags: ResizeArray<Tag> }
let categories = session.Query<Category>()
|> Seq.map(fun z -> (z,0))
|> Map.ofSeq
it simply throws an error that says:
The struct, record or union type
'Category' has the
'StructuralComparison' attribute but
the component type 'ResizeArray'
does not satisfy the 'comparison'
constraint
I have absolutely no clue about what to do, so any help is appreciated!
F# is rightly complaining that ResizeArray<_> doesn't support structural comparison. ResizeArray<_> instances are mutable and don't support structural comparison, and you therefore can't use them as fields of records which are used as keys to Map<_,_>s (which are sorted by key). You have a few options:
Have Tags use a collection type that does support structural comparison (such as an immutable Tag list).
Use [<CustomComparison>] instead of [<StructuralComparison>], which requires implementing IComparable.
Use a mutable dictionary (or another relevant collection type) instead of a Map. Try using the dict function instead of Map.ofSeq, for instance.
The problem here is that by adding StructuralComparison attribute to the type Category you've asked F# to use structural comparison when comparing instances of Category. In short it will compare every member individually to see if they are equal to determine if two instances of Category are equal.
This puts an implicit constraint on every member of Category to themselves be comparable. The type ResizeArray<Tag> is generating an error because it's not comparable. This is true for most collection types.
In order to get rid of this error you'll need to make the ResizeArray<T> type comparable or choose a different key for the Map. Don has a great blog post that goes into this topic in depth and provides a number of different ways to achieve this. It's very much worth the read
http://blogs.msdn.com/b/dsyme/archive/2009/11/08/equality-and-comparison-constraints-in-f-1-9-7.aspx

how to sort documents using the erlang map reduce in riak

i'm using riak to store json documents right now, and i want to sort them based on some attribute, let's say there's a key, i.e
{
"someAttribute": "whatever",
"order": 1
}
so i want to sort the documents based on the "order".
I am currently retrieving the documents in riak with the erlang interface. i can retrieve the document back as a string, but i dont' really know what to do after that. i'm thinking the map function just reduces the json document itself, and in the reduce function, i'd make a check to see whether the item i'm looking at has a higher "order" than the head of the rest of the list, and if so append to beginning, and then return a lists:reverse.
despite my ideas above i've had zero results after almost an entire day, i'm so confused with the erlang interface in riak. can someone provide insight on how to write this map/reduce function, or just how to parse the json document?
As far as I know, You do not have access to Input list in Map. You emit from Map a document as 1 element list.
Inputs (all the docs to handle as {Bucket, Key}) -> Map (handle single doc) -> Reduce (whole list emitted from Map).
Maps are executed per each doc on many nodes whereas Reduce is done once on so called coordinator node (the one where query was called).
Solution:
Define Inputs (as a list or bucket)
Retrieve Value in Map and emit whole doc or {Id, Val_to_sort_by)
Sort in Reduce (using regular list:keysort)
This is not a map reduce solution but you should check out Riak Search.
so i "solved" the problem using javascript, still can't do it using erlang.
here is my query
{"inputs":"test",
"query":[{"map":{"language":"javascript",
"source":"function(value, keyData, arg){ var data = Riak.mapValuesJson(value)[0]; var obj = {}; obj[data.order] = data; return [ obj ];}"}},
{"reduce":{"language":"javascript",
"source":"function(values, arg){ return [ values.reduce(function(acc, item){ for(var order in item){ acc[order] = item[order]; } return acc; }) ];}",
"keep":true}}
]
}
so in the map phase, all i do is create a new array, obj, with the key as the order, and the value as the data itself. so visually, the obj is like this
{"1":{"firstName":"John","order":1}
in the reduce phase, i'm just putting it in the accumulator, so basically that's the sort if you think about it, because when you're done, everything will be put in order for you. so i put 2 json documents for testing, one is above, the ohter is just firstName: Billie, order 2. and here is my result for the query above
[{"1":{"firstName":"John","order":1},"2":{"firstName":"Billie","order":2}}]
so it works! . but i still need to do this in ERLANG, any insights?

How do I enumerate JvMemoryData...Or, how do I create a hash with a single key and multiple values?

I am using JvMemoryData to populate a JvDBUltimGrid. I'm primarily using this JvMemoryData as a data structure, because I am not aware of anything else that meets my needs.
I'm not working with a lot of data, but I do need a way to enumerate the records I am adding to JvMemoryData. Has anyone done this before? Would it be possible to somehow "query" this data using TSQLQuery?
Or, is there a better way to do this? I'm a bit naive when it comes to data structures, so maybe someone can point me in the right direction. What I really need is like a Dictionary/Hash, that allows for 1 key, and many values. Like so:
KEY1: val1;val2;val3;val4;val5;etc...
KEY2: val1;val2;val3;val4;val5;etc...
I considered using THashedStringList in the IniFiles unit, but it still suffers from the same problem in that it allows only 1 key to be associated with a value.
One way would be to create a TStringList, and have each item's object point to another TList (or TStringList) which would contain all of your values. If the topmost string list is sorted, then retrieval is just a binary search away.
To add items to your topmost list, use something like the following (SList = TStringList):
Id := SList.AddObject( Key1, tStringList.Create );
InnerList := tStringList(SList.Objects[id]);
// for each child in list
InnerList.add( value );
When its time to dispose the list, make sure you free each of the inner lists also.
for i := 0 to SList.count-1 do
begin
if Assigned(SList.Objects[i]) then
SList.Objects[i].free;
SList.Objects[i] := nil;
end;
FreeAndNil(SList);
I'm not a Delphi programmer but couldn't you just use a list or array as the value for each hash entry? In Java terminology:
Map<String,List>
You already seem to be using Jedi. Jedi contains classes that allow you to map anything with anything.
Take a look at this related question.
I have been using an array of any arbitrarily complex user defined record types as a cache in conjunction with a TStringList or THashedStringList. I access each record using a key. First I check the string list for a match. If no match, then I get the record from the database and put it in the array. I put its array index into the string list. Using the records I am working with, this is what my code looks like:
function TEmployeeCache.Read(sCode: String): TEmployeeData;
var iRecNo: Integer;
oEmployee: TEmployee;
begin
iRecNo := CInt(CodeList.Values[sCode]);
if iRecNo = 0 then begin
iRecNo := FNextRec;
inc(FNextRec);
if FNextRec > High(Cache) then SetLength(Cache, FNextRec * 2);
oEmployee := TEmployee.Create;
oEmployee.Read(sCode);
Cache[iRecNo] := oEmployee.Data;
oEmployee.Free;
KeyList.Add(Format('%s=%s', [CStr(Cache[iRecNo].hKey), IntToStr(iRecNo)]));
CodeList.Add(Format('%s=%s', [sCode, IntToStr(iRecNo)]));
end;
Result := Cache[iRecNo];
end;
I have been getting seemingly instant access this way.
Jack

Resources