I've seen some algorithms designed to append an elment at the end of a linked list here and browsing other website, then I wrote a small procedure that i believe it should append a given element at the end of the list but it doesn't seems to work.
My question here is, why it doesn't work?
I defined pointers and nodes as follows:
Pointer = ^Node;
Node = record
about : element;
next : Pointer;
end;
And the following procedure receives a linked list L and a q element that should be appended to the end of L
First I define the record that I'll insert afterwards
var INS : Poniter;
........
INS.about := q;
And the procedure goes as follows:
{temp := L} {I'll use this for my attempt below }
if L<>NIL then
begin
while L^.next<>NIL do
begin
L:= L^.next;
end;
L^.next := INS;
INS^.next := NIL;
{L:=temp;} {I'll explain this in my attempt below}
end
else
begin
L:= INS;
end;
I also have a small procedure that prints all the elements of the linked list
procedure displayElements(L : pointer);
begin
while L <> nil do
begin
writeln(L^.about);
L := L^.next;
end
end;
The problem: after i run de program it only displays the last two entries of the list.
Conjecture about the problem: I believe it only shows the last two because when i run the procedure displayElements the pointer L is already one element before NIL -because i used the second algorithm-.
Attempt to a solution: Alright, i think i need to put L back at the very first place so that when i use displayElements it get all the elements in the list. But how could I do that?, I tried what i commented above saving L in temp but it didn't work.
Any ideas?. Thanks.
Here's a very simple program which will do what you want. Maintaining a 'tail' pointer means that you don't have to traverse the list every time that you want to add a value. If this were your code, then you would be missing the 'tail:= tmp' line in Insert: without this, Display prints the first and last entries, but not those in the middle.
type
node = ^MyRec;
MyRec = record
value: integer;
next: node
end;
var
head, tail: node;
Procedure Insert (v: integer);
var
tmp: node;
begin
new (tmp);
tmp^.value:= v;
tmp^.next:= nil;
if head = nil
then head:= tmp
else tail^.next:= tmp;
tail:= tmp;
end;
Procedure Display;
var
tmp: node;
begin
tmp:= head;
while tmp <> nil do
begin
writeln (tmp^.value);
tmp:= tmp^.next
end;
end;
begin
head:= nil;
Insert (5);
Insert (10);
Display;
Insert (15);
Display;
readln
end.
[Edit]
Judging by your comments, some further explanation is required. [Professorial mode on] When I started programming some thirty years ago (OMSI Pascal on a PDP 11/70), linked lists and pointers appeared in every self-respecting program, but since the rise of Delphi in 1990, such complexities have been hidden and most programmers now never see a naked pointer.
Linked lists come in different formats: the simple and the complex. The simple types differ at the points of insertion and deletion: a stack inserts and deletes at the same end, a queue inserts at one end and deletes at the other, a list allows insertion and deletion at any place. More complex types are trees and graphs. Your question is asking about the implementation of a queue - insertion is always at the rear, deletion is at the front.
In order to implement this properly, we need two variables: 'head' points to the head of the queue and 'tail' points to the end of the queue. These variables are normal global variables; memory is allocated for them in the data segment of the program. A pointer is a simple variable whose value is the memory address of another variable. Initially, 'head' does not point to anything so its value is nil (think of this as 0).
Normally in textbooks, the construction of a queue is accompanied by little boxes showing how memory is allocated, but I don't know how to do that here and so the explanation will be a little wordy.
When the first insertion occurs, the memory manager in the run time system allocates 12 bytes from the heap and sets the value of the local variable 'tmp' to be the address of the first of those 12 bytes (this is 'new (tmp)'). Of those 12 bytes, the 'value' part is set to 5 and the 'next' part is set to nil. Then the program checks what the value of 'head' is: if it is nil, then the value (ie address of the memory block allocated above) is copied from 'tmp' to 'head'. If 'head' already points to something, then the value of 'tmp' is copied to 'tail^.next' (which previously would have been nil). Then the value of 'tmp' is copied to the tail variable. Thus 'head' always points to the beginning of the queue and does not change whereas 'tail' points to the end of the queue and changes every time a new node is inserted.
Let's do some debugging:
When the program starts, head = nil, tail is undefined
After 'insert (5)', head = $870874, tail = $870874
After 'insert (10)', head = $870874, tail = $870880
After 'insert (15)', head = $870874, tail = $87088C
When displaying,
tmp:= head ......... tmp = $870874 (ie head)
tmp:= tmp^.next .... tmp = $870880
tmp:= tmp^.next .... tmp = $87088C
tmp:= tmp^.next .... tmp = nil
If your program has more than one queue, then you will need two variables for each queue, and 'Insert' will have to be changed in order to accept two parameters (which will be the head and tail of the given queue).
There is no need to write 'new (head)' and 'new (tail)' - doing so will cause the original pointers to the beginning and end of the queue to be lost.
[Professorial mode off] I hope that this explanation helps.
Related
I have 2 Tbytes var:
A: Tbytes;
B: Tbytes;
Now i want to swap then like this
tmp := A;
A := B;
B := tmp;
But I not sure if this is the most efficient way, especially with the copy-on-write (if it's the same as with String)
maybe something like this :
Tmp := Pointer(a);
pointer(a) := pointer(b);
pointer(b) := Tmp ;
There is no copy-on-write for dynamic arrays, but if there were, it would not matter, because nothing is written (to the contents of the arrays).
Your way is the most efficient: only references are copied, and a few reference counts are updated.
The way using pointers would be slightly more efficient (no refcounting), but also a bit more risky. You can do this because in the end, the reference counts of both arrays should be the same as they were before. If nothing can access the (local) references during the swap, it should not matter.
Update
And if you do what David recommended, i.e. put this code in a separate procedure, then it doesn't matter a lot if you use a local Tempvariable or an external one. But the swap using Pointer casts is 10x (ten times) as fast as the normal swap using TBytes!
See my comment to the other answer: it doesn't matter if you use an external or a local Temp variable: they are almost equally fast. I measured the one with a local Temp variable at an average of 6512 milliseconds, the one with the external Temp variable at 6729 milliseconds and the one using pointers at 589 milliseconds. I did several tests in different orders to eradicate any timing errors. There are timing differences when swapping empty (nil) arrays, but I assume these don't matter a lot
As it was already answered your code of swapping two TBytes between each other is the most efficient.
So my post here isn't an answer to your question but instead with it I'm trying just to warn you about how you can possibly screw up performance by using this code impropriety where performance loss will be actually caused by the code that is calling this code from your question.
Now based on the fact that you are even thinking about performance of such small piece of code I'm guessing you are probably planning on executing of this code in one large loop where slightest gain in performance of this code might have big consequences on overall performance of your application. If you would have called this code a few times I bet you wouldn't worry about its performance at all since it would be negligible to the performance of your entire application.
So if you follow the David's suggestion of putting this code into a procedure I'm guessing you might write something like this:
procedure SwapBytes(var A,B: TBytes);
var Temp: Tbytes;
begin
Temp := A;
A := B;
B := Temp;
end;
Nothing fancy. But the problem with this would be that every time you would call this procedure in your loop your application will have to initialize (allocate memory for it) that local variable upon entering the procedure and then finalize (release its memory) it on exiting the above procedure. Now why is this so bad? Because allocating od deallocating memory is much slower than actually writing to or reading from already allocated memory.
So how can you avoid this problem? You do so by initializing the Temp variable outside of your procedure and pass it to the procedure as additional parameter instead. Performance gain can be significant this way can be significant.
Here is my test example where I used both approaches and measure their performance.
//Basic procedure for swapping two TBytes values between each other
//It has local variable Temp of TBytes type which is automatically created when
//entering the procedure and released when exiting the procedure
procedure SwapBytesLocalTempVariable(var A,B: TBytes);
var Temp: TBytes;
begin
Temp := A;
A := B;
B := Temp;
end;
//Same as above bit this procedure does not contain any local variable so you
//need to pas the Temp variable as an additional input parameter
procedure SwapBytesExternalTempVariable(var A,B,Temp: TBytes);
begin
Temp := A;
A := B;
B := Temp;
end;
//Quick procedure for testing
procedure TForm1.Button1Click(Sender: TObject);
var A,B: TBytes;
I: Integer;
SW: TStopWatch;
Temp: TBytes;
begin
//Calling first procedure with local temp variable in a loop many times can be
//quite slow because your program needs to initialize and release that local
//variable in each loop cycle.
SW := TStopWatch.Create;
SW.Start;
for I := 0 to 100000000 do
begin
SwapBytesLocalTempVariable(A,B);
end;
SW.Stop;
Memo1.Lines.Add(Format('Swap bytes with local variable: %f',[SW.Elapsed.TotalMilliseconds]));
//Calling second procedure which does not have local temp variable and passing
//the temp variable as additional parameter is much quicker because this way
//the Temp variable isn't initialized and then released in each loop cycle but
//instead we created (initialized) it outside the loop (out OnClick method of
//TButton and is therefore being reused in each loop cycle.
SW := TStopWatch.Create;
SW.Start;
for I := 0 to 100000000 do
begin
SwapBytesExternalTempVariable(A,B,Temp);
end;
SW.Stop;
Memo1.Lines.Add(Format('Swap bytes with external variable: %f',[SW.Elapsed.TotalMilliseconds]));
end;
Now as you can see the performance difference of these two approaches is quite significant. During my testing calling first procedure with local variable took about 1800 millisecond (almost two seconds) while calling second procedure where I pas Temp variable as additional parameter to it only took about 800 milliseconds. Now that is one second performance gain between the two mentioned approaches.
Any way the general advice is to try and reduce the number of memory allocations as much as possible and try to reuse variables where possible.
Long time reader, first time poster, I'm turning to you because I so many times found answers to my questions here, that I'm sure this one will be just a formality for this great community :)
My question might seem odd, even newbish, but I'm building an application for parsing text lines with urls.
A the beginning of the code, the first step is to determine how many urls there are in the text block. I do it by using the "copy" function from the beginning till the end of the text block, looking for the tag "a href=" tag.
This works fine.
Here is the code :
Tag := '<a href="';
Longueur := Length(ArtistNBSource);
Result := 0;
For i := 1 to Longueur do
begin
Copied := Copy(ArtistNBSource,i,Length(Tag));
if Copied = Tag then inc(Result)
end;
ARTIST_COUNT := Result;
Now, depending on the number of urls found, I'm going to loop through the text block.
What I would like to avoid is things like this...
if Result : 1 do
begin
some instruction
end
else if Result = 2
begin
other instruction
end
else if Result = 3....
...because with a maximum of 5 url possible in the text block, that would give me a veryyyyy long code.
What I imagined was this :
First of all, I declare variables up to the maximum known possible.
var
AUPOS1, AUPOS2, AUPOS3, AUPOS4, AUPOS5, ANPOS1, ANPOS2, ANPOS3, ANPOS4, ANPOS5, ia : Integer;
As the parsing patern is fixed, I imagined this :
For ia := 1 to ARTIST_COUNT do
begin
(AUPOS+IntToStr(ia)):= Pos('">', ArtistNBSource);
(AURL+IntToStr(ia)) := Copy(ArtistNBSource,11,(AUPOS+IntToStr(ia))-11);
Delete(ArtistNBSource,1,(AUPOS+IntToStr(ia))+1);
(ANPOS+IntTostr(ia)) := Pos('</a>', ArtistNBSource);
(ANAME+IntToStr(ia)) := Copy(ArtistNBSource,1,(ANPOS+IntToStr(ia))-1);
Delete(ArtistNBSource, 1,(ANPOS+IntToStr(ia))+4);
end;
The ia variable matching the number of loops AND the variables names for each loop, I thought I could auto increment the variables names and assign their values to the previously declared variables.
But of course this does not work :)
My question :
Do any of you see a solution out of this ?
Am I condemned to writing the 'if then' long sequence, or can I dynamically adjust variable names through the loop ?
Thank you all in advance for any comment that might give me a clue of what direction to follow.
Cheers
Mathmathou.
I would recommend having just one variable - a dictionary/hash table and then have the 'dynamic variable names' be keys in that dictionary and the values be what you would store in those 'dynamically named' variables.
Here is a tutorial about dictionaries:
http://beensoft.blogspot.se/2008/09/simple-generic-dictionary-tdictionary.html
I wrote this function to remove duplicates from a TList descendant, now i was wondering if this could give me problems in certain conditions, and how it does performance wise.
It seems to work with Object Pointers
function TListClass.RemoveDups: integer;
var
total,i,j:integer;
begin
total:=0;
i := 0;
while i < count do begin
j := i+1;
while j < count do begin
if items[i]=items[j] then begin
remove(items[j]);
inc(total);
end
else
inc(j);
end;
inc(i);
end;
result:=total;
end;
Update:
Does this work faster?
function TDrawObjectList.RemoveDups: integer;
var
total,i,j:integer;
templist:TLIST;
begin
templist:=TList.Create;
total:=0;
i := 0;
while i < count do
if templist.IndexOf(items[i])=-1 then begin
templist.add(i);
inc(i);
end else begin
remove(items[i]);
inc(total);
end;
result:=total;
templist.Free;
end;
You do need another List.
As noted, the solution is O(N^2) which makes it really slow on a big set of items (1000s), but as long as the count stays low it's the best bet because of it's simplicity and easiness to implement. Where's pre-sorted and other solutions need more code and prone to implementation errors more.
This maybe the same code written in different, more compact form. It runs through all elements of the list, and for each removes duplicates on right of the current element. Removal is safe as long as it's done in a reverse loop.
function TListClass.RemoveDups: Integer;
var
I, K: Integer;
begin
Result := 0;
for I := 0 to Count - 1 do //Compare to everything on the right
for K := Count - 1 downto I+1 do //Reverse loop allows to Remove items safely
if Items[K] = Items[I] then
begin
Remove(Items[K]);
Inc(Result);
end;
end;
I would suggest to leave optimizations to a later time, if you really end up with a 5000 items list. Also, as noted above, if you do check for duplicates on adding items to the list you can save on:
Check for duplicates gets distributed in time, so it wont be as noticeable to user
You can hope to quit early if dupe is found
Just hypothetical:
Interfaces
If you have interfaced objects in an TInterfaceList that are only in that list, you could check the refcount of an object. Just loop through the list backwards and delete all objects with a refcount > 1.
Custom counter
If you can edit these objects, you could do the same without interfaces. Increment a counter on the object when they are added to the list and decrease it when they are removed.
Of course, this only works if you can actually add a counter to these objects, but the boundaries weren't exactly clear in your question, so I don't know if this is allowed.
Advantage is that you don't need to look for other items, not when inserting, not when removing duplicates. Finding a duplicate in a sorted list could be faster (as mentioned in the comments), but not having to search at all will beat even the fastest lookup.
I have to check if I have duplicate paths in a FileListBox (FileListBox has the role of some kind of job list or play list).
Using Delphi's SameText, CompareStr, CompareText, takes 6 seconds. So I came with my own compare function which is (just) a bit faster but not fast enough. Any ideas how to improve it?
function SameFile(CONST Path1, Path2: string): Boolean;
VAR i: Integer;
begin
Result:= Length(Path1)= Length(Path2); { if they have different lenghts then obviously are not the same file }
if Result then
for i:= Length(Path1) downto 1 DO { start from the end because it is more likely to find the difference there }
if Path1[i]<> Path2[i] then
begin
Result:= FALSE;
Break;
end;
end;
I use it like this:
for x:= JList.Count-1 downto 1 DO
begin
sMaster:= JList.Items[x];
for y:= x-1 downto 0 DO
if SameFile(sMaster, JList.Items[y]) then
begin
JList.Items.Delete (x); { REMOVE DUPLICATES }
Break;
end;
end;
Note: The chance of having duplicates is small so Delete is not called often. Also the list cannot be sorted because the items are added by user and sometimes the order may be important.
Update:
The thing is that I lose the asvantage of my code because it is Pascal.
It would be nice if the comparison loop ( Path1[i]<> Path2[i] ) would be optimized to use Borland's ASM code.
Delphi 7, Win XP 32 bit, Tests were done with 577 items in the list. Deleting the items from list IS NOT A PROBLEM because it happens rarely.
CONCLUSION
As Svein Bringsli pointed, my code is slow not because of the comparing algorithm but because of TListBox. The BEST solution was provided by Marcelo Cantos. Thanks a lot Marcelo.
I accepted Svein's answer because it answers directly my question "how to make my comparison function faster" with "there is no point to make it faster".
For the moment I implemented the dirty and quick to implement solution: when I have under 200 files, I use my slow code to check the duplicates. If there are more than 200 files I use dwrbudr's solution (which is damn fast) considering that if the user has so many files, the order is irrelevant anyway (human brain cannot track so many items).
I want to thank you all for ideas and especially Svein for revealing the truth: (Borland's) visual controls are damn slow!
Don't waste time optimising the assembler. You can go from O(n2) to O(n log(n)) — bringing the time down to milliseconds — by sorting the list and then doing a linear scan for duplicates.
While you're at it, forget the SameFile function. The algorithmic improvement will dwarf anything you can achieve there.
Edit: Based on feedback in the comments...
You can perform an order-preserving O(n log(n)) de-duplication as follows:
Sort a copy of the list.
Identify and copy duplicated entries to a third list along with their duplication count minus one.
Walk the original list backwards as per your original version.
In the inner (for y := ...) loop, traverse the duplication list instead. If an outer item matches, delete it, decrement the duplication count, and delete the duplication entry if the count reaches zero.
This is obviously more complicated but it will still be orders of magnitude faster, even if you do horrible dirty things like storing duplication counts as strings, C:\path1\file1=2, and using code like:
y := dupes.IndexOfName(sMaster);
if y <> -1 then
begin
JList.Items.Delete(x);
c := StrToInt(dupes.ValueFromIndex(y));
if c > 1 then
dupes.Values[sMaster] = IntToStr(c - 1);
else
dupes.Delete(y);
end;
Side note: A binary chop would be more efficient than the for y := ... loop, but given that duplicates are rare, the difference ought to be negligible.
Using your code as a starting point, I modified it to take a copy of the list before searching for duplicates. The time went from 5,5 seconds to about 0,5 seconds.
vSL := TStringList.Create;
try
vSL.Assign(jList.Items);
vSL.Sorted := true;
for x:= vSL.Count-1 downto 1 DO
begin
sMaster:= vSL[x];
for y:= x-1 downto 0 DO
if SameFile(sMaster, vSL[y]) then
begin
vSL.Delete (x); { REMOVE DUPLICATES }
jList.Items.Delete (x);
Break;
end;
end;
finally
vSL.Free;
end;
Obviously, this is not a good way to do it, but it demonstrates that TFileListBox is in itself quite slow. I don't believe you can gain much by optimizing your compare-function.
To demonstrate this, I replaced your SameFile function with the following, but kept the rest of your code:
function SameFile(CONST Path1, Path2: string): Boolean;
VAR i: Integer;
begin
Result := false; //Pretty darn fast code!!!
end;
The time went from 5,6 seconds to 5,5 seconds. I don't think there's much more to gain there :-)
Create another sorted list with sortedList.Duplicates := dupIgnore and add your strings to that list, then back.
vSL := TStringList.Create;
try
vSL.Sorted := true;
vSL.Duplicates := dupIgnore;
for x:= 0 to jList.Count - 1 do
vSL.Add(jList[x]);
jList.Clear;
for x:= 0 to vSL.Count - 1 do
jList.Add(vSL[x]);
finally
vSL.Free;
end;
The absolute fastest way, bar none (as alluded to before) is to use a routine that generates a unique 64/128/256 bit hash code for a string (I use the SHA256Managed class in C#). Run down the list of strings, generate the hash code for the strings, check for it in the sorted hash code list, and if found then the string is a duplicate. Otherwise add the hash code to the sorted hash code list.
This will work for strings, file names, images (you can get the unique hash code for an image), etc, and I guarantee that this will be as fast or faster than any other impementation.
PS You can use a string list for the hash codes by representing the hash codes as strings. I've used a hex representation in the past (256 bits -> 64 characters) but in theory you can do it any way you like.
4 seconds for how many calls? Great performance if you call it a billion times...
Anyway, does Length(Path1) get evaluated every time through the loop? If so, store that in an Integer variable prior to looping.
Pointers may yield some speed over the strings.
Try in-lining the function with:
function SameFile(blah blah): Boolean; Inline;
That will save some time, if this is being called thousands of times per second. I would start with that and see if it saves anything.
EDIT: I didn't realize that your list wasn't sorted. Obviously, you should do that first! Then you don't have to compare against every other item in the list - just the prior or next one.
I use a modified Ternary Search Tree (TST) to dedupe lists. You simply load the items into the tree, using the whole string as the key, and on each item you can get back an indication if the key is already there (and delete your visible entry). Then you throw away the tree. Our TST load function can typically load 100000 80-byte items in well under a second. And it could not take any more than this to repaint your list, with proper use of begin- and end-update. The TST is memory-hungry, but not so that you would notice it at all if you only have of the order of 500 items. And much simpler than sorting, comparisons and assembler (if you have a suitable TST implementation, of course).
No need to use a hash table, a single sorted list gives me a result of 10 milliseconds, that's 0.01 seconds, which is about 500 times faster! Here is my test code using a TListBox:
procedure TForm1.Button1Click(Sender: TObject);
var
lIndex1: Integer;
lString: string;
lIndex2: Integer;
lStrings: TStringList;
lCount: Integer;
lItems: TStrings;
begin
ListBox1.Clear;
for lIndex1 := 1 to 577 do begin
lString := '';
for lIndex2 := 1 to 100 do
if (lIndex2 mod 6) = 0 then
lString := lString + Chr(Ord('a') + Random(2))
else
lString := lString + 'a';
ListBox1.Items.Add(lString);
end;
CsiGlobals.AddLogMsg('Start', 'Test', llBrief);
lStrings := TStringList.Create;
try
lStrings.Sorted := True;
lCount := 0;
lItems := ListBox1.Items;
with lItems do begin
BeginUpdate;
try
for lIndex1 := Count - 1 downto 0 do begin
lStrings.Add(Strings[lIndex1]);
if lStrings.Count = lCount then
Delete(lIndex1)
else
Inc(lCount);
end;
finally
EndUpdate;
end;
end;
finally
lStrings.Free;
end;
CsiGlobals.AddLogMsg('Stop', 'Test', llBrief);
end;
I'd also like to point out that your solution would take an extreme amount of time if applied to a huge list (like containing 100,000,000 items or more). Even constructing a hashtable or sorted list would take too much time.
In cases like that you could try another approach : Hash each member, but instead of populating a full-blown hashtable, create a bitset (large enough to contain a close factor to as many slots as there are input items) and just set each bit at the offset indicated by the hashfunction. If the bit was 0, change it to 1. If it was already 1, take note of the offending string-index in a separate list and continue. This results in a list of string-indexes that had a collision in the hash, so you'll have to run it a second time to find the first cause of those collisions. After that, you should sort & de-dupe the string-indexes in this list (as all indexes apart from the first one will be present twice). Once that's done you should sort the list again, but this time sort it on the string-contents in order to easily spot duplicates in a following single scan.
Granted it could be a bit extreme to go this all this length, but at least it's a workable solution for very large volumes! (Oh, and this still won't work if the number of duplicates is very high, when the hash-function has a bad spread or when the number of slots in the 'hashtable' bitset is chosen too small - which would give many collisions which aren't really duplicates.)
Whilst I'd love to solve this problem in python, I'm stuck in Delphi for this one. I have nested lists (actually objects with nested lists as properties, but nevermind), and I want to iterate over them in a generator fashion. That is, I want to write a Next function, which gives me the next item from the leaves of the tree described by the nested lists.
For example, lets say I have
[[1,2,3],[4,5],[],[6],[7,8]]
I want 8 consecutive calls to Next() to return 1..8.
How can I do this in a language without yield and generators?
Note that the depth of the nesting is fixed (2 in this example, 4 in real life), but answers which solve the more general case where depth is variable are welcome.
EDIT: Sorry, I should have mentioned, this is Delphi 2007.
If you're using any Delphi list that has a built-in enumerator, it can be done easily enough with a bit of recursion. Your base list is a list of numbers, like a TList<integer>. Then you have nested lists implemented as TList<TList<integer>>. (I'm assuming you have Delphi 2009 or 2010. If not, it gets a bit trickier.)
What you need is to make your own specialized list class descended from TList<T> and add a virtual Next() function to it, and a field for an enumerator for your list. The compiler uses enumerators internally when you set up a for..in loop, but you can run them manually. The Next() function creates the enumerator if it's not already assigned and puts it in the field, then calls MoveNext() on it. If this succeeds, call GetCurrent and get your number. Otherwise, you're done. FreeAndNil the enumerator, and signal to the calling function that you've got nothing to return. Probably the simplest way to do this is to put a var boolean parameter in Next() that returns the result of its call to MoveNext, and have the calling function check its value.
For the higher lists, it's a little more complicated, but not much. Descend from the generic class you just set up, and override Next(). This one will get an enumerator that enumerates over the lists that it holds, and return the value of FEnumerator.GetCurrent.Next() until that sub-list is exhausted, then call MoveNext on its enumerator.
This should work for any depth of nested lists. Just don't try to make a list that contains both numbers and lists.
Method 1 : a "layered list" is a tree, use (or write) a tree/treenode class, and do a simple depth-first search of its values.
Method 2 : explicitly write the depth-first search for your layered list type :
TEnumerator = RECORD
private
FStack : Stack of TNestedList; //a stack, holding which level of the tree you are exploring
FIndexStack : Stack of Integer; //a twin stack, holding the index of the child you are exploring at each layer
public
Init( AList : TNestedList );
Next( var AVal : Integer ) : boolean;
end;
procedure TEnumerator.Init( AList );
begin
SetLength(FStack, 1);
FStack[0] := AList;
SetLength( FIndexStack, 1 );
FIndexStack[0] := 0;
_GoToNextValue();
end;
procedure TEnumerator._GoToNextValue();
begin
while FStack.notEmpty()
and FIndexStack.last > FStack.last.length do
begin
//we have finished exploring FStack.last, we need to explore its next sibling :
//pop FStack.last :
FIndexStack.pop;
FStack.pop;
//go to next sibling :
FIndexStack.last := FIndexStack.last + 1;
end;
//nothing left to explore :
if FStack.empty then Exit;
//else :
// dig through the layers of list until you reach a value
while FStack.last.isAList() do
begin
FStack.push( FStack.last[ FIndexStack.last ] );
FIndexStack.push( 0 );
end;
end;
function Next( var AVal : Integer ) : boolean;
begin
_GoToNextValue();
Result := FStack.notEmpty();
if Result then
begin
AVal := FStack.last[ FIndexStack.last ];
FIndexSatck.last := FIndexSatck.last + 1;
end;
end;
You basically do explicitly what a "yield" would do (ie : save a "freeze copy" of the call stack, return the current value)
Usage :
LEnum : TEnumerator;
LEnum.Init( myList );
while LEnum.Next( LVal ) do
begin
<do something>
end;
To solve this problem, I ended up making a flat list of indexes and remembering my position in that: (in pythonesque, because its shorter)
class Nested:
nestedList = [[1,2,3],[4,5],[],[6],[7,8]]
positions = []
curr = 0
def Setup:
for a in nestedList:
for b in a:
positions.add((a,b))
def Next:
p = positions[curr]
curr += 1
return nestedList[p[0]][p[1]]
Obviously my list doesn't change during iteration or this probably wouldn't work...