Delphi's generic TQueue class has a property called Capacity. If the number of items in the TQueue exceeds its capacity, additional items are still added to the queue. The documentation says the property "gets or sets the queue capacity, that is, the maximum size of the queue without resizing." It sounds like a queue is kind of like a fixed length array (memory-wise)--until it's full, at which point it becomes more like a dynamic array? Is that accurate?
When would a programmer want or need to get or set a TQueue's capacity?
Theory
Consider the following example, which generates a dynamic array of random integers:
program DynArrAlloc;
{$APPTYPE CONSOLE}
{$R *.res}
uses
Windows, System.SysUtils;
const
N = 100000000;
var
a: TArray<Integer>;
i: Integer;
tc1, tc2: Cardinal;
begin
tc1 := GetTickCount;
SetLength(a, 0);
for i := 1 to N do
begin
SetLength(a, Succ(Length(a)));
a[High(a)] := Random(1000);
end;
tc2 := GetTickCount;
Writeln(tc2 - tc1);
Readln;
end.
On my system, it takes 4.5 seconds to run it.
Notice that I -- in each iteration -- reallocate the array so it can hold one more item.
It would be better if I allocated a large enough array from the beginning:
program DynArrAlloc;
{$APPTYPE CONSOLE}
{$R *.res}
uses
Windows, System.SysUtils;
const
N = 100000000;
var
a: TArray<Integer>;
i: Integer;
tc1, tc2: Cardinal;
begin
tc1 := GetTickCount;
SetLength(a, N);
for i := 1 to N do
a[N - 1] := Random(1000);
tc2 := GetTickCount;
Writeln(tc2 - tc1);
Readln;
end.
This time, the program only takes 0.6 seconds.
Hence, one should always try not to reallocate unnecessarily. Each time I reallocate in the first example, I need to ask for more memory; then I need to copy the array to the new location, and finally free the old memory. Clearly, this is very inefficient.
Unfortunately, it isn't always possible to allocate a large enough array at the start. You simply might not know the final element count.
A common strategy then is to allocate in steps: when the array is full and you need one more slot, allocate several more slots but keep track of the actual number of used slots:
program DynArrAlloc;
{$APPTYPE CONSOLE}
{$R *.res}
uses
Windows, System.SysUtils;
const
N = 100000000;
var
a: TArray<Integer>;
i: Integer;
tc1, tc2: Cardinal;
ActualLength: Integer;
const
AllocStep = 1024;
begin
tc1 := GetTickCount;
SetLength(a, AllocStep);
ActualLength := 0;
for i := 1 to N do
begin
if ActualLength = Length(a) then
SetLength(a, Length(a) + AllocStep);
a[ActualLength] := Random(1000);
Inc(ActualLength);
end;
// Trim the excess:
SetLength(a, ActualLength);
tc2 := GetTickCount;
Writeln(tc2 - tc1);
Readln;
end.
Now we need 1.3 seconds.
In this example, I allocate in fixed-sized blocks. A more common strategy is probably to double the array at each reallocation (or multiply by 1.5 or something) or combine these options in a smart way.
Applying the theory
Under the hood, TList<T>, TQueue<T>, TStack<T>, TStringList etc. need to dynamically allocate space for an unlimited number of items. To make this performant, these classes do allocate more than necessary. The Capacity is the number of elements you can fit in the currently allocated memory while the Count <= Capacity is the actual number of elements in the container.
You can set the Capacity property to reduce the need for intermediate allocation when you fill a container and you do know the final number of elements from the beginning:
var
L: TList<Integer>;
begin
L := TList<Integer>.Create;
try
while not Something.EOF do
L.Add(Something.GetNextValue);
finally
L.Free;
end;
is OK and requires probably only a few reallocations, but
L := TList<Integer>.Create;
try
L.Capacity := Something.Count;
while not Something.EOF do
L.Add(Something.GetNextValue);
finally
L.Free;
end;
will be faster since there will be no intermediate reallocations.
Internally TQueue contains dynamic array that stores elements.
When item count reaches current capacity, array is reallocated (for example, doubles it's size) and you can add more and more elements.
If you know reliable limit for maximum item count, it is worth to set Capacity, so you will avoid memory reallocations, saving some time.
Related
With this program, I am trying to read a file and randomly print it to console. I am wondering If I have to use arrays for that. For example, I could assign my strings into an array, and randomly print from my array. But, I'm not sure how to approach to that. Also another problem is that, my current program does not read the first line from my file. I have a text file text.txt that contains
1. ABC
2. ABC
...
6. ABC
And below is my code.
type
arr = record
end;
var
x: text;
s: string;
SpacePos: word;
myArray: array of arr;
i: byte;
begin
Assign(x, 'text.txt');
reset(x);
readln(x, s);
SetLength(myArray, 0);
while not eof(x) do
begin
SetLength(myArray, Length(myArray) + 1);
readln(x, s);
WriteLn(s);
end;
end.
Please let me know how I could approach this problem!
There are a few issues with your program.
Your first Readln reads the first line of the file into s, but you don't use this value at all. It is lost. The first time you do a Readln in the loop, you get the second line of the file (which you do print to the console using Writeln).
Your arr record type is completely meaningless in this case (and in most cases), since it is a record without any members. It cannot store any data, because it has no members.
In your loop, you expand the length of the array, one item at a time. But you don't set the new item's value to anything, so you do this in vain. (And, because of the previous point, there isn't any value to set in any case: the elements of the array are empty records that cannot contain any data.)
Increasing the length of a dynamic array one item at a time is very bad practice, because it might cause a new heap allocation each time. The entire existing array might need to be copied to a new location in your computer's memory, every time.
The contents of the loop seem to be trying to do two things: saving the current line in the array, and printing it to the console. I assume the latter is only for debugging?
Old-style Pascal I/O (text, Assign, Reset) is obsolete. It is not thread-safe, possibly slow, handles Unicode badly, etc. It was used in the 90s, but shouldn't be used today. Instead, use the facilities provided by your RTL. (In Delphi, for instance, you can use TStringList, IOUtils.TFile.ReadAllLines, streams, etc.)
A partly fixed version of the code might look like this (still using old-school Pascal I/O and the inefficient array handling):
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils;
var
x: text;
arr: array of string;
begin
// Load file to string array (old and inefficient way)
AssignFile(x, 'D:\test.txt');
Reset(x);
try
while not Eof(x) do
begin
SetLength(arr, Length(arr) + 1);
Readln(x, arr[High(Arr)]);
end;
finally
CloseFile(x);
end;
Randomize;
// Print strings randomly
while True do
begin
Writeln(Arr[Random(Length(Arr))]);
Readln;
end;
end.
If you want to fix the inefficient array issue, but still not use modern classes, allocate in chunks:
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils;
var
x: text;
s: string;
arr: array of string;
ActualLength: Integer;
procedure AddLineToArr(const ALine: string);
begin
if Length(arr) = ActualLength then
SetLength(arr, Round(1.5 * Length(arr)) + 1);
arr[ActualLength] := ALine;
Inc(ActualLength);
end;
begin
SetLength(arr, 1024);
ActualLength := 0; // not necessary, since a global variable is always initialized
// Load file to string array (old and inefficient way)
AssignFile(x, 'D:\test.txt');
Reset(x);
try
while not Eof(x) do
begin
Readln(x, s);
AddLineToArr(s);
end;
finally
CloseFile(x);
end;
SetLength(arr, ActualLength);
Randomize;
// Print strings randomly
while True do
begin
Writeln(Arr[Random(Length(Arr))]);
Readln;
end;
end.
But if you have access to modern classes, things get much easier. The following examples use the modern Delphi RTL:
The generic TList<T> handles efficient expansion automatically:
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, Generics.Defaults, Generics.Collections;
var
x: text;
s: string;
list: TList<string>;
begin
list := TList<string>.Create;
try
// Load file to string array (old and inefficient way)
AssignFile(x, 'D:\test.txt');
Reset(x);
try
while not Eof(x) do
begin
Readln(x, s);
list.Add(s);
end;
finally
CloseFile(x);
end;
Randomize;
// Print strings randomly
while True do
begin
Writeln(list[Random(list.Count)]);
Readln;
end;
finally
list.Free;
end;
end.
But you could simply use a TStringList:
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, Classes;
var
list: TStringList;
begin
list := TStringList.Create;
try
list.LoadFromFile('D:\test.txt');
Randomize;
// Print strings randomly
while True do
begin
Writeln(list[Random(list.Count)]);
Readln;
end;
finally
list.Free;
end;
end.
Or you could keep the array approach and use IOUtils.TFile.ReadAllLines:
program Project1;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, IOUtils;
var
arr: TArray<string>;
begin
arr := TFile.ReadAllLines('D:\test.txt');
Randomize;
// Print strings randomly
while True do
begin
Writeln(arr[Random(Length(arr))]);
Readln;
end;
end.
As you can see, the modern approaches are much more convenient (less code). They are also faster and give you Unicode support.
Note: All snippets above assume that the file contains at least a single line. They will fail if this is not the case, and in real/production code, you must verify this, e.g. like
if Length(arr) = 0 then
raise Exception.Create('Array is empty.');
or
if List.Count = 0 then
raise Exception.Create('List is empty.');
before the // Print strings randomly part, which assumes that the array/list isn't empty.
Also another problem is that, my current program does not read the first line from my file.
Yes it does. But you don't write it to the console. See the third line, readln(x, s);
I am trying to read a file and randomly print it to console. I am wondering If I have to use arrays for that.
Yes that is a sound approach.
Instead of using an array of a record, just declare:
myArray : array of string;
To get a random value from the array, use Randomize to initialize the random generator, and Random() to get a random index.
var
x: text;
myArray: array of String;
ix: Integer;
begin
Randomize; // Initiate the random generator
Assign(x, 'text.txt');
reset(x);
ix := 0;
SetLength(myArray, 0);
while not eof(x) do
begin
SetLength(myArray, Length(myArray) + 1);
readln(x, myArray[ix]);
WriteLn(myArray[ix]);
ix := ix + 1;
end;
WriteLn('Random line:');
WriteLn(myArray[Random(ix)]); // Random(ix) returns a random number 0..ix-1
end.
I edited the answer as I read your question clearly.
Nonetheless, for your reading an extra line issue, you happened to read a line before going into your read loop. So this is your program from the begin to the end statement without that extra readln().
For this code the routine is simple and there is actually more than one method that I can think of. For the first method, you can read each line into an array. Then traverse through the array and create a random number of 0 or 1. If it is 1 print that line.
For the second method, each time read a line from the file, generate a random number of 0 or 1. If that random number is a 1, print that line.
Take note to use randomize before you run random() to not get the same random number of the last program execution. Another thing to take into consideration, if you going on a large text file, each set length can cost a lot. It is better to keep track of what going in there and set 20 - 30 length as one or even 100. That is if you going with the array route.
The code below is for the array method, for the method of not using an array, it is very simple once you see the routines below.
var
x: text;
SpacePos: word;
myArray: array of string;
i: integer;
begin
Randomize;
Assign(x, 'text.txt');
reset(x);
SetLength(myArray, 0);
i := 0;
while not eof(x) do
begin
SetLength(myArray, Length(myArray) + 1);
readln(x, myArray[i]);
i := i + 1;
end;
for i:= 0 to Length(myArray) - 1 do
begin
if random(2) = 1 then
WriteLn(myArray[i]);
end;
end.
I'm work on a program for school (Cinema app) but I have a problem with my array. My app closed but nothing is showed.
program TFE;
{$APPTYPE CONSOLE}
uses
SysUtils,
StrUtils,
Crt;
var
MovieList, MovieInfo: Text;
Choice: Byte;
i: Integer;
L: String;
S: array of String[14];
begin
i := 0
Assign(MovieInfo, 'MovieInfo.txt');
Reset(MovieInfo);
Readln(Choice);
i := 0;
ClrScr;
While not eof (MovieInfo) do
begin
Readln(MovieInfo, L);
S[i] := L;
i := i + 1;
end;
Writeln(S[Choice]);
Readln;
end.
It's all my code for the moment.
Somebody can help me ?
In the title you speak about a variable MyVar, but the code doesn't show any such variable. For future reference, please carefully proof read your question before posting.
You have declared a dynamic array:
S: array of String[14];
that is, an array of 14 character strings (short strings). But you have never set the length of this array, and so it can not hold any strings at all.
Use procedure SetLength(var S: <string or dynamic array>; NewLength: Integer); to allocate space for items in the array.
As you dont know (I presume) how many movies there might be in the file, you must first allocate some amount, and then be prepared to expand the array (with a new call to SetLength()) if the array becomes filled up before all movies are read from the file. For example, initialize (before the while loop) with space for 10 movies:
SetLength(S, 10);
and then in the while loop, e.g. just before ReadLn(),
if i > (Length(S)-1) then
SetLength(S, Length(S)+10);
Another comment is that the user is not presented any prompt when requested for a choice, but maybe this is still under development ;-)
The message you got is correct, because you only work with the array inside the while not eof loop. At that moment, the compiler cannot know the content of the file. It may be blank, as far as he's concerned. If the file is, indeed, blank, he'll skip entirely the while not eof part and go straight to the array writing part. Because the array was never used, it doesn't have a defined value, hence this message.
The solution is simple: initialize the array's values with 0:
program TFE;
{$APPTYPE CONSOLE}
uses
SysUtils,
StrUtils,
Crt;
var
MovieList, MovieInfo: Text;
Choice: Byte;
i: Integer;
L: String;
S: array of String[14];
begin
SetLength(s,10); //10 is an example
for i:=0 to Length(s) do
s[i]:='';
Assign(MovieInfo, 'MovieInfo.txt');
Reset(MovieInfo);
Readln(Choice);
i := 0;
ClrScr;
While not eof (MovieInfo) do
begin
Readln(MovieInfo, L);
S[i] := L;
i := i + 1;
end;
Writeln(S[Choice]);
Readln;
end.
Digging deep into your code, you define MovieList and MovieInfo, but you only use MovieInfo. Why?
I use Delphi 10.1 Berlin in Windows 10.
I have two records of different sizes. I wrote code to loop through two TList<T> of these records to test elapsed times. Looping through the list of the larger record runs much slower.
Can anyone explain the reason, and provide a solution to make the loop run faster?
type
tTestRecord1 = record
Field1: array[0..4] of Integer;
Field2: array[0..4] of Extended;
Field3: string;
end;
tTestRecord2 = record
Field1: array[0..4999] of Integer;
Field2: array[0..4999] of Extended;
Field3: string;
end;
procedure TForm1.Button1Click(Sender: TObject);
var
_List: TList<tTestRecord1>;
_Record: tTestRecord1;
_Time: TTime;
i: Integer;
begin
_List := TList<tTestRecord1>.Create;
for i := 0 to 4999 do
begin
_List.Add(_Record);
end;
_Time := Time;
for i := 0 to 4999 do
begin
if _List[i].Field3 = 'abcde' then
begin
Break;
end;
end;
Button1.Caption := FormatDateTime('s.zzz', Time - _Time); // 0.000
_List.Free;
end;
procedure TForm1.Button2Click(Sender: TObject);
var
_List: TList<tTestRecord2>;
_Record: tTestRecord2;
_Time: TTime;
i: Integer;
begin
_List := TList<tTestRecord2>.Create;
for i := 0 to 4999 do
begin
_List.Add(_Record);
end;
_Time := Time;
for i := 0 to 4999 do
begin
if _List[i].Field3 = 'abcde' then
begin
Break;
end;
end;
Button2.Caption := FormatDateTime('s.zzz', Time - _Time); // 0.045
_List.Free;
end;
First of all, I want to consider the entire code, even the code that populates the list which I do realise you have not timed. Because the second record is larger in size more memory needs to be copied when you make an assignment of that record type. Further when you read from the list the larger record is less cache friendly than the smaller record which impacts performance. This latter effect is likely less significant than the former.
Related to this is that as you add items the list's internal array of records has to be resized. Sometimes the resizing leads to a reallocation that cannot be performed in-place. When that happens a new block of memory is allocated and the previous content is copied to this new block. That copy is clearly ore expensive for the larger record. You can mitigate this by allocating the array once up front if you know it's length. The list Capacity is the mechanism to use. Of course, not always will you know the length ahead of time.
Your program does very little beyond memory allocation and memory access. Hence the performance of these memory operations dominates.
Now, your timing is only of the code that reads from the lists. So the memory copy performance difference on population is not part of the benchmarking that you performed. Your timing differences are mainly down to excessive memory copy when reading, as I will explain below.
Consider this code:
if _List[i].Field3 = 'abcde' then
Because _List[i] is a record, a value type, the entire record is copied to an implicit hidden local variable. The code is actually equivalent to:
var
tmp: tTestRecord2;
...
tmp := _List[i]; // copy of entire record
if tmp.Field3 = 'abcde' then
There are a few ways to avoid this copy:
Change the underlying type to be a reference type. This changes the memory management requirements. And you may have good reason to want to use a value type.
Use a container class that can return the address of an item rather than a copy of an item.
Switch from TList<T> to dynamic array TArray<T>. That simple change will allow the compiler to access individual fields directly without copying entire records.
Use the TList<T>.List to obtain access to the list object's underlying array holding the data. That would have the same effect as the previous item.
Item 4 above is the simplest change you could make to see a large difference. You would replace
if _List[i].Field3 = 'abcde' then
with
if _List.List[i].Field3 = 'abcde' then
and that should yield a very significant change in performance.
Consider this program:
{$APPTYPE CONSOLE}
uses
System.Diagnostics,
System.Generics.Collections;
type
tTestRecord2 = record
Field1: array[0..4999] of Integer;
Field2: array[0..4999] of Extended;
Field3: string;
end;
procedure Main;
const
N = 100000;
var
i: Integer;
Stopwatch: TStopwatch;
List: TList<tTestRecord2>;
Rec: tTestRecord2;
begin
List := TList<tTestRecord2>.Create;
List.Capacity := N;
for i := 0 to N-1 do
begin
List.Add(Rec);
end;
Stopwatch := TStopwatch.StartNew;
for i := 0 to N-1 do
begin
if List[i].Field3 = 'abcde' then
begin
Break;
end;
end;
Writeln(Stopwatch.ElapsedMilliseconds);
end;
begin
Main;
Readln;
end.
I had to compile it for 64 bit to avoid an out of memory condition. The output on my machine is around 700. Change List[i].Field3 to List.List[i].Field3 and the output is in single figures. The timing is rather crude, but I think this demonstrates the point.
The issue of the large record not being cache friendly remains. That is more complicated to deal with and would require a detailed analysis of how the real world code operated on its data.
As an aside, if you care about performance then you won't use Extended. Because it has size 10, not a power of two, memory access is frequently mis-aligned. Use Double or Real which is an alias to Double.
Following on from this question (Dynamic arrays and memory management in Delphi), if I create a dynamic array in Delphi, how do I access the reference count?
SetLength(a1, 100);
a2 := a1;
// The reference count for the array pointed to by both
// a1 and a2 should be 2. How do I retrieve this?
Additionally, if the reference count can be accessed, can it also be modified manually? This latter question is mainly theoretical rather than for use practically (unlike the first question above).
You can see how the reference count is managed by inspecting the code in the System unit. Here are the pertinent parts from the XE3 source:
type
PDynArrayRec = ^TDynArrayRec;
TDynArrayRec = packed record
{$IFDEF CPUX64}
_Padding: LongInt; // Make 16 byte align for payload..
{$ENDIF}
RefCnt: LongInt;
Length: NativeInt;
end;
....
procedure _DynArrayAddRef(P: Pointer);
begin
if P <> nil then
AtomicIncrement(PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt);
end;
function _DynArrayRelease(P: Pointer): LongInt;
begin
Result := AtomicDecrement(PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt);
end;
A dynamic array variable holds a pointer. If the array is empty, then the pointer is nil. Otherwise the pointer contains the address of the first element of the array. Immediately before the first element of the array is stored the metadata for the array. The TDynArrayRec type describes that metadata.
So, if you wish to read the reference count you can use the exact same technique as does the RTL. For instance:
function DynArrayRefCount(P: Pointer): LongInt;
begin
if P <> nil then
Result := PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt
else
Result := 0;
end;
If you want to modify the reference count then you can do so by exposing the functions in System:
procedure DynArrayAddRef(P: Pointer);
asm
JMP System.#DynArrayAddRef
end;
function DynArrayRelease(P: Pointer): LongInt;
asm
JMP System.#DynArrayRelease
end;
Note that the name DynArrayRelease as chosen by the RTL designers is a little mis-leading because it merely reduces the reference count. It does not release memory when the count reaches zero.
I'm not sure why you would want to do this mind you. Bear in mind that once you start modifying the reference count, you have to take full responsibility for getting it right. For instance, this program leaks:
{$APPTYPE CONSOLE}
var
a, b: array of Integer;
type
PDynArrayRec = ^TDynArrayRec;
TDynArrayRec = packed record
{$IFDEF CPUX64}
_Padding: LongInt; // Make 16 byte align for payload..
{$ENDIF}
RefCnt: LongInt;
Length: NativeInt;
end;
function DynArrayRefCount(P: Pointer): LongInt;
begin
if P <> nil then
Result := PDynArrayRec(PByte(P) - SizeOf(TDynArrayRec))^.RefCnt
else
Result := 0;
end;
procedure DynArrayAddRef(P: Pointer);
asm
JMP System.#DynArrayAddRef
end;
function DynArrayRelease(P: Pointer): LongInt;
asm
JMP System.#DynArrayRelease
end;
begin
ReportMemoryLeaksOnShutdown := True;
SetLength(a, 1);
Writeln(DynArrayRefCount(a));
b := a;
Writeln(DynArrayRefCount(a));
DynArrayAddRef(a);
Writeln(DynArrayRefCount(a));
a := nil;
Writeln(DynArrayRefCount(b));
b := nil;
Writeln(DynArrayRefCount(b));
end.
And if you make a call to DynArrayRelease that takes the reference count to zero then you would also need to dispose of the array, for reasons discussed above. I've never encountered a problem that would require manipulation of the reference count, and strongly suggest that you avoid doing so.
One final point. The RTL does not offer this functionality through its public interface. Which means that all of the above is private implementation detail. And so is subject to change in a future release. If you do attempt to read or modify the reference count then you must recognise that doing so relies on such implementation detail.
After some googling, I found an excellent article by Rudy Velthuis. I highly recommend to read it. Quoting dynamic arrays part from http://rvelthuis.de/articles/articles-pointers.html#dynarrays
At the memory location below the address to which the pointer points, there are two more fields, the number of elements allocated, and the reference count.
If, as in the diagram above, N is the address in the dynamic array variable, then the reference count is at address N-8, and the number of allocated elements (the length indicator) at N-4. The first element is at address N.
How to access these:
SetLength(a1, 100);
a2 := a1;
// Reference Count = 2
refCount := PInteger(NativeUInt(#a1[0]) - SizeOf(NativeInt) - SizeOf(Integer))^;
// Array Length = 100
arrLength := PNativeInt(NativeUInt(#a1[0]) - SizeOf(NativeInt))^;
The trick in computing proper offsets is to account for differences between 32bit and 64bit platforms code. Fields size in bytes is as follows:
32bit 64bit
RefCount 4 4
Length 4 8
I have a dynamic array. But initially I am not knowing the length of the array. Can I do like first I set the length of it as 1 and then increase length as I needed without lost of previously stored data?
I know I can do such task using TList. But I want to know whether I can do it with array or not?
Dynamic Arrays can be resized to a larger size without losing the contained data.
The following program demonstrates this in action.
program Project7;
{$APPTYPE CONSOLE}
uses
SysUtils;
var
A : Array of Integer;
I : Integer;
begin
for I := 0 to 19 do
begin
SetLength(A,I+1);
A[I] := I;
end;
for I := Low(A) to High(A) do
begin
writeln(A[I]);
end;
readln;
end.