Accessing a container object vs a locally assigned variable - delphi

Can anyone tell me if there is any performance benefit to assigning an object in the container to a local variable if its used a lot in a tight loop.
I have a large for loop and inside the loop an object from a container is access often.
i.e
for i := 0 to 100000 do
begin
my_list[i].something := something;
my_list[i].something_else := something;
my_list[i].something_else := something;
my_list[i].something_else := something;
my_list[i].something_else := something;
end;
Would I see a performance improvement by assigning
local_ref := my_list[i];
at the start of each iteration?
I am using a generic container (TList<<>MyObject<>>).

Making the change you suggest will certainly result in faster code. Accessing a local variable is always going to be faster than accessing a property getter on TList<T>. For a start, those getters perform validity checking on the index. But even for a class with the most simple getter possible, it would be hard to beat the cached local for performance.
Now, whether this matters in your case is impossible to say from here. If you do anything remotely non-trivial inside the loop then their runtime of the item getter will be irrelevant.
The fact that the loop might run for a large number of iterations is not the key factor. What counts most of all is how much time you spend in each iteration. If it costs you 1 time unit to call the item getter, and 1000 time units to do whatever you do with each item, then it the getter performance is a non-issue.
Ultimately the definitive way to answer the question is to time the alternatives. Only optimise based on measurement.
There's a much better reason to copy the item into a local variable: clarity of expression. Your current code is an egregious violation of the DRY principle.
Finally, this code would read best of all if it used a for in loop:
for Item in List do
....
Now, for in loops can be slower than traditional loops, but you should weight that against clarity and maintainability. Optimisation usually makes code harder to maintain and more prone to faults. Conclusion: only optimise bottlenecks.

It all depends on how my_list[i] is retrieved. If it results in a bunch of function calls, it can potentially make a difference (not to mention any side effect).
As usual, you should measure before doing any kind of performance refactoring. Premature optimization.....
For the record, it was one of the "good" use of with in the original Pascal design:
with my_list[i] do
begin
something := something_else;
[...]
end;

Related

creating a fast case statement (assembly)

I have a project that has intensive use of the case statement with many procedures coming off it. I know you can place case statements in a two tear arrangement divide in blocks of 10 and a second case statement to separate individual procedures. But I have a better idea if I can pull it off.
I want to call it assembly case
Prolist: array [1..500] of Pointer =
(#Procedure1, #Procedure2, #Procedure3, #Procedure4, #Procedure5);
Procedure ASMCase(Prolist: array of Pointer; No: Word; Var InRange: Boolean);
var Count : DWord;
PTR: Pointer;
Pro : Procedure;
begin
Count := No * 4;
InRange := boolean(Count <= SizeOf(Prolist));
If not InRange then Exit;
PTR := Pointer(DWord(#Prolist[1]) + Count);
If PTR <> nil then Pro := #PTR else Exit;
Pro; /run procedure
end;​
The point is I'm creating a direct jump to the procedure.
In my case procedures can have an identical header and global data can be accessed for any odd information. Writing it in assembly would be faster I think but what I'm not sure on is running the procedures. Please do not ask why am I doing this as I have 500 procedures with many calls on the case statement and time is of essence with a fast processor.
It's expensive to pass that array by value. Pass it by const.
I can't see the point of the InRange flag and test. Don't pass out of range indices. And if you have to test, do it right. Don't use SizeOf which measures byte size. Use high or perhaps Length, if you have to. Which I doubt.
The pointer assignment test (PTR <> nil) is bogus. That condition always evaluates true. And the array indexing is very weird. What's wrong with []?
On top of that, your array is 1-based (usually a bad choice) but open arrays are always 0-based. Likely that's going to trip you up.
In short, I'd throw away all of that code. It's both wrong and needless. I'd just write it like this:
ProList[No]();
In order for this to compile your array would need to be defined as an array of procedural type rather than array of Pointer. Adding some type safety would be a good move.
It's pretty hard to see asm making much difference here. The compiler is going to emit optimal code.
If you are concerned with out of bound access, enable range checking in debug mode. Disable it for release if performance is paramount.
Bear in mind that global data structures don't tend to scale well as you add complexity. Most experienced programmers go to some length to avoid global state. Are you sure that global state is the right choice for you?
If you do need to improve performance, first identify opportunity for improvement. Reading from an array and calling a function are not likely candidates. Look at the procedures that you call. The bottlenecks are surely there.
One final point. Try to forget that you ever learn to use # with function pointers. Doing so yields an untyped pointer, of type Pointer that can be assigned to any pointer type. And thus you completely abandon type checking. Your procedure could have the wrong signature altogether and the compiler is not able to tell you. Declare your array of procedures with a type safe procedure type.

Does the compiler optimize (close) identical FieldByName calls?

In some code I'm maintaining, I see two different methods used in TClientDataSet.OnCalcFields event handlers:
with DataSet do
begin
// 1. Call FieldByName twice
if AMinDate > FieldByName(SPlanAllocatieFromDate).AsDateTime then
AMinDate := FieldByName(sPlanAllocatieFromDate).AsDateTime;
// 2. Put the retrieved FieldByName value in a temp var
lEmpID := FieldByName(SPlanAllocatieEmpID).AsInteger;
if lEmpID <> 0 then lTSAllocatedEmpIDs.Add(IntToStr(lEmpID));
end;
Will the compiler (Delphi XE2, Win32 app) optimize method 2 to use a temp var? The two FieldByNames are quite close, you could even say nested.
If not, I should rewrite 1. because OnCalcFields executes often.
BTW. I know about Fields[] versus FieldByName(), or using a temp TField var when running an EOF loop, those are not the issue here.
No version of the Delphi compiler does anything like this.
Such optimizations would require the compiler to be able to prove that the two calls to FieldByName would always give the same result, and there is currently no provision for flagging a method as being deterministic.
Note that it is quite possible in theory (if unlikely in reality) for the two calls NOT to give the same result, in this case e.g. if a different thread deletes a field out of the collection between the first and second call. Generally, the compiler does not know or care at the call site what a particular method call actually does.
Does the compiler optimize (close) identical FieldByName calls?
No it does not.
The compiler does not look inside function calls to see what is within. It therefore has no way to prove that the value returned by successive calls to a function would be the same. Likewise it has no way to prove that the function has no side-effects. These are the two prerequisites for the optimisation under consideration.
You will need to perform the optimisation yourself, by explicitly adding and using a local variable to store the value returned by a single call to FieldByName.
Beyond the consideration of performance, I would argue that the use of a local variable to hold the field is semantically much better. This makes it clear to the reader that all actions are performed on the same field. That reason alone would be enough to persuade me to make the change you describe. Don't repeat yourself.
And while we are in code review mode, you might care to reconsider the use of with.

How to handle billions of objects without "Outofmemory" error

I have an application which may needs to process billions of objects.Each object of is of TRange class type. These ranges are created at different parts of an algorithm which depends on certain conditions and other object properties. As a result, if you have 100 items, you can't directly create the 100th object without creating all the prior objects. If I create all the (billions of) objects and add to the collection, the system will throw Outofmemory error. Now I want to iterate through each object mainly for two purposes:
To apply an operation for each TRange object(eg:Output certain properties)
To get a cumulative sum of a certain property.(eg: Each range has a weight property and I want to retreive totalweight that is a sum of all the range weights).
How do I effectively create an Iterator for these object without raising Outofmemory?
I have handled the first case by passing a function pointer to the algorithm function. For eg:
procedure createRanges(aProc: TRangeProc);//aProc is a pointer to function that takes a //TRange
var range: TRange;
rangerec: TRangeRec;
begin
range:=TRange.Create;
try
while canCreateRange do begin//certain conditions needed to create a range
rangerec := ReturnRangeRec;
range.Update(rangerec);//don't create new, use the same object.
if Assigned(aProc) then aProc(range);
end;
finally
range.Free;
end;
end;
But the problem with this approach is that to add a new functionality, say to retrieve the Total weight I have mentioned earlier, either I have to duplicate the algorithm function or pass an optional out parameter. Please suggest some ideas.
Thank you all in advance
Pradeep
For such large ammounts of data you need to only have a portion of the data in memory. The other data should be serialized to the hard drive. I tackled such a problem like this:
I Created an extended storage that can store a custom record either in memory or on the hard drive. This storage has a maximum number of records that can live simultaniously in memory.
Then I Derived the record classes out of the custom record class. These classes know how to store and load themselves from the hard drive (I use streams).
Everytime you need a new or already existing record you ask the extended storage for such a record. If the maximum number of objects is exceeded, the storage streams some of the least used record back to the hard drive.
This way the records are transparent. You always access them as if they are in memory, but they may get loaded from hard drive first. It works really well. By the way RAM works in a very similar way so it only holds a certain subset of all you data on your hard drive. This is your working set then.
I did not post any code because it is beyond the scope of the question itself and would only confuse.
Look at TgsStream64. This class can handle a huge amounts of data through file mapping.
http://code.google.com/p/gedemin/source/browse/trunk/Gedemin/Common/gsMMFStream.pas
But the problem with this approach is that to add a new functionality, say to retrieve the Total weight I have mentioned earlier, either I have to duplicate the algorithm function or pass an optional out parameter.
It's usually done like this: you write a enumerator function (like you did) which receives a callback function pointer (you did that too) and an untyped pointer ("Data: pointer"). You define a callback function to have first parameter be the same untyped pointer:
TRangeProc = procedure(Data: pointer; range: TRange);
procedure enumRanges(aProc: TRangeProc; Data: pointer);
begin
{for each range}
aProc(range, Data);
end;
Then if you want to, say, sum all ranges, you do it like this:
TSumRecord = record
Sum: int64;
end;
PSumRecord = ^TSumRecord;
procedure SumProc(SumRecord: PSumRecord; range: TRange);
begin
SumRecord.Sum := SumRecord.Sum + range.Value;
end;
function SumRanges(): int64;
var SumRec: TSumRecord;
begin
SumRec.Sum := 0;
enumRanges(TRangeProc(SumProc), #SumRec);
Result := SumRec.Sum;
end;
Anyway, if you need to create billions of ANYTHING you're probably doing it wrong (unless you're a scientist, modelling something extremely large scale and detailed). Even more so if you need to create billions of stuff every time you want one of those. This is never good. Try to think of alternative solutions.
"Runner" has a good answer how to handle this!
But I would like to known if you could do a quick fix: make smaller TRange objects.
Maybe you have a big ancestor? Can you take a look at the instance size of TRange object?
Maybe you better use packed records?
This part:
As a result, if you have 100 items,
you can't directly create the 100th
object without creating all the prior
objects.
sounds a bit like calculating Fibonacci. May be you can reuse some of the TRange objects instead of creating redundant copies? Here is a C++ article describing this approach - it works by storing already calculated intermediate results in a hash map.
Handling billions of objects is possible but you should avoid it as much as possible. Do this only if you absolutely have to...
I did create a system once that needed to be able to handle a huge amount of data. To do so, I made my objects "streamable" so I could read/write them to disk. A larger class around it was used to decide when an object would be saved to disk and thus removed from memory. Basically, when I would call an object, this class would check if it's loaded or not. If not, it would re-create the object again from disk, put it on top of a stack and then move/write the bottom object from this stack to disk. As a result, my stack had a fixed (maximum) size. And it allowed me to use an unlimited amount of objects, with a reasonable good performance too.
Unfortunately, I don't have that code available anymore. I wrote it for a previous employer about 7 years ago. I do know that you would need to write a bit of code for the streaming support plus a bunch more for the stack controller which maintains all those objects. But it technically would allow you to create an unlimited number of objects, since you're trading RAM memory for disk space.

Approaches for caching calculated values

In a Delphi application we are working on we have a big structure of related objects. Some of the properties of these objects have values which are calculated at runtime and I am looking for a way to cache the results for the more intensive calculations. An approach which I use is saving the value in a private member the first time it is calculated. Here's a short example:
unit Unit1;
interface
type
TMyObject = class
private
FObject1, FObject2: TMyOtherObject;
FMyCalculatedValue: Integer;
function GetMyCalculatedValue: Integer;
public
property MyCalculatedValue: Integer read GetMyCalculatedValue;
end;
implementation
function TMyObject.GetMyCalculatedValue: Integer;
begin
if FMyCalculatedValue = 0 then
begin
FMyCalculatedValue :=
FObject1.OtherCalculatedValue + // This is also calculated
FObject2.OtherValue;
end;
Result := FMyCalculatedValue;
end;
end.
It is not uncommon that the objects used for the calculation change and the cached value should be reset and recalculated. So far we addressed this issue by using the observer pattern: objects implement an OnChange event so that others can subscribe, get notified when they change and reset cached values. This approach works but has some downsides:
It takes a lot of memory to manage subscriptions.
It doesn't scale well when a cached value depends on lots of objects (a list for example).
The dependency is not very specific (even if a cache value depends only on one property it will be reset also when other properties change).
Managing subscriptions impacts the overall performance and is hard to maintain (objects are deleted, moved, ...).
It is not clear how to deal with calculations depending on other calculated values.
And finally the question: can you suggest other approaches for implementing cached calculated values?
If you want to avoid the Observer Pattern, you might try to use a hashing approach.
The idea would be that you 'hash' the arguments, and check if this match the 'hash' for which the state is saved. If it does not, then you recompute (and thus save the new hash as key).
I know I make it sound like I just thought about it, but in fact it is used by well-known softwares.
For example, SCons (Makefile alternative) does it to check if the target needs to be re-built preferably to a timestamp approach.
We have used SCons for over a year now, and we never detected any problem of target that was not rebuilt, so their hash works well!
You could store local copies of the external object values which are required. The access routine then compares the local copy with the external value, and only does the recalculation on a change.
Accessing the external objects properties would likewise force a possible re-evaluation of those properties, so the system should keep itself up-to-date automatically, but only re-calculate when it needs to. I don't know if you need to take steps to avoid circular dependencies.
This increases the amount of space you need for each object, but removes the observer pattern. It also defers all calculations until they are needed, instead of performing the calculation every time a source parameter changes. I hope this is relevant for your system.
unit Unit1;
interface
type
TMyObject = class
private
FObject1, FObject2: TMyOtherObject;
FObject1Val, FObject2Val: Integer;
FMyCalculatedValue: Integer;
function GetMyCalculatedValue: Integer;
public
property MyCalculatedValue: Integer read GetMyCalculatedValue;
end;
implementation
function TMyObject.GetMyCalculatedValue: Integer;
begin
if (FObject1.OtherCalculatedValue &LT;&GT; FObjectVal1)
or (FObject2.OtherValue &LT;&GT; FObjectVal2) then
begin
FMyCalculatedValue :=
FObject1.OtherCalculatedValue + // This is also calculated
FObject2.OtherValue;
FObjectVal1 := FObject1.OtherCalculatedValue;
FObjectVal2 := Object2.OtherValue;
end;
Result := FMyCalculatedValue;
end;
end.
In my work I use Bold for Delphi that can manage unlimited complex structures of cached values depending on each other. Usually each variable only holds a small part of the problem. In this framework that is called derived attributes. Derived because the value is not saved in the database, It just depends on on other derived attributes or persistant attributes in the database.
The code behind such attribute is written in Delphi as a procedure or in OCL (Object Constraint Language) in the model. If you write it as Delphi code you have to subscribe to the depending variables. So if attribute C depends on A and B then whenever A or B changes the code for recalc C is called automatically when C is read. So the first time C is read A and B is also read (maybe from the database). As long as A and B is not changed you can read C and got very fast performance. For complex calculations this can save quite a lot of CPU-time.
The downside and bad news is that Bold is not offically supported anymore and you cannot buy it either. I suppose you can get if you ask enough people, but I don't know where you can download it. Around 2005-2006 it was downloadable for free from Borland but not anymore.
It is not ready for D2009 as someone have to port it to Unicode.
Another option is ECO with dot.net from Capable Objects. ECO is a plugin in Visual Studio. It is a supported framwork that have the same idea and author as Bold for Delphi. Many things are also improved, for example databinding is used for the GUI-components. Both Bold and ECO use a model as a central point with classes, attributes and links. Those can be persisted in a database or a xml-file. With the free version of ECO the model can have max 12 classes, but as I remember there is no other limits.
Bold and ECO contains lot more than derived attributes that makes you more productive and allow you to think on the problem instead of technical details of database or in your case how to cache values. You are welcome with more questions about those frameworks!
Edit:
There is actually a download link for Embarcadero registred users for Bold for Delphi for D7, quite old... I know there was updates for D2005, ad D2006.

Does avoiding functions increase the performance?

Here is a little test:
function inc(n:integer):integer;
begin
n := n+1;
result := n;
end;
procedure TForm1.Button1Click(Sender: TObject);
var
start,i,n:integer;
begin
n := 0;
start := getTickCount;
for i := 0 to 10000000 do begin
inc(n);//calling inc function takes 73 ms
//n := n+1; writing it directly takes 16 ms
end;
showMessage(inttostr(getTickCount-start));
end;
Yes, calling a function introduces an overhead. Before calling the function it's necessary to save the current state - which instruction was planned to execute next - and also to copy the function parameters. This requires extra work and extra time.
That's where inlining is helpful. If the compiler supports that it can just injsct the function code directly at the call site and avoid the overhead. With good optimization of surrounding code it can even decrease amount of generated code.
This doesn't mean you need to avoid functions. In most cases the function body executes much longer that the time needed to organize the call. Only in quite rare cases the overhead is worth optimizing. This should never be done without the help of the profiler - otherwise you waste time and most likely just get a lot of unmaintainable code.
Calling a function (whichever language you're working with) generally involves doing a bit more things, like saving some context, pushing parameters to some kind of stack, calling the function itself, reading the parameters, and then pushing the result back somewhere, returning from the function, extracting the return value, ...
So, of course, calling functions generally means having some overhead.
But the main point of functions is re-using some parts of code : maybe it will take a few micro-seconds more at execution, but if you only have to write some code once, instead of 10 (or more) times, there is a huge gain ; and that code will be much easier to maintain, which is really important in the long term.
After, you might want not using functions for some really small parts of code like the one you provided as an example (well, except if the language you're using provides some kind of inlining thing -- it's the case for C, if I remember correctly ; not sure about delphi, though) : the overhead of calling the function will be important, compared to the number of lines of code the function will save you from writing (here : none ! On the contrary ^^ ).
But for bigger parts of code, the overhead will me much smaller, compared to the time taken to execute the bunch of code the function contains...
Premature optimization is the root of all evil...
Write correct and maintainable code using the known features (here the built-in pseudo(magic) procedure inc), benchmark it and refactor where it's needed for performance reason (if any).
I bet that in 99.9% of the cases, avoiding calling a function or procedure is not the solution.
Here is an example where adding a call to a procedure actually IS the optimization.
Only optimize when there is a bottleneck.
Your current code is perfectly fine for about 99.9% of the cases.
If it gets slow, use a profiler to point you at the bottleneck.
When the bottleneck appears to be in the inc function, then you can always inline your function by marking it with the 'inline' directive.
I totally agree with Francois on this one.
One of the most expensive parts of a function call is the returning of the result.
If you did want to keep your program modular, but wanted to save a bit of time, change your function to a procedure and use a var parameter to retrieve the result.
So for your example:
procedure inc(var n:integer);
begin
n := n+1;
end;
should be considerably faster than using your inc function.
Also, in the loop in your example, you have the statement:
inc(n)
but this will not update the value of n. The loop will finish and n will have the value of 0. What you need instead is:
n := inc(n);
For your timings, do you have optimization on? If you do, then it may not be timing what you thing it is. The value of n is not used by the program and may be optimized right out of it.
To make sure that n is used for the timings, you can simply display the value of n in your showMessage line.
Finally, inc is a built in procedure. It is not good practice to use the same function name as that of a built in procedure as it can cause doubts as to which procedure is being executed - yours or the built in one.
Change your function's name to myinc, and then do a third test with the built in inc procedure itself, to see if it is faster than n := n + 1;
As others before me said. Yes it does. Every line of code you write does. Functions need to store current states of registers etc... before they can execute and restore it afterwards.
But the overhead is so minimal that optimizing that means nothing. It is more important to have a redable well structured code. Almost always. There may be rare cases when every nanosecond is important but I cannot imagine one right now.
Look here for general guidelines about performance in delphi programs:
http://effovex.com/OptimalCode/opguide.htm
just want to add some comments specific to Delphi:
I think i remember than getTickCount() got a minimal resolution a bit hight to do this kind of test. (+/- 10-15ms). You could use QueryPerformanceCounter() for a better result.
for small function called a lot of time (inside process loop, data convertion, ...) use INLINE (search the help)
but to know for real what a funciton take and if you should do something about it, use a profiler !! I use http://www.prodelphi.de/, it's pretty simple, very usefull and the price is very correct compare to other profiler (ie: +/-50€ instead of 500€)
In delphi, they is the inc() function. It's faster than "n := n+1". ( because inc() is not really a function, it is replaced by the compiler by asm. ie: they is no source code for the funcion inc() ).
All good comments.
Functions are supposed to be useful, that's why they're in the language. The assumption is that if they have a nominal cost, you are willing to pay that to get the utility they provide.
Here's the real problem with functions, no matter who writes them, but especially if somebody other than you wrote them.
They have an implied contract for what they're supposed to do, but they have no contract for how long they should take.
Usually the person who writes the function thinks "This function does something valuable, so the person who calls it will respect that, and use it sparingly."
Then the person who calls it thinks "This function does so much in only a single call that I can make my code really clean and powerful by calling it lots of times."
Now, with multiple layers of abstraction, this effect acts like compound interest.
So, the real performance problem with functions is not the cost of calls, it is the psychology of programmers, leading to exponential slowdown.
Fortunately, experience in performance tuning can ameliorate this problem.

Resources