Does the compiler optimize (close) identical FieldByName calls? - delphi

In some code I'm maintaining, I see two different methods used in TClientDataSet.OnCalcFields event handlers:
with DataSet do
begin
// 1. Call FieldByName twice
if AMinDate > FieldByName(SPlanAllocatieFromDate).AsDateTime then
AMinDate := FieldByName(sPlanAllocatieFromDate).AsDateTime;
// 2. Put the retrieved FieldByName value in a temp var
lEmpID := FieldByName(SPlanAllocatieEmpID).AsInteger;
if lEmpID <> 0 then lTSAllocatedEmpIDs.Add(IntToStr(lEmpID));
end;
Will the compiler (Delphi XE2, Win32 app) optimize method 2 to use a temp var? The two FieldByNames are quite close, you could even say nested.
If not, I should rewrite 1. because OnCalcFields executes often.
BTW. I know about Fields[] versus FieldByName(), or using a temp TField var when running an EOF loop, those are not the issue here.

No version of the Delphi compiler does anything like this.
Such optimizations would require the compiler to be able to prove that the two calls to FieldByName would always give the same result, and there is currently no provision for flagging a method as being deterministic.
Note that it is quite possible in theory (if unlikely in reality) for the two calls NOT to give the same result, in this case e.g. if a different thread deletes a field out of the collection between the first and second call. Generally, the compiler does not know or care at the call site what a particular method call actually does.

Does the compiler optimize (close) identical FieldByName calls?
No it does not.
The compiler does not look inside function calls to see what is within. It therefore has no way to prove that the value returned by successive calls to a function would be the same. Likewise it has no way to prove that the function has no side-effects. These are the two prerequisites for the optimisation under consideration.
You will need to perform the optimisation yourself, by explicitly adding and using a local variable to store the value returned by a single call to FieldByName.
Beyond the consideration of performance, I would argue that the use of a local variable to hold the field is semantically much better. This makes it clear to the reader that all actions are performed on the same field. That reason alone would be enough to persuade me to make the change you describe. Don't repeat yourself.
And while we are in code review mode, you might care to reconsider the use of with.

Related

Accessing a container object vs a locally assigned variable

Can anyone tell me if there is any performance benefit to assigning an object in the container to a local variable if its used a lot in a tight loop.
I have a large for loop and inside the loop an object from a container is access often.
i.e
for i := 0 to 100000 do
begin
my_list[i].something := something;
my_list[i].something_else := something;
my_list[i].something_else := something;
my_list[i].something_else := something;
my_list[i].something_else := something;
end;
Would I see a performance improvement by assigning
local_ref := my_list[i];
at the start of each iteration?
I am using a generic container (TList<<>MyObject<>>).
Making the change you suggest will certainly result in faster code. Accessing a local variable is always going to be faster than accessing a property getter on TList<T>. For a start, those getters perform validity checking on the index. But even for a class with the most simple getter possible, it would be hard to beat the cached local for performance.
Now, whether this matters in your case is impossible to say from here. If you do anything remotely non-trivial inside the loop then their runtime of the item getter will be irrelevant.
The fact that the loop might run for a large number of iterations is not the key factor. What counts most of all is how much time you spend in each iteration. If it costs you 1 time unit to call the item getter, and 1000 time units to do whatever you do with each item, then it the getter performance is a non-issue.
Ultimately the definitive way to answer the question is to time the alternatives. Only optimise based on measurement.
There's a much better reason to copy the item into a local variable: clarity of expression. Your current code is an egregious violation of the DRY principle.
Finally, this code would read best of all if it used a for in loop:
for Item in List do
....
Now, for in loops can be slower than traditional loops, but you should weight that against clarity and maintainability. Optimisation usually makes code harder to maintain and more prone to faults. Conclusion: only optimise bottlenecks.
It all depends on how my_list[i] is retrieved. If it results in a bunch of function calls, it can potentially make a difference (not to mention any side effect).
As usual, you should measure before doing any kind of performance refactoring. Premature optimization.....
For the record, it was one of the "good" use of with in the original Pascal design:
with my_list[i] do
begin
something := something_else;
[...]
end;

Should I always return IQueryable<> instead of IList<>?

I came across this post while I was looking for things to improve performance. Currently, in my application we are returning IList<> all over the place. Is it a good idea to change all of these returns to AsQueryable() ?
Here is what I found -
AsQueryable() - Context needs to be
open and you cannot control the
lifetime of the database context
it need to be disposed properly. Also
it is deferred execution('faster
filtering' as compared to Lists)
IList<> - This should be preferred
over List<> as it provides a barebone
and lightweight implementation.
Also when should be one preferred over another ? I know the basics but I am sorry I am still not clear when and how should we use them correctly in an application. It would be great to know this as the next time I would try to keep it in mind before returning anything..Thanks a lot.
Basically, you should try to reference the widest type you need. For example, if some variable is declared as List<...>, you put a constraint for the type of the values that can be assigned to it. It may happen that you need only sequential access, so it would be enough to declare the variable as IEnumerable<...> instead. That will allow you to assign the values of other types to the variable, as well as the results of LINQ operations.
If you see that your variable needs access by index, you can again declare it as IList<...> and not just List<...>, allowing other types implementing IList<...> be assigned to it.
For the function return types, it depends upon you. If you think it's important that the function returns exactly List<...>, you declare it to return exactly List<...>. If the only important thing is access to the result by index, perhaps you don't need to constrain yourself to return exactly List<...>, you may declare return type as IList<...> (but return actually an instance of List<...> in this implementation, and possibly of some another type supporting IList<...> later). Again, if you see that the only important thing about the return value of your function is that it can be enumerated (and the access by index is not needed), you should change the function return type to IEnumerable<...>, giving yourself more freedom.
Now, about AsQueriable, again it depends on your logic. If you think that possible delayed evaluation is a good thing in your case, as it may aid to avoid the unneeded calculations, or you intend to use it as a part of some another query, you use it. If you think that the results have to be "materialized", i.e., calculated at this very moment, you would better return a List<...>. You would especially need to materialize your result if the calculation later may result in a different list!
With the database a good rule of thumb is to use AsQueriable for the short-term intermediate results, but List for the "final" results which will be used within some longer time. Of course having a non-materialized query hanging around makes closing the database impossible (since at the moment of actual evaluation of the the database should be still open).
if you do not intend to do any further queries over sql server then you should return IList because it produces in-memory data
If you are concerned about performance you should also try to run your queries on as few DB requests as possible and cache the most used queries. It is very common to reduce significantly the request process time using batch approachs.
Which ORM do you use to retrieve data from DB? If you use NHibernate, see this post about how to use Future, Multi Criteria 1, Multi Criteria 2 and Multi Query.
Greetings.

How would you refactor this code?

This hypothetical example illustrates several problems I can't seem to get past, even though I keep trying!! ... Suppose the original code is a long event handler, coded in the UI, triggered when a user clicks a cell in a grid. Expressed as pseudocode it's:
if Condition1=true then
begin
//loop through every cell in row,
//if aCell/headerCellValue>1 then
//color aCell red
end
else if Condition2=true then
begin
//do some other calculation adding cell and headerCell values, and
//if some other product>2 then
//color the whole row green
end
else show an error message
I look at this and say "Ah, refactor to the strategy pattern! The code will be easier to understand, easier to debug, and easier to later extend!" I get that.
And I can easily break the code into multiple procedures.
The problem is ultimately scope related. Assume the pseudocode makes extensive use of grid properties, values displayed in cells, maybe even built-in grid methods. How do you move all that to another unit, without referencing the grid component in the UI--which would break all the "rules" about loose coupling that make OOP valuable? ...
I'm really looking forward to responses. Thanks, as always -- Al C.
Refactoring to put code into a separate routine doesn't necessarily mean decoupling everything. You could just as well refactor each of those cases into a new method belonging to the same class as the event handler you're refactoring. Those methods would have all the same access to the grid component as your current code already has.
You're writing code for that event to do things to that grid on that form. Do you really foresee needing to do those operations in response to some other event? Or perform them on some other grid on some other form? If not, then decoupling everything is just an academic exercise and serves no purpose to your product. It's OK to write application-specific code.
If you want to decouple, then the way to do it is to add parameters to your factored-out routines. If the routines need to work with the grid without knowing exactly which grid it is, then pass the grid in as a parameter:
if Condition1 then
ColorCellsRedAboveRatio(Grid, 1.0)
else if Condition2 then
ColorRowsGreenAboveProduct(Grid, 2)
else
Error;
First of all, NEVER EVER compare a value to true.
It's
if Condition1 then
begin
end else if Condition2 then
begin
end;
Comparing to true can, in the worst case, fail even if the value is true. There is a good (but german) article 'Über den Umgang mit Boolean' in the Delphi-PRAXiS.net community forums wich shows an example to reproduce this strange-seeming behaviour.
Regarding your question directly:
This code would be better in a custom paint event. This will get called for every cell and directly paint it with the correct colors. And it well draw in the correct colors even when repainted. In your event you would only draw the cells once, and if - for any reason - the grid repaints your color would be lost.
Then, after all, this code is stringly UI and component related. If you hadn't this grid on your form, you wouldn't need this code.
What you could do to decouple things a bit is to pass the values retrieved from the grid row to an external unit that will do only the calculation and returns a logical result. Your UI code on the form then would take this result and has to decide how it has to be displayed (ie. what color etc.) and put the information on the grid.
You could consider to extract an UI dependent unit. If the method is long and you want to extract some stuff from the class, it is reasonable to extract a strategy.
As Rob suggested you could just pass the needed context to the strategy procedures. You could introduce a SheetRenderingStrategy with an approriate abstract method and approriate SubClasses eg. HighlightTheseSpecialCellsStrategy. These classes are still part of the UI, but probably clarifiy the intent and improve modularisation.
Answering a question about refactoring, without context, is like answering a question about design, without context. Because you are changing a design. I usually start out with "reasons to refactor". Maybe I have metrics that I'm exceeding. (Look, a class with 10,000 lines, surely it can be partitioned into several more-coherent, more-cohesive classes, with less coupling, and tighter cohesion).
So, if you find yourself with a lot of code that has several hundred if...else conditions in event handlers, as I often do, I would forget all about that event handler for a minute,
and reduce it as the person said above, to a minimal object oriented pattern:
if a then
doA(fewer,parameters)
else if b then
doB(is,generally,better)
else if c then
doC;
...
Now, if doA,doB,and doC belong together in another object (They share state, and modify/control some particular set of fields), then I might move doA,doB,doC methods into another object.
In general, however, rather than drilling down to an individual case-by-case where event handlers do everything, I also find the following delphi-pattern handy:
procedure TForm1.BigGuiControlRightButtonClick(Sender;...);
begin
BigThingController.RightClickMenuHandler(Sender, ....)
end;
procedure TForm1.BigGuiControlDoSomeThing(Sender:TObject);
begin
BigThingController.DoSomeThing;
end;
procedure TForm1.Print(Sender);
begin
DocumentManager.Print(Document);
end;
I like it when my TForm methods are clear and readable. I don't like to see a lot of noise, and a lot of error-checking code. I find that applications that have been carefully maintained and debugged over years tend to grow iteratively towards a complete mess of unreadable spaghetti. If the goal of refactoring is more than just to make the code look pretty, then the refactoring should also have some measurable quality goal too. Reduce defects, crashes, etc. Sometimes, I use refactoring as a time to remove features that are no longer useful, or implemented in a faulty way. So my code is more correct when I'm finished, and not just refactored to fit some ideal of how code should be written, that doesn't change the quality the user experiences. I'm a delphi developer, and I'm goal oriented, and quality oriented, and pragmatic rather than a stylizer. Other people may differ here.

Approaches for caching calculated values

In a Delphi application we are working on we have a big structure of related objects. Some of the properties of these objects have values which are calculated at runtime and I am looking for a way to cache the results for the more intensive calculations. An approach which I use is saving the value in a private member the first time it is calculated. Here's a short example:
unit Unit1;
interface
type
TMyObject = class
private
FObject1, FObject2: TMyOtherObject;
FMyCalculatedValue: Integer;
function GetMyCalculatedValue: Integer;
public
property MyCalculatedValue: Integer read GetMyCalculatedValue;
end;
implementation
function TMyObject.GetMyCalculatedValue: Integer;
begin
if FMyCalculatedValue = 0 then
begin
FMyCalculatedValue :=
FObject1.OtherCalculatedValue + // This is also calculated
FObject2.OtherValue;
end;
Result := FMyCalculatedValue;
end;
end.
It is not uncommon that the objects used for the calculation change and the cached value should be reset and recalculated. So far we addressed this issue by using the observer pattern: objects implement an OnChange event so that others can subscribe, get notified when they change and reset cached values. This approach works but has some downsides:
It takes a lot of memory to manage subscriptions.
It doesn't scale well when a cached value depends on lots of objects (a list for example).
The dependency is not very specific (even if a cache value depends only on one property it will be reset also when other properties change).
Managing subscriptions impacts the overall performance and is hard to maintain (objects are deleted, moved, ...).
It is not clear how to deal with calculations depending on other calculated values.
And finally the question: can you suggest other approaches for implementing cached calculated values?
If you want to avoid the Observer Pattern, you might try to use a hashing approach.
The idea would be that you 'hash' the arguments, and check if this match the 'hash' for which the state is saved. If it does not, then you recompute (and thus save the new hash as key).
I know I make it sound like I just thought about it, but in fact it is used by well-known softwares.
For example, SCons (Makefile alternative) does it to check if the target needs to be re-built preferably to a timestamp approach.
We have used SCons for over a year now, and we never detected any problem of target that was not rebuilt, so their hash works well!
You could store local copies of the external object values which are required. The access routine then compares the local copy with the external value, and only does the recalculation on a change.
Accessing the external objects properties would likewise force a possible re-evaluation of those properties, so the system should keep itself up-to-date automatically, but only re-calculate when it needs to. I don't know if you need to take steps to avoid circular dependencies.
This increases the amount of space you need for each object, but removes the observer pattern. It also defers all calculations until they are needed, instead of performing the calculation every time a source parameter changes. I hope this is relevant for your system.
unit Unit1;
interface
type
TMyObject = class
private
FObject1, FObject2: TMyOtherObject;
FObject1Val, FObject2Val: Integer;
FMyCalculatedValue: Integer;
function GetMyCalculatedValue: Integer;
public
property MyCalculatedValue: Integer read GetMyCalculatedValue;
end;
implementation
function TMyObject.GetMyCalculatedValue: Integer;
begin
if (FObject1.OtherCalculatedValue &LT;&GT; FObjectVal1)
or (FObject2.OtherValue &LT;&GT; FObjectVal2) then
begin
FMyCalculatedValue :=
FObject1.OtherCalculatedValue + // This is also calculated
FObject2.OtherValue;
FObjectVal1 := FObject1.OtherCalculatedValue;
FObjectVal2 := Object2.OtherValue;
end;
Result := FMyCalculatedValue;
end;
end.
In my work I use Bold for Delphi that can manage unlimited complex structures of cached values depending on each other. Usually each variable only holds a small part of the problem. In this framework that is called derived attributes. Derived because the value is not saved in the database, It just depends on on other derived attributes or persistant attributes in the database.
The code behind such attribute is written in Delphi as a procedure or in OCL (Object Constraint Language) in the model. If you write it as Delphi code you have to subscribe to the depending variables. So if attribute C depends on A and B then whenever A or B changes the code for recalc C is called automatically when C is read. So the first time C is read A and B is also read (maybe from the database). As long as A and B is not changed you can read C and got very fast performance. For complex calculations this can save quite a lot of CPU-time.
The downside and bad news is that Bold is not offically supported anymore and you cannot buy it either. I suppose you can get if you ask enough people, but I don't know where you can download it. Around 2005-2006 it was downloadable for free from Borland but not anymore.
It is not ready for D2009 as someone have to port it to Unicode.
Another option is ECO with dot.net from Capable Objects. ECO is a plugin in Visual Studio. It is a supported framwork that have the same idea and author as Bold for Delphi. Many things are also improved, for example databinding is used for the GUI-components. Both Bold and ECO use a model as a central point with classes, attributes and links. Those can be persisted in a database or a xml-file. With the free version of ECO the model can have max 12 classes, but as I remember there is no other limits.
Bold and ECO contains lot more than derived attributes that makes you more productive and allow you to think on the problem instead of technical details of database or in your case how to cache values. You are welcome with more questions about those frameworks!
Edit:
There is actually a download link for Embarcadero registred users for Bold for Delphi for D7, quite old... I know there was updates for D2005, ad D2006.

Does avoiding functions increase the performance?

Here is a little test:
function inc(n:integer):integer;
begin
n := n+1;
result := n;
end;
procedure TForm1.Button1Click(Sender: TObject);
var
start,i,n:integer;
begin
n := 0;
start := getTickCount;
for i := 0 to 10000000 do begin
inc(n);//calling inc function takes 73 ms
//n := n+1; writing it directly takes 16 ms
end;
showMessage(inttostr(getTickCount-start));
end;
Yes, calling a function introduces an overhead. Before calling the function it's necessary to save the current state - which instruction was planned to execute next - and also to copy the function parameters. This requires extra work and extra time.
That's where inlining is helpful. If the compiler supports that it can just injsct the function code directly at the call site and avoid the overhead. With good optimization of surrounding code it can even decrease amount of generated code.
This doesn't mean you need to avoid functions. In most cases the function body executes much longer that the time needed to organize the call. Only in quite rare cases the overhead is worth optimizing. This should never be done without the help of the profiler - otherwise you waste time and most likely just get a lot of unmaintainable code.
Calling a function (whichever language you're working with) generally involves doing a bit more things, like saving some context, pushing parameters to some kind of stack, calling the function itself, reading the parameters, and then pushing the result back somewhere, returning from the function, extracting the return value, ...
So, of course, calling functions generally means having some overhead.
But the main point of functions is re-using some parts of code : maybe it will take a few micro-seconds more at execution, but if you only have to write some code once, instead of 10 (or more) times, there is a huge gain ; and that code will be much easier to maintain, which is really important in the long term.
After, you might want not using functions for some really small parts of code like the one you provided as an example (well, except if the language you're using provides some kind of inlining thing -- it's the case for C, if I remember correctly ; not sure about delphi, though) : the overhead of calling the function will be important, compared to the number of lines of code the function will save you from writing (here : none ! On the contrary ^^ ).
But for bigger parts of code, the overhead will me much smaller, compared to the time taken to execute the bunch of code the function contains...
Premature optimization is the root of all evil...
Write correct and maintainable code using the known features (here the built-in pseudo(magic) procedure inc), benchmark it and refactor where it's needed for performance reason (if any).
I bet that in 99.9% of the cases, avoiding calling a function or procedure is not the solution.
Here is an example where adding a call to a procedure actually IS the optimization.
Only optimize when there is a bottleneck.
Your current code is perfectly fine for about 99.9% of the cases.
If it gets slow, use a profiler to point you at the bottleneck.
When the bottleneck appears to be in the inc function, then you can always inline your function by marking it with the 'inline' directive.
I totally agree with Francois on this one.
One of the most expensive parts of a function call is the returning of the result.
If you did want to keep your program modular, but wanted to save a bit of time, change your function to a procedure and use a var parameter to retrieve the result.
So for your example:
procedure inc(var n:integer);
begin
n := n+1;
end;
should be considerably faster than using your inc function.
Also, in the loop in your example, you have the statement:
inc(n)
but this will not update the value of n. The loop will finish and n will have the value of 0. What you need instead is:
n := inc(n);
For your timings, do you have optimization on? If you do, then it may not be timing what you thing it is. The value of n is not used by the program and may be optimized right out of it.
To make sure that n is used for the timings, you can simply display the value of n in your showMessage line.
Finally, inc is a built in procedure. It is not good practice to use the same function name as that of a built in procedure as it can cause doubts as to which procedure is being executed - yours or the built in one.
Change your function's name to myinc, and then do a third test with the built in inc procedure itself, to see if it is faster than n := n + 1;
As others before me said. Yes it does. Every line of code you write does. Functions need to store current states of registers etc... before they can execute and restore it afterwards.
But the overhead is so minimal that optimizing that means nothing. It is more important to have a redable well structured code. Almost always. There may be rare cases when every nanosecond is important but I cannot imagine one right now.
Look here for general guidelines about performance in delphi programs:
http://effovex.com/OptimalCode/opguide.htm
just want to add some comments specific to Delphi:
I think i remember than getTickCount() got a minimal resolution a bit hight to do this kind of test. (+/- 10-15ms). You could use QueryPerformanceCounter() for a better result.
for small function called a lot of time (inside process loop, data convertion, ...) use INLINE (search the help)
but to know for real what a funciton take and if you should do something about it, use a profiler !! I use http://www.prodelphi.de/, it's pretty simple, very usefull and the price is very correct compare to other profiler (ie: +/-50€ instead of 500€)
In delphi, they is the inc() function. It's faster than "n := n+1". ( because inc() is not really a function, it is replaced by the compiler by asm. ie: they is no source code for the funcion inc() ).
All good comments.
Functions are supposed to be useful, that's why they're in the language. The assumption is that if they have a nominal cost, you are willing to pay that to get the utility they provide.
Here's the real problem with functions, no matter who writes them, but especially if somebody other than you wrote them.
They have an implied contract for what they're supposed to do, but they have no contract for how long they should take.
Usually the person who writes the function thinks "This function does something valuable, so the person who calls it will respect that, and use it sparingly."
Then the person who calls it thinks "This function does so much in only a single call that I can make my code really clean and powerful by calling it lots of times."
Now, with multiple layers of abstraction, this effect acts like compound interest.
So, the real performance problem with functions is not the cost of calls, it is the psychology of programmers, leading to exponential slowdown.
Fortunately, experience in performance tuning can ameliorate this problem.

Resources