Generics, Polymorphism, Interfaces: what is the solution? - delphi

I know the title is Very wide - spanning over a lot !
And I hope that this question might evolve to a bigger "info wiki thingy" on the subjects.
What I have learned - so far:
When using Generics - understand the concepts (covariance and contravariance).
Do NOT "mis-use" the concept of generics combined with inheritance. I did and it could make you head directly into covariance problems! Make sure you "break off" the generic at the correct point in you inheritance - if you are combining the two.
(please correct me - if you think i'm wrong, missing or have misunderstood anything).
My problem was:
But by now I've spend countless hours, trying to figure out, how to solve this "big puzzle" I have on my desk. And I've gotten some good answers from several of you SO users already - but now its time to get something working in a bigger scale.
I ventured into Generics with this one:
Generics and Polymorphism working together
And now I'm kinda stuck on this one:
Situations where Generics won't work
Why I end up with covariance problems - is because of my class procedure in my hierarchy.
So I'm wondering if Interfaces is my next bold move in this "saga".
How do one "step over" a covariance problem.
One thing is to find out that you actually have this problem - another thing is "how to work around it".
So IF any of you good people "out there" has any opinions on this - I'm all ears.
Basically :
Tell me to go for Interfaces (I have never done one from scratch myself).
Or .. throw me a bone in the direction you would suggest.
My current source pool is as stated in the second link - from the top.
Here is a small snippet from my earlier post that shows my covariance problem.
David kindly explained - Why I ran into the bush.. But now I need info on - How to run around it.
var
aList : TBaseList<TBaseObject>; // used as a list parameter for methods
aPersonList : TPersonList<TPerson>;
aCustomerList : TCustomerList<TCustomer>;
begin
aPersonList := TPersonList<TPerson>.Create;
aCustomerList := TCustomerList<TCustomer>.Create;
aList := aCustomerList; <-- this FAILS !! types not equal ..
end;
Regards

You can't do what you want to do, but that is not how you use generics anyway. As Rob Kennedy said, it makes no sense to have a TCustomerList<TCustomer> and a TPersonList<TPerson>. The beauty of generics is that you can use the same list for different element types. That means that list and element type must not have any dependencies.
You can do something like:
procedure TSomething.ProcessList<T: TBaseObject>(const aList: TBaseList<T>);
begin
// process the list using code that is independent of the actual type of T.
end;
...
var
aCustomerList: TBaseList<TCustomer>;
aPersonList: TBaseList<TPerson>;
begin
ProcessList(aCustomerList);
ProcessList(aPersonList);
Perhaps you may have to specify T (some early versions of generics did not handle type inference -- i.e. that it inferes the type of T from the type of the parameter -- very well), i.e.
ProcessList<TCustomer>(aCustomerList);
ProcessList<TPerson>(aPersonList);
But that, or something similar, is what you should do. Anything else doesn't make sense, IMO. There is no need to have a variable that could hold any of these lists, like your aList. And if you really need one, you can only use TObject, but that doesn't allow you to use the list in any useful way. And it is not very generic.
Interfaces won't help you at all with this problem. You can give classes certain capabilities, i.e. also the elements of the lists, through interfaces (another kind of polymorphism). But that won't handle covariance.

I would go for :
TCustomCustomerList = class(TBaseList<TBaseObject>)
end;
TCustomerList = class(TCustomCustomerList)
end;
Whether or not this is acceptable in your design is a totally different matter. If the goal you are trying to achieve is to assign a TCustomerList to a TBaseList variable, that would be the way to go.

Related

Delphi anonymous methods - pro and cons. Good practices when using closures(anonymus methods) in Delphi

I have a colleague in my team which is extensively using closures in our projects developed in Delphi. Personal, I don't like this because is making code harder to read and I believe that closures should be used ONLY when you need them.
In the other hand I've read Can someone explain Anonymous methods to me? and other links related to this, and I'm taking into account that maybe I'm wrong, so I'm asking you to give me some examples when is better to use closures instead of a 'old-fashion' approach (not using closures).
I believe that this question calls for a very subjective judgement. I am an old-school delphi developer, and inclined to agree with you. Not only do closures add certain risks (as David H points out in comments) they also reduce readability for all classically trained Delphi developers. So why were they added to the language at all? In Delphi XE, the syntax-formatting function and closures weren't working well together, for example, and this increased my mistrust of closures; How much stuff gets added to the Delphi compiler, that the IDE hasn't been fully upgraded to support? You know you're a cranky old timer when you admit publically that you would have been happy if the Delphi language was frozen at the Delphi 7 level and never improved again. But Delphi is a living, powerful, evolving syntax. And that's a good thing. Repeat that to yourself when you find the old-crank taking over. Give it a try.
I can think of at least ten places where Anonymous methods really make sense and thus, reasons why you should use them, notwithstanding my earlier comment that I mistrust them. I will only point out the two that I have decided to personally use, and the limits that I place on myself when I use them:
Sort methods in container classes in the Generics.Collections accept an anonymous method so that you can easily provide a sorting bit of logic without having to write a regular (non-anonymous) function that matches the same signature that the sort method expects. The new generics syntax fits hand in hand with this style, and though it looks alien to you at first, it grows on you and becomes if not exactly really nice to use, at least more convenient than the alternatives.
TThread methods like Synchronize are overloaded, and in addition to supporting a single TThreadMethod as a parameter Thread.Synchronize(aClassMethodWithoutParameters), it has always been a source of pain to me, to get the parameters into that synchronize method. now you can use a closure (anonymous method), and pass the parameters in.
Limits that I recommend in writing anonymous methods:
A. I have a personal rule of thumb of only ONE closure per function, and whenever there is more than one, refactor out that bit of code to its own method. This keeps the cyclomatic complexity of your "methods" from going insane.
B. Also, inside each closure, I prefer to have only a single method invocation, and its parameters, and if I end up writing giant blocks of code, I rewrite those to be methods. Closures are for variable capture, not a carte-blanche for writing endlessly-twisted spaghetti code.
Sample sort:
var
aContainer:TList<TPair<String, Integer>>;
begin
aContainer.Sort(
TMyComparer.Construct(
function (const L, R: TPair<String, Integer>): integer
begin
result := SysUtils.CompareStr(L.Key,R.Key);
end ) {Construct end} ); {aContainer.Sort end}
end;
Update: one comment points to "language uglification", I believe that the uglification refers to the difference between having to write:
x.Sort(
TMyComparer.Construct(
function (const L, R: TPair<String, Integer>): integer
begin
result := SysUtils.CompareStr(L.Key,R.Key);
end ) );
Instead of, the following hypothetical duck-typed (or should I have said inferred types) syntax that I just invented here for comparison:
x.Sort( lambda( [L,R], [ SysUtils.CompareStr(L.Key,R.Key) ] ) )
Some other languages like Smalltalk, and Python can write lambdas more compactly because they are dynamically typed. The need for an IComparer, for example, as the type passed to a Sort() method in a container, is an example of complexity caused by the interface-flavor that strongly typed languages with generics have to follow in order to implement traits like ordering, required for sortability. I don't think there was a nice way to do this. Personally I hate seeing procedure, begin and end keywords inside a function invocation parenthesis, but I don't see what else could reasonably have been done.

do record_info and tuple_to_list return the same key order in Erlang?

I.e, if I have a record
-record(one, {frag, left}).
Is record_info(fields, one) going to always return [frag,
left]?
Is tl(tuple_to_list(#one{frag = "Frag", left = "Left"}))
always gonna be ["Frag", "Left"]?
Is this an implementation detail?
Thanks a lot!
The short answer is: yes, as of this writing it will work. The better answer is: it may not work that way in the future, and the nature of the question concerns me.
It's safe to use record_info/2, although relying on the order may be risky and frankly I can't think of a situation where doing so makes sense which implies that you are solving a problem the wrong way. Can you share more details about what exactly you are trying to accomplish so we can help you choose a better method? It could be that simple pattern matching is all you need.
As for the example with tuple_to_list/1, I'll quote from "Erlang Programming" by Cesarini and Thompson:
"... whatever you do, never, ever use the tuple representations of records in your programs. If you do, the authors of this book will disown you and deny any involvement in helping you learn Erlang."
There are several good reasons why, including:
Your code will become brittle - if you later change the number of fields or their order, your code will break.
There is no guarantee that the internal representation of records will continue to work this way in future versions of erlang.
Yes, order is always the same because records represented by tuples for which order is an essential property. Look also on my other answer about records with examples: Syntax Error while accessing a field in a record
Yes, in both cases Erlang will retain the 'original' order. And yes it's implementation as it's not specifically addressed in the function spec or documentation, though it's a pretty safe bet it will stay like that.

How to get the entire code of a method in memory so I can calculate its hash at runtime?

How to get the entire code of a method in memory so I can calculate its hash at runtime?
I need to make a function like this:
type
TProcedureOfObject = procedure of object;
function TForm1.CalculateHashValue (AMethod: TProcedureOfObject): string;
var
MemStream: TMemoryStream;
begin
result:='';
MemStream:=TMemoryStream.Create;
try
//how to get the code of AMethod into TMemoryStream?
result:=MD5(MemStream); //I already have the MD5 function
finally
MemStream.Free;
end;
end;
I use Delphi 7.
Edit:
Thank you to Marcelo Cantos & gabr for pointing out that there is no consistent way to find the procedure size due to compiler optimization. And thank you to Ken Bourassa for reminding me of the risks. The target procedure (the procedure I would like to compute the hash) is my own and I don't call another routines from there, so I could guarantee that it won't change.
After reading the answers and Delphi 7 help file about the $O directive, I have an idea.
I'll make the target procedure like this:
procedure TForm1.TargetProcedure(Sender: TObject);
begin
{$O-}
//do things here
asm
nop;
nop;
nop;
nop;
nop;
end;
{$O+}
end;
The 5 succesive nops at the end of the procedure would act like a bookmark. One could predict the end of the procedure with gabr's trick, and then scan for the 5 nops nearby to find out the hopefully correct size.
Now while this idea sounds worth trying, I...uhm... don't know how to put it into working Delphi code. I have no experience on lower level programming like how to get the entry point and put the entire code of the target procedure into a TMemoryStream while scanning for the 5 nops.
I'd be very grateful if someone could show me some practical examples.
Marcelo has correctly stated that this is not possible in general.
The usual workaround is to use an address of the method that you want to calculate the hash for and an address of the next method. For the time being the compiler lays out methods in the same order as they are defined in the source code and this trick works.
Be aware that substracting two method addresses may give you a slightly too large result - the first method may actually end few bytes before the next method starts.
The only way I can think of, is turning on TD32 debuginfo, and try JCLDebug to see if you can find the length in the debuginfo using it. Relocation shouldn't affect the length, so the length in the binary should be the same as in mem.
Another way would be to scan the code for a ret or ret opcode. That is less safe, but probably would guard at least part of the function, without having to mess with debuginfo.
The potential deal breaker though is short routines that are tail-call optimized (iow they jump instead of ret). But I don't know if Delphi does that.
You might struggle with this. Functions are defined by their entry point, but I don't think that there is any consistent way to find out the size. In fact, optimisers can do screwy things like merge two similar functions into a common shared function with multiple entry points (whether or not Delphi does stuff like this, I don't know).
EDIT: The 5-nop trick isn't guaranteed to work either. In addition to Remy's caveats (see his comment below), The compiler merely has to guarantee that the nops are the last thing to execute, not that they are last thing to appear in the function's binary image. Turning off optimisations is a rather baroque "solution" that still won't fix all the issues that others have raised.
In short, there are simply too many variables here for what you are trying to do. A better approach would be to target compilation units for checksumming (assuming it satisfies whatever overall objective you have).
I achieve this by letting Delphi generate a MAP-file and sorting symbols based on their start address in ascending order. The length of each procedure or method is then the next symbols start address minus this symbols start address. This is most likely as brittle as the other solutions suggested here but I have this code working in production right now and it has worked fine for me so far.
My implementation that reads the map-file and calculate sizes can be found here at line 3615 (TEditorForm.RemoveUnusedCode).
Even if you would achieve it, there is a few things you need to be aware of...
The hash will change many times, even if the function itself didn't change.
For example, the hash will change if your function call another function that changed address since the last build. I think the hash might also change if your function calls itself recursively and your unit (not necessarily your function) changed since the last build.
As for how it could be achieved, gabr's suggestion seems to be the best one... But it's really prone to break over time.

Is there an idiomatic way to order function arguments in Erlang?

Seems like it's inconsistent in the lists module. For example, split has the number as the first argument and the list as the second, but sublists has the list as the first argument and the len as the second argument.
OK, a little history as I remember it and some principles behind my style.
As Christian has said the libraries evolved and tended to get the argument order and feel from the impulses we were getting just then. So for example the reason why element/setelement have the argument order they do is because it matches the arg/3 predicate in Prolog; logical then but not now. Often we would have the thing being worked on first, but unfortunately not always. This is often a good choice as it allows "optional" arguments to be conveniently added to the end; for example string:substr/2/3. Functions with the thing as the last argument were often influenced by functional languages with currying, for example Haskell, where it is very easy to use currying and partial evaluation to build specific functions which can then be applied to the thing. This is very noticeable in the higher order functions in lists.
The only influence we didn't have was from the OO world. :-)
Usually we at least managed to be consistent within a module, but not always. See lists again. We did try to have some consistency, so the argument order in the higher order functions in dict/sets match those of the corresponding functions in lists.
The problem was also aggravated by the fact that we, especially me, had a rather cavalier attitude to libraries. I just did not see them as a selling point for the language, so I wasn't that worried about it. "If you want a library which does something then you just write it" was my motto. This meant that my libraries were structured, just not always with the same structure. :-) That was how many of the initial libraries came about.
This, of course, creates unnecessary confusion and breaks the law of least astonishment, but we have not been able to do anything about it. Any suggestions of revising the modules have always been met with a resounding "no".
My own personal style is a usually structured, though I don't know if it conforms to any written guidelines or standards.
I generally have the thing or things I am working on as the first arguments, or at least very close to the beginning; the order depends on what feels best. If there is a global state which is chained through the whole module, which there usually is, it is placed as the last argument and given a very descriptive name like St0, St1, ... (I belong to the church of short variable names). Arguments which are chained through functions (both input and output) I try to keep the same argument order as return order. This makes it much easier to see the structure of the code. Apart from that I try to group together arguments which belong together. Also, where possible, I try to preserve the same argument order throughout a whole module.
None of this is very revolutionary, but I find if you keep a consistent style then it is one less thing to worry about and it makes your code feel better and generally be more readable. Also I will actually rewrite code if the argument order feels wrong.
A small example which may help:
fubar({f,A0,B0}, Arg2, Ch0, Arg4, St0) ->
{A1,Ch1,St1} = foo(A0, Arg2, Ch0, St0),
{B1,Ch2,St2} = bar(B0, Arg4, Ch1, St1),
Res = baz(A1, B1),
{Res,Ch2,St2}.
Here Ch is a local chained through variable while St is a more global state. Check out the code on github for LFE, especially the compiler, if you want a longer example.
This became much longer than it should have been, sorry.
P.S. I used the word thing instead of object to avoid confusion about what I was talking.
No, there is no consistently-used idiom in the sense that you mean.
However, there are some useful relevant hints that apply especially when you're going to be making deeply recursive calls. For instance, keeping whichever arguments will remain unchanged during tail calls in the same order/position in the argument list allows the virtual machine to make some very nice optimizations.

Should I use block identifiers ("end;") in my code?

Code Complete says it is good practice to always use block identifiers, both for clarity and as a defensive measure.
Since reading that book, I've been doing that religiously. Sometimes it seems excessive though, as in the case below.
Is Steve McConnell right to insist on always using block identifiers? Which of these would you use?
//naughty and brief
with myGrid do
for currRow := FixedRows to RowCount - 1 do
if RowChanged(currRow) then
if not(RecordExists(currRow)) then
InsertNewRecord(currRow)
else
UpdateExistingRecord(currRow);
//well behaved and verbose
with myGrid do begin
for currRow := FixedRows to RowCount - 1 do begin
if RowChanged(currRow) then begin
if not(RecordExists(currRow)) then begin
InsertNewRecord(currRow);
end //if it didn't exist, so insert it
else begin
UpdateExistingRecord(currRow);
end; //else it existed, so update it
end; //if any change
end; //for each row in the grid
end; //with myGrid
I have always been following the 'well-behaved and verbose' style, except those unnecessary extra comments at the end blocks.
Somehow it makes more sense to be able to look at code and make sense out of it faster, than having to spend at least couple seconds before deciphering which block ends where.
Tip: Visual studio KB shortcut for C# jump begin and end: Ctrl + ]
If you use Visual Studio, then having curly braces for C# at the beginning and end of block helps also by the fact that you have a KB shortcut to jump to begin and end
Personally, I prefer the first one, as IMHO the "end;" don't tell me much, and once everything is close, I can tell by the identation what happens when.
I believe blocks are more useful when having large statements. You could make a mixed approach, where you insert a few "begin ... end;"s and comment what they are ending (for instance use it for the with and the first if).
IMHO you could also break this into more methods, for example, the part
if not(RecordExists(currRow)) then begin
InsertNewRecord(currRow);
end //if it didn't exist, so insert it
else begin
UpdateExistingRecord(currRow);
end; //else it existed, so update it
could be in a separate method.
I would use whichever my company has set for its coding standards.
That being said, I would prefer to use the second, more verbose, block. It is a lot easier to read. I might, however, leave off the block-ending comments in some cases.
I think it depends somewhat on the situation. Sometimes you simply have a method like this:
void Foo(bool state)
{
if (state)
TakeActionA();
else
TakeActionB();
}
I don't see how making it look like this:
void Foo(bool state)
{
if (state)
{
TakeActionA();
}
else
{
TakeActionB();
}
}
Improves on readability at all.
I'm a Python developer, so I see no need for block identifiers. I'm quite happy without them. Indentation is enough of an indicator for me.
Block identifier are not only easier to read they are much less error prone if you are changing something in the if else logic or simply adding a line and don't recognizing that the line is not in the same logical block then the rest of the code.
I would use the second code block. The first one looks prettier and more familiar but I think this a problem of the language and not the block identifiers
If it is possible I use checkstyle to ensure that brackets are used.
If I remember correctly, CC also gave some advices about comments. Especially about not rewriting what code does in comments, but explaining why it does what it does.
I'd say he's right just for the sake that the code can still be interpreted correctly if the indentation is incorrect. I always like to be able to find the start and end block identifiers for loops when I skim through code, and not rely on proper indentation.
It's never always one way or the other. Because I trust myself, I would use the shorter, more terse style. But if you're in a team environment where not everyone is of the same skill and maintainability is important, you may want to opt for the latter.
My knee-jerk reaction would be the second listing (with the repetitive comments removed from the end of the lines, like everyone's been saying), but after thinking about it more deeply I'd go with the first plus a one or two line useful comment beforehand explaining what's going on (if needed). Obviously in this toy example, even the comment before the concise answer would probably not be needed, but in other examples it might.
Having less (but still readable) and easy to understand code on the screen helps keep your brain space free for future parts of the code IMO.
I'm with those who prefer more concise code.
And it looks like prefering a verbose version to a concise one is more of a personal choice, than of a universal suitableness. (Well, within a company it may become a (mini-)universal rule.)
It's like excessive parentheses: some people prefer it like (F1 and F2) or ((not F2) and F3) or (A - (B * C)) < 0, and not necessarily because they do not know about the precedence rules. It's just more clear to them that way.
I vote for a happy medium. The rule I would use is to use the bracketing keywords any time the content is multiple lines. In action:
// clear and succinct
with myGrid do begin
for currRow := FixedRows to RowCount - 1 do begin
if RowChanged(currRow) then begin
if not(RecordExists(currRow))
InsertNewRecord(currRow);
else
UpdateExistingRecord(currRow);
end; // if RowChanged
end; // next currRow
end; // with myGrid
commenting the end is really usefull for html-like languages so do malformed C code like an infinite succession of if/else/if/else
frequent // comments at the end of code lines (per your Well Behaved and Verbose example) make the code harder to read imho -- when I see it I end up scanning the 'obvious' comments form something special that typically isn't there.
I prefer comments only where the obvious isn't (i.e. overall and / or unique functionality)
Personally I recommend always using block identifiers in languages that support them (but follow your company's coding standards, as #Muad'Dib suggests).
The reason is that, in non-Pythonic languages, whitespace is (generally) not meaningful to the compiler but it is to humans.
So
with myGrid do
for currRow := FixedRows to RowCount - 1 do
if RowChanged(currRow) then
Log(currRow);
if not(RecordExists(currRow)) then
InsertNewRecord(currRow)
else
UpdateExistingRecord(currRow);
appears to do one thing but does something quite different.
I would eliminate the end-of-line comments, though. Use an IDE that highlights blocks. I think Castalia will do that for Delphi. How often do you read code printouts anymore?

Resources