I've just watched a talk on security considerations for railway systems from last year's 32C3.
At minute 25 the speaker briefly talks about Ada. Specifically he says:
Typical Ada implementations have a mechanism called "(tramp / trunk /
?) lines". And that means it will execute code on [the] stack which is
not very good for C programs. And [...] if you want to link Ada code with C
libraries, one of the security mechanisms won't work.
Here is a link (YouTube) to the respective part of the talk. This is the slide in the background. As you see I am unsure about one of the words. Perhaps it's trampolines?
Now my blunt question: Is there any truth in this statement? If so, can anyone elaborate on this mysterious feature of the Ada language and the security mechanism it apparently influences?
Until now I always assumed that code lives in a code segment (aka "text") whereas data (including the stack) is placed in a data segment at a different memory location (as depicted in this graphic). And reading about memory management in Ada suggests it should not be much different there.
While there are ways to circumvent such a layout (see e.g. this "C on stack" question and this "C on heap" answer), I believe modern OSes would commonly prevent such attempts via executable space protection unless the stack is explicitly made executable. - However, for embedded systems it may be still an issue if the code is not kept on ROM (can anyone clarify?).
They're called "trampolines". Here's my understanding of what they're for, although I'm not a GNAT expert, so some of my understanding could be mistaken.
Background: Ada (unlike C) supports nested subprograms. A nested subprogram is able to access the local variables of enclosing subprograms. For example:
procedure Outer is
Some_Variable : Integer;
procedure Inner is
begin
...
Some_Variable := Some_Variable + 1;
...
Since each procedure has its own stack frame that holds its own local variables, there has to be a way for Inner to get at Outer's stack frame, so that it can access Some_Variable, either when Outer calls Inner, or Outer calls some other nested subprograms that call Inner. A typical implementation is to pass a hidden parameter to Inner, often called a "static link", that points to Outer's stack frame. Now Inner can use that to access Some_Variable.
The fun starts when you use Inner'Access, which is an access procedure type. This can be used to store the address of Inner in a variable of an access procedure type. Other subprograms can later use that variable to call the prodcedure indirectly. If you use 'Access, the variable has to be declared inside Outer--you can't store the procedure access in a variable outside Outer, because then someone could call it later, after Outer has exited and its local variables no longer exist. GNAT and other Ada compilers have an 'Unrestricted_Access attribute that gets around this restriction, so that Outer could call some outside subprogram that indirectly calls Inner. But you have to be very careful when using it, because if you call it at the wrong time, the result would be havoc.
Anyway, the problem arises because when Inner'Access is stored in a variable and later used to call Inner indirectly, the hidden parameter with the static link has to be used when calling Inner. So how does the indirect caller know what static link to pass?
One solution (Irvine Compiler's, and probably others) is to make variables of this access type have two values--the procedure address, and the static link (so an access procedure is a "fat pointer", not a simple pointer). Then a call to that procedure will always pass the static link, in addition to other parameters (if any). [In Irvine Compiler's implementation, the static link inside the pointer will be null if it's actually pointing to a global procedure, so that the code knows not to pass the hidden parameter in that case.] The drawback is that this doesn't work when passing the procedure address as a callback parameter to a C routine (something very commonly done in Ada libraries that sit on top of C graphics libraries like gtk). The C library routines don't know how to handle fat pointers like this.
GNAT uses, or at one time used, trampolines to get around this. Basically, when it sees Inner'Unrestricted_Access', it will generate new code (a "trampoline") on the fly. This trampoline calls Inner with the correct static link (the value of the link will be embedded in the code). The access value will then be a thin pointer, just one address, which is the address of the trampoline. Thus, when the C code calls the callback, it calls the trampoline, which then adds the hidden parameter to the parameter list and calls Inner.
This solves the problem, but creates a security issue when the trampoline is generated on the stack.
Edit: I erred when referring to GNAT's implementation in the present tense. I last looked at this a few years ago, and I don't really know whether GNAT still does things this way. [Simon has better information about this.] By the way, I do think it's possible to use trampolines but not put them on the stack, which I think would reduce the security issues. When I last investigated this, if I recall correctly, Windows had started preventing code on the stack from being executed, but it also allowed programs to request memory that could be used to dynamically generate code that could be executed.
A presentation in 2003 on Ada for secure applications (D. Wheeler, SigAda 2003) supports this on page 7 : (quote)
How do Ada and security match poorly?
...
Ada implementations typically need to execute code on the stack("trampolines", e.g. for access values to nested subprograms).
In other (C) words, for function pointers, where the subprograms are nested within other subprograms.
(Speculating : presumably these function pointers are on the stack so they go out of scope when you leave the scope of the outer subprogram)
HOWEVER
A quick search also showed this gcc mailing list message :
[Ada] remove trampolines from front end
dated 2007, which refers to enabling Gnat executables to run on systems with DEP (data execution protection), by eliminating precisely this problematic feature.
This is NOT an authoritative answer but it seems that while "typical" Ada implementations do (or did) so, it may not be the case for at least Gnat this side of 2007, thanks to protection systems on newer hardware driving the necessary changes to the compiler.
Or : definitely true at one time, but possibly no longer true today, at least for Gnat.
I would welcome a more in-depth answer from a real expert...
EDIT : Adam's thorough answer states this is still true of Gnat, so my optimism should be tempered until further information.
FSF GCC 5 generates trampolines under circumstances documented here. This becomes problematic when the trampolines are actually used; in particular, when code takes ’Access or ’Unrestricted_Access of nested subprograms.
You can detect when your code does this by using
pragma Restrictions (No_Implicit_Dynamic_Code);
which needs to be used as a configuration pragma (although you won’t necessarily get warnings at compilation time, see PR 67205). The pragma is documented here.
You used to set configuration pragmas simply by including them in a file gnat.adc. If you’re using gnatmake you can also use the switch -gnatec=foo.adc. gprbuild doesn’t see gnat.adc; instead set global configurations in package Builder in your project file,
package Builder is
for Global_Configuration_Pragmas use "foo.adc";
end Builder;
Violations end up with compilation errors like
$ gprbuild -P trampoline tramp
gcc -c tramp.adb
tramp.adb:26:12: violation of restriction "No_Implicit_Dynamic_Code" at /Users/simon/cortex-gnat-rts/test-arduino-due/gnat.adc:1
Related
Status: Sort of solved. Switching Lua.Ref (close equivalent to LuaD LuaObject) to struct as suggested in answer has solved most issues related to freeing references, and I changed back to similar mechanism LuaD uses. More about this in the end.
In one of my project, I am working with Lua interface. I have mainly borrowed the ideas from LuaD. The mechanism in LuaD uses lua_ref & lua_unref to be able to move lua table/function references in D space, but this causes heavy problems because the calls to destructors and their order is not guaranteed. LuaD usually segfaults at least at the program exit.
Because it seems that LuaD is not maintained anymore, I decided to write my own interface for my purposes. My Lua interface class is here: https://github.com/mkoskim/games/blob/master/engine/util/lua.d
Usage examples can be found here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/luademo.d
And in case you need, the Lua script used by the example is here:
https://github.com/mkoskim/games/blob/master/demo/luasketch/data/test.lua
The interface works like this:
Lua.opIndex pushes global table and index key to stack, and return Top object. For example, lua["math"] pushes _G and "math" to stack.
Further accesses go through Top object. Top.opIndex goes deeper in the table hierarchy. Other methods (call, get, set) are "final" methods, which perform an operation with the table and key at the top of the stack, and clean the stack afterwards.
Close everything works fine, except this mechanism has nasty quirk/bug that I have no idea how to solve it. If you don't call any of those "final" methods, Top will leave table and key to the stack:
lua["math"]["abs"].call(-1); // Works. Final method (call) called.
lua["math"]["abs"]; // table ref & key left to stack :(
What I know for sure, is that playing with Top() destructor does not work, as it is not called immediately when object is not referenced anymore.
NOTE: If there is some sort of operator to be called when object is accessed as rvalue, I could replace call(), set() and get() methods with operator overloads.
Questions:
Is there any way to prevent users to write such expressions (getting Top object without calling any of "final" methods)? I really don't want users to write e.g. luafunc = lua["math"]["abs"] and then later try to call it, because it won't work at all. Not without starting to play with lua_ref & lua_unref and start fighting with same issues that LuaD has.
Is there any kind of opAccess operator overloading, that is, overloading what happens when object is used as rvalue? That is, expression "a = b" -> "a.opAssign(b.opAccess)"? opCast does not work, it is called only with explicit casts.
Any other suggestions? I internally feel that I am looking solution from wrong direction. I feel that the problem reside in the realm of metaprogramming: I am trying to "scope" things at expression level, which I feel is not that suitable for classes and objects.
So far, I have tried to preserve the LuaD look'n'feel at interface user's side, but I think that if I could change the interface to something like following, I could get it working:
lua.call(["math", "abs"], 1); // call lua.math.abs(2)
lua.get(["table", "x", "y", "z"], 2); // lua table.x.y.z = 2
...
Syntactically that would ensure that reference to lua object fetched by indexing is finally used for something in the expression, and the stack would be cleaned.
UPDATE: Like said, changing Lua.Ref to struct solved problems related to dereferencing, and I am again using reference mechanism similar to LuaD. I personally feel that this mechanism suits the LuaD-style syntax I am using, too, and it can be quite a challenge to make the syntax working correctly with other mechanisms. I am still open to hear if someone has ideas to make it work.
The system I sketched to replace references (to tackle the problem with objects holding references living longer than lua sandbox) would probably need different kind of interface, something similar I sketched above.
You also have an issue when people do
auto math_abs = lua["math"]["abs"];
math_abs.call(1);
math_abs.call(3);
This will double pop.
Make Top a struct that holds the stack index of what they are referencing. That way you can use its known scoping and destruction behavior to your advantage. Make sure you handle this(this) correctly as well.
Only pop in the destructor when the value is the actual top value. You can use a bitset in LuaInterface to track which stack positions are in use and put the values in it using lua_replace if you are worried about excessive stack use.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
My eventual goal is the following:
Inside certain function say foo, I want to override malloc function to my custom handler my_malloc. But outside foo, any malloc should still call to malloc as in glibc.
Note: this is different from LD_PRELOAD which would override malloc during the entire program execution.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
In theory this is possible, but in practice it's too hard to do.
Inside certain function say foo, I want to override malloc function to my custom handler my_malloc. But outside foo, any malloc should still call to malloc as in glibc.
Modifying symbol table (even if it were possible) would not get you to your desired goal.
All calls from anywhere inside your ELF binary (let's assume foo is in the main executable), resolve to the same PLT import slot malloc#plt. That slot is resolved to glibc malloc on the first call (from anywhere in your program, assuming you are not using LD_BIND_NOW=1 or similar). After that slot has been resolved, any further modification to the symbol table will have no effect.
You didn't say how much control over foo you have.
If you can recompile it, the problem becomes trivial:
#define malloc my_malloc
int foo() {
// same code as before
}
#undef malloc
If you are handed a precompiled foo.o, you are linking it with my_malloc.o, and you want to redirect all calls from inside foo.o from malloc to my_malloc, that's actually quite simple to do at the object level (i.e. before final link).
All you have to do is go through foo.o relocation records, and change the ones that say "put address of external malloc here" to "put address of external my_malloc here".
If foo.o contains additional functions besides foo, it's quite simple to limit the relocation rewrite to just the relocations inside foo.
Is it possible to dynamically modify symbol table at runtime in C (in elf format on Linux)?
Yes, it is not easy, but the functionality can be packaged into a library, so at the end of the day, it can be made practical.
Typemock Isolator++
(https://www.typemock.com/isolatorpp-product-page/isolate-pp/)
This is free-to-use, but closed source solution. The usage example from documentation should be instructive
TEST_F(IsolatorPPTests, IsExpired_YearIs2018_ReturnTrue) {
Product product;
// Prepare a future time construct
tm* fakeTime = new tm();
fakeTime->tm_year = 2018;
// Fake the localtime method
FAKE_GLOBAL(localtime);
// Replace the returned value when the method is called
// with the fake value.
WHEN_CALLED(localtime(_)).Return(fakeTime);
ASSERT_TRUE(product.IsExpired());
}
Other libraries of this kind
Mimick, from Q: Function mocking in C?
cpp-stub, from Q: Creating stub functionality in C++
Elfspy, for C++, but sometimes it's OK to test C code from C++ unittests, from Q: C++ mock framework capable of mocking non-virtual methods and C functions
HippoMocks, from Q: Mocking C functions in MSVC (Visual Studio)
the subprojects in https://github.com/coolxv/cpp-stub/tree/master/other
... there is still more, feel free to append ...
Alternate approaches
ld's --wrap option and linker scripts, https://gitlab.com/hedayat/powerfake
various approaches described in answers for Q: Advice on Mocking System Calls
and in answers to Q: How to mock library calls?
Rewrite code to make it testable
This is easier in other languages than C, but still doable even in C. Structure code into small functions without side-effects that can be unit-tested without resorting to trickery, and so on. I like this blog Modularity. Details. Pick One. about the tradeoffs this brings. Personally, I guturally dislike the "sea of small functions and tons of dependency injection" style of code, but I realize that that's the easiest to work with, all things considered.
Excursion to other languages
What you are asking for is trivial to do in Python, with the unittest.mock.patch, or possibly by just assigning the mock into the original function directly, and undoing that at the end of the test.
In Java, there is Mockito/PowerMock, which can be used to replace static methods for the duration of a test. Static methods in Java approximately correspond to regular functions in C.
In Go, there is Mockey, which works similarly to what needs to be done in C. It has similar limitations in that inlining can break this kind of runtime mocking. I am not sure if in C you can hit the issue that very short methods are unmockable because there is not enough space to inject the redirection code; I think more likely not, if all calls go through the Procedure Linkage Table.
Destructors in Delphi are usually named "Destroy", however as far as i understand you can also
name destructors differently
have multiple destructors
Is there any reason why this was implemented this way? What are the possible use cases for differently named / multiple destructors?
In theory you can manually call different destructors to free different external resources, like breaking ref-counting loops, deleting or just closing file, etc.
Also, since the Object Pascal language does not have those magical new/delete operations, there just should be some identifier to call for disposing of the object.
I'd prefer to look at that in retrospect.
"Turbo Pascal with Objects" style objects have both - you call a "magical" Dispose procedure but explicitly specify a destructor to call, since language itself did not knew what to choose. Similarly "magic" procedure New had to be supplied with a manually selected constructor.
http://www.freepascal.org/docs-html/rtl/system/dispose.html
http://putka.acm.si/langref/turboPascal/0547.html
http://www.freepascal.org/docs-html/rtl/system/new.html
http://putka.acm.si/langref/turboPascal/04A4.html
This however violates DRY principle: compiler knows that we are calling d-tor or c-tor, but yet we have to additionally call those "New" and "Dispose" functions. In theory that probably provided to decouple memory allocation and information feeding and combine them anyway we'd like. But i don't think this feature was actually used anything wide.
Interesting that the same design is used in Apple Objective C. You 1st allocate memory for the object and after that you call a constructor for that new instance: http://en.wikipedia.org/wiki/Objective-C#Instantiation
When that model was streamlined for Delphi few decisions was made to make things more simplified (and unified). Memory [de]allocation strategy was shifted to the class level, rather than call-site. That made the redundancy of both calling "New" and named constructor very contrast. One had to be dropped.
C++/C#/Java chosen to retain a special language-level keywords for it, using overloaded functions to provide different c-tors. Perhaps that corresponds to USA style of computer languages.
However Pascal at its core has two ideas: verbosity and small vocabulary. Arguably they can be tracked in other European-school languages like Scala. If possible, the keywords should be removed from language itself and moved to external modules - libraries that you can add or remove from project. And overloaded functions were introduced much later to the language and early preference was to surely have two differently named (self-documenting) function names.
This both ideas probably caused Delphi to remove "magic" procedures and to deduce object creation/destruction at the call-site just by used function names. If you call MyVar.Destroy then compiler looks at the declaration of .Destroy and knows we are deleting the object. Similarly it knows TMyType.CreateXXX(YYY,ZZZ) is an object instanbtiation due to the way CreateXXX was declared.
To make c-tor and d-tor no-named like in C++, Delphi would have to introduce two more keywords to the language level, like those C++ new and delete. And there seems to be no clear advantage in that. At least personally i better like Delphi way.
PS. I had to add there one assumption: we are talking about real C++ and Delphi languages as they were around 1995. They only featured manual memory control for heap-allocated objects, no garbage collection and no automatic ref-counting. You could not trigger object destruction by assigning variable with nil/NULL pointer.
I have a helper class that will be in wide use across the application. The implementation relies on interface reference counting, the idea roughly is:
...
var
lHelper: IMyHelper;
begin
lHelper := TMyHelper.Create(some params);
...some code that doesn't have to access lHelper
end;
So the implementation relies on IMyHelper going out of scope at the end of the method, but not before.
So what am asking is, can I be certain that in some future Delphi compiler won't play smart and release the interface right after it's created if the variable is not accessed in rest of the method ?
IMHO you can be confident of that. The out-of-scope pattern will probably remain global to the instruction block of this method. This would be a breaking change.
See this comment from Barry Kelly (from Embarcadero):
As to your earlier comment, about explicit variables: in the hypothetical (and breaking change) case, where we optimized interface variable usage, we would likely not only break the described RAII-like functionality but also the explicit variable approach too; the values assigned to FooNotifier and BarNotifier are not used, so "in theory" they can be freed up sooner, and potentially even reuse the same storage.
But of course, destruction of the interface can have side-effects, and that's what's being relied upon for the effect in the post. Changing the language such that side-effects like these have visible changes is not something we do willingly.
So you can guess that Embarcadero won't introduce any backward compatibility change here. The benefit of re-using an interface memory won't be worth breaking compatibility and introducing side effects: saving a pointer (4 or 8 bytes) is not worth it nowadays, especially when the stack is already allocated, and aligned (x64 model uses more stack than x86).
Only if a Garbage Collector is introduced to the language (which I do not want from my personal point of view), objects life time may change. But in this case, life time may probably be longer.
In all cases, you can create your own code, to be sure that it will released at the end of the method:
var
lHelper: IMyHelper;
begin
lHelper := TMyHelper.Create(some params);
try
...some code that doesn't have to access lHelper
finally
lHelper := nil; // release the interface count by yourself
end;
end;
In fact, this is the code already generated by the compiler. Writing this will be perfectly redundant, but it will ensure that compiler won't cheat on you.
When speaking of interfaces and reference counting, please take in account the potential issue of circular references in Delphi. See this great article (i.e. "Example 2-15") about the need of "weak pointers" for circular references of Interfaces.
Other languages (like Java or C#) use a garbage collector to resolve this. Objective C uses an explicit "zeroing weak pointers" mechanism to solve it - see this discussion or this SO answer for a potential implementation. Perhaps future version of Delphi may consider using an implementation similar to the ARC model introduced in Objective C. But I suspect there will be an explicit syntax to preserve compatibility with existing code.
The documentation says this (emphasis mine):
On the Win32 platform, interface references are typically managed through reference-counting, which depends on the _AddRef and _Release methods inherited from System/IInterface. Using the default implementation of reference counting, when an object is referenced only through interfaces, there is no need to destroy it manually; the object is automatically destroyed when the last reference to it goes out of scope.
The scope of a local variable is the method and so the current specification is that _Release will not be called until the method is complete.
There's never a promise that specifications will not be changed in the future but I think the likelihood of a change being made to this part of the language is vanishingly small.
What are the pros and cons in using fluent interfaces in Delphi?
Fluent interfaces are supposed to increase the readability, but I'm a bit skeptical to have one long LOC that contains a lot of chained methods.
Are there any compiler issues?
Are there any debugging issues?
Are there any runtime/error handling issues?
Fluent interfaces are used in e.g. TStringBuilder, THTMLWriter and TGpFluentXMLBuilder.
Updated:
David Heffernan asked which issues I was concerned about. I've been given this some thought, and the overall issue is the difference between "explicitly specifying how it's done" versus "letting the compiler decide how it's done".
AFAICS, there is no documentation on how chained methods actually is handled by the compiler, nor any specification on how the compiler should treat chained methods.
In this article we can read about how the compiler adds two additional var-parameters to methods declared as functions, and that the standard calling convention puts three params in the register and the next ones on the stack. A "fluent function method" with 2 params will therefor use the stack, whereas an "ordinary procedure method" with 2 params only uses the register.
We also know that the compiler does some magic to optimize the binary (e.g. string as function result, evaluation order, ref to local proc), but sometimes with surprising side effects for the programmer.
So the fact that the memory/stack/register-management is more complex and the fact that compiler could do some magic with unintentional side effects, is pretty smelly to me. Hence the question.
After I've read the answers (very good ones), my concern is strongly reduced but my preference is still the same :)
Everybody is just writing about negative issues so let's stress some positive issues. Errr, the only positive issue - less (in some cases much less) typing.
I wrote GpFluentXMLBuilder just because I hate typing tons of code while creating XML documents. Nothing more and nothing less.
The good point with fluent interfaces is that you don't have to use them in the fluent manner if you hate the idiom. They are completely usable in a traditional way.
EDIT: A point for the "shortness and readability" view.
I was debugging some old code and stumbled across this:
fdsUnreportedMessages.Add(CreateFluentXml
.UTF8
.AddChild('LogEntry')
.AddChild('Time', Now)
.AddSibling('Severity', msg.MsgID)
.AddSibling('Message', msg.MsgData.AsString)
.AsString);
I knew immediately what the code does. If, however, the code would look like this (and I'm not claiming that this even compiles, I just threw it together for demo):
var
xmlData: IXMLNode;
xmlDoc : IXMLDocument;
xmlKey : IXMLNode;
xmlRoot: IXMLNode;
xmlDoc := CreateXMLDoc;
xmlDoc.AppendChild(xmlDoc.CreateProcessingInstruction('xml',
'version="1.0" encoding="UTF-8"'));
xmlRoot := xmlDoc.CreateElement('LogEntry');
xmlDoc.AppendChild(xmlRoot);
xmlKey := xmlDoc.CreateElement('Time');
xmlDoc.AppendChild(xmlKey);
xmlData := xmlDoc.CreateTextNode(FormatDateTime(
'yyyy-mm-dd"T"hh":"mm":"ss.zzz', Now));
xmlKey.AppendChild(xmlData);
xmlKey := xmlDoc.CreateElement('Severity');
xmlDoc.AppendChild(xmlKey);
xmlData := xmlDoc.CreateTextNode(IntToStr(msg.MsgID));
xmlKey.AppendChild(xmlData);
xmlKey := xmlDoc.CreateElement('Message');
xmlDoc.AppendChild(xmlKey);
xmlData := xmlDoc.CreateTextNode(msg.MsgData.AsString);
xmlKey.AppendChild(xmlData);
fdsUnreportedMessages.Add(xmlKey.XML);
I would need quite some time (and a cup of coffee) to understand what it does.
EDIT2:
Eric Grange made a perfectly valid point in comments. In reality, one would use some XML wrapper and not DOM directly. For example, using OmniXMLUtils from the OmniXML package, the code would look like that:
var
xmlDoc: IXMLDocument;
xmlLog: IXMLNode;
xmlDoc := CreateXMLDoc;
xmlDoc.AppendChild(xmlDoc.CreateProcessingInstruction(
'xml', 'version="1.0" encoding="UTF-8"'));
xmlLog := EnsureNode(xmlDoc, 'LogEntry');
SetNodeTextDateTime(xmlLog, 'Time', Now);
SetNodeTextInt(xmlLog, 'Severity', msg.MsgID);
SetNodeText(xmlLog, 'Message', msg.MsgData.AsString);
fdsUnreportedMessages.Add(XMLSaveToString(xmlDoc));
Still, I prefer the fluent version. [And I never ever use code formatters.]
Compiler issues:
If you're using interfaces (rather than objects), each call in the chain will result in a reference count overhead, even if the exact same interface is returned all the time, the compiler has no way of knowing it. You'll thus generate a larger code, with a more complex stack.
Debugging issues:
The call chain being seen as a single instruction, you can't step or breakpoint on the intermediate steps. You also can't evaluate state at intermediate steps.
The only way to debug intermediate steps is to trace in the asm view.
The call stack in the debugger will also not be clear, if the same methods happens multiple times in the fluent chain.
Runtime issues:
When using interfaces for the chain (rather than objects), you have to pay for the reference counting overhead, as well as a more complex exception frame.
You can't have try..finally constructs in a chain, so no guarantee of closing what was opened in a fluent chain f.i.
Debug/Error logging issues:
Exceptions and their stack trace will see the chain as a single instruction, so if you crashed in .DoSomething, and the call chain has several .DoSomething calls, you won't know which caused the issue.
Code Formatting issues:
AFAICT none of the existing code formatters will properly layout a fluent call chain, so it's only manual formatting that can keep a fluent call chain readable. If an automated formatter is run, it'll typically turn a chain into a readability mess.
Are there any compiler issues?
No.
Are there any debugging issues?
Yes. Since all the chained method calls are seen as one expression, even if you write them on multiple lines as in the Wikipedia example you linked, it's a problem when debugging because you can't single-step through them.
Are there any runtime/error handling issues?
Edited: Here's a test console application I wrote to test the actual Runtime overhead of using Fluent Interfaces. I assigned 6 properties for each iteration (actually the same 2 values 3 times each). The conclusions are:
With interfaces: 70% increase in runtime, depends on the number of properties set. Setting only two properties the overhead was smaller.
With objects: Using fluent interfaces was faster
Didn't test records. It can't work well with records!
I personally don't mind those "fluent interfaces". Never heard of the name before, but I've been using them, especially in code that populates a list from code. (Somewhat like the XML example you posted). I don't think they're difficult to read, especially if one's familiar with this kind of coding AND the method names are meaningful. As for the one long line of code, look at the Wikipedia example: You don't need to put it all on one line of code.
I clearly remember using them with Turbo Pascal to initialize Screens, which is probably why I don't mind them and also use them at times. Unfortunately Google fails me, I can't find any code sample for the old TP code.
I would question the benefit of using "fluent interfaces".
From what I see, the point is to allow you to avoid having to declare a variable. So the same benefit the dreaded With statement brings, with a different set of problems (see other answers)
Honestly I never understood the motivation to use the With statement, and I don't understand the motivation to use fluent interfaces either. I mean is it that hard to define a variable ?
All this mumbo jumbo just to allow laziness.
I would argue that rather than increase readability it, which at first glance it seems to do by having to type/read less, it actually obfuscates it.
So again, I ask why would you want to use fluent interfaces in the first place ?
It was coined by Martin Fowler so it must be cool ? Nah I ain't buying it.
This is a kind of write-once-read-never notation that is not easy to understand without going through documentation for all involved methods. Also such notation is not compatible with Delphi and C# properties - if you need to set properties, you need to rollback to using
common notations as you can't chain property assignments.