Is there a safe way to clean up stack-based code when jumping out of a block? - delphi

I've been working on Issue 14 on the PascalScript scripting engine, in which using a Goto command to jump out of a Case block produces a compiler error, even though this is perfectly valid (if ugly) Object Pascal code.
Turns out the ProcessCase routine in the compiler calls HasInvalidJumps, which scans for any Gotos that lead outside of the Case block, and gives a compiler error if it finds one. If I comment that check out, it compiles just fine, but ends up crashing at runtime. A disassembly of the bytecode shows why. I've annotated it with the original script code:
[TYPES]
<SNIPPED>
[VARS]
Var [0]: 27 Class TFORM
Var [1]: 28 Class TAPPLICATION
Var [2]: 11 S32 //i: integer
[PROCS]
Proc [0] Export: !MAIN -1
{begin}
[0] ASSIGN GlobalVar[2], [1]
{ i := 1;}
[15] PUSHTYPE 11(S32) // 1
[20] ASSIGN Base[1], GlobalVar[2]
{ case i of}
[31] PUSHTYPE 25(U8) // 2
{ 0:}
[36] COMPARE into Base[2]: [0] = Base[1]
[57] COND_NOT_GOTO currpos + 5 Base[2] [72]
{ end;}
[67] GOTO currpos + 41 [113]
{ 1:}
[72] COMPARE into Base[2]: [1] = Base[1]
[93] COND_NOT_GOTO currpos + 10 Base[2] [113]
{ goto L1;}
[103] GOTO currpos + 8 [116]
{ end;}
[108] GOTO currpos + 0 [113]
{ end; //<-- case}
[113] POP // 1
[114] POP // 0
{ Exit;}
[115] RET
{L1:
Writeln('Label L1');}
[116] PUSHTYPE 17(WideString) // 1
[121] ASSIGN Base[1], ['????????']
[144] CALL 1
{end.}
[149] POP // 0
[150] RET
Proc [1]: External Decl: \00\00 WRITELN
The "goto L1;" statement at 103 skips the cleanup pops at 113 and 114, which leaves the stack in an invalid state.
Delphi doesn't have any trouble with this, because it doesn't use a calculation stack. PascalScript, though, is not as fortunate. I need some way to make this work, as this pattern is very common in some legacy scripts from a much simpler system with little in the way of control structures that I've translated to PascalScript and need to be able to support.
Anyone have any ideas how to patch the codegen so it'll clean up the stack properly?

IIRC the goto rules in classic pascals were:
jumps are only allowed out of a block (iow from a higher to a lower nesting level on the "same" branch of the tree)
from local procedures to their parents.
The later was afaik never supported by Borland derived Pascals, but the first still holds.
So you need to generate exiting code like Martin says, but possibly it can be for multiple block levels, so you can't have a could codegeneration for each goto, but must generate code (to exit the precise number of needed blocks).
A typical test pattern is to exit from multiple nested ifs (possibly within a loop) using a goto, since that was a classic microoptimization that was faster at least up to D7.
Keep in mind that the if evaluation(s) and the begin..end blocks of their branches might have generated temps that need cleanup.
---------- added later
I think the codegenerator needs a way to walk the scopes between the goto and its endpoint, generating the relevant exit code for blocks along the way. That way a fix works for the general case and not just this example.
Since you can only jump out of scopes, and not into it that might not that be that hard.
IOW generate something that is equivalent to (for a hypothetical double case block)
Lgoto1gluecode:
// exit code first block
pop x
pop y
// exit code first block
pop A
pop B
goto real_goto_destination
Additional analysis can be done. E.g. if there is only one scope, and it has already a cleanup exit label, you can jump directly. If you know for certain that the above pop's are only discarded values (and not saves of registers) you can do them at once with add $16,%esp (4*4 byte values) etc.

The straightforward solution would be:
When generating a GOTO for goto statement, prefix the GOTO with the same cleanup code that comes before RET.

It looks to me like the calculation of how far to jump forward is the problem. I would have to spend some time looking at the implementation of the parser to help further, but my guess would be that additional handling must be performed when using a goto and there are values on the stack AND the goto would be placed after those values would be removed from the stack. Of course to determine this you would need to save the current location being parsed (the goto) and the forward parse to the target location watching for stack changes, and if so then to either adjust the goto location backwards, or inject the code as Martin suggested.

Related

What exactly does `return` do in AutoHotKey?

In a comment chain, I ask:
Hmm, so the return doesn't play like a closing bracket like in other languages?
To which a user answers:
It's maybe easy to think of return as just something that stops the code execution from going further. #IfDirectives don't care about returns, or anything else related to code execution, because they have nothing to do with code execution. They are kind of just markers that enclose different parts of code within them
I have two questions from this:
If return is just something that stops code execution from going further, then how is it different to exit? They are both flow controls playing a role to determine The Top of the Script (the Auto-execute Section). I see that many people having this problem.
If return doesn't play like a closing bracket, then why does it appear in many places one expects a closing bracket? Especially when there is no subroutine involving in, like what is described in the formal definition. Take this example in Hotkeys:
#n::
Run Notepad
return
In AHK return is what you'd learn return to be from other programming languages. It's just that control flow, and in general the legacy syntax, in AHK is different from what you might be used to from other languages.
How is return different from exit?
If we're not trying to return a value from a function, and are just interested in the control flow, there is not much difference. But the difference is there.
You could indeed end an hotkey label with exit, and it would be the same as ending it with return.
From the documentation:
"If there is no caller to which to return, Return will do an Exit instead.".
I guess it's just convention to end hotkey labels with return.
So basically said, there is difference between the two, if return isn't about to do an exit.
Well when would this be?
For example:
MsgBox, 1
function()
MsgBox, 3
return ;this does an exit
function()
{
MsgBox, 2
return ;this does a return
}
The first return has no caller to which to return to, so it will do an exit.
The second return however does have a caller to return to do, so it will return there, and then execute the 3rd message box.
If we replaced the second return with an exit, the current thread would just be exited and the 3rd message box would never have been shown.
Here's the same exact example with legacy labels:
MsgBox, 1
gosub, label
MsgBox, 3
return ;this does an exit
label:
MsgBox, 2
return ;this does a return
I'm showing this, because they're very close to hotkey labels, and hotkey labels were something you were wondering a lot about.
In this specific example, if the line MsgBox, 3 didn't exist, replacing the second return with an exit would produce the same end result. But only because code execution is about to end anyway on the first return.
If return doesn't play like a closing bracket, then why does it appear in many places one expects a closing bracket?
In AHK v1, hotkeys and hotstrings work like labels, and labels are legacy. Labels don't care about { }s like modern functions do.
So we need something to stop the code execution. Return does that.
Consider the following example:
c::
MsgBox, % "Hotkey triggered!"
gosub, run_chrome
;code execution not ended
n::
MsgBox, % "Hotkey triggered!`nRunning notepad"
Run, notepad
;code execution not ended
run_chrome:
MsgBox, % "Running chrome"
run, chrome
WinWait, ahk_exe chrome.exe
WinMove, ahk_exe chrome.exe, , 500, 500
;code execution not ended
show_system_uptime:
MsgBox, % "System uptime: " A_TickCount " ms"
;code execution not ended
We're not ending code execution at any of the labels. Because of this, code execution will bleed into other parts of the code, in this case, into the other labels.
Try running the hotkeys and see how code execution bleeds into the other labels.
Because of this, the code execution needs to be ended somehow.
If using { } was supported by labels, it would indeed be the solution to keep the code execution where it needs to be.
And in fact AHK v2 hotkeys and hotstrings are no longer labels (they're only lookalikes). In v2 using { } is actually the correct way (the script wont even run if you don't use them).
See hotkeys from the AHK v2 documentation.
Another answer above already gave a good explanation.
So, to me:_
(return vs Exit is like gosub vs goto -- return (literal meaning) vs terminate. )
return completes (ends) the current subroutine, and returns from the current subroutine back to the calling subroutine;
Exit completes (ends) the current subroutine, and terminates (/exits /ends) the current thread (which discards all the previous calling subroutine);
where subroutines are inside a thread.
(subroutines are just like function calls or label calls (or hotkey label calls);
you can visualize this as if method calls (stack frame) in Java (, you can see this when you debug in an IDE))
Quotes:
[]
Return
Returns from a subroutine to which execution had previously jumped via function-call, Gosub, Hotkey activation, GroupActivate, or other means.
https://www.autohotkey.com/docs/commands/Return.htm
[]
Exit
Exits the current thread or (if the script is not persistent) the entire script.
https://www.autohotkey.com/docs/commands/Exit.htm
[]
Gosub
Jumps to the specified label and continues execution until Return is encountered.
https://www.autohotkey.com/docs/commands/Gosub.htm
[]
Goto
Jumps to the specified label and continues execution.
https://www.autohotkey.com/docs/commands/Goto.htm
[]
main difference is gosub comes back, and goto does not.
https://www.autohotkey.com/board/topic/46160-goto-vs-gosub/
[]
A subroutine and a function are essentially the same thing
Difference between subroutine , co-routine , function and thread?
[]
A subroutine is a portion of code which can be called to perform a specific task.
Execution of a subroutine begins at the target of a label and continues until a Return or Exit is encountered.
Since the end of a subroutine depends on flow of control,
any label can act as both a Goto target and the beginning of a subroutine.
https://www.autohotkey.com/docs/misc/Labels.htm#subroutines
[]
The current thread is defined as the flow of execution invoked by the most recent event;
examples include hotkeys, SetTimer subroutines, custom menu items, and GUI events.
The current thread can be executing commands within its own subroutine or within other subroutines called by that subroutine.
https://www.autohotkey.com/docs/misc/Threads.htm
(you may try out the following code to see the difference)
^r::
MsgBox, aaa
gosub, TestLabel
MsgBox, bbb
goto, TestLabel
MsgBox, ccc
return
TestLabel:
MsgBox, inside
return
; Exit
Return returns from a function (called with funcname(...) syntax) or subroutine (called with Gosub label). Subroutines/functions can call other subroutines/functions and so it's convenient to end them with Return rather than Exit. Exit ends the current thread of execution and won't transfer flow of control back to the caller.
For functions, Return also can return a value.
Return is not analogous to a closing bracket. Closing brackets can occur only at the end of a function and imply a Return but you can also Return anywhere in a function. It is common to return in an If statement to "bail out" of a function early, perhaps if some parameter had an invalid value.

How to get this sqrt inline assembly working for iOS

I am trying to follow another SO post and implement sqrt14 within my iOS app:
double inline __declspec (naked) __fastcall sqrt14(double n)
{
_asm fld qword ptr [esp+4]
_asm fsqrt
_asm ret 8
}
I have modified this to the following in my code:
double inline __declspec (naked) sqrt14(double n)
{
__asm__("fld qword ptr [esp+4]");
__asm__("fsqrt");
__asm__("ret 8");
}
Above, I have removed the "__fastcall" keyword from the method definition since my understanding is that it is for x86 only. The above gives the following errors for each assembly line respectively:
Unexpected token in argument list
Invalid instruction
Invalid instruction
I have attempted to read through a few inline ASM guides and other posts on how to do this, but I am generally just unfamiliar with the language. I know MIPS quite well, but these commands/registers seem to be very different. For example, I don't understand why the original author never uses the passed in "n" value anywhere in the assembly code.
Any help getting this to work would be greatly appreciated! I am trying to do this because I am building an app where I need to calculate sqrt (ok, yes, I could do a lookup table, but for right now I care a lot about precision) on every pixel of a live-video feed. I am currently using the standard sqrt, and in addition to the rest of the computation, I'm running at around 8fps. Hoping to bump that up a frame or two with this change.
If it matters: I'm building the app to ideally be compatibly with any current iOS device that can run iOS 7.1 Again, many thanks for any help.
The compiler is perfectly capable of generating fsqrt instruction, you don't need inline asm for that. You might get some extra speed if you use -ffast-math.
For completeness' sake, here is the inline asm version:
__asm__ __volatile__ ("fsqrt" : "=t" (n) : "0" (n));
The fsqrt instruction has no explicit operands, it uses the top of the stack implicitly. The =t constraint tells the compiler to expect the output on the top of the fpu stack and the 0 constraint instructs the compiler to place the input in the same place as output #0 (ie. the top of the fpu stack again).
Note that fsqrt is of course x86-only, meaning it wont work for example on ARM cpus.

Understanding call x"91"

Can someone please help me understand call x"91" function 11 and function 12 with simple example. I have tried to search and couldn't understand it. Right now I an using this code in COBOL under UNIX environment,Does this call works in windows environment as well?
http://opencobol.add1tocobol.com/#what-are-the-xf4-xf5-and-x91-routines
The CALL's X"F4", X"F5", X"91" are from MF.
You can find them in the online MF doc under
Library Routines.
F4/F5 are for packing/unpacking bits from/to bytes.
91 is a multi-use call. Implemented are the subfunctions
get/set cobol switches (11, 12) and get number of call params (16).
Use
CALL X"F4" USING
BYTE-VAR
ARRAY-VAR
RETURNING STATUS-VAR
to pack the last bit of each byte in the 8 byte ARRAY-VAR into corresponding bits of the 1 byte BYTE-VAR.
The X”F5” routine takes the eight bits of byte and moves them to the corresponding occurrence within array.
X”91” is a multi-function routine.
CALL X"91" USING
RESULT-VAR
FUNCTION-NUM
PARAMETER-VAR
RETURNING STATUS-VAR
As mentioned by Roger, OpenCOBOL supports FUNCTION-NUM of 11, 12 and 16.
11 and 12 get and set the on off status of the 8 (eight) run-time OpenCOBOL switches definable in the SPECIAL-NAMES paragraph. 16 returns the number of call parameters given to the current module.
x'91' is a general library routine, for a complete list of those see the MF documentation.
This documentation also specifies what its function 11 and function 12 do: they set/read the COBOL runtime switches 0-7 and the internal debugging mode switch.
Other than these library routines you can also read them one by one from COBOL and set "some" switches via the SET statement.

Boost spirit can handle Postscript/PDF like languages?

I noticed that Boost spirit offers some limits, in a question here on SO there is an user asking for help about boost spirit and the other user who gave the answer specified that boost spirit works well with statements and not with "generic text" ( I'm sorry if I don't recall it correctly ).
Now I would like to think about Postscript and PDF in terms of tokens and simplify my approach to this formats this way, the problem is that the PDF is kind of a mix between a markup language and a programming language with jumps and tables in it, and I can't think about something similar when considering the most popular file formats like XML, C++ code and others languages and formats.
There is also another fact: I can't really find people that had some kind of experience with boost::spirit wiriting a pdf parser or writer, so I'm asking, boost::spirit it's capable of parsing a PDF file and output the elements as tokens ?
Although this has nothing to do with Boost, let me assure you that the parsing of PDF (and PostScript) are about as trivial as you could want. Let's say that you have a scanner object that returns a series of tokens. The token types you will get from the scanner are:
String
Dict begin (<<)
Dict End (>>)
Name (/whatever)
Number
Hex array
Left Angle (<)
Right Angle (>)
Array begin ([)
Array end (])
Procedure begin ({)
Procedure end (})
Comment (%foo)
Word
My scanner is a finite-state automata with states for Start, Comment, String, HexArray, Token, DictEnd, and Done.
The way you parse PDF is not by parsing it, but by executing it. Given these tokens, my "parser" looks like this (in C#):
while (true) {
MLPdfToken = scanner.GetToken();
if (token == null)
return MachineExit.EndOfFile;
PdfObject obj = PdfObject.FromToken(token);
PdfProcedure proc = obj as PdfProcedure;
if (proc != null)
{
if (IsExecuting())
{
if (token.Type == PdfTokenType.RBrace)
proc.Execute(this);
else
Push(obj);
}
else {
proc.Execute(this);
}
if (proc.IsTerminal)
return Machine.ParseComplete;
}
else {
Push(obj);
}
}
I'll also add that if you give every PdfObject an Execute() method such that the base class implementation is machine.Push(this) and IsTerminal that returns false, the REPL gets easier:
while (true) {
MLPdfToken = scanner.GetToken();
if (token == null)
return MachineExit.EndOfFile;
PdfObject obj = PdfObject.FromToken(token);
if (IsExecuting())
{
if (token.Type == PdfTokenType.RBrace)
obj.Execute(this);
else
Push(obj);
}
else {
obj.Execute(this);
if (obj.IsTerminal)
return Machine.ParseComplete;
}
}
There's more support in Machine - Machine has a Stack of PdfObject and a few methods for accessing it (Push, Pop, Mark, CountToMark, Index, Dup, Swap), as well as ExecProcBegin and ExecProcEnd.
Beyond that, it's very light. The only thing that is slightly odd is that PdfObject.FromToken takes a token and if it is a primitive type (number, string, name, hex, bool) returns a corresponding PdfObject. Otherwise, it takes the given token and looks in a "proc set" dictionary of procedure names associated with PdfProcedure objects. So when you encounter the token << that gets looked up in a the proc set and comes up with this code:
void DictBegin(PdfMachine machine)
{
machine.Push(new PdfMark(PdfMarkType.Dictionary));
}
So << really means "mark the stack as the start of a dictionary. >> gets more interesting:
void DictEnd(PdfMachine machine)
{
PdfDict dict = new PdfDict();
// PopThroughMark pops the entire stack up to the first matching mark,
// throws an exception if it fails.
PdfObject[] arr = machine.PopThroughMark(PdfMarkType.Dictionary);
if ((arr.Length & 1) != 0)
throw new PdfException("dictionaries need an even number of objects.");
for (int i=0; i < arr.Length; i += 2)
{
PdfObject key = arr[i], val = arr[i + 1];
if (key.Type != PdfObjectType.Name)
throw new PdfException("dictionaries need a /name for the key.");
dict.put((PdfName)key, val);
}
machine.Push(dict);
}
So >> Pops up to the nearest dictionary mark into an array then puts each pair into the dictionary. Now, I could have done this without allocating the array. I could just pop pairs, putting them into the dictionary until I either hit the mark, fail to get a name or underflow the stack.
The important takeaway is that there really isn't any syntax in PDF, nor is there any in PostScript. At least not so much as you'd notice. The only real Syntax (and the read-eval-(push) loop shows it) is '}'.
So when you this is a PDF 14 0 obj << /Type /Annot /SubType /Square >> endobj what your really seeing is a series of procedures:
Push 14
Push 0
Execute obj (Pop two numbers and push a "definition" object).
Execute dictionary begin
Push /Type
Push /Annot
Push /SubType
Push /Square
Execute dictionary end
Execute endobj (pop the top object and then get (not pop) the next one. If the second is a definition, set its "value" to the first object, else throw).
Since "endobj" is terminal, parsing ends and the top of the stack is the result.
So when you are asked to look up object 14 in the PDF, the cross-reference table tells you where to seek to, you make a new Machine with the stream pointer at that location and run it. If the top of the stack is a "definition" object, you've succeeded.
About now you should be nodding but not trusting me, since you're thinking about PDF streams, which look like this:
<< [/key value]* >> stream ...raw data... endstream endobj
Again, there is no syntax. The proc stream looks at the top of the stack, which should be a PdfDict. If it is, it consumes characters until the next newline (scanner does this), stores the current file position in the stream as data start, reads the stream length from the dict (which may cause another Machine to get newed up), and skips past the end of stream and pushes the new stream object on the stack. endstream is a no-op. The only difference between a PdfDict and a PdfStream is that a PdfStream has a start position and a bool saying that it's a stream, otherwise I dual-purpose the object.
PostScript is almost identical except that the execution environment is a little more complex. For example, you need several stacks in your machine: a parameter stack, a dictionary stack, and an execution stack. From there, you more or less just bind your tokenizer into the set of primitive procedures as well as the word exec, and then most of your interpreter is written in PS itself.
If you're talking about boost, you're looking at C++, which means that you can't be as fast and loose with memory as I am, so you'll want to either use smart pointers or figure out where you scope is and be careful to dispose objects instead of blithely throwing them away, but that's just the normal C++ stuff.
Currently, I make PDF tools for my company in .NET, but in a former life I worked on Acrobat versions 1-4, and most of what I described is exactly what Acrobat did under the hood (well, more or less - it was C, not C++, but it's the same approach).
With respect to the xref table (or xref stream), you read that first - the spec tells you that if you jump to EOF and scan back, you find the start of the xref table. You parse that (which is a CS 101 assignment), parse the trailer, seek to the /Prev if any and repeat until no more /Prev entries. That gives you a complete xref for looking up objects.
As for writing - there are a number of approaches that you can take. The most obvious one is that when an object is meant to be referenced, you create a new reference object by assigning the newest available xref entry to it. Whenever objects refer to other objects for writing, they ask if these objects are referenced. If they are, they write the reference (ie, 14 0 R). When it comes time to write a referenced object, you get the current stream pointer and store it in the xref, then write <objnum> <generation> obj <object contents> endobj. For example, my code to write a dictionary looks like this:
public override ToStream(PdfStreamingContext context)
{
if (context.HasReference(this)) // is object referenced in xref
{
PdfUtils.WriteObjectDefinitionBegin(this, context);
}
context.Writer.Indent();
context.Writer.WriteLine("<<");
WriteContents(context);
context.Writer.Exdent();
context.Writer.Writeline(">>");
if (context.HasReference(this))
{
PdfUtils.WriteObjectDefinitionEnd(this, context);
}
}
I've chopped out some chaff so you can see the wheat underneath. The context is an object that holds a new xref table as well as an object for writing to streams that automagically handles appropriate newline discipline, indentation, line wrapping, and so on.
What you should see is that the basics here are straight forward, if not trivial. And now's when you should be asking yourself the question, "if it's trivial, how come there isn't more (serious) competition for Acrobat in the market? The answer is that even though it's trivial, it's still easy to write PDFs that aren't spec compliant and Acrobat handles most of those. The real challenge is to be able to honor the spec and make sure that you include all required values in a dictionary and that they are in range and semantically correct. Hell, even the date time format--which is pretty well-specified--is a mound of special case code in my library to manage where other people have screwed it up royally. Being able to generate consistently correct PDF is hard and consuming the garbage in the sea of PDFs in the world is harder.
I could (and probably should) write a book about how to do this. While a lot of the fringe code is grubby, the overall structure can be very pretty.
tl;dr - If you're thinking of a recursive descent parser for PDF, you're thinking too hard. All you need is a tokenizer and a simple REPL.

Why does return/redo evaluate result functions in the calling context, but block results are not evaluated?

Last night I learned about the /redo option for when you return from a function. It lets you return another function, which is then invoked at the calling site and reinvokes the evaluator from the same position
>> foo: func [a] [(print a) (return/redo (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
20
Even though foo is a function that only takes one argument, it now acts like a function that took two arguments. Something like that would otherwise require the caller to know you were returning a function, and that caller would have to manually use the do evaluator on it.
Thus without return/redo, you'd get:
>> foo: func [a] [(print a) (return (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
== 10
foo consumed its one parameter and returned a function by value (which was not invoked, thus the interpreter moved on). Then the expression evaluated to 10. If return/redo did not exist you'd have had to write:
>> do foo "Hello" 10
Hello
20
This keeps the caller from having to know (or care) if you've chosen to return a function to execute. And is cool because you can do things like tail call optimization, or writing a wrapper for the return functionality itself. Here's a variant of return that prints a message but still exits the function and provides the result:
>> myreturn: func [] [(print "Leaving...") (return/redo :return)]
>> foo: func [num] [myreturn num + 10]
>> foo 10
Leaving...
== 20
But functions aren't the only thing that have behavior in do. So if this is a general pattern for "removing the need for a DO at the callsite", then why doesn't this print anything?
>> test: func [] [return/redo [print "test"]]
>> test
== [print "test"]
It just returned the block by value, like a normal return would have. Shouldn't it have printed out "test"? That's what do would...uh, do with it:
>> do [print "test"]
test
The short answer is because it is generally unnecessary to evaluate a block at the call point, because blocks in Rebol don't take parameters so it mostly doesn't matter where they are evaluated. However, that "mostly" may need some explanation...
It comes down to two interesting features of Rebol: static binding, and how do of a function works.
Static Binding and Scopes
Rebol doesn't have scoped word bindings, it has static direct word bindings. Sometimes it seems like we have lexical scope, but we really fake that by updating the static bindings each time we're building a new "scoped" code block. We can also rebind words manually whenever we want.
What that means for us in this case though, is that once a block exists, its bindings and values are static - they're not affected by where the block is physically located, or where it is being evaluated.
However, and this is where it gets tricky, function contexts are weird. While the bindings of words bound to a function context are static, the set of values assigned to those words are dynamically scoped. It's a side effect of how code is evaluated in Rebol: What are language statements in other languages are functions in Rebol, so a call to if, for instance, actually passes a block of data to the if function which if then passes to do. That means that while a function is running, do has to look up the values of its words from the call frame of the most recent call to the function that hasn't returned yet.
This does mean that if you call a function and return a block of code with words bound to its context, evaluating that block will fail after the function returns. However, if your function calls itself and that call returns a block of code with its words bound to it, evaluating that block before your function returns will make it look up those words in the call frame of the current call of your function.
This is the same for whether you do or return/redo, and affects inner functions as well. Let me demonstrate:
Function returning code that is evaluated after the function returns, referencing a function word:
>> a: 10 do do has [a] [a: 20 [a]]
** Script error: a word is not bound to a context
** Where: do
** Near: do do has [a] [a: 20 [a]]
Same, but with return/redo and the code in a function:
>> a: 10 do has [a] [a: 20 return/redo does [a]]
** Script error: a word is not bound to a context
** Where: function!
** Near: [a: 20 return/redo does [a]]
Code do version, but inside an outer call to the same function:
>> do f: function [x] [a: 10 either zero? x [do f 1] [a: 20 [a]]] 0
== 10
Same, but with return/redo and the code in a function:
>> do f: function [x] [a: 10 either zero? x [f 1] [a: 20 return/redo does [a]]] 0
== 10
So in short, with blocks there is usually no advantage to doing the block elsewhere than where it is defined, and if you want to it is easier to use another call to do instead. Self-calling recursive functions that need to return code to be executed in outer calls of the same function are an exceedingly rare code pattern that I have never seen used in Rebol code at all.
It could be possible to change return/redo so it would handle blocks as well, but it probably isn't worth the increased overhead to return/redo to add a feature that is only useful in rare circumstances and already has a better way to do it.
However, that brings up an interesting point: If you don't need return/redo for blocks because do does the same job, doesn't the same apply to functions? Why do we need return/redo at all?
How DO of a Function Works
Basically, we have return/redo because it uses exactly the same code that we use to implement do of a function. You might not realize it, but do of a function is really unusual.
In most programming languages that can call a function value, you have to pass the parameters to the function as a complete set, sort of how R3's apply function works. Regular Rebol function calling causes some unknown-ahead-of-time number of additional evaluations to happen for its arguments using unknown-ahead-of-time evaluation rules. The evaluator figures out these evaluation rules at runtime and just passes the results of the evaluation to the function. The function itself doesn't handle the evaluation of its parameters, or even necessarily know how those parameters were evaluated.
However, when you do a function value explicitly, that means passing the function value to a call to another function, a regular function named do, and then that magically causes the evaluation of additional parameters that weren't even passed to the do function at all.
Well it's not magic, it's return/redo. The way do of a function works is that it returns a reference to the function in a regular shortcut-return value, with a flag in the shortcut-return value that tells the interpreter that called do to evaluate the returned function as if it were called right there in the code. This is basically what is called a trampoline.
Here's where we get to another interesting feature of Rebol: The ability to shortcut-return values from a function is built into the evaluator, but it doesn't actually use the return function to do it. All of the functions you see from Rebol code are wrappers around the internal stuff, even return and do. The return function we call just generates one of those shortcut-return values and returns it; the evaluator does the rest.
So in this case, what really happened is that all along we had code that did what return/redo does internally, but Carl decided to add an option to our return function to set that flag, even though the internal code doesn't need return to do so because the internal code calls the internal function. And then he didn't tell anyone that he was making the option externally available, or why, or what it did (I guess you can't mention everything; who has the time?). I have the suspicion, based on conversations with Carl and some bugs we've been fixing, that R2 handled do of a function differently, in a way that would have made return/redo impossible.
That does mean that the handling of return/redo is pretty thoroughly oriented towards function evaluation, since that is its entire reason for existing at all. Adding any overhead to it would add overhead to do of a function, and we use that a lot. Probably not worth extending it to blocks, given how little we'd gain and how rarely we'd get any benefit at all.
For return/redo of a function though, it seems to be getting more and more useful the more we think about it. In the last day we've come up with all sorts of tricks that this enables. Trampolines are useful.
While the question originally asked why return/redo did not evaluate blocks, there were also formulations like: "is cool because you can do things like tail call optimization", "[can write] a wrapper for the return functionality", "it seems to be getting more and more useful the more we think about it".
I do not think these are true. My first example demonstrates a case where return/redo can really be used, an example being in the "area of expertise" of return/redo, so to speak. It is a variadic sum function called sumn:
use [result collect process] [
collect: func [:value [any-type!]] [
unless value? 'value [return process result]
append/only result :value
return/redo :collect
]
process: func [block [block!] /local result] [
result: 0
foreach value reduce block [result: result + value]
result
]
sumn: func [] [
result: copy []
return/redo :collect
]
]
This is the usage example:
>> sumn 1 * 2 2 * 3 4
== 12
Variadic functions taking "unlimited number" of arguments are not as useful in Rebol as it may look at the first sight. For example, if we wanted to use the sumn function in a small script, we would have to wrap it into a paren to indicate where it should stop collecting arguments:
result: (sumn 1 * 2 2 * 3 4)
print result
This is not any better than using a more standard (non-variadic) alternative called e.g. block-sum and taking just one argument, a block. The usage would be like
result: block-sum [1 * 2 2 * 3 4]
print result
Of course, if the function can somehow detect what is its last argument without needing enclosing paren, we really gain something. In this case we could use the #[unset!] value as the sumn stopping argument, but that does not spare typing either:
result: sumn 1 * 2 2 * 3 4 #[unset!]
print result
Seeing the example of a return wrapper I would say that return/redo is not well suited for return wrappers, return wrappers being outside of its area of expertise. To demonstrate that, here is a return wrapper written in Rebol 2 that actually is outside of return/redo's area of expertise:
myreturn: func [
{my RETURN wrapper returning the string "indefinite" instead of #[unset!]}
; the [throw] attribute makes this function a RETURN wrapper in R2:
[throw]
value [any-type!] {the value to return}
] [
either value? 'value [return :value] [return "indefinite"]
]
Testing in R2:
>> do does [return #[unset!]]
>> do does [myreturn #[unset!]]
== "indefinite"
>> do does [return 1]
== 1
>> do does [myreturn 1]
== 1
>> do does [return 2 3]
== 2
>> do does [myreturn 2 3]
== 2
Also, I do not think it is true that return/redo helps with tail call optimizations. There are examples how tail calls can be implemented without using return/redo at the www.rebol.org site. As said, return/redo was tailor-made to support implementation of variadic functions and it is not flexible enough for other purposes as far as argument passing is concerned.

Resources