read-byte ignores eof-error-p inside SBCL gray stream - stream

So, I'm specializing read-byte method using gray streams on SBCL. I've run into a peculiar behaviour where eof-error-p argument seems to be ignored. It could be I'm missing some trivial mistake in my code, but I've looked through it a dozen times already, and I just don't see it.
Suppose there is a file test.txt with a single byte, say
echo -n 7 > test.txt
I expect the following code to return :eof
(defclass binary-input-stream (fundamental-binary-input-stream)
((stream :initarg :stream :reader stream-of)))
(defmethod stream-read-byte ((stream binary-input-stream))
(format t "~a~%" "It's me, all right")
(read-byte (stream-of stream) nil :eof))
(defun make-binary-input-stream (stream)
(make-instance 'binary-input-stream :stream stream))
(with-open-file (in "test.txt" :element-type '(unsigned-byte 8))
(setq in (make-binary-input-stream in))
(read-byte in)
(read-byte in))
However, SBCL throws an END-OF-FILE exception. What is going on here?

This is an addition to coredump's answer. In particular I think that the behaviour of your code is correct: SBCL is doing the right thing in its implementation of Gray streams, and it should indeed be signalling an exception here.
In your code the pattern of calls is:
read-byte (with no optional arguments) on your binary-input-stream calls stream-read-byte on the same stream;
your method on stream-read-byte directly calls read-byte on the stream you have wrapped, asking for no error and a return on EOF of :eof, and returns the value of that call without further inspection;
that in turn presumably goes through the stream-read-byte method of the wrapped stream, but we don't need to worry about that.
So, what should this do? Well, I am not sure where the right place for the definitive documentation of Gray streams is, but here is something which may be close to it.
In that document, stream-read-byte is defined:
STREAM-READ-BYTE stream [Generic Function]
Used by READ-BYTE; returns either an integer, or the symbol :EOF if the
stream is at end-of-file.
read-byte is then defined as:
(defun READ-BYTE (binary-input-stream &optional (eof-errorp t) eof-value)
(check-for-eof (stream-read-byte binary-input-stream)
binary-input-stream eof-errorp eof-value))
Finally check-for-eof is defined as:
(defun check-for-eof (value stream eof-errorp eof-value)
(if (eq value :eof)
(report-eof stream eof-errorp eof-value)
value))
(I think the last two definitions really mean that 'the implementation needs to do something whose behaviour is equivalent to this', and in particular it needs to do that in the case where the stream is a Gray stream.)
So any method on stream-read-byte must not return :eof unless the stream is at end of file, and in particular doing so will cause an exception to be signalled (or will cause report-eof to be invoked, anyway, and it may or may not signal an exception). And :eof is the only special value that stream-read-byte can return.
Well, your method on stream-read-byte does indeed return :eof to indicate end of file, having carefully suppressed the exception that the inner call yo read-byte would otherwise signal, so it's a well-behaved method.
But then the outer call to read-byte, as defined above, sees this EOF value and dutifully raises an exception for you. Which is, in fact, what you asked for, because you did not ask to suppress the exception in those calls.
If you don't want an exception you need to make sure that the outer calls to read-byte ask for one not to happen, by, for instance:
(with-open-file (in "test.txt" :element-type '(unsigned-byte 8))
(with-open-stream (bin (make-binary-input-stream in))
(values (read-byte bin nil ':eof)
(read-byte bin nil ':eof))))
For a one-byte file this should return the byte in the file and :eof.

I can reproduce the example, and the debugger shows (under Slime):
Backtrace:
0: (READ-BYTE #<BINARY-INPUT-STREAM {1039A8D933}> T NIL)
1: ((LAMBDA ()))
...
By moving the cursor to frame 0 (i.e. read-byte) and pressing v, the editor shows the definition of read-byte in sbcl/src/code/stream.lisp (I build it from source):
(defun read-byte (stream &optional (eof-error-p t) eof-value)
(declare (explicit-check))
(if (ansi-stream-p stream)
(ansi-stream-read-byte stream eof-error-p eof-value nil)
;; must be Gray streams FUNDAMENTAL-STREAM
(let ((byte (stream-read-byte stream)))
(if (eq byte :eof)
(eof-or-lose stream eof-error-p eof-value) ;; <<<< CURSOR HERE
(the integer byte)))))
It turns out :eof is already used by SBCL to indicate the enf of line, and since the top-level call to read-byte does not ignore errors, this cause the condition to be signaled.
Replacing the :eof keyword by another one, say :my-eof, is not good either since the returned value is not a byte. But if you return -1, the test passes (the source stream is a stream of unsigned-bytes, but your wrapper may return -1 without errors).

Related

What exactly does `return` do in AutoHotKey?

In a comment chain, I ask:
Hmm, so the return doesn't play like a closing bracket like in other languages?
To which a user answers:
It's maybe easy to think of return as just something that stops the code execution from going further. #IfDirectives don't care about returns, or anything else related to code execution, because they have nothing to do with code execution. They are kind of just markers that enclose different parts of code within them
I have two questions from this:
If return is just something that stops code execution from going further, then how is it different to exit? They are both flow controls playing a role to determine The Top of the Script (the Auto-execute Section). I see that many people having this problem.
If return doesn't play like a closing bracket, then why does it appear in many places one expects a closing bracket? Especially when there is no subroutine involving in, like what is described in the formal definition. Take this example in Hotkeys:
#n::
Run Notepad
return
In AHK return is what you'd learn return to be from other programming languages. It's just that control flow, and in general the legacy syntax, in AHK is different from what you might be used to from other languages.
How is return different from exit?
If we're not trying to return a value from a function, and are just interested in the control flow, there is not much difference. But the difference is there.
You could indeed end an hotkey label with exit, and it would be the same as ending it with return.
From the documentation:
"If there is no caller to which to return, Return will do an Exit instead.".
I guess it's just convention to end hotkey labels with return.
So basically said, there is difference between the two, if return isn't about to do an exit.
Well when would this be?
For example:
MsgBox, 1
function()
MsgBox, 3
return ;this does an exit
function()
{
MsgBox, 2
return ;this does a return
}
The first return has no caller to which to return to, so it will do an exit.
The second return however does have a caller to return to do, so it will return there, and then execute the 3rd message box.
If we replaced the second return with an exit, the current thread would just be exited and the 3rd message box would never have been shown.
Here's the same exact example with legacy labels:
MsgBox, 1
gosub, label
MsgBox, 3
return ;this does an exit
label:
MsgBox, 2
return ;this does a return
I'm showing this, because they're very close to hotkey labels, and hotkey labels were something you were wondering a lot about.
In this specific example, if the line MsgBox, 3 didn't exist, replacing the second return with an exit would produce the same end result. But only because code execution is about to end anyway on the first return.
If return doesn't play like a closing bracket, then why does it appear in many places one expects a closing bracket?
In AHK v1, hotkeys and hotstrings work like labels, and labels are legacy. Labels don't care about { }s like modern functions do.
So we need something to stop the code execution. Return does that.
Consider the following example:
c::
MsgBox, % "Hotkey triggered!"
gosub, run_chrome
;code execution not ended
n::
MsgBox, % "Hotkey triggered!`nRunning notepad"
Run, notepad
;code execution not ended
run_chrome:
MsgBox, % "Running chrome"
run, chrome
WinWait, ahk_exe chrome.exe
WinMove, ahk_exe chrome.exe, , 500, 500
;code execution not ended
show_system_uptime:
MsgBox, % "System uptime: " A_TickCount " ms"
;code execution not ended
We're not ending code execution at any of the labels. Because of this, code execution will bleed into other parts of the code, in this case, into the other labels.
Try running the hotkeys and see how code execution bleeds into the other labels.
Because of this, the code execution needs to be ended somehow.
If using { } was supported by labels, it would indeed be the solution to keep the code execution where it needs to be.
And in fact AHK v2 hotkeys and hotstrings are no longer labels (they're only lookalikes). In v2 using { } is actually the correct way (the script wont even run if you don't use them).
See hotkeys from the AHK v2 documentation.
Another answer above already gave a good explanation.
So, to me:_
(return vs Exit is like gosub vs goto -- return (literal meaning) vs terminate. )
return completes (ends) the current subroutine, and returns from the current subroutine back to the calling subroutine;
Exit completes (ends) the current subroutine, and terminates (/exits /ends) the current thread (which discards all the previous calling subroutine);
where subroutines are inside a thread.
(subroutines are just like function calls or label calls (or hotkey label calls);
you can visualize this as if method calls (stack frame) in Java (, you can see this when you debug in an IDE))
Quotes:
[]
Return
Returns from a subroutine to which execution had previously jumped via function-call, Gosub, Hotkey activation, GroupActivate, or other means.
https://www.autohotkey.com/docs/commands/Return.htm
[]
Exit
Exits the current thread or (if the script is not persistent) the entire script.
https://www.autohotkey.com/docs/commands/Exit.htm
[]
Gosub
Jumps to the specified label and continues execution until Return is encountered.
https://www.autohotkey.com/docs/commands/Gosub.htm
[]
Goto
Jumps to the specified label and continues execution.
https://www.autohotkey.com/docs/commands/Goto.htm
[]
main difference is gosub comes back, and goto does not.
https://www.autohotkey.com/board/topic/46160-goto-vs-gosub/
[]
A subroutine and a function are essentially the same thing
Difference between subroutine , co-routine , function and thread?
[]
A subroutine is a portion of code which can be called to perform a specific task.
Execution of a subroutine begins at the target of a label and continues until a Return or Exit is encountered.
Since the end of a subroutine depends on flow of control,
any label can act as both a Goto target and the beginning of a subroutine.
https://www.autohotkey.com/docs/misc/Labels.htm#subroutines
[]
The current thread is defined as the flow of execution invoked by the most recent event;
examples include hotkeys, SetTimer subroutines, custom menu items, and GUI events.
The current thread can be executing commands within its own subroutine or within other subroutines called by that subroutine.
https://www.autohotkey.com/docs/misc/Threads.htm
(you may try out the following code to see the difference)
^r::
MsgBox, aaa
gosub, TestLabel
MsgBox, bbb
goto, TestLabel
MsgBox, ccc
return
TestLabel:
MsgBox, inside
return
; Exit
Return returns from a function (called with funcname(...) syntax) or subroutine (called with Gosub label). Subroutines/functions can call other subroutines/functions and so it's convenient to end them with Return rather than Exit. Exit ends the current thread of execution and won't transfer flow of control back to the caller.
For functions, Return also can return a value.
Return is not analogous to a closing bracket. Closing brackets can occur only at the end of a function and imply a Return but you can also Return anywhere in a function. It is common to return in an If statement to "bail out" of a function early, perhaps if some parameter had an invalid value.

Erlang: variable is unbound

Why is the following saying variable unbound?
9> {<<A:Length/binary, Rest/binary>>, Length} = {<<1,2,3,4,5>>, 3}.
* 1: variable 'Length' is unbound
It's pretty clear that Length should be 3.
I am trying to have a function with similar pattern matching, ie.:
parse(<<Body:Length/binary, Rest/binary>>, Length) ->
But if fails with the same reason. How can I achieve the pattern matching I want?
What I am really trying to achieve is parse in incoming tcp stream packets as LTV(Length, Type, Value).
At some point after I parse the the Length and the Type, I want to ready only up to Length number of bytes as the value, as the rest will probably be for the next LTV.
So my parse_value function is like this:
parse_value(Value0, Left, Callback = {Module, Function},
{length, Length, type, Type, value, Value1}) when byte_size(Value0) >= Left ->
<<Value2:Left/binary, Rest/binary>> = Value0,
Module:Function({length, Length, type, Type, value, lists:reverse([Value2 | Value1])}),
if
Rest =:= <<>> ->
{?MODULE, parse, {}};
true ->
parse(Rest, Callback, {})
end;
parse_value(Value0, Left, _, {length, Length, type, Type, value, Value1}) ->
{?MODULE, parse_value, Left - byte_size(Value0), {length, Length, type, Type, value, [Value0 | Value1]}}.
If I could do the pattern matching, I could break it up to something more pleasant to the eye.
The rules for pattern matching are that if a variable X occurs in two subpatterns, as in {X, X}, or {X, [X]}, or similar, then they have to have the same value in both positions, but the matching of each subpattern is still done in the same input environment - bindings from one side do not carry over to the other. The equality check is conceptually done afterwards, as if you had matched on {X, X2} and added a guard X =:= X2. This means that your Length field in the tuple cannot be used as input to the binary pattern, not even if you make it the leftmost element.
However, within a binary pattern, variables bound in a field can be used in other fields following it, left-to-right. Therefore, the following works (using a leading 32-bit size field in the binary):
1> <<Length:32, A:Length/binary, Rest/binary>> = <<0,0,0,3,1,2,3,4,5>>.
<<0,0,0,3,1,2,3,4,5>>
2> A.
<<1,2,3>>
3> Rest.
<<4,5>>
I've run into this before. There is some weirdness between what is happening inside binary syntax and what happens during unification (matching). I suspect that it is just that binary syntax and matching occur at different times in the VM somewhere (we don't know which Length is failing to get assigned -- maybe binary matching is always first in evaluation, so Length is still meaningless). I was once going to dig in and find out, but then I realized that I never really needed to solve this problem -- which might be why it was never "solved".
Fortunately, this won't stop you with whatever you are doing.
Unfortunately, we can't really help further unless you explain the context in which you think this kind of a match is a good idea (you are having an X-Y problem).
In binary parsing you can always force the situation to be one of the following:
Have a fixed-sized header at the beginning of the binary message that tells you the next size element you need (and from there that can continue as a chain of associations endlessly)
Inspect the binary once on entry to determine the size you are looking for, pull that one value, and then begin the real parsing task
Have a set of fields, all of predetermined sizes that conform to some a binary schema standard
Convert the binary to a list and iterate through it with any arbitrary amount of look-ahead and backtracking you might need
Quick Solution
Without knowing anything else about your general problem, a typical solution would look like:
parse(Length, Bin) ->
<<Body:Length/binary, Rest/binary>> = Bin,
ok = do_something(Body),
do_other_stuff(Rest).
But I smell something funky here.
Having things like this in your code is almost always a sign that a more fundamental aspect of the code structure is not in agreement with the data that you are handling.
But deadlines.
Erlang is all about practical code that satisfies your goals in the real world. With that in mind, I suggest that you do something like the above for now, and then return to this problem domain and rethink it. Then refactor it. This will gain you three benefits:
Something will work right away.
You will later learn something fundamental about parsing in general.
Your code will almost certainly run faster if it fits your data better.
Example
Here is an example in the shell:
1> Parse =
1> fun
1> (Length, Bin) when Length =< byte_size(Bin) ->
1> <<Body:Length/binary, Rest/binary>> = Bin,
1> ok = io:format("Chopped off ~p bytes: ~p~n", [Length, Body]),
1> Rest;
1> (Length, Bin) ->
1> ok = io:format("Binary shorter than ~p~n", [Length]),
1> Bin
1> end.
#Fun<erl_eval.12.87737649>
2> Parse(3, <<1,2,3,4,5>>).
Chopped off 3 bytes: <<1,2,3>>
<<4,5>>
3> Parse(8, <<1,2,3,4,5>>).
Binary shorter than 8
<<1,2,3,4,5>>
Note that this version is a little safer, in that we avoid a crash in the case that Length is longer than the binary. This is yet another good reason why maybe we can't do that match in the function head.
Try with below code:
{<<A:Length/binary, Rest/binary>>, _} = {_, Length} = {<<1,2,3,4,5>>, 3}.
This question is mentioned a bit in EEP-52:
Any variables used in the expression must have been previously bound, or become bound in the same binary pattern as the expression. That is, the following example is illegal:
illegal_example2(N, <<X:N,T/binary>>) ->
{X,T}.
And explained a bit more in the following e-mail: http://erlang.org/pipermail/eeps/2020-January/000636.html
Illegal. With one exception, matching is not done in a left-to-right
order, but all variables in the pattern will be bound at the same
time. That means that the variables must be bound before the match
starts. For maps, that means that the variables referenced in key
expressions must be bound before the case (or receive) that matches
the map. In a function head, all map keys must be literals.
The exception to this general rule is that within a binary pattern,
the segments are matched from left to right, and a variable bound in a
previous segment can be used in the size expression for a segment
later in the binary pattern.
Also one of the members of OTP team mentioned that they made a prototype that can do that, but it was never finished http://erlang.org/pipermail/erlang-questions/2020-May/099538.html
We actually tried to make your example legal. The transformation of
the code that we did was not to rewrite to guards, but to match
arguments or parts of argument in the right order so that variables
that input variables would be bound before being used. (We would do a
topological sort to find the correct order.) For your example, the
transformation would look similar to this:
legal_example(Key, Map) ->
case Map of
#{Key := Value} -> Value;
_ -> error(function_clause, [Key, Map])
end.
In the prototype implementation, the compiler could compile the
following example:
convoluted(Ref,
#{ node(Ref) := NodeId, Loop := universal_answer},
[{NodeId, Size} | T],
<<Int:(Size*8+length(T)),Loop>>) when is_reference(Ref) ->
Int.
Things started to fall apart when variables are repeated. Repeated
variables in patterns already have a meaning in Erlang (they should be
the same), so it become tricky to understand to distinguish between
variables being bound or variables being used a binary size or map
key. Here is an example that the prototype couldn't handle:
foo(#{K := K}, K) -> ok.
A human can see that it should be transformed similar to this:
foo(Map, K) -> case Map of
{K := V} when K =:= V -> ok end.
Here are few other examples that should work but the prototype would
refuse to compile (often emitting an incomprehensible error message):
bin2(<<Sz:8,X:Sz>>, <<Y:Sz>>) -> {X,Y}.
repeated_vars(#{K := #{K := K}}, K) -> K.
match_map_bs(#{K1 := {bin,<<Int:Sz>>}, K2 := <<Sz:8>>}, {K1,K2}) ->
Int.
Another problem was when example was correctly rejected, the error
message would be confusing.
Because much more work would clearly be needed, we have shelved the
idea for now. Personally, I am not sure that the idea is sound in the
first place. But I am sure of one thing: the implementation would be
very complicated.
UPD: latest news from 2020-05-14

Inconsistent behaviour of closures in Common Lisp [duplicate]

Can someone explain the following behavior? Specifically, why does the function return a different list every time? Why isn't some-list initialized to '(0 0 0) every time the function is called?
(defun foo ()
(let ((some-list '(0 0 0)))
(incf (car some-list))
some-list))
Output:
> (foo)
(1 0 0)
> (foo)
(2 0 0)
> (foo)
(3 0 0)
> (foo)
(4 0 0)
Thanks!
EDIT:
Also, what is the recommended way of implementing this function, assuming I want the function to output '(1 0 0) every time?
'(0 0 0) is a literal object, which is assumed to be a constant (albeit not protected from modification). So you're effectively modifying the same object every time. To create different objects at each function call use (list 0 0 0).
So unless you know, what you're doing, you should always use literal lists (like '(0 0 0)) only as constants.
On a side note, defining this function in the sbcl REPL you get the following warning:
caught WARNING:
Destructive function SB-KERNEL:%RPLACA called on constant data.
See also:
The ANSI Standard, Special Operator QUOTE
The ANSI Standard, Section 3.2.2.3
Which gives a good hint towards the problem at hand.
'(0 0 0) in code is literal data. Modifying this data has undefined behavior. Common Lisp implementations may not detect it at runtime (unless data is for example placed in some read-only memory space). But it can have undesirable effects.
you see that this data may be (and often is) shared across various invocations of the same function
one of the more subtle possible errors is this: Common Lisp has been defined with various optimizations which can be done by a compiler in mind. For example a compiler is allowed to reuse data:
Example:
(let ((a '(1 2 3))
(b '(1 2 3)))
(list a b))
In above code snippet the compiler may detect that the literal data of a and b is EQUAL. It may then have both variables point to the same literal data. Modifying it may work, but the change is visible from a and b.
Summary: Modification of literal data is a source of several subtle bugs. Avoid it if possible. Then you need to cons new data objects. Consing in general means the allocation of fresh, new data structures at runtime.
Wanted to write one myself, but I found a good one online:
CommonLisp has first class functions, i.e. functions are objects which
can be created at runtime, and passed as arguments to other functions.
--AlainPicard These first-class functions also have their own state, so they are functors. All Lisp functions are functors; there is no
separation between functions that are "just code" and "function
objects". The state takes the form of captured lexical variable
bindings. You don't need to use LAMBDA to capture bindings; a
top-level DEFUN can do it too: (let ((private-variable 42))
(defun foo ()
...))
The code in the place of ... sees private-variable in its lexical
scope. There is one instance of this variable associated with the one
and only function object that is globally tied to the symbol FOO; the
variable is captured at the time the DEFUN expression is evaluated.
This variable then acts something like a static variable in C. Or,
alternately, you can think of FOO as a "singleton" object with an
"instance variable".
--KazKylheku
Ref
http://c2.com/cgi/wiki?CommonLisp

Why does return/redo evaluate result functions in the calling context, but block results are not evaluated?

Last night I learned about the /redo option for when you return from a function. It lets you return another function, which is then invoked at the calling site and reinvokes the evaluator from the same position
>> foo: func [a] [(print a) (return/redo (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
20
Even though foo is a function that only takes one argument, it now acts like a function that took two arguments. Something like that would otherwise require the caller to know you were returning a function, and that caller would have to manually use the do evaluator on it.
Thus without return/redo, you'd get:
>> foo: func [a] [(print a) (return (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
== 10
foo consumed its one parameter and returned a function by value (which was not invoked, thus the interpreter moved on). Then the expression evaluated to 10. If return/redo did not exist you'd have had to write:
>> do foo "Hello" 10
Hello
20
This keeps the caller from having to know (or care) if you've chosen to return a function to execute. And is cool because you can do things like tail call optimization, or writing a wrapper for the return functionality itself. Here's a variant of return that prints a message but still exits the function and provides the result:
>> myreturn: func [] [(print "Leaving...") (return/redo :return)]
>> foo: func [num] [myreturn num + 10]
>> foo 10
Leaving...
== 20
But functions aren't the only thing that have behavior in do. So if this is a general pattern for "removing the need for a DO at the callsite", then why doesn't this print anything?
>> test: func [] [return/redo [print "test"]]
>> test
== [print "test"]
It just returned the block by value, like a normal return would have. Shouldn't it have printed out "test"? That's what do would...uh, do with it:
>> do [print "test"]
test
The short answer is because it is generally unnecessary to evaluate a block at the call point, because blocks in Rebol don't take parameters so it mostly doesn't matter where they are evaluated. However, that "mostly" may need some explanation...
It comes down to two interesting features of Rebol: static binding, and how do of a function works.
Static Binding and Scopes
Rebol doesn't have scoped word bindings, it has static direct word bindings. Sometimes it seems like we have lexical scope, but we really fake that by updating the static bindings each time we're building a new "scoped" code block. We can also rebind words manually whenever we want.
What that means for us in this case though, is that once a block exists, its bindings and values are static - they're not affected by where the block is physically located, or where it is being evaluated.
However, and this is where it gets tricky, function contexts are weird. While the bindings of words bound to a function context are static, the set of values assigned to those words are dynamically scoped. It's a side effect of how code is evaluated in Rebol: What are language statements in other languages are functions in Rebol, so a call to if, for instance, actually passes a block of data to the if function which if then passes to do. That means that while a function is running, do has to look up the values of its words from the call frame of the most recent call to the function that hasn't returned yet.
This does mean that if you call a function and return a block of code with words bound to its context, evaluating that block will fail after the function returns. However, if your function calls itself and that call returns a block of code with its words bound to it, evaluating that block before your function returns will make it look up those words in the call frame of the current call of your function.
This is the same for whether you do or return/redo, and affects inner functions as well. Let me demonstrate:
Function returning code that is evaluated after the function returns, referencing a function word:
>> a: 10 do do has [a] [a: 20 [a]]
** Script error: a word is not bound to a context
** Where: do
** Near: do do has [a] [a: 20 [a]]
Same, but with return/redo and the code in a function:
>> a: 10 do has [a] [a: 20 return/redo does [a]]
** Script error: a word is not bound to a context
** Where: function!
** Near: [a: 20 return/redo does [a]]
Code do version, but inside an outer call to the same function:
>> do f: function [x] [a: 10 either zero? x [do f 1] [a: 20 [a]]] 0
== 10
Same, but with return/redo and the code in a function:
>> do f: function [x] [a: 10 either zero? x [f 1] [a: 20 return/redo does [a]]] 0
== 10
So in short, with blocks there is usually no advantage to doing the block elsewhere than where it is defined, and if you want to it is easier to use another call to do instead. Self-calling recursive functions that need to return code to be executed in outer calls of the same function are an exceedingly rare code pattern that I have never seen used in Rebol code at all.
It could be possible to change return/redo so it would handle blocks as well, but it probably isn't worth the increased overhead to return/redo to add a feature that is only useful in rare circumstances and already has a better way to do it.
However, that brings up an interesting point: If you don't need return/redo for blocks because do does the same job, doesn't the same apply to functions? Why do we need return/redo at all?
How DO of a Function Works
Basically, we have return/redo because it uses exactly the same code that we use to implement do of a function. You might not realize it, but do of a function is really unusual.
In most programming languages that can call a function value, you have to pass the parameters to the function as a complete set, sort of how R3's apply function works. Regular Rebol function calling causes some unknown-ahead-of-time number of additional evaluations to happen for its arguments using unknown-ahead-of-time evaluation rules. The evaluator figures out these evaluation rules at runtime and just passes the results of the evaluation to the function. The function itself doesn't handle the evaluation of its parameters, or even necessarily know how those parameters were evaluated.
However, when you do a function value explicitly, that means passing the function value to a call to another function, a regular function named do, and then that magically causes the evaluation of additional parameters that weren't even passed to the do function at all.
Well it's not magic, it's return/redo. The way do of a function works is that it returns a reference to the function in a regular shortcut-return value, with a flag in the shortcut-return value that tells the interpreter that called do to evaluate the returned function as if it were called right there in the code. This is basically what is called a trampoline.
Here's where we get to another interesting feature of Rebol: The ability to shortcut-return values from a function is built into the evaluator, but it doesn't actually use the return function to do it. All of the functions you see from Rebol code are wrappers around the internal stuff, even return and do. The return function we call just generates one of those shortcut-return values and returns it; the evaluator does the rest.
So in this case, what really happened is that all along we had code that did what return/redo does internally, but Carl decided to add an option to our return function to set that flag, even though the internal code doesn't need return to do so because the internal code calls the internal function. And then he didn't tell anyone that he was making the option externally available, or why, or what it did (I guess you can't mention everything; who has the time?). I have the suspicion, based on conversations with Carl and some bugs we've been fixing, that R2 handled do of a function differently, in a way that would have made return/redo impossible.
That does mean that the handling of return/redo is pretty thoroughly oriented towards function evaluation, since that is its entire reason for existing at all. Adding any overhead to it would add overhead to do of a function, and we use that a lot. Probably not worth extending it to blocks, given how little we'd gain and how rarely we'd get any benefit at all.
For return/redo of a function though, it seems to be getting more and more useful the more we think about it. In the last day we've come up with all sorts of tricks that this enables. Trampolines are useful.
While the question originally asked why return/redo did not evaluate blocks, there were also formulations like: "is cool because you can do things like tail call optimization", "[can write] a wrapper for the return functionality", "it seems to be getting more and more useful the more we think about it".
I do not think these are true. My first example demonstrates a case where return/redo can really be used, an example being in the "area of expertise" of return/redo, so to speak. It is a variadic sum function called sumn:
use [result collect process] [
collect: func [:value [any-type!]] [
unless value? 'value [return process result]
append/only result :value
return/redo :collect
]
process: func [block [block!] /local result] [
result: 0
foreach value reduce block [result: result + value]
result
]
sumn: func [] [
result: copy []
return/redo :collect
]
]
This is the usage example:
>> sumn 1 * 2 2 * 3 4
== 12
Variadic functions taking "unlimited number" of arguments are not as useful in Rebol as it may look at the first sight. For example, if we wanted to use the sumn function in a small script, we would have to wrap it into a paren to indicate where it should stop collecting arguments:
result: (sumn 1 * 2 2 * 3 4)
print result
This is not any better than using a more standard (non-variadic) alternative called e.g. block-sum and taking just one argument, a block. The usage would be like
result: block-sum [1 * 2 2 * 3 4]
print result
Of course, if the function can somehow detect what is its last argument without needing enclosing paren, we really gain something. In this case we could use the #[unset!] value as the sumn stopping argument, but that does not spare typing either:
result: sumn 1 * 2 2 * 3 4 #[unset!]
print result
Seeing the example of a return wrapper I would say that return/redo is not well suited for return wrappers, return wrappers being outside of its area of expertise. To demonstrate that, here is a return wrapper written in Rebol 2 that actually is outside of return/redo's area of expertise:
myreturn: func [
{my RETURN wrapper returning the string "indefinite" instead of #[unset!]}
; the [throw] attribute makes this function a RETURN wrapper in R2:
[throw]
value [any-type!] {the value to return}
] [
either value? 'value [return :value] [return "indefinite"]
]
Testing in R2:
>> do does [return #[unset!]]
>> do does [myreturn #[unset!]]
== "indefinite"
>> do does [return 1]
== 1
>> do does [myreturn 1]
== 1
>> do does [return 2 3]
== 2
>> do does [myreturn 2 3]
== 2
Also, I do not think it is true that return/redo helps with tail call optimizations. There are examples how tail calls can be implemented without using return/redo at the www.rebol.org site. As said, return/redo was tailor-made to support implementation of variadic functions and it is not flexible enough for other purposes as far as argument passing is concerned.

vimscript: switch to buffer by filename path

I have a vimscript which needs to switch to a particular buffer. That buffer will be specified by either full path, partial path, or just its name.
For example:
I am in the directory /home/user/code and I have 3 vim buffers open foo.py src/foo.py and src/bar.py.
If the script was told to switch to buffer /home/user/code/foo.py it would switch to buffer foo.py.
If it were told to switch to user/code/src/foo.py it would switch to buffer src/foo.py
If it were told to switch to foo.py it would switch to buffer foo.py
If it were told to swith to bar.py it would switch to buffer src/bar.py
The simplest solution I can see is to somehow get a list of the buffers stored in a variable and use trial and error.
It would be nice if the solution was cross platform, but it needs to at least run on Linux.
The bufname() / bufnr() functions can lookup loaded buffers by partial filename. You can anchor the match to the end by appending a $, like this:
echo bufnr('/src/foo.py$')
I found a way to do this using python in a vimscript. With python I was able to get the names of all the buffers from vim.buffers[i].name and used os.path and os.sep to process which buffer to switch to.
In the end, I decided that it would be more helpful for it to refuse to do anything if the buffer it was requested to switch to was ambiguous.
Here it is:
"Given a file, full path, or partial path, this will try to change to the
"buffer which may match that file. If no buffers match, it returns 1. If
"multiple buffers match, it returns 2. It returns 0 on success
function s:GotoBuffer(buf)
python << EOF
import vim, os
buf = vim.eval("a:buf")
#split the paths into lists of their components and reverse.
#e.g. foo/bar/baz.py becomes ['foo', 'bar', 'baz.py']
buf_path = os.path.normpath(buf).split(os.sep)[::-1]
buffers = [os.path.normpath(b.name).split(os.sep)[::-1] for b in vim.buffers]
possible_buffers = range(len(buffers))
#start eliminating incorrect buffers by their filenames and paths
for component in xrange(len(buf_path)):
for b in buffers:
if len(b)-1 >= component and b[component] != buf_path[component]:
#This buffer doesn't match. Eliminate it as a posibility.
i = buffers.index(b)
if i in possible_buffers: possible_buffers.remove(i)
if len(possible_buffers) > 1: vim.command("return 2")
#delete the next line to allow ambiguous switching
elif not possible_buffers: vim.command("return 1")
else:
vim.command("buffer " + str(possible_buffers[-1] + 1))
EOF
endfunction
EDIT: The above code seems to have some bugs. I am not going to fix them because there is another answer which is much better.

Resources