.fsx script ignoring a function call when I add a parameter to it - f#

Alright, so I'm a happy fsx-script programmer, because I love how I can have the compiler shout at me when I do mistakes before they show up at runtime.
However I've found a case which really bothers me because I was expecting that by doing some refactoring (i.e.: adding an argument to a function) I was going to be warned by the compiler about all the places where I need to put the new argument. But, not only this did not happen, fsharpi ran my script and ignored the function call completely!! :(
How can I expect to refactor my scripts if this happens?
Here is my code:
let Foo (bar: string) =
Console.WriteLine("I received " + bar)
Foo("hey")
It works.
Now, later, I decide to add a second argument to the function (but I forget to add the argument to all the calls to it):
let Foo (bar: string) (baz: bool) =
Console.WriteLine("I received " + bar)
Foo("hey")
The result of this is: instead of the compiler telling me that I'm missing an argument, it is fsharpi running the script and ignoring the call to Foo! Why?
PS: I know the difference between currying and tuples, so I know Foo("hey") becomes a function (instead of a function call), because of partial application. But I want to understand better why the compiler is not expecting a function evaluation here, instead of seeing a function and ignoring it. Can I enable a warningAsError somehow? I would like to avoid resorting to using tuples in order to workaround this problem.

The fsharpi (or fsi if you're on Windows) interpreter makes no distinction between running a script and typing code at the interactive prompt (or, most often, submitting code from your editor via a select-and-hit-Alt-Enter keyboard shortcut).
Therefore, if you got what you're asking for -- fsharpi issuing a warning whenever a script line has a return value that isn't () -- it would ruin the value of fsharpi for the most common use case, which is people using an interactive fsharpi session to test their code, and rapidly iterate through non-working prototypes to get to one that works correctly. This is one of F#'s great strengths, and giving you what you're asking for would eliminate that strength. It is therefore never going to happen.
BUT... that doesn't mean that you're sunk. If you have functions that return unit, and you want fsharpi to give you a compile-time error when you refactor them to take more arguments, you can do it this way. Replace all occurrences of:
Foo("hey")
with:
() = Foo("hey")
As long as the function Foo has only one argument (and returns null), this will evaluate to true; the true value will be happily ignored by fsharpi, and your script will run. However, if you then change Foo to take two arguments, so that Foo("hey") now returns a function, the () = Foo("hey") line will no longer compile, and you'll get an error like:
error FS0001: This expression was expected to have type
unit
but here has type
'a -> unit
So if you want fsharpi to refuse to compile your script when you refactor a function, go through and change your calls to () = myfunc arg1 arg2. For functions that don't return unit, make the value you're testing against a value of that function's return type. For example, given this function:
let f x = x * 2
You could do
0 = f 5
This will be false, of course, but it will compile. But if you refactor f:
let f x y = x * 2 + y
Now the line 0 = f 5 will not compile, but will give you the error message:
error FS0001: This expression was expected to have type
int
but here has type
int -> int
To summarize: you won't ever get the feature you're looking for, because it would harm the language. But with a bit of work, you can do something that fits your needs.
Or in other words, as the famous philosopher Mick Jagger once put it:
You can't always get what you want. But if you try, sometimes you might find you get what you need.

Related

I don't understand this map tuple key compilation error, in F#

Here is a function:
let newPositions : PositionData list =
positions
|> List.filter (fun x ->
let key = (x.Instrument, x.Side)
match brain.Positions.TryGetValue key with
| false, _ ->
// if we don't know the position, it's new
true
| true, p when x.UpdateTime > p.UpdateTime ->
// it's newer than the version we have, it's new
true
| _ ->
false
)
it compiles at expected.
let's focus on two lines:
let key = (x.Instrument, x.Side)
match brain.Positions.TryGetValue key with
brain.Positions is a Map<Instrument * Side, PositionData> type
if I modify the second line to:
match brain.Positions.TryGetValue (x.Instrument, x.Side) with
then the code will not compile, with error:
[FS0001] This expression was expected to have type
'Instrument * Side'
but here has type
'Instrument'
but:
match brain.Positions.TryGetValue ((x.Instrument, x.Side)) with
will compile...
why is that?
This is due to method call syntax.
TryGetValue is not a function, but a method. A very different thing, and a much worse thing in general. And subject to some special syntactic rules.
This method, you see, actually has two parameters, not one. The first parameter is a key, as you expect. And the second parameter is what's known in C# as out parameter - i.e. kind of a second return value. The way it was originally meant to be called in C# is something like this:
Dictionary<int, string> map = ...
string val;
if (map.TryGetValue(42, out val)) { ... }
The "regular" return value of TryGetValue is a boolean signifying whether the key was even found. And the "extra" return value, denoted here out val, is the value corresponding to the key.
This is, of course, extremely awkward, but it did not stop the early .NET libraries from using this pattern very widely. So F# has special syntactic sugar for this pattern: if you pass just one parameter, then the result becomes a tuple consisting of the "actual" return value and the out parameter. Which is what you're matching against in your code.
But of course, F# cannot prevent you from using the method exactly as designed, so you're free to pass two parameters as well - the first one being the key and the second one being a byref cell (which is F# equivalent of out).
And here is where this clashes with the method call syntax. You see, in .NET all methods are uncurried, meaning their arguments are all effectively tupled. So when you call a method, you're passing a tuple.
And this is what happens in this case: as soon as you add parentheses, the compiler interprets that as an attempt to call a .NET method with tupled arguments:
brain.Positions.TryGetValue (x.Instrument, x.Side)
^ ^
first arg |
second arg
And in this case it expects the first argument to be of type Instrument * Side, but you're clearly passing just an Instrument. Which is exactly what the error message tells you: "expected to have type 'Instrument * Side'
but here has type 'Instrument'".
But when you add a second pair of parens, the meaning changes: now the outer parens are interpreted as "method call syntax", and the inner parens are interpreted as "denoting a tuple". So now the compiler interprets the whole thing as just a single argument, and all works as before.
Incidentally, the following will also work:
brain.Positions.TryGetValue <| (x.Instrument, x.Side)
This works because now it's no longer a "method call" syntax, because the parens do not immediately follow the method name.
But a much better solution is, as always, do not use methods, use functions instead!
In this particular example, instead of .TryGetValue, use Map.tryFind. It's the same thing, but in proper function form. Not a method. A function.
brain.Positions |> Map.tryFind (x.Instrument, x.Side)
Q: But why does this confusing method even exist?
Compatibility. As always with awkward and nonsensical things, the answer is: compatibility.
The standard .NET library has this interface System.Collections.Generic.IDictionary, and it's on that interface that the TryGetValue method is defined. And every dictionary-like type, including Map, is generally expected to implement that interface. So here you go.
In future, please consider the Stack Overflow guidelines provided under How to create a Minimal, Reproducible Example. Well, minimal and reproducible the code in your question is, but it shall also be complete...
…Complete – Provide all parts someone else needs to reproduce your
problem in the question itself
That being said, when given the following definitions, your code will compile:
type Instrument() = class end
type Side() = class end
type PositionData = { Instrument : Instrument; Side : Side; }
with member __.UpdateTime = 0
module brain =
let Positions = dict[(Instrument(), Side()), {Instrument = Instrument(); Side = Side()}]
let positions = []
Now, why is that? Technically, it is because of the mechanism described in the F# 4.1 Language Specification under §14.4 Method Application Resolution, 4. c., 2nd bullet point:
If all formal parameters in the suffix are “out” arguments with byref
type, remove the suffix from UnnamedFormalArgs and call it
ImplicitlyReturnedFormalArgs.
This is supported by the signature of the method call in question:
System.Collections.Generic.IDictionary.TryGetValue(key: Instrument * Side, value: byref<PositionData>)
Here, if the second argument is not provided, the compiler does the implicit conversion to a tuple return type as described in §14.4 5. g.
You are obviously familiar with this behaviour, but maybe not with the fact that if you specify two arguments, the compiler will see the second of them as the explicit byref "out" argument, and complains accordingly with its next error message:
Error 2 This expression was expected to have type
PositionData ref
but here has type
Side
This misunderstanding changes the return type of the method call from bool * PositionData to bool, which consequently elicits a third error:
Error 3 This expression was expected to have type
bool
but here has type
'a * 'b
In short, your self-discovered workaround with double parentheses is indeed the way to tell the compiler: No, I am giving you only one argument (a tuple), so that you can implicitly convert the byref "out" argument to a tuple return type.

Erlang Doesn't Warn About Unused Function Argument

If I declare a function
test(A) -> 3.
Erlang generates a warning about variable A not being used. However the definition
isEqual(X,X) -> 1.
Doesn't produce any warning but
isEqual(X,X) -> 1;
isEqual(X,Y) -> 0.
again produces a warning but only for the second line.
The reason why that doesn't generate a warning is because in the second case you are asserting (through pattern matching), by using the same variable name, that the first and second arguments to isEqual/2 have the same value. So you are actually using the value of the argument.
It might help to understand better if we look at the Core Erlang code produced from is_equal/2. You can get .core source files by compiling your .erl file in the following way: erlc +to_core pattern.erl (see here for pattern.erl).
This will produce a pattern.core file that will look something like this (module_info/[0,1] functions removed):
module 'pattern' ['is_equal'/2]
attributes []
'is_equal'/2 = fun (_cor1,_cor0) ->
case <_cor1,_cor0> of
%% Line 5
<X,_cor4> when call 'erlang':'=:=' (_cor4, X) ->
1
%% Line 6
<X,Y> when 'true' ->
0
end
As you can see, each function clause from is_equal/2 in the .erl source code gets translated to a case clause in Core Erlang. X does get used in the first clause since it needs to be compared to the other argument. On the other hand neither X or Y are used in the second clause.

How to refactor a function using "ignore"

When should I use "ignore" instead of "()"?
I attempted to write the following:
let log = fun data medium -> ()
I then received the following message:
Lint: 'fun _ -> ()' might be able to be refactored into 'ignore'.
So I updated the declaration to the following:
let log = fun data medium -> ignore
Is there any guidance on why I might use one over the other?
My gut tells me that I should use ignore when executing an actual expression.
In this case though, I'm declaring a high-order function.
Are my assumptions accurate?
The linter message that you got here is a bit confusing. The ignore function is just a function that takes anything and returns unit:
let ignore = fun x -> ()
Your log function is a bit similar to ignore, but it takes two parameters:
let log = fun data medium -> ()
In F#, this is actually a function that returns another function (currying). You can write this more explicitly by saying:
let log = fun data -> fun medium -> ()
Now, you can see that a part of your function is actually the same thing as ignore. You can write:
let log = fun data -> ignore
This means the same thing as your original function and this is what the linter is suggesting. I would not write the code in this way, because it is less obvious what the code does (it actually takes two arguments) - I guess the linter is looking just for the simple pattern, ignoring the fact that sometimes the refactoring is not all that useful.
Never, at least not in the way shown in the question.
Substituting between ignore and () is not meaningful, as they are different concepts:
ignore is a generic function with one argument and unit return. Its type is 'T -> unit.
() is the only valid value of type unit. It is not a function at all.
Therefore, it's not valid to do the refactor shown in the question. The first version of log takes two curried arguments, while the second version takes three.
What Lint is trying to suggest isn't quite clear. ignore is a function with one argument; it's not obvious how (or why) it should be used to refactor a method that takes two curried arguments. fun _ _ -> () would be an okay and quite readable way to ignore two arguments.

Why should successive arguments involving method application be parenthesized?

Suppose the following F# function:
let f (x:int) (y:int) = 42
I suspect that the reason I need to parenthesize the arguments in example z2 below is because of type inference; my example might not be great, but it's easy to imagine how things could get very hairy:
let z1 = f 2 3
let z2 = f 2 (f 3 5)
However, the following case is less clear to me:
let rng = System.Random()
let z3 = f 1 rng.Next(5)
z3 doesn't work, with a clear error message:
error FS0597: Successive arguments should be separated by spaces or
tupled, and arguments involving function or method applications should
be parenthesized.
Fixing it is trivial (parenthesize all the things), but what I am not clear about is why such an expression is a problem. I assume this has to do with type inference again, but naively, it seems to me that here, methods having a list of arguments surrounded by a parenthesis would actually make things less potentially ambiguous. Does this have to do with the fact that rng.Next(5) is equivalent to rng.Next 5?
Can someone hint, give an example or explain why this rule is needed, or what type of problems would arise if it were not there?
I think that the problem here is that the code could be treated as:
let z3 = f 1 rng.Next (5)
This would be equivalent to omitting the parentheses and so it would be calling f with 3 arguments (the second being a function value). This sounds a bit silly, but the compiler actually does not strictly insist on having a space between parameters. For example:
let second a b = b
add 5(1) // This works fine and calls 'add 5 1'
add id(1) // error FS0597
add rng.Next(5) // error FS0597
add (rng.Next(5)) // This works fine (partial application)
I think the problem is that if you look at the sequence of the 4 examples in the above snippet, it is not clear which behavior should you get in the second and the third case.
The call rng.Next(5) is still treated in a special way, because F# allows you to chain calls if they are formed by single-parameter application without space. For example rng.Next(5).ToString(). But, for example, writing second(1)(2) is allowed, but second(1)(2).ToString() will not work.

Why does return/redo evaluate result functions in the calling context, but block results are not evaluated?

Last night I learned about the /redo option for when you return from a function. It lets you return another function, which is then invoked at the calling site and reinvokes the evaluator from the same position
>> foo: func [a] [(print a) (return/redo (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
20
Even though foo is a function that only takes one argument, it now acts like a function that took two arguments. Something like that would otherwise require the caller to know you were returning a function, and that caller would have to manually use the do evaluator on it.
Thus without return/redo, you'd get:
>> foo: func [a] [(print a) (return (func [b] [print b + 10]))]
>> foo "Hello" 10
Hello
== 10
foo consumed its one parameter and returned a function by value (which was not invoked, thus the interpreter moved on). Then the expression evaluated to 10. If return/redo did not exist you'd have had to write:
>> do foo "Hello" 10
Hello
20
This keeps the caller from having to know (or care) if you've chosen to return a function to execute. And is cool because you can do things like tail call optimization, or writing a wrapper for the return functionality itself. Here's a variant of return that prints a message but still exits the function and provides the result:
>> myreturn: func [] [(print "Leaving...") (return/redo :return)]
>> foo: func [num] [myreturn num + 10]
>> foo 10
Leaving...
== 20
But functions aren't the only thing that have behavior in do. So if this is a general pattern for "removing the need for a DO at the callsite", then why doesn't this print anything?
>> test: func [] [return/redo [print "test"]]
>> test
== [print "test"]
It just returned the block by value, like a normal return would have. Shouldn't it have printed out "test"? That's what do would...uh, do with it:
>> do [print "test"]
test
The short answer is because it is generally unnecessary to evaluate a block at the call point, because blocks in Rebol don't take parameters so it mostly doesn't matter where they are evaluated. However, that "mostly" may need some explanation...
It comes down to two interesting features of Rebol: static binding, and how do of a function works.
Static Binding and Scopes
Rebol doesn't have scoped word bindings, it has static direct word bindings. Sometimes it seems like we have lexical scope, but we really fake that by updating the static bindings each time we're building a new "scoped" code block. We can also rebind words manually whenever we want.
What that means for us in this case though, is that once a block exists, its bindings and values are static - they're not affected by where the block is physically located, or where it is being evaluated.
However, and this is where it gets tricky, function contexts are weird. While the bindings of words bound to a function context are static, the set of values assigned to those words are dynamically scoped. It's a side effect of how code is evaluated in Rebol: What are language statements in other languages are functions in Rebol, so a call to if, for instance, actually passes a block of data to the if function which if then passes to do. That means that while a function is running, do has to look up the values of its words from the call frame of the most recent call to the function that hasn't returned yet.
This does mean that if you call a function and return a block of code with words bound to its context, evaluating that block will fail after the function returns. However, if your function calls itself and that call returns a block of code with its words bound to it, evaluating that block before your function returns will make it look up those words in the call frame of the current call of your function.
This is the same for whether you do or return/redo, and affects inner functions as well. Let me demonstrate:
Function returning code that is evaluated after the function returns, referencing a function word:
>> a: 10 do do has [a] [a: 20 [a]]
** Script error: a word is not bound to a context
** Where: do
** Near: do do has [a] [a: 20 [a]]
Same, but with return/redo and the code in a function:
>> a: 10 do has [a] [a: 20 return/redo does [a]]
** Script error: a word is not bound to a context
** Where: function!
** Near: [a: 20 return/redo does [a]]
Code do version, but inside an outer call to the same function:
>> do f: function [x] [a: 10 either zero? x [do f 1] [a: 20 [a]]] 0
== 10
Same, but with return/redo and the code in a function:
>> do f: function [x] [a: 10 either zero? x [f 1] [a: 20 return/redo does [a]]] 0
== 10
So in short, with blocks there is usually no advantage to doing the block elsewhere than where it is defined, and if you want to it is easier to use another call to do instead. Self-calling recursive functions that need to return code to be executed in outer calls of the same function are an exceedingly rare code pattern that I have never seen used in Rebol code at all.
It could be possible to change return/redo so it would handle blocks as well, but it probably isn't worth the increased overhead to return/redo to add a feature that is only useful in rare circumstances and already has a better way to do it.
However, that brings up an interesting point: If you don't need return/redo for blocks because do does the same job, doesn't the same apply to functions? Why do we need return/redo at all?
How DO of a Function Works
Basically, we have return/redo because it uses exactly the same code that we use to implement do of a function. You might not realize it, but do of a function is really unusual.
In most programming languages that can call a function value, you have to pass the parameters to the function as a complete set, sort of how R3's apply function works. Regular Rebol function calling causes some unknown-ahead-of-time number of additional evaluations to happen for its arguments using unknown-ahead-of-time evaluation rules. The evaluator figures out these evaluation rules at runtime and just passes the results of the evaluation to the function. The function itself doesn't handle the evaluation of its parameters, or even necessarily know how those parameters were evaluated.
However, when you do a function value explicitly, that means passing the function value to a call to another function, a regular function named do, and then that magically causes the evaluation of additional parameters that weren't even passed to the do function at all.
Well it's not magic, it's return/redo. The way do of a function works is that it returns a reference to the function in a regular shortcut-return value, with a flag in the shortcut-return value that tells the interpreter that called do to evaluate the returned function as if it were called right there in the code. This is basically what is called a trampoline.
Here's where we get to another interesting feature of Rebol: The ability to shortcut-return values from a function is built into the evaluator, but it doesn't actually use the return function to do it. All of the functions you see from Rebol code are wrappers around the internal stuff, even return and do. The return function we call just generates one of those shortcut-return values and returns it; the evaluator does the rest.
So in this case, what really happened is that all along we had code that did what return/redo does internally, but Carl decided to add an option to our return function to set that flag, even though the internal code doesn't need return to do so because the internal code calls the internal function. And then he didn't tell anyone that he was making the option externally available, or why, or what it did (I guess you can't mention everything; who has the time?). I have the suspicion, based on conversations with Carl and some bugs we've been fixing, that R2 handled do of a function differently, in a way that would have made return/redo impossible.
That does mean that the handling of return/redo is pretty thoroughly oriented towards function evaluation, since that is its entire reason for existing at all. Adding any overhead to it would add overhead to do of a function, and we use that a lot. Probably not worth extending it to blocks, given how little we'd gain and how rarely we'd get any benefit at all.
For return/redo of a function though, it seems to be getting more and more useful the more we think about it. In the last day we've come up with all sorts of tricks that this enables. Trampolines are useful.
While the question originally asked why return/redo did not evaluate blocks, there were also formulations like: "is cool because you can do things like tail call optimization", "[can write] a wrapper for the return functionality", "it seems to be getting more and more useful the more we think about it".
I do not think these are true. My first example demonstrates a case where return/redo can really be used, an example being in the "area of expertise" of return/redo, so to speak. It is a variadic sum function called sumn:
use [result collect process] [
collect: func [:value [any-type!]] [
unless value? 'value [return process result]
append/only result :value
return/redo :collect
]
process: func [block [block!] /local result] [
result: 0
foreach value reduce block [result: result + value]
result
]
sumn: func [] [
result: copy []
return/redo :collect
]
]
This is the usage example:
>> sumn 1 * 2 2 * 3 4
== 12
Variadic functions taking "unlimited number" of arguments are not as useful in Rebol as it may look at the first sight. For example, if we wanted to use the sumn function in a small script, we would have to wrap it into a paren to indicate where it should stop collecting arguments:
result: (sumn 1 * 2 2 * 3 4)
print result
This is not any better than using a more standard (non-variadic) alternative called e.g. block-sum and taking just one argument, a block. The usage would be like
result: block-sum [1 * 2 2 * 3 4]
print result
Of course, if the function can somehow detect what is its last argument without needing enclosing paren, we really gain something. In this case we could use the #[unset!] value as the sumn stopping argument, but that does not spare typing either:
result: sumn 1 * 2 2 * 3 4 #[unset!]
print result
Seeing the example of a return wrapper I would say that return/redo is not well suited for return wrappers, return wrappers being outside of its area of expertise. To demonstrate that, here is a return wrapper written in Rebol 2 that actually is outside of return/redo's area of expertise:
myreturn: func [
{my RETURN wrapper returning the string "indefinite" instead of #[unset!]}
; the [throw] attribute makes this function a RETURN wrapper in R2:
[throw]
value [any-type!] {the value to return}
] [
either value? 'value [return :value] [return "indefinite"]
]
Testing in R2:
>> do does [return #[unset!]]
>> do does [myreturn #[unset!]]
== "indefinite"
>> do does [return 1]
== 1
>> do does [myreturn 1]
== 1
>> do does [return 2 3]
== 2
>> do does [myreturn 2 3]
== 2
Also, I do not think it is true that return/redo helps with tail call optimizations. There are examples how tail calls can be implemented without using return/redo at the www.rebol.org site. As said, return/redo was tailor-made to support implementation of variadic functions and it is not flexible enough for other purposes as far as argument passing is concerned.

Resources