Related
Here is a function:
let newPositions : PositionData list =
positions
|> List.filter (fun x ->
let key = (x.Instrument, x.Side)
match brain.Positions.TryGetValue key with
| false, _ ->
// if we don't know the position, it's new
true
| true, p when x.UpdateTime > p.UpdateTime ->
// it's newer than the version we have, it's new
true
| _ ->
false
)
it compiles at expected.
let's focus on two lines:
let key = (x.Instrument, x.Side)
match brain.Positions.TryGetValue key with
brain.Positions is a Map<Instrument * Side, PositionData> type
if I modify the second line to:
match brain.Positions.TryGetValue (x.Instrument, x.Side) with
then the code will not compile, with error:
[FS0001] This expression was expected to have type
'Instrument * Side'
but here has type
'Instrument'
but:
match brain.Positions.TryGetValue ((x.Instrument, x.Side)) with
will compile...
why is that?
This is due to method call syntax.
TryGetValue is not a function, but a method. A very different thing, and a much worse thing in general. And subject to some special syntactic rules.
This method, you see, actually has two parameters, not one. The first parameter is a key, as you expect. And the second parameter is what's known in C# as out parameter - i.e. kind of a second return value. The way it was originally meant to be called in C# is something like this:
Dictionary<int, string> map = ...
string val;
if (map.TryGetValue(42, out val)) { ... }
The "regular" return value of TryGetValue is a boolean signifying whether the key was even found. And the "extra" return value, denoted here out val, is the value corresponding to the key.
This is, of course, extremely awkward, but it did not stop the early .NET libraries from using this pattern very widely. So F# has special syntactic sugar for this pattern: if you pass just one parameter, then the result becomes a tuple consisting of the "actual" return value and the out parameter. Which is what you're matching against in your code.
But of course, F# cannot prevent you from using the method exactly as designed, so you're free to pass two parameters as well - the first one being the key and the second one being a byref cell (which is F# equivalent of out).
And here is where this clashes with the method call syntax. You see, in .NET all methods are uncurried, meaning their arguments are all effectively tupled. So when you call a method, you're passing a tuple.
And this is what happens in this case: as soon as you add parentheses, the compiler interprets that as an attempt to call a .NET method with tupled arguments:
brain.Positions.TryGetValue (x.Instrument, x.Side)
^ ^
first arg |
second arg
And in this case it expects the first argument to be of type Instrument * Side, but you're clearly passing just an Instrument. Which is exactly what the error message tells you: "expected to have type 'Instrument * Side'
but here has type 'Instrument'".
But when you add a second pair of parens, the meaning changes: now the outer parens are interpreted as "method call syntax", and the inner parens are interpreted as "denoting a tuple". So now the compiler interprets the whole thing as just a single argument, and all works as before.
Incidentally, the following will also work:
brain.Positions.TryGetValue <| (x.Instrument, x.Side)
This works because now it's no longer a "method call" syntax, because the parens do not immediately follow the method name.
But a much better solution is, as always, do not use methods, use functions instead!
In this particular example, instead of .TryGetValue, use Map.tryFind. It's the same thing, but in proper function form. Not a method. A function.
brain.Positions |> Map.tryFind (x.Instrument, x.Side)
Q: But why does this confusing method even exist?
Compatibility. As always with awkward and nonsensical things, the answer is: compatibility.
The standard .NET library has this interface System.Collections.Generic.IDictionary, and it's on that interface that the TryGetValue method is defined. And every dictionary-like type, including Map, is generally expected to implement that interface. So here you go.
In future, please consider the Stack Overflow guidelines provided under How to create a Minimal, Reproducible Example. Well, minimal and reproducible the code in your question is, but it shall also be complete...
…Complete – Provide all parts someone else needs to reproduce your
problem in the question itself
That being said, when given the following definitions, your code will compile:
type Instrument() = class end
type Side() = class end
type PositionData = { Instrument : Instrument; Side : Side; }
with member __.UpdateTime = 0
module brain =
let Positions = dict[(Instrument(), Side()), {Instrument = Instrument(); Side = Side()}]
let positions = []
Now, why is that? Technically, it is because of the mechanism described in the F# 4.1 Language Specification under §14.4 Method Application Resolution, 4. c., 2nd bullet point:
If all formal parameters in the suffix are “out” arguments with byref
type, remove the suffix from UnnamedFormalArgs and call it
ImplicitlyReturnedFormalArgs.
This is supported by the signature of the method call in question:
System.Collections.Generic.IDictionary.TryGetValue(key: Instrument * Side, value: byref<PositionData>)
Here, if the second argument is not provided, the compiler does the implicit conversion to a tuple return type as described in §14.4 5. g.
You are obviously familiar with this behaviour, but maybe not with the fact that if you specify two arguments, the compiler will see the second of them as the explicit byref "out" argument, and complains accordingly with its next error message:
Error 2 This expression was expected to have type
PositionData ref
but here has type
Side
This misunderstanding changes the return type of the method call from bool * PositionData to bool, which consequently elicits a third error:
Error 3 This expression was expected to have type
bool
but here has type
'a * 'b
In short, your self-discovered workaround with double parentheses is indeed the way to tell the compiler: No, I am giving you only one argument (a tuple), so that you can implicitly convert the byref "out" argument to a tuple return type.
I was given a question which was:
given a number N in the first argument selects only numbers greater than N in the list, so that
greater(2,[2,13,1,4,13]) = [13,4,13]
This was the solution provided:
member(_,[]) -> false;
member(H,[H|_]) -> true;
member(N,[_,T]) -> member(N,T).
I don't understand what "_" means. I understand it has something to do with pattern matching but I don't understand it completely. Could someone please explain this to me
This was the solution provided:
I think you are confused: the name of the solution function isn't even the same as the name of the function in the question. The member/2 function returns true when the first argument is an element of the list provided as the second argument, and it returns false otherwise.
I don't understand what "_" means. I understand it has something to do with pattern matching but I don't understand it completely. Could someone please explain this to me
_ is a variable name, and like any variable it will match anything. Here are some examples of pattern matching:
35> f(). %"Forget" or erase all variable bindings
ok
45> {X, Y} = {10, 20}.
{10,20}
46> X.
10
47> Y.
20
48> {X, Y} = {30, 20}.
** exception error: no match of right hand side value {30,
20}
Now why didn't line 48 match? X was already bound to 10 and Y to 20, so erlang replaces those variables with their values, which gives you:
48> {10, 20} = {30, 20}.
...and those tuples don't match.
Now lets try it with a variable named _:
49> f().
ok
50> {_, Y} = {10, 20}.
{10,20}
51> Y.
20
52> {_, Y} = {30, 20}.
{30,20}
53>
As you can see, the variable _ sort of works like the variable X, but notice that there is no error on line 52, like there was on line 48. That's because the _ variable works a little differently than X:
53> _.
* 1: variable '_' is unbound
In other words, _ is a variable name, so it will initially match anything, but unlike X, the variable _ is never bound/assigned a value, so you can use it over and over again without error to match anything.
The _ variable is also known as a don't care variable because you don't care what that variable matches because it's not important to your code, and you don't need to use its value.
Let's apply those lessons to your solution. This line:
member(N,[_,T]) -> member(N,T).
recursively calls the member function, namely member(N, T). And, the following function clause:
member(_,[]) -> false;
will match the function call member(N, T) whenever T is an empty list--no matter what the value of N is. In other words, once the given number N has not matched any element in the list, i.e. when the list is empty so there are no more elements to check, then the function clause:
member(_,[]) -> false;
will match and return false.
You could rewrite that function clause like this:
member(N, []) -> false;
but erlang will warn you that N is an unused variable in the body of the function, which is a way of saying: "Are you sure you didn't make a mistake in your function definition? You defined a variable named N, but then you didn't use it in the body of the function!" The way you tell erlang that the function definition is indeed correct is to change the variable name N to _ (or _N).
It means a variable you don't care to name. If you are never going to use a variable inside the function you can just use underscore.
% if the list is empty, it has no members
member(_, []) -> false.
% if the element I am searching for is the head of the list, it is a member
member(H,[H|_]) -> true.
% if the elem I am searching for is not the head of the list, and the list
% is not empty, lets recursively go look at the tail of the list to see if
% it is present there
member(H,[_|T]) -> member(H,T).
the above is pseudo code for what is happening. You can also have multiple '_' unnamed variables.
According to Documentation:
The anonymous variable is denoted by underscore (_) and can be used when a variable is required but its value can be ignored.
Example:
[H, _] = [1,2] % H will be 1
Also documentation says that:
Variables starting with underscore (_), for example, _Height, are normal variables, not anonymous. They are however ignored by the compiler in the sense that they do not generate any warnings for unused variables.
Sorry if this is repetitive...
What does (_,[]) mean?
That means (1) two parameters, (2) the first one matches anything and everything, yet I don't care about it (you're telling Erlang to just forget about its value via the underscore) and (3) the second parameter is an empty list.
Given that Erlang binds or matches values with variables (depending on the particular case), here you're basically looking to a match (like a conditional statement) of the second parameter with an empty list. If that match happens, the statement returns false. Otherwise, it tries to match the two parameters of the function call with one of the other two statements below it.
Why is the following saying variable unbound?
9> {<<A:Length/binary, Rest/binary>>, Length} = {<<1,2,3,4,5>>, 3}.
* 1: variable 'Length' is unbound
It's pretty clear that Length should be 3.
I am trying to have a function with similar pattern matching, ie.:
parse(<<Body:Length/binary, Rest/binary>>, Length) ->
But if fails with the same reason. How can I achieve the pattern matching I want?
What I am really trying to achieve is parse in incoming tcp stream packets as LTV(Length, Type, Value).
At some point after I parse the the Length and the Type, I want to ready only up to Length number of bytes as the value, as the rest will probably be for the next LTV.
So my parse_value function is like this:
parse_value(Value0, Left, Callback = {Module, Function},
{length, Length, type, Type, value, Value1}) when byte_size(Value0) >= Left ->
<<Value2:Left/binary, Rest/binary>> = Value0,
Module:Function({length, Length, type, Type, value, lists:reverse([Value2 | Value1])}),
if
Rest =:= <<>> ->
{?MODULE, parse, {}};
true ->
parse(Rest, Callback, {})
end;
parse_value(Value0, Left, _, {length, Length, type, Type, value, Value1}) ->
{?MODULE, parse_value, Left - byte_size(Value0), {length, Length, type, Type, value, [Value0 | Value1]}}.
If I could do the pattern matching, I could break it up to something more pleasant to the eye.
The rules for pattern matching are that if a variable X occurs in two subpatterns, as in {X, X}, or {X, [X]}, or similar, then they have to have the same value in both positions, but the matching of each subpattern is still done in the same input environment - bindings from one side do not carry over to the other. The equality check is conceptually done afterwards, as if you had matched on {X, X2} and added a guard X =:= X2. This means that your Length field in the tuple cannot be used as input to the binary pattern, not even if you make it the leftmost element.
However, within a binary pattern, variables bound in a field can be used in other fields following it, left-to-right. Therefore, the following works (using a leading 32-bit size field in the binary):
1> <<Length:32, A:Length/binary, Rest/binary>> = <<0,0,0,3,1,2,3,4,5>>.
<<0,0,0,3,1,2,3,4,5>>
2> A.
<<1,2,3>>
3> Rest.
<<4,5>>
I've run into this before. There is some weirdness between what is happening inside binary syntax and what happens during unification (matching). I suspect that it is just that binary syntax and matching occur at different times in the VM somewhere (we don't know which Length is failing to get assigned -- maybe binary matching is always first in evaluation, so Length is still meaningless). I was once going to dig in and find out, but then I realized that I never really needed to solve this problem -- which might be why it was never "solved".
Fortunately, this won't stop you with whatever you are doing.
Unfortunately, we can't really help further unless you explain the context in which you think this kind of a match is a good idea (you are having an X-Y problem).
In binary parsing you can always force the situation to be one of the following:
Have a fixed-sized header at the beginning of the binary message that tells you the next size element you need (and from there that can continue as a chain of associations endlessly)
Inspect the binary once on entry to determine the size you are looking for, pull that one value, and then begin the real parsing task
Have a set of fields, all of predetermined sizes that conform to some a binary schema standard
Convert the binary to a list and iterate through it with any arbitrary amount of look-ahead and backtracking you might need
Quick Solution
Without knowing anything else about your general problem, a typical solution would look like:
parse(Length, Bin) ->
<<Body:Length/binary, Rest/binary>> = Bin,
ok = do_something(Body),
do_other_stuff(Rest).
But I smell something funky here.
Having things like this in your code is almost always a sign that a more fundamental aspect of the code structure is not in agreement with the data that you are handling.
But deadlines.
Erlang is all about practical code that satisfies your goals in the real world. With that in mind, I suggest that you do something like the above for now, and then return to this problem domain and rethink it. Then refactor it. This will gain you three benefits:
Something will work right away.
You will later learn something fundamental about parsing in general.
Your code will almost certainly run faster if it fits your data better.
Example
Here is an example in the shell:
1> Parse =
1> fun
1> (Length, Bin) when Length =< byte_size(Bin) ->
1> <<Body:Length/binary, Rest/binary>> = Bin,
1> ok = io:format("Chopped off ~p bytes: ~p~n", [Length, Body]),
1> Rest;
1> (Length, Bin) ->
1> ok = io:format("Binary shorter than ~p~n", [Length]),
1> Bin
1> end.
#Fun<erl_eval.12.87737649>
2> Parse(3, <<1,2,3,4,5>>).
Chopped off 3 bytes: <<1,2,3>>
<<4,5>>
3> Parse(8, <<1,2,3,4,5>>).
Binary shorter than 8
<<1,2,3,4,5>>
Note that this version is a little safer, in that we avoid a crash in the case that Length is longer than the binary. This is yet another good reason why maybe we can't do that match in the function head.
Try with below code:
{<<A:Length/binary, Rest/binary>>, _} = {_, Length} = {<<1,2,3,4,5>>, 3}.
This question is mentioned a bit in EEP-52:
Any variables used in the expression must have been previously bound, or become bound in the same binary pattern as the expression. That is, the following example is illegal:
illegal_example2(N, <<X:N,T/binary>>) ->
{X,T}.
And explained a bit more in the following e-mail: http://erlang.org/pipermail/eeps/2020-January/000636.html
Illegal. With one exception, matching is not done in a left-to-right
order, but all variables in the pattern will be bound at the same
time. That means that the variables must be bound before the match
starts. For maps, that means that the variables referenced in key
expressions must be bound before the case (or receive) that matches
the map. In a function head, all map keys must be literals.
The exception to this general rule is that within a binary pattern,
the segments are matched from left to right, and a variable bound in a
previous segment can be used in the size expression for a segment
later in the binary pattern.
Also one of the members of OTP team mentioned that they made a prototype that can do that, but it was never finished http://erlang.org/pipermail/erlang-questions/2020-May/099538.html
We actually tried to make your example legal. The transformation of
the code that we did was not to rewrite to guards, but to match
arguments or parts of argument in the right order so that variables
that input variables would be bound before being used. (We would do a
topological sort to find the correct order.) For your example, the
transformation would look similar to this:
legal_example(Key, Map) ->
case Map of
#{Key := Value} -> Value;
_ -> error(function_clause, [Key, Map])
end.
In the prototype implementation, the compiler could compile the
following example:
convoluted(Ref,
#{ node(Ref) := NodeId, Loop := universal_answer},
[{NodeId, Size} | T],
<<Int:(Size*8+length(T)),Loop>>) when is_reference(Ref) ->
Int.
Things started to fall apart when variables are repeated. Repeated
variables in patterns already have a meaning in Erlang (they should be
the same), so it become tricky to understand to distinguish between
variables being bound or variables being used a binary size or map
key. Here is an example that the prototype couldn't handle:
foo(#{K := K}, K) -> ok.
A human can see that it should be transformed similar to this:
foo(Map, K) -> case Map of
{K := V} when K =:= V -> ok end.
Here are few other examples that should work but the prototype would
refuse to compile (often emitting an incomprehensible error message):
bin2(<<Sz:8,X:Sz>>, <<Y:Sz>>) -> {X,Y}.
repeated_vars(#{K := #{K := K}}, K) -> K.
match_map_bs(#{K1 := {bin,<<Int:Sz>>}, K2 := <<Sz:8>>}, {K1,K2}) ->
Int.
Another problem was when example was correctly rejected, the error
message would be confusing.
Because much more work would clearly be needed, we have shelved the
idea for now. Personally, I am not sure that the idea is sound in the
first place. But I am sure of one thing: the implementation would be
very complicated.
UPD: latest news from 2020-05-14
I have the following computation expression builder:
type ExprBuilder() =
member this.Return(x) =
Some x
let expr = new ExprBuilder()
I understand the purpose of methods Return, Zero and Combine, but I don't understand what is the difference between expressions shown below:
let a = expr{
printfn "Hello"
return 1
} // result is Some 1
let c = expr{
return 1
printfn "Hello"
} // do not compile. Combine method required
I also don't understand why in the first case Zero method in not required for printfn statement?
In the first expression, you perform some computation that results in value 1, and that's it. You don't need a Zero in it, because Zero is only needed for return-less expressions (that's why it's called "zero" - it's what is there when nothing is there), and your expression does have a return.
To specifically answer your question, Zero is not required "for the printfn statement", because not every line within the expression gets transformed. When compiling computation expressions, the compiler breaks them up at "special" points, such as let!, do!, return, etc., leaving all the rest of the code between those points intact. In this case, your printfn call just becomes part of the code that executes before the return.
In the second expression, you perform two computations: the first one results in value 1, and second one results in Zero (which is implicitly assumed when expression lacks a return). But the whole computation expression can't have two return values, it must have one. So in order to bring results of the two computations together (one might say, combine them), you need the Combine method.
Besides Combine and Zero, you'd also need to implement Delay for this to work. Multipart (i.e. "combined") computations are also wrapped in Delay in order to allow the builder to defer evaluation and possibly drop some parts within the Combine implementation.
I recommend reading through this introduction by Scott Wlaschin, specifically part 3 about Delay and Run.
I'd like to tidy my Eralng code, I found there're lots of issue following:
A = {Tid, _Tv0, _Tv1, Tv2, Tv3}
Is there any way to clean the code like to be: A = {Tid, SomewayReplace(4)} ???
Update1:
like #Pascal example, Is there any way to simple the code A = {T, _, _, _, _, _} like to be A = {T, SomewayReplace(4)} to replace that 4 symbol _ ???
update2
in real project, if some record include many element, I found it force me to repeat writing the symbol _, so I wonder if there is any way to simple it???
Writting A = Something means that you try to match A with Something or if A is unbound, assign Something to A. In anycase, Something must be defined.
You can find some shortcut in writting. For example, if you want to assign the result of a funtion to A, verify that the result is a tuple of 5 elements and assign the first element to T, the you can write:
A = {T,_,_,_,_} = f(Param).
The meaning of _T is exactly the same as any variable. It just says to th compiler to not issue a warning if this variable is not used in the code. It is frequent in pattern matching when you want to ignore the value of a variable but still keep trace of its meaning.
[edit]
It is not possible to write {T, SomewayReplace(4)}, but you may use records. A record is a tagged tuple (first element is the atom that identify this record. It is not shorter than placeholder for small tuples, but it is clearer, you don't need to remember the location of the information in your tuple, and it is easier to modify your code when you need to add a new element in a tuple. The syntax will be
-record(mytuple,{field1,...,fieldx,...}.
...
A = #mytuple{fieldx = T} = f(Param).
waerning: Records are managed by the compiler, so everything must be known at build time (#mytuple{Fieldx = T} is illegal, Fieldx cannot be a variable).
Why not use a record? Then you only match the fields you want to extract. As a by-effect, you make the code easier to debug, since you are forced to name the tuple by having a atom first.