Does F# Seq.sort return a copy of the input sequence? - f#

Here is some unexpected (by me) behaviour in F#. I have a simple class that sorts a sequence :
type MyQueue<'a when 'a : comparison> ( values : 'a[] ) =
let vals =
Seq.sort values
member this.First = Seq.nth 0 vals
override this.ToString() =
Seq.fold ( fun s a -> s + a.ToString() + ";" ) "" vals
I have written a slightly contrived unit test (in C#) to test this:
private class TestObject : IComparable
{
public TestObject( double Value )
{
this.Value = Value;
}
public void Update(double NewValue)
{
this.Value = NewValue;
}
public double Value { get ; private set; }
public int CompareTo(object Comparable)
{
return this.Value.CompareTo( (Comparable as TestObject).Value );
}
public override string ToString ()
{
return Value.ToString();
}
}
[Test]
public void TestUpdate_OK()
{
var nums = new double[]{7,4,3,12,11,3,8};
var values = nums.Select( n => new TestObject(n) ).ToArray();
var q = new MyQueue<TestObject>( values );
Console.WriteLine ( q.ToString() );
// update one of the values in the collection - should not re-sort the collection
values[3].Update( 2.0 );
Console.WriteLine ( q.ToString() );
Assert.AreEqual( q.First.Value, 3.0 );
}
the Seq.sort does sort the sequence, and the first output is correct :
3;3;4;7;8;11;12;
However, updating the test (reference type) object causes the sequence to be re-sorted :
2;3;3;4;7;8;11;
I expected that the vals in the MyQueue object would now be unsorted, since the value in the reference object has changed, but the Seq.sort appears to have been performed again. I don't understand, I thought the object of functional programming was to avoid side effects. Why do I get this behaviour?

The cause of this is the statement let vals = Seq.sort values is not actually sorting the values until some code consumes the vals variable i.e what your Seq.fold does in toString method, it consumes the vals sequence and at that time the sorting happens and whatever values are there in the values array at that time, those values are sorted, so basically the sorting is happening at the time when you call toString method.
Also, I won't call it FP :) as you are basically doing OOPs by creating type with private state and that state is accessed by type members.
Your problem is related to how sequences works and not in general applicable to FP.

Does F# Seq.sort return a copy of the input sequence?
Yes. What else could it do – to change the order of a set of value types you need to copy (true in all .NET languages).
(This includes LINQ operators in C# and VB: the lazy aspect is that the copy is only made when the first copied element is needed, and at that point a complete new collection is created.)

You can actually check this directly in the sourcecode for f# here but in short what it does is call Seq.toArray, sort the array in place and return that array back as the sequence.

Related

F# - Use C# methods with out parameter (within arrays and void return)

I've read the-f-equivalent-of-cs-out but still I can't make it work for my case (the simplest solution/syntax).
I have this method in a C# project:
//<Project Sdk="Microsoft.NET.Sdk">
//<TargetFrameworks>netcoreapp3.1;netstandard2.0</TargetFrameworks>
public static void ABC(out byte[] a, out byte[] b, byte[] c)
{
var aaa = new byte[10];
var bbb = new byte[10];
a = aaa;
b = bbb;
}
Now, I want to use it in a F# project:
I'm using FSharp.Core 4.7.2
(* <Project Sdk="Microsoft.NET.Sdk">
<TargetFrameworks>netcoreapp3.1;netstandard2.0</TargetFrameworks> *)
let a,b = ABC(c)
I'm imitating the syntax of TryParse and this compiles without errors:
let success, number = System.Int32.TryParse("0")
The compiler on my ABC(c) call complains about the fact the signature asks for 3 parameters, not 1.
Compared to the TryParse I see 2 differences:
It does not return void
It uses Array objects
The compiler accepts this syntax:
let a = Array.empty<byte>
let b = Array.empty<byte>
ABC(ref a, ref b, c)
but:
I think it is not correct to use ref here, not in this way (because a and b are not mutable)
I'd like to use the clean syntax similar to TryParse and I WANT to know why it does not work here
I can change the C# project code, but replacing all the out parameters in that proejct will be a second step and maybe a new qeustion if I have difficulties or doubt.
[Update: parameter position]
I played a little with this and seems like I found when the "simple" syntax (without passing ref parameters) is broken.
public static void TryParseArray(string input, out int[] result) {
result = new int[0];
}
public static void TryParseArray_2(out int[] result, string input) {
result = new int[0];
}
let arr = csharp.TryParseArray("a") // OK
let arr = csharp.TryParseArray_2("a") // ERROR
It seems like the out parameter must be at the end (= not followed by normal parameters) in the C# methods, to make possible for F# to use them as returned tuple.
You correctly noticed that the "simplified" F# syntax for turning out parameters into returned tuples only works in very limited situations - only when you have one parameter and it is the last one. In other words, this feature helps with some common patterns, but it does not fully replace out parameters.
If you want to use out parameter in F#, you can either pass a reference to a local mutable variable using the &var syntax, or you can specify a reference cell of type int ref as an argument. The following shows the two options using the standard TryParse method:
// Using a local mutable variable
let mutable n = 0
Int32.TryParse("42", &n)
printfn "Got: %d" n
// Using a reference cell initialized to 0
let n = ref 0
Int32.TryParse("42", n)
printfn "Got: %d" n.Value

Extending records with additional fields

I have a data pipeline where at each step more data fields are required. I would like to do this in a functional way by respecting immutability. I could achieve this with a class by I am wondering if there is an F# way of doing it?
// code that loads initial field information and returns record A
type recordA = {
A: int
}
// code that loads additional field information and returns record AB
type recordAB = {
A: int
B: int
}
// code that loads additional field information and returns record ABC
type recordABC = {
A: int
B: int
C: int
}
As records are sealed I can't just inherit them. How can I avoid having to define a new record with the exact same fields as the previous step and adding the required fields? Preferably I would like to have something like one record that has all required fields and the fields get assigned to their values in each step.
Note that the number of fields added in each step could be more than 1.
I think this can be a good use case for the anonymous records recently introduced in F#.
let a = {| X = 3 |}
let b = {| a with Y = "1"; Z = 4.0|}
let c = {| b with W = 1 |}
printfn "%d, %s, %f, %d" c.X c.Y c.Z c.W
One way to do it in a very FP-style would be to use a DU with a case for each step of the workflow, and the appropriate data for each step in each case:
type WorfklowState =
| StepOne of int
| StepTwo of int * int
| StepThree of int * int * int
Then your entire workflow state, both what step you're currently on and the data produced/consumed by that step, would be modeled in the data type. Of course, you would probably create record types for the data of each case, rather than using progressively larger tuples.
Depending on the application, this may be a (mis-)use case for a dynamic data container.
F# might help by providing user-defined dynamic lookup operators, for which a special syntactic translation occurs.
let (?) (m : Map<_,_>) k = m.Item k
// val ( ? ) : m:Map<'a,'b> -> k:'a -> 'b when 'a : comparison
let (?<-) (m : Map<_,_>) k v = m.Add(k, v)
// val ( ?<- ) : m:Map<'a,'b> -> k:'a -> v:'b -> Map<'a,'b> when 'a : comparison
let m = Map.empty<_,_>
let ma = m?A <- "0"
let mabc = (ma?B <- "1")?C <- "2"
ma?A // val it : string = "0"
mabc?C // val it : string = "2"
You can "inherit" records:
type RecordA =
{
a : int
}
type RecordAB =
{
a : RecordA
b : int
}
type RecordABC =
{
ab : RecordAB
c : int
}
Then you can access all of the elements, though with longer and longer chain as you go deeper and deeper.
However, why don't you just use a list of elements to store the result?
First, I would create a type to handle all possible types that you may have on each step, e.g.:
type Step =
| Int of int
| String of string
// ...
Then you can represent the workflow simply as:
type WorkflowState = list<Step>
and if you want to ensure that you always have at least one element then you can use:
type WorkflowState = Step * list<Step>
However, the records have labels and the structure above does not have them! So, if you do need labels, then you can represent them by a map using either a strong type:
type Label =
| A
| B
// ...
type WorkflowMappedState = Map<Label, Step>
or just a string based one, e.g.
type WorkflowMappedState = Map<string, Step>
The benefits of list or map based approach in comparison to the answers above is that you don't have to know the maximum number of possible steps. What if the number of steps is over 100? Would you want to create a record with 100+ labels? Most likely not! The anonymous records are great, but what if you want to use them outside of module where they were created? I think that that would cause some troubles.
Having said all that, I think that I would go with a list based approach: type WorkflowState = list<Step>. It is very F# way and it is very easy to transform further.

what does 'T', 'f', 'E', 'e', '→' stand for in dart/flutter docs?

i`m learning the flutter, but i do not understand those letters meaning.
map<T>(T f(E e)) → Iterable<T>
Returns a new lazy Iterable with elements that are created by
calling f on each element of this Iterable in iteration order. [...]
so,what do they stand for?
T:
f:
E:
e:
→:
Iterable.map<T>:
map<T>(T f(E e)) → Iterable<T>
Returns a new lazy Iterable with elements that are created by calling
f on each element of this Iterable in iteration order. [...]
T is a language Type in this case the Type of the items of the iterable and is also the type that function f must return.
→ tells you the return type of the whole function (map) in this case an Iterable of T
f is the function applied to the Element e that is passed as the parameter to the function so that the function could do some operation with this current value and then return a new value of type T based on the value of the element e.
If you navigate the Iterable map function definition you will see that:
Iterable<T> map <T>(
T f(
E e
)
)
So I wanna sharpen my answer starting with the exact map<T> function of the OP and then swich to a more complex example.
Just to clarify all these let's take a concrete class of the Iterable class, the Set class choosing a Set of type String in such a scenario:
Set<String> mySet = Set();
for (int i=0; i++<5;) {
mySet.add(i.toString());
}
var myNewSet = mySet.map((currentValue) => (return "new" + currentValue));
for (var newValue in myNewSet) {
debugPrint(newValue);
}
Here I've got a Set of String Set<String> and I want another Set of String Set<String> so that the value is the same value of the original map, but sorrounded with a prefix of "new:". And for that we could easily use the map<T> along with the closure it wants as paraemters.
The function passed as closure is
(currentValue) => ("new:" + currentValue)
And if we want we could write it also like that:
(currentValue) {
return "new:" + currentValue;
}
or even pass a function like that:
String modifySetElement(String currentValue) {
return "new:" + currentValue;
}
var myNewSet = mySet.map((value) => ("new:" + value));
var myNewSet = mySet.map((value) {return "new:" + value;});
var myNewSet = mySet.map((value) => modifySetElement("new:" + value));
And this means that the parameter of the function (closure) is the String value of the element E of the Set we're modifying.
We don't even have to specify the type because its inferred by method definition, that's one of the power of generics.
The function (closure) will be applied to all the Set's elements once at a time, but you write it once as a closure.
So summarising:
T is String
E is the element we are dealing with inside of the function
f is our closure
Let's go deeper with a more complex example. We'll now deal with the Dart Map class.
Its map function is define like that:
map<K2, V2>(MapEntry<K2, V2> f(K key, V value)) → Map<K2, V2>
So in this case the previous first and third T is (K2, V2) and the return type of the function f (closure), that takes as element E parameter the pair K and V (that are the key and value of the current MapEntry element of the iteration), is a type of MapEntry<K2, V2> and is the previous second T.
The whole function then return a new Map<K2, V2>
The following is an actual example with Map:
Map<int, String> myMap = Map();
for (int i=0; i++<5;) {
myMap[i] = i.toString();
}
var myNewMap = myMap.map((key, value) => (MapEntry(key, "new:" + value)));
for (var mapNewEntry in myNewMap.entries) {
debugPrint(mapNewEntry.value);
}
In this example I've got a Map<int, String> and I want another Map<int, String> so that (like before) the value is the same value of the original map, but sorrounded with a prefix of "new:".
Again you could write the closure (your f function) also in this way (maybe it highlights better the fact that it's a fanction that create a brand new MapEntry based on the current map entry value).
var myNewMap = myMap.map((key, value) {
String newString = "new:" + value;
return MapEntry(key, newString);
});
All these symbols are called Generics because they are generic placeholder that correspond to a type or another based on the context you are using them.
That's an extract from the above link:
Using generic methods
Initially, Dart’s generic support was limited to classes. A newer syntax, called generic methods, allows
type arguments on methods and functions:
T first<T>(List<T> ts) {
// Do some initial work or error checking, then...
T tmp = ts[0];
// Do some additional checking or processing...
return tmp;
}
Here the generic type parameter on first () allows you to use the
type argument T in several places:
In the function’s return type (T). In the type of an argument
(List<T>). In the type of a local variable (T tmp).
Follow this link for Generics name conventions.

Where/how to declare the unique key of variables in a compiler written in Ocaml?

I am writing a compiler of mini-pascal in Ocaml. I would like my compiler to accept the following code for instance:
program test;
var
a,b : boolean;
n : integer;
begin
...
end.
I have difficulties in dealing with the declaration of variables (the part following var). At the moment, the type of variables is defined like this in sib_syntax.ml:
type s_var =
{ s_var_name: string;
s_var_type: s_type;
s_var_uniqueId: s_uniqueId (* key *) }
Where s_var_uniqueId (instead of s_var_name) is the unique key of the variables. My first question is, where and how I could implement the mechanism of generating a new id (actually by increasing the biggest id by 1) every time I have got a new variable. I am wondering if I should implement it in sib_parser.mly, which probably involves a static variable cur_id and the modification of the part of binding, again don't know how to realize them in .mly. Or should I implement the mechanism at the next stage - the interpreter.ml? but in this case, the question is how to make the .mly consistent with the type s_var, what s_var_uniqueId should I provide in the part of binding?
Another question is about this part of statement in .mly:
id = IDENT COLONEQ e = expression
{ Sc_assign (Sle_var {s_var_name = id; s_var_type = St_void}, e) }
Here, I also need to provide the next level (the interpreter.ml) a variable of which I only know the s_var_name, so what could I do regarding its s_var_type and s_var_uniqueId here?
Could anyone help? Thank you very much!
The first question to ask yourself is whether you actually need an unique id. From my experience, they're almost never necessary or even useful. If what you're trying to do is making variables unique through alpha-equivalence, then this should happen after parsing is complete, and will probably involve some form of DeBruijn indices instead of unique identifiers.
Either way, a function which returns a new integer identifier every time it is called is:
let unique =
let last = ref 0 in
fun () -> incr last ; !last
let one = unique () (* 1 *)
let two = unique () (* 2 *)
So, you can simply assign { ... ; s_var_uniqueId = unique () } in your Menhir rules.
The more important problem you're trying to solve here is that of variable binding. Variable x is defined in one location and used in another, and you need to determine that it happens to be the same variable in both places. There are many ways of doing this, one of them being to delay the binding until the interpreter. I'm going to show you how to deal with this during parsing.
First, I'm going to define a context: it's a set of variables that allows you to easily retrieve a variable based on its name. You might want to create it with hash tables or maps, but to keep things simple I will be using List.assoc here.
type s_context = {
s_ctx_parent : s_context option ;
s_ctx_bindings : (string * (int * s_type)) list ;
s_ctx_size : int ;
}
let empty_context parent = {
s_ctx_parent = parent ;
s_ctx_bindings = [] ;
s_ctx_size = 0
}
let bind v_name v_type ctx =
try let _ = List.assoc ctx.s_ctx_bindings v_name in
failwith "Variable is already defined"
with Not_found ->
{ ctx with
s_ctx_bindings = (v_name, (ctx.s_ctx_size, v_type))
:: ctx.s_ctx_bindings ;
s_ctx_size = ctx.s_ctx_size + 1 }
let rec find v_name ctx =
try 0, List.assoc ctx.s_ctx_bindings v_name
with Not_found ->
match ctx.s_ctx_parent with
| Some parent -> let depth, found = find v_name parent in
depth + 1, found
| None -> failwith "Variable is not defined"
So, bind adds a new variable to the current context, find looks for a variable in the current context and its parents, and returns both the bound data and the depth at which it was found. So, you could have all global variables in one context, then all parameters of a function in another context that has the global context as its parent, then all local variables in a function (when you'll have them) in a third context that has the function's main context as the parent, and so on.
So, for instance, find 'x' ctx will return something like 0, (3, St_int) where 0 is the DeBruijn index of the variable, 3 is the position of the variable in the context identified by the DeBruijn index, and St_int is the type.
type s_var = {
s_var_deBruijn: int;
s_var_type: s_type;
s_var_pos: int
}
let find v_name ctx =
let deBruijn, (pos, typ) = find v_name ctx in
{ s_var_deBruijn = deBruijn ;
s_var_type = typ ;
s_var_pos = pos }
Of course, you need your functions to store their context, and make sure that the first argument is the variable at position 0 within the context:
type s_fun =
{ s_fun_name: string;
s_fun_type: s_type;
s_fun_params: context;
s_fun_body: s_block; }
let context_of_paramlist parent paramlist =
List.fold_left
(fun ctx (v_name,v_type) -> bind v_name v_type ctx)
(empty_context parent)
paramlist
Then, you can change your parser to take into account the context. The trick is that instead of returning an object representing part of your AST, most of your rules will return a function that takes a context as an argument and returns an AST node.
For instance:
int_expression:
(* Constant : ignore the context *)
| c = INT { fun _ -> Se_const (Sc_int c) }
(* Variable : look for the variable inside the contex *)
| id = IDENT { fun ctx -> Se_var (find id ctx) }
(* Subexpressions : pass the context to both *)
| e1 = int_expression o = operator e2 = int_expression
{ fun ctx -> Se_binary (o, e1 ctx, e2 ctx) }
;
So, you simply propagate the context "down" recursively through the expressions. The only clever parts are those when new contexts are created (you don't have this syntax yet, so I'm just adding a placeholder):
| function_definition_expression (args, body)
{ fun ctx -> let ctx = context_of_paramlist (Some ctx) args in
{ s_fun_params = ctx ;
s_fun_body = body ctx } }
As well as the global context (the program rule itself does not return a function, but the block rule does, and so a context is created from the globals and provided).
prog:
PROGRAM IDENT SEMICOLON
globals = variables
main = block
DOT
{ let ctx = context_of_paramlist None globals in
{ globals = ctx;
main = main ctx } }
All of this makes the implementation of your interpreter much easier due to the DeBruijn indices: you can have a "stack" which holds your values (of type value) defined as:
type stack = value array list
Then, reading and writing variable x is as simple as:
let read stack x =
(List.nth stack x.s_var_deBruijn).(x.s_var_pos)
let write stack x value =
(List.nth stack x.s_var_deBruijn).(x.s_var_pos) <- value
Also, since we made sure that function parameters are in the same order as their position in the function context, if you want to call function f and its arguments are stored in the array args, then constructing the stack is as simple as:
let inner_stack = args :: stack in
(* Evaluate f.s_fun_body with inner_stack here *)
But I'm sure you'll have a lot more questions to ask when you start working on your interpeter ;)
How to create a global id generator:
let unique =
let counter = ref (-1) in
fun () -> incr counter; !counter
Test:
# unique ();;
- : int = 0
# unique ();;
- : int = 1
Regarding your more general design question: it seems that your data representation does not faithfully represent the compiler phases. If you must return a type-aware data-type (with this field s_var_type) after the parsing phase, something is wrong. You have two choices:
devise a more precise data representation for the post-parsing AST, that would be different from the post-typing AST, and not have those s_var_type fields. Typing would then be a conversion from the untyped to the typed AST. This is a clean solution that I would recommend.
admit that you must break the data representation semantics because you don't have enough information at this stage, and try to be at peace with the idea of returning garbage such as St_void after the parsing phase, to reconstruct the correct information later. This is less typed (as you have an implicit assumption on your data which is not apparent in the type), more pragmatic, ugly but sometimes necessary. I don't think it's the right decision in this case, but you will encounter situation where it's better to be a bit less typed.
I think the specific choice of unique id handling design depends on your position on this more general question, and your concrete decisions about types. If you choose a finer-typed representation of post-parsing AST, it's your choice to decide whether to include unique ids or not (I would, because generating a unique ID is dead simple and doesn't need a separate pass, and I would rather slightly complexify the grammar productions than the typing phase). If you choose to hack the type field with a dummy value, it's also reasonable to do that for variable ids if you wish to, putting 0 as a dummy value and defining it later; but still I personally would do that in the parsing phase.

How do I properly implement a property in F#?

Consider my first attempt, a simple type in F# like the following:
type Test() =
inherit BaseImplementingNotifyPropertyChangedViaOnPropertyChanged()
let mutable prop: string = null
member this.Prop
with public get() = prop
and public set value =
match value with
| _ when value = prop -> ()
| _ ->
let prop = value
this.OnPropertyChanged("Prop")
Now I test this via C# (this object is being exposed to a C# project, so apparent C# semantics are desirable):
[TestMethod]
public void TaskMaster_Test()
{
var target = new FTest();
string propName = null;
target.PropertyChanged += (s, a) => propName = a.PropertyName;
target.Prop = "newString";
Assert.AreEqual("Prop", propName);
Assert.AreEqual("newString", target.Prop);
return;
}
propName is properly assigned, my F# Setter is running, but the second assert is failing because the underlying value of prop isn't changed. This sort of makes sense to me, because if I remove mutable from the prop field, no error is generated (and one should be because I'm trying to mutate the value). I think I must be missing a fundamental concept.
What's the correct way to rebind/mutate prop in the Test class so that I can pass my unit test?
As a side-note, I would probably use if .. then instead of the match construct as it makes the code more succinct (patterh matching is especially valuable when you need to test the value agains multiple complex patterns). Also, public is the default access for member, so you can make the code a bit more succinct:
type Test() =
inherit BaseImplementingNotifyPropertyChangedViaOnPropertyChanged()
let mutable prop : string = null
member this.Prop
with get() = prop
and set(value) =
if value <> prop then
prop <- value
this.OnPropertyChanged("Prop")
Try this:
type Test() =
inherit BaseImplementingNotifyPropertyChangedViaOnPropertyChanged()
let mutable prop: string = null
member this.Prop
with public get() = prop
and public set value =
match value with
| _ when value = prop -> ()
| _ ->
prop <- value
this.OnPropertyChanged("Prop")
You need to make the binding mutable and then alter its value in your setter. In your initial code, you were just creating a new binding (also called prop) within your setter, so no change was visible.
In your pattern match you are actually binding a new value with
let prop = value
When you bind a value like this with the same name, it will shadow the other value for the scope of the newly declared one. I believe what you actually want to do is this:
prop <- value

Resources