Clarifying field types in F# - f#

After some practice with F#, I have still some points where I need to clear confusion:
The question is specifically about fields in a type.
This is what I understand and, some must be wrong because the naming wouldn't make sense if I was right:
let x -> private read-only field, evaluated once
let mutable x -> private mutable field
val x -> public read-only field.. difference with let?
val mutable x -> public mutable field
member this.x -> private read-only field, evaluated every time
member val -> public mutable field.. difference with val? why no mutable keyword?
Can someone tell me what is right / wrong, or some concepts I may have gotten wrong.

First of all, you can pretty much ignore val and val mutable. Those two are used with an older syntax for defining classes that is not exactly formally deprecated, but I would almost never use it when writing new normal F# code (there are some rare use cases, but I don't think it's worth worrying about those).
This leaves let and let mutable vs. member and member val.
let defines a private field that can only be accessed within the class. The value you assign to it is evaluated once. You can also define functions like let foo x = x + 1 or let bar () = printfn "hi" which have body that's evaluated when the function is called.
let mutable defines a private mutable field. This is initialized by evaluating the right-hand side, but you can later mutate it using fld <- <new value>.
member this.Foo = (...) defines a get-only property. The expression (...) is evaluated repeatedly whenever the property is accessed. This is a side-effect of how .NET properties work - they have a hidden get() method that's called whenever they are accessed, so the body is the body of this method.
member val Foo = (...) is a way of writing a property that is evaluated only once. In earlier versions of F#, this was not available, so you had to implement this functionality quite tediously yourself by definining a local field (to run the code once) and then returning that from a regular property:
let foo = (...)
member x.Foo = foo

Related

In F#, is it possible to pass a reference to a mutable, defaulted value as a parameter?

For the Froto project (Google Protobuf in F#), I am trying to update the deserialization code from using 'a ref objects to passing values byref<'a>, for performance.
However, the code below fails on the hydrator &element field line:
type Field = TypeA | TypeB | Etc
let hydrateRepeated
(hydrator:byref<'a> -> Field -> unit)
(result:byref<'a list>)
(field:Field) =
let mutable element = Unchecked.defaultof<'a>
hydrator &element field
result <- element :: result
error FS0421: The address of the variable 'element' cannot be used at this point
Is there anything I can do to get this code to work without changing the signature of the hydrator parameter?
I'm very aware that I could use hydrator:'a ref -> Field -> unit and get things to work. However, the goal is to support deserializing into record types without needing to create a bunch of ref objects on the heap every time a record is deserialize.
Note that the following code is perfectly legal and has the same signature as the hydrator function declaration, above, so I'm unclear on what the problem is.
let assign (result:byref<'a>) (x:'a) =
result <- x
let thisWorks() =
let mutable v = Unchecked.defaultof<int>
assign &v 5
printfn "%A" v
I'll try to clarify what I was saying in my comments. You're right that your definition of assign is perfectly fine, and it appears to have the signature byref<'a> -> 'a -> unit. However, if you look at the resulting assembly, you'll find that the way it's compiled at the .NET representation level is:
Void assign[a](a ByRef, a)
(that is, it's a method that takes two arguments and doesn't return anything, not a function value that takes one argument and returns a function that takes the next argument and returns a value of type unit - the compiler uses some additional metadata to determine how the method was actually declared).
The same is true of function definitions that don't involve byref. For instance, assume you've got the following definition:
let someFunc (x:int) (y:string) = ()
Then the compiler actually creates a method with the signature
Void someFunc(Int32, System.String)
The compiler is smart enough to do the right thing when you try to use a function like someFunc as a first class value - if you use it in a context where it isn't applied to any arguments, the compiler will generate a subtype of int -> string -> unit (which is FSharpFunc<int, FSharpFunc<string, unit>> at the .NET representation level), and everything works seamlessly.
However, if you try to do the same thing with assign, it won't work (or shouldn't work, but there are several compiler bugs that may make it seem like certain variations work when really they don't - you might not get a compiler error but you may get an output assembly that is malformed instead) - it's not legal for .NET type instantiations to use byref types as generic type arguments, so FSharpFunc<int byref, FSharpFunc<int, unit>> is not a valid .NET type. The fundamental way that F# represents function values just doesn't work when there are byref arguments.
So the workaround is to create your own type with a method taking a byref argument and then create subtypes/instances that have the behavior you want, sort of like doing manually what the compiler does automatically in the non-byref case. You could do this with a named type
type MyByrefFunc2<'a,'b> =
abstract Invoke : 'a byref * 'b -> unit
let assign = {
new MyByrefFunc2<_,_> with
member this.Invoke(result, x) =
result <- x }
or with a delegate type
type MyByrefDelegate2<'a,'b> = delegate of 'a byref * 'b -> unit
let assign = MyByrefDelegate2(fun result x -> result <- x)
Note that when calling methods like Invoke on the delegate or nominal type, no actual tuple is created, so you shouldn't be concerned about any extra overhead there (it's a .NET method that takes two arguments and is treated as such by the compiler). There is the cost of a virtual method call or delegate call, but in most cases similar costs exist when using function values in a first class way too. And in general, if you're worried about performance then you should set a target and measure against it rather than trying to optimize prematurely.

IEnumerator in F# continued

This is the question about the earlier Persistence class that I was trying to expose as an enumerator. I realized that I need to pass by reference really to change the value of of the object that I am trying to populate. I guess I am going about this in a C++ way (As most may have guessed I am an F# beginner). However, I want to be as efficient in terms of memory foot print as I can. Ideally I would like to reuse the same object over and over again when I read from a file.
I am having a problem with this code where it does not allow me to pass by reference in the call to the function serialize. I am again reproducing the code here. I thank you in advance for your help.
The error I get:
error FS0001: This expression was expected to have type byref<'T> but here has type 'T
If I change the call to serialize(& current_, reader_) I get the following error:
persistence.fs(71,6): error FS0437: A type would store a byref typed value. This is not permitted by Common IL.
persistence.fs(100,29): error FS0412: A type instantiation involves a byref type. This is not permitted by the rules of Common IL.
persistence.fs(100,30): error FS0423: The address of the field current_ cannot be used at this point
The CODE:
type BinaryPersistenceIn<'T when 'T: (new : unit -> 'T)>(fn: string, serializer: ('T byref * BinaryReader) -> unit) =
let stream_ = File.Open(fn, FileMode.Open, FileAccess.Read)
let reader_ = new BinaryReader(stream_)
let mutable current_ = new 'T()
let eof() =
stream_.Position = stream_.Length
interface IEnumerator<'T> with
member this.Current
with get() = current_
member this.Dispose() =
stream_.Close()
reader_.Close()
interface System.Collections.IEnumerator with
member this.Current
with get() = current_ :> obj
member this.Reset() =
stream_.Seek((int64) 0., SeekOrigin.Begin) |> ignore
member this.MoveNext() =
let mutable ret = eof()
if stream_.CanRead && ret then
serializer( current_, reader_)
ret
You can circumvent this by introducing a mutable local, passing it to serialize, and then assigning back to current_:
member this.MoveNext() =
let mutable ret = eof()
if stream_.CanRead && ret then
let mutable deserialized = Unchecked.defaultof<_>
serializer( &deserialized, reader_)
current_ <- deserialized
ret
But now this is becoming really, really unsettling. Notice the use of Unchecked.defaultof<_>? There is no other way to initialize a value of unknown type, and it's called "unchecked" for a reason: the compiler can't guarantee safety of this code.
I strongly advise that you explore other ways of achieving your initial goal, such as using a seq computation expression instead, as I have suggested in your other question.
With respect to memory footprint, let's analyze the sequence option:
You have an instance of the seq. That's going to be some class implementing IEnumerable<'T>. This one will be held until you no longer need the seq, i.e. not reallocated each time.
You hold a Stream as part of the seq, with the same lifetime.
You hold a BinaryReader as part of the seq, with the same lifetime.
eof : unit -> bool is a compiler-generated function class as part of the seq, with the same lifetime.
The loop will use a bool for the while loop and the if condition. Both of which are stack-allocated structs and needed for the branching logic.
And finally, you yield an instance that you already got from the serializer.
Conceptually, that's as little memory consumption as you can have for a lazily evaluated seq. Once an element is consumed, it can be garbage collected. Multiple evaluations will do the same thing again.
The only thing you can actually play with, is what the serializer returns.
If you have your serializer return a struct, it is copied and stack-allocated. And it should not be mutable. Mutable structs discouraged. Why are mutable structs “evil”?
Structs are good with respect to the garbage collector as they avoid garbage collection. But they are typically to be used with very small objects, in the order of say 16-24 bytes max.
Classes are heap-allocated and are passed by reference always. So if your serializer returns a class, say a string, then you just pass that around by reference and overhead of copying will be very small as you only ever copy the reference, not the content.
If you want your serializer side-effecting, i.e. overwriting the same object (class, i.e. reference type is to be used for this), then the whole approach of IEnumerable<'T> and consequently seq is wrong. IEnumerables always give you new objects as result and should never modify any pre-existing object. The only state with them should be the information, at what place in the enumeration they are.
So if you need a side-effecting version, you could do something like (pseudo-code).
let readAndOverwrite stream target =
let position = // do something here to know the state
fun target ->
target.MyProp1 <- stream.ReadInt()
target.MyProp2 <- stream.ReadFloat()
Passing as byref does not seem very reasonable to me, as you then anyway allocate and garbage collect the object. So you can just as well do that in an immutable way. What you can do, is just modifying properties on your object instead.

How do I register an Arbitrary instance in FsCheck and have xUnit use it?

I've got a type Average with a field count that's a positive int64 and a double field called sum.
I made an arbitrary that generates valid instances with
let AverageGen = Gen.map2 (fun s c -> Average(float(s),int64(int(c))) (Arb.Default.NormalFloat().Generator) (Arb.Default.PositiveInt().Generator) |> Arb.fromGen
How do I get this to be generate arguments with type Average in Property style tests in xUnit?
[<Property>]
static member average_test(av:Average) = ...
type Generators =
static member TestCase() =
{ new Arbitrary<TestCase>() with
override x.Generator =
gen { ...
return TestCase(...) }}
[<Property(Arbitrary=[|typeof<Generators>|])>]
I think Vasily Kirichenko's solution is the correct one, but just for completeness sake, I've also been able to make it work with this imperative function invocation style:
do Arb.register<Generators>() |> ignore
...if you assume a Generators class as in Vasily Kirichenko's answer.
Edit, much later...
While the above, imperative approach may work, I never use it because of its impure nature. Instead, I sometimes use the Arbitrary directly from within the test. With the AverageGen value above (which I'll rename to averageGen, because values should be camelCased), it could look like this:
[<Property>]
let member average_test () =
Prop.forAll averageGen (fun avg ->
// The rest of the test goes here... )

F# quotations, arrays and self-identifier in constructors

I think that's a well-known limitation of F# but I couldn't find any good workarounds…
So, here is the code (I tried to make it as simple as possible, so probably it looks like it doesn't make any sense):
[<ReflectedDefinition>]
type Human (makeAName: unit -> string) as self =
let mutable cats : Cat array = [| |]
do
// get a cat
cats <- Array.append cats [| new Cat (self, makeAName ()) |]
member this.Cats = cats
and
[<ReflectedDefinition>]
Cat (owner : Human, name : string) = class end
The compiler says:
error FS0452: Quotations cannot contain inline assembly code or pattern matching on arrays
Actually it is the combination of as self and array property getter that breaks everything.
The points here are:
I really want to use arrays, because I want WebSharper to translate my collections to JavaSript arrays.
I really need a self-identifier in constructors.
I really need classes (i.e. functional style won't work).
Per-method self-identifiers (member this.Foo) work fine.
One workaround I can think of is making constructors private and using static methods to construct objects. This way I don't need as self. But it is just silly.
Are there any better options?
Update:
Here is an even simpler example:
[<ReflectedDefinition>]
type User (uid: int) as self =
let ROOT_UID = 0
member this.isRoot = (uid = ROOT_UID)
With as self I can't even define a class constant. Well, it's actually a separate question, but I'll ask it here: how do I define a class constant in this particular case?
I do not think it is silly at all. We actually prefer static constructor methods for clarity, even in code that does not use WebSharper. In the whole IntelliFactory codebase we rarely, if ever use self.
You are hitting two annoying limitations of F# compiler and quotations. As you point out, static methods can solve the self problem:
[<ReflectedDefinition>]
type Human private (cats: ref<Cat []>) =
member this.Cats = !cats
static member Create(makeAName: unit -> string) =
let cats = ref [| |]
let h = Human(cats)
let cat = Cat(h, makeAName())
cats := [| cat |]
h
and [<ReflectedDefinition>] Cat (owner: Human, name: string) =
class
end
There are many other ways to accomplish this, for example you can get rid of ref indirection.
Second, you often get FS0452 in ReflectedDefinition code with array operations, even in plain static methods. This usually can be resolved by using library functions instead of direct array access (Array.iter, Array.map).
For the second example, you really want this:
[<ReflectedDefinition>]
module Users =
[<Literal>]
let ROOT_UID = 0
type User(uid: int) =
member this.isRoot = (uid = ROOT_UID)
The [<Literal>] annotation will let you pattern-match on your constants, which can be handy if there is more than one.
For your points:
I really want to use arrays - that should be OK
I really need a self-identifier - it is never necessary, just as constructors are not
I really need classes (i.e. functional style won't work) - definitely not true
Per-method self-identifiers (member this.Foo) work fine - yes, and are useful

F# Functions vs. Values

This is a pretty simple question, and I just wanted to check that what I'm doing and how I'm interpreting the F# makes sense. If I have the statement
let printRandom =
x = MyApplication.getRandom()
printfn "%d" x
x
Instead of creating printRandom as a function, F# runs it once and then assigns it a value. So, now, when I call printRandom, instead of getting a new random value and printing it, I simply get whatever was returned the first time. I can get around this my defining it as such:
let printRandom() =
x = MyApplication.getRandom()
printfn "%d" x
x
Is this the proper way to draw this distinction between parameter-less functions and values? This seems less than ideal to me. Does it have consequences in currying, composition, etc?
The right way to look at this is that F# has no such thing as parameter-less functions. All functions have to take a parameter, but sometimes you don't care what it is, so you use () (the singleton value of type unit). You could also make a function like this:
let printRandom unused =
x = MyApplication.getRandom()
printfn "%d" x
x
or this:
let printRandom _ =
x = MyApplication.getRandom()
printfn "%d" x
x
But () is the default way to express that you don't use the parameter. It expresses that fact to the caller, because the type is unit -> int not 'a -> int; as well as to the reader, because the call site is printRandom () not printRandom "unused".
Currying and composition do in fact rely on the fact that all functions take one parameter and return one value.
The most common way to write calls with unit, by the way, is with a space, especially in the non .NET relatives of F# like Caml, SML and Haskell. That's because () is a singleton value, not a syntactic thing like it is in C#.
Your analysis is correct.
The first instance defines a value and not a function. I admit this caught me a few times when I started with F# as well. Coming from C# it seems very natural that an assignment expression which contains multiple statements must be a lambda and hence delay evaluated.
This is just not the case in F#. Statements can be almost arbitrarily nested (and it rocks for having locally scoped functions and values). Once you get comfortable with this you start to see it as an advantage as you can create functions and continuations which are inaccessible to the rest of the function.
The second approach is the standard way for creating a function which logically takes no arguments. I don't know the precise terminology the F# team would use for this declaration though (perhaps a function taking a single argument of type unit). So I can't really comment on how it would affect currying.
Is this the proper way to draw this
distinction between parameter-less
functions and values? This seems less
than ideal to me. Does it have
consequences in currying, composition,
etc?
Yes, what you describe is correct.
For what its worth, it has a very interesting consequence able to partially evaluate functions on declaration. Compare these two functions:
// val contains : string -> bool
let contains =
let people = set ["Juliet"; "Joe"; "Bob"; "Jack"]
fun person -> people.Contains(person)
// val contains2 : string -> bool
let contains2 person =
let people = set ["Juliet"; "Joe"; "Bob"; "Jack"]
people.Contains(person)
Both functions produce identical results, contains creates its people set on declaration and reuses it, whereas contains2 creates its people set everytime you call the function. End result: contains is slightly faster. So knowing the distinction here can help you write faster code.
Assignment bodies looking like function bodies have cought a few programmers unaware. You can make things even more interesting by having the assignment return a function:
let foo =
printfn "This runs at startup"
(fun () -> printfn "This runs every time you call foo ()")
I just wrote a blog post about it at http://blog.wezeku.com/2010/08/23/values-functions-and-a-bit-of-both/.

Resources