F# - passing reference cells to functions - f#

I'm just wondering if somebody can explain to me how to pass reference cells to functions that are not class members. I've been following the msdn page msdn reference cells
I have the following code:
let myint = ref 32
let mutable myint2 = 23
type addone() =
member t.myadd1func (x:int byref) =
x <- x + 1
let myadd1func (x:int byref) =
x <- x + 1
let adder = new addone()
adder.myadd1func myint
// myadd1func myint <---- this line does not compile
myadd1func &myint2 // <----- this line does though
printfn "%d" !myint
printfn "%d" myint2
My question is... what is the fundamental difference between the call I am making to the "Myadd1func" method on the class and the "myadd1func" function defined after it?
As I write this, I'm guessing that the function doesn't like having .net object references being passed to it as this might break compatibility with other IL components?? I don't mind using a mutable value, I just like to understand these things.
Thanks

I think the byref type in F# should be used only for interoperability purpsoes where the existing features (as explained by kvb) are good enough. If you want to declare a function that modifies some argument passed to it, I would just use ordinary reference cell (e.g. int ref type):
let myadd1func (x:int ref) =
x := !x + 1
let myint = ref 10
myadd1func myint
This may be slightly slower than using byref type (together with local mutable value), but I don't think it is needed very often in functional style, so it should be fine.

This is explained in the Type-directed Conversions at Member Invocations section of the F# specification. For interoperability with other .NET components, ref cells can be passed to members taking byref parameters and the compiler will automatically treat it as the dereferencing of the cell's contents field. However, this isn't done for let-bound functions, and you should directly use the addressof operator (&). You can still use a ref cell, but you have to explicitly dereference the contents field yourself, so this should work in your example: myadd1func &myint.contents

Related

Are there use cases for single case variants in Ocaml?

I've been reading F# articles and they use single case variants to create distinct incompatible types. However in Ocaml I can use private module types or abstract types to create distinct types. Is it common in Ocaml to use single case variants like in F# or Haskell?
Another specialized use case fo a single constructor variant is to erase some type information with a GADT (and an existential quantification).
For instance, in
type showable = Show: 'a * ('a -> string) -> showable
let show (Show (x,f)) = f x
let showables = [ Show (0,string_of_int); Show("string", Fun.id) ]
The constructor Show pairs an element of a given type with a printing function, then forget the concrete type of the element. This makes it possible to have a list of showable elements, even if each elements had a different concrete types.
For what it's worth it seems to me this wasn't particularly common in OCaml in the past.
I've been reluctant to do this myself because it has always cost something: the representation of type t = T of int was always bigger than just the representation of an int.
However recently (probably a few years) it's possible to declare types as unboxed, which removes this obstacle:
type [#unboxed] t = T of int
As a result I've personally been using single-constructor types much more frequently recently. There are many advantages. For me the main one is that I can have a distinct type that's independent of whether it's representation happens to be the same as another type.
You can of course use modules to get this effect, as you say. But that is a fairly heavy solution.
(All of this is just my opinion naturally.)
Yet another case for single-constructor types (although it does not quite match your initial question of creating distinct types): fancy records. (By contrast with other answers, this is more a syntactic convenience than a fundamental feature.)
Indeed, using a relatively recent feature (introduced with OCaml 4.03, in 2016) which allows writing constructor arguments with a record syntax (including mutable fields!), you can prefix regular records with a constructor name, Coq-style.
type t = MakeT of {
mutable x : int ;
mutable y : string ;
}
let some_t = MakeT { x = 4 ; y = "tea" }
(* val some_t : t = MakeT {x = 4; y = "tea"} *)
It does not change anything at runtime (just like Constr (a,b) has the same representation as (a,b), provided Constr is the only constructor of its type). The constructor makes the code a bit more explicit to the human eye, and it also provides the type information required to disambiguate field names, thus avoiding the need for type annotations. It is similar in function to the usual module trick, but more systematic.
Patterns work just the same:
let (MakeT { x ; y }) = some_t
(* val x : int = 4 *)
(* val y : string = "tea" *)
You can also access the “contained” record (at no runtime cost), read and modify its fields. This contained record however is not a first-class value: you cannot store it, pass it to a function nor return it.
let (MakeT fields) = some_t in fields.x (* returns 4 *)
let (MakeT fields) = some_t in fields.x <- 42
(* some_t is now MakeT {x = 42; y = "tea"} *)
let (MakeT fields) = some_t in fields
(* ^^^^^^
Error: This form is not allowed as the type of the inlined record could escape. *)
Another use case of single-constructor (polymorphic) variants is documenting something to the caller of a function. For instance, perhaps there's a caveat with the value that your function returns:
val create : unit -> [ `Must_call_close of t ]
Using a variant forces the caller of your function to pattern-match on this variant in their code:
let (`Must_call_close t) = create () in (* ... *)
This makes it more likely that they'll pay attention to the message in the variant, as opposed to documentation in an .mli file that could get missed.
For this use case, polymorphic variants are a bit easier to work with as you don't need to define an intermediate type for the variant.

Why is passing a ref type into a F# function expecting a byref a type error?

let a = ref 0
let f (x: byref<int>) = x
f a // type error
System.Int32.TryParse("123",a) // works
f a being a type error is puzzling to me since a can be passed into .NET library methods with a byref<int> type. Why?
Edit: I think I really explained the question poorly. The type of System.Int32.TryParse is string * byref<int> -> bool and yet it works. So why can't I pass a into a function of type x:byref<int> -> int? That is all I am asking.
This feature is described in section 8.13.7 of the F# spec. The ability to use a ref when a byref is expected is enabled by a "type-directed conversion", but these are applied only on member invocations, not on regular function applications.
The only thing I am seeing wrong with that code is the Type Annotation is incorrect, try :int ref instead of byref<int>
Full code:
let a = ref 0
let f (x: int ref) = x
f a // type error
System.Int32.TryParse("123",a) // works
Edit:
Sorry, I misunderstood your question. So this one is a bit vague on F#'s part, I do think F# needs to improve it's error messages a bit. What is happening is since C# did not originally have tuples, they needed out parameters in order to return multiple values. So when you see a signature like byref<int>, that is .NET's way of telling you that is the signature of an out parameter, out parameters are for C# only. More reading here.

nested runtime coercion for a Nullable double

Let's say I have a value defined as a sort of commission formula
let address_commission = 1.0 // minimal simplified example
and I want to apply the above said commission to an amount I'm reading from the DB (the code is from a window WCF service I have in production)
let address_commission = 1.0 // minimal simplified example
new Model.ClaimModel(
//RequestRow = i, recounting
Code = (row.["claim_code"] :?> string),
EvtDate = (row.["event_date"] :?> DateTime),
// skipping lines...
Amount = (row.["amount"] :?> double) * address_commission,
now I see that the amount line compiles fine, but I also need to include the same commission in the following
PrevAmount = (if row.IsNull("prev_amount") then Nullable() else (row.["prev_amount"] :?> Nullable<double>)),
which is wrong since The type 'float' does not match the type 'obj'
Therefore I've tried also
PrevAmount = (if row.IsNull("prev_amount") then Nullable() else (((row.["prev_amount"] :?> double) * address_commission) :?> Nullable<double>)),
but it also fails with The type 'double' does not have any proper subtypes and cannot be used as the source of a type test or runtime coercion.
What is the correct way to handle this?
:?> is a dynamic cast and it's only checked at run-time so better try to avoid it. If you are accessing databases it helps to open the open FSharp.Linq.NullableOperators namespace. (The link is gone for me but it's somewhere on docs or msdn). Then you can use ?*? and similar operators. For example:
let x = System.Nullable<float> 4.
let y = x ?* 3.0
//val y : System.Nullable<float> = 12.0
You can have ? on either or both sides.
You will get back a Nullable float which you can coerce to an option with
Option.ofNullable(y) or to a double float y.
I'm going to use only one type coercion and wrap it within a Nullable(...)
PrevAmount = (if row.IsNull("prev_amount") then Nullable() else Nullable((row.["prev_amount"] :?> double) * address_commission)),
It compiles and looks ok to me, but I'm still open to different answers if they are more correct than mine

IEnumerator in F# continued

This is the question about the earlier Persistence class that I was trying to expose as an enumerator. I realized that I need to pass by reference really to change the value of of the object that I am trying to populate. I guess I am going about this in a C++ way (As most may have guessed I am an F# beginner). However, I want to be as efficient in terms of memory foot print as I can. Ideally I would like to reuse the same object over and over again when I read from a file.
I am having a problem with this code where it does not allow me to pass by reference in the call to the function serialize. I am again reproducing the code here. I thank you in advance for your help.
The error I get:
error FS0001: This expression was expected to have type byref<'T> but here has type 'T
If I change the call to serialize(& current_, reader_) I get the following error:
persistence.fs(71,6): error FS0437: A type would store a byref typed value. This is not permitted by Common IL.
persistence.fs(100,29): error FS0412: A type instantiation involves a byref type. This is not permitted by the rules of Common IL.
persistence.fs(100,30): error FS0423: The address of the field current_ cannot be used at this point
The CODE:
type BinaryPersistenceIn<'T when 'T: (new : unit -> 'T)>(fn: string, serializer: ('T byref * BinaryReader) -> unit) =
let stream_ = File.Open(fn, FileMode.Open, FileAccess.Read)
let reader_ = new BinaryReader(stream_)
let mutable current_ = new 'T()
let eof() =
stream_.Position = stream_.Length
interface IEnumerator<'T> with
member this.Current
with get() = current_
member this.Dispose() =
stream_.Close()
reader_.Close()
interface System.Collections.IEnumerator with
member this.Current
with get() = current_ :> obj
member this.Reset() =
stream_.Seek((int64) 0., SeekOrigin.Begin) |> ignore
member this.MoveNext() =
let mutable ret = eof()
if stream_.CanRead && ret then
serializer( current_, reader_)
ret
You can circumvent this by introducing a mutable local, passing it to serialize, and then assigning back to current_:
member this.MoveNext() =
let mutable ret = eof()
if stream_.CanRead && ret then
let mutable deserialized = Unchecked.defaultof<_>
serializer( &deserialized, reader_)
current_ <- deserialized
ret
But now this is becoming really, really unsettling. Notice the use of Unchecked.defaultof<_>? There is no other way to initialize a value of unknown type, and it's called "unchecked" for a reason: the compiler can't guarantee safety of this code.
I strongly advise that you explore other ways of achieving your initial goal, such as using a seq computation expression instead, as I have suggested in your other question.
With respect to memory footprint, let's analyze the sequence option:
You have an instance of the seq. That's going to be some class implementing IEnumerable<'T>. This one will be held until you no longer need the seq, i.e. not reallocated each time.
You hold a Stream as part of the seq, with the same lifetime.
You hold a BinaryReader as part of the seq, with the same lifetime.
eof : unit -> bool is a compiler-generated function class as part of the seq, with the same lifetime.
The loop will use a bool for the while loop and the if condition. Both of which are stack-allocated structs and needed for the branching logic.
And finally, you yield an instance that you already got from the serializer.
Conceptually, that's as little memory consumption as you can have for a lazily evaluated seq. Once an element is consumed, it can be garbage collected. Multiple evaluations will do the same thing again.
The only thing you can actually play with, is what the serializer returns.
If you have your serializer return a struct, it is copied and stack-allocated. And it should not be mutable. Mutable structs discouraged. Why are mutable structs “evil”?
Structs are good with respect to the garbage collector as they avoid garbage collection. But they are typically to be used with very small objects, in the order of say 16-24 bytes max.
Classes are heap-allocated and are passed by reference always. So if your serializer returns a class, say a string, then you just pass that around by reference and overhead of copying will be very small as you only ever copy the reference, not the content.
If you want your serializer side-effecting, i.e. overwriting the same object (class, i.e. reference type is to be used for this), then the whole approach of IEnumerable<'T> and consequently seq is wrong. IEnumerables always give you new objects as result and should never modify any pre-existing object. The only state with them should be the information, at what place in the enumeration they are.
So if you need a side-effecting version, you could do something like (pseudo-code).
let readAndOverwrite stream target =
let position = // do something here to know the state
fun target ->
target.MyProp1 <- stream.ReadInt()
target.MyProp2 <- stream.ReadFloat()
Passing as byref does not seem very reasonable to me, as you then anyway allocate and garbage collect the object. So you can just as well do that in an immutable way. What you can do, is just modifying properties on your object instead.

F# ref-mutable vars vs object fields

I'm writing a parser in F#, and it needs to be as fast as possible (I'm hoping to parse a 100 MB file in less than a minute). As normal, it uses mutable variables to store the next available character and the next available token (i.e. both the lexer and the parser proper use one unit of lookahead).
My current partial implementation uses local variables for these. Since closure variables can't be mutable (anyone know the reason for this?) I've declared them as ref:
let rec read file includepath =
let c = ref ' '
let k = ref NONE
let sb = new StringBuilder()
use stream = File.OpenText file
let readc() =
c := stream.Read() |> char
// etc
I assume this has some overhead (not much, I know, but I'm trying for maximum speed here), and it's a little inelegant. The most obvious alternative would be to create a parser class object and have the mutable variables be fields in it. Does anyone know which is likely to be faster? Is there any consensus on which is considered better/more idiomatic style? Is there another option I'm missing?
You mentioned that local mutable values cannot be captured by a closure, so you need to use ref instead. The reason for this is that mutable values captured in the closure need to be allocated on the heap (because closure is allocated on the heap).
F# forces you to write this explicitly (using ref). In C# you can "capture mutable variable", but the compiler translates it to a field in a heap-allocated object behind the scene, so it will be on the heap anyway.
Summary is: If you want to use closures, mutable variables need to be allocated on the heap.
Now, regarding your code - your implementation uses ref, which creates a small object for every mutable variable that you're using. An alternative would be to create a single object with multiple mutable fields. Using records, you could write:
type ReadClosure = {
mutable c : char
mutable k : SomeType } // whatever type you use here
let rec read file includepath =
let state = { c = ' '; k = NONE }
// ...
let readc() =
state.c <- stream.Read() |> char
// etc...
This may be a bit more efficient, because you're allocating a single object instead of a few objects, but I don't expect the difference will be noticeable.
There is also one confusing thing about your code - the stream value will be disposed after the function read returns, so the call to stream.Read may be invalid (if you call readc after read completes).
let rec read file includepath =
let c = ref ' '
use stream = File.OpenText file
let readc() =
c := stream.Read() |> char
readc
let f = read a1 a2
f() // This would fail!
I'm not quite sure how you're actually using readc, but this may be a problem to think about. Also, if you're declaring it only as a helper closure, you could probably rewrite the code without closure (or write it explicitly using tail-recursion, which is translated to imperative loop with mutable variables) to avoid any allocations.
I did the following profiling:
let test() =
tic()
let mutable a = 0.0
for i=1 to 10 do
for j=1 to 10000000 do
a <- a + float j
toc("mutable")
let test2() =
tic()
let a = ref 0.0
for i=1 to 10 do
for j=1 to 10000000 do
a := !a + float j
toc("ref")
the average for mutable is 50ms, while ref 600ms. The performance difference is due to that mutable variables are in stack, while ref variables are in managed heap.
The relative difference is big. However, 10^8 times of access is a big number. And the total time is acceptable. So don't worry too much about the performance of ref variables. And remember:
Premature optimization is the root of
all evil.
My advice is you first finish your parser, then consider optimizing it. You won't know where the bottomneck is until you actually run the program. One good thing about F# is that its terse syntax and functional style well support code refactoring. Once the code is done, optimizing it would be convenient. Here's an profiling example.
Just another example, we use .net arrays everyday, which is also in managed heap:
let test3() =
tic()
let a = Array.create 1 0.0
for i=1 to 10 do
for j=1 to 10000000 do
a.[0] <- a.[0] + float j
toc("array")
test3() runs about the same as ref's. If you worry too much of variables in managed heap, then you won't use array anymore.

Resources