I have come across two ways to create a properly typed null value in F#:
// null with cast
let mutable x = null :> MyReferenceType
// Unchecked.defaultof
let mutable x = Unchecked.defaultof<MyReferenceType>
Is there any reason to favour one over the other?
There's no difference, sometimes there is just more than one way to do things. But usually Unchecked.defaultof() is used in a generic context where you don't know the type ahead of time and so you don't know the default. You certainly know that the default for MyReferenceType is null. So I would go with the first declaration. And actually, even that is a tad ugly, with the cast. Most folks would probably just write that as
let x : MyReferenceType = null
Related
This is the question about the earlier Persistence class that I was trying to expose as an enumerator. I realized that I need to pass by reference really to change the value of of the object that I am trying to populate. I guess I am going about this in a C++ way (As most may have guessed I am an F# beginner). However, I want to be as efficient in terms of memory foot print as I can. Ideally I would like to reuse the same object over and over again when I read from a file.
I am having a problem with this code where it does not allow me to pass by reference in the call to the function serialize. I am again reproducing the code here. I thank you in advance for your help.
The error I get:
error FS0001: This expression was expected to have type byref<'T> but here has type 'T
If I change the call to serialize(& current_, reader_) I get the following error:
persistence.fs(71,6): error FS0437: A type would store a byref typed value. This is not permitted by Common IL.
persistence.fs(100,29): error FS0412: A type instantiation involves a byref type. This is not permitted by the rules of Common IL.
persistence.fs(100,30): error FS0423: The address of the field current_ cannot be used at this point
The CODE:
type BinaryPersistenceIn<'T when 'T: (new : unit -> 'T)>(fn: string, serializer: ('T byref * BinaryReader) -> unit) =
let stream_ = File.Open(fn, FileMode.Open, FileAccess.Read)
let reader_ = new BinaryReader(stream_)
let mutable current_ = new 'T()
let eof() =
stream_.Position = stream_.Length
interface IEnumerator<'T> with
member this.Current
with get() = current_
member this.Dispose() =
stream_.Close()
reader_.Close()
interface System.Collections.IEnumerator with
member this.Current
with get() = current_ :> obj
member this.Reset() =
stream_.Seek((int64) 0., SeekOrigin.Begin) |> ignore
member this.MoveNext() =
let mutable ret = eof()
if stream_.CanRead && ret then
serializer( current_, reader_)
ret
You can circumvent this by introducing a mutable local, passing it to serialize, and then assigning back to current_:
member this.MoveNext() =
let mutable ret = eof()
if stream_.CanRead && ret then
let mutable deserialized = Unchecked.defaultof<_>
serializer( &deserialized, reader_)
current_ <- deserialized
ret
But now this is becoming really, really unsettling. Notice the use of Unchecked.defaultof<_>? There is no other way to initialize a value of unknown type, and it's called "unchecked" for a reason: the compiler can't guarantee safety of this code.
I strongly advise that you explore other ways of achieving your initial goal, such as using a seq computation expression instead, as I have suggested in your other question.
With respect to memory footprint, let's analyze the sequence option:
You have an instance of the seq. That's going to be some class implementing IEnumerable<'T>. This one will be held until you no longer need the seq, i.e. not reallocated each time.
You hold a Stream as part of the seq, with the same lifetime.
You hold a BinaryReader as part of the seq, with the same lifetime.
eof : unit -> bool is a compiler-generated function class as part of the seq, with the same lifetime.
The loop will use a bool for the while loop and the if condition. Both of which are stack-allocated structs and needed for the branching logic.
And finally, you yield an instance that you already got from the serializer.
Conceptually, that's as little memory consumption as you can have for a lazily evaluated seq. Once an element is consumed, it can be garbage collected. Multiple evaluations will do the same thing again.
The only thing you can actually play with, is what the serializer returns.
If you have your serializer return a struct, it is copied and stack-allocated. And it should not be mutable. Mutable structs discouraged. Why are mutable structs “evil”?
Structs are good with respect to the garbage collector as they avoid garbage collection. But they are typically to be used with very small objects, in the order of say 16-24 bytes max.
Classes are heap-allocated and are passed by reference always. So if your serializer returns a class, say a string, then you just pass that around by reference and overhead of copying will be very small as you only ever copy the reference, not the content.
If you want your serializer side-effecting, i.e. overwriting the same object (class, i.e. reference type is to be used for this), then the whole approach of IEnumerable<'T> and consequently seq is wrong. IEnumerables always give you new objects as result and should never modify any pre-existing object. The only state with them should be the information, at what place in the enumeration they are.
So if you need a side-effecting version, you could do something like (pseudo-code).
let readAndOverwrite stream target =
let position = // do something here to know the state
fun target ->
target.MyProp1 <- stream.ReadInt()
target.MyProp2 <- stream.ReadFloat()
Passing as byref does not seem very reasonable to me, as you then anyway allocate and garbage collect the object. So you can just as well do that in an immutable way. What you can do, is just modifying properties on your object instead.
I have an object that can be neatly described by a discriminated union. The tree that it represents has some properties that can be easily updated when the tree is modified (but remaining immutable) but that are relatively expensive to recalculate.
I would like to store those properties along with the object as cached values but I don't want to put them into each of the discriminated union cases so I figured a member variable would fit here.
The question is then, how do I change the member value (when I modify the tree) without mutating the actual object? I know I could modify the tree and then mutate that copy without ruining purity but that seems like a wrong way to go about it to me. It would make sense to me if there was some predefined way to change a property but so that the result of the operation is a new object with that property changed.
To clarify, when I say modify I mean doing it in a functional way. Like (::) "appends" to the beginning of a list. I'm not sure what the correct terminology is here.
F# actually has syntax for copy and update records.
The syntax looks like this:
let myRecord3 = { myRecord2 with Y = 100; Z = 2 }
(example from the MSDN records page - http://msdn.microsoft.com/en-us/library/dd233184.aspx).
This allows the record type to be immutable, and for large parts of it to be preserved, whilst only a small part is updated.
The cleanest way to go about it would really be to carry the 'cached' value attached to the DU (as part of the case) in one way or another. I could think of several ways to implement this, I'll just give you one, where there are separate cases for the cached and non-cached modes:
type Fraction =
| Frac of int * int
| CachedFrac of (int * int) * decimal
member this.AsFrac =
match this with
| Frac _ -> this
| CachedFrac (tup, _) -> Frac tup
An entirely different option would be to keep the cached values in a separate dictionary, this is something that makes sense if all you want to do is save some time recalculating them.
module FracCache =
let cache = System.Collections.Generic.Dictionary<Fraction, decimal>()
let modify (oldFrac: Fraction) (newFrac: Fraction) =
cache.[newFrac] <- cache.[oldFrac] + 1 // need to check if oldFrac has a cached value as well.
Basically what memoize would give you plus you have more control over it.
The first definition below produces the warning in the title when compiled with f# 3.0 and the warning level set to 5. The second definition compiles cleanly. I wondered if someone could please explain just what the compiler worries I might accidentally mutate, or how would splitting the expression with a let clause help avoid that. Many thanks.
let ticks_with_warning () : int64 =
System.DateTime.Now.Ticks
let ticks_clean () : int64 =
let t = System.DateTime.Now
t.Ticks
I cannot really explain why the compiler emits this warning in your particular case - I agree with #ildjarn that you can safely ignore it, because the compiler is probably just being overly cautious.
However, I can give you an example where the warning might actually give you a useful hint that something might not go as you would expect. If we had a mutable struct like this:
[<Struct>]
type Test =
val mutable ticks : int64
member x.Inc() = x.ticks <- x.ticks + 1L
new (init) = { ticks = init }
Now, the Inc method mutates the struct (and you can also access the mutable field ticks). We can try writing a function that creates a Test value and mutates it:
let foo () =
let t = Test(1L)
t.Inc() // Warning: The value has been copied to ensure the original is not mutated
t
We did not mark the local value t as mutable, so the compiler tries to make sure the value is not mutated when we call Inc. It does not know whether Inc mutates the value or not, so the only safe thing is to create a copy - and thus foo returns the value Test(1L).
If we mark t as mutable, then the compiler does not have to worry about mutating it as a result of a call and so it does not give the warning (and the function returns Test(2L)):
let foo () =
let mutable t = Test(1L)
t.Inc()
t
I'm not really sure what is causing the warning in your example though. Perhaps the compiler thinks (as a result of some intermediate representation) that Ticks operation could mutate the left-hand-side value (System.DateTime.Now and t respectively) and it wants to prevent that.
The odd thing is that if you write your own DateTime struct in F#, you get a warning in both cases unless you mark the variable t as mutable (which is what I'd expect), but the behaviour with standard DateTime is different. So perhaps the compiler knows something about the standard type that I'm missing...
Any way to declare a new variable in F# without assigning a value to it?
See Aidan's comment.
If you insist, you can do this:
let mutable x = Unchecked.defaultof<int>
This will assign the absolute zero value (0 for numeric types, null for reference types, struct-zero for value types).
It would be interesting to know why the author needs this in F# (simple example of intended use would suffice).
But I guess one of the common cases when you may use uninitialised variable in C# is when you call a function with out parameter:
TResult Foo<TKey, TResult>(IDictionary<TKey, TResult> dictionary, TKey key)
{
TResult value;
if (dictionary.TryGetValue(key, out value))
{
return value;
}
else
{
throw new ApplicationException("Not found");
}
}
Luckily in F# you can handle this situation using much nicer syntax:
let foo (dict : IDictionary<_,_>) key =
match dict.TryGetValue(key) with
| (true, value) -> value
| (false, _) -> raise <| ApplicationException("Not Found")
You can also use explicit field syntax:
type T =
val mutable x : int
I agree with everyone who has said "don't do it". However, if you are convinced that you are in a case where it really is necessary, you can do this:
let mutable naughty : int option = None
...then later to assign a value.
naughty <- Some(1)
But bear in mind that everyone who has said 'change your approach instead' is probably right. I code in F# full time and I've never had to declare an unassigned 'variable'.
Another point: although you say it wasn't your choice to use F#, I predict you'll soon consider yourself lucky to be using it!
F# variables are by default immutable, so you can't assign a value later. Therefore declaring them without an initial value makes them quite useless, and as such there is no mechanism to do so.
Arguably, a mutable variable declaration could be declared without an initial value and still be useful (it could acquire an initial default like C# variables do), but F#'s syntax does not support this. I would guess this is for consistency and because mutable variable slots are not idiomatic F# so there's little incentive to make special cases to support them.
Why aren't option types like "int option" compatible with nullable types like "Nullable"?
I assume there is some semantic reason for the difference, but I can't figure what that is.
An option in F# is used when a value may or may not exist. An option has an underlying type and may either hold a value of that type or it may not have a value.
http://msdn.microsoft.com/en-us/library/dd233245%28VS.100%29.aspx
That sure sounds like the Nullable structure.
Because of the runtime representation choice for System.Nullable<'T>.
Nullable tries to represent the absent of values by the null pointer, and present values by pointers to those values.
(new System.Nullable<int>() :> obj) = null
|> printfn "%b" // true
(new System.Nullable<int>(1) :> obj).GetType().Name
|> printfn "%s" // Int32
Now consider strings. Unfortunately, strings are nullable. So this is valid:
null : string
But now a null runtime value is ambiguous - it can refer to either the absence of a value or a presence of a null value. For this reason, .NET does not allow constructing a System.Nullable<string>.
Contrast this with:
(Some (null : string) :> obj).GetType().Name
|> printfn "%s" // Option`1
That being said, one can define a bijection:
let optionOfNullable (a : System.Nullable<'T>) =
if a.HasValue then
Some a.Value
else
None
let nullableOfOption = function
| None -> new System.Nullable<_>()
| Some x -> new System.Nullable<_>(x)
If you observe the types, these functions constrain 'T to be a structure and have a zero-argument constructor. So perhaps F# compiler could expose .NET functions receiving/returning Nullable<'T> by substituting it for an Option<'T where 'T : struct and 'T : (new : unit -> 'T)>, and inserting the conversion functions where necessary..
The two have different semantics. Just to name one, Nullable is an idempotent data constructor that only works on value types, whereas option is a normal generic type. So you can't have a
Nullable<Nullable<int>>
but you can have an
option<option<int>>
Generally, though there are some overlapping scenarios, there are also things you can do with one but not the other.
Key difference is that must test the option type to see if it has a value. See this question for a good description of its semantics: How does the option type work in F#
Again, this is from my limited understanding, but the problem probably lies in how each gets rendered in the IL. The "nullable" structure probably gets handled slightly different from the option type.
You will find that the interactions between various .Net languages really boils down to how the IL gets rendered. Mostof the time it works just fine but on occasion, it causes issues. (check out this). Just when you thought it was safe to trust the level of abstraction. :)