I made a data structure looks like queue at F#, which I'm not sure whether I can call it "queue" since it can randomely access at middle of sequence and edit it.
type item =
{
name : string
mutable point : int
}
type Queue = class
val mutable q : item []
new () = { q = Array.empty; }
member this.enq i =
let mutable b = true
try
this.q <- Array.append this.q [|i|]
with
| _ as e -> (b <- false)
b
member this.deq (i : byref<item>) =
let mutable b = true
try
i <- this.q.[0]
this.q <- Array.sub this.q 1 (this.q.Length - 1)
with
| _ as e -> (b <- false)
b
member this.edit p idx =
let mutable b = true
try
this.q.[idx].point <- p
with
| _ as e -> (b <- false)
b
member this.print =
let mutable b = true
try
for j = 0 to (this.q.Length - 1) do
printfn "%s %i" this.q.[j].name this.q.[j].point
with
| _ as e -> (b <- false)
b
end
so how can I make my something-looks-like-queue thread-safe?
In addition to the answers given by Just another metaprogrammer, one way that I've handled "I need a thread-safe version of X" in F# is to just put it behind a Mailbox Processor. The mailbox processor can ensure that things process in-order, one-at-a-time, so you don't have to worry about the concurrency. While using immutable data types and the algorithms that go along with them is certainly preferable, sometimes you're stuck needing to use a type or algorithm that supports mutation, and in those cases, I like to 'protect' that thing with a MailboxProcessor, making it the mailbox processor's private state, where nothing else can access it. Then, the consumers who need to use it go through the mailbox processor to do so, and the mailbox processor ensures that the consumers don't mess each other up.
For your case, you could do something like this:
type private QueueMessage<'a> =
| Enqueue of 'a
| Dequeue of AsyncReplyChannel<'a>
| Edit of int * ('a -> unit)
type SafeQueue<'a> () as this =
let agent =
MailboxProcessor<QueueMessage<'a>>.Start
<| fun inbox ->
let rec loop queue =
async {
let! message = inbox.Receive()
match message with
| Enqueue item ->
let newState = queue.enqueue item
return! loop newState
| Dequeue channel ->
let item = queue.dequeue()
channel.Reply item
return! loop queue
| Edit (idx, f) ->
f <| queue.[idx]
return! loop queue
}
loop <| new Queue<'a>()
let enqueue item =
agent.Post <| Enqueue item
let dequeue () =
agent.PostAndReply Dequeue
let edit idx f =
agent.Post(Edit (idx, f))
member __.Enqueue item = enqueue item
member __.Dequeue () = dequeue ()
member __.Edit (idx, f) = edit idx f
Seems OP is asking for a thread-safe queue that supports enqueue, dequeue and update.
Thread-safe can mean many things, for example should the operations be blocking (ie when dequeueing and empty queue). Are locks acceptable or not?
An approach that can be taken is implementing a thread-safe mutable queue using locks and ResizeArray. The idea is that ResizeArray is the backing data structure and then there's a lock object that is used to coordinate all access to the ResizeArray so that only one thread can can access it at any time and that also prevents memory reordering.
An example of a lockful queue using ResizeArray, no guarantees for correctness given, these things are tricky:
// A Queue is a pair of ResizeArray and a object that serve as the lock
type Queue<'T> = Q of ResizeArray<'T>*obj
// Creates a queue with an initial capacity
let create capacity =
let q = ResizeArray<'T>()
q.Capacity <- capacity
Q (q, obj ())
// enqueue uses lock on the lock object to ensure all access to the
// ResizeArray is coordinated so that only one thread can update it at a time.
// When inside the lock we add the element to the queue
let enqueue v (Q (q, l)) = lock l <| fun () -> q.Add v
// Same idea as with dequeue but inside the lock we removes the first element
// and returns it
let dequeue (Q (q, l)) = lock l <| fun () ->
if q.Count > 0 then
let v = q.[0]
q.RemoveAt 0 // This is not an efficient dequeue as it's O(n), use an example only
Some v
else
None
// As with enqueue and dequeue the update operation wait till under the lock
// to update the queue. Note; this likely not a good pattern to do it as
// if there are multiple threads consuming and producing elements the index
// is likely wrong. Better to do an update operation that gets exclusive access
// to the ResizeArray to perfom it's operations
let update v i (Q (q, l)) = lock l <| fun () ->
if i < q.Count then
q.[i] <- v
else
raise (ArgumentOutOfRangeException "i")
let example () =
printfn "Using Locked Resize Array"
let q = create 10
enqueue 1 q
enqueue 2 q
enqueue 3 q
update 12 1 q
printfn "%A" <| (dequeue q)
printfn "%A" <| (dequeue q)
printfn "%A" <| (dequeue q)
printfn "%A" <| (dequeue q)
So getting this kind thread-safe code correct is tricky. In addition, the code above relies on locking which is can be bad for performance as it stalls threads.
Better is to use the built-in ConcurrentQueue. The drawback for OP is that this queue doesn't support update. The pros are many though as this queue requires no lock to be used safely. It's actually internally implemented using lock-free techniques to ensure that threads aren't stalled (in perverse scenarios one could imagine soft stalling of threads as threads are guaranteed to not stall but not guaranteed progress).
So if OP can live without the update function:
open System.Collections.Concurrent
type Queue<'T> = Q of ConcurrentQueue<'T>
let create () =
let q = ConcurrentQueue<'T> ()
Q q
// No need for any locks, as this is a thread-safe collection
let enqueue v (Q q) = q.Enqueue v
// No need for any locks, as this is a thread-safe collection
let dequeue (Q q) =
let b, v = q.TryDequeue ()
if b then Some v else None
let example () =
printfn "Using Concurrent Queue"
let q = create ()
enqueue 1 q
enqueue 2 q
enqueue 3 q
printfn "%A" <| (dequeue q)
printfn "%A" <| (dequeue q)
printfn "%A" <| (dequeue q)
printfn "%A" <| (dequeue q)
Immutable collections were mentioned as a possible approach and that is certainly possible but the purpose of the thread-safe queue is likely that it should be shared and processed by multiple threads. That shared binding needs to mutable to allow enqueue and dequeue as an immutable (sometimes called persistent) data structure never change, enqueue and dequeue produces a new queue.
The code that updates still need consider what happens when multiple threads read and updates the mutable binding.
Simply overwriting the mutable binding with a new queue might cause double processing of queue elements, lost elements and undefined behavior because memory reordering issues, all of these issues are very hard to debug even more so because they are likely rare.
It's fixable though by ensure the mutable binding is updated under a lock or in lock-free CAS loop.
open System
open System.Threading
// Very simplistic immutable queue that consists of two lists
// The first list is being dequeued from
// The second list is being enqueued to
// When the first list is empty, the second list reversed and becomes the new first list
// The book Purely Functional Data Structures by Chris Okasaki dedicates large part
// of the book to explore many variants on how to implement more efficient
// immutable queues
type Queue<'T> = Q of 'T list*'T list
[<GeneralizableValue>]
let empty<'T> = Q ([], [])
// enqueue "cons" the value to the back list
let enqueue v (Q (f, b)) = Q (f, v::b)
// dequeue tries to take the head of the front list
// if the front the list is empty, reverse the back list
// that becomes the front list
let rec dequeue (Q (f, b)) =
match f,b with
| [] , [] -> None, empty
| h::t , _ -> Some h, Q (t, b)
| [] , h::t-> Some h, Q (List.rev t, [])
module Details =
module Loops =
// Uses a CAS loop too update the state
// Might cause the updater function to be evaluated many times
// so important that the updater function is pure and fast
let rec lockfreeUpdate (q : byref<'T>) updater current =
let v, next = updater current
let actual = Interlocked.CompareExchange<'T> (&q, next, current)
if Object.ReferenceEquals (actual, current) then
v
else
lockfreeUpdate &q updater actual
open Details
// Updates the mutable binding using a lock free CAS loop
let inline lockfreeUpdate (q : byref<_>) updater = Loops.lockfreeUpdate &q updater (Volatile.Read &q)
// Updates the mutable binding under a lock
let inline lockfulUpdate (q : byref<_>) l updater =
Monitor.Enter l
try
let v, qq = updater q
q <- qq
v
finally
Monitor.Exit l
let example () =
printfn "Using ImmutableQueue"
let l = obj ()
let mutable q = empty
lockfreeUpdate &q l (fun q -> (), enqueue 1 q)
lockfreeUpdate &q l (fun q -> (), enqueue 2 q)
lockfreeUpdate &q l (fun q -> (), enqueue 3 q)
printfn "%A" <| (lockfreeUpdate &q dequeue)
printfn "%A" <| (lockfreeUpdate &q dequeue)
printfn "%A" <| (lockfreeUpdate &q dequeue)
printfn "%A" <| (lockfreeUpdate &q dequeue)
Hope this serves as some inspiration on how to solve the problem you have. Note, getting thread-safe code right is tricky so I encourage finding a collection under System.Collections.Concurrent that has the right behavior for you.
It's very tempting to try to get "cute" but the problem with buggy concurrent code is that the buggy code will go through all unit tests, all integration tests and the system tests. After a while one receives bug reports about product instabilities that can't be reproduced.
Related
I have a computation expression builder that builds up a value as you go, and has many custom operations. However, it does not allow for standard F# language constructs, and I'm having a lot of trouble figuring out how to add this support.
To give a stand-alone example, here's a dead-simple and fairly pointless computation expression that builds F# lists:
type Items<'a> = Items of 'a list
type ListBuilder() =
member x.Yield(()) = Items []
[<CustomOperation("add")>]
member x.Add(Items current, item:'a) =
Items [ yield! current; yield item ]
[<CustomOperation("addMany")>]
member x.AddMany(Items current, items: seq<'a>) =
Items [ yield! current; yield! items ]
let listBuilder = ListBuilder()
let build (Items items) = items
I can use this to build lists just fine:
let stuff =
listBuilder {
add 1
add 5
add 7
addMany [ 1..10 ]
add 42
}
|> build
However, this is a compiler error:
listBuilder {
let x = 5 * 39
add x
}
// This expression was expected to have type unit, but
// here has type int.
And so is this:
listBuilder {
for x = 1 to 50 do
add x
}
// This control construct may only be used if the computation expression builder
// defines a For method.
I've read all the documentation and examples I can find, but there's something I'm just not getting. Every .Bind() or .For() method signature I try just leads to more and more confusing compiler errors. Most of the examples I can find either build up a value as you go along, or allow for regular F# language constructs, but I haven't been able to find one that does both.
If someone could point me in the right direction by showing me how to take this example and add support in the builder for let bindings and for loops (at minimum - using, while and try/catch would be great, but I can probably figure those out if someone gets me started) then I'll be able to gratefully apply the lesson to my actual problem.
The best place to look is the spec. For example,
b {
let x = e
op x
}
gets translated to
T(let x = e in op x, [], fun v -> v, true)
=> T(op x, {x}, fun v -> let x = e in v, true)
=> [| op x, let x = e in b.Yield(x) |]{x}
=> b.Op(let x = e in in b.Yield(x), x)
So this shows where things have gone wrong, though it doesn't present an obvious solution. Clearly, Yield needs to be generalized since it needs to take arbitrary tuples (based on how many variables are in scope). Perhaps more subtly, it also shows that x is not in scope in the call to add (see that unbound x as the second argument to b.Op?). To allow your custom operators to use bound variables, their arguments need to have the [<ProjectionParameter>] attribute (and take functions from arbitrary variables as arguments), and you'll also need to set MaintainsVariableSpace to true if you want bound variables to be available to later operators. This will change the final translation to:
b.Op(let x = e in b.Yield(x), fun x -> x)
Building up from this, it seems that there's no way to avoid passing the set of bound values along to and from each operation (though I'd love to be proven wrong) - this will require you to add a Run method to strip those values back off at the end. Putting it all together, you'll get a builder which looks like this:
type ListBuilder() =
member x.Yield(vars) = Items [],vars
[<CustomOperation("add",MaintainsVariableSpace=true)>]
member x.Add((Items current,vars), [<ProjectionParameter>]f) =
Items (current # [f vars]),vars
[<CustomOperation("addMany",MaintainsVariableSpace=true)>]
member x.AddMany((Items current, vars), [<ProjectionParameter>]f) =
Items (current # f vars),vars
member x.Run(l,_) = l
The most complete examples I've seen are in §6.3.10 of the spec, especially this one:
/// Computations that can cooperatively yield by returning a continuation
type Eventually<'T> =
| Done of 'T
| NotYetDone of (unit -> Eventually<'T>)
[<CompilationRepresentation(CompilationRepresentationFlags.ModuleSuffix)>]
module Eventually =
/// The bind for the computations. Stitch 'k' on to the end of the computation.
/// Note combinators like this are usually written in the reverse way,
/// for example,
/// e |> bind k
let rec bind k e =
match e with
| Done x -> NotYetDone (fun () -> k x)
| NotYetDone work -> NotYetDone (fun () -> bind k (work()))
/// The return for the computations.
let result x = Done x
type OkOrException<'T> =
| Ok of 'T
| Exception of System.Exception
/// The catch for the computations. Stitch try/with throughout
/// the computation and return the overall result as an OkOrException.
let rec catch e =
match e with
| Done x -> result (Ok x)
| NotYetDone work ->
NotYetDone (fun () ->
let res = try Ok(work()) with | e -> Exception e
match res with
| Ok cont -> catch cont // note, a tailcall
| Exception e -> result (Exception e))
/// The delay operator.
let delay f = NotYetDone (fun () -> f())
/// The stepping action for the computations.
let step c =
match c with
| Done _ -> c
| NotYetDone f -> f ()
// The rest of the operations are boilerplate.
/// The tryFinally operator.
/// This is boilerplate in terms of "result", "catch" and "bind".
let tryFinally e compensation =
catch (e)
|> bind (fun res -> compensation();
match res with
| Ok v -> result v
| Exception e -> raise e)
/// The tryWith operator.
/// This is boilerplate in terms of "result", "catch" and "bind".
let tryWith e handler =
catch e
|> bind (function Ok v -> result v | Exception e -> handler e)
/// The whileLoop operator.
/// This is boilerplate in terms of "result" and "bind".
let rec whileLoop gd body =
if gd() then body |> bind (fun v -> whileLoop gd body)
else result ()
/// The sequential composition operator
/// This is boilerplate in terms of "result" and "bind".
let combine e1 e2 =
e1 |> bind (fun () -> e2)
/// The using operator.
let using (resource: #System.IDisposable) f =
tryFinally (f resource) (fun () -> resource.Dispose())
/// The forLoop operator.
/// This is boilerplate in terms of "catch", "result" and "bind".
let forLoop (e:seq<_>) f =
let ie = e.GetEnumerator()
tryFinally (whileLoop (fun () -> ie.MoveNext())
(delay (fun () -> let v = ie.Current in f v)))
(fun () -> ie.Dispose())
// Give the mapping for F# computation expressions.
type EventuallyBuilder() =
member x.Bind(e,k) = Eventually.bind k e
member x.Return(v) = Eventually.result v
member x.ReturnFrom(v) = v
member x.Combine(e1,e2) = Eventually.combine e1 e2
member x.Delay(f) = Eventually.delay f
member x.Zero() = Eventually.result ()
member x.TryWith(e,handler) = Eventually.tryWith e handler
member x.TryFinally(e,compensation) = Eventually.tryFinally e compensation
member x.For(e:seq<_>,f) = Eventually.forLoop e f
member x.Using(resource,e) = Eventually.using resource e
The tutorial at "F# for fun and profit" is first class in this regard.
http://fsharpforfunandprofit.com/posts/computation-expressions-intro/
Following a similar struggle to Joel's (and not finding §6.3.10 of the spec that helpful) my issue with getting the For construct to generate a list came down to getting types to line up properly (no special attributes required). In particular I was slow to realise that For would build a list of lists, and therefore need flattening, despite the best efforts of the compiler to put me right. Examples that I found on the web were always wrappers around seq{}, using the yield keyword, repeated use of which invokes a call to Combine, which does the flattening. In case a concrete example helps, the following excerpt uses for to build a list of integers - my ultimate aim being to create lists of components for rendering in a GUI (with some additional laziness thrown in). Also In depth talk on CE here which elaborates on kvb's points above.
module scratch
type Dispatcher = unit -> unit
type viewElement = int
type lazyViews = Lazy<list<viewElement>>
type ViewElementsBuilder() =
member x.Return(views: lazyViews) : list<viewElement> = views.Value
member x.Yield(v: viewElement) : list<viewElement> = [v]
member x.ReturnFrom(viewElements: list<viewElement>) = viewElements
member x.Zero() = list<viewElement>.Empty
member x.Combine(listA:list<viewElement>, listB: list<viewElement>) = List.concat [listA; listB]
member x.Delay(f) = f()
member x.For(coll:seq<'a>, forBody: 'a -> list<viewElement>) : list<viewElement> =
// seq {for v in coll do yield! f v} |> List.ofSeq
Seq.map forBody coll |> Seq.collect id |> List.ofSeq
let ve = new ViewElementsBuilder()
let makeComponent(m: int, dispatch: Dispatcher) : viewElement = m
let makeComponents() : list<viewElement> = [77; 33]
let makeViewElements() : list<viewElement> =
let model = {| Scores = [33;23;22;43;] |> Seq.ofList; Trainer = "John" |}
let d:Dispatcher = fun() -> () // Does nothing here, but will be used to raise messages from UI
ve {
for score in model.Scores do
yield makeComponent (score, d)
yield makeComponent (score * 100 / 50 , d)
if model.Trainer = "John" then
return lazy
[ makeComponent (12, d)
makeComponent (13, d)
]
else
return lazy
[ makeComponent (14, d)
makeComponent (15, d)
]
yield makeComponent (33, d)
return! makeComponents()
}
As a side question here What's the easiest way to do something like delegate multicast in F# I think it maybe better to raise a full question with proper title.
This version wouldn't cause recursion: (Here notify seems immutable in d)
let mutable notify = fun x -> x
let wrap f i = f(i); i
let a x = printf "%A" x
let d = (notify >> (wrap a)) // point free
notify <- d
notify "ss"
This version would. (Here notify seems mutable in d)
let mutable notify = fun x -> x
let wrap f i = f(i); i
let a x = printf "%A" x
let d x =
(notify >> (wrap a)) x // normal function
notify <- d
notify "ss" // endless loop
Another fail version:
let mutable notify = fun x -> x
let wrap f i = f(i); i
let a x = printf "%A" x
let d =
fun x -> (notify >> (wrap a)) x // Here
notify <- d
notify "ss" // endless loop
Where can I find any guideline or more resource into why we have this behaviours discrepancy . Is it tied to a particularly compiler/ language or there is a theory for it which applied to all functional languages?
Uncontrolled mutability is the reason for this behavior. Other languages like Haskell provides controlled mutability using Software transaction memory techniques which avoid these kind of problems. Also, eager evaluation plays an important role here.
let d = (notify >> (wrap a)) : In this case whatever value of notify has will be composed with (wrap a) and the result will be assigned to d
let d x =
(notify >> (wrap a)) x : Here, the body of the function is not executed untill you actually call the d function and hence you get the mutated value of notify
I'm trying to come up with an Rx Builder to use Reactive Extension within the F# Computation Expression syntax. How do I fix it so that it doesnt blow the stack? Like the Seq example below.
And is there any plans to provide an implementation of the RxBuilder as part of the Reactive Extensions or as part of future versions of the .NET Framework ?
open System
open System.Linq
open System.Reactive.Linq
type rxBuilder() =
member this.Delay f = Observable.Defer f
member this.Combine (xs: IObservable<_>, ys : IObservable<_>) =
Observable.merge xs ys
member this.Yield x = Observable.Return x
member this.YieldFrom (xs:IObservable<_>) = xs
let rx = rxBuilder()
let rec f x = seq { yield x
yield! f (x + 1) }
let rec g x = rx { yield x
yield! g (x + 1) }
//do f 5 |> Seq.iter (printfn "%A")
do g 5 |> Observable.subscribe (printfn "%A") |> ignore
do System.Console.ReadLine() |> ignore
A short answer is that Rx Framework doesn't support generating observables using a recursive pattern like this, so it cannot be easily done. The Combine operation that is used for F# sequences needs some special handling that observables do not provide. The Rx Framework probably expects that you'll generate observables using Observable.Generate and then use LINQ queries/F# computation builder to process them.
Anyway, here are some thoughts -
First of all, you need to replace Observable.merge with Observable.Concat. The first one runs both observables in parallel, while the second first yields all values from the first observable and then produces values from the second observable. After this change, the snippet will at least print ~800 numbers before the stack overflow.
The reason for the stack overflow is that Concat creates an observable that calls Concat to create another observable that calls Concat etc. One way to solve this is to add some synchronization. If you're using Windows Forms, then you can modify Delay so that it schedules the observable on the GUI thread (which discards the current stack). Here is a sketch:
type RxBuilder() =
member this.Delay f =
let sync = System.Threading.SynchronizationContext.Current
let res = Observable.Defer f
{ new IObservable<_> with
member x.Subscribe(a) =
sync.Post( (fun _ -> res.Subscribe(a) |> ignore), null)
// Note: This is wrong, but we cannot easily get the IDisposable here
null }
member this.Combine (xs, ys) = Observable.Concat(xs, ys)
member this.Yield x = Observable.Return x
member this.YieldFrom (xs:IObservable<_>) = xs
To implement this properly, you would have to write your own Concat method, which is quite complicated. The idea would be that:
Concat returns some special type e.g. IConcatenatedObservable
When the method is called recursively you'll create a chain of IConcatenatedObservable that reference each other
The Concat method will look for this chain and when there are e.g. three objects, it will drop the middle one (to always keep chain of length at most 2).
That's a bit too complex for a StackOverflow answer, but it may be a useful feedback for the Rx team.
Notice this has been fixed in Rx v2.0 (as mentioned here already), more generally for all of the sequencing operators (Concat, Catch, OnErrorResumeNext), as well as the imperative operators (If, While, etc.).
Basically, you can think of this class of operators as doing a subscribe to another sequence in a terminal observer message (e.g. Concat subscribes to the next sequence upon receiving the current one's OnCompleted message), which is where the tail recursion analogy comes in.
In Rx v2.0, all of the tail-recursive subscriptions are flattened into a queue-like data structure for processing one at a time, talking to the downstream observer. This avoids the unbounded growth of observers talking to each other for successive sequence subscriptions.
This has been fixed in Rx 2.0 Beta. And here's a test.
What about something like this?
type rxBuilder() =
member this.Delay (f : unit -> 'a IObservable) =
{ new IObservable<_> with
member this.Subscribe obv = (f()).Subscribe obv }
member this.Combine (xs:'a IObservable, ys: 'a IObservable) =
{ new IObservable<_> with
member this.Subscribe obv = xs.Subscribe obv ;
ys.Subscribe obv }
member this.Yield x = Observable.Return x
member this.YieldFrom xs = xs
let rx = rxBuilder()
let rec f x = rx { yield x
yield! f (x + 1) }
do f 5 |> Observable.subscribe (fun x -> Console.WriteLine x) |> ignore
do System.Console.ReadLine() |> ignore
http://rxbuilder.codeplex.com/ (created for the purpose of experimenting with RxBuilder)
The xs disposable is not wired up. As soon as I try to wire up the disposable it goes back to blowing up the stack.
If we remove the syntactic sugar from this computation expression (aka Monad) we will have:
let rec g x = Observable.Defer (fun () -> Observable.merge(Observable.Return x, g (x + 1) )
Or in C#:
public static IObservable<int> g(int x)
{
return Observable.Defer<int>(() =>
{
return Observable.Merge(Observable.Return(x), g(x + 1));
});
}
Which is definitely not tail recursive. I think if you can make it tail recursive then it would probably solve your problem
I am trying to build a list from a sequence by recursively appending the first element of the sequence to the list:
open System
let s = seq[for i in 2..4350 -> i,2*i]
let rec copy s res =
if (s|>Seq.isEmpty) then
res
else
let (a,b) = s |> Seq.head
Console.WriteLine(string a)
let newS = s |> Seq.skip(1)|> Seq.cache
let newRes = List.append res ([(a,b)])
copy newS newRes
copy s ([])
Two problems:
. getting a Stack overflow which means my tail recusive ploy sucks
and
. why is the code 100x faster when I put |> Seq.cache here let newS = s |> Seq.skip(1)|> Seq.cache.
(Note this is just a little exercise, I understand you can do Seq.toList etc.. )
Thanks a lot
One way that works is ( the two points still remain a bit weird to me ):
let toList (s:seq<_>) =
let rec copyRev res (enum:Collections.Generic.IEnumerator<_*_>) =
let somethingLeft = enum.MoveNext()
if not(somethingLeft) then
res
else
let curr = enum.Current
Console.WriteLine(string curr)
let newRes = curr::res
copyRev newRes enum
let enumerator = s.GetEnumerator()
(copyRev ([]) (enumerator)) |>List.rev
You say it's just an exercise, but it's useful to point to my answer to
While or Tail Recursion in F#, what to use when?
and reiterate that you should favor more applicative/declarative constructs when possible. E.g.
let rec copy2 s = [
for tuple in s do
System.Console.WriteLine(string(fst tuple))
yield tuple
]
is a nice and performant way to express your particular function.
That said, I'd feel remiss if I didn't also say "never create a list that big". For huge data, you want either array or seq.
In my short experience with F# it is not a good idea to use Seq.skip 1 like you would with lists with tail. Seq.skip creates a new IEnumerable/sequence and not just skips n. Therefore your function will be A LOT slower than List.toSeq. You should properly do it imperative with
s.GetEnumerator()
and iterates through the sequence and hold a list which you cons every single element.
In this question
Take N elements from sequence with N different indexes in F#
I started to do something similar to what you do but found out it is very slow. See my method for inspiration for how to do it.
Addition: I have written this:
let seqToList (xs : seq<'a>) =
let e = xs.GetEnumerator()
let mutable res = []
while e.MoveNext() do
res <- e.Current :: res
List.rev res
And found out that the build in method actually does something very similar (including the reverse part). It do, however, checks whether the sequence you have supplied is in fact a list or an array.
You will be able to make the code entirely functional: (which I also did now - could'nt resist ;-))
let seqToList (xs : seq<'a>) =
Seq.fold (fun state t -> t :: state) [] xs |> List.rev
Your function is properly tail recursive, so the recursive calls themselves are not what is overflowing the stack. Instead, the problem is that Seq.skip is poorly behaved when used recursively, as others have pointed out. For instance, this code overflows the stack on my machine:
let mutable s = seq { 1 .. 20001 }
for i in 1 .. 20000 do
s <- Seq.skip 1 s
let v = Seq.head s
Perhaps you can see the vague connection to your own code, which also eventually takes the head of a sequence which results from repeatedly applying Seq.skip 1 to your initial sequence.
Try the following code.
Warning: Before running this code you will need to enable tail call generation in Visual Studio. This can be done through the Build tab on the project properties page. If this is not enabled the code will StackOverflow processing the continuation.
open System
open System.Collections.Generic
let s = seq[for i in 2..1000000 -> i,2*i]
let rec copy (s : (int * int) seq) =
use e = s.GetEnumerator()
let rec inner cont =
if e.MoveNext() then
let (a,b) = e.Current
printfn "%d" b
inner (fun l -> cont (b :: l))
else cont []
inner (fun x -> x)
let res = copy s
printfn "Done"
let info = new SortedDictionary<string, string>
...
Thread A
--------
info.Add("abc", "def")
Thread B
--------
info
|> Seq.iteri (fun i value -> ...
Where do I place the readLock when I use the iteri function?
You may want to just side-step the problem of mutability, and use an immutable Map instead of SortedDictionary. This way, your iteration works on a "snapshot" of the data structure, with no worries about it getting changed out from underneath you. Then, you only need to lock your initial grab of the snapshot.
For example (warning, have not tested to see if this is actually threadsafe!):
let mymap = ref Map<string,string>.Empty
let safefetch m = lock(m) (fun () -> !m)
let safeadd k v m = lock(m) (fun () -> m := Map.add k v !m)
mymap
|> safefetch
|> Map.iter ( fun k v -> printfn "%s: %s" k v )
mymap |> safeadd "test" "value"
After some thinking it seems that placing a lock on Seq.iteri does actually make no sense since a Seq is lazy in F#.
However it is interesting to note that an exception is thrown when additional elements of a dictionary are inserted by another thread during the iteration of the sequence. Not sure if that is fully warranted for a lazy iteration.
My solution (as a function) right now is:
(fun _ ->
lock info (fun _ ->
info
|> Seq.iteri (fun i x -> ...)))
I hope it is OK to answer my own question (I am new here).