I am learning F# by automating few of my tasks with F# scripts. I run this scripts with "fsi/fsarpi --exec" from command line. I am using .Net core for my work. One of the thing I was looking for is how to profile my F# script. I am primarily looking for
See overall time consumed by my entire script, I tried doing with stopwatch kind of functionality and it works well. Is there anything which can show time for my various top level function calls? Or timings/counts for function calls.
See the overall memory consumption by my script.
Hot spots in my scripts.
Overall I am trying to understand the performance bottlenecks of my scripts.
On a side note, is there a way to compile F# scripts to exe?
I recommend using BenchmarkDotNet for any benchmarking tasks (well, micro-benchmarks). Since it's a statistical tool, it accounts for many things that hand-rolled benchmarking will not. And just by applying a few attributes you can get a nifty report.
Create a .NET Core console app, add the BenchmarkDotNet package, create a benchmark, and run it to see the results. Here's an example that tests two trivial parsing functions, with one as the baseline for comparison, and informing BenchmarkDotNet to capture memory usage stats when running the benchmark:
open System
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
module Parsing =
/// "123,456" --> (123, 456)
let getNums (str: string) (delim: char) =
let idx = str.IndexOf(delim)
let first = Int32.Parse(str.Substring(0, idx))
let second = Int32.Parse(str.Substring(idx + 1))
first, second
/// "123,456" --> (123, 456)
let getNumsFaster (str: string) (delim: char) =
let sp = str.AsSpan()
let idx = sp.IndexOf(delim)
let first = Int32.Parse(sp.Slice(0, idx))
let second = Int32.Parse(sp.Slice(idx + 1))
struct(first, second)
[<MemoryDiagnoser>]
type ParsingBench() =
let str = "123,456"
let delim = ','
[<Benchmark(Baseline=true)>]
member __.GetNums() =
Parsing.getNums str delim |> ignore
[<Benchmark>]
member __.GetNumsFaster() =
Parsing.getNumsSpan str delim |> ignore
[<EntryPoint>]
let main _ =
let summary = BenchmarkRunner.Run<ParsingBench>()
printfn "%A" summary
0 // return an integer exit code
In this case, the results will show that the getNumsFaster function allocations 0 bytes and runs about 33% faster.
Once you've found something that consistently performs better and allocates less, you can transfer that over to a script or some other environment where the code will actually execute.
As for hotspots, your best tool is to actually run the script under a profiler like PerfView and look at CPU time and allocations caused by the script while it's executing. There's no simple answer here: interpreting profiling results correctly is challenging and time consuming work.
There's no way to compile an F# script to an executable for .NET Core. It's possible only on Windows/.NET Framework, but this is legacy behavior that is considered deprecated. It's recommended that you convert code in your script to an application if you'd like it to run as an executable.
Related
I am just starting to work with F# and trying to understand typical idoms and effective ways of thinking and working.
The task at hand is a simple transform of a tab-delimited file to one which is comma-delimited. A typical input line will look like:
let line = "#ES# 01/31/2006 13:31:00 1303.00 1303.00 1302.00 1302.00 2514 0"
I started out with looping code like this:
// inFile and outFile defined in preceding code not shown here
for line in File.ReadLines(inFile) do
let typicalArray = line.Split '\t'
let transformedLine = typicalArray |> String.concat ","
outFile.WriteLine(transformedLine)
I then replaced the split/concat pair of operations with a single Regex.Replace():
for line in File.ReadLines(inFile) do
let transformedLine = Regex.Replace(line, "\t",",")
outFile.WriteLine(transformedLine)
And now, finally, have replaced the looping with a pipeline:
File.ReadLines(inFile)
|> Seq.map (fun x -> Regex.Replace(x, "\t", ","))
|> Seq.iter (fun y -> outFile.WriteLine(y))
// other housekeeping code below here not shown
While all versions work, the final version seems to me the most intuitive. Is this how a more experienced F# programmer would accomplish this task?
I think all three versions are perfectly fine, idiomatic code that F# experts would write.
I generally prefer writing code using built-in language features (like for loops and if conditions) if they let me solve the problem I have. These are imperative, but I think using them is a good idea when the API requires imperative code (like outFile.WriteLine). As you mentioned - you started with this version (and I would do the same).
Using higher-order functions is nice too - although I would probably do that only if I wanted to write data transformation and get a new sequence or list of lines - this would be handy if you were using File.WriteAllLines instead of writing lines one-by-one. Although, that could be also done by simply wrapping your second version with sequence expression:
let transformed =
seq { for line in File.ReadLines(inFile) -> Regex.Replace(line, "\t",",") }
File.WriteAllLines(outFilePath, transformed)
I do not think there is any objective reason to prefer one of the versions. My personal stylistic preference is to use for and refactor to sequence expressions (if needed), but others will likely disagree.
A side note that if you want to write to the same file that you are reading from, you need to remember that Seq is doing lazy evaluation.
Using Array as opposed to Seq makes sure file is closed for reading when it is needed for writing.
This works:
let lines =
file |> File.ReadAllLines
|> Array.map(fun line -> ..modify line..)
File.WriteAllLines(file, lines)
This does not (causes file access file violation)
let lines =
file |> File.ReadLines
|> Seq.map(fun line -> ..modify line..)
File.WriteAllLines(file, lines)
(potential overlap with another discussion here, where intermediate variable helps with the same problem)
The Computer Language Benchmarks Game's F# entry for Threadring contains a seemingly useless line: if false then (). When I comment out this line, the program runs much faster (~2s vs ~55s for an input of 50000000) and produces the same result. How does this work? Why is this line there? What exactly is the compiler doing with what appears to be a no-op?
The code:
let ringLength = 503
let cells = Array.zeroCreate ringLength
let threads = Array.zeroCreate ringLength
let answer = ref -1
let createWorker i =
let next = (i+1)%ringLength
async { let value = cells.[i]
if false then ()
match value with
| 0 -> answer := i+1
| _ ->
cells.[next] <- value - 1
return! threads.[next] }
[<EntryPoint>]
let main args =
cells.[0] <- if args.Length>0 then int args.[0] else 50000000
for i in 0..ringLength-1 do
threads.[i]<-createWorker i
let result = Async.StartImmediate(threads.[0])
printfn "%d" !answer
0
I wrote this code originally. I don't remember the exact reason I added the line, but I'm guessing that, without it, the optimizer would do something I thought was outside of the spirit of the benchmark game. The reason for using asyncs in the first place is to achieve tail-call continuation to the next async (which is what makes this perform so much better than C# mono).
- Jomo
If the computation expression contains if false then () then the asynchronous workflow gets translated a bit differently. With the line, it uses async.Combine. Slightly simplified code looks like:
async.Delay(fun () ->
value = cells.[i]
async.Combine
( async.Return(if false then ())
async.Delay(fun () ->
match value with (...) ) ))
The translation inserts Combine because the (potentially) asynchronous computation done by if loop needs to be combined with the following code. Now, if you delete if you get something like:
async.Delay(fun () ->
value = cells.[i]
match value with (...) ) ))
The difference is that now a lot more work is done immediately in the function passed to Delay.
EDIT: I thought this caused a difference because the code uses Async.StartImmediate instead of Async.Start, but that does not seem to be the case. In fact, I do not understand why the code uses asynchronous workflows at all...
EDIT II.: I'm not entirely sure about Mono, but it definitely does replicate in the F# interactive - there, the version with Combine is about 4 times slower (which is what I'd expect, because of the function allocation overhead).
Please post code for displaying time in F#. I noticed that you can measure it from F# interactive with #time directive, but I don't know how to execute program from FSI
Thanks
I would just use the .NET Stopwatch class.
let stopWatch = System.Diagnostics.Stopwatch.StartNew()
...
stopWatch.Stop()
printfn "%f" stopWatch.Elapsed.TotalMilliseconds
From msdn:
By itself, #time toggles whether to display performance information.
When it is enabled, F# Interactive measures real time, CPU time, and
garbage collection information for each section of code that is
interpreted and executed.
So to test you function you have to open F# interactive console and execute your function in it (one way to do it is to select your function, right click and chose Execute in Interactive)
Then make a call to your function in Interactive like that for example:
// define your function first either in interactive console or in your document:
let square x = x * x
// in interactive
#time
square 10
#time
You will see how much of real time and CPU time were spent on computation and a bit of information from garbage collector
Check out the timer function in the F Sharp Programming wikibook. It is defined like this:
let duration f =
let timer = new System.Diagnostics.Stopwatch()
timer.Start()
let returnValue = f()
printfn "Elapsed Time: %i" timer.ElapsedMilliseconds
returnValue
Whilst being used like this:
let sample() = System.Threading.Thread.Sleep(2)
duration ( fun() -> sample() )
// or
duration sample
You could also create custom computation expression to hide actual measuring logic, e.g.:
timer {
// your code goes here
}
See more examples here: https://fsharpforfunandprofit.com/posts/computation-expressions-bind/
I am trying to learn how async and let! work in F#.
All the docs i've read seem confusing.
What's the point of running an async block with Async.RunSynchronously? Is this async or sync? Looks like a contradiction.
The documentation says that Async.StartImmediate runs in the current thread. If it runs in the same thread, it doesn't look very asynchronous to me... Or maybe asyncs are more like coroutines rather than threads. If so, when do they yield back an forth?
Quoting MS docs:
The line of code that uses let! starts the computation, and then the thread is suspended
until the result is available, at which point execution continues.
If the thread waits for the result, why should i use it? Looks like plain old function call.
And what does Async.Parallel do? It receives a sequence of Async<'T>. Why not a sequence of plain functions to be executed in parallel?
I think i'm missing something very basic here. I guess after i understand that, all the documentation and samples will start making sense.
A few things.
First, the difference between
let resp = req.GetResponse()
and
let! resp = req.AsyncGetReponse()
is that for the probably hundreds of milliseconds (an eternity to the CPU) where the web request is 'at sea', the former is using one thread (blocked on I/O), whereas the latter is using zero threads. This is the most common 'win' for async: you can write non-blocking I/O that doesn't waste any threads waiting for hard disks to spin around or network requests to return. (Unlike most other languages, you aren't forced to do inversion of control and factor things into callbacks.)
Second, Async.StartImmediate will start an async on the current thread. A typical use is with a GUI, you have some GUI app that wants to e.g. update the UI (e.g. to say "loading..." somewhere), and then do some background work (load something off disk or whatever), and then return to the foreground UI thread to update the UI when completed ("done!"). StartImmediate enables an async to update the UI at the start of the operation and to capture the SynchronizationContext so that at the end of the operation is can return to the GUI to do a final update of the UI.
Next, Async.RunSynchronously is rarely used (one thesis is that you call it at most once in any app). In the limit, if you wrote your entire program async, then in the "main" method you would call RunSynchronously to run the program and wait for the result (e.g. to print out the result in a console app). This does block a thread, so it is typically only useful at the very 'top' of the async portion of your program, on the boundary back with synch stuff. (The more advanced user may prefer StartWithContinuations - RunSynchronously is kinda the "easy hack" to get from async back to sync.)
Finally, Async.Parallel does fork-join parallelism. You could write a similar function that just takes functions rather than asyncs (like stuff in the TPL), but the typical sweet spot in F# is parallel I/O-bound computations, which are already async objects, so this is the most commonly useful signature. (For CPU-bound parallelism, you could use asyncs, but you could also use TPL just as well.)
The usage of async is to save the number of threads in usage.
See the following example:
let fetchUrlSync url =
let req = WebRequest.Create(Uri url)
use resp = req.GetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let contents = reader.ReadToEnd()
contents
let sites = ["http://www.bing.com";
"http://www.google.com";
"http://www.yahoo.com";
"http://www.search.com"]
// execute the fetchUrlSync function in parallel
let pagesSync = sites |> PSeq.map fetchUrlSync |> PSeq.toList
The above code is what you want to do: define a function and execute in parallel. So why do we need async here?
Let's consider something big. E.g. if the number of sites is not 4, but say, 10,000! Then There needs 10,000 threads to run them in parallel, which is a huge resource cost.
While in async:
let fetchUrlAsync url =
async { let req = WebRequest.Create(Uri url)
use! resp = req.AsyncGetResponse()
use stream = resp.GetResponseStream()
use reader = new StreamReader(stream)
let contents = reader.ReadToEnd()
return contents }
let pagesAsync = sites |> Seq.map fetchUrlAsync |> Async.Parallel |> Async.RunSynchronously
When the code is in use! resp = req.AsyncGetResponse(), the current thread is given up and its resource could be used for other purposes. If the response comes back in 1 second, then your thread could use this 1 second to process other stuff. Otherwise the thread is blocked, wasting thread resource for 1 second.
So even your are downloading 10000 web pages in parallel in an asynchronous way, the number of threads are limited to a small number.
I think you are not a .Net/C# programmer. The async tutorial usually assumes that one knows .Net and how to program asynchronous IO in C#(a lot of code). The magic of Async construct in F# is not for parallel. Because simple parallel could be realized by other constructs, e.g. ParallelFor in the .Net parallel extension. However, the asynchronous IO is more complex, as you see the thread gives up its execution, when the IO finishes, the IO needs to wake up its parent thread. This is where async magic is used for: in several lines of concise code, you can do very complex control.
Many good answers here but I thought I take a different angle to the question: How does F#'s async really work?
Unlike async/await in C# F# developers can actually implement their own version of Async. This can be a great way to learn how Async works.
(For the interested the source code to Async can be found here: https://github.com/Microsoft/visualfsharp/blob/fsharp4/src/fsharp/FSharp.Core/control.fs)
As our fundamental building block for our DIY workflows we define:
type DIY<'T> = ('T->unit)->unit
This is a function that accepts another function (called the continuation) that is called when the result of type 'T is ready. This allows DIY<'T> to start a background task without blocking the calling thread. When the result is ready the continuation is called allowing the computation to continue.
The F# Async building block is a bit more complicated as it also includes cancellation and exception continuations but essentially this is it.
In order to support the F# workflow syntax we need to define a computation expression (https://msdn.microsoft.com/en-us/library/dd233182.aspx). While this is a rather advanced F# feature it's also one of the most amazing features of F#. The two most important operations to define are return & bind which are used by F# to combine our DIY<_> building blocks into aggregated DIY<_> building blocks.
adaptTask is used to adapt a Task<'T> into a DIY<'T>.
startChild allows starting several simulatenous DIY<'T>, note that it doesn't start new threads in order to do so but reuses the calling thread.
Without any further ado here's the sample program:
open System
open System.Diagnostics
open System.Threading
open System.Threading.Tasks
// Our Do It Yourself Async workflow is a function accepting a continuation ('T->unit).
// The continuation is called when the result of the workflow is ready.
// This may happen immediately or after awhile, the important thing is that
// we don't block the calling thread which may then continue executing useful code.
type DIY<'T> = ('T->unit)->unit
// In order to support let!, do! and so on we implement a computation expression.
// The two most important operations are returnValue/bind but delay is also generally
// good to implement.
module DIY =
// returnValue is called when devs uses return x in a workflow.
// returnValue passed v immediately to the continuation.
let returnValue (v : 'T) : DIY<'T> =
fun a ->
a v
// bind is called when devs uses let!/do! x in a workflow
// bind binds two DIY workflows together
let bind (t : DIY<'T>) (fu : 'T->DIY<'U>) : DIY<'U> =
fun a ->
let aa tv =
let u = fu tv
u a
t aa
let delay (ft : unit->DIY<'T>) : DIY<'T> =
fun a ->
let t = ft ()
t a
// starts a DIY workflow as a subflow
// The way it works is that the workflow is executed
// which may be a delayed operation. But startChild
// should always complete immediately so in order to
// have something to return it returns a DIY workflow
// postProcess checks if the child has computed a value
// ie rv has some value and if we have computation ready
// to receive the value (rca has some value).
// If this is true invoke ca with v
let startChild (t : DIY<'T>) : DIY<DIY<'T>> =
fun a ->
let l = obj()
let rv = ref None
let rca = ref None
let postProcess () =
match !rv, !rca with
| Some v, Some ca ->
ca v
rv := None
rca := None
| _ , _ -> ()
let receiver v =
lock l <| fun () ->
rv := Some v
postProcess ()
t receiver
let child : DIY<'T> =
fun ca ->
lock l <| fun () ->
rca := Some ca
postProcess ()
a child
let runWithContinuation (t : DIY<'T>) (f : 'T -> unit) : unit =
t f
// Adapts a task as a DIY workflow
let adaptTask (t : Task<'T>) : DIY<'T> =
fun a ->
let action = Action<Task<'T>> (fun t -> a t.Result)
ignore <| t.ContinueWith action
// Because C# generics doesn't allow Task<void> we need to have
// a special overload of for the unit Task.
let adaptUnitTask (t : Task) : DIY<unit> =
fun a ->
let action = Action<Task> (fun t -> a ())
ignore <| t.ContinueWith action
type DIYBuilder() =
member x.Return(v) = returnValue v
member x.Bind(t,fu) = bind t fu
member x.Delay(ft) = delay ft
let diy = DIY.DIYBuilder()
open DIY
[<EntryPoint>]
let main argv =
let delay (ms : int) = adaptUnitTask <| Task.Delay ms
let delayedValue ms v =
diy {
do! delay ms
return v
}
let complete =
diy {
let sw = Stopwatch ()
sw.Start ()
// Since we are executing these tasks concurrently
// the time this takes should be roughly 700ms
let! cd1 = startChild <| delayedValue 100 1
let! cd2 = startChild <| delayedValue 300 2
let! cd3 = startChild <| delayedValue 700 3
let! d1 = cd1
let! d2 = cd2
let! d3 = cd3
sw.Stop ()
return sw.ElapsedMilliseconds,d1,d2,d3
}
printfn "Starting workflow"
runWithContinuation complete (printfn "Result is: %A")
printfn "Waiting for key"
ignore <| Console.ReadKey ()
0
The output of the program should be something like this:
Starting workflow
Waiting for key
Result is: (706L, 1, 2, 3)
When running the program note that Waiting for key is printed immidiately as the Console thread is not blocked from starting workflow. After about 700ms the result is printed.
I hope this was interesting to some F# devs
Lots of great detail in the other answers, but as I beginner I got tripped up by the differences between C# and F#.
F# async blocks are a recipe for how the code should run, not actually an instruction to run it yet.
You build up your recipe, maybe combining with other recipes (e.g. Async.Parallel). Only then do you ask the system to run it, and you can do that on the current thread (e.g. Async.StartImmediate) or on a new task, or various other ways.
So it's a decoupling of what you want to do from who should do it.
The C# model is often called 'Hot Tasks' because the tasks are started for you as part of their definition, vs. the F# 'Cold Task' models.
The idea behind let! and Async.RunSynchronously is that sometimes you have an asynchronous activity that you need the results of before you can continue. For example, the "download a web page" function may not have a synchronous equivalent, so you need some way to run it synchronously. Or if you have an Async.Parallel, you may have hundreds of tasks all happening concurrently, but you want them all to complete before continuing.
As far as I can tell, the reason you would use Async.StartImmediate is that you have some computation that you need to run on the current thread (perhaps a UI thread) without blocking it. Does it use coroutines? I guess you could call it that, although there isn't a general coroutine mechanism in .Net.
So why does Async.Parallel require a sequence of Async<'T>? Probably because it's a way of composing Async<'T> objects. You could easily create your own abstraction that works with just plain functions (or a combination of plain functions and Asyncs, but it would just be a convenience function.
In an async block you can have some synchronous and some async operations, so, for example, you may have a web site that will show the status of the user in several ways, so you may show if they have bills that are due shortly, birthdays coming up and homework due. None of these are in the same database, so your application will make three separate calls. You may want to make the calls in parallel, so that when the slowest one is done, you can put the results together and display it, so, the end result will be that the display is based on the slowest. You don't care about the order that these come back, you just want to know when all three are received.
To finish my example, you may then want to synchronously do the work to create the UI to show this information. So, at the end, you wanted this data fetched and the UI displayed, the parts where order doesn't matter is done in parallel, and where order matters can be done in a synchronous fashion.
You can do these as three threads, but then you have to keep track and unpause the original thread when the third one is finished, but it is more work, it is easier to have the .NET framework take care of this.
Could you, please, give a code snippet showing how to use Lua embedded in OCaml?
A simple example could be a "Hello, World" variant. Have OCaml prompt the user for a name. Then pass that name to a Lua function. Have Lua print a greeting and return the length of the name. Then have OCaml print a message about the length of the name.
Example:
user#desktop:~$ ./hello.opt
Name? User
Hello, User.
Your name is 4 letters long.
user#desktop:~$
[Edit]
As a non-C programmer, could I implement this without having to write an intermediary C program to pass the data between Lua and OCaml?
Following is a theoretical idea of what I would like to try. Unfortunately, line 3 of ocaml_hello.ml would need to know how to call the function defined in lua_hello.lua in order for the code to be valid.
lua_hello.lua
Defines lua_hello, which prints an argument and returns its length.
1 function lua_hello (name)
2 print ("Hello, "..name..".")
3 return (string.len (name))
4 end
ocaml_hello.ml
OCaml prompts for a name, calls the Lua function, and prints the return value.
1 let () = print_string "Name? "; flush stdout in
2 let name = input_line stdin in
3 let len = Lua_hello.lua_hello name in
4 Printf.printf "Your name is %d letters long." len; flush stdout;;
I'm not aware of a mature set of bindings for embedding the C implementation of Lua into OCaml. An immature set of bindings was posted on the Caml mailing list in 2004.
If you want to use the ML implementation you can find some examples in a paper called ML Module Mania. The ML implementation, unlike the C implementation, guarantees type safety, but to do so it uses some very scurvy tricks in the ML module system. If you are asking basic questions, you probably want to avoid this.
In your example it's a little hard to guess where you want the function to come from. I suggest you either ask for a C example or give people a C example and ask how it could be realized in OCaml (though I think bindings are going to be a problem).
Edit
In response to the revised question, it's pretty complicated. The usual model is that you would put Lua in charge, and you would call Objective Caml code from Lua. You're putting Caml in charge, which makes things more complicated. Here's a rough sketch of what things might look like:
let lua = Lua.new() (* create Lua interpreter *)
let chunk = LuaL.loadfile lua "hello.lua" (* load and compile the file hello.lua *)
let _ = Lua.call lua 0 0 (* run the code to create the hello function *)
let lua_len s =
(* push the function; push the arg; call; grab the result; pop it; return *)
let _ = Lua.getglobal lua "lua_hello" in
let _ = Lua.pushstring lua s in
let _ = Lua.call lua 1 1 in
let len = Lua.tointeger lua (-1) in
let _ = Lua.pop lua 1 in
len
let () = print_string "Name? "; flush stdout
let name = input_line stdin
let len = lua_len name
Printf.printf "Your name is %d letters long." len; flush stdout;;
Again, I don't know where you'll get the bindings for the Lua and LuaL modules.
On further reflection, I'm not sure if you can do this with the official C implementation of Lua, because I think OCaml believes it owns main(). You'd have to find out if OCaml could be packaged as a library from a C main program.
For an example of putting Lua-ML in charge, you can get Lua-ML standalone from Cminusminus.org, and you can also check out the examples in the paper on Lua-ML as well as the source code to the QC-- compiler itself.