F# idiomatic conversion of async while loop accumulation - f#

What is the idiomatic F# way of handling an asynchronous while loop accumulation?
I'm working with the new (still in preview) Azure Cosmos DB SDK. Querying the database returns a CosmosResultSetIterator<T> which has a HasMoreResults property and a FetchNextSetAsync() method. My straight-up translation of the C# code looks like this:
let private fetchItemsFromResultSet (resultSetIterator: CosmosResultSetIterator<'a>) =
let results = ResizeArray<'a>()
async {
while resultSetIterator.HasMoreResults do
let! response = resultSetIterator.FetchNextSetAsync() |> Async.AwaitTask
results.AddRange(response |> Seq.toArray)
return Seq.toList results
}

I would take a look at the AsyncSeq package. You can use it to create asynchronously computed sequences and then iterate them asynchronously or in parallel. This allows for the async-binding to be inside the sequence and the yield to occur asynchronously, so you don't have to build up an accumulator explicitly.
You can use it to do something like:
open FSharp.Control
let private fetchItemsFromResultSet (resultSetIterator: CosmosResultSetIterator<'a>) =
asyncSeq {
while resultSetIterator.HasMoreResults do
let! response = resultSetIterator.FetchNextSetAsync() |> Async.AwaitTask
yield! response |> AsyncSeq.ofSeq
}

IMHO tail-recursion is preferable to while loops as it's one way to avoid mutation.
For example:
let fetchItemsFromResultSet (resultSetIterator: CosmosResultSetIterator<'a>) =
let rec loop results =
async {
if resultSetIterator.HasMoreResults then
let! vs = resultSetIterator.FetchNextSetAsync () |> Async.AwaitTask
let vs = vs |> Seq.toList
return! loop (vs::results)
else
// List.rev needed because batches are in reverse
return results |> List.rev |> List.concat
}
loop []

Very recently, FSharp.Control.TaskSeq was added to support tasks natively with seqs. The answer here by #Just another metaprogrammer can be rewritten as
#r "nuget: FSharp.Control.TaskSeq"
open FSharp.Control
let private fetchItemsFromResultSet (resultSetIterator: CosmosResultSetIterator<'a>) = taskSeq {
while resultSetIterator.HasMoreResults do
let! response = resultSetIterator.FetchNextSetAsync()
yield! response |> TaskSeq.ofSeq
}

Related

Calling C# method which returns a Task

This is some C# code:
var streamStore = new PostgresStreamStore(new PostgresStreamStoreSettings("Host=localhost;Port=5432;User Id=postgres;Password=123456;Database=postgres"));
await streamStore.CreateSchemaIfNotExists();
I'm trying to call it from F# like this:
let db_connection =
Sql.host "localhost"
|> Sql.port 5432
|> Sql.username "postgres"
|> Sql.password "123456"
|> Sql.database "postgres"
|> Sql.str
let store =
new PostgresStreamStore(PostgresStreamStoreSettings(db_connection))
store.CreateSchemaIfNotExists() |> Async.AwaitTask |> ignore
The code compiles, however the schema in contrast to the C# Version does not a create a schema.
How do I await this Task from store.CreateSchemaIfNotExists?
I'm getting this error message:
`This expression is a function value, i.e. is missing arguments. Its type is unit -> Tasks.Task.
In the C# code, you are using await, so this must be inside an async method. The corresponding thing in F# would be to use F# asynchronous workflows. Inside those, you can use let! which is similar to await. This works with computations of type Async<T> rather than Task<T>. The operation Async.AwaitTask turns Task<T> into Async<T> so that you can access it using let!
let doSomething () = async {
let db_connection =
Sql.host "localhost"
// (other configuration omitted)
let store =
new PostgresStreamStore(PostgresStreamStoreSettings(db_connection))
let! res = store.CreateSchemaIfNotExists() |> Async.AwaitTask
return "whatever" }
I assume that CreateSchemaIfNotExists does not return anything useful, so you can also wait for its completion using do!
do! store.CreateSchemaIfNotExists() |> Async.AwaitTask |> Async.Ignore
An asynchronous computation then needs to be started using Async.Start or Async.RunSynchronously, which is akin to starting a task or blocking using task.RunSynchronously.

How to do await an Async method, similar to C#

How to do an simple await in F# ?
In C# I have code like this:
await collection.InsertOneAsync(DO);
var r = collection.ReplaceOneAsync((fun d -> d.Id = DO.Id), DO)
So I created a let await = ... to my F# code become more similar with my C# code.
My current F# code is this:
let awaits (t: Threading.Tasks.Task) = t |> Async.AwaitTask |> Async.RunSynchronously
let await (t: Threading.Tasks.Task<'T>) = t |> Async.AwaitTask |> Async.RunSynchronously
let Busca (numero) =
let c = collection.Find(fun d -> d.Numero=numero).ToList()
c
let Insere(DO: DiarioOficial) =
//collection.InsertOneAsync(DO) |> Async.AwaitTask |> Async.RunSynchronously
collection.InsertOneAsync(DO) |> awaits
let Salva (DO: DiarioOficial) =
//let r = collection.ReplaceOneAsync((fun d -> d.Id = DO.Id), DO) |> Async.AwaitTask |> Async.RunSynchronously
let r = collection.ReplaceOneAsync((fun d -> d.Id = DO.Id), DO) |> await
r
I want to have only one definition for await (awaits), but the best I could do is this, because on Insere, type is Task, but on Salva, type is Task<'T>
If i use only the await, I get this compile error:
FS0001 The type 'Threading.Tasks.Task' is not compatible with the type 'Threading.Tasks.Task<'a>'
If I use only the awaits, it compiles, but I lose the return type from the async Task
I want to merge the await and awaits in a single
let await = ...
How can I do this?
In F# we tend to use another syntax. It is described e.g. here: https://fsharpforfunandprofit.com/posts/concurrency-async-and-parallel/.
or here: https://learn.microsoft.com/en-us/dotnet/fsharp/tutorials/asynchronous-and-concurrent-programming/async
The idea of working with C# Tasks is to "convert" them to async with Async.Await<'T>
You can do it probably another way, but it is the most straightforward.
There are two parts of writing async code in both F# and C#.
You need to mark the method or code block as asynchronous. In C#, this is done using the async keyword. The F# equivalent is to use the async { ... } block (which is an expression, but otherwise, it is similar).
Inside async method or async { .. } block, you can make non-blocking calls. In C#, this is done using await and in F# it is done using let!. Note that this is not just a function call - the compiler handles this in a special way.
F# also uses Async<T> type rather than Task<T>, but those are easy to convert - e.g. using Async.AwaitTask. So, you probably want something like this:
let myAsyncFunction () = async {
let! _ = collection.InsertOneAsync(DO) |> Async.AwaitTask
let r = collection.ReplaceOneAsync((fun d -> d.Id = DO.Id), DO)
// More code goes here
}
I used let! to show the idea, but if you have an asynchronous operation that returns unit, you can also use do!
do! collection.InsertOneAsync(DO) |> Async.AwaitTask

How to unbox Async<string> []

So I have an Async<string> [] and for the life of me I can't figure out how to unbox it!
I started with
Job<Result<string>> []
Then I managed to get it to
Job<string> []
So I added in Job.toAsync thinking converting it to an async might be easier to get out. NOPE
Now I have
Async<string> []
I'm using the Hopac lib
So my question is, how do I just get a string []
You need to somehow run those asyncs, so that each returns you a value. The easiest way is to run them in parallel, for there is a built-in function for it - Async.Parallel. This function takes a sequence of asyncs and returns an async of an array, which you can then run with Async.RunSynchronously or similar:
let asyncs : Async<string> [] = ...
let results = asyncs |> Async.Parallel |> Async.RunSynchronously
Alternatively, if you want them to run sequentially, you could run them one by one and accumulate the result as you go. Unfortunately, there is no built-in function to do it, so you'll have to code it yourself. Something like this:
let runEm asyncs =
let loop rest resultsSoFar =
match rest with
| x::xs ->
async {
let! r = x
return! loop xs (r:resultsSoFar)
}
| [] ->
async { return resultsSoFar }
async {
let! ress = loop asyncs []
return ress |> List.reverse
}
// Usage:
let asyncs : Async<string> [] = ...
let results = runEm asyncs |> Async.RunSynchronously

How to convert the download program to async?

I have the following code
open FSharp.Data
let downloadFile link =
......
use os = File.Create(...)
Http.RequestStream(....).ReponseStream.CopyTo(os)
let rec consume() = async {
......
|> Seq.iter (fun x ->
xxx |> Seq.iter(fun link ->
downloadFile link
))
}
I found that the sync downloading makes the code not run concurrently. So I'm trying to do somthing like the following. How to change it to use the FSharp.Data http AsyncRequestStream? Maybe the CopyTo can be async too?
open FSharp.Data
let downloadFile link = async {
......
use os = File.Create(...)
Http.AsyncRequestStream(....).ReponseStream.CopyTo(os) // Error
}
let rec consume() = async {
......
|> Seq.iter (fun x ->
xxx |> Seq.iter(fun link ->
downloadFile link |> Async.Start // do! downloadFile link????
))
}
consume() |> Async.RunSynchronously
Here's a skeleton solution, worthy of all the blank spots in your example:
let downloadFile link =
async {
......
use os = File.Create(...)
let! resp = Http.AsyncRequestStream(....)
return resp.ReponseStream.CopyTo(os)
}
let consume link =
async {
let comps : Async<unit> [] =
xxx
|> Seq.map (fun link -> downloadFile link)
|> Array.ofSeq
return! Async.Parallel comps
}
I think you should read up on asynchronicity and concurrency in general, as well as how to use it in F# in particular. From the OP it seems the whole thing is a bit hazy to you.
Edit: to answer the question in the comment:
With return! (or let!, or do!) you execute the nested workflow asynchronously, then pick up executing the current workflow from that point. That is, everything "below" the do! is put into a continuation that gets called once the thing "after" the do! finishes.
Whereas Async.Start fires up the workflow on (another) background thread and returns immediately without waiting for it to finish.

FSharp: Using CSV Type Provider Async

I am using the csv type provider to collect some data from a series of files I have on Azure blob storage:
#r "../packages/FSharp.Data.2.0.9/lib/portable-net40+sl5+wp8+win8/FSharp.Data.dll"
open FSharp.Data
type censusDataContext = CsvProvider<"https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/AK.TXT">
type stateCodeContext = CsvProvider<"https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/states.csv">
let stateCodes = stateCodeContext.Load("https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/states.csv");
let fetchStateData (stateCode:string)=
let uri = System.String.Format("https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/{0}.TXT",stateCode)
censusDataContext.Load(uri).Rows
let usaData = stateCodes.Rows
|> Seq.collect(fun r -> fetchStateData(r.Abbreviation))
|> Seq.length
I now want to run these async and I am running into a problem with AsyncLoad:
let fetchStateDataAsync(stateCode:string)=
async{
let uri = System.String.Format("https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/{0}.TXT",stateCode)
let! stateData = censusDataContext.AsyncLoad(uri)
return stateData.Rows
}
let usaData = stateCodes.Rows
|> Seq.collect(fun r -> fetchStateDataAsync(r.Abbreviation))
|> Seq.length
The error message is
The type 'Async<seq<CsvProvider<...>.Row>>' is not compatible with the type 'seq<'a>'
Forgive my lack of async knowledge, but do I have to use something other than Seq.Collect when applying async functions?
Thanks in advance
The problem is that turning code to asynchronous (by wrapping it in the async { .. } block) changes the result from seq<Row> to Async<seq<Row>> - that is, you now get an asynchronous computation that will eventually complete and return the sequence.
To fix this, you need to somehow start the computation and wait for the result. There is a number of choices - like running one by one sequentially. Probably the easiest option (and maybe the best - depending on what you want to do) is to run the computations in parallel:
let getAll =
stateCodes.Rows
|> Seq.map(fun r -> fetchStateDataAsync(r.Abbreviation))
|> Async.Parallel
This gives you an asynchronous computation that runs all the downloads and returns an array of results. You can run this synchronously (and block) and get the results:
getAll |> Async.RunSynchronously
|> Seq.collect id
|> Seq.length
If you want to run the downloads asynchronously in the background you can do that to, but you need to specify what to do with the result. For example:
async {
let! all = getAll
all |> Seq.collect id |> Seq.length |> printfn "Length %d" }
|> Async.Start

Resources