I'm doing many async web requests and using Async.Parallel. Something like:
xs
|> Seq.map (fun u -> downloadAsync u.Url)
|> Async.Parallel
|> Async.Catch
Some request may throw exceptions, I want to log them and continue with the rest of urls. I found the Async.Catch function, but this stop the computation when the first exception is thrown. I know I can use a try...with expression within the async expression in order to compute the entire list, but, i think, this implies passing a log function to my downloadAsync function changing his type. Is there any other way to catch the exceptions, log them and continue with the rest of urls?
The 'trick' is to move the catch into the map such that catching is parallelized as well:
open System
open System.IO
open System.Net
type T = { Url : string }
let xs = [
{ Url = "http://microsoft.com" }
{ Url = "thisDoesNotExists" } // throws when constructing Uri, before downloading
{ Url = "https://thisDotNotExist.Either" }
{ Url = "http://google.com" }
]
let isAllowedInFileName c =
not <| Seq.contains c (Path.GetInvalidFileNameChars())
let downloadAsync url =
async {
use client = new WebClient()
let fn =
[|
__SOURCE_DIRECTORY__
url |> Seq.filter isAllowedInFileName |> String.Concat
|]
|> Path.Combine
printfn "Downloading %s to %s" url fn
return! client.AsyncDownloadFile(Uri(url), fn)
}
xs
|> Seq.map (fun u -> downloadAsync u.Url |> Async.Catch)
|> Async.Parallel
|> Async.RunSynchronously
|> Seq.iter (function
| Choice1Of2 () -> printfn "Succeeded"
| Choice2Of2 exn -> printfn "Failed with %s" exn.Message)
(*
Downloading http://microsoft.com to httpmicrosoft.com
Downloading thisDoesNotExists to thisDoesNotExists
Downloading http://google.com to httpgoogle.com
Downloading https://thisDotNotExist.Either to httpsthisDotNotExist.Either
Succeeded
Failed with Invalid URI: The format of the URI could not be determined.
Failed with The remote name could not be resolved: 'thisdotnotexist.either'
Succeeded
*)
Here I wrapped the download into another async to capture the Uri construction exception.
Related
I would like to create a chain of expressions and any of them can fail when the computation should just stop.
With Unix pipes it is usually like this:
bash-3.2$ echo && { echo 'a ok'; echo; } && { echo 'b ok'; echo; }
a ok
b ok
When something fails the pipeline stops:
echo && { echo 'a ok'; false; } && { echo 'b ok'; echo; }
a ok
I can handle Optionals but my problem is that I might want to do multiple things in each branch:
let someExternalOperation = callToAnAPI()
match someExternalOperation with
| None -> LogAndStop()
| Some x -> LogAndContinue()
Then I would like to keep going with other API calls and only stop if there is an error.
Is there something like that in F#?
Update1:
What I am trying to do is calling out to external APIs. Each call can fail. Would be nice to try to retry but not required.
You can use the F# Async and Result types together to represent the results of each API Call. You can then use the bind functions for those types to build a workflow in which you only continue processing when the previous calls were successful. In order to make that easier, you can wrap the Async<Result<_,_>> you would be working with for each api call in its own type and build a module around binding those results to orchestrate a chained computation. Here's a quick example of what that would look like:
First, we would lay out the type ApiCallResult to wrap Async and Result, and we would define ApiCallError to represent HTTP error responses or exceptions:
open System
open System.Net
open System.Net.Http
type ApiCallError =
| HttpError of (int * string)
| UnexpectedError of exn
type ApiCallResult<'a> = Async<Result<'a, ApiCallError>>
Next, we would create a module to work with ApiCallResult instances, allowing us to do things like bind, map, and return so that we can process the results of a computation and feed them into the next one.
module ApiCall =
let ``return`` x : ApiCallResult<_> =
async { return Ok x }
let private zero () : ApiCallResult<_> =
``return`` []
let bind<'a, 'b> (f: 'a -> ApiCallResult<'b>) (x: ApiCallResult<'a>) : ApiCallResult<'b> =
async {
let! result = x
match result with
| Ok value ->
return! f value
| Error error ->
return Error error
}
let map f x = x |> bind (f >> ``return``)
let combine<'a> (acc: ApiCallResult<'a list>) (cur: ApiCallResult<'a>) =
acc |> bind (fun values -> cur |> map (fun value -> value :: values))
let join results =
results |> Seq.fold (combine) (zero ())
Then, you would have a module to simply do your API calls, however that works in your real scenario. Here's one that just handles GETs with query parameters, but you could make this more sophisticated:
module Api =
let call (baseUrl: Uri) (queryString: string) : ApiCallResult<string> =
async {
try
use client = new HttpClient()
let url =
let builder = UriBuilder(baseUrl)
builder.Query <- queryString
builder.Uri
printfn "Calling API: %O" url
let! response = client.GetAsync(url) |> Async.AwaitTask
let! content = response.Content.ReadAsStringAsync() |> Async.AwaitTask
if response.IsSuccessStatusCode then
let! content = response.Content.ReadAsStringAsync() |> Async.AwaitTask
return Ok content
else
return Error <| HttpError (response.StatusCode |> int, content)
with ex ->
return Error <| UnexpectedError ex
}
let getQueryParam name value =
value |> WebUtility.UrlEncode |> sprintf "%s=%s" name
Finally, you would have your actual business workflow logic, where you call multiple APIs and feed the results of one into another. In the below example, anywhere you see callMathApi, it is making a call to an external REST API that may fail, and by using the ApiCall module to bind the results of the API call, it only proceeds to the next API call if the previous call was successful. You can declare an operator like >>= to eliminate some of the noise in the code when binding computations together:
module MathWorkflow =
let private (>>=) x f = ApiCall.bind f x
let private apiUrl = Uri "http://api.mathjs.org/v4/" // REST API for mathematical expressions
let private callMathApi expression =
expression |> Api.getQueryParam "expr" |> Api.call apiUrl
let average values =
values
|> List.map (sprintf "%d")
|> String.concat "+"
|> callMathApi
>>= fun sum ->
sprintf "%s/%d" sum values.Length
|> callMathApi
let averageOfSquares values =
values
|> List.map (fun value -> sprintf "%d*%d" value value)
|> List.map callMathApi
|> ApiCall.join
|> ApiCall.map (List.map int)
>>= average
This example uses the Mathjs.org API to compute the average of a list of integers (making one API call to compute the sum, then another to divide by the number of elements), and also allows you to compute the average of the squares of a list of values, by calling the API asynchronously for each element in the list to square it, then joining the results together and computing the average. You can use these functions as follows (I added a printfn to the actual API call so it logs the HTTP requests):
Calling average:
MathWorkflow.average [1;2;3;4;5] |> Async.RunSynchronously
Outputs:
Calling API: http://api.mathjs.org/v4/?expr=1%2B2%2B3%2B4%2B5
Calling API: http://api.mathjs.org/v4/?expr=15%2F5
[<Struct>]
val it : Result<string,ApiCallError> = Ok "3"
Calling averageOfSquares:
MathWorkflow.averageOfSquares [2;4;6;8;10] |> Async.RunSynchronously
Outputs:
Calling API: http://api.mathjs.org/v4/?expr=2*2
Calling API: http://api.mathjs.org/v4/?expr=4*4
Calling API: http://api.mathjs.org/v4/?expr=6*6
Calling API: http://api.mathjs.org/v4/?expr=8*8
Calling API: http://api.mathjs.org/v4/?expr=10*10
Calling API: http://api.mathjs.org/v4/?expr=100%2B64%2B36%2B16%2B4
Calling API: http://api.mathjs.org/v4/?expr=220%2F5
[<Struct>]
val it : Result<string,ApiCallError> = Ok "44"
Ultimately, you may want to implement a custom Computation Builder to allow you to use a computation expression with the let! syntax, instead of explicitly writing the calls to ApiCall.bind everywhere. This is fairly simple, since you already do all the real work in the ApiCall module, and you just need to make a class with the appropriate Bind/Return members:
type ApiCallBuilder () =
member __.Bind (x, f) = ApiCall.bind f x
member __.Return x = ApiCall.``return`` x
member __.ReturnFrom x = x
member __.Zero () = ApiCall.``return`` ()
let apiCall = ApiCallBuilder()
With the ApiCallBuilder, you could rewrite the functions in the MathWorkflow module like this, making them a little easier to read and compose:
let average values =
apiCall {
let! sum =
values
|> List.map (sprintf "%d")
|> String.concat "+"
|> callMathApi
return!
sprintf "%s/%d" sum values.Length
|> callMathApi
}
let averageOfSquares values =
apiCall {
let! squares =
values
|> List.map (fun value -> sprintf "%d*%d" value value)
|> List.map callMathApi
|> ApiCall.join
return! squares |> List.map int |> average
}
These work as you described in the question, where each API call is made independently and the results feed into the next call, but if one call fails the computation is stopped and the error is returned. For example, if you change the URL used in the example calls here to the v3 API ("http://api.mathjs.org/v3/") without changing anything else, you get the following:
Calling API: http://api.mathjs.org/v3/?expr=2*2
[<Struct>]
val it : Result<string,ApiCallError> =
Error
(HttpError
(404,
"<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Cannot GET /v3/</pre>
</body>
</html>
"))
I have the following code
open FSharp.Data
let downloadFile link =
......
use os = File.Create(...)
Http.RequestStream(....).ReponseStream.CopyTo(os)
let rec consume() = async {
......
|> Seq.iter (fun x ->
xxx |> Seq.iter(fun link ->
downloadFile link
))
}
I found that the sync downloading makes the code not run concurrently. So I'm trying to do somthing like the following. How to change it to use the FSharp.Data http AsyncRequestStream? Maybe the CopyTo can be async too?
open FSharp.Data
let downloadFile link = async {
......
use os = File.Create(...)
Http.AsyncRequestStream(....).ReponseStream.CopyTo(os) // Error
}
let rec consume() = async {
......
|> Seq.iter (fun x ->
xxx |> Seq.iter(fun link ->
downloadFile link |> Async.Start // do! downloadFile link????
))
}
consume() |> Async.RunSynchronously
Here's a skeleton solution, worthy of all the blank spots in your example:
let downloadFile link =
async {
......
use os = File.Create(...)
let! resp = Http.AsyncRequestStream(....)
return resp.ReponseStream.CopyTo(os)
}
let consume link =
async {
let comps : Async<unit> [] =
xxx
|> Seq.map (fun link -> downloadFile link)
|> Array.ofSeq
return! Async.Parallel comps
}
I think you should read up on asynchronicity and concurrency in general, as well as how to use it in F# in particular. From the OP it seems the whole thing is a bit hazy to you.
Edit: to answer the question in the comment:
With return! (or let!, or do!) you execute the nested workflow asynchronously, then pick up executing the current workflow from that point. That is, everything "below" the do! is put into a continuation that gets called once the thing "after" the do! finishes.
Whereas Async.Start fires up the workflow on (another) background thread and returns immediately without waiting for it to finish.
I have the following interface method:
Task<string[]> GetBlobsFromContainer(string containerName);
and its implementation in C#:
var container = await _containerClient.GetContainer(containerName);
var tasks = container.ListBlobs()
.Cast<CloudBlockBlob>()
.Select(b => b.DownloadTextAsync());
return await Task.WhenAll(tasks);
When I try to rewrite it in F#:
member this.GetBlobsFromContainer(containerName : string) : Task<string[]> =
let task = async {
let! container = containerClient.GetContainer(containerName) |> Async.AwaitTask
return container.ListBlobs()
|> Seq.cast<CloudBlockBlob>
|> Seq.map (fun b -> b.DownloadTextAsync())
|> ??
}
task |> ??
I'm stuck with the last lines.
How to return to Task<string[]> from F# properly?
I had to guess what the type of containerClient is and the closest I found is CloudBlobClient (which does not have getContainer: string -> Task<CloubBlobContainer> but it shouldn't be too hard to adapt). Then, your function might look like as follows:
open System
open System.Threading.Tasks
open Microsoft.WindowsAzure.Storage.Blob
open Microsoft.WindowsAzure.Storage
let containerClient : CloudBlobClient = null
let GetBlobsFromContainer(containerName : string) : Task<string[]> =
async {
let container = containerClient.GetContainerReference(containerName)
return! container.ListBlobs()
|> Seq.cast<CloudBlockBlob>
|> Seq.map (fun b -> b.DownloadTextAsync() |> Async.AwaitTask)
|> Async.Parallel
} |> Async.StartAsTask
I changed the return type to be Task<string[]> instead of Task<string seq> as I suppose you want to keep the interface. Otherwise, I'd suggest to get rid of the Task and use Async in F#-only code.
Will this work?
member this.GetBlobsFromContainer(containerName : string) : Task<string seq> =
let aMap f x = async {
let! a = x
return f a }
let task = async {
let! container = containerClient.GetContainer(containerName) |> Async.AwaitTask
return! container.ListBlobs()
|> Seq.cast<CloudBlockBlob>
|> Seq.map (fun b -> b.DownloadTextAsync() |> Async.AwaitTask)
|> Async.Parallel
|> aMap Array.toSeq
}
task |> Async.StartAsTask
I had to make some assumptions about containerClient etc. so I haven't been able to test this, but at least it compiles.
I am using the csv type provider to collect some data from a series of files I have on Azure blob storage:
#r "../packages/FSharp.Data.2.0.9/lib/portable-net40+sl5+wp8+win8/FSharp.Data.dll"
open FSharp.Data
type censusDataContext = CsvProvider<"https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/AK.TXT">
type stateCodeContext = CsvProvider<"https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/states.csv">
let stateCodes = stateCodeContext.Load("https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/states.csv");
let fetchStateData (stateCode:string)=
let uri = System.String.Format("https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/{0}.TXT",stateCode)
censusDataContext.Load(uri).Rows
let usaData = stateCodes.Rows
|> Seq.collect(fun r -> fetchStateData(r.Abbreviation))
|> Seq.length
I now want to run these async and I am running into a problem with AsyncLoad:
let fetchStateDataAsync(stateCode:string)=
async{
let uri = System.String.Format("https://portalvhdspgzl51prtcpfj.blob.core.windows.net/censuschicken/{0}.TXT",stateCode)
let! stateData = censusDataContext.AsyncLoad(uri)
return stateData.Rows
}
let usaData = stateCodes.Rows
|> Seq.collect(fun r -> fetchStateDataAsync(r.Abbreviation))
|> Seq.length
The error message is
The type 'Async<seq<CsvProvider<...>.Row>>' is not compatible with the type 'seq<'a>'
Forgive my lack of async knowledge, but do I have to use something other than Seq.Collect when applying async functions?
Thanks in advance
The problem is that turning code to asynchronous (by wrapping it in the async { .. } block) changes the result from seq<Row> to Async<seq<Row>> - that is, you now get an asynchronous computation that will eventually complete and return the sequence.
To fix this, you need to somehow start the computation and wait for the result. There is a number of choices - like running one by one sequentially. Probably the easiest option (and maybe the best - depending on what you want to do) is to run the computations in parallel:
let getAll =
stateCodes.Rows
|> Seq.map(fun r -> fetchStateDataAsync(r.Abbreviation))
|> Async.Parallel
This gives you an asynchronous computation that runs all the downloads and returns an array of results. You can run this synchronously (and block) and get the results:
getAll |> Async.RunSynchronously
|> Seq.collect id
|> Seq.length
If you want to run the downloads asynchronously in the background you can do that to, but you need to specify what to do with the result. For example:
async {
let! all = getAll
all |> Seq.collect id |> Seq.length |> printfn "Length %d" }
|> Async.Start
Is there any way to call a function by name in F#? Given a string, I want to pluck a function value from the global namespace (or, in general, a given module), and call it. I know the type of the function already.
Why would I want to do this? I'm trying to work around fsi not having an --eval option. I have a script file that defines many int->() functions, and I want to execute one of them. Like so:
fsianycpu --use:script_with_many_funcs.fsx --eval "analyzeDataSet 1"
My thought was to write a trampoline script, like:
fsianycpu --use:script_with_many_funcs.fsx trampoline.fsx analyzeDataSet 1
In order to write "trampoline.fsx", I'd need to look up the function by name.
There is no built-in function for this, but you can implement it using .NET reflection. The idea is to search through all types available in the current assembly (this is where the current code is compiled) and dynamically invoke the method with the matching name. If you had this in a module, you'd have to check the type name too.
// Some sample functions that we might want to call
let hello() =
printfn "Hello world"
let bye() =
printfn "Bye"
// Loader script that calls function by name
open System
open System.Reflection
let callFunction name =
let asm = Assembly.GetExecutingAssembly()
for t in asm.GetTypes() do
for m in t.GetMethods() do
if m.IsStatic && m.Name = name then
m.Invoke(null, [||]) |> ignore
// Use the first command line argument (after -- in the fsi call below)
callFunction fsi.CommandLineArgs.[1]
This runs hello world when called by:
fsi --use:C:\temp\test.fsx --exec -- "hello"
You can use reflection to get the functions as MethodInfo's by FSharp function name
open System
open System.Reflection
let rec fsharpName (mi:MemberInfo) =
if mi.DeclaringType.IsNestedPublic then
sprintf "%s.%s" (fsharpName mi.DeclaringType) mi.Name
else
mi.Name
let functionsByName =
Assembly.GetExecutingAssembly().GetTypes()
|> Seq.filter (fun t -> t.IsPublic || t.IsNestedPublic)
|> Seq.collect (fun t -> t.GetMethods(BindingFlags.Static ||| BindingFlags.Public))
|> Seq.filter (fun m -> not m.IsSpecialName)
|> Seq.groupBy (fun m -> fsharpName m)
|> Map.ofSeq
|> Map.map (fun k v -> Seq.exactlyOne v)
You can then invoke the MethodInfo
functionsByName.[fsharpFunctionNameString].Invoke(null, objectArrayOfArguments)
But you probably need to do more work to parse your string arguments using the MethodInfo.GetParameters() types as a hint.
You could also use FSharp.Compiler.Service to make your own fsi.exe with an eval flag
open System
open Microsoft.FSharp.Compiler.Interactive.Shell
open System.Text.RegularExpressions
[<EntryPoint>]
let main(argv) =
let argAll = Array.append [| "C:\\fsi.exe" |] argv
let argFix = argAll |> Array.map (fun a -> if a.StartsWith("--eval:") then "--noninteractive" else a)
let optFind = argv |> Seq.tryFind (fun a -> a.StartsWith "--eval:")
let evalData = if optFind.IsSome then
optFind.Value.Replace("--eval:",String.Empty)
else
String.Empty
let fsiConfig = FsiEvaluationSession.GetDefaultConfiguration()
let fsiSession = FsiEvaluationSession(fsiConfig, argFix, Console.In, Console.Out, Console.Error)
if String.IsNullOrWhiteSpace(evalData) then
fsiSession.Run()
else
fsiSession.EvalInteraction(evalData)
0
If the above was compiled into fsieval.exe it could be used as so
fsieval.exe --load:script_with_many_funcs.fsx --eval:analyzeDataSet` 1