load balancing requests/farming requests (concurrency and state - awkward) - f#

I want to write a simple load balancer for some requests coming into a C# web api app.
(I only use the C# stuff as a convenient way to create a web server).
Whats the best way to approach this? (I havent really done any mailbox stuff in F#)
If I were to use mailboxes/agents...then I post the request as a message, fine...but how do I get the response back to web api request handler?
Isnt it all fire and forget? (I have, ironically, done some erlang)
(I CAN have a simple mutable global index of which is the next worker service to handle the request...but this is my opportunity to do it nicely).
actually I think I may have done something very similar to this in erlang, and I think the initiator would pass a return address where to send the message back (and the return address was the process id of the initiator), it would then wait for the response, and when it gets it (or times out), it would then do whatever it needed to do.
Is that a sensible mechanism in F#?
------------------------ edit ------------------------
So, https://www.codemag.com/Article/1707051/Writing-Concurrent-Programs-Using-F
describes a similar set up and it seems I need to use, and actually this works,
but it ISNT quite the same mechanism as my Erlang suggestion about.
Here each client sends a PostAndReply, and then waits for the response before replying back.....that seems unnecessary, ideally the reply would go all the way back to the origin, and the intermediaries would fire and forget in between.
open System
type Message = string * AsyncReplyChannel<string>
[<EntryPoint>]
let main argv =
let myFirstAgent =
MailboxProcessor<Message>.Start(fun inbox ->
let rec loop () =
async {
let! (message, replyChannel) = inbox.Receive()
replyChannel.Reply (String.Format ("1. Received message: {0}", message))
do! loop ()
}
loop ())
let mySecondAgent =
MailboxProcessor<Message>.Start(fun inbox ->
let rec loop () =
async {
let! (message, replyChannel) = inbox.Receive()
replyChannel.Reply (String.Format ("2. Received message: {0}", message))
do! loop ()
}
loop ())
let agents = [ myFirstAgent; mySecondAgent ]
let replyAgent =
MailboxProcessor<Message>.Start(fun inbox ->
let rec loop index =
async {
let! (message, replyChannel) = inbox.Receive()
let reply = (agents.Item index).PostAndReply(fun rc -> message,rc)
replyChannel.Reply reply
do! loop ((index + 1) % 2)
}
loop 0)
let foo = replyAgent.PostAndReply(fun rc -> "Hello", rc)
let foo1 = replyAgent.PostAndReply(fun rc -> "Hello", rc)
let foo2 = replyAgent.PostAndReply(fun rc -> "Hello", rc)
let foo3 = replyAgent.PostAndReply(fun rc -> "Hello", rc)
let foo4 = replyAgent.PostAndReply(fun rc -> "Hello", rc)
//myFirstAgent.Post "Hello!"
printfn "Hello World from F#!"
System.Console.ReadLine() |> ignore
0 // return an integer exit code

D'oh, what I need to do is actually UNDERSTAND the example, rather than just hack together code!
if the reply agent just forwards it...then we're done.
let replyAgent =
MailboxProcessor<Message>.Start(fun inbox ->
let rec loop index =
async {
let! (message, replyChannel) = inbox.Receive()
let reply = (agents.Item index).Post(message, replyChannel)
do! loop ((index + 1) % 2)
}
loop 0)

Related

Using Task and Async together for Asynchronous operations on seperate threads in F#

I'm being a little adventurous with my code for the amount of experience I have with F# and I am a little worried about cross threading issues.
Background:
I have a number of orders where I need to validate the address. Some of the orders can be validated against google maps geocoding API which allows 50/ second. the rest are Australian PO Boxes which we don't have many of - but I need to validate them against a different API that only allows 1 call per second.
I have switched over most of my code from async{} functions to task{} functions and I am assuming to get something on several threads at the same time it needs to be in an async{} function or block and be piped to Async.Parallel
Question: Is this the right way to do this or will it fall over? I am wondering if I am fundamentally thinking about this the wrong way.
Notes:
I am passing a database context into the async function and updating the database within that function
I will call this from a C# ( WPF ) Application and report the progress
Am I going to have cross threading issues?
let validateOrder
(
order: artooProvider.dataContext.``dbo.OrdersEntity``,
httpClient: HttpClient,
ctx: artooProvider.dataContext,
isAuPoBox: bool
) =
async {
// Validate Address
let! addressExceptions = ValidateAddress.validateAddress (order, httpClient, ctx, isAuPoBox) |> Async.AwaitTask
// SaveExceptions
do! ctx.SubmitUpdatesAsync()
// return Exception count
return ""
}
let validateGMapOrders(httpClient: HttpClient, ctx: artooProvider.dataContext, orders: artooProvider.dataContext.``dbo.OrdersEntity`` list) =
async {
let ordersChunked = orders |> List.chunkBySize 50
for fiftyOrders in ordersChunked do
let! tasks =
fiftyOrders
|> List.map (fun (order) -> validateOrder (order, httpClient, ctx, false) )
|> Async.Parallel
do! Async.Sleep(2000)
}
let validateOrders (ctx: artooProvider.dataContext, progress: IProgress<DownloadProgressModel>) =
task {
let unvalidatedOrders =
query {
for orders in ctx.Dbo.Orders do
where (orders.IsValidated.IsNone)
select (orders)
}
|> Seq.toList
let auPoBoxOrders =
unvalidatedOrders
|> List.filter (fun order -> isAUPoBox(order) = true )
let gMapOrders =
unvalidatedOrders
|> List.filter (fun order -> isAUPoBox(order) = false )
let googleHttpClient = new HttpClient()
let auspostHttpclient = Auspost.AuspostApi.getApiClient ()
// Google maps validations
do! validateGMapOrders(googleHttpClient,ctx,gMapOrders)
// PO Box Validations
for position in 0 .. auPoBoxOrders.Length - 1 do
let! result = validateOrder (gMapOrders[position], auspostHttpclient, ctx, true)
do! Task.Delay(1000)
return true
}
When I have had to deal with rate-limited API problems I hide that API behind a MailboxProcessor that maintains an internal time to comply with the rate limit but appears as a normal async API from the outside.
Since you have two API's with different rate limits I'd parameterise the time delay and processing action then create one object for each API.
open System
type Request = string
type Response = string
type RateLimitedProcessor() =
// Initialise 1s in past so ready to start immediately.
let mutable lastCall = DateTime.Now - TimeSpan(0, 0, 1)
let mbox = new MailboxProcessor<Request * AsyncReplyChannel<Response>>((fun mbox ->
let rec f () =
async {
let! (req, reply) = mbox.Receive()
let msSinceCall = (DateTime.Now - lastCall).Milliseconds
// wait 1s between requests
if msSinceCall < 1000 then
do! Async.Sleep (1000 - msSinceCall)
lastCall <- DateTime.Now
reply.Reply "Response"
// Call self recursively to process the next incoming message
return! f()
}
f()
))
do mbox.Start()
member __.Process(req:Request): Async<Response> =
async {
return! mbox.PostAndAsyncReply(fun reply -> req, reply)
}
interface IDisposable with
member this.Dispose() = (mbox :> IDisposable).Dispose()

Return results to the caller with a throttling queue

Building on a snippet and answer, would it be possible to return results to the caller from the throttling queue? I've tried PostAndAsyncReply to receive reply on a channel but it's throwing an error if I pipe it with Enqueue. Here's the code.
Appreciate a F# core vanilla based solution around Queue or Mailbox design patterns.
Question
The question is to be able to call functions asynchronously based on the throttle (max 3 at a time), passing each item from the array, wait on the whole queue/array until it's finished while collecting all the results and then return the results to the caller. (Return the results to the caller is what's pending in here)
Callee Code
// Message type used by the agent - contains queueing
// of work items and notification of completion
type ThrottlingAgentMessage =
| Completed
| Enqueue of Async<unit>
/// Represents an agent that runs operations in concurrently. When the number
/// of concurrent operations exceeds 'limit', they are queued and processed later
let throttlingAgent limit =
MailboxProcessor.Start(fun inbox ->
async {
// The agent body is not executing in parallel,
// so we can safely use mutable queue & counter
let queue = System.Collections.Generic.Queue<Async<unit>>()
let running = ref 0
while true do
// Enqueue new work items or decrement the counter
// of how many tasks are running in the background
let! msg = inbox.Receive()
match msg with
| Completed -> decr running
| Enqueue w -> queue.Enqueue(w)
// If we have less than limit & there is some work to
// do, then start the work in the background!
while running.Value < limit && queue.Count > 0 do
let work = queue.Dequeue()
incr running
do! // When the work completes, send 'Completed'
// back to the agent to free a slot
async {
do! work
inbox.Post(Completed)
}
|> Async.StartChild
|> Async.Ignore
})
let requestDetailAsync (url: string) : Async<Result<string, Error>> =
async {
Console.WriteLine ("Simulating request " + url)
try
do! Async.Sleep(1000) // let's say each request takes about a second
return Ok (url + ":body...")
with :? WebException as e ->
return Error {Code = "500"; Message = "Internal Server Error"; Status = HttpStatusCode.InternalServerError}
}
let requestMasterAsync() : Async<Result<System.Collections.Concurrent.ConcurrentBag<_>, Error>> =
async {
let urls = [|
"http://www.example.com/1";
"http://www.example.com/2";
"http://www.example.com/3";
"http://www.example.com/4";
"http://www.example.com/5";
"http://www.example.com/6";
"http://www.example.com/7";
"http://www.example.com/8";
"http://www.example.com/9";
"http://www.example.com/10";
|]
let results = System.Collections.Concurrent.ConcurrentBag<_>()
let agent = throttlingAgent 3
for url in urls do
async {
let! res = requestDetailAsync url
results.Add res
}
|> Enqueue
|> agent.Post
return Ok results
}
Caller Code
[<TestMethod>]
member this.TestRequestMasterAsync() =
match Entity.requestMasterAsync() |> Async.RunSynchronously with
| Ok result -> Console.WriteLine result
| Error error -> Console.WriteLine error
You could use Hopac.Streams for that. With such tool it is pretty trivial:
open Hopac
open Hopac.Stream
open System
let requestDetailAsync url = async {
Console.WriteLine ("Simulating request " + url)
try
do! Async.Sleep(1000) // let's say each request takes about a second
return Ok (url + ":body...")
with :? Exception as e ->
return Error e
}
let requestMasterAsync() : Stream<Result<string,exn>> =
[| "http://www.example.com/1"
"http://www.example.com/2"
"http://www.example.com/3"
"http://www.example.com/4"
"http://www.example.com/5"
"http://www.example.com/6"
"http://www.example.com/7"
"http://www.example.com/8"
"http://www.example.com/9"
"http://www.example.com/10" |]
|> Stream.ofSeq
|> Stream.mapPipelinedJob 3 (requestDetailAsync >> Job.fromAsync)
requestMasterAsync()
|> Stream.iterFun (printfn "%A")
|> queue //prints all results asynchronously
let allResults : Result<string,exn> list =
requestMasterAsync()
|> Stream.foldFun (fun results cur -> cur::results ) []
|> run //fold stream into list synchronously
ADDED
In case you want to use only vanilla FSharp.Core with mailboxes only try this:
type ThrottlingAgentMessage =
| Completed
| Enqueue of Async<unit>
let inline (>>=) x f = async.Bind(x, f)
let inline (>>-) x f = async.Bind(x, f >> async.Return)
let throttlingAgent limit =
let agent = MailboxProcessor.Start(fun inbox ->
let queue = System.Collections.Generic.Queue<Async<unit>>()
let startWork work =
work
>>- fun _ -> inbox.Post Completed
|> Async.StartChild |> Async.Ignore
let rec loop curWorkers =
inbox.Receive()
>>= function
| Completed when queue.Count > 0 ->
queue.Dequeue() |> startWork
>>= fun _ -> loop curWorkers
| Completed ->
loop (curWorkers - 1)
| Enqueue w when curWorkers < limit ->
w |> startWork
>>= fun _ -> loop (curWorkers + 1)
| Enqueue w ->
queue.Enqueue w
loop curWorkers
loop 0)
Enqueue >> agent.Post
It is pretty much the same logic, but slightly optimized to not use queue if there is free worker capacity (just start job and don't bother with queue/dequeue).
throttlingAgent is a function int -> Async<unit> -> unit
Because we don't want client to bother with our internal ThrottlingAgentMessage type.
use like this:
let throttler = throttlingAgent 3
for url in urls do
async {
let! res = requestDetailAsync url
results.Add res
}
|> throttler

Wait on the mailboxprocessor

Is it possible to wait on the mailboxprocessor, following code works in F# interactive but is there a way to wait on it in an application or a unit test?
[<TestMethod>]
member this.TestMailboxProcessor() =
let mailboxProcessor = MailboxProcessor<string>.Start(fun inbox ->
async {
while true do
let! msg = inbox.Receive()
printfn "agent got message %s" msg // too late, UnitTest exits
}
)
mailboxProcessor.Post "ping"
Console.WriteLine "message posted" // I see this in the console
Assert.IsTrue(true)
It's not possible in exactly this scenario, but you can define your message type to include an AsyncReplyChannel<'t>, which then allows you to use MailboxProcessor.PostAndReply instead of Post. This way the calling code can (either synchronously or asynchronously) wait for a response value, or at least an indication that the processing is done.
Your modified source code may look like this:
[<TestMethod>]
member this.TestMailboxProcessor() =
let mailboxProcessor =
MailboxProcessor<string * AsyncReplyChannel<unit>>.Start(fun inbox ->
async {
while true do
let! msg, replyChannel = inbox.Receive()
printfn "agent got message %s" msg
(*
Reply takes a value of the generic param of
AsyncReplyChannel<'t>, in this case just a unit
*)
replyChannel.Reply()
}
)
(*
You can't create an AsyncReplyChannel<'t> value, but this does it for you.
Also always, always use timeouts when awaiting message replies.
*)
mailboxProcessor.PostAndReply(
(fun replyChannel -> "ping", replyChannel),
timeout = 1000)
(* This gets printed only after the message has been posted and processed *)
Console.WriteLine "message posted"
Assert.IsTrue(true)
MailboxProcessors are a bit tricky topic though, so make sure you always use timeouts, otherwise in case of errors in your code, or exceptions killing the message loop, your code would hang forever. Not good in tests, even worse in production.
You should use PostAndAsyncReply or PostAndReply (blocking version)
let replyAgent = MailboxProcessor.Start(fun inbox ->
let rec loop() =
async {
let! (replyChannel: AsyncReplyChannel<_>), msg = inbox.Receive()
replyChannel.Reply (sprintf "replied for message: %A" msg)
return! loop()
}
loop() )
let reply = replyAgent.PostAndReply(fun replCh -> replCh, "Hi")
printfn "%s" reply //prints "replied for message: Hi"

Is returning results from MailboxProcessor via Rx a good idea?

I am a little curious about the code example below and what people think.
The idea was to read from a NetworkStream (~20 msg/s) and instead of working in the main, pass things to MainboxProcessor to handle and get things back for bindings when done.
The usual way is to use PostAndReply, but I want to bind to ListView or other control in C#. Must do magic with LastN items and filtering anyway.
Plus, Rx has some error handling.
The example below observes numbers from 2..10 and returns "hello X". On 8 it stops like it was EOF. Made it to ToEnumerable because other thread finishes before otherwise, but it works with Subscribe as well.
What bothers me:
passing Subject(obj) around in recursion. I don't see any problems having around 3-4 of those. Good idea?
Lifetime of Subject.
open System
open System.Threading
open System.Reactive.Subjects
open System.Reactive.Linq // NuGet, take System.Reactive.Core also.
open System.Reactive.Concurrency
type SerializedLogger() =
let _letters = new Subject<string>()
// create the mailbox processor
let agent = MailboxProcessor.Start(fun inbox ->
// the message processing function
let rec messageLoop (letters:Subject<string>) = async{
// read a message
let! msg = inbox.Receive()
printfn "mailbox: %d in Thread: %d" msg Thread.CurrentThread.ManagedThreadId
do! Async.Sleep 100
// write it to the log
match msg with
| 8 -> letters.OnCompleted() // like EOF.
| x -> letters.OnNext(sprintf "hello %d" x)
// loop to top
return! messageLoop letters
}
// start the loop
messageLoop _letters
)
// public interface
member this.Log msg = agent.Post msg
member this.Getletters() = _letters.AsObservable()
/// Print line with prefix 1.
let myPrint1 x = printfn "onNext - %s, Thread: %d" x Thread.CurrentThread.ManagedThreadId
// Actions
let onNext = new Action<string>(myPrint1)
let onCompleted = new Action(fun _ -> printfn "Complete")
[<EntryPoint>]
let main argv =
async{
printfn "Main is on: %d" Thread.CurrentThread.ManagedThreadId
// test
let logger = SerializedLogger()
logger.Log 1 // ignored?
let xObs = logger
.Getletters() //.Where( fun x -> x <> "hello 5")
.SubscribeOn(Scheduler.CurrentThread)
.ObserveOn(Scheduler.CurrentThread)
.ToEnumerable() // this
//.Subscribe(onNext, onCompleted) // or with Dispose()
[2..10] |> Seq.iter (logger.Log)
xObs |> Seq.iter myPrint1
while true
do
printfn "waiting"
System.Threading.Thread.Sleep(1000)
return 0
} |> Async.RunSynchronously // return an integer exit code
I have done similar things, but using the plain F# Event type rather than Subject. It basically lets you create IObservable and trigger its subscribes - much like your use of more complex Subject. The event-based version would be:
type SerializedLogger() =
let letterProduced = new Event<string>()
let lettersEnded = new Event<unit>()
let agent = MailboxProcessor.Start(fun inbox ->
let rec messageLoop (letters:Subject<string>) = async {
// Some code omitted
match msg with
| 8 -> lettersEnded.Trigger()
| x -> letterProduced.Trigger(sprintf "hello %d" x)
// ...
member this.Log msg = agent.Post msg
member this.LetterProduced = letterProduced.Publish
member this.LettersEnded = lettersEnded.Publish
The important differences are:
Event cannot trigger OnCompleted, so I instead exposed two separate events. This is quite unfortunate! Given that Subject is very similar to events in all other aspects, this might be a good reason for using subject instead of plain event.
The nice aspect of using Event is that it is a standard F# type, so you do not need any external dependencies in the agent.
I noticed your comment noting that the first call to Log was ignored. That's because you subscribe to the event handler only after this call happens. I think you could use ReplaySubject variation on the Subject idea here - it replays all events when you subscribe to it, so the one that happened earlier would not be lost (but there is a cost to caching).
In summary, I think using Subject is probably a good idea - it is essentially the same pattern as using Event (which I think is quite standard way of exposing notifications from agents), but it lets you trigger OnCompleted. I would probably not use ReplaySubject, because of the caching cost - you just have to make sure to subscribe before triggering any events.

Throttle a stream such that writes will be queued until a response was received to the previous write

Suppose I have a stream which only allows one request/response at a time but is used in several threads.
Requests/commands should be throttled such that a new request can only occur once
the previous request has been sent and a reply has been received.
The user would be able to do this
let! res = getResponse("longResp")
let! res2 = getResponse("shortResp")
and not really know or care about the throttle.
I have tried with a modified version of Tomas Petricek's Throttling Agent that allows async with return values, but this requires the user to call getResponse("..") |> Enqueue |> w.Post which is a recipe for disaster (in case they forget to do so).
Is there a good/idiomatic way of doing this in F#?
Then make it explicit in your type system that the returned type needs to be unwrapped with another function. So instead of returning an Async<'T> which as you pointed out can be called directly with Async.Start, rather return something like:
type Queuable<'T> = Queuable of Async<'T>
Then getResponse changes to return a Queueable:
let getResponse (s:string) =
let r =
async{
do! write to your stream
return! read from your stream
}
Queuable r
Provide a function that unwraps the Queuable:
let enqueue (Queuable q) = async{
return! processor.PostAndAsyncReply(fun replyChannel -> replyChannel,q)
}
The processor is an agent that simply runs the Async workflow. Something like this:
let processor = new MailboxProcessor<_>(fun inbox ->
let rec Loop() = async {
let! (r:AsyncReplyChannel<_>,job) = inbox.Receive()
let! res = job
r.Reply res
return! Loop()}
Loop())

Resources