I have the following tweet stream class. It has the TweetReceived event which can be used with the other components of my system.
It seems to work ok but I have the feeling that it's more complicated than it should be.
Are there any tools out there to give me this functionality without having to implement the mbox/event mechanism by myself?
Also would you recommend to use asyncSeq instead of IObservable?
Thanks!
type TweetStream ( cfg:oauth.Config) =
let token = TwitterToken.Token (cfg.accessToken,
cfg.accessTokenSecret,
cfg.appKey,
cfg.appSecret)
let stream = new SimpleStream("https://stream.twitter.com/1.1/statuses/sample.json")
let event = new Event<_>()
let agent = MailboxProcessor.Start(fun (mbox) ->
let rec loop () =
async {
let! msg = mbox.Receive()
do event.Trigger(msg)
return! loop()
}
loop ())
member x.TweetReceived = event.Publish
member x.Start () =
Task.Factory.StartNew(fun () -> stream.StartStream(token, agent.Post))
|> ignore
member x.Stop = stream.StopStream
UPDATE:
Thanks Thomas for the quick (as always) answer to the second question.
My first question may be a little unclear so I refactored the code to make the class AgentEvent visible and I rephrase the first question: is there a way to implement the logic in AgentEvent easier? Is this logic implemenented already in some place?
I'm asking this because it feels like a common usage pattern.
type AgentEvent<'t>()=
let event = new Event<'t>()
let agent = MailboxProcessor.Start(fun (mbox) ->
let rec loop () =
async {
let! msg = mbox.Receive()
do event.Trigger(msg)
return! loop()
}
loop ())
member x.Event = event.Publish
member x.Post = agent.Post
type TweetStream ( cfg:oauth.Config) =
let token = TwitterToken.Token (cfg.accessToken,
cfg.accessTokenSecret,
cfg.appKey,
cfg.appSecret)
let stream = new SimpleStream("https://stream.twitter.com/1.1/statuses/sample.json")
let agentEvent = AgentEvent()
member x.TweetReceived = agentEvent.Event
member x.Start () =
Task.Factory.StartNew(fun () -> stream.StartStream(token, agentEvent.Post))
|> ignore
member x.Stop = stream.StopStream
I think that IObservable is the right abstraction for publishing the events. As for processing them, I would use either Reactive Extensions or F# Agents (MailboxProcessor), depending on what you want to do.
Note that F# automatically represents events as IObservable values (actually IEvent, but that inherits from observable), so you can use Reactive Extensions directly on TweetReceived.
What is the right representation?
The main point of asyncSeq is that it lets you control how quickly the data is generated - it is like async in that you have to start it to actually do the work and get a value - so this is useful if you can start some operation (e.g. download next few bytes) to get the next value
IObservable is useful when you have no control over the data source - when it just keeps producing values and you have no way to pause it - this seems more appropriate for tweets.
As for processing, I think that Reactive Extensions are nice when they already implement the operations you need. When you need to write some custom logic (that is not easily expressed in Rx), using Agent is a great way to write your own Rx-like functions.
Related
Let say I have a class which iherits legacy API and overrides a virtual method which is called when something happens
type MyClass() as this =
let somethingObservable: IObservable<Something> = ...
override _.OnSomething(s: Something) = ...
How can I translate each invokation of OnSomething to a notification of somethingObservable?
That's probably a simple question, but I could not find a way to do it properly (should I use not advised ISubject?). Appreciate your help.
Using a subject like this is fine, and ensures correctness in the implementation.
Here's an example using FSharp.Control.Reactive, which gets you the idiomatic way of writing this.
type MyClass() =
inherit Legacy()
let somethingObservable =
Subject.broadcast
override _.OnSomething s =
somethingObservable |> Subject.onNext s |> ignore
member _.AsObservable =
somethingObservable |> Observable.asObservable
You can also use new Subject<_> and its methods, same thing.
In some assembly if you'd prefer not to take on the System.Reactive dependency, F# also natively supports IObservable through events.
type MyClassEvt() =
inherit Legacy()
let event = new Event<_>()
override _.OnSomething s =
event.Trigger s
member _.AsObservable =
event.Publish :> IObservable<_>
I am a little curious about the code example below and what people think.
The idea was to read from a NetworkStream (~20 msg/s) and instead of working in the main, pass things to MainboxProcessor to handle and get things back for bindings when done.
The usual way is to use PostAndReply, but I want to bind to ListView or other control in C#. Must do magic with LastN items and filtering anyway.
Plus, Rx has some error handling.
The example below observes numbers from 2..10 and returns "hello X". On 8 it stops like it was EOF. Made it to ToEnumerable because other thread finishes before otherwise, but it works with Subscribe as well.
What bothers me:
passing Subject(obj) around in recursion. I don't see any problems having around 3-4 of those. Good idea?
Lifetime of Subject.
open System
open System.Threading
open System.Reactive.Subjects
open System.Reactive.Linq // NuGet, take System.Reactive.Core also.
open System.Reactive.Concurrency
type SerializedLogger() =
let _letters = new Subject<string>()
// create the mailbox processor
let agent = MailboxProcessor.Start(fun inbox ->
// the message processing function
let rec messageLoop (letters:Subject<string>) = async{
// read a message
let! msg = inbox.Receive()
printfn "mailbox: %d in Thread: %d" msg Thread.CurrentThread.ManagedThreadId
do! Async.Sleep 100
// write it to the log
match msg with
| 8 -> letters.OnCompleted() // like EOF.
| x -> letters.OnNext(sprintf "hello %d" x)
// loop to top
return! messageLoop letters
}
// start the loop
messageLoop _letters
)
// public interface
member this.Log msg = agent.Post msg
member this.Getletters() = _letters.AsObservable()
/// Print line with prefix 1.
let myPrint1 x = printfn "onNext - %s, Thread: %d" x Thread.CurrentThread.ManagedThreadId
// Actions
let onNext = new Action<string>(myPrint1)
let onCompleted = new Action(fun _ -> printfn "Complete")
[<EntryPoint>]
let main argv =
async{
printfn "Main is on: %d" Thread.CurrentThread.ManagedThreadId
// test
let logger = SerializedLogger()
logger.Log 1 // ignored?
let xObs = logger
.Getletters() //.Where( fun x -> x <> "hello 5")
.SubscribeOn(Scheduler.CurrentThread)
.ObserveOn(Scheduler.CurrentThread)
.ToEnumerable() // this
//.Subscribe(onNext, onCompleted) // or with Dispose()
[2..10] |> Seq.iter (logger.Log)
xObs |> Seq.iter myPrint1
while true
do
printfn "waiting"
System.Threading.Thread.Sleep(1000)
return 0
} |> Async.RunSynchronously // return an integer exit code
I have done similar things, but using the plain F# Event type rather than Subject. It basically lets you create IObservable and trigger its subscribes - much like your use of more complex Subject. The event-based version would be:
type SerializedLogger() =
let letterProduced = new Event<string>()
let lettersEnded = new Event<unit>()
let agent = MailboxProcessor.Start(fun inbox ->
let rec messageLoop (letters:Subject<string>) = async {
// Some code omitted
match msg with
| 8 -> lettersEnded.Trigger()
| x -> letterProduced.Trigger(sprintf "hello %d" x)
// ...
member this.Log msg = agent.Post msg
member this.LetterProduced = letterProduced.Publish
member this.LettersEnded = lettersEnded.Publish
The important differences are:
Event cannot trigger OnCompleted, so I instead exposed two separate events. This is quite unfortunate! Given that Subject is very similar to events in all other aspects, this might be a good reason for using subject instead of plain event.
The nice aspect of using Event is that it is a standard F# type, so you do not need any external dependencies in the agent.
I noticed your comment noting that the first call to Log was ignored. That's because you subscribe to the event handler only after this call happens. I think you could use ReplaySubject variation on the Subject idea here - it replays all events when you subscribe to it, so the one that happened earlier would not be lost (but there is a cost to caching).
In summary, I think using Subject is probably a good idea - it is essentially the same pattern as using Event (which I think is quite standard way of exposing notifications from agents), but it lets you trigger OnCompleted. I would probably not use ReplaySubject, because of the caching cost - you just have to make sure to subscribe before triggering any events.
How do I register and implement event handlers for .Net events within F#?
I reviewed this link but it seems a bit verbose.
Example:
namespace Core
open Lego.Ev3.Core
open Lego.Ev3.Desktop
type LegoExample() =
let _brick = Brick(BluetoothCommunication("COM3"))
_brick.Changed += OnBrickChanged
let OnBrickChanged =
// some logic goes here...
In F#, events are represented as IEvent<T> values which inherit from IObservable<T> and so they are just ordinary values - not a special language construct.
You can register a handler using the Add method:
type LegoExample() =
let brick = Brick(BluetoothCommunication("COM3"))
do brick.Changed.Add(fun e ->
// some logic goes here...
)
You need the do keyword here if you want to register the handler inside the constructor of LegoExample.
Alternatively, you can also use various functions from the Observable module - those implement basic functionality similar to the one provided by Rx:
type LegoExample() =
let brick = Brick(BluetoothCommunication("COM3"))
do brick.Changed
|> Observable.filter (fun e -> ...)
|> Observable.add (fun e ->
// some logic goes here...
)
A common example used to illustrate asynchronous workflows in F# is retrieving multiple webpages in parallel. One such example is given at: http://en.wikibooks.org/wiki/F_Sharp_Programming/Async_Workflows Code shown here in case the link changes in the future:
open System.Text.RegularExpressions
open System.Net
let download url =
let webclient = new System.Net.WebClient()
webclient.DownloadString(url : string)
let extractLinks html = Regex.Matches(html, #"http://\S+")
let downloadAndExtractLinks url =
let links = (url |> download |> extractLinks)
url, links.Count
let urls =
[#"http://www.craigslist.com/";
#"http://www.msn.com/";
#"http://en.wikibooks.org/wiki/Main_Page";
#"http://www.wordpress.com/";
#"http://news.google.com/";]
let pmap f l =
seq { for a in l -> async { return f a } }
|> Async.Parallel
|> Async.Run
let testSynchronous() = List.map downloadAndExtractLinks urls
let testAsynchronous() = pmap downloadAndExtractLinks urls
let time msg f =
let stopwatch = System.Diagnostics.Stopwatch.StartNew()
let temp = f()
stopwatch.Stop()
printfn "(%f ms) %s: %A" stopwatch.Elapsed.TotalMilliseconds msg temp
let main() =
printfn "Start..."
time "Synchronous" testSynchronous
time "Asynchronous" testAsynchronous
printfn "Done."
main()
What I would like to know is how one should handle changes in global state such as loss of a network connection? Is there an elegant way to do this?
One could check the state of the network prior to making the Async.Parallel call, but the state could change during execution. Assuming what one wanted to do was pause execution until the network was available again rather than fail, is there a functional way to do this?
First of all, there is one issue with the example - it uses Async.Parallel to run multiple operations in parallel but the operations themselves are not implemented as asynchronous, so this will not avoid blocking excessive number of threads in the thread pool.
Asynchronous. To make the code fully asynchronous, the download and downloadAndExtractLinks functions should be asynchronous too, so that you can use AsyncDownloadString of the WebClient:
let asyncDownload url = async {
let webclient = new System.Net.WebClient()
return! webclient.AsyncDownloadString(System.Uri(url : string)) }
let asyncDownloadAndExtractLinks url = async {
let! html = asyncDownload url
let links = extractLinks html
return url, links.Count }
let pmap f l =
seq { for a in l -> async { return! f a } }
|> Async.Parallel
|> Async.RunSynchronously
Retrying. Now, to answer the question - there is no built-in mechanism for handling of errors such as network failure, so you will need to implement this logic yourself. What is the right approach depends on your situation. One common approach is to retry the operation certain number of times and throw the exception only if it does not succeed e.g. 10 times. You can write this as a primitive that takes other asynchronous workflow:
let rec asyncRetry times op = async {
try
return! op
with e ->
if times <= 1 then return (reraise e)
else return! asyncRetry (times - 1) op }
Then you can change the main function to build a workflow that retries the download 10 times:
let testAsynchronous() =
pmap (asyncRetry 10 downloadAndExtractLinks) urls
Shared state. Another problem is that Async.Parallel will only return once all the downloads have completed (if there is one faulty web site, you will have to wait). If you want to show the results as they come back, you will need something more sophisticated.
One nice way to do this is to use F# agent - create an agent that stores the results obtained so far and can handle two messages - one that adds new result and another that returns the current state. Then you can start multiple async tasks that will send the result to the agent and, in a separate async workflow, you can use polling to check the current status (and e.g. update the user interface).
I wrote a MSDN series about agents and also two articles for developerFusion that have a plenty of code samples with F# agents.
Does anyone know of 'prior art' regarding the following subject :
I have data that take some decent time to load. they are historical level for various stocks.
I would like to preload them somehow, to avoid the latency when using my app
However, preloading them in one chunk at start makes my app unresponsive first which is not user friendly
So I would like to not load my data.... unless the user is not requesting any and playing with what he already has, in which case I would like to get little by little. So it is neither 'lazy' nor 'eager', more 'lazy when you need' and 'eager when you can', hence the acronym LWYNEWYC.
I have made the following which seems to work, but I just wonder if there is a recognized and blessed approach for such thing ?
let r = LoggingFakeRepo () :> IQuoteRepository
r.getHisto "1" |> ignore //prints Getting histo for 1 when called
let rc = RepoCached (r) :> IQuoteRepository
rc.getHisto "1" |> ignore //prints Getting histo for 1 the first time only
let rcc = RepoCachedEager (r) :> IQuoteRepository
rcc.getHisto "100" |> ignore //prints Getting histo 1..100 by itself BUT
//prints Getting histo 100 immediately when called
And the classes
type IQuoteRepository =
abstract getUnderlyings : string seq
abstract getHisto : string -> string
type LoggingFakeRepo () =
interface IQuoteRepository with
member x.getUnderlyings = printfn "getting underlyings"
[1 .. 100] |> List.map string :> _
member x.getHisto udl = printfn "getting histo for %A" udl
"I am a historical dataset in a disguised party"
type RepoCached (rep : IQuoteRepository) =
let memoize f =
let cache = new System.Collections.Generic.Dictionary<_, _>()
fun x ->
if cache.ContainsKey(x) then cache.[x]
else let res = f x
cache.[x] <- res
res
let udls = lazy (rep.getUnderlyings )
let gethistom = memoize rep.getHisto
interface IQuoteRepository with
member x.getUnderlyings = udls.Force()
member x.getHisto udl = gethistom udl
type Message = string * AsyncReplyChannel<UnderlyingWrap>
type RepoCachedEager (rep : IQuoteRepository) =
let udls = rep.getUnderlyings
let agent = MailboxProcessor<Message>.Start(fun inbox ->
let repocached = RepoCached (rep) :> IQuoteRepository
let rec loop l =
async { try
let timeout = if l|> List.isEmpty then -1 else 50
let! (udl, replyChannel) = inbox.Receive(timeout)
replyChannel.Reply(repocached.getHisto udl)
do! loop l
with
| :? System.TimeoutException ->
let udl::xs = l
repocached.getHisto udl |> ignore
do! loop xs
}
loop (udls |> Seq.toList))
interface IQuoteRepository with
member x.getUnderlyings = udls
member x.getHisto udl = agent.PostAndReply(fun reply -> udl, reply)
I like your solution. I think using agent to implement some background loading with a timeout is a great way to go - agents can nicely encapsulate mutable state, so it is clearly safe and you can encode the behaviour you want quite easily.
I think asynchronous sequences might be another useful abstraction (if I'm correct, they are available in FSharpX these days). An asynchronous sequence represents a computation that asynchronously produces more values, so they might be a good way to separate the data loader from the rest of the code.
I think you'll still need an agent to synchronize at some point, but you can nicely separate different concerns using async sequences.
The code to load the data might look something like this:
let loadStockPrices repo = asyncSeq {
// TODO: Not sure how you detect that the repository has no more data...
while true do
// Get next item from the repository, preferably asynchronously!
let! data = repo.AsyncGetNextHistoricalValue()
// Return the value to the caller...
yield data }
This code represents the data loader, and it separates it from the code that uses it. From the agent that consumes the data source, you can use AsyncSeq.iterAsync to consume the values and do something with them.
With iterAsync, the function that you specify as a consumer is asynchronous. It may block (i.e. using Sleep) and when it blocks, the source - that is.your loader - is also blocked. This is quite nice implicit way to control the loader from the code that consumes the data.
A feature that is not in the library yet (but would be useful) is an partially eager evaluator that takes AsyncSeq<'T> and returns a new AsyncSeq<'T> but obtains a certain number of elements from the source as soon as possible and caches them (so that the consumer does not have to wait when it asks for a value, as long as the source can produce values fast enough).