I want a tool for testing Rx components that would work like this:
Given an order of the events specified as a 'v seq and a key selector function (keySelector :: 'v -> 'k) I want to create a Map<'k, IObservable<'k>> where the guarantee is that the groupped observables yield the values in the global order defined by the above enumerable.
For example:
makeObservables isEven [1;2;3;4;5;6]
...should produce
{ true : -2-4-6|,
false: 1-3-5| }
This is my attempt looks like this:
open System
open System.Reactive.Linq
open FSharp.Control.Reactive
let subscribeAfter (o1: IObservable<'a>) (o2 : IObservable<'b>) : IObservable<'b> =
fun (observer : IObserver<'b>) ->
let tempObserver = { new IObserver<'a> with
member this.OnNext x = ()
member this.OnError e = observer.OnError e
member this.OnCompleted () = o2 |> Observable.subscribeObserver observer |> ignore
}
o1.Subscribe tempObserver
|> Observable.Create
let makeObservables (keySelector : 'a -> 'k) (xs : 'a seq) : Map<'k, IObservable<'a>> =
let makeDependencies : ('k * IObservable<'a>) seq -> ('k * IObservable<'a>) seq =
let makeDep ((_, o1), (k2, o2)) = (k2, subscribeAfter o1 o2)
Seq.pairwise
>> Seq.map makeDep
let makeObservable x = (keySelector x, Observable.single x)
let firstItem =
Seq.head xs
|> makeObservable
|> Seq.singleton
let dependentObservables =
xs
|> Seq.map makeObservable
|> makeDependencies
dependentObservables
|> Seq.append firstItem
|> Seq.groupBy fst
|> Seq.map (fun (k, obs) -> (k, obs |> Seq.map snd |> Observable.concatSeq))
|> Map.ofSeq
[<EntryPoint>]
let main argv =
let isEven x = (x % 2 = 0)
let splits : Map<bool, IObservable<int>> =
[1;2;3;4;5]
|> makeObservables isEven
use subscription =
splits
|> Map.toSeq
|> Seq.map snd
|> Observable.mergeSeq
|> Observable.subscribe (printfn "%A")
Console.ReadKey() |> ignore
0 // return an integer exit code
...but the results are not as expected and the observed values are not in the global order.
Apparently the items in each group are yield correctly but when the groups are merged its more like a concat then a merge
The expected output is: 1 2 3 4 5
...but the actual output is 1 3 5 2 4
What am I doing wrong?
Thanks!
You describe wanting this:
{ true : -2-4-6|,
false: 1-3-5| }
But you're really creating this:
{ true : 246|,
false: 135| }
Since there's no time gaps between the items in the observables, the merge basically has a constant race condition. Rx guarantees that element 1 of a given sequence will fire before element 2, but Merge offers no guarantees around cases like this.
You need to introduce time gaps into your observables if you want Merge to be able to re-sequence in the original order.
Related
I'm a beginner in F#, and this is my first attempt at programming something serious. I'm sorry the code is a bit long, but there are some issues with mutability that I don't understand.
This is an implementation of the Karger MinCut Algorithm to calculate the mincut on a non-directed graph component. I won't discuss here how the algo works,
for more info https://en.wikipedia.org/wiki/Karger%27s_algorithm
What is important is it's a randomized algorithm, which is running a determined number of trial runs, and taking the "best" run.
I realize now that I could avoid a lot of the problems below if I did construct a specific function for each random trial, but I'd like to understand EXACTLY what is wrong in the implementation below.
I'm running the code on this simple graph (the mincut is 2 when we cut the graph
into 2 components (1,2,3,4) and (5,6,7,8) with only 2 edges between those 2 components)
3--4-----5--6
|\/| |\/|
|/\| |/\|
2--1-----7--8
the file simplegraph.txt should encode this graph as follow
(1st column = node number, other columns = links)
1 2 3 4 7
2 1 3 4
3 1 2 4
4 1 2 3 5
5 4 6 7 8
6 5 7 8
7 1 5 6 8
8 5 6 7
This code may look too much as imperative programming yet, I'm sorry for that.
So There is a main for i loop calling each trial.
the first execution, (when i=1) looks smooth and perfect,
but I have runtime error execution when i=2, because it looks some variables,
like WG are not reinitialized correctly, causing out of bound errors.
WG, WG1 and WGmin are type wgraphobj, which are a record of Dictionary objects
WG1 is defined outside the main loop and i make no new assignments to WG1.
[but its type is mutable though, alas]
I defined first WG with the instruction
let mutable WG = WG1
then at the beginning of the for i loop,
i write
WG <- WG1
and then later, i modify the WG object in each trial to make some calculations.
when the trial is finished and we go to the next trial (i is increased) i want to reset WG to its initial state being like WG1.
but it seems its not working, and I don't get why...
Here is the full code
MyModule.fs [some functions not necessary for execution]
namespace MyModule
module Dict =
open System.Collections.Generic
let toSeq d = d |> Seq.map (fun (KeyValue(k,v)) -> (k,v))
let toArray (d:IDictionary<_,_>) = d |> toSeq |> Seq.toArray
let toList (d:IDictionary<_,_>) = d |> toSeq |> Seq.toList
let ofMap (m:Map<'k,'v>) = new Dictionary<'k,'v>(m) :> IDictionary<'k,'v>
let ofList (l:('k * 'v) list) = new Dictionary<'k,'v>(l |> Map.ofList) :> IDictionary<'k,'v>
let ofSeq (s:('k * 'v) seq) = new Dictionary<'k,'v>(s |> Map.ofSeq) :> IDictionary<'k,'v>
let ofArray (a:('k * 'v) []) = new Dictionary<'k,'v>(a |> Map.ofArray) :> IDictionary<'k,'v>
Karger.fs
open MyModule.Dict
open System.IO
let x = File.ReadAllLines "\..\simplegraph.txt";;
// val x : string [] =
let splitAtTab (text:string)=
text.Split [|'\t';' '|]
let splitIntoKeyValue (s:seq<'T>) =
(Seq.head s, Seq.tail s)
let parseLine (line:string)=
line
|> splitAtTab
|> Array.filter (fun s -> not(s=""))
|> Array.map (fun s-> (int s))
|> Array.toSeq
|> splitIntoKeyValue
let y =
x |> Array.map parseLine
open System.Collections.Generic
// let graph = new Map <int, int array>
let graphD = new Dictionary<int,int seq>()
y |> Array.iter graphD.Add
let graphM = y |> Map.ofArray //immutable
let N = y.Length // number of nodes
let Nruns = 2
let remove_table = new Dictionary<int,bool>()
[for i in 1..N do yield (i,false)] |> List.iter remove_table.Add
// let remove_table = seq [|for a in 1 ..N -> false|] // plus court
let label_head_table = new Dictionary<int,int>()
[for i in 1..N do yield (i,i)] |> List.iter label_head_table.Add
let label = new Dictionary<int,int seq>()
[for i in 1..N do yield (i,[i])] |> List.iter label.Add
let mutable min_cut = 1000000
type wgraphobj =
{ Graph : Dictionary<int,int seq>
RemoveTable : Dictionary<int,bool>
Label : Dictionary<int,int seq>
LabelHead : Dictionary<int,int> }
let WG1 = {Graph = graphD;
RemoveTable = remove_table;
Label = label;
LabelHead = label_head_table}
let mutable WGmin = WG1
let IsNotRemoved x = //
match x with
| (i,false) -> true
| (i,true) -> false
let IsNotRemoved1 WG i = //
(i,WG.RemoveTable.[i]) |>IsNotRemoved
let GetLiveNode d =
let myfun x =
match x with
| (i,b) -> i
d |> toList |> List.filter IsNotRemoved |> List.map myfun
let rand = System.Random()
// subsets a dictionary given a sub_list of keys
let D_Subset (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = Dictionary<'T,'U>() // create new empty dictionary
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (k, dict.[k]))
|> List.iter (fun s -> z.Add s)
z
// subsets a dictionary given a sub_list of keys to remove
let D_SubsetC (dict:Dictionary<'T,'U>) (sub_list:list<'T>) =
let z = dict
sub_list |> List.filter (fun k -> dict.ContainsKey k)
|> List.map (fun k -> (dict.Remove k)) |>ignore
z
// subsets a sequence by values in a sequence
let S_Subset (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> Seq.exists (fun elem -> elem = s) sub_list)
let S_SubsetC (S:seq<'T>)(sub_list:seq<'T>) =
S |> Seq.filter (fun s-> not(Seq.exists (fun elem -> elem = s) sub_list))
[<EntryPoint>]
let main argv =
let mutable u = 0
let mutable v = 0
let mutable r = 0
let mutable N_cut = 1000000
let mutable cluster_A_min = seq [0]
let mutable cluster_B_min = seq [0]
let mutable WG = WG1
let mutable LiveNodeList = [0]
// when i = 2, i encounter problems with mutability
for i in 1 .. Nruns do
WG <- WG1
printfn "%d" i
for k in 1..(N-2) do
LiveNodeList <- GetLiveNode WG.RemoveTable
r <- rand.Next(0,N-k)
u <- LiveNodeList.[r] //selecting a live node
let uuu = WG.Graph.[u] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let n_edge = uuu |> Seq.length
let x = rand.Next(1,n_edge)
let mutable ok = false //maybe we can take this out
while not(ok) do
// selecting the edge from node u
v <- WG.LabelHead.[Array.get (uuu |> Seq.toArray) (x-1)]
let vvv = WG.Graph.[v] |> Seq.map (fun s -> WG.LabelHead.[s] )
|> Seq.filter (IsNotRemoved1 WG)
|> Seq.distinct
let zzz = S_SubsetC (Seq.concat [uuu;vvv] |> Seq.distinct) [u;v]
WG.Graph.[u] <- zzz
let lab_u = WG.Label.[u]
let lab_v = WG.Label.[v]
WG.Label.[u] <- Seq.concat [lab_u;lab_v] |> Seq.distinct
if (k<N-1) then
WG.RemoveTable.[v]<-true
//updating Label_head for all members of Label.[v]
WG.LabelHead.[v]<- u
for j in WG.Label.[v] do
WG.LabelHead.[j]<- u
ok <- true
printfn "u= %d v=%d" u v
// end of for k in 1..(N-2)
// counting cuts
// u,v contain the 2 indexes of groupings
let cluster_A = WG.Label.[u]
let cluster_B = S_SubsetC (seq[for i in 1..N do yield i]) cluster_A // defined as complementary of A
// let WG2 = {Graph = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// RemoveTable = remove_table
// Label = D_Subset WG1.Graph (cluster_A |> Seq.toList)
// LabelHead = label_head_table}
let cross_edge = // returns keyvalue pair (k,S')
let IsInCluster cluster (k,S) =
(k,S_Subset S cluster)
graphM |> toSeq |> Seq.map (IsInCluster cluster_B)
N_cut <-
cross_edge |> Seq.map (fun (k:int,v:int seq)-> Seq.length v)
|> Seq.sum
if (N_cut<min_cut) then
min_cut <- N_cut
WGmin <- WG
cluster_A_min <- cluster_A
cluster_B_min <- cluster_B
// end of for i in 1..Nruns
0 // return an integer exit code
Description of the algo: (i don't think its too essential to solve my problem)
at each trial, there are several steps. at each step, we merge 2 nodes into 1, (removing effectively 1) updating the graph. we do that 6 times until there are only 2 nodes left, which we define as 2 clusters, and we look at the number of cross edges between those 2 clusters. if we are "lucky" those 2 clusters would be (1,2,3,4) and (5,6,7,8) and find the right number of cuts.
at each step, the object WG is updated with the effects of merging 2 nodes
with only LiveNodes (the ones which are not eliminated as a result of merging 2 nodes) being perfectly kept up to date.
WG.Graph is the updated graph
WG.Label contains the labels of the nodes which have been merged into the current node
WG.LabelHead contains the label of the node into which that node has been merged
WG.RemoveTable says if the node has been removed or not.
Thanks in advance for anyone willing to take a look at it !
"It seems not working", because wgraphobj is a reference type, which is allocated on the stack, which means that when you're mutating the innards of WG, you're also mutating the innards of WG1, because they're the same innards.
This is precisely the kind of mess you get yourself into if you use mutable state. This is why people recommend to not use it. In particular, your use of mutable dictionaries undermines the robustness of your algorithm. I recommend using the F#'s own efficient immutable dictionary (called Map) instead.
Now, in response to your comment about WG.Graph <- GraphD giving compile error.
WG is mutable, but WG.Graph is not (but the contents of WG.Graph are again mutable). There is a difference, let me try to explain it.
WG is mutable in the sense that it points to some object of type wgraphobj, but you can make it, in the course of your program, to point at another object of the same type.
WG.Graph, on the other hand, is a field packed inside WG. It points to some object of type Dictionary<_,_>. And you cannot make it point to another object. You can create a different wgraphobj, in which the field Graph point to a different dictionary, but you cannot change where the field Graph of the original wgraphobj points.
In order to make the field Graph itself mutable, you can declare it as such:
type wgraphobj = {
mutable Graph: Dictionary<int, int seq>
...
Then you will be able to mutate that field:
WG.Graph <- GraphD
Note that in this case you do not need to declare the value WG itself as mutable.
However, it seems to me that for your purposes you can actually go the way of creating a new instance wgraphobj with the field Graph changed, and assigning it to the mutable reference WG:
WG.Graph <- { WG with Graph = GraphD }
I'm trying to create a filter function accept two list parameters and return all the items in the first seq after excluded these existing (equal to A) in the second list.
type R = { A: string; B: int; ...}
let filter (xxx: seq<string) (except: list<R>) =
xxx
|> Seq.filter (fun i ->
// returns all the items in xxx which not equal to any except.A
The simplest code would be:
type R = { A: string; B: int; }
let filter where except =
let except' = except |> List.map (fun x -> x.A) |> Set.ofList
where
|> Seq.filter (not << except'.Contains)
Notes:
Since the computation only uses R.A, we retrieve these R.A values only once for performance reasons.
Converting it to Set would eliminate duplicates as they only degrade performance and not affect the final result.
Since the type of except' is inferred as Set<string>, we can use member method except'.Contains instead of Set.contains.
I think one thing would be to do
let filter (xxx: seq<string>) (except: list<R>) =
xxx
|> Seq.filter (fun i -> except |> List.exists (fun t -> t.A = i) |> not)
Fluent LINQ implementation:
let filter (where: seq<string>) except =
let contains = set (where.Except(List.map (fun x -> x.A) except)) in
where.Where contains.Contains
There is now Seq.except:
xs
|> Seq.except ys
// All xs that are not in ys
I have this code
let inline ProcessExpendableADGroups (input : ('a * SPUser) seq) =
input
|> Seq.filter (fun (_, u : SPUser) -> u.IsDomainGroup = true)
|> Seq.filter (fun (_, u : SPUser) -> ADUtility.IsADGroupExpandable u.LoginName = true)
|> List.ofSeq
|> List.iter(
fun ( li : 'a, u : SPUser) ->
let userList = ADUtility.GetUsers u.LoginName
if (Seq.length userList <= 500) then
userList
|> Seq.filter (fun l -> InfobarrierPolicy.IsUserInPolicy l "FW" = true)
|> Seq.iter (
fun ln ->
let x = ADUtility.GetNameAndEmail ln
let (email, name) = x.Value
SPUtility.CopyRoleAssignment li u.LoginName ln email name
li.Update()
)
SPUtility.RemoveRoleAssignment li u
)
list3
|> List.iter (
fun w ->
SPUtility.GetDirectAssignmentsforListItems w |> ProcessExpendableADGroups
SPUtility.GetDirectAssignmentsforFolders w |> ProcessExpendableADGroups
SPUtility.GetDirectAssignmentsforLists w |> ProcessExpendableADGroups
SPUtility.GetDirectAssignmentsforWeb w |> ProcessExpendableADGroups
)
Here the methods GetDirectAssignmentsforListItems returns a Sequence of tuples (SPListItem * SPUser)
GetDirectAssignmentsforWeb returns a sequence of tuples (SPWeb * SPUser).
I need to send this sequence to a function which does very similar processing on these items except that in the end I have to call a method called "Update" on these items.
I have written a method with Generic parameter but I am having a problem when I call Update on the generic parameter.
I am not able to constrain this parameter to say that the parameter must have a method called Update.
You can use member constraints and statically resolved type parameters to do so.
let inline ProcessExpendableADGroups (input : (^a * SPUser) seq) = //'
input
|> Seq.filter (fun (_, u) -> u.IsDomainGroup && ADUtility.IsADGroupExpandable u.LoginName)
|> Seq.iter(
fun (li, u) ->
let userList = ADUtility.GetUsers u.LoginName
if (Seq.length userList <= 500) then
userList
|> Seq.filter (fun l -> InfobarrierPolicy.IsUserInPolicy l "FW")
|> Seq.iter (
fun ln ->
let x = ADUtility.GetNameAndEmail ln
let (email, name) = x.Value
SPUtility.CopyRoleAssignment li u.LoginName ln email name
(^a : (member Update : unit -> unit) li) //'
)
SPUtility.RemoveRoleAssignment li u
)
There is also a series of helpful articles on the topic here.
A few improvements I have done on the function above:
A series of Seq.filter could be collapsed to one Seq.filter , and = true is always a code smell.
List.ofSeq and List.iter could be replaced by Seq.iter. When you use Seq.iter, a lazy sequence will be evaluated anyway.
Do not write redundant type annotations such as li: 'a and u: SPUser. Since you use piping and have type annotation for input, the type checker would be able to infer correct types.
The constraint just looks like this (it doesn't need to be at the method decleration - just where you use it)
(^a: ( member Update: unit-> unit )t))
This will call a method called Update on the object t
I should split seq<a> into seq<seq<a>> by an attribute of the elements. If this attribute equals by a given value it must be 'splitted' at that point. How can I do that in FSharp?
It should be nice to pass a 'function' to it that returns a bool if must be splitted at that item or no.
Sample:
Input sequence: seq: {1,2,3,4,1,5,6,7,1,9}
It should be splitted at every items when it equals 1, so the result should be:
seq
{
seq{1,2,3,4}
seq{1,5,6,7}
seq{1,9}
}
All you're really doing is grouping--creating a new group each time a value is encountered.
let splitBy f input =
let i = ref 0
input
|> Seq.map (fun x ->
if f x then incr i
!i, x)
|> Seq.groupBy fst
|> Seq.map (fun (_, b) -> Seq.map snd b)
Example
let items = seq [1;2;3;4;1;5;6;7;1;9]
items |> splitBy ((=) 1)
Again, shorter, with Stephen's nice improvements:
let splitBy f input =
let i = ref 0
input
|> Seq.groupBy (fun x ->
if f x then incr i
!i)
|> Seq.map snd
Unfortunately, writing functions that work with sequences (the seq<'T> type) is a bit difficult. They do not nicely work with functional concepts like pattern matching on lists. Instead, you have to use the GetEnumerator method and the resulting IEnumerator<'T> type. This often makes the code quite imperative. In this case, I'd write the following:
let splitUsing special (input:seq<_>) = seq {
use en = input.GetEnumerator()
let finished = ref false
let start = ref true
let rec taking () = seq {
if not (en.MoveNext()) then finished := true
elif en.Current = special then start := true
else
yield en.Current
yield! taking() }
yield taking()
while not (!finished) do
yield Seq.concat [ Seq.singleton special; taking()] }
I wouldn't recommend using the functional style (e.g. using Seq.skip and Seq.head), because this is quite inefficient - it creates a chain of sequences that take value from other sequence and just return it (so there is usually O(N^2) complexity).
Alternatively, you could write this using a computation builder for working with IEnumerator<'T>, but that's not standard. You can find it here, if you want to play with it.
The following is an impure implementation but yields immutable sequences lazily:
let unflatten f s = seq {
let buffer = ResizeArray()
let flush() = seq {
if buffer.Count > 0 then
yield Seq.readonly (buffer.ToArray())
buffer.Clear() }
for item in s do
if f item then yield! flush()
buffer.Add(item)
yield! flush() }
f is the function used to test whether an element should be a split point:
[1;2;3;4;1;5;6;7;1;9] |> unflatten (fun item -> item = 1)
Probably no the most efficient solution, but this works:
let takeAndSkipWhile f s = Seq.takeWhile f s, Seq.skipWhile f s
let takeAndSkipUntil f = takeAndSkipWhile (f >> not)
let rec splitOn f s =
if Seq.isEmpty s then
Seq.empty
else
let pre, post =
if f (Seq.head s) then
takeAndSkipUntil f (Seq.skip 1 s)
|> fun (a, b) ->
Seq.append [Seq.head s] a, b
else
takeAndSkipUntil f s
if Seq.isEmpty pre then
Seq.singleton post
else
Seq.append [pre] (splitOn f post)
splitOn ((=) 1) [1;2;3;4;1;5;6;7;1;9] // int list is compatible with seq<int>
The type of splitOn is ('a -> bool) -> seq<'a> -> seq>. I haven't tested it on many inputs, but it seems to work.
In case you are looking for something which actually works like split as an string split (i.e the item is not included on which the predicate returns true) the below is what I came up with.. tried to be as functional as possible :)
let fromEnum (input : 'a IEnumerator) =
seq {
while input.MoveNext() do
yield input.Current
}
let getMore (input : 'a IEnumerator) =
if input.MoveNext() = false then None
else Some ((input |> fromEnum) |> Seq.append [input.Current])
let splitBy (f : 'a -> bool) (input : 'a seq) =
use s = input.GetEnumerator()
let rec loop (acc : 'a seq seq) =
match s |> getMore with
| None -> acc
| Some x ->[x |> Seq.takeWhile (f >> not) |> Seq.toList |> List.toSeq]
|> Seq.append acc
|> loop
loop Seq.empty |> Seq.filter (Seq.isEmpty >> not)
seq [1;2;3;4;1;5;6;7;1;9;5;5;1]
|> splitBy ( (=) 1) |> printfn "%A"
if I have array A, and I have another bool array isChosen with the same length of A how can I build a new array from A where isChosen is true? something like A.[isChosen]? I cannot use Array.filter directly since isChosen is not a function of A elements and there is no Array.filteri like Array.mapi.
zip should help:
let l = [|1;2;3|]
let f = [|true; false; true|]
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]
// or
let r = (l, f) ||> Seq.zip |> Seq.filter snd |> Seq.map fst |> Seq.toArray
Try the zip operator
seq.zip A isChosen
|> Seq.filter snd
|> Seq.map fst
|> Array.ofSeq
This will create a sequence of tuples where one value is from A and the other is from isChosen. This will pair the values together and make it very easy to filter them out in a Seq.filter expression
It's not as elegant or 'functional' as the other answers, but every once in a while I like a gentle reminder that you can use loops and array indices in F#:
let A = [|1;2;3|]
let isChosen = [|true; false; true|]
let r = [| for i in 0..A.Length-1 do
if isChosen.[i] then
yield A.[i] |]
printfn "%A" r
:)
And here are two more ways, just to demonstrate (even) more F# library functions:
let A = [|1;2;3|]
let isChosen = [|true;false;true|]
let B = Seq.map2 (fun x b -> if b then Some x else None) A isChosen
|> Seq.choose id
|> Seq.toArray
let C = Array.foldBack2 (fun x b acc -> if b then x::acc else acc) A isChosen []
|> List.toArray
My personal favorite for understandability (and therefore maintainability): desco's answer
let r = [| for (v, f) in Seq.zip l f do if f then yield v|]