Trying to solve task with
Code
open System
let rec distribute e = function
| [] -> [[e]]
| x::xs' as xs -> (e::xs)::[for xs in distribute e xs' -> x::xs]
let rec permute = function
| [] -> [[]]
| e::xs -> List.collect (distribute e) (permute xs)
let MoreRule (a: int) (b: int) (c: int) (d: int) =
let permutations = permute [a;b;c;d]
let sums = permutations |> List.map(fun x -> x.[0]*x.[1] + x.[2]*x.[3])
let diffs = permutations |> List.map(fun x -> x.[0]*x.[1] - x.[2]*x.[3])
List.append sums diffs
|> List.distinct
|> List.filter(fun x -> x>0)
|> List.sort
let factor number list = [
for i in list do
let t = Math.Floor(Math.Sqrt((number-Math.Pow((float)i,2.0)))) |> int
if (List.exists ((=)t) list) then
yield [|i;t|]
]
let prod2Sum (a: int) (b: int) (c: int) (d: int): int[] list =
let number = (float)(((a*a)+(b*b))*((c*c)+(d*d)))
let coefficients = MoreRule a b c d
factor number coefficients
|> List.map(fun arr -> if (arr.[0]>arr.[1]) then [|arr.[1];arr.[0]|] else [|arr.[0];arr.[1]|])
|> List.distinct
When clicking on Attemp button on page i get response Failed Exit Code 1
Expected: [[|75; 104|]; [|85; 96|]] Actual: []
But on my local PC prod2Sum 4 5 20 1 return proper result [[|75; 104|]; [|85; 96|]]
The main question what I am doing wrong? How to improve this code? Thank you!
I do not exactly know what is wrong with your code - to help with that, you probably need to explain more about how it is supposed to work - but I was able to replicate the case for which your code fails. If you run your prod2Sum function as follows, then it returns an empty list:
prod2Sum 1 20 -4 -5
The expected result by CodeWars is [[|75; 104|]; [|85; 96|]]. I did not figure out how your code is trying to solve the problem, but some experimentation suggests that negative values get filtered out in MoreRule by the following line:
|> List.filter(fun x -> x > 0)
Removing this does not quite work - I get the right numbers, but some are negative. Replacing this with List.map abs works in this example, but fails for some other inputs.
For the record, an easy way to find what's going wrong is to add some logging to your prod2Sum function:
let prod2Sum (a: int) (b: int) (c: int) (d: int): int[] list =
let number = (float)(((a*a)+(b*b))*((c*c)+(d*d)))
let coefficients = MoreRule a b c d
let res =
factor number coefficients
|> List.map(fun arr -> if (arr.[0]>arr.[1]) then [|arr.[1];arr.[0]|] else [|arr.[0];arr.[1]|])
|> List.distinct
if res = [] then
printfn "%A" (a,b,c,d) // Log inputs for which we returned wrong result
r
With this, you should be able to find out what other test cases are failing.
Related
I have been doing a CodeWars exercise which can also be seen at dev.to.
The essence of it is:
There is a line for the self-checkout machines at the supermarket. Your challenge is to write a function that calculates the total amount of time required for the rest of the customers to check out!
INPUT
customers : an array of positive integers representing the line. Each integer represents a customer, and its value is the amount of time they require to check out.
n : a positive integer, the number of checkout tills.
RULES
There is only one line serving many machines, and
The order of the line never changes, and
The front person in the line (i.e. the first element in the array/list) proceeds to a machine as soon as it becomes free.
OUTPUT
The function should return an integer, the total time required.
The answer I came up with works - but it is highly imperative.
open System.Collections.Generic
open System.Linq
let getQueueTime (customerArray: int list) n =
let mutable d = new Dictionary<string,int>()
for i in 1..n do
d.Add(sprintf "Line%d" <| i, 0)
let getNextAvailableSupermarketLineName(d:Dictionary<string,int>) =
let mutable lowestValue = -1
let mutable lineName = ""
for myLineName in d.Keys do
let myValue = d.Item(myLineName)
if lowestValue = -1 || myValue <= lowestValue then
lowestValue <- myValue
lineName <- myLineName
lineName
for x in customerArray do
let lineName = getNextAvailableSupermarketLineName d
let lineTotal = d.Item(lineName)
d.Item(lineName) <- lineTotal + x
d.Values.Max()
So my question is ... is this OK F# code or should it be written in a functional way? And if the latter, how? (I started off trying to do it functionally but didn't get anywhere).
is this OK F# code or should it be written in a functional way?
That's a subjective question, so can't be answered. I'm assuming, however, that since you're doing an exercise, it's in order to learn. Learning functional programming takes years for most people (it did for me), but F# is a great language because it enables you learn gradually.
You can, however, simplify the algorithm. Think of a till as a number. The number represents the instant it's ready. At the beginning, you initialise them all to 0:
let tills = List.replicate n 0
where n is the number of tills. At the beginning, they're all ready at time 0. If, for example, n is 3, the tills are:
> List.replicate 3 0;;
val it : int list = [0; 0; 0]
Now you consider the next customer in the line. For each customer, you have to pick a till. You pick the one that is available first, i.e. with the lowest number. Then you need to 'update' the list of counters.
In order to do that, you'll need a function to 'update' a list at a particular index, which isn't part of the base library. You can define it yourself, however:
module List =
let set idx v = List.mapi (fun i x -> if i = idx then v else x)
For example, if you want to 'update' the second element to 3, you can do it like this:
> List.replicate 3 0 |> List.set 1 3;;
val it : int list = [0; 3; 0]
Now you can write a function that updates the set of tills given their current state and a customer (represented by a duration, which is also a number).
let next tills customer =
let earliestTime = List.min tills
let idx = List.findIndex (fun c -> earliestTime = c) tills
List.set idx (earliestTime + customer) tills
First, the next function finds the earliestTime in tills by using List.min. Then it finds the index of that value. Finally, it 'updates' that till by adding its current state to the customer duration.
Imagine that you have two tills and the customers [2;3;10]:
> List.replicate 2 0;;
val it : int list = [0; 0]
> List.replicate 2 0 |> fun tills -> next tills 2;;
val it : int list = [2; 0]
> List.replicate 2 0 |> fun tills -> next tills 2 |> fun tills -> next tills 3;;
val it : int list = [2; 3]
> List.replicate 2 0 |> fun tills -> next tills 2 |> fun tills -> next tills 3
|> fun tills -> next tills 10;;
val it : int list = [12; 3]
You'll notice that you can keep calling the next function for all the customers in the line. That's called a fold. This gives you the final state of the tills. The final step is to return the value of the till with the highest value, because that represents the time it finished. The overall function, then, is:
let queueTime line n =
let next tills customer =
let earliestTime = List.min tills
let idx = List.findIndex (fun c -> earliestTime = c) tills
List.set idx (earliestTime + customer) tills
let tills = List.replicate n 0
let finalState = List.fold next tills line
List.max finalState
Here's some examples, taken from the original exercise:
> queueTime [5;3;4] 1;;
val it : int = 12
> queueTime [10;2;3;3] 2;;
val it : int = 10
> queueTime [2;3;10] 2;;
val it : int = 12
This solution is based entirely on immutable data, and all functions are pure, so that's a functional solution.
Here is a version that resembles your version, with all the mutability removed:
let getQueueTime (customerArray: int list) n =
let updateWith f key map =
let v = Map.find key map
map |> Map.add key (f v)
let initialLines = [1..n] |> List.map (fun i -> sprintf "Line%d" i, 0) |> Map.ofList
let getNextAvailableSupermarketLineName(d:Map<string,int>) =
let lowestLine = d |> Seq.minBy (fun l -> l.Value)
lowestLine.Key
let lines =
customerArray
|> List.fold (fun linesState x ->
let lineName = getNextAvailableSupermarketLineName linesState
linesState |> updateWith (fun l -> l + x) lineName) initialLines
lines |> Seq.map (fun l -> l.Value) |> Seq.max
getQueueTime [5;3;4] 1 |> printfn "%i"
Those loops with mutable "outer state" can be swapped for either recursive functions or folds/reduce, here I suspect recursive functions would be nicer.
I've swapped out Dictionary for the immutable Map, but it feels like more trouble than it's worth here.
Update - here is a compromise solution I think reads well:
let getQueueTime (customerArray: int list) n =
let d = [1..n] |> List.map (fun i -> sprintf "Line%d" i, 0) |> dict
let getNextAvailableSupermarketLineName(d:IDictionary<string,int>) =
let lowestLine = d |> Seq.minBy (fun l -> l.Value)
lowestLine.Key
customerArray
|> List.iter (fun x ->
let lineName = getNextAvailableSupermarketLineName d
d.Item(lineName) <- d.Item(lineName) + 1)
d.Values |> Seq.max
getQueueTime [5;3;4] 1 |> printfn "%i"
I believe there is a more natural functional solution if you approach it freshly, but I wanted to evolve your current solution.
This is less an attempt at answering than an extended comment on Mark Seemann's otherwise excellent answer. If we do not restrict ourselves to standard library functions, the slightly cumbersome determination of the index with List.findIndex can be avoided. Instead, we may devise a function that replaces the first occurrence of a value in a list with a new value.
The implementation of our bespoke List.replace involves recursion, with an accumulator to hold the values before we encounter the first occurrence. When found, the accumulator needs to be reversed and also to have the new value and the tail of the original list appended. Both of this can be done in one operation: List.fold being fed the new value and tail of the original list as initial state while the elements of the accumulator are prepended in the loop, thereby restoring their order.
module List =
// Replace the first occurrence of a specific object in a list
let replace oldValue newValue source =
let rec aux acc = function
| [] -> List.rev acc
| x::xs when x = oldValue ->
(newValue::xs, acc)
||> List.fold (fun xs x -> x::xs)
| x::xs -> aux (x::acc) xs
aux [] source
let queueTime customers n =
(List.init n (fun _ -> 0), customers)
||> List.fold (fun xs customer ->
let x = List.min xs
List.replace x (x + customer) xs )
|> List.max
queueTime [5;3;4] 1 // val it : int = 12
queueTime [10;2;3;3] 2 // val it : int = 10
queueTime [2;3;10] 2 // val it : int = 12
I'm trying to write a function in F# for multiplying 2 matrices together. It uses a transpose function that finds the transpose of a given matrix.
let rec transpose = function
| [] -> failwith "Error, no matrix supplied"
| xs when List.head xs = [] -> []
| xs -> List.map (fun n -> List.head n) xs :: transpose
(List.map(fun n -> List.tail n) xs);;
let rec multiply = function
|([], []) -> []
|([], ys) -> failwith "Empty matrix"
|(xs, []) -> failwith "empty matrix"
|(x::xs, ys) -> let t = transpose ys
List.map(fun n -> inner x n) t :: multiply(xs,ys);;
No matter what the input is, my multiply function always executes "failwith" if I were to return with [0] however, instead of failwith, the function works perfectly. But I feel 0 is incorrect when multiplying a matrix with an empty one so I want to fix this and be able to output error if that is the case.
Your problem is in the recursion - you do:
multiply(xs,ys)
This will happen until xs=[] when you will get a fail.
The best solution is probably to have a wrapper function - something like
let multiply a b =
check a
check b
let rec actualmultiply = ...
actualmultiply a b
how do you convert an obj list to int type. I am trying to add two lists using a map function below but it doesn't work on obj lists.
let query f=
seq{
let cmd = new OleDbCommand( "SELECT * FROM F" );
let conn = new OleDbConnection( #"Provider=Microsoft.ACE.OLEDB.12.0;
Data Source=D:\Users\df\Documents\Vfolio.accdb;
Persist Security Info=False;" )
conn.Open()
let DAdapt = new OleDbDataAdapter("SELECT * FROM F",conn)
let DTab = new DataSet()
let i= DAdapt.Fill(DTab)
let rowCol = DTab.Tables.[0].Rows
let rowCount = rowCol.Count
for i in 0 .. (rowCount - 1) do
yield f (rowCol.[i])
}
let u= query(fun row -> row.[0])
let a= List.ofSeq u
let v=query(fun row -> row.[1])
let b= List.ofSeq v
let c = List.map2 (fun x y-> x + y) a b
error msg: The type 'obj' does not support the operator '+'
Because row.[i] returns type obj, your u and v become seq<obj>, and thus your a and b become type List<obj>, and therefore x and y are inferred to have type obj, and of course, you can't add two objs, which is exactly what the compiler tells you.
If you are sure that row.[0] and row.[1] are numbers of some kind, you should apply the appropriate cast, for example:
let u= query(fun row -> row.[0] :?> int)
let a= List.ofSeq u
let v=query(fun row -> row.[1] :?> int)
let b= List.ofSeq v
let c = List.map2 (fun x y-> x + y) a b
You can apply this cast in other places, too, depending on your taste and requirements, for example:
let c = List.map2 (fun x y-> (x :?> int) + (y :?> int)) a b
Or:
let a= u |> Seq.cast<int> |> List.ofSeq
let b= v |> Seq.cast<int> |> List.ofSeq
But I like the first example best, because it applies the cast at the earliest known point and results in the least amount of extra code.
But beware: if row.[0] turns out to be not an int at runtime, you will get an InvalidCastException.
P.S. In your List.map2 call, you could specify (+) directly instead of wrapping it in an extra lambda:
List.map2 (+) a b
P.P.S Also, it seems that your List.ofSeq calls are wasteful, for Seq also has a map2:
let u = query(fun row -> row.[0] :?> int)
let v = query(fun row -> row.[1] :?> int)
let c = Seq.map2 (+) u v |> List.ofSeq
P.P.P.S Also, have you noticed that each of the two calls to query generates its own DB connection, command, adapter, and dataset? Did you intend this or did you mean to only have one connection and then fetch different columns from the result? If so, you should only call query once:
let c = query( fun row -> (row.[0] :?> int) + (row.[1] :?> int) ) |> List.ofSeq
Disclosure: this came up in FsCheck, an F# random testing framework I maintain. I have a solution, but I do not like it. Moreover, I do not understand the problem - it was merely circumvented.
A fairly standard implementation of (monadic, if we're going to use big words) sequence is:
let sequence l =
let k m m' = gen { let! x = m
let! xs = m'
return (x::xs) }
List.foldBack k l (gen { return [] })
Where gen can be replaced by a computation builder of choice. Unfortunately, that implementation consumes stack space, and so eventually stack overflows if the list is long enough.The question is: why? I know in principle foldBack is not tail recursive, but the clever bunnies of the F# team have circumvented that in the foldBack implementation. Is there a problem in the computation builder implementation?
If I change the implementation to the below, everything is fine:
let sequence l =
let rec go gs acc size r0 =
match gs with
| [] -> List.rev acc
| (Gen g)::gs' ->
let r1,r2 = split r0
let y = g size r1
go gs' (y::acc) size r2
Gen(fun n r -> go l [] n r)
For completeness, the Gen type and computation builder can be found in the FsCheck source
Building on Tomas's answer, let's define two modules:
module Kurt =
type Gen<'a> = Gen of (int -> 'a)
let unit x = Gen (fun _ -> x)
let bind k (Gen m) =
Gen (fun n ->
let (Gen m') = k (m n)
m' n)
type GenBuilder() =
member x.Return(v) = unit v
member x.Bind(v,f) = bind f v
let gen = GenBuilder()
module Tomas =
type Gen<'a> = Gen of (int -> ('a -> unit) -> unit)
let unit x = Gen (fun _ f -> f x)
let bind k (Gen m) =
Gen (fun n f ->
m n (fun r ->
let (Gen m') = k r
m' n f))
type GenBuilder() =
member x.Return v = unit v
member x.Bind(v,f) = bind f v
let gen = GenBuilder()
To simplify things a bit, let's rewrite your original sequence function as
let rec sequence = function
| [] -> gen { return [] }
| m::ms -> gen {
let! x = m
let! xs = sequence ms
return x::xs }
Now, sequence [for i in 1 .. 100000 -> unit i] will run to completion regardless of whether sequence is defined in terms of Kurt.gen or Tomas.gen. The issue is not that sequence causes a stack overflow when using your definitions, it's that the function returned from the call to sequence causes a stack overflow when it is called.
To see why this is so, let's expand the definition of sequence in terms of the underlying monadic operations:
let rec sequence = function
| [] -> unit []
| m::ms ->
bind (fun x -> bind (fun xs -> unit (x::xs)) (sequence ms)) m
Inlining the Kurt.unit and Kurt.bind values and simplifying like crazy, we get
let rec sequence = function
| [] -> Kurt.Gen(fun _ -> [])
| (Kurt.Gen m)::ms ->
Kurt.Gen(fun n ->
let (Kurt.Gen ms') = sequence ms
(m n)::(ms' n))
Now it's hopefully clear why calling let (Kurt.Gen f) = sequence [for i in 1 .. 1000000 -> unit i] in f 0 overflows the stack: f requires a non-tail-recursive call to sequence and evaluation of the resulting function, so there will be one stack frame for each recursive call.
Inlining Tomas.unit and Tomas.bind into the definition of sequence instead, we get the following simplified version:
let rec sequence = function
| [] -> Tomas.Gen (fun _ f -> f [])
| (Tomas.Gen m)::ms ->
Tomas.Gen(fun n f ->
m n (fun r ->
let (Tomas.Gen ms') = sequence ms
ms' n (fun rs -> f (r::rs))))
Reasoning about this variant is tricky. You can empirically verify that it won't blow the stack for some arbitrarily large inputs (as Tomas shows in his answer), and you can step through the evaluation to convince yourself of this fact. However, the stack consumption depends on the Gen instances in the list that's passed in, and it is possible to blow the stack for inputs that aren't themselves tail recursive:
// ok
let (Tomas.Gen f) = sequence [for i in 1 .. 1000000 -> unit i]
f 0 (fun list -> printfn "%i" list.Length)
// not ok...
let (Tomas.Gen f) = sequence [for i in 1 .. 1000000 -> Gen(fun _ f -> f i; printfn "%i" i)]
f 0 (fun list -> printfn "%i" list.Length)
You're correct - the reason why you're getting a stack overflow is that the bind operation of the monad needs to be tail-recursive (because it is used to aggregate values during folding).
The monad used in FsCheck is essentially a state monad (it keeps the current generator and some number). I simplified it a bit and got something like:
type Gen<'a> = Gen of (int -> 'a)
let unit x = Gen (fun n -> x)
let bind k (Gen m) =
Gen (fun n ->
let (Gen m') = k (m n)
m' n)
Here, the bind function is not tail-recursive because it calls k and then does some more work. You can change the monad to be a continuation monad. It is implemented as a function that takes the state and a continuation - a function that is called with the result as an argument. For this monad, you can make bind tail recursive:
type Gen<'a> = Gen of (int -> ('a -> unit) -> unit)
let unit x = Gen (fun n f -> f x)
let bind k (Gen m) =
Gen (fun n f ->
m n (fun r ->
let (Gen m') = k r
m' n f))
The following example will not stack overflow (and it did with the original implementation):
let sequence l =
let k m m' =
m |> bind (fun x ->
m' |> bind (fun xs ->
unit (x::xs)))
List.foldBack k l (unit [])
let (Gen f) = sequence [ for i in 1 .. 100000 -> unit i ]
f 0 (fun list -> printfn "%d" list.Length)
Background:
I have a sequence of contiguous, time-stamped data. The data-sequence has gaps in it where the data is not contiguous. I want create a method to split the sequence up into a sequence of sequences so that each subsequence contains contiguous data (split the input-sequence at the gaps).
Constraints:
The return value must be a sequence of sequences to ensure that elements are only produced as needed (cannot use list/array/cacheing)
The solution must NOT be O(n^2), probably ruling out a Seq.take - Seq.skip pattern (cf. Brian's post)
Bonus points for a functionally idiomatic approach (since I want to become more proficient at functional programming), but it's not a requirement.
Method signature
let groupContiguousDataPoints (timeBetweenContiguousDataPoints : TimeSpan) (dataPointsWithHoles : seq<DateTime * float>) : (seq<seq< DateTime * float >>)= ...
On the face of it the problem looked trivial to me, but even employing Seq.pairwise, IEnumerator<_>, sequence comprehensions and yield statements, the solution eludes me. I am sure that this is because I still lack experience with combining F#-idioms, or possibly because there are some language-constructs that I have not yet been exposed to.
// Test data
let numbers = {1.0..1000.0}
let baseTime = DateTime.Now
let contiguousTimeStamps = seq { for n in numbers ->baseTime.AddMinutes(n)}
let dataWithOccationalHoles = Seq.zip contiguousTimeStamps numbers |> Seq.filter (fun (dateTime, num) -> num % 77.0 <> 0.0) // Has a gap in the data every 77 items
let timeBetweenContiguousValues = (new TimeSpan(0,1,0))
dataWithOccationalHoles |> groupContiguousDataPoints timeBetweenContiguousValues |> Seq.iteri (fun i sequence -> printfn "Group %d has %d data-points: Head: %f" i (Seq.length sequence) (snd(Seq.hd sequence)))
I think this does what you want
dataWithOccationalHoles
|> Seq.pairwise
|> Seq.map(fun ((time1,elem1),(time2,elem2)) -> if time2-time1 = timeBetweenContiguousValues then 0, ((time1,elem1),(time2,elem2)) else 1, ((time1,elem1),(time2,elem2)) )
|> Seq.scan(fun (indexres,(t1,e1),(t2,e2)) (index,((time1,elem1),(time2,elem2))) -> (index+indexres,(time1,elem1),(time2,elem2)) ) (0,(baseTime,-1.0),(baseTime,-1.0))
|> Seq.map( fun (index,(time1,elem1),(time2,elem2)) -> index,(time2,elem2) )
|> Seq.filter( fun (_,(_,elem)) -> elem <> -1.0)
|> PSeq.groupBy(fst)
|> Seq.map(snd>>Seq.map(snd))
Thanks for asking this cool question
I translated Alexey's Haskell to F#, but it's not pretty in F#, and still one element too eager.
I expect there is a better way, but I'll have to try again later.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:LazyList<'a>) : LazyList<LazyList<'a>> =
LazyList.delayed (fun () ->
match input with
| LazyList.Nil -> LazyList.cons (LazyList.empty()) (LazyList.empty())
| LazyList.Cons(x,LazyList.Nil) ->
LazyList.cons (LazyList.cons x (LazyList.empty())) (LazyList.empty())
| LazyList.Cons(x,(LazyList.Cons(y,_) as xs)) ->
let groups = GroupBy comp xs
if comp x y then
LazyList.consf
(LazyList.consf x (fun () ->
let (LazyList.Cons(firstGroup,_)) = groups
firstGroup))
(fun () ->
let (LazyList.Cons(_,otherGroups)) = groups
otherGroups)
else
LazyList.cons (LazyList.cons x (LazyList.empty())) groups)
let result = data |> LazyList.of_seq |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x
You seem to want a function that has signature
(`a -> bool) -> seq<'a> -> seq<seq<'a>>
I.e. a function and a sequence, then break up the input sequence into a sequence of sequences based on the result of the function.
Caching the values into a collection that implements IEnumerable would likely be simplest (albeit not exactly purist, but avoiding iterating the input multiple times. It will lose much of the laziness of the input):
let groupBy (fun: 'a -> bool) (input: seq) =
seq {
let cache = ref (new System.Collections.Generic.List())
for e in input do
(!cache).Add(e)
if not (fun e) then
yield !cache
cache := new System.Collections.Generic.List()
if cache.Length > 0 then
yield !cache
}
An alternative implementation could pass cache collection (as seq<'a>) to the function so it can see multiple elements to chose the break points.
A Haskell solution, because I don't know F# syntax well, but it should be easy enough to translate:
type TimeStamp = Integer -- ticks
type TimeSpan = Integer -- difference between TimeStamps
groupContiguousDataPoints :: TimeSpan -> [(TimeStamp, a)] -> [[(TimeStamp, a)]]
There is a function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] in the Prelude:
The group function takes a list and returns a list of lists such that the concatenation of the result is equal to the argument. Moreover, each sublist in the result contains only equal elements. For example,
group "Mississippi" = ["M","i","ss","i","ss","i","pp","i"]
It is a special case of groupBy, which allows the programmer to supply their own equality test.
It isn't quite what we want, because it compares each element in the list with the first element of the current group, and we need to compare consecutive elements. If we had such a function groupBy1, we could write groupContiguousDataPoints easily:
groupContiguousDataPoints maxTimeDiff list = groupBy1 (\(t1, _) (t2, _) -> t2 - t1 <= maxTimeDiff) list
So let's write it!
groupBy1 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy1 _ [] = [[]]
groupBy1 _ [x] = [[x]]
groupBy1 comp (x : xs#(y : _))
| comp x y = (x : firstGroup) : otherGroups
| otherwise = [x] : groups
where groups#(firstGroup : otherGroups) = groupBy1 comp xs
UPDATE: it looks like F# doesn't let you pattern match on seq, so it isn't too easy to translate after all. However, this thread on HubFS shows a way to pattern match sequences by converting them to LazyList when needed.
UPDATE2: Haskell lists are lazy and generated as needed, so they correspond to F#'s LazyList (not to seq, because the generated data is cached (and garbage collected, of course, if you no longer hold a reference to it)).
(EDIT: This suffers from a similar problem to Brian's solution, in that iterating the outer sequence without iterating over each inner sequence will mess things up badly!)
Here's a solution that nests sequence expressions. The imperitave nature of .NET's IEnumerable<T> is pretty apparent here, which makes it a bit harder to write idiomatic F# code for this problem, but hopefully it's still clear what's going on.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let rec partitions (first:option<_>) =
seq {
match first with
| Some first' -> //'
(* The following value is always overwritten;
it represents the first element of the next subsequence to output, if any *)
let next = ref None
(* This function generates a subsequence to output,
setting next appropriately as it goes *)
let rec iter item =
seq {
yield item
if (en.MoveNext()) then
let curr = en.Current
if (cmp item curr) then
yield! iter curr
else // consumed one too many - pass it on as the start of the next sequence
next := Some curr
else
next := None
}
yield iter first' (* ' generate the first sequence *)
yield! partitions !next (* recursively generate all remaining sequences *)
| None -> () // return an empty sequence if there are no more values
}
let first = if en.MoveNext() then Some en.Current else None
partitions first
let groupContiguousDataPoints (time:TimeSpan) : (seq<DateTime*_> -> _) =
groupBy (fun (t,_) (t',_) -> t' - t <= time)
Okay, trying again. Achieving the optimal amount of laziness turns out to be a bit difficult in F#... On the bright side, this is somewhat more functional than my last attempt, in that it doesn't use any ref cells.
let groupBy cmp (sq:seq<_>) =
let en = sq.GetEnumerator()
let next() = if en.MoveNext() then Some en.Current else None
(* this function returns a pair containing the first sequence and a lazy option indicating the first element in the next sequence (if any) *)
let rec seqStartingWith start =
match next() with
| Some y when cmp start y ->
let rest_next = lazy seqStartingWith y // delay evaluation until forced - stores the rest of this sequence and the start of the next one as a pair
seq { yield start; yield! fst (Lazy.force rest_next) },
lazy Lazy.force (snd (Lazy.force rest_next))
| next -> seq { yield start }, lazy next
let rec iter start =
seq {
match (Lazy.force start) with
| None -> ()
| Some start ->
let (first,next) = seqStartingWith start
yield first
yield! iter next
}
Seq.cache (iter (lazy next()))
Below is some code that does what I think you want. It is not idiomatic F#.
(It may be similar to Brian's answer, though I can't tell because I'm not familiar with the LazyList semantics.)
But it doesn't exactly match your test specification: Seq.length enumerates its entire input. Your "test code" calls Seq.length and then calls Seq.hd. That will generate an enumerator twice, and since there is no caching, things get messed up. I'm not sure if there is any clean way to allow multiple enumerators without caching. Frankly, seq<seq<'a>> may not be the best data structure for this problem.
Anyway, here's the code:
type State<'a> = Unstarted | InnerOkay of 'a | NeedNewInner of 'a | Finished
// f() = true means the neighbors should be kept together
// f() = false means they should be split
let split_up (f : 'a -> 'a -> bool) (input : seq<'a>) =
// simple unfold that assumes f captured a mutable variable
let iter f = Seq.unfold (fun _ ->
match f() with
| Some(x) -> Some(x,())
| None -> None) ()
seq {
let state = ref (Unstarted)
use ie = input.GetEnumerator()
let innerMoveNext() =
match !state with
| Unstarted ->
if ie.MoveNext()
then let cur = ie.Current
state := InnerOkay(cur); Some(cur)
else state := Finished; None
| InnerOkay(last) ->
if ie.MoveNext()
then let cur = ie.Current
if f last cur
then state := InnerOkay(cur); Some(cur)
else state := NeedNewInner(cur); None
else state := Finished; None
| NeedNewInner(last) -> state := InnerOkay(last); Some(last)
| Finished -> None
let outerMoveNext() =
match !state with
| Unstarted | NeedNewInner(_) -> Some(iter innerMoveNext)
| InnerOkay(_) -> failwith "Move to next inner seq when current is active: undefined behavior."
| Finished -> None
yield! iter outerMoveNext }
open System
let groupContigs (contigTime : TimeSpan) (holey : seq<DateTime * int>) =
split_up (fun (t1,_) (t2,_) -> (t2 - t1) <= contigTime) holey
// Test data
let numbers = {1 .. 15}
let contiguousTimeStamps =
let baseTime = DateTime.Now
seq { for n in numbers -> baseTime.AddMinutes(float n)}
let holeyData =
Seq.zip contiguousTimeStamps numbers
|> Seq.filter (fun (dateTime, num) -> num % 7 <> 0)
let grouped_data = groupContigs (new TimeSpan(0,1,0)) holeyData
printfn "Consuming..."
for group in grouped_data do
printfn "about to do a group"
for x in group do
printfn " %A" x
Ok, here's an answer I'm not unhappy with.
(EDIT: I am unhappy - it's wrong! No time to try to fix right now though.)
It uses a bit of imperative state, but it is not too difficult to follow (provided you recall that '!' is the F# dereference operator, and not 'not'). It is as lazy as possible, and takes a seq as input and returns a seq of seqs as output.
let N = 20
let data = // produce some arbitrary data with holes
seq {
for x in 1..N do
if x % 4 <> 0 && x % 7 <> 0 then
printfn "producing %d" x
yield x
}
let rec GroupBy comp (input:seq<_>) = seq {
let doneWithThisGroup = ref false
let areMore = ref true
use e = input.GetEnumerator()
let Next() = areMore := e.MoveNext(); !areMore
// deal with length 0 or 1, seed 'prev'
if not(e.MoveNext()) then () else
let prev = ref e.Current
while !areMore do
yield seq {
while not(!doneWithThisGroup) do
if Next() then
let next = e.Current
doneWithThisGroup := not(comp !prev next)
yield !prev
prev := next
else
// end of list, yield final value
yield !prev
doneWithThisGroup := true }
doneWithThisGroup := false }
let result = data |> GroupBy (fun x y -> y = x + 1)
printfn "Consuming..."
for group in result do
printfn "about to do a group"
for x in group do
printfn " %d" x