I am currently learning functional programming and F#, and I want to do a loop control until n-2. For example:
Given a list of doubles, find the pairwise average,
e.g. pairwiseAverage [1.0; 2.0; 3.0; 4.0; 5.0] will give [1.5; 2.5; 3.5; 4.5]
After doing some experimenting and searching, I have a few ways to do it:
Method 1:
let pairwiseAverage (data: List<double>) =
[for j in 0 .. data.Length-2 do
yield (data.[j]+data.[j+1])/2.0]
Method 2:
let pairwiseAverage (data: List<double>) =
let averageWithNone acc next =
match acc with
| (_,None) -> ([],Some(next))
| (result,Some prev) -> ((prev+next)/2.0)::result,Some(next))
let resultTuple = List.fold averageWithNone ([],None) data
match resultTuple with
| (x,_) -> List.rev x
Method 3:
let pairwiseAverage (data: List<double>) =
// Get elements from 1 .. n-1
let after = List.tail data
// Get elements from 0 .. n-2
let before =
data |> List.rev
|> List.tail
|> List.rev
List.map2 (fun x y -> (x+y)/2.0) before after
I just like to know if there are other ways to approach this problem. Thank you.
Using only built-ins:
list |> Seq.windowed 2 |> Seq.map Array.average
Seq.windowed n gives you sliding windows of n elements each.
One simple other way is to use Seq.pairwise
something like
list |> Seq.pairwise |> Seq.map (fun (a,b) -> (a+b)/2.0)
The approaches suggested above are appropriate for short windows, like the one in the question. For windows with a length greater than 2 one cannot use pairwise. The answer by hlo generalizes to wider windows and is a clean and fast approach if window length is not too large. For very wide windows the code below runs faster, as it only adds one number and subtracts another one from the value obtained for the previous window. Notice that Seq.map2 (and Seq.map) automatically deal with sequences of different lengths.
let movingAverage (n: int) (xs: float List) =
let init = xs |> (Seq.take n) |> Seq.sum
let additions = Seq.map2 (fun x y -> x - y) (Seq.skip n xs) xs
Seq.fold (fun m x -> ((List.head m) + x)::m) [init] additions
|> List.rev
|> List.map (fun (x: float) -> x/(float n))
xs = [1.0..1000000.0]
movingAverage 1000 xs
// Real: 00:00:00.265, CPU: 00:00:00.265, GC gen0: 10, gen1: 10, gen2: 0
For comparison, the function above performs the calculation above about 60 times faster than the windowed equivalent:
let windowedAverage (n: int) (xs: float List) =
xs
|> Seq.windowed n
|> Seq.map Array.average
|> Seq.toList
windowedAverage 1000 xs
// Real: 00:00:15.634, CPU: 00:00:15.500, GC gen0: 74, gen1: 74, gen2: 71
I tried to eliminate List.rev using foldBack but did not succeed.
A point-free approach:
let pairwiseAverage = List.pairwise >> List.map ((<||) (+) >> (*) 0.5)
Online Demo
Usually not a better way, but another way regardless... ;-]
Related
I am working on a data "intensive" app and I am not sure if I should use Series./DataFrame. It seems very interesting but it looks also way slower than the equivalent done with a List ... but I may not use the Series properly when I filter.
Please let me know what you think.
Thanks
type TSPoint<'a> =
{
Date : System.DateTime
Value : 'a
}
type TimeSerie<'a> = TSPoint<'a> list
let sd = System.DateTime(1950, 2, 1)
let tsd =[1..100000] |> List.map (fun x -> sd.AddDays(float x))
// creating a List of TSPoint
let tsList = tsd |> List.map (fun x -> {Date = x ; Value = 1.})
// creating the same as a serie
let tsSeries = Series(tsd , [1..100000] |> List.map (fun _ -> 1.))
// function to "randomise" the list of dates
let shuffleG xs = xs |> List.sortBy (fun _ -> Guid.NewGuid())
// new date list to search within out tsList and tsSeries
let d = tsd |> shuffleG |> List.take 1000
// Filter
d |> List.map (fun x -> (tsList |> List.filter (fun y -> y.Date = x)))
d |> List.map (fun x -> (tsSeries |> Series.filter (fun key _ -> key = x)))
Here is what I get:
List -> Real: 00:00:04.780, CPU: 00:00:04.508, GC gen0: 917, gen1: 2, gen2: 1
Series -> Real: 00:00:54.386, CPU: 00:00:49.311, GC gen0: 944, gen1: 7, gen2: 3
In general, Deedle series and data frames do have some extra overhead over writing hand-crafted code using whatever is the most efficient data structure for a given problem. The overhead is small for some operations and larger for some operations, so it depends on what you want to do and how you use Deedle.
If you use Deedle in a way in which it was intended to be used, then you'll get a good performance, but if you run a large number of operations that are not particularly efficient, you may get a bad performance.
In your particular case, you are running Series.filter on 1000 series and creating a new series (which is what happens behind the scenes here) does have some overhead.
However, what your code really does is that you are using Series.filter to find a value with a specific key. Deedle provides a key-based lookup operation for this (and it's one of the things it has been optimized for).
If you rewrite the code as follows, you'll get much better performance with Deedle than with list:
d |> List.map (fun x -> tsSeries.[x])
// 0.001 seconds
d |> List.map (fun x -> (tsSeries |> Series.filter (fun key _ -> key = x)))
// 3.46 seconds
d |> List.map (fun x -> (tsList |> List.filter (fun y -> y.Date = x)))
// 40.5 seconds
I have, in F#, 2 sequences, each containing distinct integers, strictly in ascending order: listMaxes and numbers.
If not Seq.isEmpty numbers, then it is guaranteed that not Seq.isEmpty listMaxes and Seq.last listMaxes >= Seq.last numbers.
I would like to implement in F# a function that returns a list of list of integers, whose List.length equals Seq.length listMaxes, containing the elements of numbers divided in lists, where the elements of listMaxes limit each group.
For example: called with the arguments
listMaxes = seq [ 25; 56; 65; 75; 88 ]
numbers = seq [ 10; 11; 13; 16; 20; 25; 31; 38; 46; 55; 65; 76; 88 ]
this function should return
[ [10; 11; 13; 16; 20; 25]; [31; 38; 46; 55]; [65]; List.empty; [76; 88] ]
I can implement this function, iterating over numbers only once:
let groupByListMaxes listMaxes numbers =
if Seq.isEmpty numbers then
List.replicate (Seq.length listMaxes) List.empty
else
List.ofSeq (seq {
use nbe = numbers.GetEnumerator ()
ignore (nbe.MoveNext ())
for lmax in listMaxes do
yield List.ofSeq (seq {
if nbe.Current <= lmax then
yield nbe.Current
while nbe.MoveNext () && nbe.Current <= lmax do
yield nbe.Current
})
})
But this code feels unclean, ugly, imperative, and very un-F#-y.
Is there any functional / F#-idiomatic way to achieve this?
Here's a version based on list interpretation, which is quite functional in style. You can use Seq.toList to convert between them, whenever you want to handle that. You could also use Seq.scan in conjunction with Seq.partition ((>=) max) if you want to use only library functions, but beware that it's very very easy to introduce a quadratic complexity in either computation or memory when doing that.
This is linear in both:
let splitAt value lst =
let rec loop l1 = function
| [] -> List.rev l1, []
| h :: t when h > value -> List.rev l1, (h :: t)
| h :: t -> loop (h :: l1) t
loop [] lst
let groupByListMaxes listMaxes numbers =
let rec loop acc lst = function
| [] -> List.rev acc
| h :: t ->
let out, lst' = splitAt h lst
loop (out :: acc) lst' t
loop [] numbers listMaxes
It can be done like this with pattern matching and tail recursion:
let groupByListMaxes listMaxes numbers =
let rec inner acc numbers =
function
| [] -> acc |> List.rev
| max::tail ->
let taken = numbers |> Seq.takeWhile ((>=) max) |> List.ofSeq
let n = taken |> List.length
inner (taken::acc) (numbers |> Seq.skip n) tail
inner [] numbers (listMaxes |> List.ofSeq)
Update: I also got inspired by fold and came up with the following solution that strictly refrains from converting the input sequences.
let groupByListMaxes maxes numbers =
let rec inner (acc, (cur, numbers)) max =
match numbers |> Seq.tryHead with
// Add n to the current list of n's less
// than the local max
| Some n when n <= max ->
let remaining = numbers |> Seq.tail
inner (acc, (n::cur, remaining)) max
// Complete the current list by adding it
// to the accumulated result and prepare
// the next list for fold.
| _ ->
(List.rev cur)::acc, ([], numbers)
maxes |> Seq.fold inner ([], ([], numbers)) |> fst |> List.rev
I have found a better implementation myself. Tips for improvements are still welcome.
Dealing with 2 sequences is really a pain. And I really do want to iterate over numbers only once without turning that sequence into a list. But then I realized that turning listMaxes (generally the shorter of the sequences) is less costly. That way only 1 sequence remains, and I can use Seq.fold over numbers.
What should be the state that we want to keep and change while iterating with Seq.fold over numbers? First, it should definitely include the remaining of the listMaxes, yet the previous maxes that we already have surpassed are no longer of interest. Second, the accumulated lists so far, although, like in the other answers, these can be kept in reverse order. More to the point: the state is a couple which has as second element a reversed list of reversed lists of the numbers so far.
let groupByListMaxes listMaxes numbers =
let rec folder state number =
match state with
| m :: maxes, _ when number > m ->
folder (maxes, List.empty :: snd state) number
| m :: maxes, [] ->
fst state, List.singleton (List.singleton number)
| m :: maxes, h :: t ->
fst state, (number :: h) :: t
| [], _ ->
failwith "Guaranteed not to happen"
let listMaxesList = List.ofSeq listMaxes
let initialState = listMaxesList, List.empty
let reversed = snd (Seq.fold folder initialState numbers)
let temp = List.rev (List.map List.rev reversed)
let extraLength = List.length listMaxesList - List.length temp
let extra = List.replicate extraLength List.empty
List.concat [temp; extra]
I know this is an old question but I had a very similar problem and I think this is a simple solution:
let groupByListMaxes cs xs =
List.scan (fun (_, xs) c -> List.partition (fun x -> x <= c) xs)
([], xs)
cs
|> List.skip 1
|> List.map fst
I completed the seventh Euler problem* in F# but am not entirely happy with my
implementation. In the function primes I create a sequence that I estimated would contain the 10,001st prime number. When I tried using Seq.initInfinite to lazily generate the candidate primes my code just hung before throwing an out of memory exception.
Could someone advise me on replacing the literal sequence with a lazily-generated sequence which is short-circuited once the desired prime is found?
let isPrime n =
let bound = int (sqrt (float n))
seq {2 .. bound} |> Seq.forall (fun x -> n % x <> 0)
let primeAsync n =
async { return (n, isPrime n)}
let primes =
{1..1000000}
|> Seq.map primeAsync
|> Async.Parallel
|> Async.RunSynchronously
|> Array.filter snd
|> Array.map fst
|> Array.mapi (fun i el -> (i, el))
|> Array.find (fun (fst, snd) -> fst = 10001)
primes
*"By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13. What is the 10,001st prime number?"
I think the problem is/was that Async.RunSynchronously isn't lazy and tried to evaluate the whole infinite sequence. Although there are better solutions for this, your algorithm is fast enough, so you don't even need parallelization; this works perfectly:
open System
let isPrime n =
let bound = n |> float |> sqrt |> int
seq {2 .. bound} |> Seq.forall (fun x -> n % x <> 0)
let prime =
Seq.initInfinite ((+) 2)
|> Seq.filter isPrime
|> Seq.skip 10000
|> Seq.head
The sequence gets 'reified' as soon as you feed it to Async.Parallel. If you want to minimise memory consumption, run the computation serially or split it into lazy chunks, the elements in each chunk to be run in parallel.
In APL one can use a bit vector to select out elements of another vector; this is called compression. For example 1 0 1/3 5 7 would yield 3 7.
Is there a accepted term for this in functional programming in general and F# in particular?
Here is my F# program:
let list1 = [|"Bob"; "Mary"; "Sue"|]
let list2 = [|1; 0; 1|]
[<EntryPoint>]
let main argv =
0 // return an integer exit code
What I would like to do is compute a new string[] which would be [|"Bob"; Sue"|]
How would one do this in F#?
Array.zip list1 list2 // [|("Bob",1); ("Mary",0); ("Sue",1)|]
|> Array.filter (fun (_,x) -> x = 1) // [|("Bob", 1); ("Sue", 1)|]
|> Array.map fst // [|"Bob"; "Sue"|]
The pipe operator |> does function application syntactically reversed, i.e., x |> f is equivalent to f x. As mentioned in another answer, replace Array with Seq to avoid the construction of intermediate arrays.
I expect you'll find many APL primitives missing from F#. For lists and sequences, many can be constructed by stringing together primitives from the Seq, Array, or List modules, like the above. For reference, here is an overview of the Seq module.
I think the easiest is to use an array sequence expression, something like this:
let compress bits values =
[|
for i = 0 to bits.Length - 1 do
if bits.[i] = 1 then
yield values.[i]
|]
If you only want to use combinators, this is what I would do:
Seq.zip bits values
|> Seq.choose (fun (bit, value) ->
if bit = 1 then Some value else None)
|> Array.ofSeq
I use Seq functions instead of Array in order to avoid building intermediary arrays, but it would be correct too.
One might say this is more idiomatic:
Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None) list1 list2
|> Seq.choose id
|> Seq.toArray
EDIT (for the pipe lovers)
(list1, list2)
||> Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None)
|> Seq.choose id
|> Seq.toArray
Søren Debois' solution is good but, as he pointed out, but we can do better. Let's define a function, based on Søren's code:
let compressArray vals idx =
Array.zip vals idx
|> Array.filter (fun (_, x) -> x = 1)
|> Array.map fst
compressArray ends up creating a new array in each of the 3 lines. This can take some time, if the input arrays are long (1.4 seconds for 10M values in my quick test).
We can save some time by working on sequences and creating an array at the end only:
let compressSeq vals idx =
Seq.zip vals idx
|> Seq.filter (fun (_, x) -> x = 1)
|> Seq.map fst
This function is generic and will work on arrays, lists, etc. To generate an array as output:
compressSeq sq idx |> Seq.toArray
The latter saves about 40% of computation time (0.8s in my test).
As ildjarn commented, the function argument to filter can be rewritten to snd >> (=) 1, although that causes a slight performance drop (< 10%), probably because of the extra function call that is generated.
Hullo all.
I am a C# programmer, exploring F# in my free time. I have written the following little program for image convolution in 2D.
open System
let convolve y x =
y |> List.map (fun ye -> x |> List.map ((*) ye))
|> List.mapi (fun i l -> [for q in 1..i -> 0] # l # [for q in 1..(l.Length - i - 1) -> 0])
|> List.reduce (fun r c -> List.zip r c |> List.map (fun (a, b) -> a + b))
let y = [2; 3; 1; 4]
let x = [4; 1; 2; 3]
printfn "%A" (convolve y x)
My question is: Is the above code an idiomatic F#? Can it be made more concise? (e.g. Is there some shorter way to generate a filled list of 0's (I have used list comprehension in my code for this purpose)). Any changes that can improve its performance?
Any help would be greatly appreciated. Thanks.
EDIT:
Thanks Brian. I didn't get your first suggestion. Here's how my code looks after applying your second suggestion. (I also abstracted out the list-fill operation.)
open System
let listFill howMany withWhat = [for i in 1..howMany -> withWhat]
let convolve y x =
y |> List.map (fun ye -> x |> List.map ((*) ye))
|> List.mapi (fun i l -> (listFill i 0) # l # (listFill (l.Length - i - 1) 0))
|> List.reduce (List.map2 (+))
let y = [2; 3; 1; 4]
let x = [4; 1; 2; 3]
printfn "%A" (convolve y x)
Anything else can be improved? Awaiting more suggestions...
As Brian mentioned, the use of # is generally problematic, because the operator cannot be efficiently implemented for (simple) functional lists - it needs to copy the entire first list.
I think Brians suggestion was to write a sequence generator that would generate the list at once, but that's a bit more complicated. You'd have to convert the list to array and then write something like:
let convolve y x =
y |> List.map (fun ye -> x |> List.map ((*) ye) |> Array.ofList)
|> List.mapi (fun i l -> Array.init (2 * l.Length - 1) (fun n ->
if n < i || n - i >= l.Length then 0 else l.[n - i]))
|> List.reduce (Array.map2 (+))
In general, if performance is an important concern, then you'll probably need to use arrays anyway (because this kind of problem can be best solved by accessing elements by index). Using arrays is a bit more difficult (you need to get the indexing right), but perfectly fine approach in F#.
Anyway, if you want to write this using lists, then here ara some options. You could use sequence expressions everywhere, which would look like this:
let convolve y (x:_ list) =
[ for i, v1 in x |> List.zip [ 0 .. x.Length - 1] ->
[ yield! listFill i 0
for v2 in y do yield v1 * v2
yield! listFill (x.Length - i - 1) 0 ] ]
|> List.reduce (List.map2 (+))
... or you can also combine the two options and use a nested sequence expression (with yield! to generate zeros and lists) in the lambda function that you're passing to List.mapi:
let convolve y x =
y |> List.map (fun ye -> x |> List.map ((*) ye))
|> List.mapi (fun i l ->
[ for _ in 1 .. i do yield 0
yield! l
for _ in 1 .. (l.Length - i - 1) do yield 0 ])
|> List.reduce (List.map2 (+))
The idiomatic solution would be to use arrays and loops just as you would in C. However, you may be interested in the following alternative solution that uses pattern matching instead:
let dot xs ys =
Seq.map2 (*) xs ys
|> Seq.sum
let convolve xs ys =
let rec loop vs xs ys zs =
match xs, ys with
| x::xs, ys -> loop (dot ys (x::zs) :: vs) xs ys (x::zs)
| [], _::(_::_ as ys) -> loop (dot ys zs :: vs) [] ys zs
| _ -> List.rev vs
loop [] xs ys []
convolve [2; 3; 1; 4] [4; 1; 2; 3]
Regarding the zeroes, how about e.g.
[for q in 0..l.Length-1 -> if q=i then l else 0]
(I haven't tested to verify that is exactly right, but hopefully the idea is clear.) In general, any use of # is a code smell.
Regarding overall performance, for small lists this is probably fine; for larger ones, you might consider using Seq rather than List for some of the intermediate computations, to avoid allocating as many temporary lists along the way.
It looks like maybe the final zip-then-map could be replaced by just a call to map2, something like
... fun r c -> (r,c) ||> List.map2 (+)
or possibly even just
... List.map2 (+)
but I'm away from a compiler so haven't double-checked it.
(fun ye -> x |> List.map ((*) ye))
Really ?
I'll admit |> is pretty, but you could just wrote :
(fun ye -> List.map ((*) ye) x)
Another thing that you could do is fuse the first two maps. l |> List.map f |> List.mapi g = l |> List.mapi (fun i x -> g i (f x)), so incorporating Tomas and Brian's suggestions, you can get something like:
let convolve y x =
let N = List.length x
y
|> List.mapi (fun i ye ->
[for _ in 1..i -> 0
yield! List.map ((*) ye) x
for _ in 1..(N-i-1) -> 0])
|> List.reduce (List.map2 (+))