Process a stream of Tuples without mutability? - f#

So I want a function that receives an array of Tuple<int,int> and returns the same type but with different values.
What I want to do is a function that returns this kind of values:
f( [1,10; 2,20; 3,40; 4,70] ) = [2,10; 3,20; 4,30]
So as you can see, the first number is basically unchanged (except the 1st item is not picked), but the last number is the substraction of the current number with the previous number (20 - 10 = 10, 40 - 20 = 20, ...).
I've tried to come up with an algorithm in F# that doesn't involve mutability (using an accumulator for the previous value would mean I need a mutable variable), but I can't figure out. Is this possible?

Using built-in functions. In this case, you can use Seq.pairwise. The function takes a sequence of inputs and produces a sequence of pairs containing the previous value and the current value. Once you have the pairs, you can use Seq.map to transform the pairs into the results - in your case, take the ID of the current value and subtract the previous value from the current value:
input
|> Seq.pairwise
|> Seq.map (fun ((pid, pval), (nid, nval)) -> nid, nval-pval)
Note that the result is a sequence (IEnumerable<T>) rather than a list - simply because the Seq module contains a few more (useful) functions. You could convert it back to list using List.ofSeq.
Using explicit recursion. If your task did not fit one of the common patterns that are covered by some of the built-in functions, then the answer would be to use recursion (which, in general, replaces mutation in the functional style).
For completeness, the recursive version would look like this (this is not perfect, because it is not tail-recursive so it might cause stack overflow, but it demonstrates the idea):
let rec f list =
match list with
| (pid, pval)::(((nid, nval)::_) as tail) ->
(nid, nval-pval)::(f tail)
| _ -> []
This takes a list and looks at the first two elements of the list (pid, pval) and (nid, nval). Then it calculates the new value based on the two elements in (nid, nval-pval) and then it recursively processes the rest of the list (tail), skipping over the first element. If the list has one or fewer elements (the second case), then nothing is returned.
The tail-recursive version could be written using the "accumulator" trick. Instead of writing newValue::(recursiveCall ...) we accumulate the newly produced values in a list kept as an argument and then reverse it:
let rec f list acc =
match list with
| (pid, pval)::(((nid, nval)::_) as tail) ->
f tail ((nid, nval-pval)::acc)
| _ -> List.rev acc
Now you just need to call the function using f input [] to initialize the accumulator.

> let values = [(1, 10); (2, 20); (3, 40); (4, 70)];;
val values : (int * int) list = [(1, 10); (2, 20); (3, 40); (4, 70)]
> values
|> Seq.pairwise
|> Seq.map (fun ((x1, y1), (x2, y2)) -> (x2, y2 - y1))
|> Seq.toList;;
val it : (int * int) list = [(2, 10); (3, 20); (4, 30)]
Seq.pairwise gives you each element in a sequence as a pair, except the first element, which is only available as the predecessor of the second element.
For example:
> values |> Seq.pairwise |> Seq.toList;;
val it : ((int * int) * (int * int)) list =
[((1, 10), (2, 20)); ((2, 20), (3, 40)); ((3, 40), (4, 70))]
Second, Seq.map maps each of these pairs of pairs by using the desired algorithm.
Notice that this uses lazy evaluation - I only used Seq.ToList at the end to make the output more readable.
BTW, you can alternatively write the map function like this:
Seq.map (fun ((_, y1), (x2, y2)) -> (x2, y2 - y1))
Notice that instead of x1 is replaced with _ because the value isn't used.

Mark and Tomas have given really good solutions for the specific problem. Your question had a statement I think warrants a third answer, though:
(using an accumulator for the previous value would mean I need a mutable variable)
But this is actually not true! List.fold exists exactly to help you process lists with accumulators in a functional way. Here is how it looks:
let f xs = List.fold (fun (y, ys) (d, x) -> x, (d, x-y) :: ys)
(snd (List.head xs), [])
(List.tail xs)
|> snd |> List.rev
The accumulator here is the argument (y, ys) to the fun ... in the first line. We can see how the accumulator updates to the right of the ->: we accumulate both the previous element of the list x, as well as the new list we're constructing (d, x-y)::xs. We'll get that list in reverse order, so we reverse it in the end with List.rev.
Incidentally, List.fold is tail-recursive.
Of course, Tomas and Mark's solutions using Seq.pairwise are much neater for your particular problem, and you'd definitely want to use one of those in practice.

Whenever we need to create a sequence from another sequence, where one element in the output is a function of its predecessors, scan (docs) comes in handy:
[1,10; 2,20; 3,40; 4,70]
|> List.scan (fun ((accA, accB), prevB) (elA, elB) -> ((elA, elB-prevB), elB)) ((0, 0), 0)
|> Seq.skip 2
|> Seq.map fst
yields:
[(2, 10); (3, 20); (4, 30)]

Related

what would be a good way to implement List.scani, in F#?

I have a situation where I'm parsing a file and I need to know both:
the current line
the previous line
before the previous line requirements, I was doing something like:
myData
|> List.mapi (fun i data -> parse i data)
but now I need access to the previous line, so scan is ideal for that, but then I loose the index.
so, I need a List.scani function :) is it something that could be built easily in an idiomatic way?
You could define scani as follows:
let scani (f:int->'S->'T->'S) (state:'S) (list:'T list) =
list
|>List.scan (fun (i,s) x -> (i+1,f i s x)) (0,state)
|>List.map snd
Creating a tuple with the original state and a counter initialized with (0,state). The state is manipulated as usual with the folder function f (that now takes an extra i parameter) and the counter incremented by one. Finally, we remove the counter from the state by taking the second element of the state.
You could use it as follows, where i is the index, s is the state, and x the element.
[1;2;3]
|> scani (fun i s x -> s + i*x) 0
|> should equal [0;0;2;8]
It may not be the most efficient way to do it, but it seems to work (I called it scanl given that you want access to the previous element, or line):
let scanl f s l =
List.scan (fun (acc,elem0) elem1 -> (f acc elem0 elem1),elem1) (s,List.head l) l
|> List.map fst
Examples of usage:
let l = [1..5]
scanl (fun acc elem0 elem1 -> elem0,elem1) (0,0) l
//result: [(0, 0); (1, 1); (1, 2); (2, 3); (3, 4); (4, 5)]
The usual List.scan would give this:
List.scan (fun acc elem -> elem) 0 l
//result [0; 1; 2; 3; 4; 5]

Does (Array/List/Seq).groupBy maintain sort order within groups?

Does groupBy guarantee that sort order is preserved in code like the following?
x
|> Seq.sortBy (fun (x, y) -> y)
|> Seq.groupBy (fun (x, y) -> x)
By preserving sort order, I mean can we guarantee that within each grouping by x, the result is still sorted by y.
This is true for simple examples,
[(1, 3);(2, 1);(1, 1);(2, 3)]
|> Seq.sortBy (fun (x, y) -> y)
|> Seq.groupBy (fun (x, y) -> x)
// seq [(2, seq [(2, 1); (2, 3)]); (1, seq [(1, 1); (1, 3)])]
I want to make sure there are no weird edge cases.
What do you mean by preserving sort order? Seq.groupBy changes the type of the sequence, so how can you even meaningfully compare before and after?
For a given xs of the type seq<'a * 'b>, the type of the expression xs |> Seq.sortBy snd is seq<'a * 'b>, whereas the type of the expression xs |> Seq.sortBy snd |> Seq.groupBy fst is seq<'a * seq<'a * 'b>>. Thus, whether or not the answer to the question is yes or no depends on what you mean by preserving the sort order.
As #Petr wrote in the comments, it's easy to test this. If you're worried about special cases, write a Property using FsCheck and see if it generalises:
open FsCheck.Xunit
open Swensen.Unquote
[<Property>]
let isSortOrderPreserved (xs : (int * int) list) =
let actual = xs |> Seq.sortBy snd |> Seq.groupBy fst
let expected = xs |> Seq.sortBy snd |> Seq.toList
expected =! (actual |> Seq.map snd |> Seq.concat |> Seq.toList)
In this property, I've interpreted the property of sort order preservation to mean that if you subsequently concatenate the grouped sequences, the sort order is preserved. Your definition may be different.
Given this particular definition, however, running the property clearly demonstrates that the property doesn't hold:
Falsifiable, after 6 tests (13 shrinks) (StdGen (1448745695,296088811)):
Original:
[(-3, -7); (4, -7); (4, 0); (-4, 0); (-4, 7); (3, 7); (3, -1); (-5, -1)]
Shrunk:
[(3, 1); (3, 0); (0, 0)]
---- Swensen.Unquote.AssertionFailedException : Test failed:
[(3, 0); (0, 0); (3, 1)] = [(3, 0); (3, 1); (0, 0)]
false
Here we see that if the input is [(3, 1); (3, 0); (0, 0)], the grouped sequence doesn't preserve the sort order (which isn't surprising to me).
Based on the updated question, here's a property that examines that question:
[<Property(MaxTest = 10000)>]
let isSortOrderPreservedWithEachGroup (xs : (int * int) list) =
let actual = xs |> Seq.sortBy snd |> Seq.groupBy fst
let expected =
actual
|> Seq.map (fun (k, vals) -> k, vals |> Seq.sort |> Seq.toList)
|> Seq.toList
expected =!
(actual |> Seq.map (fun (k, vals) -> k, Seq.toList vals) |> Seq.toList)
This property does, indeed, hold:
Ok, passed 10000 tests.
You should still consider carefully whether you want to rely on behaviour that isn't documented, since it could change in later incarnations of F#. Personally, I'd adopt a piece of advice from the Zen of Python:
Explicit is better than implicit.
BTW, the reason for all that conversion to F# lists is because lists have structural equality, while sequences don't.
The documentation doesn't say explicitly (except through the example), but the implementation does preserve the order of the original sequence. It would be quite surprising if it didn't: the equivalent functions in other languages that I am aware of do.
Who cares. Instead of sorting and then grouping, just group and then sort and the ordering is guaranteed even if the F# implementation of groupBy eventually changes:
x
|> Seq.groupBy (fun (x, y) -> x)
|> Seq.map (fun (k, v) -> k, v |> Seq.sortBy (fun (x, y) -> y))

How to return items in a seq only if they are different than their preceding items in F#?

When I write in python, I always try to think of alternatives as if I was using F#:
I have a seq of tuples (key, value1, value2, ...) I simplify the tuple here so it is only of length 2. Keys contain duplicated figures.
let xs = Seq.zip [|1;2;2;3;3;3;4;1;1;2;2|] {0..10} // so this is a seq of tuple
[(1, 0),
(2, 1),
(2, 2),
(3, 3),
(3, 4),
(3, 5),
(4, 6),
(1, 7),
(1, 8),
(2, 9),
(2, 10)]
Now, I would like to build a function, which takes the seq as input, and return a seq, which is a subset of the original seq.
It must captures all items where the keys are changed, and include the first and last items of the seq if they are not already there.
f1(xs) = [(1, 0), (2, 1), (3, 3), (4, 6), (1, 7), (2, 9), (2, 10)]
f1([]) = []
The following is my python code, it works, but I don't really like it.
xs = zip([1,2,2,3,3,3,4,1,1,2,2], range(11))
def f1(xs):
if not xs:
return
last_a = None # I wish I don't have to use None here.
is_yield = False
for a, b in xs:
if a != last_a:
last_a = a
is_yield = True
yield (a, b)
else:
is_yield = False
if not is_yield:
yield (a, b) # Ugly, use variable outside the loop.
print list(f1(xs))
print list(f1([]))
Here is another way, using the itertools library
def f1(xs):
group = None
for _, group_iter in itertools.groupby(xs, key = lambda pair: pair[0]):
group = list(group_iter)
yield group[0]
# make sure we yield xs[-1], doesn't work if xs is iterator.
if group and len(group) > 1: # again, ugly, use variable outside the loop.
yield group[-1]
In F#, Seq.groupBy has difference behaviour from groupby in python. I am wondering how this problem can be solved as functional as possible, and less reference cells, less mutable, and without too much hassle.
A recursive solution that should work, but also isn't particularly beautiful or short could looks something like this - but using pattern matching definitely makes this a bit nicer:
let whenKeyChanges input = seq {
/// Recursively iterate over input, when the input is empty, or we found the last
/// element, we just return it. Otherwise, we check if the key has changed since
/// the last produced element (and return it if it has), then process the rest
let rec loop prevKey input = seq {
match input with
| [] -> ()
| [last] -> yield last
| (key, value)::tail ->
if key <> prevKey then yield (key, value)
yield! loop key tail }
// Always return the first element if the input is not empty
match List.ofSeq input with
| [] -> ()
| (key, value)::tail ->
yield (key, value)
yield! loop key tail }
If you wanted a nicer and a bit more declarative solution, then you could use data frame and time series library that I've been working on at BlueMountain Capital (not yet officially announced, but should work).
// Series needs to have unique keys, so we add an index to your original keys
// (so we have series with (0, 1) => 0; (1, 2) => 1; ...
let xs = series <| Seq.zip (Seq.zip [0..10] [1;2;2;3;3;3;4;1;1;2;2]) [0..10]
// Create chunks such that your part of the key is the same in each chunk
let chunks = xs |> Series.chunkWhile (fun (_, k1) (_, k2) -> k1 = k2)
// For each chunk, return the first element, or the first and the last
// element, if this is the last chunk (as you always want to include the last element)
chunks
|> Series.map (fun (i, k) chunk ->
let f = Series.firstValue chunk
let l = Series.lastValue chunk
if (i, k) = Series.lastKey chunks then
if f <> l then [k, f; k, l] else [k, l]
else [k, f])
// Concatenate the produced values into a single sequence
|> Series.values |> Seq.concat
The chunking is the key operation that you need here (see the documentation). The only tricky thing is returning the last element - which could be handled in multiple different ways - not sure if the one I used is the nicest.
The simplest solution would likely be to convert the sequence to an array and couple John's approach with snatching the first and last elements by index. But, here's another solution to add to the mix:
let f getKey (items: seq<_>) =
use e = items.GetEnumerator()
let rec loop doYield prev =
seq {
if doYield then yield prev
if e.MoveNext() then
yield! loop (getKey e.Current <> getKey prev) e.Current
elif not doYield then yield prev
}
if e.MoveNext() then loop true e.Current
else Seq.empty
//Usage: f fst xs
I think something like this will work
let remove dup =
dup
|> Seq.pairwise
|> Seq.filter (fun ((a,b),(c,d)) -> a <> c)
|> Seq.map fst
A correct solution needs to be aware of the end of the sequence, in order to satisfy the special case regarding the last element. Thus there either needs to be two passes such that length is known before processing (e.g. Tomas's solution - first pass is copy to list which unlike seq exposes its "end" as you iterate) or you need to rely on IEnumerable methods so that you know as you iterate when the end has been reached (e.g. Daniel's solution).
Below is inspired by the elegance of John's code, but handles the special cases by obtaining the length up front (2-pass).
let remove dup =
let last = Seq.length dup - 2
seq{
yield Seq.head dup
yield! dup
|> Seq.pairwise
|> Seq.mapi (fun i (a,b) -> if fst a <> fst b || i = last then Some(b) else None)
|> Seq.choose id
}
Sorry to chime in late here. While the answers so far are very good, I feel that they don't express the fundamental need for mutable state in order to return the last element. While I could rely on IEnumerable methods too, sequence expressions are basically equivalent. We start by defining a three-way DU to encapsulate state.
type HitOrMiss<'T> =
| Starting
| Hit of 'T
| Miss of 'T
let foo2 pred xs = seq{
let store = ref Starting // Save last element and state
for current in xs do // Iterate sequence
match !store with // What had we before?
| Starting -> // No element yet
yield current // Yield first element
store := Hit current
| Hit last // Check if predicate is satisfied
| Miss last when pred last current ->
yield current // Then yield intermediate element
store := Hit current
| _ ->
store := Miss current
match !store with
| Miss last -> // Yield last element, if not already
yield last
| _ -> () }
[(1, 0)
(2, 1)
(2, 2)
(3, 3)
(3, 4)
(3, 5)
(4, 6)
(1, 7)
(1, 8)
(2, 9)
(2, 10)]
|> foo2 (fun (a,_) (b,_) -> a <> b) |> Seq.toList |> printfn "%A"
// [(1, 0); (2, 1); (3, 3); (4, 6); (1, 7); (2, 9); (2, 10)]

F# return element pairs in list

I have been looking for an elegant way to write a function that takes a list of elements and returns a list of tuples with all the possible pairs of distinct elements, not taking into account order, i.e. (a,b) and (b,a) should be considered the same and only one of them be returned.
I am sure this is a pretty standard algorithm, and it's probably an example from the cover page of the F# documentation, but I can't find it, not even searching the Internet for SML or Caml. What I have come up with is the following:
let test = [1;2;3;4;5;6]
let rec pairs l =
seq {
match l with
| h::t ->
yield! t |> Seq.map (fun elem -> (h, elem))
yield! t |> pairs
| _ -> ()
}
test |> pairs |> Seq.toList |> printfn "%A"
This works and returns the expected result [(1, 2); (1, 3); (1, 4); (1, 5); (1, 6); (2, 3); (2, 4); (2, 5); (2, 6); (3, 4); (3, 5); (3, 6); (4, 5); (4, 6); (5, 6)] but it looks horribly unidiomatic.
I should not need to go through the sequence expression and then convert back into a list, there must be an equivalent solution only involving basic list operations or library calls...
Edited:
I also have this one here
let test = [1;2;3;4;5;6]
let rec pairs2 l =
let rec p h t =
match t with
| hx::tx -> (h, hx)::p h tx
| _ -> []
match l with
| h::t -> p h t # pairs2 t
| _ -> []
test |> pairs2 |> Seq.toList |> printfn "%A"
Also working, but like the first one it seems unnecessarily involved and complicated, given the rather easy problem. I guess my question is mor about style, really, and if someone can come up with a two-liner for this.
I think that your code is actually pretty close to an idiomatic version. The only change I would do is that I would use for in a sequence expression instead of using yield! in conjunction with Seq.map. I also usually format code differently (but that's just a personal preference), so I would write this:
let rec pairs l = seq {
match l with
| h::t -> for e in t do yield h, elem
yield! pairs t
| _ -> () }
This is practically the same thing as what Brian posted. If you wanted to get a list as the result then you could just wrap the whole thing in [ ... ] instead of seq { ... }.
However, this isn't actually all that different - under the cover, the compiler uses a sequence anyway and it just adds conversion to a list. I think that it may be actually a good idea to use sequences until you actually need a list (because sequences are evaluated lazilly, so you may avoid evaluating unnecessary things).
If you wanted to make this a bit simpler by abstracting a part of the behavior into a separate (generally useful) function, then you could write a function e.g. splits that returns all elements of a list together with the rest of the list:
let splits list =
let rec splitsAux acc list =
match list with
| x::xs -> splitsAux ((x, xs)::acc) xs
| _ -> acc |> List.rev
splitsAux [] list
For example splits [ 1 .. 3 ] would give [(1, [2; 3]); (2, [3]); (3, [])]. When you have this function, implementing your original problem becomes much easier - you can just write:
[ for x, xs in splits [ 1 .. 5] do
for y in xs do yield x, y ]
As a guide for googling - the problem is called finding all 2-element combinations from the given set.
Here's one way:
let rec pairs l =
match l with
| [] | [_] -> []
| h :: t ->
[for x in t do
yield h,x
yield! pairs t]
let test = [1;2;3;4;5;6]
printfn "%A" (pairs test)
You seem to be overcomplicating things a lot. Why even use a seq if you want a list? How about
let rec pairs lst =
match lst with
| [] -> []
| h::t -> List.map (fun elem -> (h, elem)) t # pairs t
let _ =
let test = [1;2;3;4;5;6]
printfn "%A" (pairs test)

How to apply Seq map function?

I been recently playing with F# . I was wondering instead of using a for loop to generate a sequence to element which are multiplied with every other element in the list how can I use a Seq map function or something similar to generate something like below.
So for e.g. I have a list [1..10] I would like to apply a fun which generates a result something like
[(1*1); (1*2);(1*3); (1*4); (1*5)......(2*1);(2*2);(2*3).....(3*1);(3*2)...]
How can i achieve this ?.
Many thanks for all you help.
let list = [1..10]
list |> List.map (fun v1 -> List.map (fun v2 -> (v1*v2)) list) |> List.collect id
The List.collect at the end flattens the list of lists.
It works the same with Seq instead of List, if you want a lazy sequence.
Or, using collect as the main iterator, as cfern suggested and obsessivley eliminating anonymous functions:
let flip f x y = f y x
let list = [1..10]
list |> List.collect ((*) >> ((flip List.map) list))
A list comprehension would be the easiest way to do this:
let allpairs L =
[for x in L do
for y in L -> (x*y)]
Or, without using any loops:
let pairs2 L = L |> List.collect (fun x -> L |> List.map (fun y -> (x*y)))
Edit in response to comment:
You could add a self-crossing extension method to a list like this:
type Microsoft.FSharp.Collections.List<'a> with
member L.cross f =
[for x in L do
for y in L -> f x y]
Example:
> [1;2;3].cross (fun x y -> (x,y));;
val it : (int * int) list =
[(1, 1); (1, 2); (1, 3); (2, 1); (2, 2); (2, 3); (3, 1); (3, 2); (3, 3)]
I wouldn't use an extension method in F# myself, is feels a bit C#'ish. But that's mostly because I don't feel that a fluent syntax is needed in F# because I usually chain my functions together with pipe (|>) operators.
My approach would be to extend the List module with a cross function, not the type itself:
module List =
let cross f L1 L2 =
[for x in L1 do
for y in L2 -> f x y]
If you do this, you can use the cross method like any other method of List:
> List.cross (fun x y -> (x,y)) [1;2;3] [1;2;3];;
val it : (int * int) list =
[(1, 1); (1, 2); (1, 3); (2, 1); (2, 2); (2, 3); (3, 1); (3, 2); (3, 3)]
> List.cross (*) [1;2;3] [1;2;3];;
val it : int list = [1; 2; 3; 2; 4; 6; 3; 6; 9]
Or we can implement a general cross product function:
let cross l1 l2 =
seq { for el1 in l1 do
for el2 in l2 do
yield el1, el2 };;
and use this function to get the job done:
cross [1..10] [1..10] |> Seq.map (fun (a,b) -> a*b) |> Seq.toList
To implement the same thing without for loops, you can use the solution using higher-order functions posted by Mau, or you can write the same thing explicitly using recursion:
let cross xs ys =
let rec crossAux ol2 l1 l2 =
match l1, l2 with
// All elements from the second list were processed
| x::xs, [] -> crossAux ol2 xs ol2
// Report first elements and continue looping after
// removing first element from the second list
| x::xs, y::ys -> (x, y)::(crossAux ol2 l1 ys)
// First list is empty - we're done
| [], _ -> []
crossAux ys xs ys
This may be useful if you're learning functional programming and recursion, however, the solution using sequence expressions is far more practically useful.
As a side-note, the first version by Mau can be made a bit nicer, because you can join the call to List.map with a call to List.collect id like this (you can pass the nested processing lambda directly as a parameter to collect). The cross function would look like this (Of course, you can modifiy this to take a parameter to apply to the two numbers instead of creating a tuple):
let cross xs ys =
xs |> List.collect (fun v1 ->
ys |> List.map (fun v2 -> (v1, v2)))
Incidentally, there is a free chapter from my book avaialable, which discusses how sequence expressions and List.collect functions work. It is worth noting, that for in sequence expressions directly corresponds to List.collect, so you can write the code just by using this higher order function:
let cross xs ys =
xs |> List.collect (fun v1 ->
ys |> List.collect (fun v2 -> [(v1, v2)] ))
However, see the free chapter for more information :-).

Resources