Related
I have a solution to this, and several working-but-unsatisfactory solutions, but it took a lot of work and seems unnecessarily complex.
Am I missing something in F#?
The Problem
I have a sequence of numbers
let nums = seq { 9; 12; 4; 17; 9; 7; 13; }
I want to decorate each number with an "index", so the result is
seq [(9, 0); (12, 1); (4, 2); (17, 3); ...]
Looks simple!
In practice the input can be very large and of indeterminate size. In my application, it is coming from a REST service.
Further
the operation must support lazy evaluation (because of the REST backend)
must be purely functional, which eliminates the obvious seq { let mutable i = o; for num in nums do .. } solution, ditto for while ... do ...
Lets call the function decorate, of type (seq<'a> -> seq<'a * int>), so it would work as follows:
nums
|> decorate
|> Seq.iter (fun (n,index) -> printfn "%d: %d" index n)
Producing:
0: 9
1: 12
2: 4
...
6: 13
This is a trivial problem with Lists (apart from the lazy evaluation), but tricky with Sequences.
My solution is to use Seq.unfold, as follows:
let decorate numSeq =
(0,numSeq)
|> Seq.unfold
(fun (count,(nums : int seq)) ->
if Seq.isEmpty nums then
None
else
let result = ((Seq.head nums),count)
let remaining = Seq.tail nums
Some( result, (count+1,remaining)))
This meets all requirements, and is the best I've come up with.
Here's the whole solution, with diagnostics to show lazy evaluation:
let nums =
seq {
// With diagnostic
let getN n =
printfn "get: %d" n
n
getN 9;
getN 12;
getN 4;
getN 17;
getN 9;
getN 7;
getN 13
}
let decorate numSeq =
(0,numSeq)
|> Seq.unfold
(fun (count,(nums : int seq)) ->
if Seq.isEmpty nums then
None
else
let result = ((Seq.head nums),count)
let remaining = Seq.tail nums
printfn "unfold: %A" result
Some( result, (count+1,remaining)))
nums
|> Seq.cache
// To prevent re-computation of the sequence.
// Will be necessary for any solution. This solution required only one.
|> decorate
|> Seq.iter (fun (n,index) -> printfn "ITEM %d: %d" index n)
PROBLEM: This took a LOT of work to reach. It looks complex, compared to the (apparently) simple requirement.
QUESTION: Is there a simpler solution?
Discussion of some alternatives.
All work, but are unsatisfactory for the reasons given
// Most likely: Seq.mapFold
// Fails lazy evalation. The final state must be evaluated, even if not used
let decorate numSeq =
numSeq
|> Seq.mapFold
(fun count num ->
let result = (num,count)
printfn "yield: %A" result
(result,(count + 1)))
0
|> fun (nums,finalState) -> nums // And, no, using "_" doesn't fix it!
// 'for' loop, with MUTABLE
// Lazy evaluation works
// Not extensible, as the state 'count' is specific to this problem
let decorate numSeq =
let mutable count = 0
seq {
for num in numSeq do
let result = num,count
printfn "yield: %A" result
yield result;
count <- count+1
}
// 'for' loop, without mutable
// Fails lazy evaluation, and is ugly
let decorate numSeq =
seq {
for index in 0..((Seq.length numSeq) - 1) do
let result = ((Seq.item index numSeq), // Ugly!
index)
printfn "yield: %A" result
yield result
}
// "List" like recursive descent,
// Fails lazy evaluation. Ugly, because we are not meant to use recursion on Sequences
// https://stackoverflow.com/questions/11451727/recursive-functions-for-sequences-in-f
let rec decorate' count (nums : int seq) =
if Seq.isEmpty nums then
Seq.empty
else
let hd = Seq.head nums
let tl = Seq.tail nums
let result = (hd,count)
let tl' = decorate' (count+1) tl
printfn "yield: %A" result
seq { yield result; yield! tl'}
let decorate : (seq<'a> -> seq<'a * int>) = decorate' 0
You can use Seq.mapi to do what you need.
let nums = seq { 9; 12; 4; 17; 9; 7; 13; }
nums |> Seq.mapi (fun i num -> (num, i))
This gives (9, 0); (12, 1); etc...
Seq is "lazy" in the same sense as IEnumerable in C#.
You can read about Seq.mapi here:
https://fsharp.github.io/fsharp-core-docs/reference/fsharp-collections-seqmodule.html#mapi
Read more about the use of map here:
https://fsharpforfunandprofit.com/posts/elevated-world/#map
In addition to the Seq.mapi function mentioned in Sean's answer, F# also has a built-in Seq.indexed function, which decorates a sequence with index. This does not do exactly what you're asking, because the index becomes the first element of the tuple, but depending on your use case, it may do the trick:
> let nums = seq { 9; 12; 4; 17; 9; 7; 13; };;
val nums : seq<int>
> Seq.indexed nums;;
val it : seq<int * int> = seq [(0, 9); (1, 12); (2, 4); (3, 17); ...]
If I was trying to implement this on my own using a more primitive function, it could be done using Seq.scan, which is a bit like fold but produces a lazy sequence of states. The only tricky thing is that you have to construct the initial state and then process the rest of the sequence:
Seq.tail nums
|> Seq.scan (fun (prevIndex, _) v -> (prevIndex+1, v)) (0, Seq.head nums)
This will not work for empty lists, even though the function should logically be able to handle this.
Using for is not bad, or wrong. for and yield in a seq {} is how you write new seq functions, if none of the provided functions in Seq Module is a best-fit. It is neither wrong, or bad to use this special construct. It's the same as C# foreach and yield syntax.
Using a mutable in a limited scope, is also not wrong. Mutables are a bad idea, if they escape the scope. For example, you return a mutable value, from a function.
Its important to put the mutable inside the seq, and not outside. Your version is wrong.
Let's assume this
let xs = decorate [3;6;7;12;9]
for x in xs do
printfn "%A" x
for x in xs do
printfn "%A" x
Now you have two versions of decorate. The first version
let decorate numSeq =
let mutable count = 0
seq {
for num in numSeq do
yield (num,count)
count <- count + 1
}
will print:
(3, 0)
(6, 1)
(7, 2)
(12, 3)
(9, 4)
(3, 5)
(6, 6)
(7, 7)
(12, 8)
(9, 9)
Or in other words. The mutable is shared across all invocation whenever you iterate through the sequence. As a general tip. If you want to return a seq then put all your code into seq. And put the seq {} after the = sign. If you do this instead.
let decorate numSeq = seq {
let mutable count = 0
for num in numSeq do
yield (num,count)
count <- count + 1
}
you get the correct output:
(3, 0)
(6, 1)
(7, 2)
(12, 3)
(9, 4)
(3, 0)
(6, 1)
(7, 2)
(12, 3)
(9, 4)
Forther you explain, that this version is not "extensible". But the version with mapi you select as "correct". Has the same problem, it only provides an index, nothing more.
If you want a more generic version, you always can make a function that expects its values as a function argument. You could for example change the above function to this code.
let decorate2 f (state:'State) (xs:'T seq) = seq {
let mutable state = state
for x in xs do
yield state, x
let newState = f state x
state <- newState
}
Now decorate2 expects a state that you can freely pass, and a function to change the state. With this function you could then write:
decorate2 (fun state _ -> state+1) 0 [3;6;7;12;9]
The function signature is nearly the same as Seq.scan, but still a little bit different. But if you want to create a indexed function, you could use scan like this.
let indexed xs =
Seq.scan (fun (count,_) x -> (count+1,x)) (0,Seq.head xs) (Seq.skip 1 xs)
Just in my opinion. This version is harder rot read, understand, and just fugly compared to decorate or decorate2.
And just a note. There is already a Seq.indexed function in the standard library, that does what you wish.
for x in Seq.indexed [3;6;7;12;9] do
printfn "%A" x
will print
(0, 3)
(1, 6)
(2, 7)
(3, 12)
(4, 9)
I have a situation where I'm parsing a file and I need to know both:
the current line
the previous line
before the previous line requirements, I was doing something like:
myData
|> List.mapi (fun i data -> parse i data)
but now I need access to the previous line, so scan is ideal for that, but then I loose the index.
so, I need a List.scani function :) is it something that could be built easily in an idiomatic way?
You could define scani as follows:
let scani (f:int->'S->'T->'S) (state:'S) (list:'T list) =
list
|>List.scan (fun (i,s) x -> (i+1,f i s x)) (0,state)
|>List.map snd
Creating a tuple with the original state and a counter initialized with (0,state). The state is manipulated as usual with the folder function f (that now takes an extra i parameter) and the counter incremented by one. Finally, we remove the counter from the state by taking the second element of the state.
You could use it as follows, where i is the index, s is the state, and x the element.
[1;2;3]
|> scani (fun i s x -> s + i*x) 0
|> should equal [0;0;2;8]
It may not be the most efficient way to do it, but it seems to work (I called it scanl given that you want access to the previous element, or line):
let scanl f s l =
List.scan (fun (acc,elem0) elem1 -> (f acc elem0 elem1),elem1) (s,List.head l) l
|> List.map fst
Examples of usage:
let l = [1..5]
scanl (fun acc elem0 elem1 -> elem0,elem1) (0,0) l
//result: [(0, 0); (1, 1); (1, 2); (2, 3); (3, 4); (4, 5)]
The usual List.scan would give this:
List.scan (fun acc elem -> elem) 0 l
//result [0; 1; 2; 3; 4; 5]
Let's say I have an array
let arr = [|1;2;3;4;5;6|]
I would like to convert it to something like
[|(1,2);(3,4);(5,6)|]
I've seen Seq.window but this one is going to generate something like
[|(1,2);(2,3);(3,4);(4,5);(5,6)|]
which is not what I want
You can use Array.chunkBySize and then map each sub-array into tuples:
let input = [|1..10|]
Array.chunkBySize 2 list |> Array.map (fun xs -> (xs.[0], xs.[1]))
#Slugart's accepted answer is the best approach (IMO) assuming you know that the array has an even number of elements, but here's another approach that doesn't throw an exception if there does happen to be an odd number (it just omits the last trailing element):
let arr = [|1;2;3;4;5|]
seq { for i in 0 .. 2 .. arr.Length - 2 -> (arr.[i], arr.[i+1]) } |> Seq.toArray
You could use Seq.pairwise, as long as you filter out every other tuple. The filtering needs to pass a state through the iteration, which is usually effected by the scan function.
[|1..10|]
|> Seq.pairwise
|> Seq.scan (fun s t ->
match s with None -> Some t | _ -> None )
None
|> Seq.choose id
|> Seq.toArray
// val it : (int * int) [] = [|(1, 2); (3, 4); (5, 6); (7, 8); (9, 10)|]
But then it's also possible to have scan generate the tuples directly, on penalty of an intermediate array.
[|1..10|]
|> Array.scan (function
| Some x, _ -> fun y -> None, Some(x, y)
| _ -> fun x -> Some x, None )
(None, None)
|> Array.choose snd
Use Seq.pairwise to turn a sequence into tuples
[|1;2;3;4;5;6|]
|> Seq.pairwise
|> Seq.toArray
val it : (int * int) [] = [|(1, 2); (2, 3); (3, 4); (4, 5); (5, 6)|]
Should be:
let rec slice =
function
| [] -> []
| a::b::rest -> (a,b) :: slice (rest)
| _::[] -> failwith "cannot slice uneven list"
I have some code I am trying to test, that is supposed to merge two int lists of same length into a tuple list. I have got it to compile but I cannot find out if it works as I am having trouble printing the result.
Here is what I have so far:
let myList = [5;15;20;25;30;200]
let myList2 = [6;16;21;26;31;201]
let rec tupleMaker (list1: int list) (list2: int list) =
match list1, list2 with
| (h1 :: tail1),(h2 :: tail2)->
let (a,b) = (h1,h2)
(a,b) :: tupleMaker tail1 tail2
| _,_->
[]
let z = tupleMaker myList, myList2
//printfn z
//printfn %A
The printfn does not work and neither has anything else I tried, any help would be greatly appreciated.
You just implemented List.zip:
List.zip myList myList2
//val it : (int * int) list =
// [(5, 6); (15, 16); (20, 21); (25, 26); (30, 31); (200, 201)]
Note: The OP has not been seen in months so this answer will probably never get an accept vote. Don't let that stop you from thinking it is not a correct answer.
First to generate a list of tuples.
This uses two different types of list to make the answer more useful in general.
let myList: int list = [1;2;3]
let myList2 : string list = ["a";"b";"c"]
let listOfTuple:(int * string) list = List.zip myList myList2
There are many ways to print a list of tuples, but the basic idea is to use List.iter to access the individual tuples in the list and then apply standard means to access the items in the tuple.
Example 1:
This doesn't use List.iter. It uses just printfn %A. This is useful when you are stuck trying to figure out why something will not print and just need to see the data as the system sees it.
printfn "%A" listOfTuple
Result:
[(1, "a"); (2, "b"); (3, "c")]
Example 2:
This uses List.iter with printfn %A. This is useful when you know the data is a list but don't know the type of the individual items.
listOfTuple |>
List.iter (printfn "%A")
Result:
(1, "a")
(2, "b")
(3, "c")
Example 3:
This uses List.iter with a tuple deconstructor, e.g. let (a,b) = values, to get at the individual values of the tuple. This is useful if you want to print every value of every item in the list.
listOfTuple |>
List.iter(
fun values ->
let (a,b) = values
printfn ("%i, %s") a b
)
Result:
1, a
2, b
3, c
Example 4:
This uses List.iter with a match statement to get at the individual values of the tuple. This is useful if you want to do more complicated processing, such as filtering before printing, or having different printing messages and/or formats for different values.
listOfTuple |>
List.iter(
fun values ->
match values with
| (a,_) when a > 1 ->
printfn ("%i") a
| (_,_) -> ()
)
Result
2
3
When I write in python, I always try to think of alternatives as if I was using F#:
I have a seq of tuples (key, value1, value2, ...) I simplify the tuple here so it is only of length 2. Keys contain duplicated figures.
let xs = Seq.zip [|1;2;2;3;3;3;4;1;1;2;2|] {0..10} // so this is a seq of tuple
[(1, 0),
(2, 1),
(2, 2),
(3, 3),
(3, 4),
(3, 5),
(4, 6),
(1, 7),
(1, 8),
(2, 9),
(2, 10)]
Now, I would like to build a function, which takes the seq as input, and return a seq, which is a subset of the original seq.
It must captures all items where the keys are changed, and include the first and last items of the seq if they are not already there.
f1(xs) = [(1, 0), (2, 1), (3, 3), (4, 6), (1, 7), (2, 9), (2, 10)]
f1([]) = []
The following is my python code, it works, but I don't really like it.
xs = zip([1,2,2,3,3,3,4,1,1,2,2], range(11))
def f1(xs):
if not xs:
return
last_a = None # I wish I don't have to use None here.
is_yield = False
for a, b in xs:
if a != last_a:
last_a = a
is_yield = True
yield (a, b)
else:
is_yield = False
if not is_yield:
yield (a, b) # Ugly, use variable outside the loop.
print list(f1(xs))
print list(f1([]))
Here is another way, using the itertools library
def f1(xs):
group = None
for _, group_iter in itertools.groupby(xs, key = lambda pair: pair[0]):
group = list(group_iter)
yield group[0]
# make sure we yield xs[-1], doesn't work if xs is iterator.
if group and len(group) > 1: # again, ugly, use variable outside the loop.
yield group[-1]
In F#, Seq.groupBy has difference behaviour from groupby in python. I am wondering how this problem can be solved as functional as possible, and less reference cells, less mutable, and without too much hassle.
A recursive solution that should work, but also isn't particularly beautiful or short could looks something like this - but using pattern matching definitely makes this a bit nicer:
let whenKeyChanges input = seq {
/// Recursively iterate over input, when the input is empty, or we found the last
/// element, we just return it. Otherwise, we check if the key has changed since
/// the last produced element (and return it if it has), then process the rest
let rec loop prevKey input = seq {
match input with
| [] -> ()
| [last] -> yield last
| (key, value)::tail ->
if key <> prevKey then yield (key, value)
yield! loop key tail }
// Always return the first element if the input is not empty
match List.ofSeq input with
| [] -> ()
| (key, value)::tail ->
yield (key, value)
yield! loop key tail }
If you wanted a nicer and a bit more declarative solution, then you could use data frame and time series library that I've been working on at BlueMountain Capital (not yet officially announced, but should work).
// Series needs to have unique keys, so we add an index to your original keys
// (so we have series with (0, 1) => 0; (1, 2) => 1; ...
let xs = series <| Seq.zip (Seq.zip [0..10] [1;2;2;3;3;3;4;1;1;2;2]) [0..10]
// Create chunks such that your part of the key is the same in each chunk
let chunks = xs |> Series.chunkWhile (fun (_, k1) (_, k2) -> k1 = k2)
// For each chunk, return the first element, or the first and the last
// element, if this is the last chunk (as you always want to include the last element)
chunks
|> Series.map (fun (i, k) chunk ->
let f = Series.firstValue chunk
let l = Series.lastValue chunk
if (i, k) = Series.lastKey chunks then
if f <> l then [k, f; k, l] else [k, l]
else [k, f])
// Concatenate the produced values into a single sequence
|> Series.values |> Seq.concat
The chunking is the key operation that you need here (see the documentation). The only tricky thing is returning the last element - which could be handled in multiple different ways - not sure if the one I used is the nicest.
The simplest solution would likely be to convert the sequence to an array and couple John's approach with snatching the first and last elements by index. But, here's another solution to add to the mix:
let f getKey (items: seq<_>) =
use e = items.GetEnumerator()
let rec loop doYield prev =
seq {
if doYield then yield prev
if e.MoveNext() then
yield! loop (getKey e.Current <> getKey prev) e.Current
elif not doYield then yield prev
}
if e.MoveNext() then loop true e.Current
else Seq.empty
//Usage: f fst xs
I think something like this will work
let remove dup =
dup
|> Seq.pairwise
|> Seq.filter (fun ((a,b),(c,d)) -> a <> c)
|> Seq.map fst
A correct solution needs to be aware of the end of the sequence, in order to satisfy the special case regarding the last element. Thus there either needs to be two passes such that length is known before processing (e.g. Tomas's solution - first pass is copy to list which unlike seq exposes its "end" as you iterate) or you need to rely on IEnumerable methods so that you know as you iterate when the end has been reached (e.g. Daniel's solution).
Below is inspired by the elegance of John's code, but handles the special cases by obtaining the length up front (2-pass).
let remove dup =
let last = Seq.length dup - 2
seq{
yield Seq.head dup
yield! dup
|> Seq.pairwise
|> Seq.mapi (fun i (a,b) -> if fst a <> fst b || i = last then Some(b) else None)
|> Seq.choose id
}
Sorry to chime in late here. While the answers so far are very good, I feel that they don't express the fundamental need for mutable state in order to return the last element. While I could rely on IEnumerable methods too, sequence expressions are basically equivalent. We start by defining a three-way DU to encapsulate state.
type HitOrMiss<'T> =
| Starting
| Hit of 'T
| Miss of 'T
let foo2 pred xs = seq{
let store = ref Starting // Save last element and state
for current in xs do // Iterate sequence
match !store with // What had we before?
| Starting -> // No element yet
yield current // Yield first element
store := Hit current
| Hit last // Check if predicate is satisfied
| Miss last when pred last current ->
yield current // Then yield intermediate element
store := Hit current
| _ ->
store := Miss current
match !store with
| Miss last -> // Yield last element, if not already
yield last
| _ -> () }
[(1, 0)
(2, 1)
(2, 2)
(3, 3)
(3, 4)
(3, 5)
(4, 6)
(1, 7)
(1, 8)
(2, 9)
(2, 10)]
|> foo2 (fun (a,_) (b,_) -> a <> b) |> Seq.toList |> printfn "%A"
// [(1, 0); (2, 1); (3, 3); (4, 6); (1, 7); (2, 9); (2, 10)]