F#: How to process a list of tuples? - f#

I'm working on a homework assignment. And I'm trying to learn F# so I don't want any shortcuts besides using basic things like List.Map or lambdas or something.
I'm trying to process a list of tuples, but I'm having trouble accessing the tuples in the list.
I want to take the list of tuples, add up the numbers in each tuple, and return that number, printing it out each time.
let listTup = [(2,3,4); (4,5,6); (6,7,8)]
let getSum (a,b,c) =
a+b+c
let rec printSum tpList =
let total = 0
match tpList with
| [] -> total //return 0 if empty list
| hd::tl ->
print (getSum hd)

The first thing you want to do is map your tuples through the getSum function. This can be done very simply by piping the list of tuples into List.map getSum. Then you want to print each element in the list, so you pipe the result of List.map getSum into List.iter with the function printfn "%d". This works because of the functions having curried parameters. printfn "%d" applies the "%d" parameter to printfn and returns a function taking an integer, which it then prints. The whole thing would look like this:
let listTup = [(2,3,4); (4,5,6); (6,7,8)]
let getSum (a,b,c) =
a + b + c
let printSum tpList =
tpList |> List.map getSum |> List.iter (printfn "%d")
This prints:
9
15
21
We can even simplify the function further if we take advantage of function composition (the >> operator). Notice that printSum takes tpList as its parameter, and then just uses it as input to two functions that are pipelined together. Since pipelining just takes the output of one function and passes it as the last parameter of another function, all we really need to do is compose the function List.map getSum, which takes a list of int 3-tuples and returns a list of ints with List.iter (printfn "%d"), which takes a list of ints and returns unit. That would look like this:
let printSum = List.map getSum >> List.iter (printfn "%d")
This will print the same results, but is a simpler way of expressing the function.

F# has imperative loops as well. In this case I think an imperative loop matches the problem most idiomatically.
let listTup = [(2,3,4); (4,5,6); (6,7,8)]
for a,b,c in listTup do
let sum = a + b + c
printfn "%d" sum

Related

Remove same values from two lists and compare them using List.fold in F#

I am trying to create a function that takes two lists that removes values in one list that are also in the other. E.g if we have the lists [1;2;3] and [1;2;3;4] then the first list becomes empty []
and the second list is just [4]. At the end I just when to compare both lists.
I am trying to use List.fold for this since I want to understand it better. Also I created my own folder function that deletes elements from a list.
I am very new to F# so I only came up with a partial solution
let rec delete x list =
match list with
| [] -> []
| hd:: tl when hd = x -> tl
| hd:: tl-> hd:: delete x tl
let myFunc list1 list2 =
let x = list1 |> List.fold(delete) [] list2
let y = list2 |> List.fold(delete) [] list1
x = y
but this does not work and the compiler is telling me "The type '('a -> 'b)' does not support the 'equality' constraint because it is a function type" when I try to use the delete function with the list.fold method.
Although you say you are trying to use List.fold for this to understand it better, there is another List function that makes this simpler. This is to use List.except which is one of a number of methods that treats lists as sets.
let list1 = [1;2;3]
let list2 = [1;2;3;4]
let myFunc list1 list2=
list1 |> List.except list2, list2 |> List.except list1
printfn "%A" (myFunc list1 list2)
[],[4]
If you want to understand List.fold here you could try and create an explicit implementation of except using List.fold. However, again, this is simpler to implement using List.filter.
let list1 = [1;2;3]
let list2 = [1;2;3;4]
let except exclude src =
src |> List.filter (fun i -> exclude |> List.contains i |> not)
let myFuncCustom list1 list2 =
(list1 |> except list2), (list2 |> except list1)
printfn "%A" (myFuncCustom list1 list2)
[],[4]
So really you want to implement filter using List.fold. In this case you would actually need List.foldBack:
let filter f src =
List.foldBack (fun item filtered ->
if f item then item :: filtered else filtered) src []
You can use List.fold but then results are reversed and you need to pipe this into List.rev. And note that List.fold only takes three arguments: the first a folder function; second the accumulator which becomes the output - in this case a list too; and, the last, the source list to fold over. (Let us expand List.contains as well):
let list1 = [1;2;3]
let list2 = [1;2;3;4]
let rec contains item = function
| [] -> false
| hd::tl when hd = item -> true
| hd::tl -> contains item tl
let filter f src =
src
|> List.fold (fun filtered item ->
if f item then item :: filtered else filtered) []
|> List.rev
let except exclude src =
src |> filter (fun i -> exclude |> contains i |> not)
let myFuncCustom list1 list2 =
(list1 |> except list2), (list2 |> except list1)
printfn "%A" (myFuncCustom list1 list2)
[],[4]
This should be what you want:
let difference list blacklist =
let folder acc a =
if List.contains a blacklist
then acc
else a::acc
List.fold folder [] list
difference [1;2;3;4] [1;2;3] // [4]
difference [1;2;3] [1;2;3;4] // []
Looking at the code you posted, there seems to be some confusion on how fold works.
the arguments to fold are
A function that somehow combines a given state with an element of the list. This function can be as simple as summing the two arguments together resulting in a single scalar or it can be something really complicated that creates some weird data structure.
An initial state which must be of the type that you want fold to produce
And, of course, the list you want to fold over
Fold iterates the list, by calling your fold function for every element of the list.
The first time your fold function is called, it will get the initial state. Every other time it will get the state produced from the previous iteration.
Fold will return the last state that was produced by your fold function (or the initial state if the list is empty)
As your goal is to better understand fold I try to explain fold instead of explaining how you achive your goal.
fold is bacially a for loop for immutable data-types. It allows you to eliminate mutable variables. For example,
lets assume you want to sum all values of an integer list. In an "imperative" style you are probaly used to
write something like this.
(* This xs is used through all exampes *)
let xs = [1..10]
(* Example A1 *)
let mutable sum = 0
for x in xs do
sum <- sum + x
(* sum = 55 *)
Before you loop through a list, you define a mutable sum and then mutate the sum and updating it on everey iteration.
This is how you achive it with List.fold.
(* Example A2 *)
let sum =
List.fold (fun sum x ->
sum + x
) 0 xs
(* sum = 55 *)
You can think of List.fold as the following.
The function is the body of the loop that gets executed for every item in your list.
The second argument to List.fold (here 0) is the state you want to compute. This is the sum.
The last argument of List.fold is finally the list you want to traverse.
The function always gets two arguments. The state and the next item of your list. Your function must return
the next state.
With the for-loop you also have state. But the state is outside of the for-loop and you achieve your goal
by mutating the state.
You also can think of the List.fold by mentally mapping the values to the lambda function you provide. The second
argument 0 will be sum in your lambda and x in your lambda is one value of xs. The result of your lambda is
the sum for the next call.
Let's say you want to compute three things on the fly. A mutable version looks like this
(* Helper Function *)
let isEven x = x &&& 1 = 0
(* Example B1 *)
let mutable count = 0
let mutable evens = 0
let mutable sum = 0
for x in xs do
count <- count + 1
if isEven x then
evens <- evens + 1
sum <- sum + x
(* count=10; evens=5; sum=55 *)
Here we compute the amount of values in a list, how many even values exists, and the sum in one go.
List.fold only allows one state, but the state can be a complex value. For example a tuple with three values. The
same example with List.fold looks like this:
(* Example B2 *)
let count,evens,sum =
List.fold (fun (count,evens,sum) x ->
(count+1), (if isEven x then evens + 1 else evens), (sum + x)
) (0,0,0) xs
(* count=10; evens=5; sum=55 *)
To better understand fold it is crucial to understand recursion and immutable data-strucutres like how list works.
You could implement fold yourself like this:
(* Self-defined fold *)
let rec myFold f state xs =
match xs with
| [] -> state
| x::rest -> myFold f (f state x) rest
(* Example C *)
let sum = myFold (fun sum x -> sum + x) 0 xs
(* sum = 55 *)
fold just do two things, it checks if the list is empty and in that case returns the state. Or it removes one element from the top of your list and calls itself recursively by
Keeping the function.
Producing the next state with (f state x)
Use the remaining list rest
Maybe you wonder about performance. This is tail-recursive, and tail-recursive functions are basically turned into for-loops by the compiler. So it has no performance penalty compared to the code that mutate things.
This is at least the case in F#. Just a reminder, not every compiler or run-time for other languages support tail-recursion.

F# sequences with BigInteger indices

I am looking for a type similar to sequences in F# where indices could be big integers, rather that being restricted to int. Does there exist anything like this?
By "big integer indices" I mean a type which allows for something equivalent to that:
let s = Seq.initInfinite (fun i -> i + 10I)
The following will generate an infinite series of bigints:
let s = Seq.initInfinite (fun i -> bigint i + 10I)
What i suspect you actually want though is a Map<'Key, 'Value>.
This lets you efficiently use a bigint as an index to look up whatever value it is you care about:
let map =
seq {
1I, "one"
2I, "two"
3I, "three"
}
|> Map.ofSeq
// val map : Map<System.Numerics.BigInteger,string> =
// map [(1, "one"); (2, "two"); (3, "three")]
map.TryFind 1I |> (printfn "%A") // Some "one"
map.TryFind 4I |> (printfn "%A") // None
The equivalent of initInfinite for BigIntegers would be
let inf = Seq.unfold (fun i -> let n = i + bigint.One in Some(n, n)) bigint.Zero
let biggerThanAnInt = inf |> Seq.skip (Int32.MaxValue) |> Seq.head // 2147483648
which takes ~2 min to run on my machine.
However, I doubt this is of any practical use :-) That is unless you start at some known value > Int32.MaxValue and stop reasonably soon (generating less than Int32.MaxValue items), which then could be solved by offsetting the BigInt indexes into the Int32 domain.
Theoretically you could amend the Seq module with functions working with BigIntegers to skip / window / ... an amount of items > Int32.MaxValue (e.g. by repeatedly performing the corresponding Int32 variant)
Since you want to index into a sequence, I assume you want a version of Seq.item that takes a BigInteger as index. There's nothing like that built into F#, but it's easy to define your own:
open System.Numerics
module Seq =
let itemI (index : BigInteger) source =
source |> Seq.item (int index)
Note that no new type is needed unless you're planning to create sequences that are longer than 2,147,483,647 items, which would probably not be practical anyway.
Usage:
let items = [| "moo"; "baa"; "oink" |]
items
|> Seq.itemI 2I
|> printfn "%A" // output: "oink"

Read two integers in the same line as tuple in f#

I'm trying to read two integers which are going to be taken as input from the same line. My attempt so far:
let separator: char =
' '
Console.ReadLine().Split separator
|> Array.map Convert.ToInt32
But this returns a two-element array and I have to access the individual indices to access the integers. Ideally, what I would like is the following:
let (a, b) =
Console.ReadLine().Split separator
|> Array.map Convert.ToInt32
|> (some magic to convert the two element array to a tuple)
How can I do that?
I'm afraid there's no magic. You have to explicitly convert into a tuple
let a, b =
Console.ReadLine().Split separator
|> Array.map int
|> (fun arr -> arr.[0], arr.[1])
Edit: you can use reflection as #dbc suggested but that's slow and probably overkill for what you're doing.

In F#, how to get head/tail of a seq without re-evaluating the seq

I'm reading a file and I want to do something with the first line, and something else with all the other lines
let lines = System.IO.File.ReadLines "filename.txt" |> Seq.map (fun r -> r.Trim())
let head = Seq.head lines
let tail = Seq.tail lines
```
Problem: the call to tail fails because the TextReader is closed.
What it means is that the Seq is evaluated twice: once to get the head once to get the tail.
How can I get the firstLine and the lastLines, while keeping a Seq and without reevaluating the Seq ?
the signature could be, for example :
let fn: ('a -> Seq<'a> -> b) -> Seq<'a> -> b
The easiest thing to do is probably just using Seq.cache to wrap your lines sequence:
let lines =
System.IO.File.ReadLines "filename.txt"
|> Seq.map (fun r -> r.Trim())
|> Seq.cache
Of note from the documentation:
This result sequence will have the same elements as the input sequence. The result can be enumerated multiple times. The input sequence is enumerated at most once and only as far as is necessary. Caching a sequence is typically useful when repeatedly evaluating items in the original sequence is computationally expensive or if iterating the sequence causes side-effects that the user does not want to be repeated multiple times.
I generally use a seq expression in which the Stream is scoped inside the expression. That will allow you to enumerate the sequence fully before the stream is disposed. I usually use a function like this:
let readLines file =
seq {
use stream = File.OpenText file
while not stream.EndOfStream do
yield stream.ReadLine().Trim()
}
Then you should be able to call Seq.head and get the first line in the fail, and Seq.last to get the last line in the file. I think this will technically create two different enumerators though. If you want to only read the file exactly one time, then materializing the sequence to a list or using a function like Seq.cache will be your best option.
I had an important use case for this, where I am using Seq.unfold to read a large number of blocks with REST reads, and sequentially processing each block, with further REST reads.
The reading of the sequence had to be both "lazy" but also cached to avoid duplicate re-evaluation (with every Seq.tail operation).
Hence finding this question and the accepted answer (Seq.cache). Thanks!
I experimented with Seq.cache and discovered that it worked as claimed (ie, lazy and avoid re-evaluation), but with one noteworthy condition - the first five elements of the sequence are always read first (and retained with 'cache'), so experiments on five or smaller numbers won't show lazy evaluation. However, after five, lazy evaluation kicks in for each element.
This code can be used to experiment. Try it for 5, and see no lazy evaluation, and then 10, and see each element after 5 being 'lazy' read, as required. Also remove Seq.cache to see the problem we are addressing (re-evaluation)
// Get a Sequence of numbers.
let getNums n = seq { for i in 1..n do printfn "Yield { %d }" i; yield i}
// Unfold a sequence of numbers
let unfoldNums (nums : int seq) =
nums
|> Seq.unfold
(fun (nums : int seq) ->
printfn "unfold: nums = { %A }" nums
if Seq.isEmpty nums then
printfn "Done"
None
else
let num = Seq.head nums // Value to yield
let tl = Seq.tail nums // Next State. CAUSES RE-EVALUTION!
printfn "Yield: < %d >, tl = { %A }" num tl
Some (num,tl))
// Get n numbers as a sequence, then unfold them as a sequence
// Observe that with 'Seq.cache' input is not re-evaluated unnecessarily,
// and also that lazy evaulation kicks in for n > 5
let experiment n =
getNums n
|> Seq.cache
// Without cache, Seq.tail causes the sequence to be re-evaluated
|> unfoldNums
|> Seq.iter (fun x -> printfn "Process: %d" x)

Trying to filter out values in a sequence that are not in another sequence

I am trying to filter out values from a sequence, that are not in another sequence. I was pretty sure my code worked, but it is taking a long time to run on my computer and because of this I am not sure, so I am here to see what the community thinks.
Code is below:
let statezip =
StateCsv.GetSample().Rows
|> Seq.map (fun row -> row.State)
|> Seq.distinct
type State = State of string
let unwrapstate (State s) = s
let neededstates (row:StateCsv) = Seq.contains (unwrapstate row.State) statezip
I am filtering by the neededstates function. Is there something wrong with the way I am doing this?
let datafilter =
StateCsv1.GetSample().Rows
|> Seq.map (fun row -> row.State,row.Income,row.Family)
|> Seq.filter neededstates
|> List.ofSeq
I believe that it should filter the sequence by the values that are true, since neededstates function is a bool. StateCsv and StateCsv1 have the same exact structure, although from different years.
Evaluation of contains on sequences and lists can be slow. For a case where you want to check for the existence of an element in a collection, the F# Set type is ideal. You can convert your sequences to sets using Set.ofSeq, and then run the logic over the sets instead. The following example uses the numbers from 1 to 10000 and then uses both sequences and sets to filter the result to only the odd numbers by checking that the values are not in a collection of even numbers.
Using Sequences:
let numberSeq = {0..10000}
let evenNumberSeq = seq { for n in numberSeq do if (n % 2 = 0) then yield n }
#time
numberSeq |> Seq.filter (fun n -> evenNumberSeq |> Seq.contains n |> not) |> Seq.toList
#time
This runs in about 1.9 seconds for me.
Using sets:
let numberSet = numberSeq |> Set.ofSeq
let evenNumberSet = evenNumberSeq |> Set.ofSeq
#time
numberSet |> Set.filter (fun n -> evenNumberSet |> Set.contains n |> not)
#time
This runs in only 0.005 seconds. Hopefully you can materialize your sequences to sets before performing your contains operation, thereby getting this level of speedup.

Resources