F# Basics: Folding 2 lists together into a string - f#

a little rusty from my Scheme days, I'd like to take 2 lists: one of numbers and one of strings, and fold them together into a single string where each pair is written like "{(ushort)5, "bla bla bla"},\n". I have most of it, i'm just not sure how to write the Fold properly:
let splitter = [|","|]
let indexes =
indexStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let values =
valueStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let pairs = List.zip indexes values
printfn "%A" pairs
let result = pairs |> Seq.fold
(fun acc a -> String.Format("{0}, \{(ushort){1}, \"{2}\"\}\n",
acc, (List.nth a 0), (List.nth a 1)))

Your missing two things. The initial state of the fold which is an empty string and you can't use list comprehension on tuples in F#.
let splitter = [|","|]
let indexes =
indexStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let values =
valueStr.Split(splitter, System.StringSplitOptions.None) |> Seq.toList
let pairs = List.zip indexes values
printfn "%A" pairs
let result =
pairs
|> Seq.fold (fun acc (index, value) ->
String.Format("{0}{{(ushort){1}, \"{2}\"}},\n", acc, index, value)) ""
fold2 version
let result =
List.fold2
(fun acc index value ->
String.Format("{0}{{(ushort){1}, \"{2}\"}},\n", acc, index, value))
""
indexes
values
If you are concerned with speed you may want to use string builder since it doesn't create a new string every time you append.
let result =
List.fold2
(fun (sb:StringBuilder) index value ->
sb.AppendFormat("{{(ushort){0}, \"{1}\"}},\n", index, value))
(StringBuilder())
indexes
values
|> string

Fold probably isn't the best method for this task. Its a lot easier to map and concat like this:
let l1 = "a,b,c,d,e".Split([|','|])
let l2 = "1,2,3,4,5".Split([|','|])
let pairs =
Seq.zip l1 l2
|> Seq.map (fun (x, y) -> sprintf "(ushort)%s, \"%s\"" x y)
|> String.concat "\n"

I think you want List.fold2. For some reason the List module has a fold2 member but Seq doesn't. Then you can dispense with the zip entirely.
The types of your named variables and the type of the result you hope for are all implicit, so it's difficult to help, but if you are trying to accumulate a list of strings you might consider something along the lines of
let result = pairs |> Seq.fold
(fun prev (l, r) ->
String.Format("{0}, \{(ushort){1}, \"{2}\"\}\n", prev, l, r)
"" pairs
My F#/Caml is very rusty so I may have the order of arguments wrong. Also note your string formation is quadratic; in my own code I would go with something more along these lines:
let strings =
List.fold2 (fun ss l r ->
String.format ("\{(ushort){0}, \"{1}\"\}\n", l, r) :: ss)
[] indexes values
let result = String.concat ", " strings
This won't cost you quadratic time and it's a little easier to follow. I've checked MSDN and believe I have the correct order of arguments on fold2.
Keep in mind I know Caml not F# and so I may have details or order of arguments wrong.

Perhaps this:
let strBuilder = new StringBuilder()
for (i,v) in Seq.zip indexes values do
strBuilder.Append(String.Format("{{(ushort){0}, \"{1}\"}},\n", i,v))
|> ignore
with F# sometimes is better go imperative...

map2 or fold2 is the right way to go. Here's my take, using the (||>) operator:
let l1 = [| "a"; "b"; "c"; "d"; "e" |]
let l2 = [| "1"; "2"; "3"; "4"; "5" |]
let pairs = (l1, l2) ||> Seq.map2 (sprintf ("(ushort)%s, \"%s\""))
|> String.concat "\n"

Related

Remove All but First Occurrence of a Character in a List of Strings

I have a list of names, and I need to output a single string that shows the letters from the names in the order they appear without the duplicates (e.g. If the list is ["John"; "James"; "Jack"], the output string should be Johnamesck). I've got a solution (folding all the names into a string then parse), but I feel like I'm cheesing it a bit by making my string mutable.
I also want to state this is not a school assignment, just an exercise from a work colleague as I'm coming into F# from only ever knowing Java Web stuff.
Here is my working solution (for insight purposes):
let lower = ['a' .. 'z']
let upper = ['A' .. 'Z']
let mutable concatedNames = ["John"; "James"; "Jack"] |> List.fold (+) ""
let greaterThanOne (length : int) = length > 1
let stripChars (str : string) letter =
let parts = str.Split([| letter |])
match greaterThanOne (Array.length parts) with
| true -> seq {
yield Array.head parts
yield string letter
yield! Array.tail parts
}
|> String.concat ""
| _ -> str
let killAllButFirstLower = lower |> List.iter (fun letter -> concatedNames <- (stripChars concatedNames letter))
let killAllButFirstUpper = upper |> List.iter ( fun letter -> concatedNames <- (stripChars concatedNames letter))
printfn "All names with duplicate letters removed: %s" concatedNames
I originally wanted to do this explicitly with functions alone and had a solution previous to above
let lower = ['a' .. 'z']
let upper = ['A' .. 'Z']
:
:
:
let lowerStripped = [""]
let killLowerDuplicates = lower |> List.iter (fun letter ->
match lowerStripped.Length with
| 1 ->
(stripChars concatedNames letter)::lowerStripped |> ignore
| _ -> (stripChars (List.head lowerStripped) letter)::lowerStripped |> ignore
)
let upperStripped = [List.head lowerStripped]
let killUpperDuplicates = lower |> List.iter ( fun letter -> (stripChars (List.head upperStripped) letter)::upperStripped |> ignore )
let strippedAll = List.head upperStripped
printfn "%s" strippedAll
But I couldn't get this working because I realized the consed lists weren't going anywhere (not to mention this is probably inefficient). The idea was that by doing it this way, once I parsed everything, the first element of the list would be the desired string.
I understand it may be strange asking a question I already have a solution to, but I feel like using mutable is just me not letting go of my Imperative habits (as I've read it should be rare to need to use it) and I want to more reinforce pure functional. So is there a better way to do this? Is the second solution a feasible route if I can somehow pipe the result somewhere?
You can use Seq.distinct to remove duplicates and retain ordering, so you just need to convert the list of strings to a single string, which can be done with String.concat "":
let distinctChars s = s |> String.concat ""
|> Seq.distinct
|> Array.ofSeq
|> System.String
If you run distinctChars ["John"; "James"; "Jack"], you will get back:
"Johnamesck"
This should do the trick:
let removeDuplicateCharacters strings =
// Treat each string as a seq<char>, flattening them into one big seq<char>
let chars = strings |> Seq.collect id // The id function (f(x) = x) is built in to F#
// We use it here because we want to collect the characters themselves
chars
|> Seq.mapi (fun i c -> i,c) // Get the index of each character in the overall sequence
|> Seq.choose (fun (i,c) ->
if i = (chars |> Seq.findIndex ((=) c)) // Is this character's index the same as the first occurence's index?
then Some c // If so, return (Some c) so that `choose` will include it,
else None) // Otherwise, return None so that `choose` will ignore it
|> Seq.toArray // Convert the seq<char> into a char []
|> System.String // Call the new String(char []) constructor with the choosen characters
Basically, we just treat the list of strings as one big sequence of characters, and choose the ones where the index in the overall sequence is the same as the index of the first occurrence of that character.
Running removeDuplicateCharacters ["John"; "James"; "Jack";] gives the expected output: "Johnamesck".

F# sort by indexes

Let's say I have two lists:
let listOfValues = [100..105] //can be list of strings or whatever
let indexesToSortBy = [1;2;0;4;5;3]
Now I need listOfValues_sorted: 102;100;101;105;103;104
It can be done with zip and "conversion" to Tuple:
let listOfValues_sorted = listOfValues
|> Seq.zip indexesToSortBy
|> Seq.sortBy( fun x-> fst x)
|> Seq.iter(fun c -> printfn "%i" (snd c))
But I guess, there is better solution for that?
I think your solution is pretty close. I would do this
let listOfValues_sorted =
listOfValues
|> Seq.zip indexesToSortBy
|> Seq.sortBy fst
|> Seq.toList
|> List.unzip
|> List.head
you can collapse fun x -> fst x into simply fst. And then unzip and get what ever list you want
If indexesToSortBy is a complete set of indexes you could simply use:
indexesToSortBy |> List.map (fun x -> listOfValues |> List.item x )
Your example sounds precisely what the List.permute function is for:
let listOfValues = [100..105]
let indexesToSortBy = [|1;2;0;4;5;3|] // Note 0-based indexes
listOfValues |> List.permute (fun i -> indexesToSortBy.[i])
// Result: [102; 100; 101; 105; 103; 104]
Two things: First, I made indexesToSortBy an array since I'll be looking up a value inside it N times, and doing that in a list would lead to O(N^2) run time. Second, List.permute expects to be handed a 0-based index into the original list, so I subtracted 1 from all the indexes in your original indexToSortBy list. With these two changes, this produces exactly the same ordering as the let listOfValues_sorted = ... example in your question.

key based functional fold

I have a map reduce code for which I group in each of the threads by some key and then in the reduce part merge the results. My current approach is to search for an specific key index in the accumulator and then mapi to retrieve the combined result only for this key, leaving the rest unmodified:
let rec groupFolder sequence acc =
match sequence with
| (by:string, what) :: rest ->
let index = acc |> Seq.tryFindIndex( fun (byInAcc, _) -> byInAcc.Equals(by) )
match index with
| Some (idx) ->
acc |> Seq.mapi( fun i (byInAcc, whatInAcc) -> if i = idx then (by, (what |> Array.append whatInAcc) ) else byInAcc, whatInAcc )
|> groupFolder rest
| None -> acc |> Seq.append( seq{ yield (by, what) } )
|> groupFolder rest
My question is, is it a more functional way to achieve just this?
As an example input to this reducer
let GroupsCommingFromMap = [| seq { yield! [|("key1", [|1;2;3|] ); ("key2", [|1;2;3|] ); ("key3", [|1;2;3|]) |] }, seq { yield! [|("key1", [|4;5;6|] ); ("key2", [|4;5;6|] ); ("key3", [|4;5;6|]) |] } |];;
GroupsCommingFromMap |> Seq.reduce( fun acc i ->
acc |> groupFolder (i |> Seq.toList))
the expected result should contain all key1..key3 each with the array 1..6
From the code you posted, it is not very clear what you're trying to do. Could you include some sample inputs (together with the output that you would like to get)? And does your code actually work on any of the inputs (it has incomplete pattern match, so I doubt that...)
Anyway, you can implement key-based map reduce using Seq.groupBy. For example:
let mapReduce mapper reducer input =
input
|> Seq.map mapper
|> Seq.groupBy fst
|> Seq.map (fun (k, vs) ->
k, vs |> Seq.map snd |> Seq.reduce reducer)
Here:
The mapper takes a value from the input sequence and turns it into key value pair. The mapReduce function then groups the values using the key
The reducer is then used to reduce all values associated with each key
This lets you create a word count function like this (using simple mapper that returns the word as the key with 1 as a value and reducer that just adds all the numbers):
"hello world hello people hello world".Split(' ')
|> mapReduce (fun w -> w, 1) (+)
EDIT: The example you mentioned does not really have "mapper" part, but instead it has array of arrays as an input - so perhaps it is easier to write this directly using Seq.groupBy like this:
let GroupsCommingFromMap =
[| [|("key1", [|1;2;3|] ); ("key2", [|1;2;3|] ); ("key3", [|1;2;3|]) |]
[|("key1", [|4;5;6|] ); ("key2", [|4;5;6|] ); ("key3", [|4;5;6|]) |] |]
GroupsCommingFromMap
|> Seq.concat
|> Seq.groupBy fst
|> Seq.map (fun (k, vs) -> k, vs |> Seq.map snd |> Array.concat)

Merging two lists in F#

I wrote this function which merges two lists together but as I'm fairly new to functional programming I was wondering whether there is a better (simpler) way to do it?
let a = ["a"; "b"; "c"]
let b = ["d"; "b"; "a"]
let merge a b =
// take all a and add b
List.fold (fun acc elem ->
let alreadyContains = acc |> List.exists (fun item -> item = elem)
if alreadyContains = true then
acc
else
elem :: acc |> List.rev
) b a
let test = merge a b
Expected result is: ["a"; "b"; "c"; "d"], I'm reverting the list in order to keep the original order. I thought I would be able to achieve the same using List.foldBack (and dropping List.rev) but it results in an error:
Type mismatch. Expecting a
'a
but given a
'a list
The resulting type would be infinite when unifying ''a' and ''a list'
Why is there a difference when using foldBack?
You could use something like the following
let merge a b =
a # b
|> Seq.distinct
|> List.ofSeq
Note that this will preserve order and remove any duplicates.
In F# 4.0 this will be simplified to
let merge a b = a # b |> List.distinct
If I wanted to write this in a way that is similar to your original version (using fold), then the main change I would do is to move List.rev outside of the function (you are calling List.rev every time you add a new element, which is wrong if you're adding even number of elements!)
So, a solution very similar to yours would be:
let merge a b =
(b, a)
||> List.fold (fun acc elem ->
let alreadyContains = acc |> List.exists (fun item -> item = elem)
if alreadyContains = true then acc
else elem :: acc)
|> List.rev
This uses the double-pipe operator ||> to pass two parameters to the fold function (this is not necessary, but I find it a bit nicer) and then passes the result to List.rev.

How do I do in F# what would be called compression in APL?

In APL one can use a bit vector to select out elements of another vector; this is called compression. For example 1 0 1/3 5 7 would yield 3 7.
Is there a accepted term for this in functional programming in general and F# in particular?
Here is my F# program:
let list1 = [|"Bob"; "Mary"; "Sue"|]
let list2 = [|1; 0; 1|]
[<EntryPoint>]
let main argv =
0 // return an integer exit code
What I would like to do is compute a new string[] which would be [|"Bob"; Sue"|]
How would one do this in F#?
Array.zip list1 list2 // [|("Bob",1); ("Mary",0); ("Sue",1)|]
|> Array.filter (fun (_,x) -> x = 1) // [|("Bob", 1); ("Sue", 1)|]
|> Array.map fst // [|"Bob"; "Sue"|]
The pipe operator |> does function application syntactically reversed, i.e., x |> f is equivalent to f x. As mentioned in another answer, replace Array with Seq to avoid the construction of intermediate arrays.
I expect you'll find many APL primitives missing from F#. For lists and sequences, many can be constructed by stringing together primitives from the Seq, Array, or List modules, like the above. For reference, here is an overview of the Seq module.
I think the easiest is to use an array sequence expression, something like this:
let compress bits values =
[|
for i = 0 to bits.Length - 1 do
if bits.[i] = 1 then
yield values.[i]
|]
If you only want to use combinators, this is what I would do:
Seq.zip bits values
|> Seq.choose (fun (bit, value) ->
if bit = 1 then Some value else None)
|> Array.ofSeq
I use Seq functions instead of Array in order to avoid building intermediary arrays, but it would be correct too.
One might say this is more idiomatic:
Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None) list1 list2
|> Seq.choose id
|> Seq.toArray
EDIT (for the pipe lovers)
(list1, list2)
||> Seq.map2 (fun l1 l2 -> if l2 = 1 then Some(l1) else None)
|> Seq.choose id
|> Seq.toArray
Søren Debois' solution is good but, as he pointed out, but we can do better. Let's define a function, based on Søren's code:
let compressArray vals idx =
Array.zip vals idx
|> Array.filter (fun (_, x) -> x = 1)
|> Array.map fst
compressArray ends up creating a new array in each of the 3 lines. This can take some time, if the input arrays are long (1.4 seconds for 10M values in my quick test).
We can save some time by working on sequences and creating an array at the end only:
let compressSeq vals idx =
Seq.zip vals idx
|> Seq.filter (fun (_, x) -> x = 1)
|> Seq.map fst
This function is generic and will work on arrays, lists, etc. To generate an array as output:
compressSeq sq idx |> Seq.toArray
The latter saves about 40% of computation time (0.8s in my test).
As ildjarn commented, the function argument to filter can be rewritten to snd >> (=) 1, although that causes a slight performance drop (< 10%), probably because of the extra function call that is generated.

Resources