Merging two lists in F# - f#

I wrote this function which merges two lists together but as I'm fairly new to functional programming I was wondering whether there is a better (simpler) way to do it?
let a = ["a"; "b"; "c"]
let b = ["d"; "b"; "a"]
let merge a b =
// take all a and add b
List.fold (fun acc elem ->
let alreadyContains = acc |> List.exists (fun item -> item = elem)
if alreadyContains = true then
acc
else
elem :: acc |> List.rev
) b a
let test = merge a b
Expected result is: ["a"; "b"; "c"; "d"], I'm reverting the list in order to keep the original order. I thought I would be able to achieve the same using List.foldBack (and dropping List.rev) but it results in an error:
Type mismatch. Expecting a
'a
but given a
'a list
The resulting type would be infinite when unifying ''a' and ''a list'
Why is there a difference when using foldBack?

You could use something like the following
let merge a b =
a # b
|> Seq.distinct
|> List.ofSeq
Note that this will preserve order and remove any duplicates.
In F# 4.0 this will be simplified to
let merge a b = a # b |> List.distinct

If I wanted to write this in a way that is similar to your original version (using fold), then the main change I would do is to move List.rev outside of the function (you are calling List.rev every time you add a new element, which is wrong if you're adding even number of elements!)
So, a solution very similar to yours would be:
let merge a b =
(b, a)
||> List.fold (fun acc elem ->
let alreadyContains = acc |> List.exists (fun item -> item = elem)
if alreadyContains = true then acc
else elem :: acc)
|> List.rev
This uses the double-pipe operator ||> to pass two parameters to the fold function (this is not necessary, but I find it a bit nicer) and then passes the result to List.rev.

Related

F# sort by indexes

Let's say I have two lists:
let listOfValues = [100..105] //can be list of strings or whatever
let indexesToSortBy = [1;2;0;4;5;3]
Now I need listOfValues_sorted: 102;100;101;105;103;104
It can be done with zip and "conversion" to Tuple:
let listOfValues_sorted = listOfValues
|> Seq.zip indexesToSortBy
|> Seq.sortBy( fun x-> fst x)
|> Seq.iter(fun c -> printfn "%i" (snd c))
But I guess, there is better solution for that?
I think your solution is pretty close. I would do this
let listOfValues_sorted =
listOfValues
|> Seq.zip indexesToSortBy
|> Seq.sortBy fst
|> Seq.toList
|> List.unzip
|> List.head
you can collapse fun x -> fst x into simply fst. And then unzip and get what ever list you want
If indexesToSortBy is a complete set of indexes you could simply use:
indexesToSortBy |> List.map (fun x -> listOfValues |> List.item x )
Your example sounds precisely what the List.permute function is for:
let listOfValues = [100..105]
let indexesToSortBy = [|1;2;0;4;5;3|] // Note 0-based indexes
listOfValues |> List.permute (fun i -> indexesToSortBy.[i])
// Result: [102; 100; 101; 105; 103; 104]
Two things: First, I made indexesToSortBy an array since I'll be looking up a value inside it N times, and doing that in a list would lead to O(N^2) run time. Second, List.permute expects to be handed a 0-based index into the original list, so I subtracted 1 from all the indexes in your original indexToSortBy list. With these two changes, this produces exactly the same ordering as the let listOfValues_sorted = ... example in your question.

GroupBy Year then take Pairwise diffs except for the head value then Flatten Using Deedle and F#

I have the following variable:
data:seq<(DateTime*float)>
and I want to do something like the following F# code but using Deedle:
data
|> Seq.groupBy (fun (k,v) -> k.Year)
|> Seq.map (fun (k,v) ->
let vals = v |> Seq.pairwise
let first = seq { yield v |> Seq.head }
let diffs = vals |> Seq.map (fun ((t0,v0),(t1,v1)) -> (t1, v1 - v0))
(k, diffs |> Seq.append first))
|> Seq.collect snd
This works fine using F# sequences but I want to do it using Deedle series. I know I can do something like:
(data:Series<DateTime*float>) |> Series.groupBy (fun k v -> k.Year)...
But then I need to take the within group year diffs except for the head value which should just be the value itself and then flatten the results into on series...I am bit confused with the deedle syntax
Thanks!
I think the following might be doing what you need:
ts
|> Series.groupInto
(fun k _ -> k.Month)
(fun m s ->
let first = series [ fst s.KeyRange => s.[fst s.KeyRange]]
Series.merge first (Series.diff 1 s))
|> Series.values
|> Series.mergeAll
The groupInto function lets you specify a function that should be called on each of the groups
For each group, we create series with the differences using Series.diff and append a series with the first value at the beginning using Series.merge.
At the end, we get all the nested series & flatten them using Series.mergeAll.

key based functional fold

I have a map reduce code for which I group in each of the threads by some key and then in the reduce part merge the results. My current approach is to search for an specific key index in the accumulator and then mapi to retrieve the combined result only for this key, leaving the rest unmodified:
let rec groupFolder sequence acc =
match sequence with
| (by:string, what) :: rest ->
let index = acc |> Seq.tryFindIndex( fun (byInAcc, _) -> byInAcc.Equals(by) )
match index with
| Some (idx) ->
acc |> Seq.mapi( fun i (byInAcc, whatInAcc) -> if i = idx then (by, (what |> Array.append whatInAcc) ) else byInAcc, whatInAcc )
|> groupFolder rest
| None -> acc |> Seq.append( seq{ yield (by, what) } )
|> groupFolder rest
My question is, is it a more functional way to achieve just this?
As an example input to this reducer
let GroupsCommingFromMap = [| seq { yield! [|("key1", [|1;2;3|] ); ("key2", [|1;2;3|] ); ("key3", [|1;2;3|]) |] }, seq { yield! [|("key1", [|4;5;6|] ); ("key2", [|4;5;6|] ); ("key3", [|4;5;6|]) |] } |];;
GroupsCommingFromMap |> Seq.reduce( fun acc i ->
acc |> groupFolder (i |> Seq.toList))
the expected result should contain all key1..key3 each with the array 1..6
From the code you posted, it is not very clear what you're trying to do. Could you include some sample inputs (together with the output that you would like to get)? And does your code actually work on any of the inputs (it has incomplete pattern match, so I doubt that...)
Anyway, you can implement key-based map reduce using Seq.groupBy. For example:
let mapReduce mapper reducer input =
input
|> Seq.map mapper
|> Seq.groupBy fst
|> Seq.map (fun (k, vs) ->
k, vs |> Seq.map snd |> Seq.reduce reducer)
Here:
The mapper takes a value from the input sequence and turns it into key value pair. The mapReduce function then groups the values using the key
The reducer is then used to reduce all values associated with each key
This lets you create a word count function like this (using simple mapper that returns the word as the key with 1 as a value and reducer that just adds all the numbers):
"hello world hello people hello world".Split(' ')
|> mapReduce (fun w -> w, 1) (+)
EDIT: The example you mentioned does not really have "mapper" part, but instead it has array of arrays as an input - so perhaps it is easier to write this directly using Seq.groupBy like this:
let GroupsCommingFromMap =
[| [|("key1", [|1;2;3|] ); ("key2", [|1;2;3|] ); ("key3", [|1;2;3|]) |]
[|("key1", [|4;5;6|] ); ("key2", [|4;5;6|] ); ("key3", [|4;5;6|]) |] |]
GroupsCommingFromMap
|> Seq.concat
|> Seq.groupBy fst
|> Seq.map (fun (k, vs) -> k, vs |> Seq.map snd |> Array.concat)

fold or choose till None?

Is there already a way to do something like a chooseTill or a foldTill, where it will process until a None option is received? Really, any of the higher order functions with a "till" option. Granted, it makes no sense for stuff like map, but I find I need this kind of thing pretty often and I wanted to make sure I wasn't reinventing the wheel.
In general, it'd be pretty easy to write something like this, but I'm curious if there is already a way to do this, or if this exists in some known library?
let chooseTill predicate (sequence:seq<'a>) =
seq {
let finished = ref false
for elem in sequence do
if not !finished then
match predicate elem with
| Some(x) -> yield x
| None -> finished := true
}
let foldTill predicate seed list =
let rec foldTill' acc = function
| [] -> acc
| (h::t) -> match predicate acc h with
| Some(x) -> foldTill' x t
| None -> acc
foldTill' seed list
let (++) a b = a.ToString() + b.ToString()
let abcdef = foldTill (fun acc v ->
if Char.IsWhiteSpace v then None
else Some(acc ++ v)) "" ("abcdef ghi" |> Seq.toList)
// result is "abcdef"
I think you can get that easily by combining Seq.scan and Seq.takeWhile:
open System
"abcdef ghi"
|> Seq.scan (fun (_, state) c -> c, (string c) + state) ('x', "")
|> Seq.takeWhile (fst >> Char.IsWhiteSpace >> not)
|> Seq.last |> snd
The idea is that Seq.scan is doing something like Seq.fold, but instead of waiting for the final result, it yields the intermediate states as it goes. You can then keep taking the intermediate states until you reach the end. In the above example, the state is the current character and the concatenated string (so that we can check if the character was whitespace).
A more general version based on a function that returns option could look like this:
let foldWhile f initial input =
// Generate sequence of all intermediate states
input |> Seq.scan (fun stateOpt inp ->
// If the current state is not 'None', then calculate a new one
// if 'f' returns 'None' then the overall result will be 'None'
stateOpt |> Option.bind (fun state -> f state inp)) (Some initial)
// Take only 'Some' states and get the last one
|> Seq.takeWhile Option.isSome
|> Seq.last |> Option.get

F# Pipelines access data from pipeline stages above

I have written a function like this
let GetAllDirectAssignmentsforLists (spWeb : SPWeb) =
spWeb.Lists
|> Seq.cast<SPList>
|> Seq.filter(fun l -> l.HasUniqueRoleAssignments)
|> Seq.collect (fun l -> l.RoleAssignments
|> Seq.cast<SPRoleAssignment>
|> Seq.map(fun ra -> ra.Member)
)
|> Seq.filter (fun p -> p.GetType().Name = "SPUser")
|> Seq.map(fun m -> m.LoginName.ToLower())
I want to return a tuple which contains the list name (taken from l.Title) in the send pipe and the m.LoginName.ToLower().
Is there a cleanway for me to get something from the above pipe elements?
One way ofcourse would be to tuple the return value in the 2nd stage of the pipe and then pass the Title all the way down.... but that would pollute the code all subsequent stages will then have to accept and return tuple values just for the sake of the last stage to get the value.
I wonder if there is a clean and easy way....
Also, in stage 4 of the pipeline (fun p -> p.GetType().Name = "SPUser") could i use if here to compare the types? rather than convert the typename to string and then match strings?
We exploit the fact that Seq.filter and Seq.map can be pushed inside Seq.collect without changing the results. In this case, l is still available to access.
And the last filter function is more idiomatic to use with type test operator :?.
let GetAllDirectAssignmentsforLists(spWeb: SPWeb) =
spWeb.Lists
|> Seq.cast<SPList>
|> Seq.filter (fun l -> l.HasUniqueRoleAssignments)
|> Seq.collect (fun l -> l.RoleAssignments
|> Seq.cast<SPRoleAssignment>
|> Seq.map (fun ra -> ra.Member)
|> Seq.filter (fun p -> match box p with
| :? SPUser -> true
| _ -> false)
|> Seq.map (fun m -> l.Title, m.LoginName.ToLower()))
To simplify further, you could change the series of Seq.map and Seq.filter to Seq.choose:
Seq.choose (fun ra -> match box ra.Member with
| :? SPUser -> Some (l.Title, ra.Member.LoginName.ToLower())
| _ -> None)
While you can solve the problem by lifting the rest of the computation inside collect, I think that you could make the code more readable by using sequence expressions instead of pipelining.
I could not run the code to test it, but this should be equivalent:
let GetAllDirectAssignmentsforLists (spWeb : SPWeb) = seq {
// Corresponds to your 'filter' and 'collect'
for l in Seq.cast<SPList> spWeb.Lists do
if l.HasUniqueRoleAssignments then
// Corresponds to nested 'map' and 'filter'
for ra in Seq.cast<SPRoleAssignment> l.RoleAssignments do
let m = ra.Member
if m.GetType().Name = "SPUser" then
// This implements the last 'map' operation
yield l.Title, m.LoginName.ToLower() }
The code above corresponds more closely to the version by #pad than to your original code, because the rest of the computation is nested under for (which corresponds to nesting under collect) and so you can see all variables that are already in scope - like l which you need.
The nice thing about sequence expressions is that you can use F# constructs like if (instead of filter), for (instead of collect) etc. Also, I think it is more suitable for writing nested operations (which you need here to keep variables in scope), because it remains quite readable and keeps familiar code structure.

Resources