How to order a LIST in F# - f#

Total F# n00b question.
How do I sort a LIST data structure?
Edit: Sorry, my data structure is actually a LIST.
maybe i should add my code since just using ".sort" hasn't worked:
let getDataFromDb (db: MyDB) Id =
Query.query <# seq {
big honking database/FLinq query
yield (sec, pm, sr, trade, tradeRec, i, pm_firm, files, lt)
} #> |> List.ofSeq
when I change the last line of code to this:
} #> |> List.ofSeq.sortBy fst
I get the following:
Error 1 The field, constructor or member 'sortBy' is not defined
ugh, what a pain. I'm trying this now:
|> List.ofSeq |> List.sortBy
But I'm getting this:
Error 1 Type mismatch. Expecting a (Security * RoleContributor * RoleContributor * SuggestedTrade * SuggestedTradeRecommendation * Idea * RoleContributor * SupportingUploadedFile * LargeText) list -> 'a but given a ('b -> 'c) -> 'b list -> 'b list The type '(Security * RoleContributor * RoleContributor * SuggestedTrade * SuggestedTradeRecommendation * Idea * RoleContributor * SupportingUploadedFile * LargeText) list' does not match the type ''a -> 'b'

Seq.sortBy would do that.
However sorting implies you know the key values of the full sequence at the time of sorting, so by definition you cannot use this on infinite sequences.
Edit:
The equivalent for lists has the same name:
List.sortBy
MSDN example:
let sortedList2 = List.sortBy (fun elem -> abs elem) [1; 4; 8; -2; 5]
printfn "%A" sortedList2
Edit 2:
From your new example it seems like you have a list of tuples. Now it depends on what item in the tuple you want to search by.

As others said, Seq.sortBy is the way to go. If you're using FLinq to read some data from a database, then it is a good idea to include the sorting as part of the database query (enclosed in <# .. #>) so that the sorting is done on the SQL server:
let getDataFromDb (db: MyDB) Id =
<# seq { big honking database/FLinq query
yield (sec, pm, sr, trade, tradeRec, i, pm_firm, files, lt)
|> Seq.sortBy (fun (_, _, _, _, _, i, _, _, _) -> i) #>
|> List.ofSeq
To make this a little nicer, you could return tuple containing key and all other elements as nested tuple e.g key, (sec, pm, ..., lt) and then just sort using the first element:
|> Seq.sortBy (fun (k, _) -> k)
(I had some troubles using tuples with LINQ to Entities, but I believe that it should work in LINQ to SQL).

Use:
Query.query <# ... #>
|> List.sortBy (fun (sec, _, _, _, _, _, _, _, _) -> sec)
Note that using tuples with that many elements is really bad style in F#. Use something more structured like a record type to give names to the fields and avoid confusion.

Related

F# Write Mapped Input to Output

I am new to F# and am starting with a simple project to get going.
I have large txt files that I process - usually about 10 million records. What I want to do is read the file, filter out some specific rows, map the fields to only take a subset of the columns from the original file, and then output the result.
The 2 questions I have are:
How do I filter based on the map. The file has about 30 fields.
How do I take the output of the map and write it to a new TXT file
//Open the file
let lines = seq {use r = new StreamReader(kDir + kfName )
while not r.EndOfStream do yield r.ReadLine() }
//Filter the file
let sFilt = "Detached Houses,Upper Middle"
let out1 = lines
|> Seq.filter (fun x -> x.Contains(sFilt))
//Write out the filtered file - this works great
//val out1 : seq<string>
File.WriteAllLines("c:\\temp\\out1.txt", out1 )
//Here is where I have an issue
//I am trying to just get 2 of the columns to an output file
//val out2 : seq<string * string> - this has a different patter than out1
let out2 = out1 |> Seq.map (fun x2 -> x2.Split[|','|])
|> Seq.map (fun x3 -> x3.[0], x3.[3])
I get the following error on this line - I know the out1 and out2 are different. How can I resolve this difference?
Error message:
Possible overload: 'File.WriteAllLines(path: string, contents: IEnumerable<string>) : unit'. Type constraint mismatch. The type seq<string * string> is not compatible with type IEnumerable<string>
The type 'string' does not match the type 'string * string'.
What you can do is map back to a seq<string> from your seq<string*string>.
Seq.map (fun (str1, str2) -> sprintf "%s, %s" str1 str2)
You can just add that to your existing chain of map operations
let out2 =
out1
|> Seq.map (fun x2 -> x2.Split[|','|])
|> Seq.map (fun x3 -> x3.[0], x3.[3])
|> Seq.map (fun (str1, str2) -> sprintf "%s, %s" str1 str2)
Then, once again, you have a sequence of strings which you can write to your file.
fun x3 -> x3.[0], x3.[3] creates a tuple of strings string * string. You need to concat them, e.g. fun x3 -> sprintf "%s,%s" x3.[0] x3.[3] (if you want the comma in the output) or just fun x3 -> x3.[0] + x3.[3].
Its also possible that you want to use the CsvProvider if the file is properly structured. No reason for handling any IO if file is properly structured.
Then you get typed data, column names etc "for free"...
If it is not entirely well structured you might also use CsvParser for some less strictness in reading/handling the file.
Take a look at:
https://fsharp.github.io/FSharp.Data/library/CsvProvider.html
or
https://fsharp.github.io/FSharp.Data/library/CsvFile.html

Define sum of square without defining parameter

I want to define sumOfSquares without explicity using parameter, relying instead on functional composition
Here's my code below
let sumOfSquares = Seq.map (fun n -> n * n) >> Seq.sum
However, I got the following error
stdin(80,5): error FS0030: Value restriction. The value 'sumOfSquares'
has been inferred to have generic type
val sumOfSquares : ('_a -> int) when '_a :> seq<int>
Either make the arguments to 'sumOfSquares' explicit or, if you do not intend for
it to be generic, add a type annotation.
One way to resolve it is by using parameters
let sumOfSquares nums = nums |> Seq.map (fun n -> n * n) |> Seq.sum
and this will work. However, I want to see if I can define sum of squares by using composition alone
Update
Here's a nice article describing the issue I've encountered: Value Restriction.
Make a type annotation:
let sumOfSquares : seq<int> -> int =
Seq.map (fun n -> n * n) >> Seq.sum
So lets see what happens when the type inference tries to work here. First you have
Seq.map (fun n -> n * n) >> Seq.sum
Now as Seq allows for anything that implements Seq, we can input int list int[] or many others.
As a result, you get this as the type
val sumOfSquares : ('_a -> int) when '_a :> seq<int>
Now the problem is that sumofSquares is a value (which is a function). Unfortunately, you can't have a generic value in a top level binding. You can though have a generic function, if you make the arguments explicit.
As a result, one alternative to a type annotation is to make the argument explicit like so
let sumOfSquares s= s |> Seq.map (fun n -> n * n) |> Seq.sum
And this works
Searching SO for "value restriction errors" should give some more examples of this problem.

F# Pipelines access data from pipeline stages above

I have written a function like this
let GetAllDirectAssignmentsforLists (spWeb : SPWeb) =
spWeb.Lists
|> Seq.cast<SPList>
|> Seq.filter(fun l -> l.HasUniqueRoleAssignments)
|> Seq.collect (fun l -> l.RoleAssignments
|> Seq.cast<SPRoleAssignment>
|> Seq.map(fun ra -> ra.Member)
)
|> Seq.filter (fun p -> p.GetType().Name = "SPUser")
|> Seq.map(fun m -> m.LoginName.ToLower())
I want to return a tuple which contains the list name (taken from l.Title) in the send pipe and the m.LoginName.ToLower().
Is there a cleanway for me to get something from the above pipe elements?
One way ofcourse would be to tuple the return value in the 2nd stage of the pipe and then pass the Title all the way down.... but that would pollute the code all subsequent stages will then have to accept and return tuple values just for the sake of the last stage to get the value.
I wonder if there is a clean and easy way....
Also, in stage 4 of the pipeline (fun p -> p.GetType().Name = "SPUser") could i use if here to compare the types? rather than convert the typename to string and then match strings?
We exploit the fact that Seq.filter and Seq.map can be pushed inside Seq.collect without changing the results. In this case, l is still available to access.
And the last filter function is more idiomatic to use with type test operator :?.
let GetAllDirectAssignmentsforLists(spWeb: SPWeb) =
spWeb.Lists
|> Seq.cast<SPList>
|> Seq.filter (fun l -> l.HasUniqueRoleAssignments)
|> Seq.collect (fun l -> l.RoleAssignments
|> Seq.cast<SPRoleAssignment>
|> Seq.map (fun ra -> ra.Member)
|> Seq.filter (fun p -> match box p with
| :? SPUser -> true
| _ -> false)
|> Seq.map (fun m -> l.Title, m.LoginName.ToLower()))
To simplify further, you could change the series of Seq.map and Seq.filter to Seq.choose:
Seq.choose (fun ra -> match box ra.Member with
| :? SPUser -> Some (l.Title, ra.Member.LoginName.ToLower())
| _ -> None)
While you can solve the problem by lifting the rest of the computation inside collect, I think that you could make the code more readable by using sequence expressions instead of pipelining.
I could not run the code to test it, but this should be equivalent:
let GetAllDirectAssignmentsforLists (spWeb : SPWeb) = seq {
// Corresponds to your 'filter' and 'collect'
for l in Seq.cast<SPList> spWeb.Lists do
if l.HasUniqueRoleAssignments then
// Corresponds to nested 'map' and 'filter'
for ra in Seq.cast<SPRoleAssignment> l.RoleAssignments do
let m = ra.Member
if m.GetType().Name = "SPUser" then
// This implements the last 'map' operation
yield l.Title, m.LoginName.ToLower() }
The code above corresponds more closely to the version by #pad than to your original code, because the rest of the computation is nested under for (which corresponds to nesting under collect) and so you can see all variables that are already in scope - like l which you need.
The nice thing about sequence expressions is that you can use F# constructs like if (instead of filter), for (instead of collect) etc. Also, I think it is more suitable for writing nested operations (which you need here to keep variables in scope), because it remains quite readable and keeps familiar code structure.

Return value in F# - incomplete construct

I've trying to learn F#. I'm a complete beginner, so this might be a walkover for you guys :)
I have the following function:
let removeEven l =
let n = List.length l;
let list_ = [];
let seq_ = seq { for x in 1..n do if x % 2 <> 0 then yield List.nth l (x-1)}
for x in seq_ do
let list_ = list_ # [x];
list_;
It takes a list, and return a new list containing all the numbers, which is placed at an odd index in the original list, so removeEven [x1;x2;x3] = [x1;x3]
However, I get my already favourite error-message: Incomplete construct at or before this point in expression...
If I add a print to the end of the line, instead of list_:
...
print_any list_;
the problem is fixed. But I do not want to print the list, I want to return it!
What causes this? Why can't I return my list?
To answer your question first, the compiler complains because there is a problem inside the for loop. In F#, let serves to declare values (that are immutable and cannot be changed later in the program). It isn't a statement as in C# - let can be only used as part of another expression. For example:
let n = 10
n + n
Actually means that you want the n symbol to refer to the value 10 in the expression n + n. The problem with your code is that you're using let without any expression (probably because you want to use mutable variables):
for x in seq_ do
let list_ = list_ # [x] // This isn't assignment!
list_
The problematic line is an incomplete expression - using let in this way isn't allowed, because it doesn't contain any expression (the list_ value will not be accessed from any code). You can use mutable variable to correct your code:
let mutable list_ = [] // declared as 'mutable'
let seq_ = seq { for x in 1..n do if x % 2 <> 0 then yield List.nth l (x-1)}
for x in seq_ do
list_ <- list_ # [x] // assignment using '<-'
Now, this should work, but it isn't really functional, because you're using imperative mutation. Moreover, appending elements using # is really inefficient thing to do in functional languages. So, if you want to make your code functional, you'll probably need to use different approach. Both of the other answers show a great approach, although I prefer the example by Joel, because indexing into a list (in the solution by Chaos) also isn't very functional (there is no pointer arithmetic, so it will be also slower).
Probably the most classical functional solution would be to use the List.fold function, which aggregates all elements of the list into a single result, walking from the left to the right:
[1;2;3;4;5]
|> List.fold (fun (flag, res) el ->
if flag then (not flag, el::res) else (not flag, res)) (true, [])
|> snd |> List.rev
Here, the state used during the aggregation is a Boolean flag specifying whether to include the next element (during each step, we flip the flag by returning not flag). The second element is the list aggregated so far (we add element by el::res only when the flag is set. After fold returns, we use snd to get the second element of the tuple (the aggregated list) and reverse it using List.rev, because it was collected in the reversed order (this is more efficient than appending to the end using res#[el]).
Edit: If I understand your requirements correctly, here's a version of your function done functional rather than imperative style, that removes elements with odd indexes.
let removeEven list =
list
|> Seq.mapi (fun i x -> (i, x))
|> Seq.filter (fun (i, x) -> i % 2 = 0)
|> Seq.map snd
|> List.ofSeq
> removeEven ['a'; 'b'; 'c'; 'd'];;
val it : char list = ['a'; 'c']
I think this is what you are looking for.
let removeEven list =
let maxIndex = (List.length list) - 1;
seq { for i in 0..2..maxIndex -> list.[i] }
|> Seq.toList
Tests
val removeEven : 'a list -> 'a list
> removeEven [1;2;3;4;5;6];;
val it : int list = [1; 3; 5]
> removeEven [1;2;3;4;5];;
val it : int list = [1; 3; 5]
> removeEven [1;2;3;4];;
val it : int list = [1; 3]
> removeEven [1;2;3];;
val it : int list = [1; 3]
> removeEven [1;2];;
val it : int list = [1]
> removeEven [1];;
val it : int list = [1]
You can try a pattern-matching approach. I haven't used F# in a while and I can't test things right now, but it would be something like this:
let rec curse sofar ls =
match ls with
| even :: odd :: tl -> curse (even :: sofar) tl
| even :: [] -> curse (even :: sofar) []
| [] -> List.rev sofar
curse [] [ 1; 2; 3; 4; 5 ]
This recursively picks off the even elements. I think. I would probably use Joel Mueller's approach though. I don't remember if there is an index-based filter function, but that would probably be the ideal to use, or to make if it doesn't exist in the libraries.
But in general lists aren't really meant as index-type things. That's what arrays are for. If you consider what kind of algorithm would require a list having its even elements removed, maybe it's possible that in the steps prior to this requirement, the elements can be paired up in tuples, like this:
[ (1,2); (3,4) ]
That would make it trivial to get the even-"indexed" elements out:
thelist |> List.map fst // take first element from each tuple
There's a variety of options if the input list isn't guaranteed to have an even number of elements.
Yet another alternative, which (by my reckoning) is slightly slower than Joel's, but it's shorter :)
let removeEven list =
list
|> Seq.mapi (fun i x -> (i, x))
|> Seq.choose (fun (i,x) -> if i % 2 = 0 then Some(x) else None)
|> List.ofSeq

F# Split list into sublists based on comparison of adjacent elements

I've found this question on hubFS, but that handles a splitting criteria based on individual elements. I'd like to split based on a comparison of adjacent elements, so the type would look like this:
val split = ('T -> 'T -> bool) -> 'T list -> 'T list list
Currently, I am trying to start from Don's imperative solution, but I can't work out how to initialize and use a 'prev' value for comparison. Is fold a better way to go?
//Don's solution for single criteria, copied from hubFS
let SequencesStartingWith n (s:seq<_>) =
seq { use ie = s.GetEnumerator()
let acc = new ResizeArray<_>()
while ie.MoveNext() do
let x = ie.Current
if x = n && acc.Count > 0 then
yield ResizeArray.to_list acc
acc.Clear()
acc.Add x
if acc.Count > 0 then
yield ResizeArray.to_list acc }
This is an interesting problem! I needed to implement exactly this in C# just recently for my article about grouping (because the type signature of the function is pretty similar to groupBy, so it can be used in LINQ query as the group by clause). The C# implementation was quite ugly though.
Anyway, there must be a way to express this function using some simple primitives. It just seems that the F# library doesn't provide any functions that fit for this purpose. I was able to come up with two functions that seem to be generally useful and can be combined together to solve this problem, so here they are:
// Splits a list into two lists using the specified function
// The list is split between two elements for which 'f' returns 'true'
let splitAt f list =
let rec splitAtAux acc list =
match list with
| x::y::ys when f x y -> List.rev (x::acc), y::ys
| x::xs -> splitAtAux (x::acc) xs
| [] -> (List.rev acc), []
splitAtAux [] list
val splitAt : ('a -> 'a -> bool) -> 'a list -> 'a list * 'a list
This is similar to what we want to achieve, but it splits the list only in two pieces (which is a simpler case than splitting the list multiple times). Then we'll need to repeat this operation, which can be done using this function:
// Repeatedly uses 'f' to take several elements of the input list and
// aggregate them into value of type 'b until the remaining list
// (second value returned by 'f') is empty
let foldUntilEmpty f list =
let rec foldUntilEmptyAux acc list =
match f list with
| l, [] -> l::acc |> List.rev
| l, rest -> foldUntilEmptyAux (l::acc) rest
foldUntilEmptyAux [] list
val foldUntilEmpty : ('a list -> 'b * 'a list) -> 'a list -> 'b list
Now we can repeatedly apply splitAt (with some predicate specified as the first argument) on the input list using foldUntilEmpty, which gives us the function we wanted:
let splitAtEvery f list = foldUntilEmpty (splitAt f) list
splitAtEvery (<>) [ 1; 1; 1; 2; 2; 3; 3; 3; 3 ];;
val it : int list list = [[1; 1; 1]; [2; 2]; [3; 3; 3; 3]]
I think that the last step is really nice :-). The first two functions are quite straightforward and may be useful for other things, although they are not as general as functions from the F# core library.
How about:
let splitOn test lst =
List.foldBack (fun el lst ->
match lst with
| [] -> [[el]]
| (x::xs)::ys when not (test el x) -> (el::(x::xs))::ys
| _ -> [el]::lst
) lst []
the foldBack removes the need to reverse the list.
Having thought about this a bit further, I've come up with this solution. I'm not sure that it's very readable (except for me who wrote it).
UPDATE Building on the better matching example in Tomas's answer, here's an improved version which removes the 'code smell' (see edits for previous version), and is slightly more readable (says me).
It still breaks on this (splitOn (<>) []), because of the dreaded value restriction error, but I think that might be inevitable.
(EDIT: Corrected bug spotted by Johan Kullbom, now works correctly for [1;1;2;3]. The problem was eating two elements directly in the first match, this meant I missed a comparison/check.)
//Function for splitting list into list of lists based on comparison of adjacent elements
let splitOn test lst =
let rec loop lst inner outer = //inner=current sublist, outer=list of sublists
match lst with
| x::y::ys when test x y -> loop (y::ys) [] (List.rev (x::inner) :: outer)
| x::xs -> loop xs (x::inner) outer
| _ -> List.rev ((List.rev inner) :: outer)
loop lst [] []
splitOn (fun a b -> b - a > 1) [1]
> val it : [[1]]
splitOn (fun a b -> b - a > 1) [1;3]
> val it : [[1]; [3]]
splitOn (fun a b -> b - a > 1) [1;2;3;4;6;7;8;9;11;12;13;14;15;16;18;19;21]
> val it : [[1; 2; 3; 4]; [6; 7; 8; 9]; [11; 12; 13; 14; 15; 16]; [18; 19]; [21]]
Any thoughts on this, or the partial solution in my question?
"adjacent" immediately makes me think of Seq.pairwise.
let splitAt pred xs =
if Seq.isEmpty xs then
[]
else
xs
|> Seq.pairwise
|> Seq.fold (fun (curr :: rest as lists) (i, j) -> if pred i j then [j] :: lists else (j :: curr) :: rest) [[Seq.head xs]]
|> List.rev
|> List.map List.rev
Example:
[1;1;2;3;3;3;2;1;2;2]
|> splitAt (>)
Gives:
[[1; 1; 2; 3; 3; 3]; [2]; [1; 2; 2]]
I would prefer using List.fold over explicit recursion.
let splitOn pred = function
| [] -> []
| hd :: tl ->
let (outer, inner, _) =
List.fold (fun (outer, inner, prev) curr ->
if pred prev curr
then (List.rev inner) :: outer, [curr], curr
else outer, curr :: inner, curr)
([], [hd], hd)
tl
List.rev ((List.rev inner) :: outer)
I like answers provided by #Joh and #Johan as these solutions seem to be most idiomatic and straightforward. I also like an idea suggested by #Shooton. However, each solution had their own drawbacks.
I was trying to avoid:
Reversing lists
Unsplitting and joining back the temporary results
Complex match instructions
Even Seq.pairwise appeared to be redundant
Checking list for emptiness can be removed in cost of using Unchecked.defaultof<_> below
Here's my version:
let splitWhen f src =
if List.isEmpty src then [] else
src
|> List.foldBack
(fun el (prev, current, rest) ->
if f el prev
then el , [el] , current :: rest
else el , el :: current , rest
)
<| (List.head src, [], []) // Initial value does not matter, dislike using Unchecked.defaultof<_>
|> fun (_, current, rest) -> current :: rest // Merge temporary lists
|> List.filter (not << List.isEmpty) // Drop tail element

Resources