I have something similar top this:
let idx = 9
let map =
Map.empty.
Add(10, "abc").
Add( 9, "bcd").
Add( 8, "cde").
Add( 7, "def")
let result =
map
|> Map.pick (fun key value -> if idx > key then Some(key) else None)
printfn "%A" result
Map.pick from MSDN: Searches the map looking for the first element where the given function returns a Some value.
I assume that searching is starting from the end of the map - from 7 towards 10, since the result is 7.
But I want to get the process to start from 10 towards 7, and then I will get 8. How to achieve this?
Maps are stored in order of their key because of the way the data structure is designed. Map functions like pick happen to start from the smallest key, but it's not something that you should rely on either way.
If this is the main way that you are using this map, then a map may not be the best choice of data structure for your overall task. But if you do need to use a map, I would suggest this:
map
|> Map.toSeq
|> Seq.filter (fun (key, _) -> idx > key)
|> Seq.map fst
|> Seq.max
// returns 8
Be aware that if the there are no keys that meet the criteria then Seq.max will receive and empty sequence and throw an exception.
Related
Good evening! I am a very new programmer getting my feet wet with F#. I am attempting to do some simple data analysis and plotting but I cannot figure out how access the data properly. I get everything set up and use the CSVProvider and it works perfectly:
#load #"packages\FsLab\FsLab.fsx"
#load #"packages\FSharp.Charting\FSharp.Charting.fsx"
open Deedle
open FSharp.Data
type Pt = CsvProvider<"C:/Users/berkl/Test10/CGC.csv">
let data = Pt.Load("C:/Users/berkl/Test10/CGC.csv")
Then, I pull out the data for a specific entry:
let test = data.Rows |> Seq.filter (fun r -> r.``Patient number`` = 2104)
This works as expected and prints the following to FSI:
test;;
val it : seq<CsvProvider<...>.Row> =
seq
[(2104, "Cita 1", "Nuevo", "Femenino", nan, nan, nan);
(2104, "Cita 2", "Establecido", "", 18.85191818, 44.0, 103.0);
(2104, "Cita 3", "Establecido", "Femenino", 17.92617533, 46.0, 108.0);
(2104, "Cita 4", "Establecido", "Femenino", nan, nan, nan); ...]
Here is where I'm at a loss. I want to take out the fifth column and plot it against the sixth column. And I don't know how to access it.
What I can do so far is access a single value in one of the columns:
let Finally = Seq.item 1 test
let PtHt = Finally.Ht_cm
Any help is much appreciated!!
I would probably recommend using the XPlot library instead of F# Charting, because that is the one that's going to be available in FsLab in the long term (it is cross-platform).
To create a chart using XPlot, you need to give it a sequence of pairs with X and Y values:
#load "packages/FsLab/FsLab.fsx"
open XPlot.Plotly
Chart.Scatter [ for x in 0.0 .. 0.1 .. 10.0 -> x, sin x ]
In your example, you can get the required format using sequence comprehensions (as in the above example) or using Seq.map as in the existing answer - both options do the same thing:
// Using sequence comprehensions
Chart.Scatter [ for row in test -> row.Ht_cm, row.Wt_kg ]
// Using Seq.map and piping
test |> Seq.map (fun row -> row.Ht_cm, row.Wt_kg) |> Chart.Scatter
The key thing is that you need to produce one sequence (or a list) containing the X and Y values as a tuple (rather than producing two separate sequences).
What you want to do is transform your sequence of rows to a sequence of values from a column. You use Seq.map for any such transformation.
In your case, you could do (modulo the correct column names which I don't have)
let col5 =
test
|> Seq.map (fun row -> row.Ht_cm)
let col6 =
test
|> Seq.map (fun row -> row.Wt_kg)
Lets say I have a tuple list. Just to make it easier to refer to, its a coordinates with an x and y values.
let test = [(1,34);(2,43);(3,21);(1,51);(2,98);(3,56);(1,51)]
I want to make another list using test so that if I only want value which has an x value of 1, it would return [34;51;51]
You need to filter the list first to get tuples that have an x value of 1, then map the results to get the y value :
[(1,34);(2,43);(3,21);(1,51);(2,98);(3,56);(1,51)]
|> List.filter (fun (x,_)->x=1)
|> List.map snd
This returns :
[34;51;51]
I've created a function that gets all integers from 1 to n and then combines with the same sequence to create a sequence of tuples of all combinations. So passing it the integer 2 would give you [(1,1);(1,2);(2,1);(2,2)]:
let allTuplesUntil x =
let primary = seq { 1 .. x }
let secondary = seq { 1 .. x }
[for x in primary do
for y in secondary do
yield (x,y)]
This implementations works, but it uses an inner and outer for loop, similar to what I would do in c#.
Could this be achieved in a more idiomatic functional way? Would a more functional way typically be more desirable or is this acceptable in a functional language because of its brevity and clarity?
I'm relatively new to f# and looking for some feedback.
These loops are part of what's called computation expression, which is quite idiomatic to F#. It's just made to look like familiar loops. I can't see any problem with your code being written in this way. If what you want is to get rid of the loops, you could hide them in functions:
let cartesianProduct xs ys =
xs |> Seq.collect (fun x -> ys |> Seq.map (fun y -> x, y))
cartesianProduct [1;2;3] ['a';'b';'c']
val it : seq<int * char> = seq [(1, 'a'); (1, 'b'); (1, 'c'); (2, 'a'); ...]
First, just because there is a for doesn't mean its not functional. In this example you go over each element and yield a new element that will turn into a new element of a new immutable list. Such feature is also named "List Comprehension" and part of languages like Haskell. Imperative would be to loop over a list and mutate the list.
Second, remember that other functions like map, fold, filter also just loop over each element, like a for expression. They are just less powerful than a for loop.
Third, even if it would be "not 100% functional". Who cares? Code should be easily readable and understandable. The intention of two for loops is easy to understand.
Fourth, the equivalent function of the for expression is usually the bind or in this case the Seq.collect function. You also could write, this code.
[for x in primary do
for y in secondary do
yield (x,y)]
Like this:
primary |> Seq.collect (fun x ->
secondary |> Seq.collect (fun y ->
[x,y]
))
I prefer the for loops for readability!
What are the essential functions to find duplicate elements within a list?
Translated, how can I simplify the following function:
let numbers = [ 3;5;5;8;9;9;9 ]
let getDuplicates = numbers |> List.groupBy id
|> List.map snd
|> List.filter (fun set -> set.Length > 1)
|> List.map (fun set -> set.[0])
I'm sure this is a duplicate. However, I am unable to locate the question on this site.
UPDATE
let getDuplicates numbers =
numbers |> List.groupBy id
|> List.choose (fun (k,v) -> match v.Length with
| x when x > 1 -> Some k
| _ -> None)
Simplifying your function:
Whenever you have a filter followed by a map, you can probably replace the pair with a choose. The purpose of choose is to run a function for each value in the list, and return only the items which return Some value (None values are removed, which is the filter portion). Whatever value you put inside Some is the map portion:
let getDuplicates = numbers |> List.groupBy id
|> List.map snd
|> List.choose( fun( set ) ->
if set.Length > 1
then Some( set.[0] )
else None )
We can take it one additional step by removing the map. In this case, keeping the tuple which contains the key is helpful, because it eliminates the need to get the first item of the list:
let getDuplicates = numbers |> List.groupBy id
|> List.choose( fun( key, set ) ->
if set.Length > 1
then Some key
else None )
Is this simpler than the original? Perhaps. Because choose combines two purposes, it is by necessity more complex than those purposes kept separate (the filter and the map), and this makes it harder to understand at a glance, perhaps undoing the more "simplified" code. More on this later.
Decomposing the concept
Simplifying the code wasn't the direct question, though. You asked about functions useful in finding duplicates. At a high level, how do you find a duplicate? It depends on your algorithm and specific needs:
Your given algorithm uses the "put items in buckets based on their value", and "look for buckets with more than one item". This is a direct match to List.groupBy and List.choose (or filter/map)
A different algorithm could be to "iterate through all items", "modify an accumulator as we see each", then "report all items which have been seen multiple times". This is kind of like the first algorithm, where something like List.fold is replacing List.groupBy, but if you need to drag some other kind of state along, it may be helpful.
Perhaps you need to know how many times there are duplicates. A different algorithm satisfying these requirements may be "sort the items so they are always ascending", and "flag if the next item is the same as the current item". In this case, you have a List.sort followed by a List.toSeq then Seq.windowed:
let getDuplicates = numbers |> List.sort
|> List.toSeq
|> Seq.windowed 2
|> Seq.choose( function
| [|x; y|] when x = y -> Some x
| _ -> None )
Note that this returns a sequence with [5; 9; 9], informing you that 9 is duplicated twice.
These were algorithms mostly based on List functions. There are already two answers, one mutable, the other not, which are based on sets and existence.
My point is, a complete list of functions helpful to finding duplicates would read like a who's who list of existing collection functions -- it all depends on what you're trying to do and your specific requirements. I think your choice of List.groupBy and List.choose is probably about as simple as it gets.
Simplifying for maintainability
The last thought on simplification is to remember that simplifying code will improve the readability of your code to a certain extent. "Simplifying" beyond that point will most likely involve tricks, or obscure intent. If I were to look back at a sample of code I wrote even several weeks and a couple of projects ago, the shortest and perhaps simplest code would probably not be the easiest to understand. Thus the last point -- simplifying future code maintainability may be your goal. If this is the case, your original algorithm modified only keeping the groupBy tuple and adding comments as to what each step of the pipeline is doing may be your best bet:
// combine numbers into common buckets specified by the number itself
let getDuplicates = numbers |> List.groupBy id
// only look at buckets with more than one item
|> List.filter( fun (_,set) -> set.Length > 1)
// change each bucket to only its key
|> List.map( fun (key,_) -> key )
The original question comments already show that your code was unclear to people unfamiliar with it. Is this a question of experience? Definitely. But, regardless of whether we work on a team, or are lone wolves, optimizing code (where possible) for quick understanding should probably be close to everyone's top priority. (climbing down off sandbox...) :)
Regardless, best of luck.
If you don't mind using a mutable collection in a local scope, this could do it:
open System.Collections.Generic
let getDuplicates numbers =
let known = HashSet()
numbers |> List.filter (known.Add >> not) |> set
You can wrap the last three operations in a List.choose:
let duplicates =
numbers
|> List.groupBy id
|> List.choose ( function
| _, x::_::_ -> Some x
| _ -> None )
Here's a solution which uses only basic functions and immutable data structures:
let findDups elems =
let findDupsHelper (oneOccurrence, manyOccurrences) elem =
if oneOccurrence |> Set.contains elem
then (oneOccurrence, manyOccurrences |> Set.add elem)
else (oneOccurrence |> Set.add elem, manyOccurrences)
List.fold findDupsHelper (Set.empty, Set.empty) elems |> snd
Good Morning everyone,
I must do an exercise of Programming, but i'm stuck!
Well, the exercise requires a function that given a list not empty of integers, return the first number with maximum number of occurrences.
For example:
mode [1;2;5;1;2;3;4;5;5;4:5;5] ==> 5
mode [2;1;2;1;1;2] ==> 2
mode [-1;2;1;2;5;-1;5;5;2] ==> 2
mode [7] ==> 7
Important: the exercise must be in functional programming
My idea is:
let rec occurences_counter xs i = match xs with
|[] -> failwith "Error"
|x :: xs when x = i -> 1 + occurences_counter xs i
|x :: xs -> occurences_counter xs i;;
In this function i'm stuck:
let rec mode (l : int list) : int = match l with
|[] -> failwith "Error"
|[x] -> x
|x::y::l when occurences_counter l x >= occurences_counter l y -> x :: mode l
|x::y::l when occurences_counter l y > occurences_counter l x -> y :: mode l;;
Thanks in advance, i'm newbie in programming and in stackoverflow
Sorry for my english
one solution : calculate first a list of couples (number , occurences).
hint : use List.assoc.
Then, loop over that list of couple to find the max occurrence and then return the number.
One suggestion:
your algorithm could be simplified if you sort the list before. This has O(N log(N)) complexity. Then measure the longest sequence of identical numbers.
This is a good strategy because you delegate the hard part of the work to a well known algorithm.
It is probably not the most beautiful code, but here is with what i came up (F#). At first i transform every element to an intermediate format. This format contains the element itself, the position of it occurrence and the amount it occurred.
type T<'a> = {
Element: 'a
Position: int
Occurred: int
}
The idea is that those Records can be added. So you can first transform every element, and then add them together. So a list like
[1;3]
will be first transformed to
[{Element=1;Position=0;Occurred=1}; {Element=3;Position=1;Occurred=1}]
By adding two together you only can add those with the same "Element". The Position with the lower number from both is taken, and Occurred is just added together. So if you for example have
{Element=3;Position=1;Occurred=2} {Element=3;Position=3;Occurred=2}
the result will be
{Element=3;Position=1;Occurred=4}
The idea that i had in mind was a Monoid. But in a real Monoid you had to come up that you also could add different Elements together. By trying some stuff out i feel that the restriction of just adding the same Element where way more easier. I created a small Module with the type. Including some helper functions for creating, adding and comparing.
module Occurred =
type T<'a> = {
Element: 'a
Position: int
Occurred: int
}
let create x pos occ = {Element=x; Position=pos; Occurred=occ}
let sameElements x y = x.Element = y.Element
let add x y =
if not <| sameElements x y then failwith "Cannot add two different Occurred"
create x.Element (min x.Position y.Position) (x.Occurred + y.Occurred)
let compareOccurredPosition x y =
let occ = compare x.Occurred y.Occurred
let pos = compare x.Position y.Position
match occ,pos with
| 0,x -> x * -1
| x,_ -> x
With this setup i now wrote two additional function. One aggregate function that first turns every element into a Occurred.T, group them by x.Element (the result is a list of list). And then it uses List.reduce on the inner list to add the Occurred with the same Element together. The result is a List that Contains only a single Occurred.T for every Element with the first Position and the amount of Occurred items.
let aggregate =
List.mapi (fun i x -> Occurred.create x i 1)
>> List.groupBy (fun occ -> occ.Element)
>> List.map (fun (x,occ) -> List.reduce Occurred.add occ)
You could use that aggregate function to now implement different aggregation logic. In your case you only wanted the one with the highest Occurrences and the lowest position. I wrote another function that did that.
let firstMostOccurred =
List.sortWith (fun x y -> (Occurred.compareOccurredPosition x y) * -1) >> List.head >> (fun x -> x.Element)
One note. Occurred.compareOccurredPosition is written that it sorts everything in ascending order. I think people expecting it in this order to go to the smallest to the biggest element by default. So by default the first element would be the element with the lowest occurrence and the biggest Position. By multiplying the result of it with -1 you turn that function into a descending sorting function. The reason why i did that is that i could use List.head. I also could use List.last to get the last element, but i felt that it would be better not to go through the whole list again just to get the last element. On top of it, you didn't wanted an Occurred.T you wanted the element itself, so i unwrap the Element to get the number.
Here is everything in action
let ll = [
[1;2;5;1;2;3;4;5;5;4;5;5]
[2;1;2;1;1;2]
[-1;2;1;2;5;-1;5;5;2]
[7]
]
ll
|> List.map aggregate
|> List.map firstMostOccurred
|> List.iter (printfn "%d")
This code will now print
5
2
2
7
It has still some rough edges like
Occurred.add throws an exception if you try to add Occurred with different Elements
List.head throws an exception for empty lists
And in both cases no code is written to handle those cases or making sure an exception will not raise.
You need to process you input list while maintaining a state, that stores the number of occurrences of each number. Basically, the state can be a map, where keys are in the domain of list elements, and values are in domain of natural numbers. If you will use Map the algorithm would be of O(NlogN) complexity. You can also use associative list (i.e., a list of type ('key,'value) list) to implement map. This will lead to quadratic complexity. Another approach is to use hash table or an array of the length equal to the size of the input domain. Both will give you a linear complexity.
After you collected the statistics, (i.e., a mapping from element to the number of its occurrences) you need to go through the set of winners, and choose the one, that was first on the list.
In OCaml the solution would look like this:
open Core_kernel.Std
let mode xs : int =
List.fold xs ~init:Int.Map.empty ~f:(fun stat x ->
Map.change stat x (function
| None -> Some 1
| Some n -> Some (n+1))) |>
Map.fold ~init:Int.Map.empty ~f:(fun ~key:x ~data:n modes ->
Map.add_multi modes ~key:n ~data:x) |>
Map.max_elt |> function
| None -> invalid_arg "mode: empty list"
| Some (_,ms) -> List.find_exn xs ~f:(List.mem ms)
The algorithm is the following:
Run through input and compute frequency of each element
Run through statistics and compute spectrum (i.e., a mapping from frequency to elements).
Get the set of elements that has the highest frequency, and find an element in the input list, that is in this set.
For example, if we take sample [1;2;5;1;2;3;4;5;5;4;5;5],
stats = {1 => 2; 2 => 2; 3 => 1; 4 => 2; 5 => 5}
mods = {1 => [3]; 2 => [1;2]; 5 => [5]}
You need to install core library to play with it. Use coretop to play with this function in the toplevel. Or corebuild to compile it, like this:
corebuild test.byte --
if the source code is stored in test.ml