Convert a sequence of dictionary keys to a set - f#

The following code lists the set of keys found in a dictionary sequence (each dict is basically a row from a database). (I want to convert the keys to a set so I can compare 2 db tables)
for seqitem in tblseq do
let keyset = seqitem.Keys |> Set.ofSeq // works correctly
printfn ">>> List: %A; Item Type: %A" keyset
Rather than print the keyset however I want to return it from a function but am having a problem with type inference. Tried the following but it does not work;
What I want to do is return these values as either an array of list (rather than print them)
let get_keyset tblseq =
tblseq |> Seq.iter (fun x ->
x.Keys |> Set.ofSeq
)
What am I missing here?

Using Seq.map as ildjarn suggests is one option (you may want to add Array.ofSeq to the end to get array of sets as you say in your qurestion).
An alternative approach is to use array comprehension:
let get_keyset (tblseq:seq<System.Collections.Generic.Dictionary<_, _>>) =
[| for x in tblseq -> x.Keys |> Set.ofSeq |]
The notation [| .. |] says that you want to create an array of elements and the expression following -> specifies what should be produced as an element. The syntax is essentially just a nicer way for writing Seq.map (although it supports more features).
You can also use this syntax for creating sets (instead of calling Set.ofSeq). In this case, it doesn't make much sense, because Set.ofSeq is faster and sorhter, but sometimes it is quite neat option. It allows you to avoid type annotations, because you can get key of a dictionary using KeyValue pattern:
let get_keyset tblseq =
[| for x in tblseq ->
set [ for (KeyValue(k, v)) in x -> k ] |]

Use Seq.map rather than Seq.iter:
let get_keyset tblseq =
tblseq
|> Seq.map (fun (x:Dictionary<_,_>) -> x.Keys |> set)
|> Array.ofSeq

Related

F# sequences with BigInteger indices

I am looking for a type similar to sequences in F# where indices could be big integers, rather that being restricted to int. Does there exist anything like this?
By "big integer indices" I mean a type which allows for something equivalent to that:
let s = Seq.initInfinite (fun i -> i + 10I)
The following will generate an infinite series of bigints:
let s = Seq.initInfinite (fun i -> bigint i + 10I)
What i suspect you actually want though is a Map<'Key, 'Value>.
This lets you efficiently use a bigint as an index to look up whatever value it is you care about:
let map =
seq {
1I, "one"
2I, "two"
3I, "three"
}
|> Map.ofSeq
// val map : Map<System.Numerics.BigInteger,string> =
// map [(1, "one"); (2, "two"); (3, "three")]
map.TryFind 1I |> (printfn "%A") // Some "one"
map.TryFind 4I |> (printfn "%A") // None
The equivalent of initInfinite for BigIntegers would be
let inf = Seq.unfold (fun i -> let n = i + bigint.One in Some(n, n)) bigint.Zero
let biggerThanAnInt = inf |> Seq.skip (Int32.MaxValue) |> Seq.head // 2147483648
which takes ~2 min to run on my machine.
However, I doubt this is of any practical use :-) That is unless you start at some known value > Int32.MaxValue and stop reasonably soon (generating less than Int32.MaxValue items), which then could be solved by offsetting the BigInt indexes into the Int32 domain.
Theoretically you could amend the Seq module with functions working with BigIntegers to skip / window / ... an amount of items > Int32.MaxValue (e.g. by repeatedly performing the corresponding Int32 variant)
Since you want to index into a sequence, I assume you want a version of Seq.item that takes a BigInteger as index. There's nothing like that built into F#, but it's easy to define your own:
open System.Numerics
module Seq =
let itemI (index : BigInteger) source =
source |> Seq.item (int index)
Note that no new type is needed unless you're planning to create sequences that are longer than 2,147,483,647 items, which would probably not be practical anyway.
Usage:
let items = [| "moo"; "baa"; "oink" |]
items
|> Seq.itemI 2I
|> printfn "%A" // output: "oink"

Read two integers in the same line as tuple in f#

I'm trying to read two integers which are going to be taken as input from the same line. My attempt so far:
let separator: char =
' '
Console.ReadLine().Split separator
|> Array.map Convert.ToInt32
But this returns a two-element array and I have to access the individual indices to access the integers. Ideally, what I would like is the following:
let (a, b) =
Console.ReadLine().Split separator
|> Array.map Convert.ToInt32
|> (some magic to convert the two element array to a tuple)
How can I do that?
I'm afraid there's no magic. You have to explicitly convert into a tuple
let a, b =
Console.ReadLine().Split separator
|> Array.map int
|> (fun arr -> arr.[0], arr.[1])
Edit: you can use reflection as #dbc suggested but that's slow and probably overkill for what you're doing.

Trying to filter out values in a sequence that are not in another sequence

I am trying to filter out values from a sequence, that are not in another sequence. I was pretty sure my code worked, but it is taking a long time to run on my computer and because of this I am not sure, so I am here to see what the community thinks.
Code is below:
let statezip =
StateCsv.GetSample().Rows
|> Seq.map (fun row -> row.State)
|> Seq.distinct
type State = State of string
let unwrapstate (State s) = s
let neededstates (row:StateCsv) = Seq.contains (unwrapstate row.State) statezip
I am filtering by the neededstates function. Is there something wrong with the way I am doing this?
let datafilter =
StateCsv1.GetSample().Rows
|> Seq.map (fun row -> row.State,row.Income,row.Family)
|> Seq.filter neededstates
|> List.ofSeq
I believe that it should filter the sequence by the values that are true, since neededstates function is a bool. StateCsv and StateCsv1 have the same exact structure, although from different years.
Evaluation of contains on sequences and lists can be slow. For a case where you want to check for the existence of an element in a collection, the F# Set type is ideal. You can convert your sequences to sets using Set.ofSeq, and then run the logic over the sets instead. The following example uses the numbers from 1 to 10000 and then uses both sequences and sets to filter the result to only the odd numbers by checking that the values are not in a collection of even numbers.
Using Sequences:
let numberSeq = {0..10000}
let evenNumberSeq = seq { for n in numberSeq do if (n % 2 = 0) then yield n }
#time
numberSeq |> Seq.filter (fun n -> evenNumberSeq |> Seq.contains n |> not) |> Seq.toList
#time
This runs in about 1.9 seconds for me.
Using sets:
let numberSet = numberSeq |> Set.ofSeq
let evenNumberSet = evenNumberSeq |> Set.ofSeq
#time
numberSet |> Set.filter (fun n -> evenNumberSet |> Set.contains n |> not)
#time
This runs in only 0.005 seconds. Hopefully you can materialize your sequences to sets before performing your contains operation, thereby getting this level of speedup.

F#: Converting tuples into a hashtable

I am new to programming, and this is my first time working with a typed, functional, and .NET language, so pardon me if my question is silly/trivial.
I have a list of tuples, and I would like to store the first item (which is a string) of each tuple as a value in the hashtable, and the second item (which is a byte array) of each tuple as a key. How may I go about doing so?
Here is my code:
let readAllBytes (tupleOfFileLengthsAndFiles) =
let hashtable = new Hashtable()
tupleOfFileLengthsAndFiles
|> snd
|> List.map (fun eachFile -> (eachFile, File.ReadAllBytes eachFile))
|> hashtable.Add(snd eachTuple, fst eachTuple)
However, the last line is underscored in red. How can I improve it? Thanks in advance for your help.
The easies way is to use dict
It will convert a sequence of tuples in a Dictionary which is a string type Hashtable.
> dict [(1,"one"); (2,"two"); ] ;;
val it : System.Collections.Generic.IDictionary<int,string> =
seq [[1, one] {Key = 1;
Value = "one";}; [2, two] {Key = 2;
Value = "two";}]
If you are really interested in a Hashtable you can use this simple function:
let convert x =
let d = Hashtable()
x |> Seq.iter d.Add
d
So, I'm not sure what do you want to do with this, it seems you're also interested in reading the file in the middle of the conversion. May be something like this:
let readAllBytes (tupleOfFileLengthsAndFiles:seq<'a*'b>) =
tupleOfFileLengthsAndFiles
|> Seq.map (fun (x, y) -> x, File.ReadAllBytes y)
|> convert

Set of keys from a map

I have a map X and I'm trying to get a set of the keys satisfying a certain condition, something like this:
Map.Keys X
|> Set.filter (fun x -> ...)
...but I cannot find the way to get the keys from F#'s Map collection.
Convert your map to sequence of tuples (key,value) first and then map it to a sequence of just keys:
map |> Map.toSeq |> Seq.map fst
FSI sample:
>Map.ofList[(1,"a");(2,"b")] |> Map.toSeq |> Seq.map fst;;
val it : seq<int> = seq [1; 2]
Or alternatively, as ordering of keys likely does not matter you may use more eager method returning the list of all keys. It is also not hard to make it into extension method keys of Microsoft.FSharp.Collections.Map module:
module Map =
let keys (m: Map<'Key, 'T>) =
Map.fold (fun keys key _ -> key::keys) [] m
In F# 6.0, Map collection has now a Keys property.
OLD ANSWER:
Most readable (and probably most efficient, due to not needing previous conversions to Seq or mapping) answer:
let Keys(map: Map<'K,'V>) =
seq {
for KeyValue(key,value) in map do
yield key
} |> Set.ofSeq
For a set of keys you could just do:
let keys<'k, 'v when 'k : comparison> (map : Map<'k, 'v>) =
Map.fold (fun s k _ -> Set.add k s) Set.empty map

Resources