Projecting a collection/sequence into a single Record - f#

I'm parsing HTML (via HAP) and am now parsing specific table column content per row (a collection of TD elements)
Note: Not using FSharp.Data's HTML parser as it is broken with html that contains <Script> code that cause the CSS Selectors to fail (known issue)
The type that I am trying to map a row of data into (10 "columns" of varying types) :
type DailyRow = { C0: string; C1: string; C2: int; C3: decimal; C4: string; C5: string; C6: int; C7: decimal; C8: decimal; C9: int }
My ugly but working function that maps column positions into the record field (yes, anything that does not parse correctly should explode):
let dailyRow = fun (record:DailyRow, column:int, node:HtmlNode) ->
printfn "dailyRow: Column %i has value %s" column node.InnerText
match column with
| 0 -> {record with C0 = node.InnerText }
| 1 -> {record with C1 = node.InnerText }
| 2 -> {record with C2 = (node.InnerText |> int) }
| 3 -> {record with C3 = Decimal.Parse(node.InnerText, NumberStyles.Currency) }
| 4 -> {record with C4 = node.InnerText }
| 5 -> {record with C5 = node.InnerText }
| 6 -> {record with C6 = Int32.Parse(node.InnerText, NumberStyles.AllowThousands) }
| 7 -> {record with C7 = Decimal.Parse(node.InnerText, NumberStyles.Currency) }
| 8 -> {record with C8 = Decimal.Parse(node.InnerText, NumberStyles.Currency) }
| 9 -> {record with C9 = (node.InnerText |> int) }
| _ -> raise (System.MissingFieldException("New Field in Chart Data Found: " + column.ToString()))
Some test code:
let chartRow = { C0 = ""; C1 = ""; C2 = 0; C3 = 0.0M; C4 = "" ; C5 = ""; C6 = 0; C7 = 0.0M; C8 = 0.0M; C9 = 0 }
let columnsToParse = row.SelectNodes "td" // 1 row of 10 columns
let x = columnsToParse
|> Seq.mapi (fun i x -> dailyRow(chartRow, i, x))
The issue, since I am passing in a immutable record and receiving a new record from the dailyRow function via Seq.mapi (using the index to map to the column number), I will end up with 10 records, each with one of their values property set.
In C# I would just pass dailyRow a ref'd object and update it in place, what would be the F# idiomatic way of handling this?

Simplest option, if you don't mind an array allocation:
let nodes = seq [...]
let arr = nodes |> Seq.map (fun n -> n.InnerText) |> Array.ofSeq
let record =
{ C0 = arr.[0]
C1 = arr.[1]
C2 = int arr.[2]
C3 = Decimal.Parse(arr.[3], NumberStyles.Currency)
C4 = arr.[4]
C5 = arr.[5]
C6 = Int32.Parse(arr.[6], NumberStyles.AllowThousands)
C7 = Decimal.Parse(arr.[7], NumberStyles.Currency)
C8 = Decimal.Parse(arr.[8], NumberStyles.Currency)
C9 = int arr.[9] }

Related

f# function to count number of digraphs in a string

Im getting an error with this function. Im new to f# so I don't fully know what the code is doing, I tried duplicating a function that only takes one parameter to find vowels in a string.
let rec countDigraph c1 c2 L =
match L with
| [] -> 0
| hd::tl when hd = c1 -> 1 + count c1 tl
| hd::tl when tl = c2 -> 1 + count c2 tl
| _::tl ->0 + countDigraph c1 c2 tl
gets called later in the code:
let printCountDigraph digraph L =
let c1 = List.head digraph
let c2 = List.head digraph
printfn "%A,%A: %A" c1 c2 (countDigraph c1 c2 L)
let digraphs = [['a';'i']; ['c';'h']; ['e';'a']; ['i';'e']; ['o';'u']; ['p';'h']; ['s';'h']; ['t';'h']; ['w';'h'];]
List.iter (fun digraph -> printCountDigraph digraph L) digraphs
In countDigraph, you need to check that the first two characters of the list match the digraph. You seem to be trying to do this by first checking the first one (in the first case) and then the second one (in the second case), but this is not how pattern matching works.
The easiest option is to have a single clause that uses the pattern l1::l2::tl to extract the first two letters, followed by the rest of the list. You need to think whether e.g. eai counts as two digraphs or just one. If two, you need to recursively call countDigraph on c2::tl as below - if just one, you would recursively call countDigraph on just tl.
let rec countDigraph c1 c2 L =
match L with
| [] -> 0
| l1::l2::tl when l1=c1 && l2=c2 -> 1 + countDigraph c1 c2 (c2::tl)
| _::tl ->0 + countDigraph c1 c2 tl
The rest of the code gets much easier if you represent digraphs as a list of pairs, rather than a list of two-element lists:
let printCountDigraph (c1, c2) L =
printfn "%A,%A: %A" c1 c2 (countDigraph c1 c2 L)
let digraphs = [('a','i'); ('c','h'); ('e','a'); ('i','e');
('o','u'); ('p','h'); ('s','h'); ('t','h'); ('w','h')]
let L = List.ofSeq "chai"
List.iter (fun digraph -> printCountDigraph digraph L) digraphs

Multiplication in F#

I can't get my answer correctly from my multiplication function
My code is:
let List = [77; 14; 89; 93; 201]
let rec Mult =
match n with
| 24 -> 24
| _-> n * n
for i = 1 to 5 do
printfn "Multiplication: %A" (Mult i)
My question is how do I get I get it to call my List?
let List = [24; 103; 7; 13; 445]
let rec Mult = function
| head :: tail -> head * (Mult tail)
| [] -> 1
let result = Mult List
printfn "%A" result

F# Traversing mutually recursive tree to count elements

Given the following types:
type Title = string
type Document = Title * Element list
and Element = Par of string | Sec of Document
I'm attempting to create a function which will traverse the tree and count the number of occurrences of Sec.
Given the following example:
let s1 = ("section 1", [Par "Bla"])
let s2 = ("section 2", [Sec s21; Par "Bla"])
let s21 = ("subsection 2.1", [Par "Bla"])
let s3 = ("section 3", [Par "Bla"])
let doc = ("Compiler project", [Par "Bla"; Sec s1; Sec s2; Sec s3]);
A function taking a Document to count number of Sections, noOfSecs d would in this case return 4, as there are 4 Sections in this case. I've attempted something, but I'm a little stuck, especially what to do when I hit a Par:
let rec noOfSecs d =
match d with
| (_,[]) -> 0
| (_,e::es) -> (findSecs e)
and findSecs = function
| Sec(t,_::es) -> 1 + noOfSecs (t,es)
| Par p -> //What should we do here?
There are 0 Secs within a Par string so you can return 0 for that case. In noOfSecs you need to sum the Sec cases for each element in the element list, not just the first one. You can use List.sumBy for this:
let rec noOfSecs (_, elems) =
List.sumBy findSecs elems
and findSecs = function
| Sec d -> 1 + noOfSecs d
| Par p -> 0

How do I separate case ids from case values on a Discriminated Union?

I want to build a dictionary from a list of items.
An item has the following definition:
type Item =
| A of TotalPrice * Special
| B of TotalPrice * Special
| C of TotalPrice
| D of TotalPrice
I want the keys of the dictionary to map to the case ids:
| A
| B
| C
| D
I would then have the values for the case id be a list.
How do I separate the case ids from the case values?
Example:
let dictionary = items |> List.map (fun item -> item) // uh...
Appendix:
module Checkout
(*Types*)
type UnitPrice = int
type Qty = int
type Special =
| ThreeForOneThirty
| TwoForFourtyFive
type TotalPrice = { UnitPrice:int ; Qty:int }
type Item =
| A of TotalPrice * Special
| B of TotalPrice * Special
| C of TotalPrice
| D of TotalPrice
(*Functions*)
let totalPrice (items:Item list) =
let dictionary = items |> List.map (fun item -> item) // uh...
0
(*Tests*)
open FsUnit
open NUnit.Framework
[<Test>]
let ``buying 2 A units, B unit, A unit = $160`` () =
// Setup
let items = [A ({UnitPrice=50; Qty=2} , ThreeForOneThirty)
B ({UnitPrice=30; Qty=1} , TwoForFourtyFive)
A ({UnitPrice=50; Qty=1} , ThreeForOneThirty)]
items |> totalPrice |> should equal 160
Your data is badly defined for your use case. If you want to refer to the kinds of items by themselves, you need to define them by themselves:
type ItemKind = A | B | C | D
type Item = { Kind: ItemKind; Price: TotalPrice; Special: Special option }
Then you can easily build a dictionary of items:
let dictionary = items |> List.map (fun i -> i.Kind, i) |> dict
Although I must note that such dictionary may not be possible: if the items list contains several items of the same kind, some of them will not be included in the dictionary, because it can't contain multiple identical keys. Perhaps I didn't understand what kind of dictionary you're after.
If you want to create the dictionary with keys like A, B, C and D you will fail because A and B are constructors with type TotalPrice * Special -> Item and C and D are constructors of type TotalPrice -> Item. Dictionary would have to make a decision about type of keys.
Getting DU constructor name should be doable by reflection but is it really necessary for your case?
Maybe different type structure will be more efficient for your case, ie. Fyodor Soikin proposal.
Maybe the following will clarify somewhat why datastructure and code is no good, and as such also clarify that this mainly is not related to FP as indicated in some of the comments et al.
My guess is that the question is related to "how can this be grouped", and lo and behold, there is in fact a groupBy function!
(*Types*)
type UnitPrice = int
type Qty = int
type Special =
| ThreeForOneThirty
| TwoForFourtyFive
type TotalPrice = { UnitPrice:int ; Qty:int }
type Item =
| A of TotalPrice * Special
| B of TotalPrice * Special
| C of TotalPrice
| D of TotalPrice
let items = [A ({UnitPrice=50; Qty=2} , ThreeForOneThirty)
B ({UnitPrice=30; Qty=1} , TwoForFourtyFive)
A ({UnitPrice=50; Qty=1} , ThreeForOneThirty)]
let speciallyStupidTransformation =
function
| ThreeForOneThirty -> 34130
| TwoForFourtyFive -> 2445
let stupidTransformation =
function
| A (t,s) -> "A" + (s |> speciallyStupidTransformation |> string)
| B (t,s) -> "B" + (s |> speciallyStupidTransformation |> string)
| C (t) -> "C"
| D(t) -> "D"
let someGrouping = items |> List.groupBy(stupidTransformation)
val it : (string * Item list) list =
[("A34130",
[A ({UnitPrice = 50;
Qty = 2;},ThreeForOneThirty); A ({UnitPrice = 50;
Qty = 1;},ThreeForOneThirty)]);
("B2445", [B ({UnitPrice = 30;
Qty = 1;},TwoForFourtyFive)])]
Yeah its still a bad idea. But its somewhat grouped uniquely, and may be misused further to aggregate some sums or whatever.
Adding some more code for that, like the following:
let anotherStupidTransformation =
function
| A(t,_) -> (t.UnitPrice, t.Qty)
| B(t,_) -> (t.UnitPrice, t.Qty)
| C(t) -> (t.UnitPrice, t.Qty)
| D(t) -> (t.UnitPrice, t.Qty)
let x4y x y tp q =
if q%x = 0 then y*q/x else tp/q*(q%x)+(q-q%x)/x*y
let ``34130`` = x4y 3 130
let ``2445`` = x4y 2 45
let getRealStupidTotal =
function
| (s, (tp,q)) ->
(s|> List.ofSeq, (tp,q))
|> function
| (h::t, (tp,q)) ->
match t |> List.toArray |> System.String with
| "34130" -> ``34130`` tp q
| "2445" -> ``2445`` tp q
| _ -> tp
let totalPrice =
items
|> List.groupBy(stupidTransformation)
|> List.map(fun (i, l) -> i,
l
|> List.map(anotherStupidTransformation)
|> List.unzip
||> List.fold2(fun acc e1 e2 ->
((fst acc + e1) * e2, snd acc + e2) ) (0,0))
|> List.map(getRealStupidTotal)
|> List.sum
val totalPrice : int = 160
might or might not yield some test cases correct.
For the above testdata as far as I can read the initial code at least is ok. The sum does get to be 160...
Would I use this code anywhere? Nope.
Is it readable? Nope.
Is it fixable? Not without changing the way the data are structured to avoid several of the stupid transformations...

A straightforward functional way to rename columns of a Deedle data frame

Is there a concise functional way to rename columns of a Deedle data frame f?
f.RenameColumns(...) is usable, but mutates the data frame it is applied to, so it's a bit of a pain to make the renaming operation idempotent. I have something like f.RenameColumns (fun c -> ( if c.IndexOf( "_" ) < 0 then c else c.Substring( 0, c.IndexOf( "_" ) ) ) + "_renamed"), which is ugly.
What would be nice is something that creates a new frame from the input frame, like this: Frame( f |> Frame.cols |> Series.keys |> Seq.map someRenamingFunction, f |> Frame.cols |> Series.values ) but this gets tripped up by the second part -- the type of f |> Frame.cols |> Series.values is not what is required by the Frame constructor.
How can I concisely transform f |> Frame.cols |> Series.values so that its result is edible by the Frame constructor?
You can determine its function when used with RenameColumns:
df.RenameColumns someRenamingFunction
You can also use the function Frame.mapColKeys.
Builds a new data frame whose columns are the results of applying the
specified function on the columns of the input data frame. The
function is called with the column key and object series that
represents the column data.
Source
Example:
type Record = {Name:string; ID:int ; Amount:int}
let data =
[|
{Name = "Joe"; ID = 51; Amount = 50};
{Name = "Tomas"; ID = 52; Amount = 100};
{Name = "Eve"; ID = 65; Amount = 20};
|]
let df = Frame.ofRecords data
let someRenamingFunction s =
sprintf "%s(%i)" s s.Length
df.Format() |> printfn "%s"
let ndf = df |> Frame.mapColKeys someRenamingFunction
ndf.Format() |> printfn "%s"
df.RenameColumns someRenamingFunction
df.Format() |> printfn "%s"
Print:
Name ID Amount
0 -> Joe 51 50
1 -> Tomas 52 100
2 -> Eve 65 20
Name(4) ID(2) Amount(6)
0 -> Joe 51 50
1 -> Tomas 52 100
2 -> Eve 65 20
Name(4) ID(2) Amount(6)
0 -> Joe 51 50
1 -> Tomas 52 100
2 -> Eve 65 20

Resources