Trying to validate CSV data in f# - f#

Just trying to wrap my head around some F# here and I'm having an issue.
I have a CSV file which looks like
CorrelationId,TagNumber,Description,CreationDate,UpdateDate,Discipline
8D3F96F3-938F-4599-BCA1-66B13199A39A,Test 70-2,Test tag - Ignore,2016-04-05 14:55:23.503,2016-04-05 14:55:23.503,Mechanical
A9FD4B9D-F7A1-4B7D-917F-D633EA0321E3,test-4,A test tag 24,2016-03-23 15:09:54.667,2016-03-30 17:35:29.553,Civil
And I'm reading it in using the CSV type provider
open FSharp.Data
type Tag = CsvProvider<"tags.csv">
let readTags (path:string) =
let tags = Tag.Load(path)
printfn "%i" ( tags.Rows |> Seq.length )
let tag = tags.Rows |> Seq.head
Then I'd like to validate the rows so I took a hint from the fsharpforfunandprofit railway oriented programming.
type Result<'TSuccess,'TFailure> =
| Success of 'TSuccess
| Failure of 'TFailure
let bind switchFunction twoTrackInput =
match twoTrackInput with
| Success s -> switchFunction s
| Failure f -> Failure f
let validateTagName tag =
if String.length tag.TagNumber = 0 then Failure "Tag number cannot be empty"
else Success tag
let validateTagDescription tag =
if String.length tag.Description = 0 then Failure "Tag description cannot be empty"
else Success tag
But I'm getting a problem in the validation methods that I need to annotate the functions with a type. I have no idea what type to annotate these as. I tried playing with creating a new type and mapping to it
type tagType = { TagNumber: string; Description: string}
which made those functions compile properly but I just kicked the problem down the road because now I'm not sure how to map from the Tag.Row to tagType. Ideally I'd do this validation without having to do any mapping.
How should all this look?

You already have the Tag type from the type provider. With that particular data sample, it provides a nested type called Tag.Row. You can annotate your functions with that type:
let validateTagName (tag : Tag.Row) =
if String.length tag.TagNumber = 0 then Failure "Tag number cannot be empty"
else Success tag
let validateTagDescription (tag : Tag.Row) =
if String.length tag.Description = 0 then Failure "Tag description cannot be empty"
else Success tag
These functions compile.

To add to Mark's answer, the problem is that dotting into a class in the OOP way generally needs a type annotation,
because the type inference can't normally tell what type is being used.
For example, what is the type of x here?
let doSomething x = x.Length // error FS0072: Lookup on object of indeterminate type
Using a function attached to a module would give the type inference the information it needs:
let doSomething x = List.length x // Compiles OK
The type inference will normally work with record types that you have defined:
type tagType = { TagNumber: string; Description: string}
let doSomething x = x.TagNumber // Compiles OK
But in this case you are working with a class defined by a type provider, so the type inference is not working as well.
As Mark says, the easiest thing to do is to use a type annotation, in the way that he demonstrates.
The alternative would be to write a converter function from the type provider Tag type to your own MyTag type, and then do
let myTags = tags.Rows |> Seq.map convertToMyTag
to convert each row into your type. I sometimes do that when I want a more sophisticated domain type than just a simple record with fields.
In this scenario, though, that would be overkill (and you'd still need to add an annotation to the converter function!)
Finally, here are two posts that might be useful: understanding type inference
and troubleshooting common compiler errors.

Related

Result bind with different types?

I've made a simple Computation Expression workflow for dealing with Results.
[<RequireQualifiedAccess>]
module Result =
type Builder() =
member __.Bind(x, f) = x |> Result.bind f
member __.Return(x) = x
member __.ReturnFrom(x) = Ok x
let workflow = Builder()
I also use different types to represent different kids of errors:
type ValidationError<'a> = { Obj:'a; Message:string }
type InvalidOperationError = { Operation:string; Message:string }
The problem arises when two results have different error types.
LetterString.create : string -> Result<LetterString, ValidationError<string>>
Username.create : string -> Result<Username, ValidationError<string>>
PositiveDecimal.create : decimal -> Result<PositiveDecimal, ValidationError<decimal>>
let user =
Result.workflow {
let! name = LetterString.create "Tom"
let! username = Username.create "Tom01098"
// Error occurs here.
let! balance = PositiveDecimal.create 100m
return! {
// User record creation elided.
}
}
FS0001 Type mismatch. Expecting a
'Result<PositiveDecimal,ValidationError<string>>'
but given a
'Result<PositiveDecimal,ValidationError<decimal>>'
I have already tried using a DU type of all errors:
type Error<'a> =
| ValidationError of Obj:'a * Message:string
| InvalidOperationError of Operation:string * Message:string
This has a similar problem when the generic parameter 'a is different between errors. It also loses the exact type of error in the type signature of the function.
The expected result is that the entire workflow has a unified error type, preferably as specific as possible in terms of type.
Can be solved by removing the generic parameter and using a single Error DU. Unfortunately this loses the signature I wanted, but it will have to do.

F# How to create an instance of a provided type

In my first attempt to create a type provider, I have a ProvidedTypeDefinition for a message:
// Message type
let mTy = ProvidedTypeDefinition(asm, ns, message.Key, Some(typeof<ValueType>),
HideObjectMethods = true, IsErased = false)
// Direct buffer
let bufferField = ProvidedField("_directBuffer", typeof<IDirectBuffer>)
mTy.AddMember bufferField
let mCtor1 =
ProvidedConstructor(
[ProvidedParameter("buffer", typeof<IDirectBuffer>)],
InvokeCode = fun args ->
match args with
| [this;buffer] ->
Expr.FieldSet (this, bufferField, <## %%buffer:IDirectBuffer ##>)
| _ -> failwith "wrong ctor params"
)
mTy.AddMember mCtor1
Then I need to create an instance of that type in a method of another provided type. I am doing this:
let mMethod = ProvidedMethod(message.Key, [ProvidedParameter("buffer", typeof<IDirectBuffer>)], mTy)
mMethod.InvokeCode <- (fun [this;buffer] ->
let c = mTy.GetConstructors().Last()
Expr.NewObject(c, [ buffer ])
)
ILSpy shows the following C# code equivalent for the method:
public Car Car(IDirectBuffer buffer)
{
return new Car(buffer);
}
and it also shows that the Car struct is present in the test assembly (this test assembly builds OK unless I access the Car method):
But when I try to create the Car via the method like this:
type CarSchema = SbeProvider<"Path\to\SBETypeProvider\SBETypeProvider\Car.xml">
module Test =
let carSchema = CarSchema()
let car = carSchema.Car(null)
I get the following errors:
The module/namespace 'SBETypeProvider' from compilation unit 'tmp5CDE' did not contain the namespace, module or type 'Car'
A reference to the type 'SBETypeProvider.Car' in assembly 'tmp5CDE' was found, but the type could not be found in that assembly
What I am doing wrong? The picture shows that the type is here. Why I cannot create it?
I looked through many type providers on GitHub and cannot find a clear example how to generate a ProvidedTypeDefinition from another one.
This might not be the problem, but at a quick glance it looks like the line you linked might actually be the issue:
let mTy = ProvidedTypeDefinition(asm, ns, message.Key, Some(typeof<ValueType>),
HideObjectMethods = true, IsErased = false)
This type is being added to the ty provided type (the one that will actually be written to the temporary assembly) and so shouldn't have the assembly and namespace specified itself.
let mTy = ProvidedTypeDefinition(message.Key, Some(typeof<ValueType>),
HideObjectMethods = true, IsErased = false)
Might work better. Generated types are a bit of a black art though, with very little documentation, so it's possible (probable?) that there will be other issues you might find.
On a more general note, for creating provided types what I normally end up doing is returning the provided constructor as a value which can then be embedded in the invoke code for other properties/functions using Expr.Call. This is especially important for erased types, as reflection will not work on them anyway.

How do I retrieve a value from a composite generic type?

How do I retrieve a value from a generic?
Specifically, I am attempting the following:
// Test
let result = Validate goodInput;;
// How to access record??
let request = getRequest result
Here's the code:
type Result<'TSuccess,'TFailure> =
| Success of 'TSuccess
| Failure of 'TFailure
let bind nextFunction lastFunctionResult =
match lastFunctionResult with
| Success input -> nextFunction input
| Failure f -> Failure f
type Request = {name:string; email:string}
let validate1 input =
if input.name = "" then Failure "Name must not be blank"
else Success input
let validate2 input =
if input.name.Length > 50 then Failure "Name must not be longer than 50 chars"
else Success input
let validate3 input =
if input.email = "" then Failure "Email must not be blank"
else Success input;;
let Validate =
validate1
>> bind validate2
>> bind validate3;;
// Setup
let goodInput = {name="Alice"; email="abc#abc.com"}
let badInput = {name=""; email="abc#abc.com"};;
// I have no clue how to do this...
let getRequest = function
| "Alice", "abc#abc.com" -> {name="Scott"; email="xyz#xyz.com"}
| _ -> {name="none"; email="none"}
// Test
let result = Validate goodInput;;
// How to access record??
let request = getRequest result
printfn "%A" result
You mean how do you extract the record out of your result type? Through pattern matching, that's what you're already doing in bind.
let getRequest result =
match result with
| Success input -> input
| Failure msg -> failwithf "Invalid input: %s" msg
let result = Validate goodInput
let record = getRequest result
This will return the record or throw an exception. Up to you how you handle the success and failure cases once you have your Result - that could be throwing an exception, or turning it into option, or logging the message and returning a default etc.
This seems to be a frequently asked question: How do I get the value out of a monadic value? The correct answer, I believe, is Mu.
The monadic value is the value.
It's like asking, how do I get the value out of a list of integers, like [1;3;3;7]?
You don't; the list is the value.
Perhaps, then, you'd argue that lists aren't Discriminated Unions; they have no mutually exclusive cases, like the above Result<'TSuccess,'TFailure>. Consider, instead, a tree:
type Tree<'a> = Node of Tree<'a> list | Leaf of 'a
This is another Discriminated Union. Examples include:
let t1 = Leaf 42
let t2 = Node [Node []; Node[Leaf 1; Leaf 3]; Node[Leaf 3; Leaf 7]]
How do you get the value out of a tree? You don't; the tree is the value.
Like 'a option in F#, the above Result<'TSuccess,'TFailure> type (really, it's the Either monad) is deceptive, because it seems like there should only be one value: the success. The failure we don't like to think about (just like we don't like to think about None).
The type, however, doesn't work like that. The failure case is just as important as the success case. The Either monad is often used to model error handling, and the entire point of it is to have a type-safe way to deal with errors, instead of exceptions, which are nothing more than specialised, non-deterministic GOTO blocks.
This is the reason the Result<'TSuccess,'TFailure> type comes with bind, map, and lots of other goodies.
A monadic type is what Scott Wlaschin calls an 'elevated world'. While you work with the type, you're not supposed to pull data out of that world. Rather, you're supposed to elevate data and functions up to that world.
Going back to the above code, imagine that given a valid Request value, you'd like to send an email to that address. Therefore, you write the following (impure) function:
let send { name = name; email = email } =
// Send email using name and email
This function has the type Request -> unit. Notice that it's not elevated into the Either world. Still, you want to send the email if the request was valid, so you elevate the send method up to the Either world:
let map f = bind (fun x -> Success (f x))
let run = validate1 >> bind validate2 >> bind validate3 >> map send
The run function has the type Request -> Result<unit,string>, so used with goodInput and badInput, the results are the following:
> run goodInput;;
val it : Result<unit,string> = Success unit
> run badInput;;
val it : Result<unit,string> = Failure "Name must not be blank"
And then you probably ask: and how do I get the value out of that?
The answer to that question depends entirely on what you want to do with the value, but, imagine that you want to report the result of run back to the user. Displaying something to the user often involves some text, and you can easily convert a result to a string:
let reportOnRun = function
| Success () -> "Email was sent."
| Failure msg -> msg
This function has the type Result<unit,string> -> string, so you can use it to report on any result:
> run goodInput |> reportOnRun;;
val it : string = "Email was sent."
> run badInput |> reportOnRun;;
val it : string = "Name must not be blank"
In all cases, you get back a string that you can display to the user.

Type annotation for using a F# TypeProvider type e.g. FSharp.Data.JsonProvider<...>.DomainTypes.Url

I'm using the FSharp.Data.JsonProvider to read Twitter Tweets.
Playing with this sample code
https://github.com/tpetricek/Documents/tree/master/Samples/Twitter.API
I want to expand the urls in the tweet with
let expandUrl (txt:string) (url:Search.DomainTypes<...>.DomainTypes.Url) =
txt.Replace( url.Url, url.ExpandedUrl )
This results in Error:
Lookup on object of indeterminate type based on information prior to this program point.
A type annotation may be needed prior to this program point to constrain the type of the object.
My problem is how to define the TypeProvider Type for url in the expandUrl function above?
The type inferance shows me this
val urls : FSharp.Data.JsonProvider<...>.DomainTypes.Url []
but this is not accepted in the type declaration. I assume "<...>" is not F# synatx.
How to do a type annotation for using a TypeProvider type e.g. FSharp.Data.JsonProvider<...>.DomainTypes.Url ?
Here is the complete code snippet:
open TwitterAPI // github.com/tpetricek/Documents/tree/master/Samples/Twitter.API
let twitter = TwitterAPI.TwitterContext( _consumerKey, _consumerSecret, _accessToken, _accessTokenSecret )
let query = "water"
let ts = Twitter.Search.Tweets(twitter, Utils.urlEncode query, count=100)
let ret =
[ for x in ts.Statuses do
// val urls : FSharp.Data.JsonProvider<...>.DomainTypes.Url []
let urls = x.Entities.Urls
// fully declarated to help the type inference at expandUrl
let replace (txt:string) (oldValue:string) (newValue:string) =
txt.Replace( oldValue, newValue)
// Error:
// Lookup on object of indeterminate type based on information prior to this program point.
// A type annotation may be needed prior to this program point to constrain the type of the object.
// This may allow the lookup to be resolved.
let expandUrl (txt:string) (url:FSharp.Data.JsonProvider<_>.DomainTypes.Url) =
replace txt url.Url url.ExpandedUrl
let textWithExpandedUrls = Array.fold expandUrl x.Text urls
yield textWithExpandedUrls
]
When you call Twitter.Search.Tweets (https://github.com/tpetricek/Documents/blob/master/Samples/Twitter.API/Twitter.fs#L284), the return type of that is one of the domain types of TwitterTypes.SearchTweets, which is a type alias for JsonProvider<"references\\search_tweets.json"> (https://github.com/tpetricek/Documents/blob/master/Samples/Twitter.API/Twitter.fs#L183).
Although in the tooltip it shows up as JsonProvider<...>.DomainTypes.Url, you'll have to use the type alias TwitterTypes.SearchTweets.DomainTypes.Url
I had a similar problem trying to figure out how to use the FSharp.Data HtmlProvider.
I am using Wikipedia to get information about USA presidents. The HtmlProvider does a great job of discovering the various tables in that webpage, but I wanted to extract the logic for processing a row of "president data" into a separate function called processRow.
And the problem was trying to work out what the type of such a row is for processRow's parameter row. The following code does the trick:
#load "Scripts\load-references.fsx"
open FSharp.Data
let presidents = new HtmlProvider<"https://en.wikipedia.org/wiki/List_of_Presidents_of_the_United_States">()
let ps = presidents.Tables.``List of presidents``
ps.Headers |> Option.map (fun hs -> for h in hs do printf "%s " h)
printfn ""
type Presidents = ``HtmlProvider,Sample="https://en.wikipedia.org/wiki/List_of_Presidents_of_the_United_States"``.ListOfPresidents
let processRow (row:Presidents.Row) =
printfn "%d %s" row.``№`` row.President2
ps.Rows |> Seq.iter processRow
I did not type in the long type alias for Presidents, I used Visual Studio auto-completion by guessing that the type for List of presidents would be discoverable from something starting with Html, and it was, complete with the four single back quotes.

Creating record type in F#

I would like to craft a simple record type based on fields provided.
That is :
let rectype = MakeRecordType(['fieldname1'; 'fieldname2'])
Going directly to type providers looks like heavy gunpower for such a simple task.
Are there any other way ?
update
I found the following question which look very similar
Creating F# record through reflection
Putting aside the usefulness of the end result, a snippet below achieves exactly what you asked for in the spirit of my other related answer:
#if INTERACTIVE
#r #"C:\Program Files (x86)\Microsoft F#\v4.0\FSharp.Compiler.dll"
#r #"C:\Program Files (x86)\FSharpPowerPack-1.9.9.9\bin\FSharp.Compiler.CodeDom.dll"
#endif
open System
open System.CodeDom.Compiler
open Microsoft.FSharp.Compiler.CodeDom
open Microsoft.FSharp.Reflection
type RecordTypeMaker (typeName: string, records: (string*string) []) =
let _typeDllName = "Synth"+typeName+".dll"
let _code =
let fsCode = new System.Text.StringBuilder()
fsCode.Append("module ").Append(typeName).Append(".Code\ntype ").Append(typeName).Append(" = {") |> ignore
for rec' in records do fsCode.Append(" ").Append(fst rec').Append(" : ").Append(snd rec').Append(";\n") |> ignore
fsCode.Append("}").ToString()
let _compiled =
use provider = new FSharpCodeProvider()
let options = CompilerParameters([||], _typeDllName)
let result = provider.CompileAssemblyFromSource( options, [|_code|] )
result.Errors.Count = 0
let mutable _type: Type = null
member __.RecordType
with get() = if _compiled && _type = null then
_type <- Reflection.Assembly.LoadFrom(_typeDllName).GetType(typeName+".Code+"+typeName)
_type
A sketch implementation of RecordTypeMaker accepts an arbitrary Record type definition containing type name and array of field names accompanied by field type names. Then behind the curtain it assembles a piece of F# code defining the requested Record type, compiles this code via CodeDom provider, loads container assembly and provides access to this newly created synthetic Record type via Reflection. A test snippet
let myType = RecordTypeMaker("Test", [|("Field1", "string"); ("Field2", "int")|]).RecordType
printfn "IsRecordType=%b" (FSharpType.IsRecord(myType))
printfn "Record fields: %A" (FSharpType.GetRecordFields(myType))
demonstrates for a purely synthetic type myType the proof of concept:
IsRecordType=true
Record fields: [|System.String Field1; Int32 Field2|]

Resources