I've been working my way through this article on ROP, and have reached the section "Converting simple functions to the railway-oriented programming model" where he explains how to fit other kinds of functions into the model.
He gave the example of wanting to clean an email address (function name different as I'm English!)...
let cleanEmail person =
printfn "Cleaning email for \"%s\"" person.Name
let newPerson = { person with Email = person.Email.Trim().ToLower() }
printfn " email is now \"%s\"" newPerson.Email
newPerson
I added some printfn to see what was happening. This function works fine, in that if I pass it an input with an email that has uppercase letters and/or leading/trailing blanks, the output is a value with a clean, lowercase email address...
let shouter = { Name = "Jim"; Email = " JIM#SHOUT.com "}
cleanEmail shouter
Gives the output...
Cleaning email for "Jim"
email is now "jim#shout.com"
val it : Request = {Name = "Jim";
Email = "jim#shout.com";}
However, if I wrap this function in a switch, and insert it into the chain of validation functions, the original email address is used. For example, with validation functions like this...
let validateNameNotBlank person =
printfn "Validating name not blank for \"%s\"" person.Name
if person.Name = "" then Failure "Name must not be blank"
else Success person
...an operator like this...
let (>=>) switch1 switch2 input =
match switch1 input with
| Success s -> switch2 input
| Failure f -> Failure f
...and bound together like this...
let validate =
validateNameNotBlank
>=> validateNameLength
>=> switch cleanEmail
>=> validateEmail
...then when I pass shouter in, I get the following out...
Validating name not blank for "Jim"
Validating name length for "Jim"
Cleaning email for "Jim"
email is now "jim#shout.com"
Validating email for "Jim" ( JIM#SHOUT.com )
val it : Result<Request,string> = Success {Name = "Jim";
Email = " JIM#SHOUT.com ";}
You can see from the printfn I put in that the cleanEmail function is cleaning the email, but the output shows that the original email is passed to the next function in the chain.
What have I missed here?
If you look at the article, the operator has the definition
let (>=>) switch1 switch2 x =
match switch1 x with
| Success s -> switch2 s
| Failure f -> Failure f
which when correctly translated to your variables is:
let (>=>) switch1 switch2 input =
match switch1 input with
| Success s -> switch2 s
| Failure f -> Failure f
You never actually use the s from the Success so the same input just keeps getting passed through.
A close examination of the type of >=> would have revealed this problem as yours would have given the same type to switch1 and switch2 when they should have different types.
Related
(Newbie question).
In F#, please assume an array of CompanyNames of the format:
"BLUE CROSS BLUE SHIELD OF ALABAMA, BIRMINGHAM"
This from the below where GetAllCompanies reads from the database the company names and cities:
CompanyNames = GetAllCompanies |> Array.map (fun i -> sprintf "%s , %s" i.companyname i.city )
This fails, but is my best attempt at going the other way. That is, given the company name and city as a string like above, I want to get back the company it came from (y is the company,city string from above):
let company = Array.pick (fun (k:company1) ->
match sprintf "%s , %s" k.companyname k.city with
| y -> Some k
| _ -> None ) GetAllCompanies
The compiler gives the warning on _ as: "This rule will never be matched".
How is this done?
TIA
Just to clarify, I will be getting a string of "company name"+","+"city" as typed in by the user. I need to check if this string matches a composite string of "company name"+","+"city" that was built from the company details and then return all the company details when it matches. GetAllCompanies returns an array of company details. Thanks.
I believe you are asking the wrong question. It seems you're trying to find the first company that matches the user input. Unless you want to transform the found company, Array.pick is not the right function, and you should use another function. You probably also don't want an exception if the company is not found.
type Company = { Name: string; City: string }
let companies =
[|
{ Name = "Foo"; City = "Oslo"}
{ Name = "Bar"; City = "Lillehammer"}
|]
let userInput = "Bar,Lillehammer"
let company =
companies
|> Array.tryFind (fun c -> c.Name + "," + c.City = userInput)
Now you can match on company.
There are still several weaknesses. Two of them: What if several companies match the user input? I'd use Array.filter then, if you want to present all of them to the user during interactive input. What if the user types spaces, e.g. a space after the comma? Or if there are commas in the company info?
What causes the REPL to print a function signature instead of function result?
I am attempting to execute the following line:
let email = Email "abc.com";;
email |> sendMessage |> ignore;;
The code is as follows
type PhoneNumber =
{ CountryCode:int
Number:string }
type ContactMethod =
| Email of string
| PhoneNumber of PhoneNumber
let sendMessage contact = function
| Email _ -> printf "Sending message via email"
| PhoneNumber phone -> printf "Sending message via phone"
// c. Create two values, one for the email address case and
// one for the phone number case, and pass them to sendMessage.
let email = Email "abc.com";;
email |> sendMessage |> ignore;;
I get the following result:
type PhoneNumber =
{CountryCode: int;
Number: string;}
type ContactMethod =
| Email of string
| PhoneNumber of PhoneNumber
val sendMessage : contact:'a -> _arg1:ContactMethod -> unit
val email : ContactMethod = Email "abc.com"
>
val it : unit = ()
I expected something like this:
"Sending message via email"
Your sendMessage function takes two arguments: one named contact of unrestricted type 'a and an anonymous (_arg1 in the signature) ContactMethod.
When you supply email to sendMessage you get a function which takes a ContactMethod and returns unit. You then ignore this function.
Either remove the contact parameter (more idiomatic):
let sendMessage = function
| Email _ -> printf "Sending message via email"
| PhoneNumber phone -> printf "Sending message via phone"
or match on it (might be easier to understand):
let sendMessage contact =
match contact with
| Email _ -> printf "Sending message via email"
| PhoneNumber phone -> printf "Sending message via phone"
Now, sendMessage is of type ContactMethod -> unit and you don't need to ignore anymore.
How do I retrieve a value from a generic?
Specifically, I am attempting the following:
// Test
let result = Validate goodInput;;
// How to access record??
let request = getRequest result
Here's the code:
type Result<'TSuccess,'TFailure> =
| Success of 'TSuccess
| Failure of 'TFailure
let bind nextFunction lastFunctionResult =
match lastFunctionResult with
| Success input -> nextFunction input
| Failure f -> Failure f
type Request = {name:string; email:string}
let validate1 input =
if input.name = "" then Failure "Name must not be blank"
else Success input
let validate2 input =
if input.name.Length > 50 then Failure "Name must not be longer than 50 chars"
else Success input
let validate3 input =
if input.email = "" then Failure "Email must not be blank"
else Success input;;
let Validate =
validate1
>> bind validate2
>> bind validate3;;
// Setup
let goodInput = {name="Alice"; email="abc#abc.com"}
let badInput = {name=""; email="abc#abc.com"};;
// I have no clue how to do this...
let getRequest = function
| "Alice", "abc#abc.com" -> {name="Scott"; email="xyz#xyz.com"}
| _ -> {name="none"; email="none"}
// Test
let result = Validate goodInput;;
// How to access record??
let request = getRequest result
printfn "%A" result
You mean how do you extract the record out of your result type? Through pattern matching, that's what you're already doing in bind.
let getRequest result =
match result with
| Success input -> input
| Failure msg -> failwithf "Invalid input: %s" msg
let result = Validate goodInput
let record = getRequest result
This will return the record or throw an exception. Up to you how you handle the success and failure cases once you have your Result - that could be throwing an exception, or turning it into option, or logging the message and returning a default etc.
This seems to be a frequently asked question: How do I get the value out of a monadic value? The correct answer, I believe, is Mu.
The monadic value is the value.
It's like asking, how do I get the value out of a list of integers, like [1;3;3;7]?
You don't; the list is the value.
Perhaps, then, you'd argue that lists aren't Discriminated Unions; they have no mutually exclusive cases, like the above Result<'TSuccess,'TFailure>. Consider, instead, a tree:
type Tree<'a> = Node of Tree<'a> list | Leaf of 'a
This is another Discriminated Union. Examples include:
let t1 = Leaf 42
let t2 = Node [Node []; Node[Leaf 1; Leaf 3]; Node[Leaf 3; Leaf 7]]
How do you get the value out of a tree? You don't; the tree is the value.
Like 'a option in F#, the above Result<'TSuccess,'TFailure> type (really, it's the Either monad) is deceptive, because it seems like there should only be one value: the success. The failure we don't like to think about (just like we don't like to think about None).
The type, however, doesn't work like that. The failure case is just as important as the success case. The Either monad is often used to model error handling, and the entire point of it is to have a type-safe way to deal with errors, instead of exceptions, which are nothing more than specialised, non-deterministic GOTO blocks.
This is the reason the Result<'TSuccess,'TFailure> type comes with bind, map, and lots of other goodies.
A monadic type is what Scott Wlaschin calls an 'elevated world'. While you work with the type, you're not supposed to pull data out of that world. Rather, you're supposed to elevate data and functions up to that world.
Going back to the above code, imagine that given a valid Request value, you'd like to send an email to that address. Therefore, you write the following (impure) function:
let send { name = name; email = email } =
// Send email using name and email
This function has the type Request -> unit. Notice that it's not elevated into the Either world. Still, you want to send the email if the request was valid, so you elevate the send method up to the Either world:
let map f = bind (fun x -> Success (f x))
let run = validate1 >> bind validate2 >> bind validate3 >> map send
The run function has the type Request -> Result<unit,string>, so used with goodInput and badInput, the results are the following:
> run goodInput;;
val it : Result<unit,string> = Success unit
> run badInput;;
val it : Result<unit,string> = Failure "Name must not be blank"
And then you probably ask: and how do I get the value out of that?
The answer to that question depends entirely on what you want to do with the value, but, imagine that you want to report the result of run back to the user. Displaying something to the user often involves some text, and you can easily convert a result to a string:
let reportOnRun = function
| Success () -> "Email was sent."
| Failure msg -> msg
This function has the type Result<unit,string> -> string, so you can use it to report on any result:
> run goodInput |> reportOnRun;;
val it : string = "Email was sent."
> run badInput |> reportOnRun;;
val it : string = "Name must not be blank"
In all cases, you get back a string that you can display to the user.
I need help understanding the concepts behind the following:
I have this:
type Result<'TSuccess,'TFailure> =
| Success of 'TSuccess
| Failure of 'TFailure
But this doesn't work:
let getRequest = function
| Success input -> input
| Failure msg -> msg
But this does:
let getRequest result =
match result with
| Success input -> input
| Failure msg -> failwithf "Invalid input: %s" msg
Why does the initial "getRequest" fail?
Again, I just don't understand the basic rules for pattern matching.
Can someone please shed some light on this?
The entire code is here:
module Core
type Result<'TSuccess,'TFailure> =
| Success of 'TSuccess
| Failure of 'TFailure
let bind nextFunction lastFunctionResult =
match lastFunctionResult with
| Success input -> nextFunction input
| Failure f -> Failure f
type Request = {Name:string; Email:string}
let validate1 input =
if input.Name = "" then Failure "Name must not be blank"
else Success input
let validate2 input =
if input.Name.Length > 50 then Failure "Name must not be longer than 50 chars"
else Success input
let validate3 input =
if input.Email = "" then Failure "Email must not be blank"
else Success input;;
let Validate =
validate1
>> bind validate2
>> bind validate3;;
// Setup
let goodInput = {Name="Alice"; Email="abc#abc.com"}
let badInput = {Name=""; Email="abc#abc.com"};;
let getRequest = function
| Success input -> input
| Failure msg -> msg
// Test
let result = Validate goodInput;;
let request = getRequest result;;
Given this definition
let getRequest = function
| Success input -> input
| Failure msg -> msg
we can reason about its type as follows:
Since it's a function, it has type ?1 -> ?2 for some not-yet-known placeholder types.
The input uses the Success and Failure union cases, so the input must itself actually be some type Result<?3,?4>, and the function's type is Result<?3,?4> -> ?2.
The types of each branch must be the same as the function's return type.
Looking at the first branch, we see this means that ?3 = ?2.
Looking at the second branch, this means that ?4 = ?2.
Therefore the overall type of the function must be Result<?2,?2> -> ?2, or using real F# notation, Result<'a,'a> -> 'a where 'a can be any type - this is a generic function definition.
But Validate has type Request -> Result<Request, string>, so its output isn't consistent with this definition of getResult because the success and failure types are different and you can't just directly pass the results from one to the other.
On the other hand, with
let getRequest = function
| Success input -> input
| Failure msg -> failwithf "Invalid input: %s" msg
we would analyze it like this:
As before, any function has type ?1 -> ?2.
As before, the input must clearly be a Result<?3,?4>.
But analyzing the branches, we get a different result:
The first branch again leads to the constraint ?3 = ?2.
But the second branch is different - failwithf with that format pattern takes a string and throws an exception (and can have any return type, since it never returns normally), so we see msg must be a string, so ?4 = string.
So the overall type is Result<'a,string> -> 'a, which now allows you to pass a Result<Request,string> to it, as desired.
I think your question actually gives a good platform to explain pattern matching.
Let's start by thinking about the types involved, your getRequest function must be a function with some type signature getRequest : 'a -> 'b', i.e. it takes something of one type and returns something of another but let's look at what you've written.
let getRequest =
function
| Success input -> input // this branch would return 'TSuccess
| Failure msg -> msg // this brach would return 'TFailure
You can't have a function whose type signature is 'a -> 'b or 'c. (The only way to make this function compile is if 'b and 'c can be constrained to be the same type, then we do simply have a consistent 'a -> 'b).
But wait, enter the discriminated union.
Your discriminated union has type Result<'TSuccess,'TFailure>. Or, to describe things as I did above type 'd<'b,'c>. Now, we absolutely can have a function whose type signature is 'a -> 'd<'b,'c>.
'd<'b, 'c> is just a generic type which involves 'b and it involves 'c. The type signature includes all of these things, so hopefully you can see that we can contain both a 'TSuccess and 'TFailure in this type with no problem.
So, if we want to return something which can contain different combinations of types, we've found the perfect use case for the discriminated union. It can contain the 'TSuccess or the 'TFailure in one type. You can then choose between those results using pattern matching.
let getRequest =
function
| Success input -> printfn "%A" input // do something with success case
| Failure msg -> printfn "%A" msg // do something with failure case
So long as we have consistent return types in each case, we can insert any behaviour we want.
On to the next point of why throwing an exception is okay. Let's look at the type signature of the failwithf function: failwithf : StringFormat<'T,'Result> -> 'T. The failwithf function takes a string format and returns some type 'T.
So let's look at your second function again
let getRequest result =
match result with
| Success input -> input // type 'TSuccess
| Failure msg -> failwithf "Invalid input: %s" msg // type 'T inferred to be 'TSuccess
The function signature is consistent, 'a -> 'TSuccess.
Hi I'm looking to find the best way to read in a fixed width text file using F#. The file will be plain text, from one to a couple of thousand lines long and around 1000 characters wide. Each line contains around 50 fields, each with varying lengths. My initial thoughts were to have something like the following
type MyRecord = {
Name : string
Address : string
Postcode : string
Tel : string
}
let format = [
(0,10)
(10,50)
(50,7)
(57,20)
]
and read each line one by one, assigning each field by the format tuple(where the first item is the start character and the second is the number of characters wide).
Any pointers would be appreciated.
The hardest part is probably to split a single line according to the column format. It can be done something like this:
let splitLine format (line : string) =
format |> List.map (fun (index, length) -> line.Substring(index, length))
This function has the type (int * int) list -> string -> string list. In other words, format is an (int * int) list. This corresponds exactly to your format list. The line argument is a string, and the function returns a string list.
You can map a list of lines like this:
let result = lines |> List.map (splitLine format)
You can also use Seq.map or Array.map, depending on how lines is defined. Such a result will be a string list list, and you can now map over such a list to produce a MyRecord list.
You can use File.ReadLines to get a lazily evaluated sequence of strings from a file.
Please note that the above is only an outline of a possible solution. I left out boundary checks, error handling, and such. The above code may contain off-by-one errors.
Here's a solution with a focus on custom validation and error handling for each field. This might be overkill for a data file consisting of just numeric data!
First, for these kinds of things, I like to use the parser in Microsoft.VisualBasic.dll as it's already available without using NuGet.
For each row, we can return the array of fields, and the line number (for error reporting)
#r "Microsoft.VisualBasic.dll"
// for each row, return the line number and the fields
let parserReadAllFields fieldWidths textReader =
let parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(reader=textReader)
parser.SetFieldWidths fieldWidths
parser.TextFieldType <- Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
seq {while not parser.EndOfData do
yield parser.LineNumber,parser.ReadFields() }
Next, we need a little error handling library (see http://fsharpforfunandprofit.com/rop/ for more)
type Result<'a> =
| Success of 'a
| Failure of string list
module Result =
let succeedR x =
Success x
let failR err =
Failure [err]
let mapR f xR =
match xR with
| Success a -> Success (f a)
| Failure errs -> Failure errs
let applyR fR xR =
match fR,xR with
| Success f,Success x -> Success (f x)
| Failure errs,Success _ -> Failure errs
| Success _,Failure errs -> Failure errs
| Failure errs1, Failure errs2 -> Failure (errs1 # errs2)
Then define your domain model. In this case, it is the record type with a field for each field in the file.
type MyRecord =
{id:int; name:string; description:string}
And then you can define your domain-specific parsing code. For each field I have created a validation function (validateId, validateName, etc).
Fields that don't need validation can pass through the raw data (validateDescription).
In fieldsToRecord the various fields are combined using applicative style (<!> and <*>).
For more on this, see http://fsharpforfunandprofit.com/posts/elevated-world-3/#validation.
Finally, readRecords maps each input row to the a record Result and chooses the successful ones only. The failed ones are written to a log in handleResult.
module MyFileParser =
open Result
let createRecord id name description =
{id=id; name=name; description=description}
let validateId (lineNo:int64) (fields:string[]) =
let rawId = fields.[0]
match System.Int32.TryParse(rawId) with
| true, id -> succeedR id
| false, _ -> failR (sprintf "[%i] Can't parse id '%s'" lineNo rawId)
let validateName (lineNo:int64) (fields:string[]) =
let rawName = fields.[1]
if System.String.IsNullOrWhiteSpace rawName then
failR (sprintf "[%i] Name cannot be blank" lineNo )
else
succeedR rawName
let validateDescription (lineNo:int64) (fields:string[]) =
let rawDescription = fields.[2]
succeedR rawDescription // no validation
let fieldsToRecord (lineNo,fields) =
let (<!>) = mapR
let (<*>) = applyR
let validatedId = validateId lineNo fields
let validatedName = validateName lineNo fields
let validatedDescription = validateDescription lineNo fields
createRecord <!> validatedId <*> validatedName <*> validatedDescription
/// print any errors and only return good results
let handleResult result =
match result with
| Success record -> Some record
| Failure errs -> printfn "ERRORS %A" errs; None
/// return a sequence of records
let readRecords parserOutput =
parserOutput
|> Seq.map fieldsToRecord
|> Seq.choose handleResult
Here's an example of the parsing in practice:
// Set up some sample text
let text = """01name1description1
02name2description2
xxname3badid-------
yy badidandname
"""
// create a low-level parser
let textReader = new System.IO.StringReader(text)
let fieldWidths = [| 2; 5; 11 |]
let parserOutput = parserReadAllFields fieldWidths textReader
// convert to records in my domain
let records =
parserOutput
|> MyFileParser.readRecords
|> Seq.iter (printfn "RECORD %A") // print each record
The output will look like:
RECORD {id = 1;
name = "name1";
description = "description";}
RECORD {id = 2;
name = "name2";
description = "description";}
ERRORS ["[3] Can't parse id 'xx'"]
ERRORS ["[4] Can't parse id 'yy'"; "[4] Name cannot be blank"]
By no means is this the most efficient way to parse a file (I think there are some CSV parsing libraries available on NuGet that can do validation while parsing) but it does show how you can have complete control over validation and error handling if you need it.
A record of 50 fields is a bit unwieldy, therefore alternate approaches which allow dynamic generation of the data structure may be preferable (eg. System.Data.DataRow).
If it has to be a record anyway, you could spare at least the manual assignment to each record field and populate it with the help of Reflection instead. This trick relies on the field order as they are defined. I am assuming that every column of fixed width represents a record field, so that start indices are implied.
open Microsoft.FSharp.Reflection
type MyRecord = {
Name : string
Address : string
City : string
Postcode : string
Tel : string } with
static member CreateFromFixedWidth format (line : string) =
let fields =
format
|> List.fold (fun (index, acc) length ->
let str = line.[index .. index + length - 1].Trim()
index + length, box str :: acc )
(0, [])
|> snd
|> List.rev
|> List.toArray
FSharpValue.MakeRecord(
typeof<MyRecord>,
fields ) :?> MyRecord
Example data:
"Postman Pat " +
"Farringdon Road " +
"London " +
"EC1A 1BB" +
"+44 20 7946 0813"
|> MyRecord.CreateFromFixedWidth [16; 16; 16; 8; 16]
// val it : MyRecord = {Name = "Postman Pat";
// Address = "Farringdon Road";
// City = "London";
// Postcode = "EC1A 1BB";
// Tel = "+44 20 7946 0813";}