Define the cons (::) operator for custom collections

Define the cons (::) operator for custom collections - f#

I am using the fairly popular FSharpx.Collections package, and in particular the NonEmptyList type.
This type provides the NonEmptyList.cons function, but I want to use the :: operator as with regular List, i.e. head :: tail. Since tail must already be a NonEmptyList<'a>, there shouldn't be any conflict with List's ::operator.
However, it seems I cannot define the operator. This:
let ( :: ) h t = NonEmptyList.cons h t
results in a compilation error:
Unexpected symbol '::' in pattern. Expected ')' or other token.
I know that :: is not quite in the same category as other operators, but I don't fully understand how. So I tried a few things more or less at random, such as replacing :: with op_cons and the like, without success.
Am I missing something, and is there a way to do what I want to do?

According to MSDN, colon cannot actually be used in operator name. This seems to contradict the F# specification from FSharp.org, I'm not sure what's going on there. But we can verify that in FSI:
> let ( >:> ) a b = a+b
Script.fsx(1,7): error FS0035: This construct is deprecated: ':' is not permitted as a character in operator names and is reserved for future use
If you look at how List<'T> is defined, you'll find that (::) is not actually an operator, but a case constructor:
type List<'T> =
| ( [] )
| ( :: ) of Head: 'T * Tail: 'T list
And sure enough, you can define your own DU type with that as constructor name:
> type A =
> | ( :: ) of string * int
> | A of int
>
> let a = "abc" :: 5
val a : A = Cons ("abc",5)
Now, oddly, if I try to use another operator-ish-looking name as case constructor, I get this error:
> type A = | ( |> ) of string * int
Script.fsx(1,14): error FS0053: Discriminated union cases and exception labels must be uppercase identifiers
Which means that (::) is somehow special (and so is ([]), by the way).
So the bottom line seems to be - no, you can't do that.
But why do you even need to? Can you, perhaps, settle for a more acceptable operator name, which would still express the semantics of "cons" - like, say, (<+>)?

Related

How do I extract useful information from the payload of a GADT / existential type?

I'm trying to use Menhir's incremental parsing API and introspection APIs in a generated parser. I want to, say, determine the semantic value associated with a particular LR(1) stack entry; i.e. a token that's been previously consumed by the parser.
Given an abstract parsing checkpoint, encapsulated in Menhir's type 'a env, I can extract a “stack element” from the LR automaton; it looks like this:
type element =
| Element: 'a lr1state * 'a * position * position -> element
The type element describes one entry in the stack of the LR(1) automaton. In a stack element of the form Element (s, v, startp, endp), s is a (non-initial) state and v is a semantic value. The value v is associated with the incoming symbol A of the state s. In other words, the value v was pushed onto the stack just before the state s was entered. Thus, for some type 'a, the state s has type 'a lr1state and the value v has type 'a ...
In order to do anything useful with the value v, one must gain information about the type 'a, by inspection of the state s. So far, the type 'a lr1state is abstract, so there is no way of inspecting s. The inspection API (§9.3) offers further tools for this purpose.
Okay, cool! So I go and dive into the inspection API:
The type 'a terminal is a generalized algebraic data type (GADT). A value of type 'a terminal represents a terminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ terminal =
| T_A : unit terminal
| T_B : int terminal
The type 'a nonterminal is also a GADT. A value of type 'a nonterminal represents a nonterminal symbol (without a semantic value). The index 'a is the type of the semantic values associated with this symbol ...
type _ nonterminal =
| N_main : thing nonterminal
Piecing these together, I get something like the following (where "command" is one of my grammar's nonterminals, and thus N_command is a string nonterminal):
let current_command (env : 'a env) =
let rec f i =
match Interpreter.get i env with
| None -> None
| Some Interpreter.Element (lr1state, v, _startp, _endp) ->
match Interpreter.incoming_symbol lr1state with
| Interpreter.N Interpreter.N_command -> Some v
| _ -> f (i + 1)
in
f 0
Unfortunately, this is puking up very confusing type-errors for me:
File "src/incremental.ml", line 110, characters 52-53:
Error: This expression has type string but an expression was expected of type
string
This instance of string is ambiguous:
it would escape the scope of its equation
This is a bit above my level! I'm pretty sure I understand why I can't do what I tried to do above; but I don't understand what my alternatives are. In fact, the Menhir manual specifically mentions this complexity:
This function can be used to gain access to the semantic value v in a stack element Element (s, v, _, _). Indeed, by case analysis on the symbol incoming_symbol s, one gains information about the type 'a, hence one obtains the ability to do something useful with the value v.
Okay, but that's what I thought I did, above: case-analysis by match'ing on incoming_symbol s, pulling out the case where v is of a single, specific type: string.
tl;dr: how do I extract the string payload from this GADT, and do something useful with it?

If your error sounds like
This instance of string is ambiguous:
it would escape the scope of its equation
it means that the type checker is not really sure if outside of the pattern matching branch the type of v should be a string, or another type that is equal to string but only inside the branch. You just need to add a type annotation when leaving the branch to remove this ambiguity:
| Interpreter.(N N_command) -> Some (v:string)

FParsec and pipe3 make the arguments explicit or add a type notation

I am trying to use the pipe3 function from the FParsec library but I get an error I don't know how to solve.
Given the Record
type Point = { x: float; y: float }
and the following parser
let plistoffloats' =
pipe3 pfloat (pchar ',' .>> spaces) pfloat
(fun first z second -> { x = first; y = second })
What I try to achieve is a parser that receives a string in format "1.1, 3.7" and returns a Point
run plistoffloats' "1.1, 3.7"
Input : "1.1, 3.7"
Desired output : Point = {x = 1.1; y = 3.7;}
Error :
error FS0030: Value restriction. The value 'plistoffloats'' has been inferred to have generic type
val plistoffloats' : Parser <Point,'__a>
Either make the arguments to 'plistoffloats'' explicit or, if you do not intend for it to be generic, add a type annotation.
A simpler example with pchar also didn't work.
let parsesA = pchar 'a'
error FS0030: Value restriction. The value 'parsesA' has been inferred to have generic type
val parsesA : Parser<char,'_a>
Either make the arguments to 'parsesA' explicit or, if you do not intend for it to be generic, add a type annotation.

This is covered in the FParsec documentation; it will happen with any parser. The reason is because in the .Net type system, functions are allowed to be generic, but values are not — and in FParsec, you're generally defining parsers as values (e.g., you're typically writing let psomething = ... where psomething takes no parameters). Read the linked documentation page for the whole explanation — I won't copy and paste the whole thing — but the short version is that you can do one of two things:
Create a test function that looks like the following, and make sure it's used within the same source file on your parser:
let test p str =
match run p str with
| Success(result, _, _) -> printfn "Success: %A" result
| Failure(errorMsg, _, _) -> printfn "Failure: %s" errorMsg
Annotate your parser with a type annotation like the following:
type UserState = unit // You might change this later
let plistoffloats' : Parser<_, UserState> =
// ...
It sounds like you're trying to do #1, but unless your parser is called with test plistoffloats' in the same source file, the F# type inference won't be able to infer your user state type and will give you that error.
P.S. You can read more about the F# value restriction error here: Understanding F# Value Restriction Errors
P.P.S. The _ in the first position of Parser<_, UserState> does not mean "This type could be anything" the way _ means in other contexts like pattern matching. Instead, _ in a type annotation means "Please infer this type for me so that I don't have to specify it explicitly". In FParsec contexts, this is very useful because all your parsers will have UserState as their second type argument, but will have a varying type for the first type argument. And since the first type argument is the one that the type inference can infer, it means that you can copy and paste the type Parser<_, UserState> to all your parsers and F# will Do The Right Thing™ in each case.

Why can't you write "(::) 1 [2]" the way you can write "(+) 1 2" in F#?

Put an F# infix operator in brackets, and it behaves like a function,
let foo = (*) 3 2 // foo = 6
let factorial n = [2..n] |> List.fold (*) 1 // n!
However, this doesn't work with the :: operator (cons operator),
let ls = (::) 1 [2..5] // Error: Unexpected symbol '::' in binding.
What's the reason for this?

You can use the static method:
let ls = List.Cons (1, [2..5])
or the operator's verbose name:
let ls = op_ColonColon (1, [2..5])
(checked with F# 3.0; older versions may behave differently. For instance, MSDN suggests op_Cons)
In both cases, there's no way to curry the arguments here. Numeric operators are defined like this:
let inline ( * ) (x:int) (y:int) = ...
The list concatenation, however, requires a tuple, and this also answers your question,
What's the reason for this?
In fact, (::) is not an usual operator (a standalone function or a type member), but a union case. Here's how the List<'T> is defined in F# sources:
type List<'T> =
| ([]) : 'T list
| (::) : Head: 'T * Tail: 'T list -> 'T list
So, if your purpose is partial application of arguments, the only nice solution would be writing a wrapper function as #pad has suggested.

Because (::) (and [] for that matter) is a symbolic keyword, you can't expect to use it as an infix operator. See F# specification, section 3.6 Symbolic keywords.
In this case, you have to define an extra function e.g.
let cons x xs = x :: xs
let ls = cons 1 [2..5]

Inconsistent behavior in pattern matching

type Bar = A | B
type Foo = C of Bar | D of Bar
let case = Unchecked.defaultof<Foo>;;
match case with
| C A -> ""
| C B -> ""
| _ -> "Matches";;
match case with
| C A -> ""
| D B -> ""
| _ -> "Throws"
Quickly skimming over the F# Language Spec, nothing about null-test (which I can't do anyway) seemed related and both types seems to be reference type (AFAIK).
I would assume the behavior in the first case to be correct one.

I think all bets are off once you use Unchecked.defaultof<_> (thus the "Unchecked" ;-)). Null isn't considered a valid value for type Foo from an F# perspective (although it is from the .NET perspective), so I don't think that the pattern matching semantics are defined.
What is it that you're trying to do?

So reading the spec, the first hint is here
b) Types with null as an abnormal value. These are types that do
not admit the null literal, but do have null as an abnormal value.
Types in this category are:
o All F# list, record, tuple, function, class and interface types.
o All F# union types apart from those with null as a normal value
(as discussed in the next paragraph).
For these types, the use of the null literal is not directly
permitted. However it is, strictly speaking, possible to generate a
null value for these types using certain functions such as
Unchecked.defaultof. For these types, null is considered an
abnormal value. The behavior of operations with respect to null values
is defined in §6.9.
This does seem to suggest that on passing in a null value for your union could be some sort of undefined behaviour as the value is "abnormal", Section 6.9 isn't particularly helpful
Looking at the definition for _, it seems like you are right that this is a bug - it states
7.1.7 Wildcard Patterns
The pattern _ is a wildcard pattern and matches any input. For
example:
let categorize x =
match x with
| 1 -> 0
| 0 -> 1
| _ -> 0
I think the most relevnat hints though are later on where the compiled methods for DU's are listed
8.5.3 Compiled Form of Union Types for Use from Other CLI Languages
A compiled union type U will have:
· One CLI static getter property U.C for each nullary union
case C. This will get a singleton object representing that case.
· One CLI nested type U.C for each non-nullary union case C.
This type will have instance properties Item1, Item2.... for each
field of the union case, or a single instance property Item if there
is only one field. A compiled union type with only one case does not
have a nested type. Instead, the union type itself plays the role of
the case type.
· One CLI static method U.NewC for each non-nullary union case
C. This will construct an object for that case.
· One CLI instance property u.IsC for each case C that returns
true or false for the case.
· One CLI instance property u.Tag for each case C that fetches
or computes an integer tag corresponding to the case.
From this, you can see that all of the methods for checking are instance methods, which would require non-nullness. Sine null is "abnormal", the generated code doesn't bother checking, so it throws.
I think you could argue that this is infact a bug, based on the definition of _. However, fixing it would require inserting null checks before every DU pattern matching check, which would slow the code down significantly, so I doubt whether this will be fixed

A pattern match against a DU with two cases compiles to an if/else with type tests. If you translate your examples, the behavior is apparent.
match case with
| C A -> ""
| C B -> ""
| _ -> "Matches"
translates to
if (case is C)
...
else
"Matches"
and
match case with
| C A -> ""
| D B -> ""
| _ -> "Throws"
translates to
if (case is D)
...
else //must be C
//check tag BANG! NRE
And kvb's example: match case with | C _ -> "C" | D _ -> "D"
if (case is D)
//...
else //must be C
"C"
I suppose you could view this as a reasonable optimization (vs. if (case is D) {...} else if (case is C) {...} else { MatchFailureException }) given that null behavior is undefined.
Add a third case to Foo and the problem goes away.

The description of the method Unchecked.defaultof ends with the sentence "This function is unsafe in the sense that some F# values don't have proper null values", which is exactly the case here.
Try this:
let isNull = (case = null) ;;
let isNull = (case = null) ;;
---------------------^^^^
stdin(3,22): error FS0043: The type 'Foo' does not have 'null' as a proper value
>
DUs are intended to be immutable and don't have proper null values.

I'm not able to confirm this at the moment, but wouldn't the AddNullLiteral attribute to the types allow the match to succeed?

F# how to write an empty statement

How can I write a no-op statement in F#?
Specifically, how can I improve the second clause of the following match statement:
match list with
| [] -> printfn "Empty!"
| _ -> ignore 0

Use unit for empty side effect:
match list with
| [] -> printfn "Empty!"
| _ -> ()

The answer from Stringer is, of course, correct. I thought it may be useful to clarify how this works, because "()" insn't really an empty statement or empty side effect...
In F#, every valid piece of code is an expression. Constructs like let and match consist of some keywords, patterns and several sub-expressions. The F# grammar for let and match looks like this:
<expr> ::= let <pattern> = <expr>
<expr>
::= match <expr> with
| <pat> -> <expr>
This means that the body of let or the body of clause of match must be some expression. It can be some function call such as ignore 0 or it can be some value - in your case it must be some expression of type unit, because printfn ".." is also of type unit.
The unit type is a type that has only one value, which is written as () (and it also means empty tuple with no elements). This is, indeed, somewhat similar to void in C# with the exception that void doesn't have any values.
BTW: The following code may look like a sequence of statements, but it is also an expression:
printf "Hello "
printf "world"
The F# compiler implicitly adds ; between the two lines and ; is a sequencing operator, which has the following structure: <expr>; <expr>. It requires that the first expression returns unit and returns the result of the second expression.
This is a bit surprising when you're coming from C# background, but it makes the langauge surprisingly elegant and consise. It doesn't limit you in any way - you can for example write:
if (a < 10 && (printfn "demo"; true)) then // ...
(This example isn't really useful - just a demonstration of the flexibility)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart