How to use a three parametered infix operator?
Eg.: base function is
let orElse labelFunc p1 p2 = {...} and operator let ( <|> ) = orElse
Now, for non-infix version this works nicely:List.reduce ((<|>) labelFunc) parserList.
Can I use it somehow still "infixing"? eg.: (p1 (<|> labelFunc) p1) does not work nor any other combination, other than using the non-infix version here as well.
First of all, I think it's best to restrict the number of custom operators you're using in your code, because custom operators make F# code hard to read. F# lets you define custom operators, but it's not particularly designed to make this a great experience - it makes sense for some small domain-specific languages (like parser combinators), but not much else.
So, while I do not recommend using this, there is a weird trick that you can use to write something like p1 (<op> l) p2, which is to make <op> infix and replace the parentheses with two more custom operators:
let (</) a b = a, b
let (/>) c d = c, d
let (!) f = f
1 </ !10 /> 2
This sample just produces a tuple with all three arguments, but if you implement your logic in the </ operator, it will actually do something like what you want. But as I said, I would not do this :-).
I don't believe there is any good way to achieve that. Once you have a parenthesized expression it won't be parsed as an operator - even 1 (+) 1 doesn't work.
Related
On wikipedia it says that using call/cc you can implement the amb operator for nondeterministic choice, and my question is how would you implement the amb operator in a language in which the only support for continuations is to write in continuation passing style, like in erlang?
If you can encode the constraints for what constitutes a successful solution or choice as guards, list comprehensions can be used to generate solutions. For example, the list comprehension documentation shows an example of solving Pythagorean triples, which is a problem frequently solved using amb (see for example exercise 4.35 of SICP, 2nd edition). Here's the more efficient solution, pyth1/1, shown on the list comprehensions page:
pyth1(N) ->
[ {A,B,C} ||
A <- lists:seq(1,N-2),
B <- lists:seq(A+1,N-1),
C <- lists:seq(B+1,N),
A+B+C =< N,
A*A+B*B == C*C
].
One important aspect of amb is efficiently searching the solution space, which is done here by generating possible values for A, B, and C with lists:seq/2 and then constraining and testing those values with guards. Note that the page also shows a less efficient solution named pyth/1 where A, B, and C are all generated identically using lists:seq(1,N); that approach generates all permutations but is slower than pyth1/1 (for example, on my machine, pyth(50) is 5-6x slower than pyth1(50)).
If your constraints can't be expressed as guards, you can use pattern matching and try/catch to deal with failing solutions. For example, here's the same algorithm in pyth/1 rewritten as regular functions triples/1 and the recursive triples/5:
-module(pyth).
-export([triples/1]).
triples(N) ->
triples(1,1,1,N,[]).
triples(N,N,N,N,Acc) ->
lists:reverse(Acc);
triples(N,N,C,N,Acc) ->
triples(1,1,C+1,N,Acc);
triples(N,B,C,N,Acc) ->
triples(1,B+1,C,N,Acc);
triples(A,B,C,N,Acc) ->
NewAcc = try
true = A+B+C =< N,
true = A*A+B*B == C*C,
[{A,B,C}|Acc]
catch
error:{badmatch,false} ->
Acc
end,
triples(A+1,B,C,N,NewAcc).
We're using pattern matching for two purposes:
In the function heads, to control values of A, B and C with respect to N and to know when we're finished
In the body of the final clause of triples/5, to assert that conditions A+B+C =< N and A*A+B*B == C*C match true
If both conditions match true in the final clause of triples/5, we insert the solution into our accumulator list, but if either fails to match, we catch the badmatch error and keep the original accumulator value.
Calling triples/1 yields the same result as the list comprehension approaches used in pyth/1 and pyth1/1, but it's also half the speed of pyth/1. Even so, with this approach any constraint could be encoded as a normal function and tested for success within the try/catch expression.
Lets see the code snippet:
pSegmentBegin p i = pIndentExact i *> ((:) <$> p i <*> ((pEOL *> pSegment p i) <|> pure []))
if I change this code in my parser to:
pSegmentBegin p i = do
pIndentExact i
((:) <$> p i <*> ((pEOL *> pSegment p i) <|> pure []))
I've got an error:
canot compute minmal length of a parser due to occurrence of a moadic bind, use addLength to override
I thought the above parser should behave the same way. Why this error can occur?
EDIT
The above example is very simple (to simplify the question) and as noted below it is not necessary to use do notation here, but the real case I wanted it to use is as follows:
pSegmentBegin p i = do
j <- pIndentAtLast i
(:) <$> p j <*> ((pEOL *> pSegments p j) <|> pure [])
I have noticed that adding "addLength 1" before the do statement solves the problem, but I'm unsure if its a correct solution:
pSegmentBegin p i = addLength 2 $ do
j <- pIndentAtLast i
(:) <$> p j <*> ((pEOL *> pSegments p j) <|> pure [])
As I have mentioned many times the monadic interface should be avoided whenever possible. let me try to explain why the applicative interface is to be preferred.
One of the distinctive features of my library is that it performs error correction by inserting or deleting problems. Of course we can take an umlimited look-ahead here but that would make the process VERY expensive. So we take only a limited lookahead of three steps.
Now suppose we have to insert an expression and one of the expression alternatives is:
expr := "if" expr "then" expr "else" expr
then we want to exclude this alternative since choosing this alternative would necessitate the insertion of another expression etc. So we perform an abstract interpretation of the alternatives and make sure that in case of a draw (i.e. equal costs for the limited lookahead) we take one of the non-recursive alternatives.
Unfortunately this scheme breaks down when one writes monadic parsers, since the length of the right hand side of the bind may depend on the result of the left-hand side. So we issue the error message, and ask some help from the programmer to indicate the number of tokens this alternative might consume. The actual value does not matter so much, as long as you do not provide a finite length for something which is recursive and may lead to infinite insertions. It is only used to select the shortest alternative in case of an insertion.
This abstract interpretation has some costs and if you write all your parsers in monadic style it is unavoidable that this analysis is repeated over an over again. so: DO NOT WRITE MONADIC STYLE PARSERS WHEN USING THIS LIBRARY IF THERE IS AN APPLICATIVE ALTERNATIVE.
It's trying to statically analyze how much input needs to be read in order to optimize performance, but that kind of optimization requires a statically known parser structure—the kind that can be built by Applicatives since the parser effect cannot depend upon the parser value such what (>>=) does.
So that's what goes wrong—when you use do notation it translates to a Monadic bind which breaks the Applicative predictor. It'd be nice if the library only exposed one of the two interfaces so that this kind of error cannot happen, but instead there's some inconsistency if you use both interfaces together in the same parser.
Since this use of do is strictly unnecessary—you're not using the extra power the monadic interface gives you—it's probably better to just avoid it.
I have a workaround I use with monadic parsers in uuparsinglib. Its a self-answer here:
Monadic parse with uu-parsinglib
You may find it useful
Is there a way to write an infix function not using symbols? Something like this:
let mod x y = x % y
x mod y
Maybe a keyword before "mod" or something.
The existing answer is correct - you cannot define an infix function in F# (just a custom infix operator). Aside from the trick with pipe operators, you can also use extension members:
// Define an extension member 'modulo' that
// can be called on any Int32 value
type System.Int32 with
member x.modulo n = x % n
// To use it, you can write something like this:
10 .modulo 3
Note that the space before . is needed, because otherwise the compiler tries to interpret 10.m as a numeric literal (like 10.0f).
I find this a bit more elegant than using pipeline trick, because F# supports both functional style and object-oriented style and extension methods are - in some sense - close equivalent to implicit operators from functional style. The pipeline trick looks like a slight misuse of the operators (and it may look confusing at first - perhaps more confusing than a method invocation).
That said, I have seen people using other operators instead of pipeline - perhaps the most interesting version is this one (which also uses the fact that you can omit spaces around operators):
// Define custom operators to make the syntax prettier
let (</) a b = a |> b
let (/>) a b = a <| b
let modulo a b = a % b
// Then you can turn any function into infix using:
10 </modulo/> 3
But even this is not really an established idiom in the F# world, so I would probably still prefer extension members.
Not that I know of, but you can use the left and right pipe operators. For example
let modulo x y = x % y
let FourMod3 = 4 |> modulo <| 3
To add a little bit to the answers above, you can create a custom infix operators but the vocabulary is limited to:
!, $, %, &, *, +, -, ., /, <, =, >, ?, #, ^, |, and ~
Meaning you can build your infix operator using combining these symbols.
Please check the full documentation on MS Docs.
https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/operator-overloading
Is there a (simple) way, within a parsing expression grammar (PEG), to express an "unordered sequence"? A rule such as
Rule <- A B C
requires A, B and C to match in order. A rule such as
Rule <- (A B C) / (B C A) / (C A B) / (A C B) / (C B A) / (B A C)
allows them to match in any order (which is what we want) but it is cumbersome and inapplicable in practice with more terms in the sequence.
Is the only solution to use a syntactically looser rule such as
Rule <- (A / B / C){3}
and semantically check that each rule matches only once?
The fact that, e.g., Relax NG Compact Syntax has an "unordered list" operator to parse XML make me hint that there is no obvious solution.
Last question: do you think the addition of such an operator would bring ambiguity to PEG?
Grammar rules express precisely the sequence of forms that you want, regardless of parsing engine (e.g., PEG, LALR, LL(k), ...) that you choose.
The only way to express that you want all possible sequences of just of something using BNF rules is the big ugly rule you proposed.
The standard solution is to simply define:
rule <- (A | B | C)*
(or whatever syntax your parser generator accepts for lists) and semantically count that only 3 forms are provided and they are unique.
Often people building parser generators add special "extended BNF" notations to let them describe special circumstances; you gave an example use {3} as special syntax implying that you only wanted "3 of" under the assumption the parser generator accepts this notation and does the appropriate enforcement. One can imagine an extension notation {unique} to let you describe your situation. I've never seen a parser generator that implemented that idea.
I've just found something I'd call a quirk in F# and would like to know whether it's by design or by mistake and if it's by design, why is it so...
If you write any range expression where the first term is greater than the second term the returned sequence is empty. A look at reflector suggests this is by design, but I can't really find a reason why it would have to be so.
An example to reproduce it is:
[1..10] |> List.length
[10..1] |> List.length
The first will print out 10 while the second will print out 0.
Tests were made in F# CTP 1.9.6.2.
EDIT: thanks for suggesting expliciting the range, but there's still one case (which is what inspired me to ask this question) that won't be covered. What if A and B are variables and none is constantly greater than the other although they're always different?
Considering that the range expression does not seem to get optimized at compiled time anyway, is there any good reason for the code which determines the step (not explicitly specified) in case A and B are ints not to allow negative steps?
As suggested by other answers, you can do
[10 .. -1 .. 1] |> List.iter (printfn "%A")
e.g.
[start .. step .. stop]
Adam Wright - But you should be able
to change the binding for types you're
interested in to behave in any way you
like (including counting down if x >
y).
Taking Adam's suggestion into code:
let (..) a b =
if a < b then seq { a .. b }
else seq { a .. -1 .. b }
printfn "%A" (seq { 1 .. 10 })
printfn "%A" (seq { 10 .. 1 })
This works for int ranges. Have a look at the source code for (..): you may be able to use that to work over other types of ranges, but not sure how you would get the right value of -1 for your specific type.
What "should" happen is, of course, subjective. Normal range notation in my mind defines [x..y] as the set of all elements greater than or equal to x AND less than or equal to y; an empty set if y < x. In this case, we need to appeal to the F# spec.
Range expressions expr1 .. expr2 are evaluated as a call to the overloaded operator (..), whose default binding is defined in Microsoft.FSharp.Core.Operators. This generates an IEnumerable<_> for the range of values between the given start (expr1) and finish (expr2) values, using an increment of 1. The operator requires the existence of a static member (..) (long name GetRange) on the static type of expr1 with an appropriate signature.
Range expressions expr1 .. expr2 .. expr3 are evaluated as a call to the overloaded operator (.. ..), whose default binding is defined in Microsoft.FSharp.Core.Operators. This generates an IEnumerable<_> for the range of values between the given start (expr1) and finish (expr3) values, using an increment of expr2. The operator requires the existence of a static member (..) (long name GetRange) on the static type of expr1 with an appropriate signature.
The standard doesn't seem to define the .. operator (a least, that I can find). But you should be able to change the binding for types you're interested in to behave in any way you like (including counting down if x > y).
In haskell, you can write [10, 9 .. 1]. Perhaps it works the same in F# (I haven't tried it)?
edit:
It seems that the F# syntax is different, maybe something like [10..-1..1]
Ranges are generally expressed (in the languages and frameworks that support them) like this:
low_value <to> high_value
Can you give a good argument why a range ought to be able to be expressed differently? Since you were requesting a range from a higher number to a lower number does it not stand to reason that the resulting range would have no members?