How to write a Parser that validates its input against a predicate and otherwise fails - parsing

I want to write a Parser that produces some data structure and validates its consistency by running a predicate on it. In case the predicate returns false the parser should return a custom Error object (as opposed to a Failure, since this can be achieved by ^?).
I am looking for some operator on parser that can achieve that.
For example, let's say that I want to parse a list of integers and check that they are distinct. I would like to have something like this:
import util.parsing.combinator.RegexParsers
object MyParser extends RegexParsers {
val number: Parser[Int] = """\d+""".r ^^ {_.toInt }
val list = repsep(number, ",") ^!(checkDistinct, "numbers have to be unique")
def checkDistinct(numbers: List[Int]) = (numbers.length == numbers.distinct.length)
}
The ^! in the code above is what I am looking for. How can I validate a parser output and return a useful error message if it does not validate?

One way to achieve this would be to use the Pimp My Library pattern to add the ^! operator to Parser[List[T]] (the return type of repsep). Define an implicit def, then import it into scope when you need to use it:
class ParserWithMyExtras[T](val parser:Parser[List[T]]){
def ^!(predicate:List[T]=>Boolean, errorMessage:String) = {...}
}
implicit def augmentParser[T](parser:Parser[List[T]]) =
new ParserWithMyExtras(parser)

Parsers.commit transforms Failure to Error.
So a first step would be
commit(p ^?(condition, message))
However this would give an error if p gives a failure, which I suppose is not what you want, you want an error only when p succeeds and then the check fails.
So you should rather do
p into {result => commit(success(result) ^? (condition,message))}
That may sound rather contrived, you may also implement directly, just copy the implementation of ^? replacing failure with an error.
Finally you should probably do as suggested by Dylan and add the operator. If you want to do it outside of your grammar (Parsers) , I think you will need a mixin:
trait PimpedParsers { self: Parsers =>
implicit def ...
}
Otherwise you cannot easily refer to (single) Parser.

Here is a complete Pimp My Library implementation:
implicit def validatingParsers[T](parser: Parser[T]) = new {
def ^!(predicate: T => Boolean, error: => String) = Parser { in =>
parser(in) match {
case s #Success(result, sin) => predicate(result) match {
case true => s
case false => Error(error, sin) // <--
}
case e #NoSuccess(_, _) => e
}
}
}
The new operator ^! transforms the parser on the left to a new parser that applies the predicate.
One important thing to note is the sin on the line marked with <--. Because the Error that is eventually returned by Scala's parser library is the one in the latest position in the input, it is crucial to pass sin in that line instead of in, as sin represents the point where the inner parser completed its own parsing.
If we passed in instead of sin, the error that would eventually be reported could be the latest failure that happened during the parsing of the inner rule (which we know that eventually succeeded if we got to that line).

^? accepts an error message generator, commit converts a Failure to an Error:
val list = commit {
repsep(number, ",") ^? (
{ case numbers if checkDistinct(numbers) => true},
_ => "numbers have to be unique" )
}

Related

How to access the methods for a higher rule?

During writing validation rules a came across the problem that I need some content from a rule in my grammar which is hierarchically higher than the one I pass to my validationMethod.
I know that I can refer to a "higher" rule with .eContainer but then I don't have a clue how to access the values I want to.
For example I have the following grammar snippet:
rule1:
name=ID content=rule2
;
rule2:
<<whatever content>>
;
If I have a normal validationMethod with the argument rule1 then I can access the name via .name but when I give rule2 as an argument and then referring to rule via .eConatiner the .name method does not exist.
Greetings Krzmbrzl
EObject is the root class of all AST node classes. It comes from the EMF Ecore framework which is used by Xtext to generate the AST implementation. Therefore, the EObject class contains a lot of the tree structure features, e.g., iterating through a tree. The EObject.eContainer() method returns an element of type EObject which actually is the super type of the type of the actual returned object. To get access to methods of the next higher element, you have to cast the eContainers methods result like this:
#Check
public void check(rule2 r2) {
EObject o = r2.eContainer();
rule1 r1 = (rule1) o;
String r1Name = r1.getName();
}
If the type of the parent object is ambigous, you should test whether the actual type is the expected with an instanceof expression:
#Check
public void check(rule2 r2) {
EObject o = r2.eContainer();
if (o instanceof rule1) {
rule1 r1 = (rule1) o;
String r1Name = r1.getName();
}
}
Xtend provide the same instanceof-expression like Java. But if the object to be checked definetly can have more then a few types, you can use Xtends really powerful switch expression. It supports so called type guards. You can switch over any object and instead of case value: guards simply write the a concrete type:
switch (anyAbstractTypeObject) {
ConcreteSubtypeA: {...}
ConcreteSubtypeB: {...}
}
This is an elegant shorthand if-instanceof-eleseif-... in Xtend.

Is it necessary to use else branch in async expressions?

I want to write the following code:
let someAsync () = async {
if 1 > 2 then return true // Error "this expression is expected to have type unit ..."
// I want to place much code here
return false
}
F# for some reason thinks that I need to write it like that:
let someAsync () = async {
if 1 > 2 then return true
else
// Much code here (indented!)
return false
}
In latter case no error message is produced. But in my view both pieces of code are equivalent. Is there any chance I could avoid unnecessary nesting and indentation?
UPD. What I am asking is possible indeed! Please take a look at example, see section Real world example
I will quote the code:
let validateName(arg:string) = imperative {
if (arg = null) then return false // <- HERE IT IS
let idx = arg.IndexOf(" ")
if (idx = -1) then return false // <- HERE IT IS
// ......
return true
}
So, it is possible, the only question is if it is possible to implement somehow in async, via an extension to module or whatever.
I think that situation is described here: Conditional Expressions: if... then...else (F#)
(...) if the type of the then branch is any type other than unit,
there must be an else branch with the same return type.
Your first code does not have else branch, which caused an error.
There is an important difference between the async computation builder and my imperative builder.
In async, you cannot create a useful computation that does not return a value. This means that Async<'T> represents a computation that will eventually produce a value of type 'T. In this case, the async.Zero method has to return unit and has a signature:
async.Zero : unit -> Async<unit>
For imperiatve builder, the type Imperative<'T> represents a computation that may or may not return a value. If you look at the type declaration, it looks as follows:
type Imperative<'T> = unit -> option<'T>
This means that the Zero operation (which is used when you write if without else) can be computation of any type. So, imperative.Zero method returns a computation of any type:
imperative.Zero : unit -> Imperative<'T>
This is a fundamental difference which also explains why you can create if without else branch (because the Zero method can create computation of any type). This is not possible for async, because Zero can only create unit-returning values.
So the two computations have different structures. In particular, "imperative" computations have monoidal structure and async workflows do not. In more details, you can find the explanation in our F# Computation Zoo paper

DART: syntax of future then

I don´t understand the syntax of the then() clause.
1. myFuture(6).then( (erg) => print(erg) )
What´s (erg) => expr syntactically?
I thougt it could be a function, but
then( callHandler2(erg)
doesn´t work, Error:
"Multiple markers at this line
- The argument type 'void' cannot be assigned to the parameter type '(String) ->
dynamic'
- Undefined name 'erg'
- Expected to find ')'"
2. myFuture(5).then( (erg) { callHandler(erg);},
onError: (e) => print (e)
What´s `onError: (e) => expr"` syntactically?
3. Is there a difference between the onError: and the .catchError(e) variants?
1) The Fat Arrow is syntactic sugar for short anonymous functions. The two functions below are the same:
someFuture(arg).then((erg) => print(erg));
// is the same as
someFuture(arg).then((erg) { return print(erg); });
Basically the fat arrow basically automatically returns the evaluation of the next expression.
If your callHandler2 has the correct signature, you can just pass the function name. The signature being that it accept the number of parameters as the future will pass to the then clause, and returns null/void.
For instance the following will work:
void callHandler2(someArg) { ... }
// .. elsewhere in the code
someFuture(arg).then(callHandler);
2) See answer 1). The fat arrow is just syntactic sugar equivalent to:
myFuture(5).then( (erg){ callHandler(erg);}, onError: (e){ print(e); });
3) catchError allows you to chain the error handling after a series of futures. First its important to understand that then calls can be chained, so a then call which returns a Future can be chained to another then call. The catchError will catch errors both synchronous and asynchronous from all Futures in the chain. Passing an onError argument will only deal with an error in the Future its an argument for and for any synchronous code in your then block. Any asynchronous code in your then block will remain uncaught.
Recent tendency in most Dart code is to use catchError and omit the onError argument.
I will attempt to elaborate more on Matt's answer, hopefully to give more insights.
What then() requires is a function (callback), whose signature matches the future's type.
For example, given a Future<String> myFuture and doSomething being any function that accepts a String input, you can call myFuture.then(doSomething). Now, there are several ways to define a function that takes a String in Dart:
Function(String) doSomething1 = (str) => /* do something with str */ // only one command
Function(String) doSomething2 = (str) { /* do something with str */ } // several commands
Function(String) doSomething3 = myFunction;
myFunction(String) { // Dart will auto imply return type here
/* do something with str */ // several commands
}
Any of those 3 function definitions (the right hand side of =) could go inside then(). The first two definitions are called lambda functions, they are created at runtime and cannot be reused unless you manually copy the code. Lambda functions can potentially yield language-like expressions, i.e. (connection) => connection.connect(). The third approach allows the function to be reused. Lambda functions are common in many languages, you can read more about it here: https://medium.com/#chineketobenna/lambda-expressions-vs-anonymous-functions-in-javascript-3aa760c958ae.
The reason why you can't put callHandler2(erg) inside then() is because callHandler2(erg) uses an undefined variable erg. Using the lambda function, you will be able to tell then() that the erg in callHandler2(erg) is the output of the future, so it knows where to get erg value.

Using config driven logic in createCriteria grails

I have requirement in which i need some logic of criteria query to be config driven. Earlier i used to query like :
e.g.:
User.createCriteria().list{
or{
eq('username',user.username)
eq('name',user.name)
}
}
But, i need this to be configurable in my use case so, i try this code snippet.
def criteriaCondition= grailsApplication.config.criteriaCondition?:{user->
or{
eq('username',user.username)
eq('name',user.name)
}
}
User.createCriteria().list{criteriaCondition(user)}
But, This doesn't work for me. I am getting missing method exception for "or" I tried few solution from some sources but it didn't worked for me.
So, can anyone help me :
1) How to make the above given code work.
2) Any other better way for my use case.
Thanks in advance!!!
you have to pass criteriaBuilder object to the closure, something like this:
def criteriaCondition = grailsApplication.config.criteriaCondition ?: { cb, user ->
cb.or{
cb.eq('username',user.username)
cb.eq('name',user.name)
}
}
def criteriaBuilder = User.createCriteria()
criteriaBuilder.list{
criteriaCondition(criteriaBuilder, user)
}
obviously, closure in the Config.groovy also has to have the same parameters list, including cb
The way the criteria builder mechanism works, the list method expects to be passed a closure which it will call, whereas your current code is calling the criteriaCondition closure itself rather than letting the criteria builder call it. "Currying" will help you here: given
def criteriaCondition= grailsApplication.config.criteriaCondition?:{user->
or{
eq('username',user.username)
eq('name',user.name)
}
}
instead of saying
User.createCriteria().list{criteriaCondition(user)}
you say
User.createCriteria().list(criteriaCondition.curry(user))
(note the round brackets rather than braces).
The curry method of Closure returns you another Closure with some or all of its arguments "pre-bound" to specific values. For example
def add = {a, b -> a + b}
def twoPlus = add.curry(2) // gives a closure equivalent to {b -> 2 + b}
println twoPlus(3) // prints 5
In your case, criteriaCondition.curry(user) gives you a zero-argument closure that you can pass to criteria.list. You can curry as many arguments as you like (up to the number that the closure can accept).

How to get lexer to output EOF in Scala TokenParser?

If I define a Lexical to feed a TokenParser, I'm having trouble getting the TokenParser to actually output an EOF token. In particular, some of the methods in Parser[T] (acceptIf, acceptMatch, and phrase) directly check whether the Reader is atEnd, so there's no chance for an EOF token to get added to the token stream before an error is returned.
Since the Tokens trait actually defines an EOF token, I'm sure there must be some simple way to output it, but at this point all I can think to do is to create my own Reader that doesn't return true for atEnd until after at least one EOF has been output or adding a '\032' character to the input so that the Reader doesn't realize it's at the end until after it has emitted that character.
Please tell me I'm missing an easier way...
You don't need to do this at all. Use
new YourLexical.Scanner("foo")
to create a Reader[YourLexical.Token], which will respond to #atEnd automatically.
Then you can hand this Reader to a TokenParser implementing your syntax directly as input:
class YourTokenParser ... {
...
def program: Parser[...] = ...
def parse(s: String) =
phrase(program)(new YourLexical.Scanner(s))
}

Resources