"Some instances of this call cannot safely be inlined." - dafny

In Dafny, what does the error message "Some instances of this call cannot safely be inlined" mean?
I see it reported for calls to predicates inside asserts. E.g.
assert LessThanOrEqual( [a[z]], a[z+1..r] ) ;

It's an informational message (not an error), and it's rather obscure! (I had to look it up myself to understand it.)
When there's a proof obligation that involves a predicate (here LessThanOrEqual), then the Dafny verifier internally sets things up so that if it can't prove the predicate, it will be able to tell you which conjunct inside the body of the predicate is failing. You will see this as "associated declaration" messages, which accompany error messages.
You can think of what's going on as, essentially, inlining the predicate's body into the call-site of the predicate. Sometimes, however, this cannot be done. For example, if the predicate is recursive, then there must be some limit to such inlining. If inlining cannot be done, all it means is that any error message you get will just say "can't prove LessThanOrEqual(...)", but it won't tell you which part of the definition of LessThanOrEqual it couldn't manage to prove.
A more subtle reason for why inlining cannot be done involves quantifiers. The verifier works with quantifiers through what are called matching triggers. A trigger informs the verifier when it would be a good idea to instantiate a quantifier. There are certain rules about what can and cannot be a trigger. The one rule that's relevant in your example is that arithmetic + cannot be part of a trigger. I'm guessing the definition of your LessThanOrEqual involves a quantifier, and that the verifier picks as a trigger of that quantifier a term that involves the second parameter to LessThanOrEqual. If the call above to LessThanOrEqual were inlined, then the + would sneak into the trigger, and that's not allowed by the rule.
Dafny thus chooses not to inline this call to LessThanOrEqual. All this means is that you'd get a slightly less precise error location if the verifier failed to prove the assertion. You probably wouldn't have noticed or been bothered about this; indeed, getting a less precise error message is probably less puzzling than the informational message you're getting instead.
There is a way suppress the informational message: if you pass in an equivalent expression that doesn't directly mention +. For example, you can you a local variable:
ghost var s := a[z+1..r];
assert LessThanOrEqual( [a[z]], s );
or a let expression:
assert var s := a[z+1..r]; LessThanOrEqual( [a[z]], s );
Rustan

Related

DLV Rule is not safe

I am starting to work with DLV (Disjunctive Datalog) and I have a rule that is reporting a "Rule is not safe" error, when running the code. The rule is the following:
foo(R, 1) :- not foo(R, _)
I have read the manual and seen that "cyclic dependencies are disallowed". I guess this is why I am being reported the error, but I am not sure how this statement is so problematic to DLV. The final goal is to have some kind of initialization in case that the predicate has not been defined.
More precisely, if there is no occurrence of 'foo' with the parameter R (and anything else), then define it with parameters R and 1. Once it is defined, the rule shouldn't be triggered again. So, this is not a real recursion in my opinion.
Any comments on how to solve this issue are welcomed!
I have realised that I probably need another predicate to match the parameter R in the body of the rule. Something like this:
foo(R, 1) :- not foo(R, _), bar(R)
Since, otherwise there would be no way to know whether there are no occurrences of foo(R, _). I don't know whether I made myself clear.
Anyway, this doesn't work either :(
To the particular "Rule is not safe" error: First of all this has nothing to do with cyclic or acyclic dependencies. The same error message shows up for the non-cyclic program:
foo2(R, 1) :- not foo(R,_), bar(R).
The problem is that the program is actually not safe (http://www.dlvsystem.com/html/DLV_User_Manual.html#SAFETY). As mentioned in the section on negative rules (anchor #AEN375, I am only allowed to use 2 links in my answer):
Variables, which occur in a negated literal, must also occur in a
positive literal in the body.
Observe that the _ is an anonymous variable. I.e., the program
foo(R,1) :- not foo(R,_), bar(R).
can be equivalently written as (and is equivalent to)
foo(R,1) :- not foo(R,X), bar(R).
Anonymous variables (DLV manual, anchor #AEN264 - at the end of the section) just allow us to avoid inventing names for variables that will only occur once within the rule (i.e. for variables that only express "there is some value, I absolutely do not care about it), but they are variables nevertheless. And since negation with not is "negation" and not "true negation" (or "strong negation" as it is also often called), none of the three safety conditions is satisfied by the rule.
A very rough and high-level intuition for safety is that it guarantees that every variable in the program can be assigned to some finite domain - as it is now the case with R by adding bar(R). However, the same also must be the case for the anonymous variable _ .
To the actual problem of defining default values:
As pointed out by lambda.xy.x, the problem here is the Answer Set (or stable model) semantics of DLV: Trying to do it in one rule does not give any solution:
In order to get a safe program, we could replace the above problems e.g. by
foo(1,2). bar(1). bar(2).
tmp(R) :- foo(R,_).
foo(R,1) :- not tmp(R), bar(R).
This has no stable model:
Assume the answer is, as intended,
{foo(1,2), bar(1), bar(2), foo(2,1)}
However, this is not a valid model, since tmp(R) :- foo(R,_) would require it to contain tmp(2). But then, "not tmp(2)" is no longer true, and therefore having foo(2,1) in the model violates the required minimality of the model. (This is not exactly what is going on, more a rough intuition. More technical details could be found in any article on answer set programming, a quick Google search gave me this paper as one of the first results: http://www.kr.tuwien.ac.at/staff/tkren/pub/2009/rw2009-asp.pdf)
In order to solve the problem, it is therefore somehow necessary to "break the cycle". One possibility would be:
foo(1,2). bar(1). bar(2). bar(3).
tmp(R) :- foo(R,X), X!=1.
foo(R,1) :- bar(R), not tmp(R).
I.e., by explicitly stating that we want to add R into the intermediate atom only if the value is different from 1, having foo(2,1) in the model does not contradict tmp(2) not being part of the model as well. Of course, this no longer allows to distinguish whether foo(R,1) is there as default value or by input, but if this is not required ...
Another possibility would be to not use foo for the computation, but some foo1 instead. I.e. having
foo1(R,X) :- foo(R,X).
tmp(R) :- foo(R,_).
foo1(R,1) :- bar(R), not tmp(R).
and then just use foo1 instead of foo.

What's the difference between luaL_checknumber and lua_tonumber?

Clarification (sorry the question was not specific): They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. How does luaL_checknumber deal with something that's not a number?
There's also luaL_checklong and luaL_checkinteger. Are they the same as (int)luaL_checknumber and (long)luaL_checknumber respectively?
The reference manual does answer this question. I'm citing the Lua 5.2 Reference Manual, but similar text is found in the 5.1 manual as well. The manual is, however, quite terse. It is rare for any single fact to be restated in more than one sentence. Furthermore, you often need to correlate facts stated in widely separated sections to understand the deeper implications of an API function.
This is not a defect, it is by design. This is the reference manual to the language, and as such its primary goal is to completely (and correctly) describe the language.
For more information about "how" and "why" the general advice is to also read Programming in Lua. The online copy is getting rather long in the tooth as it describes Lua 5.0. The current paper edition describes Lua 5.1, and a new edition describing Lua 5.2 is in process. That said, even the first edition is a good resource, as long as you also pay attention to what has changed in the language since version 5.0.
The reference manual has a fair amount to say about the luaL_check* family of functions.
Each API entry's documentation block is accompanied by a token that describes its use of the stack, and under what conditions (if any) it will throw an error. Those tokens are described at section 4.8:
Each function has an indicator like this: [-o, +p, x]
The first field, o, is how many elements the function pops from the
stack. The second field, p, is how many elements the function pushes
onto the stack. (Any function always pushes its results after popping
its arguments.) A field in the form x|y means the function can push
(or pop) x or y elements, depending on the situation; an interrogation
mark '?' means that we cannot know how many elements the function
pops/pushes by looking only at its arguments (e.g., they may depend on
what is on the stack). The third field, x, tells whether the function
may throw errors: '-' means the function never throws any error; 'e'
means the function may throw errors; 'v' means the function may throw
an error on purpose.
At the head of Chapter 5 which documents the auxiliary library as a whole (all functions in the official API whose names begin with luaL_ rather than just lua_) we find this:
Several functions in the auxiliary library are used to check C
function arguments. Because the error message is formatted for
arguments (e.g., "bad argument #1"), you should not use these
functions for other stack values.
Functions called luaL_check* always throw an error if the check is not
satisfied.
The function luaL_checknumber is documented with the token [-0,+0,v] which means that it does not disturb the stack (it pops nothing and pushes nothing) and that it might deliberately throw an error.
The other functions that have more specific numeric types differ primarily in function signature. All are described similarly to luaL_checkint() "Checks whether the function argument arg is a number and returns this number cast to an int", varying the type named in the cast as appropriate.
The function lua_tonumber() is described with the token [-0,+0,-] meaning it has no effect on the stack and does not throw any errors. It is documented to return the numeric value from the specified stack index, or 0 if the stack index does not contain something sufficiently numeric. It is documented to use the more general function lua_tonumberx() which also provides a flag indicating whether it successfully converted a number or not.
It too has siblings named with more specific numeric types that do all the same conversions but cast their results.
Finally, one can also refer to the source code, with the understanding that the manual is describing the language as it is intended to be, while the source is a particular implementation of that language and might have bugs, or might reveal implementation details that are subject to change in future versions.
The source to luaL_checknumber() is in lauxlib.c. It can be seen to be implemented in terms of lua_tonumberx() and the internal function tagerror() which calls typerror() which is implemented with luaL_argerror() to actually throw the formatted error message.
They both try to convert the item on the stack to a lua_Number. lua_tonumber will also convert a string that represents a number. luaL_checknumber throws a (Lua) error when it fails a conversion - it long jumps and never returns from the POV of the C function. lua_tonumber merely returns 0 (which can be a valid return as well.) So you could write this code which should be faster than checking with lua_isnumber first.
double r = lua_tonumber(_L, idx);
if (r == 0 && !lua_isnumber(_L, idx))
{
// Error handling code
}
return r;

Encode FIRST and FOLLOW sets into a recursive descent parser

This is a follow up to a previous question I asked How to encode FIRST & FOLLOW sets inside a compiler, but this one is more about the design of my program.
I am implementing the Syntax Analysis phase of my compiler by writing a recursive descent parser. I need to be able to take advantage of the FIRST and FOLLOW sets so I can handle errors in the syntax of the source program more efficiently. I have already calculated the FIRST and FOLLOW for all of my non-terminals, but am have trouble deciding where to logically place them in my program and what the best data-structure would be to do so.
Note: all code will be pseudo code
Option 1) Use a map, and map all non-terminals by their name to two Sets that contains their FIRST and FOLLOW sets:
class ParseConstants
Map firstAndFollowMap = #create a map .....
firstAndFollowMap.put("<program>", FIRST_SET, FOLLOW_SET)
end
This seems like a viable option, but inside of my parser I would then need sorta ugly code like this to retrieve the FIRST and FOLLOW and pass to error function:
#processes the <program> non-terminal
def program
List list = firstAndFollowMap.get("<program>")
Set FIRST = list.get(0)
Set FOLLOW = list.get(1)
error(current_symbol, FOLLOW)
end
Option 2) Create a class for each non-terminal and have a FIRST and FOLLOW property:
class Program
FIRST = .....
FOLLOW = ....
end
this leads to code that looks a little nicer:
#processes the <program> non-terminal
def program
error(current_symbol, Program.FOLLOW)
end
These are the two options I thought up, I would love to hear any other suggestions for ways to encode these two sets, and also any critiques and additions to the two ways I posted would be helpful.
Thanks
I have also posted this question here: http://www.coderanch.com/t/570697/java/java/Encode-FIRST-FOLLOW-sets-recursive
You don't really need the FIRST and FOLLOW sets. You need to compute those to get the parse table. That is a table of {<non-terminal, token> -> <action, rule>} if LL(k) (which means seeing a non-terminal in stack and token in input, which action to take and if applies, which rule to apply), or a table of {<state, token> -> <action, state>} if (C|LA|)LR(k) (which means given state in stack and token in input, which action to take and go to which state.
After you get this table, you don't need the FIRST and FOLLOWS any more.
If you are writing a semantic analyzer, you must assume the parser is working correctly. Phrase level error handling (which means handling parse errors), is totally orthogonal to semantic analysis.
This means that in case of parse error, the phrase level error handler (PLEH) would try to fix the error. If it couldn't, parsing stops. If it could, the semantic analyzer shouldn't know if there was an error which was fixed, or there wasn't any error at all!
You can take a look at my compiler library for examples.
About phrase level error handling, you again don't need FIRST and FOLLOW. Let's talk about LL(k) for now (simply because about LR(k) I haven't thought about much yet). After you build the grammar table, you have many entries, like I said like this:
<non-terminal, token> -> <action, rule>
Now, when you parse, you take whatever is on the stack, if it was a terminal, then you must match it with the input. If it didn't match, the phrase level error handler kicks in:
Role one: handle missing terminals - simply generate a fake terminal of the type you need in your lexer and have the parser retry. You can do other stuff as well (for example check ahead in the input, if you have the token you want, drop one token from lexer)
If what you get is a non-terminal (T) from the stack instead, you must look at your lexer, get the lookahead and look at your table. If the entry <T, lookahead> existed, then you're good to go. Follow the action and push to/pop from the stack. If, however, no such entry existed, again, the phrase level error handler kicks in:
Role two: handle unexpected terminals - you can do many things to get passed this. What you do depends on what T and lookahead are and your expert knowledge of your grammar.
Examples of the things you can do are:
Fail! You can do nothing
Ignore this terminal. This means that you push lookahead to the stack (after pushing T back again) and have the parser continue. The parser would first match lookahead, throw it away and continues. Example: if you have an expression like this: *1+2/0.5, you can drop the unexpected * this way.
Change lookahead to something acceptable, push T back and retry. For example, an expression like this: 5id = 10; could be illegal because you don't accept ids that start with numbers. You can replace it with _5id for example to continue

Parsing code with syntax errors

Parsing techniques are well described in CS literature. But the algorithms I know of require that the source is syntactically correct. If a syntax error is encountered, parsing is immediately aborted.
But IDE's (like Visual Studio) are typically able to provide meaningful code completion and other hints while typing, which mean the syntax is often not in a valid state. E.g. you type an opening parenthesis in a function call, and the IDE provide parameter hints for the function, even though the syntax is invalid until the closing parenthesis is typed.
It seems to me this must rely on some kind of guessing or error-tolerant parser. Anyone know what techniques or algorithms are used for this?
The standard trick is to do some kind of error repair using the parsing machinery to help make predictions.
For table-based parsers (such as LALR or GLR), when a syntax error occurs, the parser was recently in some state in which the error had not yet happened. One can record the parse stack to remember this before each shift (or alternatively record reductions before the error). Given that an error as been encountered, one can inspect the parse state for the saved stack to determine which tokens might be next (this is also how one can do code completion in terms of syntax tokens). A more sophisticated technique can invent the smallest possible sequence of tokens that allow a shift by the error token, or the smallest possible tree that could replace the error token and allow a shift on the next.
This isn't so easy with recursive descent parsers because there isn't a lot of information lying around with which make a predication. For error recovery, a cheesy trick is define error recovery points (e.g., where a "stmt" might be accepted) and continue scanning until a ";" is found and accept and "error stmt". This doesn't help if you want code completion.
Packrat is promising - it provides information on both successful and failed parsing attempt at key points, which can be recovered and used for smart error reporting, completion, hints and so on. For example, if the cursor is at a point where all the parsing attempts are marked as failed in a cache, a list of tokens tried can be given for completion options.

Pattern Matching, F# vs Erlang

In Erlang, you are encouraged not to match patterns that you do not actually handle. For example:
case (anint rem 10) of
1 -> {ok, 10}
9 -> {ok, 25}
end;
is a style that is encouraged, with other possible results resulting in a badmatch result. This is consistant with the "let it crash" philosophy in Erlang.
On the other hand, F# would issue an "incomplete pattern matching" in the equivalent F# code, like here.
The question: why wouldn't F# remove the warning, effectively by augmenting every pattern matching with a statement equivalent to
|_ -> failwith "badmatch"
and use the "let it crash" philosophy?
Edit: Two interesting answers so far: either to avoid bugs that are likely when not handling all cases of an algebraic datatype; or because of the .Net platform. One way to find out which is to check OCaml. So, what is the default behaviour in OCaml?
Edit: To remove misunderstanding by .Net people who have no background in Erlang. The point of the Erlang philosophy is not to produce bad code that always crashes. Let it crash means let some other process fix the error. Instead of writing the function so that it can handle all possible cases, let the caller (for example) handle the bad cases which are thrown automatically. For those with Java background, it is like the difference between having a language with checked exceptions which must declare everything it will possibly return with every possible exception, and having a language in which functions may raise exceptions that are not explicitly declared.
F# (and other languages with pattern matching, like Haskell and O'Caml) does implicitly add a case that throws an exception.
In my opinion the most valuable reason for having complete pattern matches and paying attention to the warning, is that it makes it easy to refactor by extending your datatype, because the compiler will then warn you about code you haven't yet updated with the new case.
On the other hand, sometimes there genuinely are cases that should be left out, and then it's annoying to have to put in a catch-all case with what is often a poor error message. So it's a trade-off.
In answer to your edit, this is also a warning by default in O'Caml (and in Haskell with -Wall).
In most cases, particularly with algebraic datatypes, forgetting a case is likely to be an accident and not an intentional decision to ignore a case. In strongly typed functional languages, I think that most functions will be total, and should therefore handle every case. Even for partial functions, it's often ideal to throw a specific exception rather than to use a generic pattern matching failure (e.g. List.head throws an ArgumentException when given an empty list).
Thus, I think that it generally makes sense for the compiler to warn the developer. If you don't like this behavior, you can either add a catch-all pattern which itself throws an exception, or turn off or ignore the warning.
why wouldn't F# remove the warning
Interesting that you would ask this. Silently injecting sources of run-time error is absolutely against the philosophy behind F# and its relatives. It is considered to be a grotesque abomination. This family of languages are all about static checking, to the extent that the type system was fundamentally designed to facilitate exactly these kinds of static checks.
This stark difference in philosophy is precisely why F# and Python are so rarely compared and contrasted. "Never the twain shall meet" as they say.
So, what is the default behaviour in OCaml?
Same as F#: exhaustiveness and redundancy of pattern matches is checked at compile time and a warning is issued if a match is found to be suspect. Idiomatic style is also the same: you are expected to write your code such that these warnings do not appear.
This behaviour has nothing to do with .NET and, in fact, this functionality (from OCaml) was only implemented properly in F# quite recently.
For example, if you use a pattern in a let binding to extract the first element of a list because you know the list will always have at least one element:
let x::_ = myList
In this family of languages, that is almost always indicative of a design flaw. The correct solution is to represent your non-empty list using a type that makes it impossible to represent the empty list. Static type checking then proves that your list cannot be empty and, therefore, guarantees that this source of run-time errors has been completely eliminated from your code.
For example, you can represent a non-empty list as a tuple containing the head and the tail list. Your pattern match then becomes:
let x, _ = myList
This is exhaustive so the compiler is happy and does not issue a warning. This code cannot go wrong at run-time.
I became an advocate of this technique back in 2004, when I refactored about 1kLOC of commercial OCaml code that had been a major source of run-time errors in an application (even though they were explicit in the form of catch-all match cases that raised exceptions). My refactoring removed all of the sources of run-time errors from most the code. The reliability of the entire application improved enormously. Moreover, we had wasted weeks hunting bugs via debugging but my refactoring was completed within 2 days. So this technique really does pay dividends in the real world.
Erlang cannot have exhaustive pattern matching because of dynamic types unless you have a catch-all in every, which is just silly. Ocaml, on the other hand, can. Ocaml also tries to push all issues that can be caught at compile-time to compile-time.
OCaml by default does warn you about incomplete matches. You can disable it by adding "p" to the "-w" flag. The idea with these warnings is that more often than not (at least in my experience) they are an indication of programmer error. Especially when all your patterns are really complex like Node (Node (Leaf 4) x) (Node y (Node Empty _)), it is easy to miss a case. When the programmer is sure that it cannot go wrong, explicitly adding a | _ -> assert false case is an unambiguous way to indicate that.
GHC by default turns off these warnings; but you can enable it with -fwarn-incomplete-patterns

Resources