On SWI Prolog's foreach/2, we read:
The compatibility of the "aggregate" library which contains the foreach/2 predicate is stated to be
Quintus, SICStus 4. The forall/2 is a SWI-Prolog built-in and
term_variables/3 is a SWI-Prolog built-in with different semantics.
From the SICStus Prolog documentation:
foreach(:Generator, :Goal)
for each proof of Generator in turn, we make a copy of Goal with
the appropriate substitution, then we execute these copies in
sequence. For example, foreach(between(1,3,I), p(I)) is equivalent
to p(1), p(2), p(3).
Note that this is not the same as forall/2. For example,
forall(between(1,3,I), p(I)) is equivalent to
\+ \+ p(1), \+ \+ p(2), \+ \+ p(3).
The trick in foreach/2 is to ensure that the variables of Goal
which do not occur in Generator are restored properly. (If there are
no such variables, you might as well use forall/2.)
Like forall/2, this predicate does a failure-driven loop over the
Generator. Unlike forall/2, the Goals are executed as an ordinary conjunction, and may succeed in more than one way.
Taking the example on the SWI Prolog page, without the final unification:
?- foreach(between(1,4,X), dif(X,Y)).
dif(Y, 4),
dif(Y, 3),
dif(Y, 2),
dif(Y, 1),
dif(Y, 1),
dif(Y, 2),
dif(Y, 1),
dif(Y, 1),
dif(Y, 3),
dif(Y, 2),
dif(Y, 1),
dif(Y, 1),
dif(Y, 2),
dif(Y, 1),
dif(Y, 1).
I'm not sure why there is the output of all the dif/2 instantiations and why there are repetitions of the same subgoal.
foreach/2 should be decide that the conjunction of multiple dif(Y,i) is true, as unbound variable Y is evidently dif from an integer i.
But now:
?- foreach(between(1,4,X), dif(X,Y)), Y = 5.
Y = 5.
Ok, so no output except for Y=5 because the goal succeeds. But what does the Y=5 change? After foreach/2, Prolog has already been able to decide to that foreach/2 is true (given the state of Y at the time foreach/2 was run), so adding Y=5 should not change anything.
But then:
?- foreach(between(1,4,X), dif(X,Y)), Y = 2.
false.
A later unification changes the outcome of foreach/2. How?
I thought that freeze/2 on Y might be involved to make the situation interesting, as in, we are really computing:
freeze(Y,foreach(between(1,4,X), dif(X,Y))).
This would more or less explain the printouts. For example:
?- freeze(Y,foreach(between(1,4,X), dif(X,Y))).
freeze(Y, foreach(between(1, 4, X), dif(X, Y))).
?- freeze(Y,foreach(between(1,4,X), dif(X,Y))), Y=5.
Y = 5.
?- freeze(Y,foreach(between(1,4,X), dif(X,Y))), Y=2.
false.
Is this what's going on?
The "freezing" behaviour that you observe is due to dif/2, not foreach/2.
The predicate dif/2 ensures that its two arguments are not identical and will not become identical. So if the two arguments might become identical, then dif/2 is suspended until it can decide whether they are identical or not.
Related
Suppose I have a set of Z3 expressions:
exprs = [A, B, C, D, E, F]
I want to check whether any of them are equivalent and, if so, determine which. The most obvious way is just an N×N comparison (assume exprs is composed of some arbitrarily-complicated boolean expressions instead of the simple numbers in the example):
from z3 import *
exprs = [IntVal(1), IntVal(2), IntVal(3), IntVal(4), IntVal(3)]
for i in range(len(exprs) - 1):
for j in range(i+1, len(exprs)):
s = Solver()
s.add(exprs[i] != exprs[j])
if unsat == s.check():
quit(f'{(i, j)} are equivalent')
Is this the most efficient method, or is there some way of quantifying over a set of arbitrary expressions? It would also be acceptable for this to be a two-step process where I first learn whether any of the expressions are equivalent, and then do a longer check to see which specific expressions are equivalent.
As with anything performance related, the answer is "it depends." Before delving into options, though, note that z3 supports Distinct, which can check whether any number of expressions are all different: https://z3prover.github.io/api/html/namespacez3py.html#a9eae89dd394c71948e36b5b01a7f3cd0
Though of course, you've a more complicated query here. I think the following two algorithms are your options:
Explicit pairwise checks
Depending on your constraints, the simplest thing to do might be to call the solver multiple times, as you alluded to. To start with, use Distinct and make a call to see if its negation is satisfiable. (i.e., check if some of these expressions can be made equal.) If the answer comes unsat, you know you can't make any equal. Otherwise, go with your loop as before till you hit the pair that can be made equal to each other.
Doing multiple checks together
You can also solve your problem using a modified algorithm, though with more complicated constraints, and hopefully faster.
To do so, create Nx(N-1)/2 booleans, one for each pair, which is equal to that pair not being equivalent. To illustrate, let's say you have the expressions A, B, and C. Create:
X0 = A != B
X1 = A != C
X2 = B != C
Now loop:
Ask if X0 || X1 || X2 is satisfiable.
If the solver comes back unsat, then all of A, B, and C are equivalent. You're done.
If the solver comes back sat, then at least one of the disjuncts X0, X1 or X2 is true. Use the model the solver gives you to determine which ones are false, and continue with those until you get unsat.
Here's a simple concrete example. Let's say the expressions are {1, 1, 2}:
Ask if 1 != 1 || 1 != 2 || 1 != 2 is sat.
It'll be sat. In the model, you'll have at least one of these disjuncts true, and it won't be the first one! In this case the last two. Drop them from your list, leaving you with 1 != 1.
Ask again if 1 != 1 is satisfiable. The answer will be unsat and you're done.
In the worst case you'll make Nx(N-1)/2 calls to the solver, if it happens that none of them can be made equivalent with you eliminating one at a time. This is where the first call to Not (Distinct(A, B, C, ...)) is important; i.e., you will start knowing that some pair is equivalent; hopefully iterating faster.
Summary
My initial hunch is that the second algorithm above will be more performant; though it really depends on what your expressions really look like. I suggest some experimentation to find out what works the best in your particular case.
A Python solution
Here's the algorithm coded:
from z3 import *
exprs = [IntVal(i) for i in [1, 2, 3, 4, 3, 2, 10, 10, 1]]
s = Solver()
bools = []
for i in range(len(exprs) - 1):
for j in range(i+1, len(exprs)):
b = Bool(f'eq_{i}_{j}')
bools.append(b)
s.add(b == (exprs[i] != exprs[j]))
# First check if they're all distinct
s.push()
s.add(Not(Distinct(*exprs)))
if(s.check()== unsat):
quit("They're all distinct")
s.pop()
while True:
# Be defensive, bools should not ever become empty here.
if not bools:
quit("This shouldn't have happened! Something is wrong.")
if s.check(Or(*bools)) == unsat:
print("Equivalent expressions:")
for b in bools:
print(f' {b}')
quit('Done')
else:
# Use the model to keep bools that are false:
m = s.model()
bools = [b for b in bools if not(m.evaluate(b, model_completion=True))]
This prints:
Equivalent expressions:
eq_0_8
eq_1_5
eq_2_4
eq_6_7
Done
which looks correct to me! Note that this should work correctly even if you have 3 (or more) items that are equivalent; of course you'll see the output one-pair at a time. So, some post-processing might be needed to clean that up, depending on the needs of the upstream algorithm.
Note that I only tested this for a few test values; there might be corner case gotchas. Please do a more thorough test and report if there're any bugs!
I am writing a parser for a query engine. My parser DCG query is not deterministic.
I will be using the parser in a relational manner, to both check and synthesize queries.
Is it appropriate for a parser DCG to not be deterministic?
In code:
If I want to be able to use query/2 both ways, does it require that
?- phrase(query, [q,u,e,r,y]).
true;
false.
or should I be able to obtain
?- phrase(query, [q,u,e,r,y]).
true.
nevertheless, given that the first snippet would require me to use it as such
?- bagof(X, phrase(query, [q,u,e,r,y]), [true]).
true.
when using it to check a formula?
The first question to ask yourself, is your grammar deterministic, or in the terminology of grammars, unambiguous. This is not asking if your DCG is deterministic, but if the grammar is unambiguous. That can be answered with basic parsing concepts, no use of DCG is needed to answer that question. In other words, is there only one way to parse a valid input. The standard book for this is "Compilers : principles, techniques, & tools" (WorldCat)
Now you are actually asking about three different uses for parsing.
A recognizer.
A parser.
A generator.
If your grammar is unambiguous then
For a recognizer the answer should only be true for valid input that can be parsed and false for invalid input.
For the parser it should be deterministic as there is only one way to parse the input. The difference between a parser and an recognizer is that a recognizer only returns true or false and a parser will return something more, typically an abstract syntax tree.
For the generator, it should be semi-deterministic so that it can generate multiple results.
Can all of this be done with one, DCG, yes. The three different ways are dependent upon how you use the input and output of the DCG.
Here is an example with a very simple grammar.
The grammar is just an infix binary expression with one operator and two possible operands. The operator is (+) and the operands are either (1) or (2).
expr(expr(Operand_1,Operator,Operand_2)) -->
operand(Operand_1),
operator(Operator),
operand(Operand_2).
operand(operand(1)) --> "1".
operand(operand(2)) --> "2".
operator(operator(+)) --> "+".
recognizer(Input) :-
string_codes(Input,Codes),
DCG = expr(_),
phrase(DCG,Codes,[]).
parser(Input,Ast) :-
string_codes(Input,Codes),
DCG = expr(Ast),
phrase(DCG,Codes,[]).
generator(Generated) :-
DCG = expr(_),
phrase(DCG,Codes,[]),
string_codes(Generated,Codes).
:- begin_tests(expr).
recognizer_test_case_success("1+1").
recognizer_test_case_success("1+2").
recognizer_test_case_success("2+1").
recognizer_test_case_success("2+2").
test(recognizer,[ forall(recognizer_test_case_success(Input)) ] ) :-
recognizer(Input).
recognizer_test_case_fail("2+3").
test(recognizer,[ forall(recognizer_test_case_fail(Input)), fail ] ) :-
recognizer(Input).
parser_test_case_success("1+1",expr(operand(1),operator(+),operand(1))).
parser_test_case_success("1+2",expr(operand(1),operator(+),operand(2))).
parser_test_case_success("2+1",expr(operand(2),operator(+),operand(1))).
parser_test_case_success("2+2",expr(operand(2),operator(+),operand(2))).
test(parser,[ forall(parser_test_case_success(Input,Expected_ast)) ] ) :-
parser(Input,Ast),
assertion( Ast == Expected_ast).
parser_test_case_fail("2+3").
test(parser,[ forall(parser_test_case_fail(Input)), fail ] ) :-
parser(Input,_).
test(generator,all(Generated == ["1+1","1+2","2+1","2+2"]) ) :-
generator(Generated).
:- end_tests(expr).
The grammar is unambiguous and has only 4 valid strings which are all unique.
The recognizer is deterministic and only returns true or false.
The parser is deterministic and returns a unique AST.
The generator is semi-deterministic and returns all 4 valid unique strings.
Example run of the test cases.
?- run_tests.
% PL-Unit: expr ........... done
% All 11 tests passed
true.
To expand a little on the comment by Daniel
As Daniel notes
1 + 2 + 3
can be parsed as
(1 + 2) + 3
or
1 + (2 + 3)
So 1+2+3 is an example as you said is specified by a recursive DCG and as I noted a common way out of the problem is to use parenthesizes to start a new context. What is meant by starting a new context is that it is like getting a new clean slate to start over again. If you are creating an AST, you just put the new context, items in between the parenthesizes, as a new subtree at the current node.
With regards to write_canonical/1, this is also helpful but be aware of left and right associativity of operators. See Associative property
e.g.
+ is left associative
?- write_canonical(1+2+3).
+(+(1,2),3)
true.
^ is right associative
?- write_canonical(2^3^4).
^(2,^(3,4))
true.
i.e.
2^3^4 = 2^(3^4) = 2^81 = 2417851639229258349412352
2^3^4 != (2^3)^4 = 8^4 = 4096
The point of this added info is to warn you that grammar design is full of hidden pitfalls and if you have not had a rigorous class in it and done some of it you could easily create a grammar that looks great and works great and then years latter is found to have a serious problem. While Python was not ambiguous AFAIK, it did have grammar issues, it had enough issues that when Python 3 was created, many of the issues were fixed. So Python 3 is not backward compatible with Python 2 (differences). Yes they have made changes and libraries to make it easier to use Python 2 code with Python 3, but the point is that the grammar could have used a bit more analysis when designed.
The only reason why code should be non-deterministic is that your question has multiple answers. In that case, you'd of course want your query to have multiple solutions. Even then, however, you'd like it to not leave a choice point after the last solution, if at all possible.
Here is what I mean:
"What is the smaller of two numbers?"
min_a(A, B, B) :- B < A.
min_a(A, B, A) :- A =< B.
So now you ask, "what is the smaller of 1 and 2" and the answer you expect is "1":
?- min_a(1, 2, Min).
Min = 1.
?- min_a(2, 1, Min).
Min = 1 ; % crap...
false.
?- min_a(2, 1, 2).
false.
?- min_a(2, 1, 1).
true ; % crap...
false.
So that's not bad code but I think it's still crap. This is why, for the smaller of two numbers, you'd use something like the min() function in SWI-Prolog.
Similarly, say you want to ask, "What are the even numbers between 1 and 10"; you write the query:
?- between(1, 10, X), X rem 2 =:= 0.
X = 2 ;
X = 4 ;
X = 6 ;
X = 8 ;
X = 10.
... and that's fine, but if you then ask for the numbers that are multiple of 3, you get:
?- between(1, 10, X), X rem 3 =:= 0.
X = 3 ;
X = 6 ;
X = 9 ;
false. % crap...
The "low-hanging fruit" are the cases where you as a programmer would see that there cannot be non-determinism, but for some reason your Prolog is not able to deduce that from the code you wrote. In most cases, you can do something about it.
On to your actual question. If you can, write your code so that there is non-determinism only if there are multiple answers to the question you'll be asking. When you use a DCG for both parsing and generating, this sometimes means you end up with two code paths. It feels clumsy but it is easier to write, to read, to understand, and probably to make efficient. As a word of caution, take a look at this question. I can't know that for sure, but the problems that OP is running into are almost certainly caused by unnecessary non-determinism. What probably happens with larger inputs is that a lot of choice points are left behind, there is a lot of memory that cannot be reclaimed, a lot of processing time going into book keeping, huge solution trees being traversed only to get (as expected) no solutions.... you get the point.
For examples of what I mean, you can take a look at the implementation of library(dcg/basics) in SWI-Prolog. Pay attention to several things:
The documentation is very explicit about what is deterministic, what isn't, and how non-determinism is supposed to be useful to the client code;
The use of cuts, where necessary, to get rid of choice points that are useless;
The implementation of number//1 (towards the bottom) that can "generate extract a number".
(Hint: use the primitives in this library when you write your own parser!)
I hope you find this unnecessarily long answer useful.
We are giving students some exercises, where their solutions are evaluated with maxima.
The answer involves the unit step function. The evaluation in maxima seems to go okay, except that there seems to be missing some algebraic rules on the unit_step functions.
For example is(unit_step(x)*unit_step(x) = unit_step(x)) evaluates to false. It is quite unlikely that the student gives the answer in such a form, but still we don't want to have the possibility that the student gives a good answer, that is evaluated as incorrect.
Below is a screenshot of an answer we try to evaluate with maxima involving the unit_step function (that we defined as u):
Maxima doesn't know much about unit_step at present (circa Maxima 5.41). This is just a shortcoming, there's no reason for it, except that nobody has gotten around to doing the work. That said, it's not too hard to make some progress.
The simplifier for multiplication merges identical terms into powers:
(%i3) unit_step(x)*unit_step(x);
2
(%o3) unit_step (x)
So let's define a simplifier rule which reduces positive powers of unit_step. (I was going to say positive integer powers, but a moment's thought shows that the same identity holds for noninteger positive powers as well.)
(%i4) matchdeclare (aa, lambda ([e], e > 0)) $
(%i5) matchdeclare (xx, all) $
(%i6) tellsimpafter (unit_step(xx)^aa, unit_step(xx));
(%o6) [^rule1, simpexpt]
Let's try it.
(%i7) unit_step(x)*unit_step(x);
(%o7) unit_step(x)
(%i8) is (unit_step(x)*unit_step(x) = unit_step(x));
(%o8) true
(%i9) unit_step(t - 5)^(1/4);
(%o9) unit_step(t - 5)
(%i10) assume (m > 0);
(%o10) [m > 0]
(%i11) unit_step(2*u + 1)^m;
(%o11) unit_step(2 u + 1)
So far, so good. Of course this is just one identity and there are others that could be useful. Since this rule is not built-in, one would have to load that definition in order to make use of it; that would be bothersome if you intend for others to use this.
For the record, the only simplification for unit_step which I found in the Maxima source code is in share/contrib/integration/abs_integrate.mac, which contains a function unit_step_mult_simp, which applies the identity unit_step(a)*unit_step(b) --> unit_step(min(a, b)).
I want to make an arithmetic solver in Prolog that can have +,-,*,^ operations on numbers >= 2. It should also be possible to have a variable x in there. The input should be a prefix expression in a list.
I have made a program that parses an arithmetic expression in prefix format into a syntax tree. So that:
?- parse([+,+,2,9,*,3,x],Tree).
Tree = plus(plus(num(2), num(9)), mul(num(3), var(x))) .
(1) At this stage, I want to extend this program to be able to solve it for a given x value. This should be done by adding another predicate evaluate(Tree, Value, Solution) which given a value for the unknown x, calculates the solution.
Example:
?- parse([*, 2, ^, x, 3],Tree), evaluate(Ast, 2, Solution).
Tree = mul(num(2), pow(var(x), num(3))) ,
Solution = 16.
I'm not sure how to solve this problem due to my lack of Prolog skills, but I need a way of setting the var(x) to num(2) like in this example (because x = 2). Maybe member in Prolog can be used to do this. Then I have to solve it using perhaps is/2
Edit: My attempt to solving it. Getting error: 'Undefined procedure: evaluate/3 However, there are definitions for: evaluate/5'
evaluate(plus(A,B),Value,Sol) --> evaluate(A,AV,Sol), evaluate(B,BV,Sol), Value is AV+BV.
evaluate(mul(A,B),Value,Sol) --> evaluate(A,AV,Sol), evaluate(B,BV,Sol), Value is AV*BV.
evaluate(pow(A,B),Value,Sol) --> evaluate(A,AV,Sol), evaluate(B,BV,Sol), Value is AV^BV.
evaluate(num(Num),Value,Sol) --> number(Num).
evaluate(var(x),Value,Sol) --> number(Value).
(2) I'd also want to be able to express it in postfix form. Having a predicate postfixform(Tree, Postfixlist)
Example:
?- parse([+, *, 2, x, ^, x, 5 ],Tree), postfix(Tree,Postfix).
Tree = plus(mul(num(2), var(x)), pow(var(x), num(5))) ,
Postfix = [2, x, *, x, 5, ^, +].
Any help with (1) and (2) would be highly appreciated!
You don't need to use a grammar for this, as you are doing. You should use normal rules.
This is the pattern you need to follow.
evaluate(plus(A,B),Value,Sol) :-
evaluate(A, Value, A2),
evaluate(B, Value, B2),
Sol is A2+B2.
And
evaluate(num(X),_Value,Sol) :- Sol = X.
evaluate(var(x),Value,Sol) :- Sol = Value.
I'm trying to teach myself Prolog. Below, I've written some code that I think should return all paths between nodes in an undirected graph... but it doesn't. I'm trying to understand why this particular code doesn't work (which I think differentiates this question from similar Prolog pathfinding posts). I'm running this in SWI-Prolog. Any clues?
% Define a directed graph (nodes may or may not be "room"s; edges are encoded by "leads_to" predicates).
room(kitchen).
room(living_room).
room(den).
room(stairs).
room(hall).
room(bathroom).
room(bedroom1).
room(bedroom2).
room(bedroom3).
room(studio).
leads_to(kitchen, living_room).
leads_to(living_room, stairs).
leads_to(living_room, den).
leads_to(stairs, hall).
leads_to(hall, bedroom1).
leads_to(hall, bedroom2).
leads_to(hall, bedroom3).
leads_to(hall, studio).
leads_to(living_room, outside). % Note "outside" is the only node that is not a "room"
leads_to(kitchen, outside).
% Define the indirection of the graph. This is what we'll work with.
neighbor(A,B) :- leads_to(A, B).
neighbor(A,B) :- leads_to(B, A).
Iff A --> B --> C --> D is a loop-free path, then
path(A, D, [B, C])
should be true. I.e., the third argument contains the intermediate nodes.
% Base Rule (R0)
path(X,Y,[]) :- neighbor(X,Y).
% Inductive Rule (R1)
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), not(member(Z, P)), path(Z,Y,P).
Yet,
?- path(bedroom1, stairs, P).
is false. Why? Shouldn't we get a match to R1 with
X = bedroom1
Y = stairs
Z = hall
P = []
since,
?- neighbor(bedroom1, hall).
true.
?- not(member(hall, [])).
true.
?- path(hall, stairs, []).
true .
?
In fact, if I evaluate
?- path(A, B, P).
I get only the length-1 solutions.
Welcome to Prolog! The problem, essentially, is that when you get to not(member(Z, P)) in R1, P is still a pure variable, because the evaluation hasn't gotten to path(Z, Y, P) to define it yet. One of the surprising yet inspiring things about Prolog is that member(Ground, Var) will generate lists that contain Ground and unify them with Var:
?- member(a, X).
X = [a|_G890] ;
X = [_G889, a|_G893] ;
X = [_G889, _G892, a|_G896] .
This has the confusing side-effect that checking for a value in an uninstantiated list will always succeed, which is why not(member(Z, P)) will always fail, causing R1 to always fail. The fact that you get all the R0 solutions and none of the R1 solutions is a clue that something in R1 is causing it to always fail. After all, we know R0 works.
If you swap these two goals, you'll get the first result you want:
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), path(Z,Y,P), not(member(Z, P)).
?- path(bedroom1, stairs, P).
P = [hall]
If you ask for another solution, you'll get a stack overflow. This is because after the change we're happily generating solutions with cycles as quickly as possible with path(Z,Y,P), only to discard them post-facto with not(member(Z, P)). (Incidentally, for a slight efficiency gain we can switch to memberchk/2 instead of member/2. Of course doing the wrong thing faster isn't much help. :)
I'd be inclined to convert this to a breadth-first search, which in Prolog would imply adding an "open set" argument to contain solutions you haven't tried yet, and at each node first trying something in the open set and then adding that node's possibilities to the end of the open set. When the open set is extinguished, you've tried every node you could get to. For some path finding problems it's a better solution than depth first search anyway. Another thing you could try is separating the path into a visited and future component, and only checking the visited component. As long as you aren't generating a cycle in the current step, you can be assured you aren't generating one at all, there's no need to worry about future steps.
The way you worded the question leads me to believe you don't want a complete solution, just a hint, so I think this is all you need. Let me know if that's not right.