Simplify product of roots containing goniometric functions - maxima

In Maxima, I am trying to simplify the expression
sqrt(1 - sin(x)) * sqrt(1 + sin(x))
to yield
cos(x)
I properly restricted the definition of x
declare(x, real) $
assume(x > 0, x < %pi/2) $
and tried several simplification commands including radcan, trigsimp, trigreduce and trigexpand, but without any success. How can this be done?

Try trigsimp(rootscontract(expr))

The restrictions you assert do not uniquely determine the simplified result you request.
It would seem both harmless and obviously unnecessary to declare or assume the following:
declare(9, real)
assume(9>0)
and yet, sqrt(9) is still the set {-3, +3}, mathematically speaking, as opposed to "what I learned in 6th grade".
Stavros' suggestion give |cos(x)|, which is not quite what the original questioner wanted.
Another way of getting the same result, one which may more explicitly exhibit the -in general falseness - of the result, is to square and then use the semi-bogus sqrt, that attempts to pick the positive answer.
trigsimp (sqrt(expand(expr^2)));
If you think this is a way of simplifying expr, note that it changes -3 to 3.

Related

Floor and Ceiling Function implementation in Z3

I have tried to implement Floor and Ceiling Function as defined in the following link
https://math.stackexchange.com/questions/3619044/floor-or-ceiling-function-encoding-in-first-order-logic/3619320#3619320
But Z3 query returning counterexample.
Floor Function
_X=Real('_X')
_Y=Int('_Y')
_W=Int('_W')
_n=Int('_n')
_Floor=Function('_Floor',RealSort(),IntSort())
..
_s.add(_X>=0)
_s.add(_Y>=0)
_s.add(Implies(_Floor(_X)==_Y,And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W ==_Y,_W<_Y))))))
_s.add(Implies(And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W==_Y,_W<_Y))),_Floor(_X)==_Y))
_s.add(Not(_Floor(0.5)==0))
Expected Result - Unsat
Actual Result - Sat
Ceiling Function
_X=Real('_X')
_Y=Int('_Y')
_W=Int('_W')
_Ceiling=Function('_Ceiling',RealSort(),IntSort())
..
..
_s.add(_X>=0)
_s.add(_Y>=0)
_s.add(Implies(_Ceiling(_X)==_Y,And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W ==_Y,_Y<_W))))))
_s.add(Implies(And(Or(_Y==_X,_Y<_X),ForAll(_W,Implies(And(_W>=0,_W<_X),And(_W==_Y,_Y<_W)))),_Ceiling(_X)==_Y))
_s.add(Not(_Ceilng(0.5)==1))
Expected Result - Unsat
Actual Result - Sat
[Your encoding doesn't load to z3, it gives a syntax error even after eliminating the '..', as your call to Implies needs an extra argument. But I'll ignore all that.]
The short answer is, you can't really do this sort of thing in an SMT-Solver. If you could, then you can solve arbitrary Diophantine equations. Simply cast it in terms of Reals, solve it (there is a decision procedure for Reals), and then add the extra constraint that the result is an integer by saying Floor(solution) = solution. So, by this argument, you can see that modeling such functions will be beyond the capabilities of an SMT solver.
See this answer for details: Get fractional part of real in QF_UFNRA
Having said that, this does not mean you cannot code this up in Z3. It just means that it will be more or less useless. Here's how I would go about it:
from z3 import *
s = Solver()
Floor = Function('Floor',RealSort(),IntSort())
r = Real('R')
f = Int('f')
s.add(ForAll([r, f], Implies(And(f <= r, r < f+1), Floor(r) == f)))
Now, if I do this:
s.add(Not(Floor(0.5) == 0))
print(s.check())
you'll get unsat, which is correct. If you do this instead:
s.add(Not(Floor(0.5) == 1))
print(s.check())
you'll see that z3 simply loops forever. To make this usefull, you'd want the following to work as well:
test = Real('test')
s.add(test == 2.4)
result = Int('result')
s.add(Floor(test) == result)
print(s.check())
but again, you'll see that z3 simply loops forever.
So, bottom line: Yes, you can model such constructs, and z3 will correctly answer the simplest of queries. But with anything interesting, it'll simply loop forever. (Essentially whenever you'd expect sat and most of the unsat scenarios unless they can be constant-folded away, I'd expect z3 to simply loop.) And there's a very good reason for that, as I mentioned: Such theories are just not decidable and fall well out of the range of what an SMT solver can do.
If you are interested in modeling such functions, your best bet is to use a more traditional theorem prover, like Isabelle, Coq, ACL2, HOL, HOL-Light, amongst others. They are much more suited for working on these sorts of problems. And also, give a read to Get fractional part of real in QF_UFNRA as it goes into some of the other details of how you can go about modeling such functions using non-linear real arithmetic.

Is it appropriate for a parser DCG to not be deterministic?

I am writing a parser for a query engine. My parser DCG query is not deterministic.
I will be using the parser in a relational manner, to both check and synthesize queries.
Is it appropriate for a parser DCG to not be deterministic?
In code:
If I want to be able to use query/2 both ways, does it require that
?- phrase(query, [q,u,e,r,y]).
true;
false.
or should I be able to obtain
?- phrase(query, [q,u,e,r,y]).
true.
nevertheless, given that the first snippet would require me to use it as such
?- bagof(X, phrase(query, [q,u,e,r,y]), [true]).
true.
when using it to check a formula?
The first question to ask yourself, is your grammar deterministic, or in the terminology of grammars, unambiguous. This is not asking if your DCG is deterministic, but if the grammar is unambiguous. That can be answered with basic parsing concepts, no use of DCG is needed to answer that question. In other words, is there only one way to parse a valid input. The standard book for this is "Compilers : principles, techniques, & tools" (WorldCat)
Now you are actually asking about three different uses for parsing.
A recognizer.
A parser.
A generator.
If your grammar is unambiguous then
For a recognizer the answer should only be true for valid input that can be parsed and false for invalid input.
For the parser it should be deterministic as there is only one way to parse the input. The difference between a parser and an recognizer is that a recognizer only returns true or false and a parser will return something more, typically an abstract syntax tree.
For the generator, it should be semi-deterministic so that it can generate multiple results.
Can all of this be done with one, DCG, yes. The three different ways are dependent upon how you use the input and output of the DCG.
Here is an example with a very simple grammar.
The grammar is just an infix binary expression with one operator and two possible operands. The operator is (+) and the operands are either (1) or (2).
expr(expr(Operand_1,Operator,Operand_2)) -->
operand(Operand_1),
operator(Operator),
operand(Operand_2).
operand(operand(1)) --> "1".
operand(operand(2)) --> "2".
operator(operator(+)) --> "+".
recognizer(Input) :-
string_codes(Input,Codes),
DCG = expr(_),
phrase(DCG,Codes,[]).
parser(Input,Ast) :-
string_codes(Input,Codes),
DCG = expr(Ast),
phrase(DCG,Codes,[]).
generator(Generated) :-
DCG = expr(_),
phrase(DCG,Codes,[]),
string_codes(Generated,Codes).
:- begin_tests(expr).
recognizer_test_case_success("1+1").
recognizer_test_case_success("1+2").
recognizer_test_case_success("2+1").
recognizer_test_case_success("2+2").
test(recognizer,[ forall(recognizer_test_case_success(Input)) ] ) :-
recognizer(Input).
recognizer_test_case_fail("2+3").
test(recognizer,[ forall(recognizer_test_case_fail(Input)), fail ] ) :-
recognizer(Input).
parser_test_case_success("1+1",expr(operand(1),operator(+),operand(1))).
parser_test_case_success("1+2",expr(operand(1),operator(+),operand(2))).
parser_test_case_success("2+1",expr(operand(2),operator(+),operand(1))).
parser_test_case_success("2+2",expr(operand(2),operator(+),operand(2))).
test(parser,[ forall(parser_test_case_success(Input,Expected_ast)) ] ) :-
parser(Input,Ast),
assertion( Ast == Expected_ast).
parser_test_case_fail("2+3").
test(parser,[ forall(parser_test_case_fail(Input)), fail ] ) :-
parser(Input,_).
test(generator,all(Generated == ["1+1","1+2","2+1","2+2"]) ) :-
generator(Generated).
:- end_tests(expr).
The grammar is unambiguous and has only 4 valid strings which are all unique.
The recognizer is deterministic and only returns true or false.
The parser is deterministic and returns a unique AST.
The generator is semi-deterministic and returns all 4 valid unique strings.
Example run of the test cases.
?- run_tests.
% PL-Unit: expr ........... done
% All 11 tests passed
true.
To expand a little on the comment by Daniel
As Daniel notes
1 + 2 + 3
can be parsed as
(1 + 2) + 3
or
1 + (2 + 3)
So 1+2+3 is an example as you said is specified by a recursive DCG and as I noted a common way out of the problem is to use parenthesizes to start a new context. What is meant by starting a new context is that it is like getting a new clean slate to start over again. If you are creating an AST, you just put the new context, items in between the parenthesizes, as a new subtree at the current node.
With regards to write_canonical/1, this is also helpful but be aware of left and right associativity of operators. See Associative property
e.g.
+ is left associative
?- write_canonical(1+2+3).
+(+(1,2),3)
true.
^ is right associative
?- write_canonical(2^3^4).
^(2,^(3,4))
true.
i.e.
2^3^4 = 2^(3^4) = 2^81 = 2417851639229258349412352
2^3^4 != (2^3)^4 = 8^4 = 4096
The point of this added info is to warn you that grammar design is full of hidden pitfalls and if you have not had a rigorous class in it and done some of it you could easily create a grammar that looks great and works great and then years latter is found to have a serious problem. While Python was not ambiguous AFAIK, it did have grammar issues, it had enough issues that when Python 3 was created, many of the issues were fixed. So Python 3 is not backward compatible with Python 2 (differences). Yes they have made changes and libraries to make it easier to use Python 2 code with Python 3, but the point is that the grammar could have used a bit more analysis when designed.
The only reason why code should be non-deterministic is that your question has multiple answers. In that case, you'd of course want your query to have multiple solutions. Even then, however, you'd like it to not leave a choice point after the last solution, if at all possible.
Here is what I mean:
"What is the smaller of two numbers?"
min_a(A, B, B) :- B < A.
min_a(A, B, A) :- A =< B.
So now you ask, "what is the smaller of 1 and 2" and the answer you expect is "1":
?- min_a(1, 2, Min).
Min = 1.
?- min_a(2, 1, Min).
Min = 1 ; % crap...
false.
?- min_a(2, 1, 2).
false.
?- min_a(2, 1, 1).
true ; % crap...
false.
So that's not bad code but I think it's still crap. This is why, for the smaller of two numbers, you'd use something like the min() function in SWI-Prolog.
Similarly, say you want to ask, "What are the even numbers between 1 and 10"; you write the query:
?- between(1, 10, X), X rem 2 =:= 0.
X = 2 ;
X = 4 ;
X = 6 ;
X = 8 ;
X = 10.
... and that's fine, but if you then ask for the numbers that are multiple of 3, you get:
?- between(1, 10, X), X rem 3 =:= 0.
X = 3 ;
X = 6 ;
X = 9 ;
false. % crap...
The "low-hanging fruit" are the cases where you as a programmer would see that there cannot be non-determinism, but for some reason your Prolog is not able to deduce that from the code you wrote. In most cases, you can do something about it.
On to your actual question. If you can, write your code so that there is non-determinism only if there are multiple answers to the question you'll be asking. When you use a DCG for both parsing and generating, this sometimes means you end up with two code paths. It feels clumsy but it is easier to write, to read, to understand, and probably to make efficient. As a word of caution, take a look at this question. I can't know that for sure, but the problems that OP is running into are almost certainly caused by unnecessary non-determinism. What probably happens with larger inputs is that a lot of choice points are left behind, there is a lot of memory that cannot be reclaimed, a lot of processing time going into book keeping, huge solution trees being traversed only to get (as expected) no solutions.... you get the point.
For examples of what I mean, you can take a look at the implementation of library(dcg/basics) in SWI-Prolog. Pay attention to several things:
The documentation is very explicit about what is deterministic, what isn't, and how non-determinism is supposed to be useful to the client code;
The use of cuts, where necessary, to get rid of choice points that are useless;
The implementation of number//1 (towards the bottom) that can "generate extract a number".
(Hint: use the primitives in this library when you write your own parser!)
I hope you find this unnecessarily long answer useful.

How do I tell Maxima about valid approximations of subexpressions of a large expression?

I have a fairly large expression that involves a lot of subexpressions of the form (100*A^3 + 200*A^2 + 100*A)*x or (-A^2 - A)*y or (100*A^2 + 100*A)*z
I know, but I don't know how to tell Maxima this, that it in this case is valid to make the approximation A+1 ~ A, thereby effectively removing anything but the highest power of A in each coefficient.
I'm now looking for functions, tools, or methods that I can use to guide Maxima in dropping various terms that aren't important.
I have attempted with subst, but that requires me to specify each and every factor separately, because:
subst([A+1=B], (A+2)*(A+1)*2);
subst([A+1=B], (A+2)*(A*2+2));
(%o1) 2*(A+2)*B
(%o2) (A+2)*(2*A+2)
(that is, I need to add one expression for each slightly different variant)
I tried with ratsimp, but that's too eager to change every occurrence:
ratsubst(B, A+1, A*(A+1)*2);
ratsubst(B, A+1, A*(A*2+2));
(%o3) 2*B^2-2*B
(%o4) 2*B^2-2*B
which isn't actually simpler, as I would have preferred the answer to have been given as 2*B^2.
In another answer, (https://stackoverflow.com/a/22695050/5999883) the functions let and letsimp were suggested for the task of substituting values, but I fail to get them to really do anything:
x:(A+1)*A;
let ( A+1, B );
letsimp(x);
(x)A*(A+1)
(%o6) A+1 --\> B
(%o7) A^2+A
Again, I'd like to approximate this expression to A^2 (B^2, whatever it's called).
I understand that this is, in general, a hard problem (is e.g. A^2 + 10^8*A still okay to approximate as A^2?) but I think that what I'm looking for is a function or method of calculation that would be a little bit smarter than subst and can recognize that the same substitution could be done in the expression A^2+A as in the expression 100*A^2+100*A or -A^2-A instead of making me create a list of three (or twenty) individual substitutions when calling subst. The "nice" part of the full expression that I'm working on is that each of these A factors are of the form k*A^n*(A+1)^m for various small integers n, m, so I never actually end up with the degenerate case mentioned above.
(I was briefly thinking of re-expressing my expression as a polynomial in A, but this will not work as the only valid approximation of the expression (A^3+A^2+A)*x + y is A^3*x + y -- I know nothing about the relative sizes of x and y.

Pathfinding in Prolog

I'm trying to teach myself Prolog. Below, I've written some code that I think should return all paths between nodes in an undirected graph... but it doesn't. I'm trying to understand why this particular code doesn't work (which I think differentiates this question from similar Prolog pathfinding posts). I'm running this in SWI-Prolog. Any clues?
% Define a directed graph (nodes may or may not be "room"s; edges are encoded by "leads_to" predicates).
room(kitchen).
room(living_room).
room(den).
room(stairs).
room(hall).
room(bathroom).
room(bedroom1).
room(bedroom2).
room(bedroom3).
room(studio).
leads_to(kitchen, living_room).
leads_to(living_room, stairs).
leads_to(living_room, den).
leads_to(stairs, hall).
leads_to(hall, bedroom1).
leads_to(hall, bedroom2).
leads_to(hall, bedroom3).
leads_to(hall, studio).
leads_to(living_room, outside). % Note "outside" is the only node that is not a "room"
leads_to(kitchen, outside).
% Define the indirection of the graph. This is what we'll work with.
neighbor(A,B) :- leads_to(A, B).
neighbor(A,B) :- leads_to(B, A).
Iff A --> B --> C --> D is a loop-free path, then
path(A, D, [B, C])
should be true. I.e., the third argument contains the intermediate nodes.
% Base Rule (R0)
path(X,Y,[]) :- neighbor(X,Y).
% Inductive Rule (R1)
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), not(member(Z, P)), path(Z,Y,P).
Yet,
?- path(bedroom1, stairs, P).
is false. Why? Shouldn't we get a match to R1 with
X = bedroom1
Y = stairs
Z = hall
P = []
since,
?- neighbor(bedroom1, hall).
true.
?- not(member(hall, [])).
true.
?- path(hall, stairs, []).
true .
?
In fact, if I evaluate
?- path(A, B, P).
I get only the length-1 solutions.
Welcome to Prolog! The problem, essentially, is that when you get to not(member(Z, P)) in R1, P is still a pure variable, because the evaluation hasn't gotten to path(Z, Y, P) to define it yet. One of the surprising yet inspiring things about Prolog is that member(Ground, Var) will generate lists that contain Ground and unify them with Var:
?- member(a, X).
X = [a|_G890] ;
X = [_G889, a|_G893] ;
X = [_G889, _G892, a|_G896] .
This has the confusing side-effect that checking for a value in an uninstantiated list will always succeed, which is why not(member(Z, P)) will always fail, causing R1 to always fail. The fact that you get all the R0 solutions and none of the R1 solutions is a clue that something in R1 is causing it to always fail. After all, we know R0 works.
If you swap these two goals, you'll get the first result you want:
path(X,Y,[Z|P]) :- not(X == Y), neighbor(X,Z), path(Z,Y,P), not(member(Z, P)).
?- path(bedroom1, stairs, P).
P = [hall]
If you ask for another solution, you'll get a stack overflow. This is because after the change we're happily generating solutions with cycles as quickly as possible with path(Z,Y,P), only to discard them post-facto with not(member(Z, P)). (Incidentally, for a slight efficiency gain we can switch to memberchk/2 instead of member/2. Of course doing the wrong thing faster isn't much help. :)
I'd be inclined to convert this to a breadth-first search, which in Prolog would imply adding an "open set" argument to contain solutions you haven't tried yet, and at each node first trying something in the open set and then adding that node's possibilities to the end of the open set. When the open set is extinguished, you've tried every node you could get to. For some path finding problems it's a better solution than depth first search anyway. Another thing you could try is separating the path into a visited and future component, and only checking the visited component. As long as you aren't generating a cycle in the current step, you can be assured you aren't generating one at all, there's no need to worry about future steps.
The way you worded the question leads me to believe you don't want a complete solution, just a hint, so I think this is all you need. Let me know if that's not right.

Can a SHA-1 hash be all-zeroes?

Is there any input that SHA-1 will compute to a hex value of fourty-zeros, i.e. "0000000000000000000000000000000000000000"?
Yes, it's just incredibly unlikely. I.e. one in 2^160, or 0.00000000000000000000000000000000000000000000006842277657836021%.
Also, becuase SHA1 is cryptographically strong, it would also be computationally unfeasible (at least with current computer technology -- all bets are off for emergent technologies such as quantum computing) to find out what data would result in an all-zero hash until it occurred in practice. If you really must use the "0" hash as a sentinel be sure to include an appropriate assertion (that you did not just hash input data to your "zero" hash sentinel) that survives into production. It is a failure condition your code will permanently need to check for. WARNING: Your code will permanently be broken if it does.
Depending on your situation (if your logic can cope with handling the empty string as a special case in order to forbid it from input) you could use the SHA1 hash ('da39a3ee5e6b4b0d3255bfef95601890afd80709') of the empty string. Also possible is using the hash for any string not in your input domain such as sha1('a') if your input has numeric-only as an invariant. If the input is preprocessed to add any regular decoration then a hash of something without the decoration would work as well (eg: sha1('abc') if your inputs like 'foo' are decorated with quotes to something like '"foo"').
I don't think so.
There is no easy way to show why it's not possible. If there was, then this would itself be the basis of an algorithm to find collisions.
Longer analysis:
The preprocessing makes sure that there is always at least one 1 bit in the input.
The loop over w[i] will leave the original stream alone, so there is at least one 1 bit in the input (words 0 to 15). Even with clever design of the bit patterns, at least some of the values from 0 to 15 must be non-zero since the loop doesn't affect them.
Note: leftrotate is circular, so no 1 bits will get lost.
In the main loop, it's easy to see that the factor k is never zero, so temp can't be zero for the reason that all operands on the right hand side are zero (k never is).
This leaves us with the question whether you can create a bit pattern for which (a leftrotate 5) + f + e + k + w[i] returns 0 by overflowing the sum. For this, we need to find values for w[i] such that w[i] = 0 - ((a leftrotate 5) + f + e + k)
This is possible for the first 16 values of w[i] since you have full control over them. But the words 16 to 79 are again created by xoring the first 16 values.
So the next step could be to unroll the loops and create a system of linear equations. I'll leave that as an exercise to the reader ;-) The system is interesting since we have a loop that creates additional equations until we end up with a stable result.
Basically, the algorithm was chosen in such a way that you can create individual 0 words by selecting input patterns but these effects are countered by xoring the input patterns to create the 64 other inputs.
Just an example: To make temp 0, we have
a = h0 = 0x67452301
f = (b and c) or ((not b) and d)
= (h1 and h2) or ((not h1) and h3)
= (0xEFCDAB89 & 0x98BADCFE) | (~0x98BADCFE & 0x10325476)
= 0x98badcfe
e = 0xC3D2E1F0
k = 0x5A827999
which gives us w[0] = 0x9fb498b3, etc. This value is then used in the words 16, 19, 22, 24-25, 27-28, 30-79.
Word 1, similarly, is used in words 1, 17, 20, 23, 25-26, 28-29, 31-79.
As you can see, there is a lot of overlap. If you calculate the input value that would give you a 0 result, that value influences at last 32 other input values.
The post by Aaron is incorrect. It is getting hung up on the internals of the SHA1 computation while ignoring what happens at the end of the round function.
Specifically, see the pseudo-code from Wikipedia. At the end of the round, the following computation is done:
h0 = h0 + a
h1 = h1 + b
h2 = h2 + c
h3 = h3 + d
h4 = h4 + e
So an all 0 output can happen if h0 == -a, h1 == -b, h2 == -c, h3 == -d, and h4 == -e going into this last section, where the computations are mod 2^32.
To answer your question: nobody knows whether there exists an input that produces all zero outputs, but cryptographers expect that there are based upon the simple argument provided by daf.
Without any knowledge of SHA-1 internals, I don't see why any particular value should be impossible (unless explicitly stated in the description of the algorithm). An all-zero value is no more or less probable than any other specific value.
Contrary to all of the current answers here, nobody knows that. There's a big difference between a probability estimation and a proof.
But you can safely assume it won't happen. In fact, you can safely assume that just about ANY value won't be the result (assuming it wasn't obtained through some SHA-1-like procedures). You can assume this as long as SHA-1 is secure (it actually isn't anymore, at least theoretically).
People doesn't seem realize just how improbable it is (if all humanity focused all of it's current resources on finding a zero hash by bruteforcing, it would take about xxx... ages of the current universe to crack it).
If you know the function is safe, it's not wrong to assume it won't happen. That may change in the future, so assume some malicious inputs could give that value (e.g. don't erase user's HDD if you find a zero hash).
If anyone still thinks it's not "clean" or something, I can tell you that nothing is guaranteed in the real world, because of quantum mechanics. You assume you can't walk through a solid wall just because of an insanely low probability.
[I'm done with this site... My first answer here, I tried to write a nice answer, but all I see is a bunch of downvoting morons who are wrong and can't even tell the reason why are they doing it. Your community really disappointed me. I'll still use this site, but only passively]
Contrary to all answers here, the answer is simply No.
The hash value always contains bits set to 1.

Resources