Which of the following is TRUE about formulae in Conjunctive Normal Form? - digital

Which of the following is TRUE about formulae in Conjunctive Normal Form?
A. For any formula, there is a truth assignment for which at least half the clauses evaluate to true.
B. For any formula, there is a truth assignment for which all the clauses evaluate to true.
C. There is a formula such that for each truth assignment, at most one-fourth of the clauses evaluate to true.
D. None of the above.
My Doubt: I know Conjunctive normal form is Product of sum form, But this question confuses me, Please explain me in simple language.

I will show two proofs of (A). We can quickly discount (B) with any unsatisfiable formula, like x ∧ ¬x. From a proof of (A) we can also discount (C) and (D).
Proof by construction
Let us say we have some formula in CNF. Let us examine any propositional variable (or term, or whatever you want to call it), which we will call x. We can consider all the clauses which include either x or ¬x over three distinct cases:
If x appears in the clause and ¬x does not, we know that the clause will be satisfied if x is assigned true, and will not be satisfied if x is assigned false.
Similarly, if ¬x appears in the clause and x does not, we know that the clause will be satisfied if x is assigned false, and will not be satisfied if x is assigned true.
If both x and ¬x appears in the clause, any assignment will satisfy that clause.
If there are more case 1 clauses than case 2, an assignment of true to x will satisfy at least half of the considered clauses. If there are more case 2 clauses than case 1, an assignment of false to x will satisfy at least half of the considered clauses. If there is an equal occurrence of case 1 and case 2, then any assignment to x will satisfy at least half of the considered clauses.
Of the remaining clauses, we can apply a similar algorithm to the propositional variables left, until all the clauses have at least one assigned variable, and at least half of all these clauses will evaluate to true.
Proof by expected value
Consider any clause of some formula in CNF. Given that each clause is in 'sum form', there is only one assignment of all the constituent literals which will cause the clause to evaluate to false. That means, for a clause with k literals, there are 2k - 1 assignments of the constituent literals which will cause the clause to evaluate to true. Because each assignment is independently equally as likely, the expected probability of the clause evaluating to true is (2k - 1) / 2k, which is 1 - (1 / 2k). Clearly, for any positive value of k, the probability of a clause being satisfied by an arbitrary truth assignment is at least one half.
From the probability of some arbitrary clause being satisfied, we can say that the expected number of satisfied clauses to be the the sum of the probabilities of each clause being satisfied. From here, with a small amount of mathematics we can conclude that the expected number of satisfied clauses to be at least half the total number of clauses. Because there must be at least one truth assignment which satisfies at least this number of expected satisfied clauses, we can conclude that for any given formula, there exists at least one truth assignment which satisfies at least half of the clauses.

Related

How can I represent integer infinity in Z3?

I need to be able to compare two integer expressions, which may include literal integers, addition, unary negation, integer constants, and infinity, and decide if an inequality between them is satisfiable. This is part of a larger program, so there is no way for me to know ahead of time what those expressions will look like.
I have considered defining an integer constant and just letting it take any value, but then I realized that Infinity < 5 would be satisfiable.
I have considered defining a constant and making a universally quantified assertion that it is greater than all integers, but I don't know what Sort I should say it is. If I tell Z3 that the Sort of my Infinity constant is integer, I think it will probably happily go off and try to find me THE LARGEST INTEGER! I'm pretty sure that won't end the way I want.
I would create a composite type consisting of an integer and a boolean that says whether this particular value is infinity or not. Then, you need to define arithmetic on this. For example, if one of the operands is infinity then the result of an addition is infinity as well. Otherwise, it's the actual sum of the integers. Comparisons would be defined in a similar way (case based).
There is no need to actually create a type in the Z3 system. Just always create values in int/bool pairs in your code. A few helper functions can do that.
Doing this you will probably create a harder problem for Z3 to solve and maybe even escalate the logic you are using.

Are existential quantifiers nested under foralls skolemised once? What do quantifier instantiation statistics mean for these quantifiers?

In a (rather large) Z3 problem, we have a few axioms of the shape:
forall xs :: ( (P(xs) ==> (exists ys :: Q(xs,ys))) && ((exists zs :: Q(xs,zs)) ==> P(xs)) )
All three quantifiers (including the existentials) have explicit triggers provided (omitted here). When running the problem and gathering quantifier statistics, we observed the following data (amongst many other instantiations):
[quantifier_instances] k!244 : 804 : 3 : 4
[quantifier_instances] k!232 : 10760 : 29 : 30
Here, line 244 corresponds to the end of the outer forall quantifier, and line 232 to the end of the first inner exists. Furthermore, there are no reported instantiations of the second inner exists (which I believe Z3 will pull out into a forall); given the triggers this is surprising.
My understanding is that existentials in this inner position should be skolemised by a function (depending on the outer quantifier). It's not clear to me what quantifier statistics mean for such existentials.
Here are my specific questions:
Are quantifier statistics meaningful for existential quantifiers (those which remain as existentials - i.e. in positive positions)? If so, what do they mean?
Does skolemisation of such an existential happen once and for all, or each time the outer quantifier is instantiated? Why is a substantially higher number reported for this quantifier than for the outer forall?
Does Z3 apply some internal rewriting of this kind of (A==>B)&&(B==>A) assertion? If so, how does that affect quantifier statistics for quantifiers in A and B?
From our point of view, understanding question 2 is most urgent, since we are trying to investigate how the generated existentials affect the performance of the overall problem.
The original smt file is available here:
https://gist.github.com/anonymous/16e489ce5c513e8c4bc6
and a summary of the statistics generated (with Z3 4.4.0, but we observed the same with 4.3.2) is here:
https://gist.github.com/anonymous/ce7b96acf712ac16299e
The answer to all these questions is 'it depends', mainly on what else appears in the problem and what options are set. For instance, if there are only bit-vector variables, then Skolemization will indeed be performed during preprocessing, once and forall, but this is not the case for all other theories or theory combinations.
Briefly looking at your SMT2 file, it seems to me that all existentials appear in the left hand side of implications, i.e., they are in fact negated (and actually rewritten into universals somewhere along the line), so those statistics do make sense for the existentials appearing in this particular problem.

Are soft constraints disjunctions?

In Z3, soft constraints can be implemented by placing the actual constraint P on the right-hand side of an implication whose left-hand side is a fresh boolean variable, e.g. assert (=> b1 P). Constraints can then be "switched on" per check-sat by listing bs which are to be considered true, e.g. (check-sat b1).
Question: What happens with such constraints P if their guard variable b is not included in a check-sat? I assume Z3 treats b's value as unknown, and then branches over the implication/over b's value - is that correct?
Background: The background of my question is that I use Z3 in incremental mode (push/pop blocks), and that I check assertions P by (push)(assert (not P))(check-sat)(pop); if the check-sat is unsat then P holds.
I thought about using soft constraints instead, i.e. to replace each such push/pop block by (declare-const b_i Int)(assert (=> b_i P))(check-sat b_i). Each b_i would only be used once. However, if Z3 may branch over all previous b_j, then it sounds as if that might potentially slow down Z3 quite a bit - hence my question.
(P.S.: I am aware of this answer by Leo, which says that soft constraints might also reduce performance because certain optimisations might not be applicable.)
Yes, if the value of b1 is not fixed, it will be treated just like any other Boolean.

Bottom-Up-Parser: When to apply which reduction rule?

Let's take the following context-free grammar:
G = ( {Sum, Product, Number}, {decimal representations of numbers, +, *}, P, Sum)
Being P:
Sum → Sum + Product
Sum → Product
Product → Product * Number
Product → Number
Number → decimal representation of a number
I am trying to parse expressions produced by this grammar with a bottom-up-parser and a look-ahead-buffer (LAB) of length 1 (which supposingly should do without guessing and back-tracking).
Now, given a stack and a LAB, there are often several possibilities of how to reduce the stack or whether to reduce it at all or push another token.
Currently I use this decision tree:
If any top n tokens of the stack plus the LAB are the begining of the
right side of a rule, I push the next token onto the stack.
Otherwise, I reduce the maximum number of tokens on top of the stack.
I.e. if it is possible to reduce the topmost item and at the same time
it is possible to reduce the three top most items, I do the latter.
If no such reduction is possible, I push another token onto the stack.
Rinse and repeat.
This, seems (!) to work, but it requires an awful amount of rule searching, finding matching prefixes, etc. No way this can run in O(NM).
What is the standard (and possibly only sensible) approach to decide whether to reduce or push (shift), and in case of reduction, which reduction to apply?
Thank you in advance for your comments and answers.
The easiest bottom-up parsing approach for grammars like yours (basically, expression grammars) is operator-precedence parsing.
Recall that bottom-up parsing involves building the parse tree left-to-right from the bottom. In other words, at any given time during the parse, we have a partially assembled tree with only terminal symbols to the right of where we're reading, and a combination of terminals and non-terminals to the left (the "prefix"). The only possible reduction is one which applies to a suffix of the prefix; if no reduction applies, we must be able to shift a terminal from the input to the prefix.
An operator grammar has the feature that there are never two consecutive non-terminals in any production. Consequently, in a bottom-up parse of an operator grammar, either the last symbol in the prefix is a terminal or the second-last symbol is one. (Both of them could be.)
An operator precedence parser is essentially blind to non-terminals; it simply doesn't distinguish between them. So you cannot have two productions whose right-hand sides contain exactly the same sequence of terminals, because the op-prec parser wouldn't know which of these two productions to apply. (That's the traditional view. It's actually possible to extend that a bit so that you can have two productions with the same terminals, provided that the non-terminals are in different places. That allows grammars which have unary - operators, for example, since the right hand sides <non-terminal> - <non-terminal> and - <non-terminal> can be distinguished without knowing the names of the non-terminals; only their presence.
The other requirement is that you have to be able to build a precedence relationship between the terminals. More precisely, we define three precedence relations, usually written <·, ·> and ·=· (or some typographic variation on the theme), and insist that for any two terminals x and y, at most one of the relations x ·> y, x ·=· y and x <· y are true.
Roughly speaking, the < and > in the relations correspond to the edges of a production. In other words, if x <· y, that means that x can be followed by a non-terminal with a production whose first terminal is y. Similarly, x ·> y means that y can follow a non-terminal with a production whose last terminal is x. And x ·=· y means that there is some right-hand side where x and y are consecutive terminals, in that order (possibly with an intervening non-terminal).
If the single-relation restriction is true, then we can parse as follows:
Let x be the last terminal in the prefix (that is, either the last or second-last symbol), and let y be the lookahead symbol, which must be a terminal. If x ·> y then we reduce, and repeat the rule. Otherwise, we shift y onto the prefix.
In order to reduce, we need to find the start of the production. We move backwards over the prefix, comparing consecutive terminals (all of which must have <· or ·=· relations) until we find one with a <· relation. Then the terminals between the <· and the ·> are the right-hand side of the production we're looking for, and we can slot the non-terminals into the right-hand side as indicated.
There is no guarantee that there will be an appropriate production; if there isn't, the parse fails. But if the input is a valid sentence, and if the grammar is an operator-precedence grammar, then we will be able to find the right production to reduce.
Note that it is usually really easy to find the production, because most productions have only one (<non-terminal> * <non-terminal>) or two (( <non-terminal> )) terminals. A naive implementation might just run the terminals together into a string and use that as the key in a hash-table.
The classic implementation of operator-precedence parsing is the so-called "Shunting Yard Algorithm", devised by Edsger Dijkstra. In that algorithm, the precedence relations are modelled by providing two functions, left-precedence and right-precedence, which map terminals to integers such that x <· y is true only if right-precedence(x) < left-precedence(y) (and similarly for the other operators). It is not always possible to find such mappings, and the mappings are a cover of the actual precedence relations because it is often the case that there are pairs of terminals for which no precedence relationship applies. Nonetheless, it is often the case that these mappings can be found, and almost always the case for simple expression grammars.
I hope that's enough to get you started. I encourage you to actually read some texts about bottom-up parsing, because I think I've already written far too much for an SO answer, and I haven't yet included a single line of code. :)

Is there such thing as a left-associative prefix operator or right-associative postfix operator?

This page says "Prefix operators are usually right-associative, and postfix operators left-associative" (emphasis mine).
Are there real examples of left-associative prefix operators, or right-associative postfix operators? If not, what would a hypothetical one look like, and how would it be parsed?
It's not particularly easy to make the concepts of "left-associative" and "right-associative" precise, since they don't directly correspond to any clear grammatical feature. Still, I'll try.
Despite the lack of math layout, I tried to insert an explanation of precedence relations here, and it's the best I can do, so I won't repeat it. The basic idea is that given an operator grammar (i.e., a grammar in which no production has two non-terminals without an intervening terminal), it is possible to define precedence relations ⋖, ≐, and ⋗ between grammar symbols, and then this relation can be extended to terminals.
Put simply, if a and b are two terminals, a ⋖ b holds if there is some production in which a is followed by a non-terminal which has a derivation (possibly not immediate) in which the first terminal is b. a ⋗ b holds if there is some production in which b follows a non-terminal which has a derivation in which the last terminal is a. And a ≐ b holds if there is some production in which a and b are either consecutive or are separated by a single non-terminal. The use of symbols which look like arithmetic comparisons is unfortunate, because none of the usual arithmetic laws apply. It is not necessary (in fact, it is rare) for a ≐ a to be true; a ≐ b does not imply b ≐ a and it may be the case that both (or neither) of a ⋖ b and a ⋗ b are true.
An operator grammar is an operator precedence grammar iff given any two terminals a and b, at most one of a ⋖ b, a ≐ b and a ⋗ b hold.
If a grammar is an operator-precedence grammar, it may be possible to find an assignment of integers to terminals which make the precedence relationships more or less correspond to integer comparisons. Precise correspondence is rarely possible, because of the rarity of a ≐ a. However, it is often possible to find two functions, f(t) and g(t) such that a ⋖ b is true if f(a) < g(b) and a ⋗ b is true if f(a) > g(b). (We don't worry about only if, because it may be the case that no relation holds between a and b, and often a ≐ b is handled with a different mechanism: indeed, it means something radically different.)
%left and %right (the yacc/bison/lemon/... declarations) construct functions f and g. They way they do it is pretty simple. If OP (an operator) is "left-associative", that means that expr1 OP expr2 OP expr3 must be parsed as <expr1 OP expr2> OP expr3, in which case OP ⋗ OP (which you can see from the derivation). Similarly, if ROP were "right-associative", then expr1 ROP expr2 ROP expr3 must be parsed as expr1 ROP <expr2 ROP expr3>, in which case ROP ⋖ ROP.
Since f and g are separate functions, this is fine: a left-associative operator will have f(OP) > g(OP) while a right-associative operator will have f(ROP) < g(ROP). This can easily be implemented by using two consecutive integers for each precedence level and assigning them to f and g in turn if the operator is right-associative, and to g and f in turn if it's left-associative. (This procedure will guarantee that f(T) is never equal to g(T). In the usual expression grammar, the only ≐ relationships are between open and close bracket-type-symbols, and these are not usually ambiguous, so in a yacc-derivative grammar it's not necessary to assign them precedence values at all. In a Floyd parser, they would be marked as ≐.)
Now, what about prefix and postfix operators? Prefix operators are always found in a production of the form [1]:
non-terminal-1: PREFIX non-terminal-2;
There is no non-terminal preceding PREFIX so it is not possible for anything to be ⋗ PREFIX (because the definition of a ⋗ b requires that there be a non-terminal preceding b). So if PREFIX is associative at all, it must be right-associative. Similarly, postfix operators correspond to:
non-terminal-3: non-terminal-4 POSTFIX;
and thus POSTFIX, if it is associative at all, must be left-associative.
Operators may be either semantically or syntactically non-associative (in the sense that applying the operator to the result of an application of the same operator is undefined or ill-formed). For example, in C++, ++ ++ a is semantically incorrect (unless operator++() has been redefined for a in some way), but it is accepted by the grammar (in case operator++() has been redefined). On the other hand, new new T is not syntactically correct. So new is syntactically non-associative.
[1] In Floyd grammars, all non-terminals are coalesced into a single non-terminal type, usually expression. However, the definition of precedence-relations doesn't require this, so I've used different place-holders for the different non-terminal types.
There could be in principle. Consider for example the prefix unary plus and minus operators: suppose + is the identity operation and - negates a numeric value.
They are "usually" right-associative, meaning that +-1 is equivalent to +(-1), the result is minus one.
Suppose they were left-associative, then the expression +-1 would be equivalent to (+-)1.
The language would therefore have to give a meaning to the sub-expression +-. Languages "usually" don't need this to have a meaning and don't give it one, but you can probably imagine a functional language in which the result of applying the identity operator to the negation operator is an operator/function that has exactly the same effect as the negation operator. Then the result of the full expression would again be -1 for this example.
Indeed, if the result of juxtaposing functions/operators is defined to be a function/operator with the same effect as applying both in right-to-left order, then it always makes no difference to the result of the expression which way you associate them. Those are just two different ways of defining that (f g)(x) == f(g(x)). If your language defines +- to mean something other than -, though, then the direction of associativity would matter (and I suspect the language would be very difficult to read for someone used to the "usual" languages...)
On the other hand, if the language doesn't allow juxtaposing operators/functions then prefix operators must be right-associative to allow the expression +-1. Disallowing juxtaposition is another way of saying that (+-) has no meaning.
I'm not aware of such a thing in a real language (e.g., one that's been used by at least a dozen people). I suspect the "usually" was merely because proving a negative is next to impossible, so it's easier to avoid arguments over trivia by not making an absolute statement.
As to how you'd theoretically do such a thing, there seem to be two possibilities. Given two prefix operators # and # that you were going to treat as left associative, you could parse ##a as equivalent to #(#(a)). At least to me, this seems like a truly dreadful idea--theoretically possible, but a language nobody should wish on even their worst enemy.
The other possibility is that ##a would be parsed as (##)a. In this case, we'd basically compose # and # into a single operator, which would then be applied to a.
In most typical languages, this probably wouldn't be terribly interesting (would have essentially the same meaning as if they were right associative). On the other hand, I can imagine a language oriented to multi-threaded programming that decreed that application of a single operator is always atomic--and when you compose two operators into a single one with the left-associative parse, the resulting fused operator is still a single, atomic operation, whereas just applying them successively wouldn't (necessarily) be.
Honestly, even that's kind of a stretch, but I can at least imagine it as a possibility.
I hate to shoot down a question that I myself asked, but having looked at the two other answers, would it be wrong to suggest that I've inadvertently asked a subjective question, and that in fact that the interpretation of left-associative prefixes and right-associative postfixes is simply undefined?
Remembering that even notation as pervasive as expressions is built upon a handful of conventions, if there's an edge case that the conventions never took into account, then maybe, until some standards committee decides on a definition, it's better to simply pretend it doesn't exist.
I do not remember any left-associated prefix operators or right-associated postfix ones. But I can imagine that both can easily exist. They are not common because the natural way of how people are looking to operators is: the one which is closer to the body - is applying first.
Easy example from C#/C++ languages:
~-3 is equal 2, but
-~3 is equal 4
This is because those prefix operators are right associative, for ~-3 it means that at first - operator applied and then ~ operator applied to the result of previous. It will lead to value of the whole expression will be equal to 2
Hypothetically you can imagine that if those operators are left-associative, than for ~-3 at first left-most operator ~ is applied, and after that - to the result of previous. It will lead to value of the whole expression will be equal to 4
[EDIT] Answering to Steve Jessop:
Steve said that: the meaning of "left-associativity" is that +-1 is equivalent to (+-)1
I do not agree with this, and think it is totally wrong. To better understand left-associativity consider the following example:
Suppose I have hypothetical programming language with left-associative prefix operators:
# - multiplies operand by 3
# - adds 7 to operand
Than following construction ##5 in my language will be equal to (5*3)+7 == 22
If my language was right-associative (as most usual languages) than I will have (5+7)*3 == 36
Please let me know if you have any questions.
Hypothetical example. A language has prefix operator # and postfix operator # with the same precedence. An expression #x# would be equal to (#x)# if both operators are left-associative and to #(x#) if both operators are right-associative.

Resources