Attempting to verify formally the following problem. https://leetcode.com/problems/count-equal-and-divisible-pairs-in-an-array/
Given a 0-indexed integer array nums of length n and an integer k, return the number of pairs (i, j) where 0 <= i < j < n, such that nums[i] == nums[j] and (i * j) is divisible by k.
function countPairs(nums: number[], k: number): number {
let count = 0;
for(let i = 0; i < nums.length-1; i++) {
for(let j = i+1; j < nums.length; j++) {
if(nums[i] == nums[j] && (i*j) % k == 0) {
count++;
}
}
}
return count;
};
I cannot figure out how to verify this silly thing. Maybe I'm missing a super simple description of the problem and I'm too focused on irrelevant details.
My strategy started with trying to compare the result to the cardinality of a matching set. However, getting dafny to believe the cardinality of the set matches during the two loops seems impossible.
function method satPairs(nums: seq<nat>, k: nat, a: nat, b: nat): bool
requires k > 0
requires a <= b < |nums|
{
nums[a] == nums[b] && (a*b) % k == 0
}
function matchPairs(nums: seq<nat>, k: nat): nat
requires k > 0
{
|set x,y | 0 <= x < y < |nums| && nums[x] == nums[y] && (x*y) % k == 0 :: (x,y)|
}
function method pairsI(nums: seq<nat>, k: nat, i: nat): set<(nat, nat)>
requires k > 0
requires 0 <= i < |nums|
ensures forall x,y :: 0 <= x < i && x <= y < |nums| && satPairs(nums, k, x, y) ==> (x,y) in pairsI(nums, k, i)
{
set x: nat,y:nat | 0 <= x < i && x <= y < |nums| && satPairs(nums, k, x, y) :: (x,y)
}
I also tried to setup invariants based on counts using methods which could count the matching pairs using a column at a time. However, I ran into no end of incompatible conditions maintaining the invariants before and after the while loop. I was hoping these helper functions could be defined recursively allowing some induction on the result to be used inside the two loops for each invariant, but it didn't work.
function countSeqPairs(nums: seq<nat>, k: nat, start: nat, stop: nat): nat
requires k > 0
requires start <= stop <= |nums|
decreases |nums|-start, |nums|-stop
{
if start > stop || stop >= |nums| || start >= |nums| then 0 else
if stop < |nums| then (if satPairs(nums, k, start, stop) then 1 + countSeqPairs(nums, k, start, stop+1) else countSeqPairs(nums, k, start, stop+1)) else countSeqPairs(nums, k, start+1, stop+2)
}
function countSeqSlice(nums: seq<nat>, k: nat, start: nat, stop: nat): nat
requires k > 0
requires start <= stop <= |nums|
decreases |nums| - stop
{
if start > stop || stop >= |nums| then 0
else if satPairs(nums, k, start, stop) then 1 + countSeqSlice(nums, k, start, stop+1) else countSeqSlice(nums, k, start, stop+1)
}
Here is the main method, with various non-working attempts at invariants.
method countPairs(nums: seq<nat>, k: nat) returns (count: nat)
requires k > 0
requires |nums| >= 2;
{
count := 0;
//ghost var cpairs: set<(nat, nat)> := {};
for i : nat := 0 to |nums|-2
invariant count >= 0
//invariant cpairs == pairsI(nums, k, i)
{
// ghost var occount := count;
// ghost var increment := 0;
for j : nat := i+1 to |nums|-1
invariant count >= 0
// invariant count == occount + increment
// invariant satPairs(nums, k, i, j) ==> increment == increment + 1
// invariant count == 0 || satPairs(nums, k, i, j) ==> count == count + 1
//invariant cpairs == pairsI(nums, k, i) + set z: nat | i+1 <= z <= j && satPairs(nums, k, i, z) :: (i, z)
{
// ghost var currcount := count;
// if nums[i] == nums[j] && (i*j)% k == 0 {
if i+1 <= j <= j && satPairs(nums, k, i, j) {
// increment := increment + 1;
//cpairs := {(i,j)}+cpairs;
count := count + 1;
}
}
}
}
It seems like there is no shorter description than the method itself for what is to be ensured. In addition to help with the above, I have three questions, generally how do you describe an invariant for something which is sort of manifestly arbitrary like this?
Secondly, what strategy can be used to handle loop invariants which are not true before the loop is run but are afterwards? It tends to be the initial condition is set to 0 or some other empty value, but then after the 0-th iteration of the loop it will be set to some value, and then the invariant fails. I keep running into this situation and it feels like there should be some sort of standard guard for it.
Finally, can or should ghost variables be used in method ensure statements?
Here is one way to do it.
function method satPairs(nums: seq<nat>, k: nat, a: nat, b: nat): bool
requires k > 0
requires a <= b < |nums|
{
nums[a] == nums[b] && (a*b) % k == 0
}
function matchPairsHelper(nums: seq<nat>, k: nat, bound: int): set<(int, int)>
requires k > 0
requires bound <= |nums|
{
set x,y | 0 <= x < bound && x < y < |nums| && satPairs(nums, k, x, y) :: (x,y)
}
function matchPairs(nums: seq<nat>, k: nat): set<(int, int)>
requires k > 0
{
matchPairsHelper(nums, k, |nums|)
}
function innerMatchPairsHelper(nums: seq<nat>, k: nat, outer: int, inner_bound: int): set<(int, int)>
requires k > 0
requires inner_bound <= |nums|
{
set y | 0 <= outer < y < inner_bound && satPairs(nums, k, outer, y) :: (outer,y)
}
method countPairs(nums: seq<nat>, k: nat) returns (count: nat)
requires k > 0
requires |nums| >= 2
ensures count == |matchPairs(nums, k)|
{
count := 0;
for i : nat := 0 to |nums|
invariant count == |matchPairsHelper(nums, k, i)|
{
for j : nat := i+1 to |nums|
invariant count == |matchPairsHelper(nums, k, i)| + |innerMatchPairsHelper(nums, k, i, j)|
{
assert innerMatchPairsHelper(nums, k, i, j+1) ==
(if satPairs(nums, k, i, j) then {(i, j)} else {}) + innerMatchPairsHelper(nums, k, i, j);
if satPairs(nums, k, i, j) {
count := count + 1;
}
}
assert matchPairsHelper(nums, k, i+1) == matchPairsHelper(nums, k, i) + innerMatchPairsHelper(nums, k, i, |nums|);
}
}
A few notes:
James's rule of set comprehensions: never use a set comprehension as part of any larger expression, but only as the body of a function that returns that set. This facilitates referring to the set multiple times in assertions without confusing Dafny.
James's rule of cardinalities: Dafny will never prove two cardinalities equal. You must manually ask it to prove two sets equal, and from there, it will conclude the cardinalities are equal.
I didn't really look at your invariants. I just tried to write down what the loops are doing. The outer loop is a loop over the x coordinate. So the invariant is that all the "right" pairs have been counted for all x coordinates smaller than i (for all values of y). Then for the inner loop, the invariant is that all the pairs for the exact x coordinate i have been counted up to j. I define two functions for these notions.
Other than that, just need to assert some set equalities in a few places.
To your other questions:
how do you describe an invariant for something which is sort of
manifestly arbitrary like this?
I'm not sure I understand the question. Are you asking "how do I discover the right loop invariants for this postcondition?" or "how do I state the postcondition when it seems like it is just as long as the code itself?"
what strategy can be used to handle loop invariants which are
not true before the loop is run but are afterwards?
I didn't need this in my solution, but if you do need it then the easiest way is something like i == 0 || <invariant that is true only after loop starts>. Generally I consider such invariants to have a "bad smell" and try to refactor to avoid them. Sometimes they are unavoidable though.
can or should ghost variables be used in method ensure statements?
Not sure I understand this one either. You cannot refer to any local variables of a method in an ensures clause except those that are returned by the method. If needed you can return a ghost variable as in
method Foo() returns (bar: int, ghost baz: int)
ensures ... mentions bar and baz just fine ...
Does that answer this one?
This is a simple segregating 0s and 1s in the array problem. I can't fathom why the loop invariant does not hold.
method rearrange(arr: array<int>, N: int) returns (front: int)
requires N == arr.Length
requires forall i :: 0 <= i < arr.Length ==> arr[i] == 0 || arr[i] == 1
modifies arr
ensures 0 <= front <= arr.Length
ensures forall i :: 0 <= i <= front - 1 ==> arr[i] == 0
ensures forall j :: front <= j <= N - 1 ==> arr[j] == 1
{
front := 0;
var back := N;
while(front < back)
invariant 0 <= front <= back <= N
invariant forall i :: 0 <= i <= front - 1 ==> arr[i] == 0
// The first one does not hold, the second one holds though
invariant forall j :: back <= j < N ==> arr[j] == 1
{
if(arr[front] == 1){
arr[front], arr[back - 1] := arr[back - 1], arr[front];
back := back - 1;
}else{
front := front + 1;
}
}
return front;
}
On entry to your method, the precondition tells you
forall i :: 0 <= i < arr.Length ==> arr[i] == 0 || arr[i] == 1
So, at that time, it is known that all array elements are either 0 or 1. However, since the array is modified by the loop, you must mention in invariants all the things you still want to remember about the contents of the array.
In different words, to verify that the loop body maintains the invariant, think of the loop body as starting in an arbitrary state satisfying the invariant. You may have in your mind that array elements remain 0 or 1, but your invariant does not say that. This is why you're unable to prove that the loop invariant is maintained.
To fix the problem, add
forall i :: 0 <= i < arr.Length ==> arr[i] == 0 || arr[i] == 1
as a loop invariant.
Rustan
A method that copies the negative elements of an array of integers into another array has the property that the set of elements in the result is a subset of the elements in the original array, which stays the same during the copy.
The problem in the code below is that, as soon as we write something in the result array, Dafny somehow forgets that the original set is unchanged.
How to fix this?
method copy_neg (a: array<int>, b: array<int>)
requires a != null && b != null && a != b
requires a.Length == b.Length
modifies b
{
var i := 0;
var r := 0;
ghost var sa := set j | 0 <= j < a.Length :: a[j];
while i < a.Length
invariant 0 <= r <= i <= a.Length
invariant sa == set j | 0 <= j < a.Length :: a[j]
{
if a[i] < 0 {
assert sa == set j | 0 <= j < a.Length :: a[j]; // OK
b[r] := a[i];
assert sa == set j | 0 <= j < a.Length :: a[j]; // KO!
r := r + 1;
}
i := i + 1;
}
}
Edit
Following James Wilcox's answer, replacing inclusions of sets with predicates on sequences is what works the best.
Here is the complete specification (for an array with distinct elements). The post-condition has to be detailed a bit in the loop invariant and a dumb assert remains in the middle of the loop, but all ghost variables are gone, which is great.
method copy_neg (a: array<int>, b: array<int>)
returns (r: nat)
requires a != null && b != null && a != b
requires a.Length <= b.Length
modifies b
ensures 0 <= r <= a.Length
ensures forall x | x in a[..] :: x < 0 <==> x in b[..r]
{
r := 0;
var i := 0;
while i < a.Length
invariant 0 <= r <= i <= a.Length
invariant forall x | x in b[..r] :: x < 0
invariant forall x | x in a[..i] && x < 0 :: x in b[..r]
{
if a[i] < 0 {
b[r] := a[i];
assert forall x | x in b[..r] :: x < 0;
r := r + 1;
}
i := i + 1;
}
}
This is indeed confusing. I will explain why Dafny has trouble proving this below, but first let me give a few ways to make it go through.
First workaround
One way to make the proof go through is to insert the following forall statement after the line b[r] := a[i];.
forall x | x in sa
ensures x in set j | 0 <= j < a.Length :: a[j]
{
var j :| 0 <= j < a.Length && x == old(a[j]);
assert x == a[j];
}
The forall statement is a proof that sa <= set j | 0 <= j < a.Length :: a[j]. I will come back to why this works below.
Second workaround
In general, when reasoning about arrays in Dafny, it is best to use the a[..] syntax to convert the array to a mathematical sequence, and then work with that sequence. If you really need to work with the set of elements, you can use set x | x in a[..], and you will have a better time than if you use set j | 0 <= j < a.Length :: a[j].
Systematically replacing set j | 0 <= j < a.Length :: a[j] with set x | x in a[..] causes your program to verify.
Third solution
Popping up a level to specifying your method, it seems like you don't actually need to mention the set of all elements. Instead, you can get away with saying something like "every element of b is an element of a". Or, more formally forall x | x in b[..] :: x in a[..]. This is not quite a valid postcondition for your method, because your method may not fill out all of b. Since I'm not sure what your other constraints are, I'll leave that to you.
Explanations
Dafny's sets with elements of type A are translated to Boogie maps [A]Bool, where an element maps to true iff it is in the set. Comprehensions such as set j | 0 <= j < a.Length :: a[j] are translated to Boogie maps whose definition involves an existential quantifier. This particular comprehension translates to a map that maps x to
exists j | 0 <= j < a.Length :: x == read($Heap, a, IndexField(j))
where the read expression is the Boogie translation of a[j], which, in particular, makes the heap explicit.
So, to prove that an element is in the set defined by the comprehension, Z3 needs to prove an existential quantifier, which is hard. Z3 uses triggers to prove such quantifiers, and Dafny tells Z3 to use the trigger read($Heap, a, IndexField(j)) when trying to prove this quantifier. This turns out to not be a great trigger choice, because it mentions the current value of the heap. Thus, when the heap changes (ie, after updating b[r]), the trigger may not fire, and you get a failing proof.
Dafny lets you customize the trigger it uses for set comprehensions using a {:trigger} attribute. Unfortunately, there is no great choice of trigger at the Dafny level. However, a reasonable trigger for this program at the Boogie/Z3 level would be just IndexField(j) (though this is likely a bad trigger for such expressions in general, since it is overly general). Z3 itself will infer this trigger if Dafny doesn't tell it otherwise. You can Dafny to get out of the way by saying {:autotriggers false}, like this
invariant sa == set j {:autotriggers false} | 0 <= j < a.Length :: a[j]
This solution is unsatisfying and requires detailed knowledge of Dafny's internals. But now that we've understood it, we can also understand the other workarounds I proposed.
For the first workaround, the proof goes through because the forall statement mentions a[j], which is the trigger. This causes Z3 to successfully prove the existential.
For the second workaround, we have simplified the set comprehension expression so that it no longer introduces an existential quantifier. Instead the comprehension set x | x in a[..], is translated to a map that maps x to
x in a[..]
(ignoring details of how a[..] is translated). This means that Z3 never has to prove an existential, so the otherwise very similar proof goes through.
The third solution works for similar reasons, since it uses no comprehensions and thus no problematic existential quantifiers/
How to convert a simple while loop(c- code) to smt2 language or z3?
For ex :
int x,a;
while(x > 10 && x < 100){
a = x + a;
x++;
}
The input language to an SMT solver is first-order logic (with theories) and as such has no notion of computational operations such as loops.
You can
either use a loop invariant to encode an arbitrary loop iteration (and the pre- and post-state of the loop) and prove your relevant properties with respect to that arbitrary iteration, which is what deductive program verifiers such as Boogie, Dafny or Viper do
or, if the number of iterations is statically known, you unroll the loop and basically use single static assignment form to encode the different unrollings
For your loop, the latter would look as follows (not using proper SMT syntax here because I'm lazy):
declare x0, a0 // initial values
declare a1, x1 // values after first unrolling
x0 > 10 && x0 < 100 ==> a1 == a0 + x0 && x1 == x0 + 1
declare a2, x2 // values after second unrolling
x1 > 10 && x1 < 100 ==> a2 == a1 + x1 && x2 == x1 + 1
...
So, lets say I have a number 123456. 123456 % 97 = 72. How can I determine what two digits need to be added to the end of 123456 such that the new number % 97 = 1? Note--it must always be two digits.
For example, 12345676 % 97 = 1. In this case, I need to add the digits "76" to the end of the number.
(This is for IBAN number calculation.)
x = 123456
x = x * 100
newX = x + 1 + 97 - (x % 97)
Edit: put the 100 in the wrong place
You calc the modulo of 12345600 to 97 and add (97 - that + 1) to that number.
So you get what RoBorg explained whay cleaner above :)
This is the equation that that you need
X = Y -(Number*100 mod y) - 1
where:
Number = 123456
Y = 97
X the number you need
Let’s say we have to receive the number containing 12 digits.
Step 1:
Write any random number of 10 digits, i.e. 2 digits less than needed, e.g. 1234567890 – this is X
Step 2:
X * 100 = 123456789000.
‘123456789000’ - this is Y
Step 3:
Y / 97 = '1272750402.061856'.
'06' – this is Z
Step 4:
97 – Z + 1 = 92.
'92' – this is W
Step 5:
Final Deal Number is X followed by W, i.e. '123456789092'
Examples of accepted numbers:
100000000093
100000000190
100000000287
Etc.
Modulo arithmetic is really not that different from regular arithmetic. The key to solving the kind of problem that you're having is to realize that what you would normally do to solve that problem is still valid (in what follows, any mention of number means integer number):
Say you have
15 + x = 20
The way that you solve this is by realizing that the inverse of 15 under regular addition is -15 then you can write (exploiting commutativity and associativity as we naturally do)
15 + x + (-15) = (15 + (-15)) + x = 0 + x = x = 20 + (-15) = 5
so that your answer is x = 5
Now on to your problem.
Say that N and M are known, and you're looking for x under addition modulo k:
( N + x ) mod k = M
First realize that
( N + x ) mod k = ( ( N mod k ) + ( x mod k ) ) mod k
and for the problem to make sense
M mod k = M
and
x mod k = x
so that by letting
N mod k = N_k
and
( a + b ) mod k = a +_k b
you have
N_k +_k x = M
which means that what you need is the inverse of N_k under +_k. This is actually pretty simple because the inverse under +_k is whatever satisfies this equation:
N_k +_k ("-N_k") = 0
which is actually pretty simple because for a number y such that 0 <= y < k
(y + (k - y)) mod k = k mod k = 0
so that
"-N_k" = (k-N_k)
and then
N_k +_k x +_k "-N_k" = N_k +_k "-N_k" +_k x = 0 +_k x = x = M +_k "-N_k" = M +_k ( k - N_k )
so that the solution to
( N + x ) mod k = M
is
x = ( M + ( k - ( N mod k ) ) ) mod k
and for your problem in particular
( 12345600 + x ) % 97 = 1
is solved by
x = ( 1 + ( 97 - ( 12345600 mod 97 ) ) ) mod 97 = 76
Do notice that the requirement that you solution always have two digits is built in as long as k < 100