cryptol: arithmetic with different width - cryptol

How to perform arithmetic with values of different widths ?
In verilog there is no problem xoring 2 bits with 8 bits but cryptol complains:
cryptol> let test(x: [2],y: [8]) = x ^ y
[error] at <interactive>:1:31--1:32:
Type mismatch:
Expected type: 2
Inferred type: 8
My original problem:
I would like to rotate the bytes in a 64 bit value, with the number of bytes to shift depending on a two bit input. I struggle to get this working:
cryptol> let shift (v, s:[2]) = v >>> (s*16+8)
[error] at <interactive>:1:5--1:38:
Unsolved constraint:
2 >= 5
arising from
use of literal or demoted expression
at <interactive>:1:33--1:35
In the interpreter I can remove the type specification of s and then it works however I need to get that working from a file and with s being really a 2 bit value.

The type of ^ is:
Cryptol> :t (^)
(^) : {a} (Logic a) => a -> a -> a
Note that it requires both arguments to be exactly the same. You're getting the type-error because [2] is not the same as [8]; as they differ in size. Unlike Verilog, Cryptol will not "pad" things implicitly, and I think Cryptol is definitely doing the right thing here. Verilog programmers can chime in with countless bugs they had due to implicit casting.
All such casting in Cryptol has to be explicit.
The typical way to deal with this situation in Cryptol is to use the polymorphic constant zero:
Cryptol> :t zero
zero : {a} (Zero a) => a
The value zero inhabits all types (you can ignore the Zero constraint for now), and as you can imagine is the "right" padding value in this case. So, you'd define your function as:
Cryptol> let test(x:[2], y:[8]) = (zero#x)^y
Cryptol> :t test
test : ([2], [8]) -> [8]
And use it like this:
Cryptol> test (1, 5)
0x04
And if you wanted to pad on the right for some reason, you'd do:
Cryptol> let test(x:[2], y:[8]) = (x#zero)^y
Cryptol> test(1,5)
0x45
This way, everything is explicit and you don't have to know all the magical rules about how things get padded to become the right size.
If you want to get real fancy, then you can do:
Cryptol> let test(x, y) = (zero#x)^(zero#y)
Cryptol> :t test
test : {n, m, i, j, a} (Logic a, Zero a, m + i == n + j, fin n,
fin m) =>
([i]a, [j]a) -> [m + i]a
Now, that type looks a bit scary; but essentially it's telling you that you can give it any sized arguments, and it would be valid for any other size, so long as the new size is larger than the maximum of the two you've given. Of course, this inferred size is way more polymorphic then you probably cared for; so you can give it something more readable:
test : {m, n} (fin m, fin n) => [m] -> [n] -> [max m n]
test x y = (zero#x) ^ (zero#y)
I believe this captures your intent perfectly. Note how cryptol will make sure your inputs are finite, and you get the maximum of the two sizes given.
Getting back to your example, Cryptol is telling you that to multiply by 16 you need at least 5 bits, and thus 2>=5 is not satisfiable. This is a bit cryptic, but arises from the use of literals which are polymorphically typed. You can use the zero trick to address the issue in the same way as before:
Cryptol> let shift (v, s:[2]) = v >>> ((zero#s)*16+8)
[warning] at <interactive>:1:32--1:38:
Defaulting type argument 'front' of '(#)' to 3
But note how cryptol is warning you about the type of zero that's used there, since the type of >>> is polymorphic enough to allow different size shifts/rotates:
Cryptol> :t (>>>)
(>>>) : {n, ix, a} (fin n, fin ix) => [n]a -> [ix] -> [n]a
In these cases, Cryptol will pick the smallest possible size to default to by looking at the expressions. Unfortunately, it does the wrong thing here. By picking size 3 for zero, you'll have a 5 bit shift, but your expression can produce the maximum value of 3*16+8=56, which requires at least 6 bits to represent. Note that Cryptol only uses the minimum size required to handle the multiplication there, and does not care about overflows! This is why it's important to pay attention to such warnings.
To be clear: Cryptol did the right thing per the language rules on how type inference works, but it ended up picking a size that is just too small for what you wanted to do.
So, you should write your shift as follows:
Cryptol> let shift (v, s:[2]) = v >>> (((zero:[4])#s)*16+8)
Cryptol> :t shift
shift : {n, a} (fin n) => ([n]a, [2]) -> [n]a
The important thing here is to make sure the expression s*16+8 will fit in the final result, and since s is only 2 bits wide the largest value will be 56 as discussed above, which needs at least 6-bits to represent. This is why I chose [4] as the size of zero.
The moral of the story here is that you should always be explicit about the sizes of your bitvectors, and Cryptol will give you the right framework to express your constraints in a polymorphic way to allow for code reuse without ambiguity, avoiding many of the pitfalls of Verilog and other similar languages.

Related

Are those languages regular or not

Hi I would need help from you, I got the following languages and need to determine if are regular or not.
Now I think that Y is not regular and I applied the Pumping Lemma to determine that.
For X I am not sure if is a regular language or not, I was thinking that X is the set of strings with an odd number of a's that can be easily represented with an NFA.
Can anyone help me with that ?
The first one (X) is regular, because you can construct a finite automaton for it:
(start) --- a --> (final) -- a --> (state)
^ |
\------ a -------/
The second one (Y) is not regular, because you cannot construct a finite automaton for it. It would require memory to store the number of a to be able to later find one more of b. That language is context-free, with a grammar:
S = T b
T = a T b
T = ε

Surprising Dafny failure to verify boundedness of set comprehension

Dafny has no problem with this definition of a set intersection function.
function method intersection(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A && x in B
}
But when it comes to union, Dafny complains, "a set comprehension must produce a finite set, but Dafny's heuristics can't figure out how to produce a bounded set of values for 'x'". A and B are finite, and so, clearly the union is, too.
function method union(A: set<int>, B: set<int>): (r: set<int>)
{
set x | x in A || x in B
}
What explains this, to-a-beginner seemingly discrepant, behavior?
This is indeed potentially surprising!
First, let me note that in practice, Dafny has built-in operators for intersection and union that it knows preserve finiteness. So you don't need to use set comprehensions to express these ideas. Instead you could just say A * B and A + B respectively.
However, my guess is that you're running into a more complicated example where you're using a set comprehension with a disjunction and are confused about why Dafny can't prove it finite.
Dafny uses syntactic heuristics to determine whether a set comprehension is finite. Unfortunately, these heuristics are not well documented anywhere. For purposes of this question, the key point is that the heuristics either depend on the type of the comprehension's bound variables, or look for a conjunct that constrains elements to be bounded in some other way. For example, Dafny can prove
set x: int | 0 <= x < 10 && ...
finite, as well as
set x:A | x in S && ...
In both cases, it is essential that the relevant bounds be conjuncts. Dafny has no syntactic heuristic for proving a bound for disjunctions, although one could imagine adding one. That is why Dafny cannot prove your union function finite.
As an aside, another work around would be to use potentially infinite sets (written iset in Dafny). If you don't need use the cardinality of the sets, then these might work better.

What exactly is the 'pumping length' in the Pumping lemma?

I'm trying to understand what is this 'magical' number 'n' that is used in every application of the Pumping lemma. After hours of research on the subject, I came to the following website: http://elvis.rowan.edu/~nlt/TheoryNotes/PumpingLemma.pdf
It states
n is
the longest a string can be without having a loop. The biggest n can
be is s, though it might be smaller for some particular language.
From what I understand if there is a Language L then the pumping length of L is the amount of states in the Finite State Automata that recognizes L. Is this true?
If it is then what exactly does the last line from above say "though it might be smaller for some particular language"? Complete mess in my head. Could somebody clear it up, please?
A state doesn't recognise a language. A DFA recognises a language by accepting exactly the set of words in the languages and no others. A DFA has many states.
If there is a regular language L, which can be modelled by the Pumping Lemma, it will have a property n.
For a DFA with s states, in order for it to accept L, s must be >= n.
The last line merely states that there are some languages in which s is greater than n, rather than equal.
This is probably more suited for https://cstheory.stackexchange.com/ or https://cs.stackexchange.com/ (not quite sure of the value of both myself).
Note: Trivially, not all DFA's with sufficient states will accept the language. Also, the fact that a language passes the pumping lemma doesn't mean it's regular (but failing it means definitely isn't).
Note also, the language changes between FA and DFA - this is a bit lax, but because NDFAs have the same power as DFAs and DFAs are easier to write and understand, DFAs are used for the proof.
Edit I'll give an example of a regular language, so you can see an idea of u,v,w,z and n. Then we'll discuss s.
L = xy^nz, n > 2 (i.e. xyyz, xyyyz, xyyyyz)
u = xy
v = y
w = z
n = 4
Hence:
|z| = 3: xyz (i = 0) Not in L, but |z| < n
|z| = 4: xyyz (i = 1)
|z| = 5: xyyyz (i = 2)
etc
Hence, it's modelled by the Pumping Lemma. A DFA would need at least 4 states. So let's think of one.
-> State 1: x
State 1:
-> State 2: y
State 2:
-> State 3: y
State 3:
-> State 3: y
-> State 4: z
State 4:
Accepting state
Terminating state
4 states, as expected. Not possible to do it in less, as expected by n = 4. In this case, the example is quite simple so I don't think you can build one with more than 4 states (but that would be okay if it were needed).
I think a possibility is when you have a FA with an unreachable state. The FA has s states, but 1 is unreachable, so all strings recognizing L will be comprised of (s-1)=n states, so n<s.

Explanation on this FIRST function

LL(1) Grammar:
(1) Var -> ID DimList
(2) DimList -> ε DimList'
(3) DimList' -> Dim DimList'
(4) DimList' -> ε
(5) Dim -> [ CONST ]
And, in the script that I am reading, it says that the function FIRST(ε DimList') gives {#, [}. But, how?
My guess is that since the right side of (2) begins with ε, it skips epsilon and takes FIRST(DimList') which is, considering (3) and (5), equal to {[}, BUT also, because of (4), takes FOLLOW(DimList') which is {#}.
Other way it could be is that, since (2) begins with ε it skips epsilon and takes FIRST(DimList') BUT ALSO takes FOLLOW(DimList) from (2)...
First one makes more sense to me, though I'm still in the process of learning basics of LL(1) grammars so I would appreciate if someone takes the time to make this clear, thank you.
EDIT: And, of course, it could be that neither of these is true.
The usual definition of the FIRST function would result in FIRST(Dimlist) (or, if you like, FIRST(ε Dimlist') being {ε, [}. ε is in FIRST(ε Dimlist') because both ε and Dimlist' are nullable. [ is an element because it could be the first symbol in a derivation of ε Dimlist, which is the same as saying that it could be the first symbol in a derivation of Dimlist'.
Another way of saying this is that:
FIRST(ε Dimlist' #) = {#, [}
We usually then define the function PREDICT:
PREDICT(ω) = FIRST(ω FOLLOW(ω))
and we can see that
PREDICT(Dimlist) = FIRST(Dimlist FOLLOW(Dimlist)) = {#, [}
Here, FIRST(ω) is the set of strings of terminals (of length ≤ 1) which could appear at the beginning of a derivation of ω, while PREDICT(ω) is the set of strings of terminals (of length ≤ 1) which could be present in the input when a derivation of ω is possible.
It's not uncommon to confuse FIRST and PREDICT, but it's better to keep the difference straight.
Note that all of these functions can be generalized to strings of maximum length k, which are usually written FIRSTk, FOLLOWk and PREDICTk, and the definition of PREDICTk is similar to the above:
PREDICTk(ω) = FIRSTk(ω FOLLOWk(ω))

Ranges A to B where A > B in F#

I've just found something I'd call a quirk in F# and would like to know whether it's by design or by mistake and if it's by design, why is it so...
If you write any range expression where the first term is greater than the second term the returned sequence is empty. A look at reflector suggests this is by design, but I can't really find a reason why it would have to be so.
An example to reproduce it is:
[1..10] |> List.length
[10..1] |> List.length
The first will print out 10 while the second will print out 0.
Tests were made in F# CTP 1.9.6.2.
EDIT: thanks for suggesting expliciting the range, but there's still one case (which is what inspired me to ask this question) that won't be covered. What if A and B are variables and none is constantly greater than the other although they're always different?
Considering that the range expression does not seem to get optimized at compiled time anyway, is there any good reason for the code which determines the step (not explicitly specified) in case A and B are ints not to allow negative steps?
As suggested by other answers, you can do
[10 .. -1 .. 1] |> List.iter (printfn "%A")
e.g.
[start .. step .. stop]
Adam Wright - But you should be able
to change the binding for types you're
interested in to behave in any way you
like (including counting down if x >
y).
Taking Adam's suggestion into code:
let (..) a b =
if a < b then seq { a .. b }
else seq { a .. -1 .. b }
printfn "%A" (seq { 1 .. 10 })
printfn "%A" (seq { 10 .. 1 })
This works for int ranges. Have a look at the source code for (..): you may be able to use that to work over other types of ranges, but not sure how you would get the right value of -1 for your specific type.
What "should" happen is, of course, subjective. Normal range notation in my mind defines [x..y] as the set of all elements greater than or equal to x AND less than or equal to y; an empty set if y < x. In this case, we need to appeal to the F# spec.
Range expressions expr1 .. expr2 are evaluated as a call to the overloaded operator (..), whose default binding is defined in Microsoft.FSharp.Core.Operators. This generates an IEnumerable<_> for the range of values between the given start (expr1) and finish (expr2) values, using an increment of 1. The operator requires the existence of a static member (..) (long name GetRange) on the static type of expr1 with an appropriate signature.
Range expressions expr1 .. expr2 .. expr3 are evaluated as a call to the overloaded operator (.. ..), whose default binding is defined in Microsoft.FSharp.Core.Operators. This generates an IEnumerable<_> for the range of values between the given start (expr1) and finish (expr3) values, using an increment of expr2. The operator requires the existence of a static member (..) (long name GetRange) on the static type of expr1 with an appropriate signature.
The standard doesn't seem to define the .. operator (a least, that I can find). But you should be able to change the binding for types you're interested in to behave in any way you like (including counting down if x > y).
In haskell, you can write [10, 9 .. 1]. Perhaps it works the same in F# (I haven't tried it)?
edit:
It seems that the F# syntax is different, maybe something like [10..-1..1]
Ranges are generally expressed (in the languages and frameworks that support them) like this:
low_value <to> high_value
Can you give a good argument why a range ought to be able to be expressed differently? Since you were requesting a range from a higher number to a lower number does it not stand to reason that the resulting range would have no members?

Resources