rewriting of Z3_ast during its traversing in C++ - z3

to_expr function leads to error. Could you advise what is wrong below?
context z3_cont;
expr x = z3_cont.int_const("x");
expr y = z3_cont.int_const("y");
expr ge = ((y==3) && (x==2));
ge = swap_tree( ge );
where swap_tree is a function that shall swap all operands of binary operations. It defined as follows.
expr swap_tree( expr e ) {
Z3_ast ee[2];
if ( e.is_app() && e.num_args() == 2) {
for ( int i = 0; i < 2; ++i ) {
ee[ 1 - i ] = swap_tree( e.arg(i) );
}
for ( int i = 0; i < 2; ++i ) {
cout <<" ee[" << i << "] : " << to_expr( z3_cont, ee[ i ] ) << endl;
}
return to_expr( z3_cont, Z3_update_term( z3_cont, e, 2, ee ) );
}
else
return e;
}

The problem is "referencing counting". A Z3 object can be garbage collected by the system if its reference counter is 0. The Z3 C++ API provides "smart pointers" (expr, sort, ...) for automatically managing the reference counters for us. Your code uses Z3_ast ee[2]. In the for-loop, you store the result of swap_tree(e.arg(0)) into ee[0]. Since the reference counter is not incremented, this Z3 object may be deleted when executing the second iteration of the loop.
Here is a possible fix:
expr swap_tree( expr e ) {
if ( e.is_app() && e.num_args() == 2) {
// using smart-pointers to store the intermediate results.
expr ee0(z3_cont), ee1(z3_cont);
ee0 = swap_tree( e.arg(0) );
ee1 = swap_tree( e.arg(1) );
Z3_ast ee[2] = { ee1, ee0 };
return to_expr( z3_cont, Z3_update_term( z3_cont, e, 2, ee ) );
}
else {
return e;
}
}

Related

General insertion sort algorithm not moving the first object in list

I need to create a general insertion sort algorithm using move semantics. I have it working for entire lists of different types of objects except for the very first object in the list.
template<typename Iter, typename Comparator>
void insertionSort(const Iter& begin, const Iter& end, Comparator compareFn)
{
for (auto i = begin + 1; i < end; i++)
{
auto currentthing = std::move(*i);
auto j = std::move(i - 1);
while (j >= begin and compareFn(*j, currentthing))
{
*(j + 1) = std::move(*j);
if (j == begin)
break;
j--;
}
*(j + 1) = std::move(currentthing);
}
}
Where comparing a list of strings from my main function:
int main()
{
vector<int> numbers = { 0, 1, 8, 4, 2, 9, 5, 3, 6, 7, 10 };
insertionSort(numbers.begin(), numbers.end(), std::less<int>());
cout << "Sorted: " << numbers << "\n";
vector<string> names = { "p", "a", "b", "d", "c", "f", "e" };
insertionSort(names.begin(), names.end(), std::greater<string>());
cout << "Sorted: " << names << "\n";
return 0;
}
Outputs the following
Sorted: [ 0 10 9 8 7 6 5 4 3 2 1 ]
Sorted: [ a b c d e f ]
The while loop should break when j equals i and not when j equals begin. So, the following:
if (j == begin)
break;
should actually be:
if (j == i)
break;

How does xv6 write to the terminal?

The printf function calls write (re. forktest.c):
void printf ( int fd, char *s, ... )
{
write( fd, s, strlen(s) );
}
Passing 1 as the fd writes to the console (as 1 maps to stdout). But where is write defined? I only see its declaration in user.h.
int write ( int, void*, int );
I'm assuming it somehow gets redirected to filewrite in file.c.
int filewrite (struct file *f, char *addr, int n )
{
int r;
if ( f->writable == 0 )
return -1;
if ( f->type == FD_PIPE )
return pipewrite( f->pipe, addr, n );
if ( f->type == FD_INODE )
{
// write a few blocks at a time to avoid exceeding
// the maximum log transaction size, including
// i-node, indirect block, allocation blocks,
// and 2 blocks of slop for non-aligned writes.
// this really belongs lower down, since writei()
// might be writing a device like the console.
int max = ( ( MAXOPBLOCKS - 1 - 1 - 2 ) / 2 ) * 512;
int i = 0;
while ( i < n )
{
int n1 = n - i;
if ( n1 > max )
n1 = max;
begin_op();
ilock( f->ip );
if ( ( r = writei( f->ip, addr + i, f->off, n1 ) ) > 0 )
f->off += r;
iunlock( f->ip );
end_op();
if ( r < 0 )
break;
if ( r != n1 )
panic( "short filewrite" );
i += r;
}
return i == n ? n : -1;
}
panic( "filewrite" );
}
And filewrite calls writei which is defined in fs.c.
int writei ( struct inode *ip, char *src, uint off, uint n )
{
uint tot, m;
struct buf *bp;
if ( ip->type == T_DEV )
{
if ( ip->major < 0 || ip->major >= NDEV || !devsw[ ip->major ].write )
return -1;
return devsw[ ip->major ].write( ip, src, n );
}
if ( off > ip->size || off + n < off )
return -1;
if ( off + n > MAXFILE*BSIZE )
return -1;
for ( tot = 0; tot < n; tot += m, off += m, src += m )
{
bp = bread( ip->dev, bmap( ip, off/BSIZE ) );
m = min( n - tot, BSIZE - off%BSIZE );
memmove( bp->data + off%BSIZE, src, m );
log_write( bp );
brelse( bp );
}
if ( n > 0 && off > ip->size )
{
ip->size = off;
iupdate( ip );
}
return n;
}
How does all this result in the terminal displaying the characters? How does the terminal know to read fd 1 for display, and where to find fd 1? What is the format of fd 1? Is it a standard?
Below is the full path from printf to the terminal. The gist is that eventually, xv6 writes the character to the CPU's serial port.
QEMU is initialized with the flags -nographic or -serial mon:stdio which tell it to use the terminal to send data to, or receive data from the CPU's serial port.
Step 1) printf in forktest.c
void printf ( int fd, const char *s, ... )
{
write( fd, s, strlen( s ) );
}
void forktest ( void )
{
...
printf( 1, "fork test\n" );
...
}
Step 2) write in usys.S
.globl write
write:
movl $SYS_write, %eax
int $T_SYSCALL
ret
Step 3) sys_write in sysfile.c
int sys_write ( void )
{
...
argfd( 0, 0, &f )
...
return filewrite( f, p, n );
}
static int argfd ( int n, int *pfd, struct file **pf )
{
...
f = myproc()->ofile[ fd ]
...
}
Previously during system initialization, main in init.c was called where the stdin (0), stdout (1), and stderr (2) file descriptors are created. This is what argfd finds when looking up the file descriptor argument to sys_write.
int main ( void )
{
...
if ( open( "console", O_RDWR ) < 0 )
{
mknod( "console", 1, 1 ); // stdin
open( "console", O_RDWR );
}
dup( 0 ); // stdout
dup( 0 ); // stderr
...
}
The stdin|out|err are inodes of type T_DEV because they are created using mknod in sysfile.c
int sys_mknod ( void )
{
...
ip = create( path, T_DEV, major, minor )
...
}
The major device number of 1 that is used to create them is mapped to the console. See file.h
// Table mapping major device number to device functions
struct devsw
{
int ( *read )( struct inode*, char*, int );
int ( *write )( struct inode*, char*, int );
};
extern struct devsw devsw [];
#define CONSOLE 1
Step 4) filewrite in file.c
int filewrite ( struct file *f, char *addr, int n )
{
...
if ( f->type == FD_INODE )
{
...
writei( f->ip, addr + i, f->off, n1 )
...
}
...
}
Step 5) writei in fs.c
int writei ( struct inode *ip, char *src, uint off, uint n )
{
...
if ( ip->type == T_DEV )
{
...
return devsw[ ip->major ].write( ip, src, n );
}
...
}
The call to devsw[ ip->major ].write( ip, src, n )
becomes devsw[ CONSOLE ].write( ip, src, n ).
Previously during system initialization, consoleinit mapped this to the function consolewrite (see console.c)
void consoleinit ( void )
{
...
devsw[ CONSOLE ].write = consolewrite;
devsw[ CONSOLE ].read = consoleread;
...
}
Step 6) consolewrite in console.c
int consolewrite ( struct inode *ip, char *buf, int n )
{
...
for ( i = 0; i < n; i += 1 )
{
consputc( buf[ i ] & 0xff );
}
...
}
Step 7) consoleputc in console.c
void consputc ( int c )
{
...
uartputc( c );
...
}
Step 8) uartputc in uart.c.
The out assembly instruction is used to write to the CPU's serial port.
#define COM1 0x3f8 // serial port
...
void uartputc ( int c )
{
...
outb( COM1 + 0, c );
}
Step 9) QEMU is configured to use the serial port for communication in the Makefile through the -nographic or -serial mon:stdio flags. QEMU uses the terminal to send data to the serial port, and to display data from the serial port.
qemu: fs.img xv6.img
$(QEMU) -serial mon:stdio $(QEMUOPTS)
qemu-nox: fs.img xv6.img
$(QEMU) -nographic $(QEMUOPTS)
fd==1 refers to stdout, or Standard Out. It's a common feature of Unix-like Operatin Systems. The kernel knows that it's not a real file. Writes to stdout are mapped to terminal output.

with creating table in table in Lua C-API

I use this code for creating table in table (Like namespace) in Lua C-API:
JNIEXPORT void JNICALL Java_com_naef_jnlua_LuaState_lua_1import_1tables(JNIEnv *env,
jobject obj, jstring namespace) {
lua_State *L;
JNLUA_ENV(env);
L = getluathread(obj);
char * str= getstringchars(namespace);
char ** res = NULL;
char * p = strtok (str, ".");
int n_spaces = 0, i;
while (p) {
res = realloc (res, sizeof (char*) * ++n_spaces);
if (res == NULL)
exit (-1);
res[n_spaces-1] = p;
p = strtok (NULL, ".");
}
for (i = 0; i < (n_spaces); ++i) {
if (i == 0) {
lua_newtable(L);
} else if (i == (n_spaces - 1)) {
lua_pushlstring(L, res[i], (sizeof(res[i])/sizeof(char))-1);
lua_getglobal(L, res[i]);
break;
} else {
lua_pushlstring(L, res[i], (sizeof(res[i])/sizeof(char))-1);
lua_newtable(L);
}
}
for (i = (n_spaces - 2); i >= 0 ; i--) {
if (i == 0) {
lua_setglobal(L, res[i]);
break;
} else {
lua_settable(L, -3);
}
}
free(res);
}
This is can be equals for this hardcode:
lua_newtable( L ); /* ==> stack: ..., {} */
{
lua_pushliteral( L, "b" ); /* ==> stack: ..., {}, "b" */
lua_newtable( L ); /* ==> stack: ..., {}, "b", {} */
{
lua_pushliteral( L, "c" ); /* == stack: ..., {}, "b", {}, "c" */
lua_newtable( L ); /* ==> stack: ..., {}, "b", {}, "c", {} */
{
lua_pushliteral( L, "d" );
lua_getglobal(L, "MyTable");
lua_settable( L, -3 );
}
lua_settable( L, -3 ); /* ==> stack: ..., {}, "b", {} */
}
lua_settable( L, -3 ); /* ==> stack: ..., {} */
}
lua_setglobal( L, "a" ); /* ==> stack: ... */
When i send i function Java_com_naef_jnlua_LuaState_lua_1import_1tables() String looks like this "com.naef.jnlua.test.fixture.TestObject"// TestObject is equivalent "MyTable", other like "com" tables in tables and TestObject table - the last table
When i try after this code execute lua code com.naef.jnlua.test.fixture.TestObject
attempt to index field 'naef' (a nil value)
Where i mistake?

How to loop over array in Z3Py

As part of a reverse engineering exercise, I'm trying to write a Z3 solver to find a username and password that satisfy the program below. This is especially tough because the z3py tutorial that everyone refers to (rise4fun) is down.
#include <iostream>
#include <string>
using namespace std;
int main() {
string name, pass;
cout << "Name: ";
cin >> name;
cout << "Pass: ";
cin >> pass;
int sum = 0;
for (size_t i = 0; i < name.size(); i++) {
char c = name[i];
if (c < 'A') {
cout << "Lose: char is less than A" << endl;
return 1;
}
if (c > 'Z') {
sum += c - 32;
} else {
sum += c;
}
}
int r1 = 0x5678 ^ sum;
int r2 = 0;
for (size_t i = 0; i < pass.size(); i++) {
char c = pass[i];
c -= 48;
r2 *= 10;
r2 += c;
}
r2 ^= 0x1234;
cout << "r1: " << r1 << endl;
cout << "r2: " << r2 << endl;
if (r1 == r2) {
cout << "Win" << endl;
} else {
cout << "Lose: r1 and r2 don't match" << endl;
}
}
I got that code from the assembly of a binary, and while it may be wrong I want to focus on writing the solver. I'm starting with the first part, just calculating r1, and this is what I have:
from z3 import *
s = Solver()
sum = Int('sum')
name = Array('name', IntSort(), IntSort())
for c in name:
s.add(c < 65)
if c > 90:
sum += c - 32
else:
sum += c
r1 = Xor(sum, 0x5678)
print s.check()
print s.model()
All I'm asserting is that there are no letters less than 'A' in the array, so I expect to get back an array of any size that has numbers greater than 65.
Obviously this is completely wrong, mainly because it infinite loops. Also, I'm not sure I'm calculating sum correctly, because I don't know if it's initialized to 0. Could someone help figure out how to get this first loop working?
EDIT: I was able to get a z3 script that is close to the C++ code shown above:
from z3 import *
s = Solver()
sum = 0
name = Array('name', BitVecSort(32), BitVecSort(32))
i = Int('i')
for i in xrange(0, 1):
s.add(name[i] >= 65)
s.add(name[i] < 127)
if name[i] > 90:
sum += name[i] - 32
else:
sum += name[i]
r1 = sum ^ 0x5678
passwd = Array('passwd', BitVecSort(32), BitVecSort(32))
r2 = 0
for i in xrange(0, 5):
s.add(passwd[i] < 127)
s.add(passwd[i] >= 48)
c = passwd[i] - 48
r2 *= 10
r2 += c
r2 ^= 0x1234
s.add(r1 == r2)
print s.check()
print s.model()
This code was able to give me a correct username and password. However, I hardcoded the lengths of one for the username and five for the password. How would I change the script so I wouldn't have to hard code the lengths? And how would I generate a different solution each time I run the program?
Arrays in Z3 do not necessarily have any bounds. In this case the index-sort is Int, which means unbounded integers (not machine integers). Consequently, for c in name will run forever because it enumerates name[0], name[1], name[2], ...
It seems that you actually have a bound in the original program (name.size()), so it would suffice to enumerate up to that limit. Otherwise you might need a quantifier, e.g., \forall x of Int sort . name[x] < 65. This comes with all the warnings about quantifiers, of course (see e.g., the Z3 Guide)
Suppose the length is to be determined. Here is what I think you could do:
length = Int('length')
x = Int('x')
s.add(ForAll(x,Implies(And(x>=0,x<length),And(passwd[x] < 127,passwd[x] >=48))))

Parsing a TeX-like language with lpeg

I am struggling to get my head around LPEG. I have managed to produce one grammar which does what I want, but I have been beating my head against this one and not getting far. The idea is to parse a document which is a simplified form of TeX. I want to split a document into:
Environments, which are \begin{cmd} and \end{cmd} pairs.
Commands which can either take an argument like so: \foo{bar} or can be bare: \foo.
Both environments and commands can have parameters like so: \command[color=green,background=blue]{content}.
Other stuff.
I also would like to keep track of line number information for error handling purposes. Here's what I have so far:
lpeg = require("lpeg")
lpeg.locale(lpeg)
-- Assume a lot of "X = lpeg.X" here.
-- Line number handling from http://lua-users.org/lists/lua-l/2011-05/msg00607.html
-- with additional print statements to check they are working.
local newline = P"\r"^-1 * "\n" / function (a) print("New"); end
local incrementline = Cg( Cb"linenum" )/ function ( a ) print("NL"); return a + 1 end , "linenum"
local setup = Cg ( Cc ( 1) , "linenum" )
nl = newline * incrementline
space = nl + lpeg.space
-- Taken from "Name-value lists" in http://www.inf.puc-rio.br/~roberto/lpeg/
local identifier = (R("AZ") + R("az") + P("_") + R("09"))^1
local sep = lpeg.S(",;") * space^0
local value = (1-lpeg.S(",;]"))^1
local pair = lpeg.Cg(C(identifier) * space ^0 * "=" * space ^0 * C(value)) * sep^-1
local list = lpeg.Cf(lpeg.Ct("") * pair^0, rawset)
local parameters = (P("[") * list * P("]")) ^-1
-- And the rest is mine
anything = C( (space^1 + (1-lpeg.S("\\{}")) )^1) * Cb("linenum") / function (a,b) return { text = a, line = b } end
begin_environment = P("\\begin") * Ct(parameters) * P("{") * Cg(identifier, "environment") * Cb("environment") * P("}") / function (a,b) return { params = a[1], environment = b } end
end_environment = P("\\end{") * Cg(identifier) * P("}")
texlike = lpeg.P{
"document";
document = setup * V("stuff") * -1,
stuff = Cg(V"environment" + anything + V"bracketed_stuff" + V"command_with" + V"command_without")^0,
bracketed_stuff = P"{" * V"stuff" * P"}" / function (a) return a end,
command_with =((P("\\") * Cg(identifier) * Ct(parameters) * Ct(V"bracketed_stuff"))-P("\\end{")) / function (i,p,n) return { command = i, parameters = p, nodes = n } end,
command_without = (( P("\\") * Cg(identifier) * Ct(parameters) )-P("\\end{")) / function (i,p) return { command = i, parameters = p } end,
environment = Cg(begin_environment * Ct(V("stuff")) * end_environment) / function (b,stuff, e) return { b = b, stuff = stuff, e = e} end
}
It almost works!
> texlike:match("\\foo[one=two]thing\\bar")
{
command = "foo",
parameters = {
{
one = "two",
},
},
}
{
line = 1,
text = "thing",
}
{
command = "bar",
parameters = {
},
}
But! First, I can't get the line number handling part to work at all. The function within incrementline is never fired.
I also can't quite work out how nested capture information is passed to handling functions (which is why I have scattered Cg, C and Ct semirandomly over the grammar). This means that only one item is returned from within a command_with:
> texlike:match("\\foo{text \\command moretext}")
{
command = "foo",
nodes = {
{
line = 1,
text = "text ",
},
},
parameters = {
},
}
I would also love to be able to check that the environment start and ends match up but when I tried to do so, my back references from "begin" were not in scope by the time I got to "end". I don't know where to go from here.
Late answer but hopefully it'll offer some insight if you're still looking for a solution or wondering what the problem was.
There are a couple of issues with your grammar, some of which can be tricky to spot.
Your line increment here looks incorrect:
local incrementline = Cg( Cb"linenum" ) /
function ( a ) print("NL"); return a + 1 end,
"linenum"
It looks like you meant to create a named capture group and not an anonymous group. The backcapture linenum is essentially being used like a variable. The problem is because this is inside an anonymous capture, linenum will not update properly -- function(a) will always receive 1 when called. You need to move the closing ) to the end so "linenum" is included:
local incrementline = Cg( Cb"linenum" /
function ( a ) print("NL"); return a + 1 end,
"linenum")
Relevant LPeg documentation for Cg capture.
The second problem is with your anything non-terminal rule:
anything = C( (space^1 + (1-lpeg.S("\\{}")) )^1) * Cb("linenum") ...
There are several things to be careful here. First, a named Cg capture (from incrementline rule once it's fixed) doesn't produce anything unless it's in a table or you backref it. The second major thing is that it has an adhoc scope like a variable. More precisely, its scope ends once you close it in an outer capture -- like what you're doing here:
C( (space^1 + (...) )^1)
Which means by the time you reference its backcapture with * Cb("linenum"), that's already too late -- the linenum you really want already closed its scope.
I always found LPeg's re syntax a bit easier to grok so I've rewritten the grammar with that instead:
local grammar_cb =
{
fold = pairfold,
resetlinenum = resetlinenum,
incrementlinenum = incrementlinenum, getlinenum = getlinenum,
error = error
}
local texlike_grammar = re.compile(
[[
document <- '' -> resetlinenum {| docpiece* |} !.
docpiece <- {| envcmd |} / {| cmd |} / multiline
beginslash <- cmdslash 'begin'
endslash <- cmdslash 'end'
envcmd <- beginslash paramblock? {:beginenv: envblock :} (!endslash docpiece)*
endslash openbrace {:endenv: =beginenv :} closebrace / &beginslash {} -> error .
envblock <- openbrace key closebrace
cmd <- cmdslash {:command: identifier :} (paramblock? cmdblock)?
cmdblock <- openbrace {:nodes: {| docpiece* |} :} closebrace
paramblock <- opensq ( {:parameters: {| parampairs |} -> fold :} / whitesp) closesq
parampairs <- parampair (sep parampair)*
parampair <- key assign value
key <- whitesp { identifier }
value <- whitesp { [^],;%s]+ }
multiline <- (nl? text)+
text <- {| {:text: (!cmd !closebrace !%nl [_%w%p%s])+ :} {:line: '' -> getlinenum :} |}
identifier <- [_%w]+
cmdslash <- whitesp '\'
assign <- whitesp '='
sep <- whitesp ','
openbrace <- whitesp '{'
closebrace <- whitesp '}'
opensq <- whitesp '['
closesq <- whitesp ']'
nl <- {%nl+} -> incrementlinenum
whitesp <- (nl / %s)*
]], grammar_cb)
The callback functions are straight-forwardly defined as:
local function pairfold(...)
local t, kv = {}, ...
if #kv % 2 == 1 then return ... end
for i = #kv, 2, -2 do
t[ kv[i - 1] ] = kv[i]
end
return t
end
local incrementlinenum, getlinenum, resetlinenum do
local line = 1
function incrementlinenum(nl)
assert(not nl:match "%S")
line = line + #nl
end
function getlinenum() return line end
function resetlinenum() line = 1 end
end
Testing the grammar with a non-trivial tex-like str with multiple lines:
local test1 = [[\foo{text \bar[color = red, background = black]{
moretext \baz{
even
more text} }
this time skipping multiple
lines even, such wow!}]]
Produces the follow AST in lua-table format:
{
command = "foo",
nodes = {
{
text = "text",
line = 1
},
{
parameters = {
color = "red",
background = "black"
},
command = "bar",
nodes = {
{
text = " moretext",
line = 2
},
{
command = "baz",
nodes = {
{
text = "even ",
line = 3
},
{
text = "more text",
line = 4
}
}
}
}
},
{
text = "this time skipping multiple",
line = 7
},
{
text = "lines even, such wow!",
line = 9
}
}
}
And a second test for begin/end environments:
local test2 = [[\begin[p1
=apple,
p2=blue]{scope} scope foobar
\end{scope} global foobar]]
Which seems to give approximately what you're looking for:
{
{
{
text = " scope foobar",
line = 3
},
parameters = {
p1 = "apple",
p2 = "blue"
},
beginenv = "scope",
endenv = "scope"
},
{
text = " global foobar",
line = 4
}
}

Resources