z3 with workflow satisfiability - z3

I am new using Z3, and after a lot of tutorial an reading almost all the related questions I still have some doubts about how to "encode" a problem with Z3. CAN SOMEBODY HELP ME PLEASE?..
What I am trying to do is to encode the satisfiability problem with Z3.
I have two arrays representing roles (a role-task relation), and privileges (a user-role relation) . I also have a datatype which is a User-Role pair representing the "attributes" of a task.
(declare-datatypes (User Role) ((Pair (mk-pair (first User) (second Role)))))
(declare-const Privs (Array User Role))
(declare-const Roles (Array Role (Pair User Role)))
then I am trying to assert that for any task (for all) there is an element in Privs which contains a user-role relation and in Roles and element which contains a Role-"Task"(user-role pair) like this.
(assert (forall ((l (Pair User Role)))
(and (= (select Privs (first oneTask)) (second oneTask))
(= (select Roles (second oneTask)) oneTask))))
Until there I am getting a Sat answer and a model (uninterpreted since I am using uninterpreted sorts).
But here is where my doubts begins....
1) The next step is ask if when having two workflows with a list of tasks (user-role pair) I can assert the same for all the tasks in the list. I tried creating a new const which is a list of tasks like this:
(declare-const Workflow (List (Pair User Role)))
is there any way in Z3 to specify an assert over ALL the elements of a list (workflow in my case) ??
2) How can one specify restrictions like over the set of users or assignments , and moreover how can one express limits in the time of executions for instance.. an execution of a set of taks couldnt take more than n seconds??..
3) Is there any way to get an interpreted model when using interpreted taks, lets say something like ... when PRIVS = (U1, R1) , (U2,R2) and Role= (R1,T1) and wf =T1(U1,R1)
Can somebody help me please to get how to attack the problem from a Z3 view?????PLEASE!!

Z3 supports standard first-order quantification. If you want to quantify over a what amounts to the elements of a container object (List), you will be left with having to encode accessing the container objects. So for your list example, when enforcing a property on all elements you will need to define auxiliary relations that access the list elements. For example, you can define a recursive relation that is true on Nil, and for non-empty lists holds if the predicate of interest holds on the head of the list and the relation holds recursively on the tail of the list. The catch is of course that such encodings quickly lead to problems where Z3 diverges, predominantly on satisfiable instances. Arrays are of course different: you have direct access to each element in the range of arrays by quantifying over the domain and selecting each index into the array.
I don't understand what you mean by 'user assignments'. You can specify time limits by setting options: "(set-option :timeout 1000)" sets a one second timeout.
I don't understand your last question. Sorry.

Related

Auto increment id Neo4j to retrieve elements in insert order

Recently, I am experimenting Neo4j. I like the idea but I am facing a problem that I have never faced with relational databases.
I want to perform these inserts and then return them exactly in the insertion order.
Insert elements:
create(p1:Person {name:"Marc"})
create(p2:Person {name:"John"})
create(p3:Person {name:"Paul"})
create(p4:Person {name:"Steve"})
create(p5:Person {name:"Andrew"})
create(p6:Person {name:"Alice"})
create(p7:Person {name:"Bob"})
While to return them:
match(p:Person) return p order by id(p)
I receive the elements in the following order:
Paul
Andrew
Marc
John
Steve
Alice
Bob
I note that these elements are not returned respecting the query insertion order (through the id function).
In fact the id of my elements are the following:
Marc: 18221
John: 18222
Paul: 18208
Steve: 18223
Andrew: 18209
Alice: 18224
Bob: 18225
How does the Neo4j id function work? I read that it generates an auto incremental id but it seems a little strange his mechanism. How do I return items respecting the query insertion order? I thought about creating a timestamp attribute for each node but I don't think it's the best choice
If you're looking to generate sequence numbers in Neo4j then you need to manage this yourself using a strategy that works best in your application.
In ours we maintain sequence numbers in key/value pair nodes where Scope is the application name given to the sequence number range, and Value is the last sequence number used. When we generate a node of a given type, such as Product, then we increment the sequence number and assign it to our new node.
MERGE (n:Sequence {Scope: 'Product'})
SET n.Value = COALESCE(n.Value, 0) + 1
WITH n.Value AS seq
CREATE (product:Product)
SET product.UniqueId = seq
With this you can create as many sequence numbers you need just by creating sequence nodes with unique scope names.
For more examples and tests see the AutoInc.Neo4j project https://github.com/neildobson-au/AutoInc/blob/master/src/AutoInc.Neo4j/Neo4jUniqueIdGenerator.cs
The id of Neo4j is maintained internally, which your business code should not depend on.
Generally it's auto incrementally, but if there is delete operation, you may reuse the deleted id according to the Reuse Policy of Neo4j Server.

Mnesia Errors case_clause in QLC query without a case clause

I have the following function for a hacky project:
% The Record variable is some known record with an associated table.
Query = qlc:q([Existing ||
Existing <- mnesia:table(Table),
ExistingFields = record_to_fields(Existing),
RecordFields = record_to_fields(Record),
ExistingFields == RecordFields
]).
The function record_to_fields/1 simply drops the record name and ID from the tuple so that I can compare the fields themselves. If anyone wants context, it's because I pre-generate a unique ID for a record before attempting to insert it into Mnesia, and I want to make sure that a record with identical fields (but different ID) does not exist.
This results in the following (redacted for clarity) stack trace:
{aborted, {{case_clause, {stuff}},
[{db, '-my_func/2-fun-1-',8, ...
Which points to the line where I declare Query, however there is no case clause in sight. What is causing this error?
(Will answer myself, but I appreciate a comment that could explain how I could achieve what I want)
EDIT: this wouldn't be necessary if I could simply mark certain fields as unique, and Mnesia had a dedicated insert/1 or create/1 function.
For your example, I think your solution is clearer anyway (although it seems you can pull the record_to_fields(Record) portion outside the comprehension so it isn't getting calculated over and over.)
Yes, list comprehensions can only have generators and assignments. But you can cheat a little by writing an assignment as a one-element generator. For instance, you can re-write your expression as this:
RecordFields = record_to_fields(Record),
Query = qlc:q([Existing ||
Existing <- mnesia:table(Table),
ExistingFields <- [record_to_fields(Existing)],
ExistingFields == RecordFields
]).
As it turns out, the QLC DSL does not allow assignments, only generators and filters; as per the documentation (emphasis mine):
Syntactically QLCs have the same parts as ordinary list
comprehensions:
[Expression || Qualifier1, Qualifier2, ...]
Expression (the template)
is any Erlang expression. Qualifiers are either filters or generators.
Filters are Erlang expressions returning boolean(). Generators have
the form Pattern <- ListExpression, where ListExpression is an
expression evaluating to a query handle or a list.
Which means we cannot variable assignments within a QLC query.
Thus my only option, insofar as I know, is to simply write out the query as:
Query = qlc:q([Existing ||
Existing <- mnesia:table(Table),
record_to_fields(Existing) == record_to_fields(Record)
]).

Determining if this data is really in 4th normal form?

I got a few - company, location and product details to store in a db.
sample data
company location product
------------------------------
abc hilltop alpha
abc hilltop beta
abc riverside alpha
abc riverside beta
buggy underbridge gama
buggy underbridge theta
buggy underbridge omega
The relationships are multi-valued, as I understand. And the data needs to be normalized as the MVD's are
not derived from a candidate key (company ->> location and company ->> product where company is not a candidate key)
or the union does not make the whole set (company U location < R and so with product).
But my colleague disagrees with me, who insists that for a relation to have multi-valued dependency at least four same values in company column should exist for each company. i.e
t1(company) = t2(company) = t3(company) = t4(company),
for company abc this is true. But for company "buggy", which does only one product in three locations, this is untrue.
For the formal definition and similar examples I refernced:
https://en.wikipedia.org/wiki/Multivalued_dependency
and Fourth_normal_form example also on wiki.
I know my colleague is being pedagogy, but I too started seeing the same question after reading the formal definition. (After all these are derived on mathematical basis.)
update: I am not asking how to normalize this data in to 4NF, I think I know that. (I need to break it in to two tables 1) company - location and 2) company - product.
which I have done already.
Can some one explain how this relation is still a MVD even though it does not satisfy the formal definition?
Detailed explanations are very much welcome.
"There exist" says some values exist, and they don't have to be different. EXISTS followed by some name(s) says that there exist(s) some value(s) referred to by the name(s), for which a condition holds. Multiple names can refer to the same value. (FOR ALL can be expressed in terms of EXISTS.)
The notion of MVD can be applied to both variables and values. In fact the form of the linked definition is that a MVD holds in the variable sense when it holds in the value sense "in any legal relation". To know that a particular value is legal, you need business knowledge. You can then show whether that value satisfies an MVD. But to show whether its variable satisfies the MVD you have to show that the MVD is satisfied "in any legal relation" value that the variable can hold. One valid value can tell you that a MVD doesn't hold in (it and) its variable, but it can't tell you that a MVD does hold in its variable. That requires more business knowledge.
You can show that this value violates 4NF by using that definition of MVD. The definition says that a relation variable satisfies a MVD when a certain condition holds "for any valid relation" value:
for all pairs of tuples t1 & t2 in r such that t1[a] = t2[a] there exist tuples t3 & t4 [...]
For what MVD and values for t1 & t2 does your colleague claim there doesn't exist values for t3 & t4? There is no such combination of MVD and values for t1 & t2. Eg for {company} ↠ {product} and t1 & t2 both (buggy, underbridge, gamma), we can take (company, underbridge, gamma) as a value for both t3 & t4, and so on for all other choices for t1 & t2.
Another definition for F ↠ T holding is that binary JD (join dependency) *{F U T, F U (A - T)} holds, ie that the relation is equal to the join of its projections on F U T & F U (A - T). This definition might be more immediately helpful to you & your colleague in that it avoids the terminology that you & they are misinterpreting. Eg your example data is the join of these two of its projections:
company location
--------------------
abc hilltop
abc riverside
buggy underbridge
company product
----------------
abc alpha
abc beta
buggy gamma
buggy theta
buggy omega
So it satisfies the JD *{{company, location}, {company, product}}, so it satisfies the MVDs {company} ↠ {location} and {company} ↠ {product} (among others). (Maybe you will be able to think of examples of relations with zero, one, two, three etc tuples for which one or more (trivial and/or non-trivial) MVDs hold.)
Of course, the two definitions are two different ways of describing the same condition.
PS 1 Whenever a FD F → T holds, the MVD F ↠ T holds. For a relation in BCNF, the MVDs that violate 4NF & 5NF are those not so associated with FDs.
PS 2 A relation variable is meant to hold a tuple if and only if it makes a true statement in business terms when its values are substituted into a given statement template, or predicate. That plus the JD definition for MVD gives conditions for a relation variable satisfying a MVD in business terms. Here our predicate is of the form ...company...location...product.... (Eg company namedcompanyis located atlocationand makes productproduct.) It happens that this MVD holds for a variable when for all valid business situations, FOR ALL company, location, product,
EXISTS product [...company...location...product...]
AND EXISTS location [...company...location...product...]
IMPLIES ...company...location...product...

"for all" in datalog

Given a set of facts of the form is_member(country, organisation), I have the following query to write in datalog:
Return all countries who belong to all of the organisations of which Denmark is a member of.
I want to do something like
member_all_Denmarks_organisations(Country):-
¬( is_member('Denmark', Organization),
¬is_member(Country, Organization)
).
In other words, 'for every organization that Denmark is member of, Country is a member of it too'. But datalog does not allow negated predicates which contain non-instantiated variables, so this doesn't work.
How can I proceed? And in general, when wanting to express a 'for all' statement, how to do so in datalog?
We are going to take the following alternative equivalent definition:
Return all countries who not fail to belong to some organisation that Denmark is a member of.
Of course, you can only express this in a dialect of Datalog with negation.
The following should do:
organisation_of_denmark(org) :- is_member('Denmark', org).
// a country c is disqualified if there is some organisation org
// of which Denmark is a member but c isn't
disqualified_country(c) :- organisation_of_denmark(org), country(c), ¬is_member(c, org).
// we are only interested in countries that are not excluded by the previous rule
mmember_all_Denmarks_organisations(c) :- country(c), ¬disqualified_country(c).
// in case there is no unary predicate identifying all countries
// the best we can do is the following (knowing well that then the above
// will only work for countries that are members of at least one organisation)
country(c) :- is_member(c, _).
This is precisely what you wrote also, only with intermediate relations included that
capture some of your sub-formulas and with the atom country(c) included to act as
a guard or a domain for the outer-most complementation.
The problem is a case of expressing the following proposition P in Datalog:
P(x) := for all y, p(y) => q(x,y)
In Datalog, given database DB with, say, 2 columns and x in 1st column, this can be expressed as:
P(x):- DB(x,_), ¬disqualified(x).
disqualified(x):- DB(x,_), p(y), ¬q(x,y).
The trick is to create your own disqualified() predicate.
DB(x,_) is there just to instantiate x before it appears in a negated predicate.
In the specific Denmark case:
P(x) =: 'x is member of all Denmark's organisations'
p(y) =: is_member('Denmark', y)
q(x,y) =: is_member(x,y)
DB =: is_member()

3NF Normal form

I have a question about 3NF normal form:
Normalize, with respect to 3NF, the relational scheme E(A, B, C, D, E, F)
by assuming that (A, B, C) is the unique candidate key and that the following additional functional dependencies hold:
A,B -> D
C,D -> E
E -> F
My understanding is that if I apply the 3NF which says that a schema is 3NF if all attributes
non-prime do not transitively depend on any key candidate , the result should be:
E'=(A,B,C,E,F), E''= (B,D) , E'''= A,B,C,D,F) , E''''=(D,E) , E''''''= (A,B,C,D,E),
E''''''= (E,F)
but I do think I'm wrong...
Can someone help understand the issue?
Thanks
(Reformatted for readability)
My understanding is that if I apply the 3NF which says that a schema
is 3NF if all attributes non-prime do not transitively depend on any
key candidate , the result should be:
E1= {A,B,C,E,F}
E2= {B,D}
E3= {A,B,C,D,F}
E4= {D,E}
E5= {A,B,C,D,E}
E6= {E,F}
3NF means that a) the relation is in 2NF, and b) every non-prime attribute is directly dependent (that is, not transitively dependent) on every candidate key.
In turn, 2NF means that a) the relation is in 1NF, and b) every non-prime attribute is dependent on the whole of every candidate key, not just on part of any candidate key.
Given {ABC} is a candidate key, and given {AB->D}, you can see that D depends on part of a candidate key. So
E0 = {A,B,C,D,E,F}
is not in 2NF. You fix that by moving that dependent attribute to a new relation, and you copy the attributes that determine it to the same relation.
R0 = {ABC DEF} This relation—which we started with, and which is not in 2NF—goes away, to be replaced with
R1 = {ABC EF}
R2 = {AB D}
You want to continue from here?
When it comes to getting normalization right, there is no substitute for understanding the formal definitions. If you're still working on building that understanding, there's a cute little mnemonic that people use to help remember the essence of 3NF and to judge whether a table that they're looking at is 3NF or not.
"The key, the whole key, and nothing but the key, so help me Codd."
How do you apply it? Every attribute of the relation must depend on the key. It must depend on the whole key. I must not depend on anything that isn't the key. When you look at your example, clearly there are problems and you need to normalize. You need to get to a point where every non-key column which violates 3NF is out of your original relation. Each of the non-key columns, D, E, and F all violate 3NF.
Note that your additional functional dependencies cover all of the non-key columns in your original relation. Each of these additional functional dependencies is going to result in a relation:
{ A B D } - This solves 3NF for attribute D
{ C D E } - This solves 3NF for attribute E
{ E F } - This solves 3NF for attribute F
What is left to cover from your original relation? Nothing except the candidate key:
{ A B C }

Resources