I'm new to using Z3 and am trying to model an ILP, which I have already successfully done using the MILP Solver PuLP. I now implemented the same objective function (which is to be minimized) and the same constraints in Z3 and am experiencing strange behaviour. Mainly, that adding a constraint decreases the minimum.
My question is: How can that be? Is it a bug or can it be explained somehow?
More detail, in case needed:
I'm trying to solve a Teacher Assignment Problem. Courses are scheduled before a year and the teachers get the list of the courses. They then can select which courses they want to teach, with which priority they want to teach it, how many workdays (each course lasts several days) they desire to teach and a max and min number of courses they definitly want to teach. The program gets as input a list of possible teacher-assignments. A teacher-assignment is a tuple consisting of:
teacher-name
event-id
priority of teacher towards event
the distance between teacher and course
The goal of the program to find a combination of assignments that minimize:
the average relative deviation 'desired workdays <-> assigned workdays' of all teachers
the maximum relative deviation 'desired workdays <-> assigned workdays' of any teacher
the overall distance of courses to assigned teachers
the sum of priorities (higher priority means less willingness to teach)
Main Constraints:
number of teachers assigned to course must match needed amount of teachers
the number of assigned courses to a teacher must be within the specified min/max range
the courses to which a teacher is assigned may not overlap in time (a list of overlap-sets are given)
To track the average relative deviation and the maximum deviation of workdays two more 'helper-constraints' are introduced:
for each teacher: overload (delta_plus) - underload (delta_minus) = assigned workdays - desired workdays
for each teacher: delta_plus + delta_minus <= max relative deviation (DELTA)
Here you have this as Python code:
from z3 import *
def compute_optimum(a1, a2, a3, a4, worst_case_distance=0):
"""
Find the optimum solution with weights a1, a2, a3, a4
(average workday deviation, maximum workday deviation, cummulative linear distance, sum of priority 2 assignments)
Higher weight = More optimized (value minimized)
Returns all assignment-tuples which occur in the calculated optimal model.
"""
print("Powered by Z3")
print(f"\n\n\n\n ------- FINDING OPTIMUM TO WEIGHTS: a1={a1}, a2={a2}, a3={a3}, a4={a4} -------\n")
# key: assignment tuple value: z3-Bool
x = {assignment : Bool('e%i_%s' % (assignment[1], assignment[0])) for assignment in possible_assignments}
delta_plus = {teacher : Int('d+_%s' % teacher) for teacher in teachers}
delta_minus = {teacher : Int('d-_%s' % teacher) for teacher in teachers}
DELTA = Real('DELTA')
opt = Optimize()
# constraint1: number of teachers needed per event
num_available_per_event = {event : len(list(filter(lambda assignment: assignment[1] == event, possible_assignments))) for event in events}
for event in events:
num_teachers_to_assign = min(event_size[event], num_available_per_event[event])
opt.add(Sum( [If(x[assignment], 1, 0) for assignment in x.keys() if assignment[1] == event] ) == num_teachers_to_assign)
for teacher in teachers:
# constraint2: max and min number of events for each teacher
max_events = len(events)
min_events = 0
num_assigned_events = Sum( [If(x[assignment], 1, 0) for assignment in x.keys() if assignment[0] == teacher] )
opt.add(num_assigned_events >= min_events, num_assigned_events <= max_events)
# constraint3: teacher can't work in multiple overlapping events
for overlapping_events in event_overlap_sets:
opt.add(Sum( [If(x[assignment], 1, 0) for assignment in x.keys() if assignment[1] in overlapping_events and assignment[0] == teacher] ) <= 1)
# constraint4: delta (absolute over and underload of teacher)
num_teacher_workdays = Sum( [If(x[assignment], event_durations[assignment[1]], 0) for assignment in x.keys() if assignment[0] == teacher])
opt.add(delta_plus[teacher] >= 0, delta_minus[teacher] >= 0)
opt.add(delta_plus[teacher] - delta_minus[teacher] == num_teacher_workdays - desired_workdays[teacher])
# constraint5: DELTA (maximum relative deviation of wished to assigned workdays)
opt.add(DELTA >= ToReal(delta_plus[teacher] + delta_minus[teacher]) / desired_workdays[teacher])
#opt.add(DELTA <= 1) # adding this results in better optimum
average_rel_workday_deviation = Sum( [ToReal(delta_plus[teacher] + delta_minus[teacher]) / desired_workdays[teacher] for teacher in teachers]) / len(teachers)
overall_distance = Sum( [If(x[assignment], assignment[3], 0) for assignment in x.keys()])
num_prio2 = Sum( [If(x[assignment], assignment[2]-1, 0) for assignment in x.keys()])
obj_fun = opt.minimize(
a1 * average_rel_workday_deviation
+ a2 * DELTA
+ a3 * overall_distance
+ a4 * num_prio2
)
#print(opt)
if opt.check() == sat:
m = opt.model()
optimal_assignments = []
for assignment in x.keys():
if m.evaluate(x[assignment]):
optimal_assignments.append(assignment)
for teacher in teachers:
print(f"{teacher}: d+ {m.evaluate(delta_plus[teacher])}, d- {m.evaluate(delta_minus[teacher])}")
#print(m)
print("DELTA:::", m.evaluate(DELTA))
print("min value:", obj_fun.value().as_decimal(2))
return optimal_assignments
else:
print("Not satisfiable")
return []
compute_optimum(1,1,1,1)
Sample input:
teachers = ['fr', 'hö', 'pf', 'bo', 'jö', 'sti', 'bi', 'la', 'he', 'kl', 'sc', 'str', 'ko', 'ba']
events = [5, 6, 7, 8, 9, 10, 11, 12]
event_overlap_sets = [{5, 6}, {8, 9}, {10, 11}, {11, 12}, {12, 13}]
desired_workdays = {'fr': 36, 'hö': 50, 'pf': 30, 'bo': 100, 'jö': 80, 'sti': 56, 'bi': 20, 'la': 140, 'he': 5.0, 'kl': 50, 'sc': 38, 'str': 42, 'ko': 20, 'ba': 20}
event_size = {5: 2, 6: 2, 7: 2, 8: 3, 9: 2, 10: 2, 11: 3, 12: 2}
event_durations = {5: 5.0, 6: 5.0, 7: 5.0, 8: 16, 9: 7.0, 10: 5.0, 11: 16, 12: 5.0}
# assignment: (teacher, event, priority, distance)
possible_assignments = [('he', 5, 1, 11), ('sc', 5, 1, 48), ('str', 5, 1, 199), ('ko', 6, 1, 53), ('jö', 7, 1, 317), ('bo', 9, 1, 56), ('sc', 10, 1, 25), ('ba', 11, 1, 224), ('bo', 11, 1, 312), ('jö', 11, 1, 252), ('kl', 11, 1, 248), ('la', 11, 1, 303), ('pf', 11, 1, 273), ('str', 11, 1, 228), ('kl', 5, 2, 103), ('la', 5, 2, 16), ('pf', 5, 2, 48), ('bi', 6, 2, 179), ('la', 6, 2, 16), ('pf', 6, 2, 48), ('sc', 6, 2, 48), ('str', 6, 2, 199), ('sc', 7, 2, 354), ('sti', 7, 2, 314), ('bo', 8, 2, 298), ('fr', 8, 2, 375), ('hö', 9, 2, 95), ('jö', 9, 2, 119), ('sc', 9, 2, 37), ('sti', 9, 2, 95), ('bi', 10, 2, 211), ('hö', 11, 2, 273), ('bi', 12, 2, 408), ('bo', 12, 2, 318), ('ko', 12, 2, 295), ('la', 12, 2, 305), ('sc', 12, 2, 339), ('str', 12, 2, 218)]
Output (just the delta+ and delta-):
------- FINDING OPTIMUM TO WEIGHTS: a1=1, a2=1, a3=1, a4=1 -------
fr: d+ 17, d- 37
hö: d+ 26, d- 69
pf: d+ 0, d- 25
bo: d+ 41, d- 120
jö: d+ 0, d- 59
sti: d+ 27, d- 71
bi: d+ 0, d- 15
la: d+ 0, d- 119
he: d+ 0, d- 0
kl: d+ 0, d- 50
sc: d+ 0, d- 33
str: d+ 0, d- 32
ko: d+ 0, d- 20
ba: d+ 10, d- 14
DELTA::: 19/10
min value: 3331.95?
What I observe that does not make sense to me:
often, neither delta_plus nor delta_minus for a teacher equals 0, DELTA is bigger than 1
adding constraint 'DELTA <= 1' results in a smaller objective function value, faster computation and observation 1 cannot be observed anymore
Also: the computation takes forever (although this is not the point of this)
I am happy for any sort of help!
Edit:
Like suggested by alias, changing the delta+/- variables to Real and removing the two ToReal() statements yields the desired result. If you look at the generated expressions of my sample input, there are in fact slight differences (also besides the different datatype and missing to_real statements).
For example, when looking at the constraint, which is supposed to constrain that delta_plus - delta_minus of 'fri' is equals to 16 - 36 if he works for event 8, 0 - 36 if he doesn't.
My old code using integers and ToReal-conversions produces this expression:
(assert (= (- d+_fr d-_fr) (- (+ (ite e8_fr 16 0)) 36)))
The code using Reals and no type-conversions produces this:
(assert (let ((a!1 (to_real (- (+ (ite e8_fr 16 0)) 36))))
(= (- d+_fr d-_fr) a!1)))
Also the minimization expressions are slightly different:
My old code using integers and ToReal-conversions produces this expression:
(minimize (let (
(a!1 ...)
(a!2 (...))
(a!3 (...))
)
(+ (* 1.0 (/ a!1 14.0)) (* 1.0 DELTA) a!2 a!3)))
The code using Reals and no type-conversions produces this:
(minimize (let (
(a!1 (/ ... 14.0))
(a!2 (...))
(a!3 (...))
)
(+ (* 1.0 a!1) (* 1.0 DELTA) a!2 a!3)))
Sadly I don't know really know how to read this but it seems quite the same to me.
I want to take a list of non-negative integers D=[d1,...,dm] and and generate a multidimensional array of indexed symbols A in the form of:
where 0<=i_j<=d_j. For example if D=[2,3] then A should be
[[a_[0,0],a_[0,1],a_[0,2]],
[a_[1,0],a_[1,1],a_[1,2]]]
For this case I could nest two for loops to generate the said array, however D does not necessarily have a length of 2 and I don't know how to nest an arbitrary number of for loops!
I would appreciate if you could help me know how I can generate A from D.
P.S. What I want to finally achieve is to create a multivariate polynomial as explained here.
Here's one way to do it. The essential part is that I called cartesian_product to construct the list of all combinations of indices, and then arrayapply to create the subscripted expressions.
(%i11) ii:setify(makelist(i, i, 0, n)), n=2;
(%o11) {0, 1, 2}
(%i12) apply (cartesian_product, makelist (ii, m)), m=3;
(%o12) {[0, 0, 0], [0, 0, 1], [0, 0, 2], [0, 1, 0], [0, 1, 1],
[0, 1, 2], [0, 2, 0], [0, 2, 1], [0, 2, 2], [1, 0, 0],
[1, 0, 1], [1, 0, 2], [1, 1, 0], [1, 1, 1], [1, 1, 2],
[1, 2, 0], [1, 2, 1], [1, 2, 2], [2, 0, 0], [2, 0, 1],
[2, 0, 2], [2, 1, 0], [2, 1, 1], [2, 1, 2], [2, 2, 0],
[2, 2, 1], [2, 2, 2]}
(%i13) map (lambda ([l], arrayapply (_a, l)), %);
(%o13) {_a , _a , _a , _a , _a ,
0, 0, 0 0, 0, 1 0, 0, 2 0, 1, 0 0, 1, 1
_a , _a , _a , _a , _a ,
0, 1, 2 0, 2, 0 0, 2, 1 0, 2, 2 1, 0, 0
_a , _a , _a , _a , _a ,
1, 0, 1 1, 0, 2 1, 1, 0 1, 1, 1 1, 1, 2
_a , _a , _a , _a , _a ,
1, 2, 0 1, 2, 1 1, 2, 2 2, 0, 0 2, 0, 1
_a , _a , _a , _a , _a ,
2, 0, 2 2, 1, 0 2, 1, 1 2, 1, 2 2, 2, 0
_a , _a }
2, 2, 1 2, 2, 2
(%i14) grind (%);
{_a[0,0,0],_a[0,0,1],_a[0,0,2],_a[0,1,0],_a[0,1,1],_a[0,1,2],
_a[0,2,0],_a[0,2,1],_a[0,2,2],_a[1,0,0],_a[1,0,1],_a[1,0,2],
_a[1,1,0],_a[1,1,1],_a[1,1,2],_a[1,2,0],_a[1,2,1],_a[1,2,2],
_a[2,0,0],_a[2,0,1],_a[2,0,2],_a[2,1,0],_a[2,1,1],_a[2,1,2],
_a[2,2,0],_a[2,2,1],_a[2,2,2]}$
(%o14) done
This is just working at the top-level interactive prompt; if you need to construct a function, I think you'll see how to do it.
EDIT: Here's a way to create the polynomial.
(%i16) S : {[0, 0, 0], [0, 0, 1], [0, 0, 2], [0, 1, 0], [0, 1, 1],
[0, 1, 2], [0, 2, 0], [0, 2, 1], [0, 2, 2], [1, 0, 0],
[1, 0, 1], [1, 0, 2], [1, 1, 0], [1, 1, 1], [1, 1, 2],
[1, 2, 0], [1, 2, 1], [1, 2, 2], [2, 0, 0], [2, 0, 1],
[2, 0, 2], [2, 1, 0], [2, 1, 1], [2, 1, 2], [2, 2, 0],
[2, 2, 1], [2, 2, 2]} $
(%i17) L : listify (S) $
(%i18) A : map (lambda ([l], arrayapply (_a, l)), L);
(%o18) [_a , _a , _a , _a , _a ,
0, 0, 0 0, 0, 1 0, 0, 2 0, 1, 0 0, 1, 1
_a , _a , _a , _a , _a ,
0, 1, 2 0, 2, 0 0, 2, 1 0, 2, 2 1, 0, 0
_a , _a , _a , _a , _a ,
1, 0, 1 1, 0, 2 1, 1, 0 1, 1, 1 1, 1, 2
_a , _a , _a , _a , _a ,
1, 2, 0 1, 2, 1 1, 2, 2 2, 0, 0 2, 0, 1
_a , _a , _a , _a , _a ,
2, 0, 2 2, 1, 0 2, 1, 1 2, 1, 2 2, 2, 0
_a , _a ]
2, 2, 1 2, 2, 2
(%i19) U : map (lambda ([l], product (u[i]^l[i], i, 1, length(l))), L);
2 2 2 2 2 2
(%o19) [1, u , u , u , u u , u u , u , u u , u u , u ,
3 3 2 2 3 2 3 2 2 3 2 3 1
2 2 2 2
u u , u u , u u , u u u , u u u , u u , u u u ,
1 3 1 3 1 2 1 2 3 1 2 3 1 2 1 2 3
2 2 2 2 2 2 2 2 2 2 2 2
u u u , u , u u , u u , u u , u u u , u u u , u u ,
1 2 3 1 1 3 1 3 1 2 1 2 3 1 2 3 1 2
2 2 2 2 2
u u u , u u u ]
1 2 3 1 2 3
(%i20) A.U;
2 2 2 2 2 2 2
(%o20) u u _a u + u u _a u + u _a u
1 2 2, 2, 2 3 1 2 2, 1, 2 3 1 2, 0, 2 3
2 2 2 2 2
+ u _a u u + _a u u + u _a u u
1 1, 2, 2 2 3 0, 2, 2 2 3 1 1, 1, 2 2 3
2 2 2
+ _a u u + u _a u + _a u
0, 1, 2 2 3 1 1, 0, 2 3 0, 0, 2 3
2 2 2 2
+ u u _a u + u u _a u + u _a u
1 2 2, 2, 1 3 1 2 2, 1, 1 3 1 2, 0, 1 3
2 2
+ u _a u u + _a u u + u _a u u
1 1, 2, 1 2 3 0, 2, 1 2 3 1 1, 1, 1 2 3
+ _a u u + u _a u + _a u
0, 1, 1 2 3 1 1, 0, 1 3 0, 0, 1 3
2 2 2 2
+ u u _a + u u _a + u _a
1 2 2, 2, 0 1 2 2, 1, 0 1 2, 0, 0
2 2
+ u _a u + _a u + u _a u
1 1, 2, 0 2 0, 2, 0 2 1 1, 1, 0 2
+ _a u + u _a + _a
0, 1, 0 2 1 1, 0, 0 0, 0, 0
Note that the ordering of terms within each product doesn't conform to what humans would consider the usual convention, e.g. [1]^2*u[2]^2*_a[2,2,2]*u[3]^2 is the first term. Maxima is ordering the terms according to the subscripts, therefore _a[2,2,2] comes after u[1] and before u[3]. In some contexts this coincides with what humans expect, but here it doesn't; in any event, Maxima is consistent in hope of making programmatic manipulation work better.
(%i21) grind (%);
u[1]^2*u[2]^2*_a[2,2,2]*u[3]^2+u[1]^2*u[2]*_a[2,1,2]*u[3]^2
+u[1]^2*_a[2,0,2]*u[3]^2
+u[1]*_a[1,2,2]*u[2]^2*u[3]^2
+_a[0,2,2]*u[2]^2*u[3]^2
+u[1]*_a[1,1,2]*u[2]*u[3]^2
+_a[0,1,2]*u[2]*u[3]^2
+u[1]*_a[1,0,2]*u[3]^2
+_a[0,0,2]*u[3]^2
+u[1]^2*u[2]^2*_a[2,2,1]*u[3]
+u[1]^2*u[2]*_a[2,1,1]*u[3]
+u[1]^2*_a[2,0,1]*u[3]
+u[1]*_a[1,2,1]*u[2]^2*u[3]
+_a[0,2,1]*u[2]^2*u[3]
+u[1]*_a[1,1,1]*u[2]*u[3]
+_a[0,1,1]*u[2]*u[3]
+u[1]*_a[1,0,1]*u[3]
+_a[0,0,1]*u[3]
+u[1]^2*u[2]^2*_a[2,2,0]
+u[1]^2*u[2]*_a[2,1,0]
+u[1]^2*_a[2,0,0]
+u[1]*_a[1,2,0]*u[2]^2
+_a[0,2,0]*u[2]^2
+u[1]*_a[1,1,0]*u[2]
+_a[0,1,0]*u[2]+u[1]*_a[1,0,0]
+_a[0,0,0]$
(%o21) done