I am trying to reimplement littledog.ipynb using C++. I find it is hard to translate the function velocity_dynamics_constraint and have 3 questions
What is the function of ad_velocity_dynamics_context? Can we ignore it?
How to reimplement velocity_dynamics_constraint using C++? Do I have to create a new class like class VelocityDynamicsConstraint : public drake::solvers::Constraint? Is three any easier way to implement it?
Why we need to consider isinstance(vars[0], AutoDiffXd) condition?
# Some code from https://github.com/RussTedrake/underactuated/blob/master/examples/littledog.ipynb
ad_velocity_dynamics_context = [
ad_plant.CreateDefaultContext() for i in range(N)
]
def velocity_dynamics_constraint(vars, context_index):
h, q, v, qn = np.split(vars, [1, 1+nq, 1+nq+nv])
if isinstance(vars[0], AutoDiffXd):
if not autoDiffArrayEqual(
q,
ad_plant.GetPositions(
ad_velocity_dynamics_context[context_index])):
ad_plant.SetPositions(
ad_velocity_dynamics_context[context_index], q)
v_from_qdot = ad_plant.MapQDotToVelocity(
ad_velocity_dynamics_context[context_index], (qn - q) / h)
else:
if not np.array_equal(q, plant.GetPositions(
context[context_index])):
plant.SetPositions(context[context_index], q)
v_from_qdot = plant.MapQDotToVelocity(context[context_index],
(qn - q) / h)
return v - v_from_qdot
for n in range(N-1):
prog.AddConstraint(partial(velocity_dynamics_constraint,
context_index=n),
lb=[0] * nv,
ub=[0] * nv,
vars=np.concatenate(
([h[n]], q[:, n], v[:, n], q[:, n + 1]))
What is the function of ad_velocity_dynamics_context? Can we ignore it?
The context caches the intermediate computation result for a given q, v, u. It is very common that several constraints are imposed on the same set of q, v, u (for example, consider at the final time, we typically have a kinematic constraint on the final state, say the robot foot has to land on the ground and its center of mass is at a certain location. At the same time we we have the velocity dynamics constraint on the final state). Hence these different constraints can share some intermediate computation result, such as the rigid transform between each adjacent links. Hence we cache the result in ad_velocity_dynamics_context, and this ad_velocity_dynamics_context can be used later when we impose other constraints.
How to reimplement velocity_dynamics_constraint using C++? Do I have to create a new class like class VelocityDynamicsConstraint : public drake::solvers::Constraint? Is there any easier way to implement it?
That is right, you will need to create a new class VelocityDynamicsConstraint. The main challenge in implementing this class is to write the three overloaded DoEval function for three scalar types (double, AutoDiffXd, symbolic::Expression). You can refer to PositionConstraint as a reference. And for the moment you can ignore the case to call DoEval(const Eigen::Ref<const AutoDiffXd>&, AutoDiffXd*) with a MultibodyPlant<double> case, and only implement the this DoEval function with MultibodyPlant<AutoDiffXd>.
Why we need to consider isinstance(vars[0], AutoDiffXd) condition?
Because when the scalar type is AutoDiffXd, we want to compare not only the value of q against the one stored in context, but also its gradient. If they are different, then we need to call SetPositions to recompute the cache. When the scalar type is double, we then only need to compare the value.
Related
Per this answer the Z3 set sort is implemented using arrays, which makes sense given the SetAdd and SetDel methods available in the API. It is also claimed here that if the array modification functions are never used, it's wasteful overhead to use arrays instead of uninterpreted functions. Given that, if my only uses of a set are to apply constraints with IsMember (either on individual values or as part of a quantification), is it a better idea to use an uninterpreted function mapping from the underlying element sort to booleans? So:
from z3 import *
s = Solver()
string_set = SetSort(StringSort())
x = String('x')
s.add(IsMember(x, string_set))
becomes
from z3 import *
s = Solver()
string_set = Function('string_set', StringSort(), BoolSort())
x = String('x')
s.add(string_set(x))
Are there any drawbacks to this approach? Alternative representations with even less overhead?
Those are really your only options, as long as you want to restrict yourself to the standard interface. In the past, I also had luck with representing sets (and in general relations) outside of the solver, keeping the processing completely outside. Here's what I mean:
from z3 import *
def FSet_Empty():
return lambda x: False
def FSet_Insert(val, s):
return lambda x: If(x == val, True, s(val))
def FSet_Delete(val, s):
return lambda x: If(x == val, False, s(val))
def FSet_Member(val, s):
return s(val)
x, y, z = Ints('x y z')
myset = FSet_Insert(x, FSet_Insert(y, FSet_Insert(z, FSet_Empty())))
s = Solver()
s.add(FSet_Member(2, myset))
print(s.check())
print(s.model())
Note how we model sets by unary relations, i.e., functions from values to booleans. You can generalize this to arbitrary relations and the ideas carry over. This prints:
sat
[x = 2, z = 4, y = 3]
You can easily add union (essentially Or), intersection (essentially And), and complement (essentially Not) operations. Doing cardinality is harder, especially in the presence of complement, but that's true for all the other approaches too.
As is usual with these sorts of modeling questions, there's no single approach that will work best across all problems. They'll all have their strengths and weaknesses. I'd recommend creating a single API, and implementing it using all three of these ideas, and benchmarking your problem domain to see what works the best; keeping in mind if you start working on a different problem the answer might be different. Please report your findings!
I wonder if there is any way to make functions defined within the main function be local, in a similar way to local variables. For example, in this function that calculates the gradient of a scalar function,
grad(var,f) := block([aux],
aux : [gradient, DfDx[i]],
gradient : [],
DfDx[i] := diff(f(x_1,x_2,x_3),var[i],1),
for i in [1,2,3] do (
gradient : append(gradient, [DfDx[i]])
),
return(gradient)
)$
The variable gradient that has been defined inside the main function grad(var,f) has no effect outside the main function, as it is inside the aux list. However, I have observed that the function DfDx, despite being inside the aux list, does have an effect outside the main function.
Is there any way to make the sub-functions defined inside the main function to be local only, in a similar way to what can be made with local variables? (I know that one can kill them once they have been used, but perhaps there is a more elegant way)
To address the problem you are needing to solve here, another way to compute the gradient is to say
grad(var, e) := makelist(diff(e, var1), var1, var);
and then you can say for example
grad([x, y, z], sin(x)*y/z);
to get
cos(x) y sin(x) sin(x) y
[--------, ------, - --------]
z z 2
z
(There isn't a built-in gradient function; this is an oversight.)
About local functions, bear in mind that all function definitions are global. However you can approximate a local function definition via local, which saves and restores all properties of a symbol. Since the function definition is a property, local has the effect of temporarily wiping out an existing function definition and later restoring it. In between you can create a temporary function definition. E.g.
foo(x) := 2*x;
bar(y) := block(local(foo), foo(x) := x - 1, foo(y));
bar(100); /* output is 99 */
foo(100); /* output is 200 */
However, I don't this you need to use local -- just makelist plus diff is enough to compute the gradient.
There is more to say about Maxima's scope rules, named and unnamed functions, etc. I'll try to come back to this question tomorrow.
To compute the gradient, my advice is to call makelist and diff as shown in my first answer. Let me take this opportunity to address some related topics.
I'll paste the definition of grad shown in the problem statement and use that to make some comments.
grad(var,f) := block([aux],
aux : [gradient, DfDx[i]],
gradient : [],
DfDx[i] := diff(f(x_1,x_2,x_3),var[i],1),
for i in [1,2,3] do (
gradient : append(gradient, [DfDx[i]])
),
return(gradient)
)$
(1) Maxima works mostly with expressions as opposed to functions. That's not causing a problem here, I just want to make it clear. E.g. in general one has to say diff(f(x), x) when f is a function, instead of diff(f, x), likewise integrate(f(x), ...) instead of integrate(f, ...).
(2) When gradient and Dfdx are to be the local variables, you have to name them in the list of variables for block. E.g. block([gradient, Dfdx], ...) -- Maxima won't understand block([aux], aux: ...).
(3) Note that a function defined with square brackets instead of parentheses, e.g. f[x] := ... instead of f(x) := ..., is a so-called array function in Maxima. An array function is a memoizing function, i.e. if f[x] is called two or more times, the return value is only computed once, and then returned every time thereafter. Sometimes that's a useful optimization when the domain of the function comprises a finite set.
(4) Bear in mind that x_1, x_2, x_3, are distinct symbols, not related to each other, and not related to x[1], x[2], x[3], even if they are displayed the same. My advice is to work with subscripted symbols x[i] when i is a variable.
(5) About building up return values, try to arrange to compute the whole thing at one go, instead of growing the result incrementally. In this case, makelist is preferable to for plus append.
(6) The return function in Maxima acts differently than in other programming languages; it's a little hard to explain. A function returns the value of the last expression which was evaluated, so if gradient is that last expression, you can just write grad(var, f) := block(..., gradient).
Hope this helps, I know it's obscure and complex. The Maxima programming language was not designed before being implemented, and some of the decisions are clearly questionable at the long interval of more than 50 years (!) later. That's okay, they were figuring it out as they went along. There was not a body of established results which could provide a point of reference; the original authors were contributing to what's considered common knowledge today.
Drake has an interface where you can give it a generic function as a constraint and it can set up the nonlinearly-constrained mathematical program automatically (as long as it supports AutoDiff). I have a situation where my constraint does not support AutoDiff (the constraint function conducts a line search to approximate the maximum value of some function), but I have a closed-form expression for the gradient of the constraint. In my case, the math works out so that it's difficult to find a point on this function, but once you have that point it's easy to linearize around it.
I know many optimization libraries will allow you to provide your own analytical gradient when available; can you do this with Drake's MathematicalProgram as well? I could not find mention of it in the MathematicalProgram class documentation.
Any help is appreciated!
It's definitely possible, but I admit we haven't provided helper functions that make it pretty yet. Please let me know if/how this helps; I will plan to tidy it up and add it as an example or code snippet that we can reference in drake.
Consider the following code:
from pydrake.all import AutoDiffXd, MathematicalProgram, Solve
prog = MathematicalProgram()
x = prog.NewContinuousVariables(1, 'x')
def cost(x):
return (x[0]-1.)*(x[0]-1.)
def constraint(x):
if isinstance(x[0], AutoDiffXd):
print(x[0].value())
print(x[0].derivatives())
return x
cost_binding = prog.AddCost(cost, vars=x)
constraint_binding = prog.AddConstraint(
constraint, lb=[0.], ub=[2.], vars=x)
result = Solve(prog)
When we register the cost or constraint with MathematicalProgram in this way, we are allowing that it can get called with either x being a float, or x being an AutoDiffXd -- which is simply a wrapping of Eigen's AutoDiffScalar (with dynamically allocated derivatives of type double). The snippet above shows you roughly how it works -- every scalar value has a vector of (partial) derivatives associated with it. On entry to the function, you are passed x with the derivatives of x set to dx/dx (which will be 1 or zero).
Your job is to return a value, call it y, with the value set to the value of your cost/constraint, and the derivatives set to dy/dx. Normally, all of this happens magically for you. But it sounds like you get to do it yourself.
Here's a very simple code snippet that, I hope, gets you started:
from pydrake.all import AutoDiffXd, MathematicalProgram, Solve
prog = MathematicalProgram()
x = prog.NewContinuousVariables(1, 'x')
def cost(x):
return (x[0]-1.)*(x[0]-1.)
def constraint(x):
if isinstance(x[0], AutoDiffXd):
y = AutoDiffXd(2*x[0].value(), 2*x[0].derivatives())
return [y]
return 2*x
cost_binding = prog.AddCost(cost, vars=x)
constraint_binding = prog.AddConstraint(
constraint, lb=[0.], ub=[2.], vars=x)
result = Solve(prog)
Let me know?
I am surprising to notice that it is somehow difficult to obtain a correct fit of interaction function from gam().
To be more specific, I want to estimate an additive function:
y=m_1(x)+m_2(z)+m_{12}(x,z)+u,
where m_1(x)=x^2, m_2(z)=z^2,m_{12}(x,z)=xz. The following code generate this model:
test1 <- function(x,z,sx=1,sz=1) {
#--m1(x) function
m.x<-x^2
m.x<-m.x-mean(m.x)
#--m2(z) function
m.z<-z^2
m.z<-m.z-mean(m.z)
#--m12(x,z) function
m.xz<-x*z
m.xz<-m.xz-mean(m.xz)
m<-m.x+m.z+m.xz
return(list(m=m,m.x=m.x,m.z=m.z,m.xz=m.xz))
}
n <- 1000
a=0
b=2
x <- runif(n,a,b)/20
z <- runif(n,a,b)
u <- rnorm(n,0,0.5)
model<-test1(x,z)
y <- model$m + u
So I use gam() by fitting the model as
b3 <- gam(y~ ti(x) + ti(z) + ti(x,z))
vis.gam(b3);title("tensor anova")
#---extracting basis matrix
B.f3<-model.matrix.gam(b3)
#---extracting series estimator
b3.hat<-b3$coefficients
Question: when I plot the estimated function by gam()above against its true function, I end up with
par(mfrow=c(1,3))
#---m1(x)
B.x<-B.f3[,c(2:5)]
b.x.hat<-b3.hat[c(2:5)]
plot(x,B.x%*%b.x.hat)
points(x,model$m.x,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
#---m2(z)
B.z<-B.f3[,c(6:9)]
b.z.hat<-b3.hat[c(6:9)]
plot(z,B.z%*%b.z.hat)
points(z,model$m.z,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
#---m12(x,z)
B.xz<-B.f3[,-c(1:9)]
b.xz.hat<-b3.hat[-c(1:9)]
plot(x,B.xz%*%b.xz.hat)
points(x,model$m.xz,col='red')
legend('topleft',c('Estimate','True'),lty=c(1,1),col=c('black','red'))
However, the function estimate of m_1(x) is largely different from x^2, and the interaction function estimate m_{12}(x,z) is also largely different from xz defined in test1 above. The results are the same if I use predict(b3).
I really can't figure it out. Can anybody help me out by explaining why the results end up with this? Greatly appreciate it!
First, the problem of the above issue is not due to the package, of course. It is closely related to the identification conditions of the smooth functions. One common practice is to impose the assumptions that E(mj(.))=0 for all individual function j=1,...,d, and E(m_ij(x_i,x_j)|x_i)=E(m_ij(x_i,x_j)|x_j)=0 for i not equal to j. Those conditions require one to employ centered basis function in series estimator, which has been done already in GAM package. However, in my case above, function m(x,z)=x*z defined in test1 does not satisfy the above identification assumptions, since the integral of x*z with respect to either x or z is not zero when x and z have range from zero to two.
Furthermore, series estimator allows the individual and interaction function to be identified if one impose m(0)=0 or m(0,x_j)=m(x_i,0)=0. This can be readily achieved if we center the basis function around zero. I have tried both cases, and they work well whenever DGP satisfies the identification conditions.
I am trying to solve a problem, for example I have a 4 point and each two point has a cost between them. Now I want to find a sequence of nodes which total cost would be less than a bound. I have written a code but it seems not working. The main problem is I have define a python function and trying to call it with in a constraint.
Here is my code: I have a function def getVal(n1,n2): where n1, n2 are Int Sort. The line Nodes = [ Int("n_%s" % (i)) for i in range(totalNodeNumber) ] defines 4 points as Int sort and when I am adding a constraint s.add(getVal(Nodes[0], Nodes[1]) + getVal(Nodes[1], Nodes[2]) < 100) then it calls getVal function immediately. But I want that, when Z3 will decide a value for Nodes[0], Nodes[1], Nodes[2], Nodes[3] then the function should be called for getting the cost between to points.
from z3 import *
import random
totalNodeNumber = 4
Nodes = [ Int("n_%s" % (i)) for i in range(totalNodeNumber) ]
def getVal(n1,n2):
# I need n1 and n2 values those assigned by Z3
cost = random.randint(1,20)
print cost
return IntVal(cost)
s = Solver()
#constraint: Each Nodes value should be distinct
nodes_index_distinct_constraint = Distinct(Nodes)
s.add(nodes_index_distinct_constraint)
#constraint: Each Nodes value should be between 0 and totalNodeNumber
def get_node_index_value_constraint(i):
return And(Nodes[i] >= 0, Nodes[i] < totalNodeNumber)
nodes_index_constraint = [ get_node_index_value_constraint(i) for i in range(totalNodeNumber)]
s.add(nodes_index_constraint)
#constraint: Problem with this constraint
# Here is the problem it's just called python getVal function twice without assiging Nodes[0],Nodes[1],Nodes[2] values
# But I want to implement that - Z3 will call python function during his decission making of variables
s.add(getVal(Nodes[0], Nodes[1]) + getVal(Nodes[1], Nodes[2]) + getVal(Nodes[2], Nodes[3]) < 100)
if s.check() == sat:
print "SAT"
print "Model: "
m = s.model()
nodeIndex = [ m.evaluate(Nodes[i]) for i in range(totalNodeNumber) ]
print nodeIndex
else:
print "UNSAT"
print "No solution found !!"
If this is not a right way to solve the problem then could you please tell me what would be other alternative way to solve it. Can I encode this kind of problem to find optimal sequence of way points using Z3 solver?
I don't understand what problem you need to solve. Definitely, the way getVal is formulated does not make sense. It does not use the arguments n1, n2. If you want to examine values produced by a model, then you do this after Z3 returns from a call to check().
I don't think you can use a python function in your SMT logic. What you could alternatively is define getVal as a Function like this
getVal = Function('getVal',IntSort(),IntSort(),IntSort())
And constraint the edge weights as
s.add(And(getVal(0,1)==1,getVal(1,2)==2,getVal(0,2)==3))
The first two input parameters of getVal represent the node ids and the last integer represents the weight.