how to cast collision avoidance constraint as a convex one to avoid getting errors - cvxpy

I am running a trajectory optimization problem using sequential convex programming where at each iteration I solve a QP problem. I am linearizing my inter-vehicle collision avoidance constraint as the following:
However, when I do this I get an error from CVXPY complaining that my constraint does not follow DCP:
s_cvx = cvx.Variable((11, 12))
u_cvx = cvx.Variable((10, 6))
for k in range(N):
if k>0:
dist_linear = cvx.norm(agent_i_q-agent_j_q,2) + \
(agent_i_q-agent_j_q/cvx.norm(agent_i_q-agent_j_q,2)) # (s_try[1,0:3]-s_try[1,6:9])
constraints += [dist_linear >= 0.5]
......
where (s_cvx[k,0:3]-s_cvx[k,6:9]) refers to the current difference in position between the two vehicles and agent_i_q,agent_j_q refer to their positions in the previous iterate.
The DCP error is as follows:
DCPError: Problem does not follow DCP rules. Specifically:
The following constraints are not DCP:
0.5 <= Pnorm(var1216896[0, 0:3] + -var1216896[0, 6:9], 2) + (var1216896[0, 0:3] + -var1216896[0, 6:9] / Promote(Pnorm(var1216896[0, 0:3] + -var1216896[0, 6:9], 2), (3,))) # (var1216795[1, 0:3] + -var1216795[1, 6:9]) , because the following subexpressions are not:
|-- var1216896[0, 6:9] / Promote(Pnorm(var1216896[0, 0:3] + -var1216896[0, 6:9], 2), (3,))
0.5 <= Pnorm(var1216896[1, 0:3] + -var1216896[1, 6:9], 2) + (var1216896[1, 0:3] + -var1216896[1, 6:9] / Promote(Pnorm(var1216896[1, 0:3] + -var1216896[1, 6:9], 2), (3,))) # (var1216795[1, 0:3] + -var1216795[1, 6:9]) , because the following subexpressions are not:
|-- var1216896[1, 6:9] / Promote(Pnorm(var1216896[1, 0:3] + -var1216896[1, 6:9], 2), (3,))
......
Any suggestions?

Related

Can't replicate RStan ESS code from Vehtari paper

I am trying to replicate an ESS (effective sample size) calculation using the method of Vehtari et al. in: Rank-normalization, folding, and localization: An improved Rhat for assessing convergence of MCMC
I am working from the code here:
https://github.com/avehtari/rhat_ess/blob/master/code/monitornew.R
# Geyer's initial positive sequence
rho_hat_t <- rep.int(0, n_samples)
t <- 0
rho_hat_even <- 1
rho_hat_t[t + 1] <- rho_hat_even
rho_hat_odd <- 1 - (mean_var - mean(acov[t + 2, ])) / var_plus # 251
rho_hat_t[t + 2] <- rho_hat_odd
while (t < nrow(acov) - 5 && !is.nan(rho_hat_even + rho_hat_odd) &&
(rho_hat_even + rho_hat_odd > 0)) {
t <- t + 2
rho_hat_even = 1 - (mean_var - mean(acov[t + 1, ])) / var_plus # 256
rho_hat_odd = 1 - (mean_var - mean(acov[t + 2, ])) / var_plus # 257
if ((rho_hat_even + rho_hat_odd) >= 0) {
rho_hat_t[t + 1] <- rho_hat_even
rho_hat_t[t + 2] <- rho_hat_odd
}
}
I can follow the code from the paper except when we get to equation 10 in the paper (calculating the cross-chain autocorrelation). The code (lines 251, 256 and 257) appears in the form:
1 - (mean_var - mean(acov[t + 1, ])) / var_plus
which is close to equation 10, except the missing the 's' terms from equation 10:
I can't see anywhere in the code that this is somehow accounted for elsewhere in the way the calculation is being done. I have tried putting the 's' terms back into those lines of code and it makes a big difference to the final ESS value.
Is anyone able to help me understand the discrepancy between paper and code?
Thanks.
In the formula in the paper, s^2 is is the estimate of variance and rho the estimate of autocorrelation. Thus s^2 * rho is an estimate of the autocovariance, which is what you see in the code.

Infinity objective value given by CVXPY on a convex program

I am solving a convex problem using cvxpy. The constraints are rather simple, there are 3 variables, but we could eliminate one. The objective is convex and involves the entropy and logarithm. The solution is correct, in the sense that the variables have the expected values. Howver the objective value should be around -1.06, but it is infinite. Is there a bug with evaluating involved expressions ?
#!/usr/bin/env python3
import cvxpy as cx
import numpy as np
from math import log
def entr(x):
return -x * log(x)
def check_obj(a, b, c):
return -entr(2.0) + -2.0 * log(2.0) + -entr(1.0 + a) + -1.0 + a * log(2.0) + -entr(2.0 + a) -2.0 + a * log(1.0) -entr(1.0 + a + b + c) + -1.0 + a + b + c * log(2.0) + -entr(2.0) + -2.0 * log(2.0) + -entr(1.0 + b) -1.0 + b * log(2.0) + -entr(2.0 + b) + -2.0 + b * log(1.0) -entr(1.0 + b + a + c) -1.0 + b + a + c * log(2.0)
a = cx.Variable(name='a')
b = cx.Variable(name='b')
c = cx.Variable(name='c')
obj = -cx.entr(2.0) + -2.0 * cx.log(2.0) + -cx.entr(1.0 + a) + -1.0 + a * cx.log(2.0) + -cx.entr(2.0 + a) -2.0 + a * cx.log(1.0) -cx.entr(1.0 + a + b + c) + -1.0 + a + b + c * cx.log(2.0) + -cx.entr(2.0) + -2.0 * cx.log(2.0) + -cx.entr(1.0 + b) -1.0 + b * cx.log(2.0) + -cx.entr(2.0 + b) + -2.0 + b * cx.log(1.0) -cx.entr(1.0 + b + a + c) -1.0 + b + a + c * cx.log(2.0)
p = cx.Problem(cx.Minimize(obj), [0 <= a, 0<= b, 0 <= c, a + b + c == 1])
p.solve()
# should be 'optimal' and indeed it is
print(p.status)
# the following two values should be the same, but p.value is infinite and should be around -1.06
print(p.value)
print(check_obj(a.value, b.value, c.value))
It looks like a bug in the entropy atom. I fixed it and made a pull request here. It is merged now. If you run your code with the latest cvxpy from the master branch it should give correct results.

Calculating Gradient Update

Lets say I want to manually calculate the gradient update with respect to the Kullback-Liebler divergence loss, say on a VAE (see an actual example from pytorch sample documentation here):
KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
where the logvar is (for simplicitys sake, ignoring activation functions and multiple layers etc.) basically a single layer transformation from a 400 dim feature vector into a 20 dim one:
self.fc21 = nn.Linear(400, 20)
logvar = fc21(x)
I'm just not mathematically understanding how you take the gradient of this, with respect to the weight vector for fc21. Mathematically I thought this would look like:
KL = -.5sum(1 + Wx + b - m^2 - e^{Wx + b})
dKL/dW = -.5 (x - e^{Wx + b}x)
where W is the weight matrix of the fc21 layer. But here this result isn't in the same shape as W (20x400). Like, x is just a 400 feature vector. So how would I perform SGD on this? Does x just broadcast to the second term, and if so why? I feel like I'm just missing some mathematical understanding here...
Let's simplify the example a bit and assume a fully connected layer of input shape 3 and output shape 2, then:
W = [[w1, w2, w3], [w4, w5, w6]]
x = [x1, x2, x3]
y = [w1*x1 + w2*x2 + w3*x3, w4*x1 + w5*x2 + w6*x3]
D_KL = -0.5 * [ 1 + w1*x1 + w2*x2 + w3*x3 + w4*x1 + w5*x2 + w6*x3 + b - m^2 + e^(..)]
grad(D_KL, w1) = -0.5 * [x1 + x1* e^(..)]
grad(D_KL, w2) = -0.5 * [x2 + x2* e^(..)]
...
grad(D_KL, W) = [[grad(D_KL, w1), grad(D_KL, w2), grad(D_KL,w3)],
[grad(D_KL, w4), grad(D_KL, w5), grad(D_KL,w6)]
]
This generalizes for higher order tensors of any dimensionality. Your differentiation is wrong in treating x and W as scalars rather than taking element-wise partial derivatives.

Getting probability of class using naive Bayes

I am trying to classify input with two classes, here is the code. dino and crypto are two classes:
for w, cnt in list(counts.items()): #count is dict with word and it's count value
p_word = vocab[w] / sum(vocab.values())
p_w_given_dino = (word_counts["dino"].get(w, 0.0) + 1) / (sum(word_counts["dino"].values()) + v)
p_w_given_crypto = (word_counts["crypto"].get(w, 0.0) + 1) / (sum(word_counts["crypto"].values()) + v)
log_prob_dino += math.log(cnt * p_w_given_dino / p_word)
log_prob_crypto += math.log(cnt * p_w_given_crypto / p_word)
print("Score(dino) :", math.exp(log_prob_dino + math.log(prior_dino)))
print("Score(crypto):", math.exp(log_prob_crypto + math.log(prior_crypto)))
Another approach is:
prior_dino = (priors["dino"] / sum(priors.values()))
prior_crypto = (priors["crypto"] / sum(priors.values()))
for w, cnt in list(counts.items()):
p_word = vocab[w] / sum(vocab.values())
p_w_given_dino = (word_counts["dino"].get(w, 0.0) + 1) / (sum(word_counts["dino"].values()) + v)
p_w_given_crypto = (word_counts["crypto"].get(w, 0.0) + 1) / (sum(word_counts["crypto"].values()) + v)
prob_dino *= p_w_given_dino
prob_crypto *= p_w_given_crypto
t_prior_dino = prob_dino * prior_dino
t_prior_crypto = prob_crypto * prior_crypto
On the second approach I got very small values.
Which one is correct, or are both of them correct?
These are completely equivalent approaches. The first one however is the preferable one, as working on logarithms of probabilities makes the whole process more numericaly stable. Results should be identical (up to numerical errors).
However it appears that you have errors in second approach
prob_dino *= p_w_given_dino
does not use the fact, that you have cnt occurences; it should be something like
prob_dino *= pow(p_w_given_dino, cnt)

print multicolumn formatted text vb6

i need to print formatted text like in the image below, how can i achive this in vb6, given that vb6 print object is not friendly for such this
The data i need to print that represented by the boxes are non related
It is not very difficult. You use the ScaleLeft, ScaleWidth, CurrentX, and CurrentY properties to set where printing begins on the page. In this case you will probably also want to set the Orientation property to vbPROPortrait. Using those positioning properties, and setting the font and style you want you then call Printer.Print
This method will draw 4 boxes onto a page. Play with the (x, y) coordinates or hard code the numbers to alter the sizes. Remove the .EndDoc statement if you don't want the printer to print the page from this method and call Printer.EndDoc from somewhere else. Full Printer object documentation for VB6 can be found at http://msdn.microsoft.com/en-us/library/aa443915%28v=vs.60%29.aspx
Private Sub DrawBox()
With Printer
.ScaleMode = vbTwips
lngScaleWidth = .ScaleWidth
lngScaleHeight = .ScaleHeight
Printer.Line (.ScaleLeft + lngMargin, .ScaleTop + lngMargin)-(lngScaleWidth / 2 - (100 + lngMargin * 2), lngScaleHeight / 2 - (100 + lngMargin * 2)), lngColor, B
Printer.Line (lngScaleWidth / 2 + (100 + lngMargin * 2), .ScaleTop + lngMargin)-(.ScaleWidth - lngMargin, lngScaleHeight / 2 - (100 + lngMargin * 2)), lngColor, B
Printer.Line (.ScaleLeft + lngMargin, lngScaleHeight / 2 + (100 + lngMargin * 2))-(lngScaleWidth / 2 - (100 + lngMargin * 2), .ScaleHeight - lngMargin), lngColor, B
Printer.Line (lngScaleWidth / 2 + (100 + lngMargin * 2), lngScaleHeight / 2 + (100 + lngMargin * 2))-(.ScaleWidth - lngMargin, .ScaleHeight - lngMargin), lngColor, B
.EndDoc
End With
End Sub
The sample code below demonstrates some of the positioning and other properties.
Dim lMargin as Integer
lMargin = 200
With Printer
.FontBold = True
.FontItalic = False
.CurrentY = .CurrentY + (3 * .TextHeight(App.ProductName))
.CurrentX = lLeftMargin
.FontName = "Arial"
.FontSize = 11
Printer.Print "Date " & strTransDate
End With

Resources