I have a convex optimization problem I am trying to solve with cvxpy. Given a 1 x n row vector y and an m x n matrix C, I want to find a scalar b and a 1 x m row vector a such that the sum of squares of y - (aC + b(aC # aC)) is as small as possible (the # denotes element wise multiplication). In addition, all entires in a must be nonnegative and sum to 1 and -100 <= b <= 100. Below is my attempt to solve this using cvxpy.
import numpy as np
import cvxpy as cvx
def find_a(y, C, b_min=-100, b_max=100):
b = cvx.Variable()
a = cvx.Variable( (1,C.shape[0]) )
aC = a * C # this should be matrix multiplication
x = (aC + cvx.multiply(b, cvx.square(aC)))
objective = cvx.Minimize ( cvx.sum_squares(y - x) )
constraints = [0. <= a,
a <= 1.,
b_min <= b,
b <= b_max,
cvx.sum(a) == 1.]
prob = cvx.Problem(objective, constraints)
result = prob.solve()
print a.value
print result
y = np.asarray([[0.10394265, 0.25867508, 0.31258457, 0.36452763, 0.36608997]])
C = np.asarray([
[0., 0.00169811, 0.01679245, 0.04075472, 0.03773585],
[0., 0.00892802, 0.03154158, 0.06091544, 0.07315024],
[0., 0.00962264, 0.03245283, 0.06245283, 0.07283019],
[0.04396226, 0.05245283, 0.12245283, 0.18358491, 0.23886792]])
find_a(y, C)
I keep getting a DCPError: Problem does not follow DCP rules. error when I try to solve for a. I am thinking that either my function is not really convex, or I do not understand how to construct the proper cvxpy Problem. Any help would be greatly appreciated.
Related
I'm interested in minimizing the trace of the covariance matrix associated with a Gaussian process in two dimensions. That is, I want to minimize tr(Σ) where Σ is given by:
Sigma
and K() is the kernel function with design points X and query points X*
As a minimum working example, I have tried the below implementation. This is clearly not DCP compliant and I have a strong feeling there is a better way to implement this such that it would be DCP compliant; however, I am somewhat of a novice to cvxpy and so would appreciate any suggestions.
import cvxpy as cp
import numpy as np
from itertools import product
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, ConstantKernel as C
def cp_kernel(X, Y):
return cp.exp(cp.norm2(X-Y))
T = 10 # number of X's
u_max = 1
Xq = np.mgrid[0:1:0.1, 0:1:0.1].reshape(2,-1).T
x = cp.Variable((T,2))
u = cp.Variable((T,2))
nu = cp.Variable((T,1))
A = np.matrix('1 0; 0 1')
B = np.matrix('1 0; 0 1')
kernel = RBF()
KXqXq = kernel(Xq)
obj = cp.trace(KXqXq - cp.matrix_frac(cp.bmat([[cp_kernel(x[i], Xq[j]) for j in range(len(Xq))] for i in range(T)]), cp.bmat([[cp_kernel(x[i], x[j]) for j in range(T)] for i in range(T)]) + cp.diag(nu)))
cons = [0 <= x, x <= 1]
cons += [0 <= u, u <= u_max]
cons += [0 <= nu]
cons += [cp.sum(nu) == 1]
for t in range(T-1):
cons += [x[t+1] == A # x[t] + B # u[t]]
cp.Problem(cp.Minimize(obj), cons).solve()
I'm trying to implement a function that computes the Relu derivative for each element in a matrix, and then return the result in a matrix. I'm using Python and Numpy.
Based on other Cross Validation posts, the Relu derivative for x is
1 when x > 0, 0 when x < 0, undefined or 0 when x == 0
Currently, I have the following code so far:
def reluDerivative(self, x):
return np.array([self.reluDerivativeSingleElement(xi) for xi in x])
def reluDerivativeSingleElement(self, xi):
if xi > 0:
return 1
elif xi <= 0:
return 0
Unfortunately, xi is an array because x is an matrix. reluDerivativeSingleElement function doesn't work on array. So I'm wondering is there a way to map values in a matrix to another matrix using numpy, like the exp function in numpy?
Thanks a lot in advance.
That's an exercise in vectorization.
This code
if x > 0:
y = 1
elif xi <= 0:
y = 0
Can be reformulated into
y = (x > 0) * 1
This is something that will work for numpy arrays, since boolean expressions involving them are turned into arrays of values of these expressions for elements in said array.
I guess this is what you are looking for:
>>> def reluDerivative(x):
... x[x<=0] = 0
... x[x>0] = 1
... return x
>>> z = np.random.uniform(-1, 1, (3,3))
>>> z
array([[ 0.41287266, -0.73082379, 0.78215209],
[ 0.76983443, 0.46052273, 0.4283139 ],
[-0.18905708, 0.57197116, 0.53226954]])
>>> reluDerivative(z)
array([[ 1., 0., 1.],
[ 1., 1., 1.],
[ 0., 1., 1.]])
Basic function to return derivative of relu could be summarized as follows:
f'(x) = x > 0
So, with numpy that would be:
def relu_derivative(z):
return np.greater(z, 0).astype(int)
def dRelu(z):
return np.where(z <= 0, 0, 1)
Here z is a ndarray in my case.
def reluDerivative(self, x):
return 1 * (x > 0)
You are on a good track: thinking on vectorized operation. Where we define a function, and we apply this function to a matrix, instead of writing a for loop.
This threads answers your question, where it replace all the elements satisfy the condition. You can modify it into ReLU derivative.
https://stackoverflow.com/questions/19766757/replacing-numpy-elements-if-condition-is-met
In addition, python supports functional programming very well, try to use lambda function.
https://www.python-course.eu/lambda.php
This works:
def dReLU(x):
return 1. * (x > 0)
As mentioned by Neil in the comments, you can use heaviside function of numpy.
def reluDerivative(self, x):
return np.heaviside(x, 0)
If you want to use pure Python:
def relu_derivative(x):
return max(sign(x), 0)
If you want it with the derivative you can use:
def relu(neta):
relu = neta * (neta > 0)
d_relu = (neta > 0)
return relu, d_relu
When x is larger than 0, the slope is 1.
When x is smaller than or equal to 0, the slope is 0.
if (x > 0):
return 1
if (x <= 0):
return 0
This can be written more compact:
return 1 * (x > 0)
Here is part of get_updates code from SGD from keras(source)
moments = [K.zeros(shape) for shape in shapes]
self.weights = [self.iterations] + moments
for p, g, m in zip(params, grads, moments):
v = self.momentum * m - lr * g # velocity
self.updates.append(K.update(m, v))
Observation:
Since moments variable is a list of zeros tensors. Each m in the for loop is a zero tensor with the shape of p. Then the self.momentum * m, at the first line of the loop, is just a scalar multiply by zero tensor which result a zero tensor.
Question
What am I missing here?
Yes - during a first iteration of this loop m is equal to 0. But then it's updated by a current v value in this line:
self.updates.append(K.update(m, v))
So in next iteration you'll have:
v = self.momentum * old_velocity - lr * g # velocity
where old_velocity is a previous value of v.
minΣ(||xi-Xci||^2+ λ||ci||),
s.t cii = 0,
where X is a matrix of shape d * n and C is of the shape n * n, xi and ci means a column of X and C separately.
X is known here and based on X we want to find C.
Usually with a loss like that you need to vectorize it, instead of working with columns:
loss = X - tf.matmul(X, C)
loss = tf.reduce_sum(tf.square(loss))
reg_loss = tf.reduce_sum(tf.square(C), 0) # L2 loss for each column
reg_loss = tf.reduce_sum(tf.sqrt(reg_loss))
total_loss = loss + lambd * reg_loss
To implement the zero constraint on the diagonal of C, the best way is to add it to the loss with another constant lambd2:
reg_loss2 = tf.trace(tf.square(C))
total_loss = total_loss + lambd2 * reg_loss2
I am trying to understand how the bound variables are indexed in z3.
Here in a snippet in z3py and the corresponding output. ( http://rise4fun.com/Z3Py/plVw1 )
x, y = Ints('x y')
f1 = ForAll(x, And(x == 0, Exists(y, x == y)))
f2 = ForAll(x, Exists(y, And(x == 0, x == y)))
print f1.body()
print f2.body()
Output:
ν0 = 0 ∧ (∃y : ν1 = y)
y : ν1 = 0 ∧ ν1 = y
In f1, why is the same bound variable x has different index.(0 and 1). If I modify the f1 and bring out the Exists, then x has the same index(0).
Reason I want to understand the indexing mechanism:
I have a FOL formula represented in a DSL in scala that I want to send to z3. Now ScalaZ3 has a mkBound api for creating bound variables that takes index and sort as arguments. I am not sure what value should I pass to the index argument. So, I would like to know the following:
If I have two formulas phi1 and phi2 with maximum bound variable indexes n1 and n2, what would be the index of x in ForAll(x, And(phi1, phi2))
Also, is there a way to show all the variables in an indexed form? f1.body() just shows me x in indexed form and not y. (I think the reason is that y is still bound in f1.body())
Z3 encodes bound variables using de Bruijn indices.
The following wikipedia article describes de Bruijn indices in detail:
http://en.wikipedia.org/wiki/De_Bruijn_index
Remark: in the article above the indices start at 1, in Z3, they start at 0.
Regarding your second question, you can change the Z3 pretty printer.
The Z3 distribution contains the source code of the Python API. The pretty printer is implemented in the file python\z3printer.py.
You just need to replace the method:
def pp_var(self, a, d, xs):
idx = z3.get_var_index(a)
sz = len(xs)
if idx >= sz:
return seq1('Var', (to_format(idx),))
else:
return to_format(xs[sz - idx - 1])
with
def pp_var(self, a, d, xs):
idx = z3.get_var_index(a)
return seq1('Var', (to_format(idx),))
If you want to redefine the HTML pretty printer, you should also replace.
def pp_var(self, a, d, xs):
idx = z3.get_var_index(a)
sz = len(xs)
if idx >= sz:
# 957 is the greek letter nu
return to_format('ν<sub>%s</sub>' % idx, 1)
else:
return to_format(xs[sz - idx - 1])
with
def pp_var(self, a, d, xs):
idx = z3.get_var_index(a)
return to_format('ν<sub>%s</sub>' % idx, 1)