The objective is not DCP. Its following subexpressions are not - cvxpy

Here I use the cvxpy solver to solve a problem. But the Problem does not follow DCP rules.
The objective is
import cvxpy as cp
import numpy as np
def bit_rate(alpha, beta, p, w):
return alpha * w * cp.log(1 + beta * p / w)
# Create scalar optimization variables.
p_1 = cp.Variable()
p_2 = cp.Variable()
p_3 = cp.Variable()
w_1 = cp.Variable()
w_2 = cp.Variable()
w_3 = cp.Variable()
r_1 = bit_rate(2, 2, p_1, w_1)
r_2 = bit_rate(2.4, 2.4, p_2, w_2)
r_3 = bit_rate(2.8, 2.8, p_3, w_3)
# Create constraints.
constraints = [p_1 + p_2 + p_3 == 0.5,
w_1 + w_2 + w_3 == 1,
p_1 >= 0, p_2 >= 0, p_3 >= 0,
w_1 >= 0, w_2 >= 0, w_3 >= 0]
# Form objective.
obj = cp.Maximize(r_1 + r_2 + r_3)
# Form and solve problem.
prob = cp.Problem(obj, constraints)
prob.solve() # Returns the optimal value.
print("status:", prob.status)
print("optimal value", prob.value)
print("p optimal var", p_1.value, p_2.value, p_3.value)
print("W optimal var", w_1.value, w_2.value, w_3.value)
The p and w are variables. I wanna ask how to convert the problem to be a DCP problem?THANKS!
The bug is
The objective is not DCP. Its following subexpressions are not:
2.0 * var0 / var3
2.4 * var1 / var4
2.8 * var2 / var5


Why macro F1 measure can't be calculated from macro precision and recall?

I'm interested in calculating macro f1-score by macro precision and recall manually. But the results aren't equal. What is the difference in the final formula between f1 and f1_new in code?
from sklearn.metrics import precision_score, recall_score, f1_score
y_true = [0, 1, 0, 1, 0 , 1, 1, 0]
y_pred = [0, 1, 0, 0, 1 , 1, 0, 0]
p = precision_score(y_true, y_pred, average='macro')
r = recall_score(y_true, y_pred, average='macro')
f1_new = (2 * p * r) / (p + r) # 0.6291390728476821
f1 = f1_score(y_true, y_pred, average='macro') # 0.6190476190476191
print(f1_new == f1)
# False
The f1_score is calculated in scikit-learn as follows:
all_positives = 4
all_negatives = 4
true_positives = 2
true_negatives = 3
true_positive_rate = true_positives/all_positives = 2/4
true_negative_rate = true_negatives/all_negatives = 3/4
pred_positives = 3
pred_negatives = 5
positive_predicted_value = true_positives/pred_positives = 2/3
negative_predicted_value = true_negatives/pred_negatives = 3/5
f1_score_pos = 2 * true_positive_rate * positive_predicted_value / (true_positive_rate + positive_predicted_value)
= 2 * 2/4 * 2/3 / (2/4 + 2/3)
f1_score_neg = 2 * true_negative_rate * negative_predicted_value / (true_negative_rate + negative_predicted_value)
= 2 * 3/4 * 3/5 / (3/4 + 3/5)
f1 = average(f1_score_pos, f1_score_neg)
= 2/4 * 2/3 / (2/4 + 2/3) + 3/4 * 3/5 / (3/4 + 3/5)
= 0.6190476190476191
This matches the definition given in the documentation for the 'macro' parameter of skicit-learn's f1_score: Calculate metrics for each label, and find their unweighted mean. This definition also applies to precision_score and recall_score.
Your manual calculation of the F1-score is as follows:
precision = average(positive_predicted_value, negative_predicted_value)
= average(2/3, 3/5)
= 19/30
recall = average(true_positive_rate, true_negative_rate)
= average(2/4, 3/4)
= 5/8
f1_new = 2 * precision * recall / (precision + recall)
= 2 * 19/30 * 5/8 / (19/30 + 5/8)
= 0.6291390728476821
In fact, the general formula F1 = 2 * (precision * recall) / (precision + recall) as presented in the docs is only valid for average='binary' and average='micro', but not for average='macro' and average='weighted'. In that sense, as it is currently presented in scikit-learn, the formula is misleading as it suggests that it holds irrespective of the chosen parameters, which is not the case.

Scipy.optimize - minimize not respecting constraints

Using the code below to to understand how Scipy optmization/minimization works. The results are not matching what I am expecting.
Minimize: f = 2*x[0]*x[1] + 2*x[0] - x[0]**2 - 2*x[1]**2
Subject to: -2*x[0] + 2*x[1] <= -2
2*x[0] - 4*x[1] <= 0
x[0]**3 -x[1] == 0
where: 0 <= x[0] <= inf
1 <= x[1] <= inf
import numpy as np
from scipy.optimize import minimize
def objective(x):
return 2.0*x[0]*x[1] + 2.0*x[0] - x[0]**2 - 2.0*x[1]**2
def constraint1(x):
return +2.0*x[0] - 2.0*x[1] - 2.0
def constraint2(x):
return -2.0*x[0] + 4.0*x[1]
def constraint3(x):
sum_eq = x[0]**3.0 -x[1]
return sum_eq
# initial guesses
n = 2
x0 = np.zeros(n)
x0[0] = 10.0
x0[1] = 100.0
# show initial objective
print('Initial SSE Objective: ' + str(objective(x0)))
# optimize
#b = (1.0,None)
bnds = ((0.0,1000.0), (1.0,1000.0))
con1 = {'type': 'ineq', 'fun': constraint1}
con2 = {'type': 'ineq', 'fun': constraint2}
con3 = {'type': 'eq', 'fun': constraint3}
cons = ([con1, con2, con3])
solution = minimize(objective,
x = solution.x
# show final objective
print('Final SSE Objective: ' + str(objective(x)))
# print solution
print('x1 = ' + str(x[0]))
print('x2 = ' + str(x[1]))
print('x', x)
print('constraint1', constraint1(x))
print('constraint2', constraint2(x))
print('constraint3', constraint3(x))
When I run, this is what Python throws on its output console:
Initial SSE Objective: -18080.0
fun: 2.0
jac: array([ 0.00000000e+00, -2.98023224e-08])
message: 'Optimization terminated successfully.'
nfev: 122
nit: 17
njev: 13
status: 0
success: True
x: array([2., 1.])
Final SSE Objective: 2.0
x1 = 2.0000000000010196
x2 = 1.0000000000012386
x [2. 1.]
constraint1 -4.3787196091216174e-13
constraint2 2.915001573455811e-12
constraint3 7.000000000010997
Despite the optimizer says the result was successful, the constraint3 is not respected because the result should be zero. What am I missing?
Your problem is incompatible. You can eliminate the 3rd constraint (which makes your problem simpler in the first place - only a scalar optimization), after this it is a bit more clear to see what is the problem. From constraint 3 and the lower bound on the original x1 follows, that x0 is not feasible from 0 to 1, so the lower bound in the 1D problem should be 1. It is easy to see that constraint 2 will be always positive, when x0 is larger than 1, therefore it will never be satisfied.
When I run your original problem for me it stops with positive directional derivative (and for the rewritten problem with 'Inequality constraints incompatible').
Which SciPy are you using? For me it is 1.4.1.
On the picture below you can see the objective and the remaining constraints for the 1D problem (horizontal axis is the original x0 variable)
Minimize: f = 2*x[0]*x1 + 2*x[0] - x[0]**2 - 2*x1**2
Subject to: -2*x[0] + 2*x[1] <= -2
2*x[0] - 4*x[1] <= 0
x[0]**3 -x[1] == 0
where: 0 <= x[0] <= inf
1 <= x[1] <= inf
import numpy as np
from scipy.optimize import minimize
def objective(x):
return 2*x**4 + 2*x - x**2 - 2*x**6
def constraint1(x):
return x - x**3 - 1
def constraint2(x):
return 2 * x**3 - x
# def constraint3(x):
# sum_eq = x[0]**3.0 -x[1]
# return sum_eq
# initial guesses
n = 1
x0 = np.zeros(n)
x0[0] = 2.
# x0[1] = 100.0
# show initial objective
print('Initial SSE Objective: ' + str(objective(x0)))
# optimize
#b = (1.0,None)
bnds = ((1.0,1000.0),)
con1 = {'type': 'ineq', 'fun': constraint1}
con2 = {'type': 'ineq', 'fun': constraint2}
# con3 = {'type': 'eq', 'fun': constraint3}
cons = [
# con1,
# con3,
solution = minimize(objective,
x = solution.x
# show final objective
print('Final SSE Objective: ' + str(objective(x)))
# print solution
print('x1 = ' + str(x[0]))
# print('x2 = ' + str(x[1]))
print('x', x)
print('constraint1', constraint1(x))
print('constraint2', constraint2(x))
# print('constraint3', constraint3(x))
x_a = np.linspace(1, 2, 200)
f = objective(x_a)
c1 = constraint1(x_a)
c2 = constraint2(x_a)
import matplotlib.pyplot as plt
plt.plot(x_a, f, label="f")
plt.plot(x_a, c1, label="c1")
plt.plot(x_a, c2, label="c2")

Logistic Regression not able to find value of theta

I have hundred Entries in csv file.
Using above data i am trying to build logistic (binary) classifier.
Please advise me where i am doing wrong ? Why i am getting answer in 3*3 Matrix (9 values of theta, where as it should be 3 only)
Here is code:
importing the libraries
import numpy as np
import pandas as pd
from sklearn import preprocessing
reading data from csv file.
df = pd.read_csv("LogisticRegressionFirstBinaryClassifier.csv", header=None)
df.columns = ["Maths", "Physics", "AdmissionStatus"]
X = np.array(df[["Maths", "Physics"]])
y = np.array(df[["AdmissionStatus"]])
X = preprocessing.normalize(X)
X = np.c_[np.ones(X.shape[0]), X]
theta = np.ones((X.shape[1], 1))
print(X.shape) # (100, 3)
print(y.shape) # (100, 1)
print(theta.shape) # (3, 1)
calc_z to caculate dot product of X and theta
def calc_z(X,theta):
Sigmoid function
def sigmoid(z):
return 1 / (1 + np.exp(-z))
def cost_function(X, y, theta):
z = calc_z(X,theta)
h = sigmoid(z)
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
print("cost_function =" , cost_function(X, y, theta))
def derivativeofcostfunction(X, y, theta):
z = calc_z(X,theta)
h = sigmoid(z)
calculation = - y).T,X)
return calculation
print("derivativeofcostfunction=", derivativeofcostfunction(X, y, theta))
def grad_desc(X, y, theta, lr=.001, converge_change=.001):
cost = cost_function(X, y, theta)
change_cost = 1
num_iter = 1
while(change_cost > converge_change):
old_cost = cost
print (derivativeofcostfunction(X, y, theta))
theta = theta - lr*(derivativeofcostfunction(X, y, theta))
cost = cost_function(X, y, theta)
change_cost = old_cost - cost
num_iter += 1
return theta, num_iter
Here is the output :
[[ 0.4185146 -0.56877556 0.63999433]
[15.39722864 9.73995197 11.07882445]
[12.77277463 7.93485324 9.24909626]]
[[0.33944777 0.58199037 0.52493407]
[0.02106587 0.36300629 0.30297278]
[0.07040604 0.3969297 0.33737757]]
[[-0.05856159 -0.89826735 0.30849185]
[15.18035041 9.59004868 10.92827046]
[12.4804775 7.73302024 9.04599788]]
[[0.33950634 0.58288863 0.52462558]
[0.00588552 0.35341624 0.29204451]
[0.05792556 0.38919668 0.32833157]]
[[-5.17526527e-01 -1.21534937e+00 -1.03387571e-02]
[ 1.49729502e+01 9.44663458e+00 1.07843504e+01]
[ 1.21978140e+01 7.53778010e+00 8.84964495e+00]]
(array([[ 0.34002386, 0.58410398, 0.52463592],
[-0.00908743, 0.34396961, 0.28126016],
[ 0.04572775, 0.3816589 , 0.31948193]]), 46)
I changed this code , just added Transpose while returning the matrix and it fixed my issue.
def derivativeofcostfunction(X, y, theta):
z = calc_z(X,theta)
h = sigmoid(z)
calculation = - y).T,X)
return calculation.T

backpropagation algorithm in matlab

I'm writing a back propagation algorithm in matlab. But I can not get to write a good solution. I read a book Haykin and read some topics in Internet, how make it other people. I understand from door to door this algorithm in theory, but I have a much of error in practice. I have a NaN in my code.
You can see here.
I'm trying classification some points on plate. These are three ellipses, which are placed one inside the other.
I wrote this function. The second layer learn, but first layer dont learn.
function [E, W_1, W_2, B_1, B_2, X_3] = update(W_1, W_2, B_1, B_2, X_1, T, alpha)
V_1 = W_1 * X_1 + B_1;
X_2 = tansig(V_1);
V_2 = W_2 * X_2 + B_2;
X_3 = tansig(V_2);
E = 1 / 2 * sum((T - X_3) .^ 2);
dE = (T - X_3);
for j = 1 : size(X_2, 1)
delta_2_sum = 0;
for i = 1 : size(X_3, 1)
delta_2 = dE(i, 1) * dtansig(1, V_2(i, 1) );
W_2_tmp(i, j) = W_2(i, j) - alpha * delta_2 * X_2(j, 1);
B_2_tmp(i, 1) = B_2(i, 1) - alpha * delta_2;
for k = 1 : size(X_1, 1)
for j = 1 : size(X_2, 1)
delta_2_sum = 0;
for i = 1 : size(X_3, 1)
delta_2 = dE(i, 1) * dtansig(1, V_2(i, 1) );
delta_2_sum = delta_2_sum + W_2(i, j) * delta_2;
delta_1 = delta_2_sum * dtansig(1, V_1(j, 1) );
W_1_tmp(j, k) = W_1(j, k) - alpha * delta_1 * X_1(k, 1);
B_1_tmp(j, 1) = B_1(j, 1) - alpha * delta_1;
if (min(W_1) < -10000 )
X = 1;
B_1 = B_1_tmp;
B_2 = B_2_tmp;
W_1 = W_1_tmp
W_2 = W_2_tmp;
I wrote another variant code. And this code don't work. I calculated this code with 1-dimensional vector as input and as output. And I don't have truth result.
What can I do?
I use matlab nntool interface. But my backprop was written my hand.
How I can testing my code?
function [net] = backProp(net, epoch, alpha)
for u = 1 : epoch % Число эпох
for p = 1 : size(net.userdata{1, 1}, 2)
% Учим по всем элементам выборки
[~, ~, ~, De, Df, f] = frontProp(net, p, 1);
for l = size(net.LW, 1) : -1 : 1 % Обходим слои
if (size(net.LW, 1) == l )
delta{l} = De .* Df{l};
% size(delta{l + 1})
% size(net.LW{l + 1})
delta{l} = Df{l} .* (delta{l + 1}' * net.LW{l + 1} )';
if (l == 1)
net.IW{l} + alpha * delta{l} * f{l}'
net.IW{l} = net.IW{l} + alpha * delta{l} * f{l}';
net.LW{l} + alpha * delta{l} * f{l}'
net.LW{l} = net.LW{l} + alpha * delta{l} * f{l}';

Gradient Descent Implementation in Python returns Nan

I am trying to implement gradient descent in python; the implementation works when I try it with training_set1 but it returns not a number(nan) when I try it training_set. Any idea why my code is broken?
from collections import namedtuple
TrainingInstance = namedtuple("TrainingInstance", ['X', 'Y'])
training_set1 = [TrainingInstance(0, 4), TrainingInstance(1, 7),
TrainingInstance(2, 7), TrainingInstance(3, 8),
TrainingInstance(8, 12)]
training_set = [TrainingInstance(60, 3.1), TrainingInstance(61, 3.6),
TrainingInstance(62, 3.8), TrainingInstance(63, 4),
TrainingInstance(65, 4.1)]
def grad_desc(x, x1):
# minimize a cost function of two variables using gradient descent
training_rate = 0.1
iterations = 5000
#while sqrd_error(x, x1) > 0.0000001:
while iterations > 0:
#print sqrd_error(x, x1)
x, x1 = x - (training_rate * deriv(x, x1)), x1 - (training_rate * deriv1(x, x1))
iterations -= 1
return x, x1
def sqrd_error(x, x1):
sum = 0.0
for inst in training_set:
sum += ((x + x1 * inst.X) - inst.Y)**2
return sum / (2.0 * len(training_set))
def deriv(x, x1):
sum = 0.0
for inst in training_set:
sum += ((x + x1 * inst.X) - inst.Y)
return sum / len(training_set)
def deriv1(x, x1):
sum = 0.0
for inst in training_set:
sum += ((x + x1 * inst.X) - inst.Y) * inst.X
return sum / len(training_set)
if __name__ == "__main__":
print grad_desc(2, 2)
Reduce training_rate so that the objective decreases at each iteration.
See Figure 6. in this paper:
