Problem using GEKKO to do regime change detection (estimating time-varying variables) - time-series

Using GEKKO Python, we have trouble trying to learn a parameter that can vary multiple times per day. In some disciplines this is also known as 'regime detection or regime change detection'. We (me and my colleague Henri ter Hofte from Windesheim University of Applied Sciences) conceived 3 strategies but fail (more below).
Our question(s):
What are we doing wrong, is there an obvious error in our GEKKO code (more below in the details)?
Is strategy I doomed to fail, and should we switch to strategy II or II?
Is GEKKO Python even suitable for doing this kind of regime (change) detection?
Your help is much appreciated.
=== The problem:
We have time series data about:
(1) CO₂ concentration
(2) ventilation rates (or rather: valve fractions, which give ventilation rates, when multiplied with known maximum ventilation rates)
(3) occupancy (number of persons in a room)
For research question (A) we would like to know a proper estimate for (2) for each hour of the day, given time series data about (1) and (3).
For research question (B) we would like to know a proper estimate for (3) for each hour of the day, given time series data about (1) and (2).
We focus on research question A (but have similar questions for B).
=== The 3 strategies:
We considered 3 different strategies how to implement this using GEKKO Python:
Strategy I. Declare the variable valve_frac as a Manipulated Variabe in our GEKKO model (m.MV), since the GEKKO documentation says these variables can be "adjusted by the optimizer to minimize an objective function at every time point". and "Manipulated variables are like FVs, but can change with each data row, either calculated by the optimizer (STATUS=1) or specified by the user (STATUS=0)." according to https://gekko.readthedocs.io/en/latest/imode.html#mv
Strategy II. Split the time into several shorter time spans (e.g. one time span per hour) and then learn valve_frac as a GEKKO Fixed Variable (m.FV), one for each hour.
Strategy III. Reframe the problem to GEKKO as if it were a control problem: the setpoint is reaching a particular CO2 concentration and GEKKO has can use valve_frac as a Control Variable (m.CV)
We tried implementing strategy I (see more info and code below) but fail to get proper results.
Considering some equation derived from physics, we intend to find the best value for some specific variable (valve_frac__0 variable in following table. Having a dataframe (df_learn) like this:
Index
Date-Time
occupancy__p
valve_frac__0
co2__ppm
1
2022.12.01 – 00:00:00
0
0.51
546
2
2022.12.01 – 00:15:00
4
0.85
820
3
2022.12.01 – 00:30:00
1
0.21
595
4
2022.12.01 – 00:45:00
2
0.74
635
5
2022.12.01 – 00:15:00
0
0.65
559
6
2022.12.01 – 00:15:00
0
0.45
538
7
2022.12.01 – 00:15:00
2
0.82
659
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1920
2022.12.20 – 00:15:00
3
0.73
749
We are trying to develop a moving horizon estimation model (IMODE=5) or Control model (IMODE=6) to predict the valve_frac__0 value.
Following comes the code in Gekko format:
=== Code:
from gekko import GEKKO
# Gekko Model - Initialize
m = GEKKO(remote = False)
m.time = np.arange(0, duration__s, step__s)
# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6
# Constants
MET__mL_min_1_kg_1_p_1 = 3.5
desk_work__MET = 1.5
P_std__Pa = 101325
R__m3_Pa_K_1_mol_1 = 8.3145
T_room__degC = 20.0
T_std__degC = 0.0
T_zero__K = 273.15
T_std__K = T_zero__K + T_std__degC
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001
# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)
co2_ext__ppm = 415
# National averages
weight__kg = 77.5
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 * weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3
co2_o2 = 0.894
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1
# Room averages
wind__m_s_1 = 3.0
# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = df_learn.occupancy__p.values)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1
# Strategy I:
valve_frac__0 = m.MV(value = df_learn.valve_frac__0.values)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = df_learn.co2__ppm.values)
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1
# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) * (vent_max__m3_s_1 * valve_frac__0 + wind__m_s_1 * infilt__m2) / room__m3)
co2_gain_mol0_s_1 = m.Intermediate(occupancy__p * co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))
co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)
m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)
# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.solve(disp = False)
The result which I got for each scenario come as follow:
Strategy I:
There is no output for simulated “co2__ppm” and the output value for
valve_frac__0 = 0
Strategy II:
There is big difference between simulated and measured “co2__ppm” and the output value for
valve_frac__0 = 0.166 (which is not reasonable)

The code looks like it should work as long as valve_frac__0 is the adjustable unknown parameter that should be estimated from the CO2 PPM data. Here is a result on a smaller subset of the posted data.
The data doesn't fit exactly if there is a lower bound of zero on the valve position.
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
Otherwise, the valve position can be adjusted to fit the CO2 data perfectly.
Here is a complete script with the sample data.
from gekko import GEKKO
import numpy as np
# Gekko Model - Initialize
m = GEKKO(remote = False)
# data
# 1 2022.12.01 – 00:00:00 0 0.51 546
# 2 2022.12.01 – 00:15:00 4 0.85 820
# 3 2022.12.01 – 00:30:00 1 0.21 595
# 4 2022.12.01 – 00:45:00 2 0.74 635
# 5 2022.12.01 – 00:15:00 0 0.65 559
# 6 2022.12.01 – 00:15:00 0 0.45 538
# 7 2022.12.01 – 00:15:00 2 0.82 659
occupancy__p = np.array([0,4,1,2,0,0,2])
valve_frac__0 = np.array([0.51,0.85,0.21,0.74,0.65,0.45,0.82])
co2__ppm_meas = np.array([546,820,595,635,559,538,659])
duration__s = len(co2__ppm_meas)
m.time = np.linspace(0,duration__s-1,duration__s)
vent_max__m3_s_1 = 1
room__m3 = 1
# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6
# Constants
MET__mL_min_1_kg_1_p_1 = 3.5
desk_work__MET = 1.5
P_std__Pa = 101325
R__m3_Pa_K_1_mol_1 = 8.3145
T_room__degC = 20.0
T_std__degC = 0.0
T_zero__K = 273.15
T_std__K = T_zero__K + T_std__degC
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001
# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)
co2_ext__ppm = 415
# National averages
weight__kg = 77.5
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 \
* weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3
co2_o2 = 0.894
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1
# Room averages
wind__m_s_1 = 3.0
# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = occupancy__p)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1
# Strategy I:
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = co2__ppm_meas)
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1
# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) \
* (vent_max__m3_s_1 * valve_frac__0 \
+ wind__m_s_1 * infilt__m2) / room__m3)
co2_gain_mol0_s_1 = m.Intermediate(occupancy__p \
* co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))
co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)
m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)
# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.options.SOLVER = 1
m.solve(disp = True)
import matplotlib.pyplot as plt
plt.subplot(2,1,1)
plt.plot(m.time,valve_frac__0.value,'r-',label='Valve Frac')
plt.legend(); plt.grid(); plt.ylabel('Valve Frac')
plt.subplot(2,1,2)
plt.plot(m.time,co2__ppm_meas,'ko',label='Measured')
plt.plot(m.time,co2__ppm.value,'k--',label='Predicted')
plt.legend(); plt.grid()
plt.xlabel('Time'); plt.ylabel('CO2')
plt.savefig('results.png',dpi=300)
plt.show()
For question B, adjust the code to make the valve position fixed at the measured values and the occupancy determined by the optimizer.
occupancy__p = m.MV(value = occupancy__p)
occupancy__p.STATUS = 1; occupancy__p.FSTATUS = 0
# Strategy I:
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
valve_frac__0.STATUS = 0; valve_frac__0.FSTATUS = 1
Use occupancy__p.MV_STEP_HOR = 2 or higher decrease the frequency at which the optimized parameter can change (e.g. every 2 hours).

Related

How to calculate F1 Score for Multi-label Classification

I am trying to calculate F1 score (and accuracy) for my multi-label classification problem. Could you please provide feedback on my method, if I'm calculating it correctly. Note that I'm calculating IOU (intersection over union) when model predicts an object as 1, and mark it as TP only if IOU is greater than or equal to 0.5.
GT labels: 14 x 10 x 128
Output: 14 x 10 x 128
where 14 is the batch_size, 10 is the sequence_length, and 128 is the object vector (i.e., 1 if the object at an index belongs to the sequence and 0 otherwise).
def calculate_performance_metrics(total_padded_elements, gt_labels, predicted_labels):
# check if TP pred objects overlap with TP gt objects
TP_INDICES = (torch.logical_and(predicted_labels == 1, gt_labels == 1)).nonzero() # we only want the batch and object indices, i.e. the 0 and 2 indices
TP = calculate_tp_with_iou() # details of this don't matter for now
FP = torch.sum(torch.logical_and(predicted_labels, 1 - gt_labels)).item()
TN = torch.sum(torch.logical_and(1 - predicted_labels, 1 - gt_labels)).item()
FN = torch.sum(torch.logical_and(1 - predicted_labels, gt_labels)).item()
return float(TP), float(FP), float(TN - total_padded_elements), float(FN)
for epoch in range(10):
TP = FP = TN = FN = EPOCH_PRECISION = EPOCH_RECALL = EPOCH_F1 = 0.
for inputs, gt_labels, masks in tr_dl:
outputs = model(inputs) # out shape: (14, 10, 128)
# mask shape: (14, 10). So need to expand it to the shape of output
masks = masks[:, :, None].expand_as(outputs)
pred_labels = (torch.sigmoid(outputs) >= 0.5).float().type(torch.int64) # consider all predictions above 0.5 as 1, rest 0
pred_labels = pred_labels * masks
gt_labels = (gt_labels * masks).type(torch.int64)
total_padded_elements = masks.numel() - masks.sum() # need this to get accurate true negatives
batch_tp, batch_fp, batch_tn, batch_fn = calculate_performance_metrics(gt_labels, pred_labels, total_padded_elements)
EPOCH_TP += batch_tp
EPOCH_FP += batch_fp
EPOCH_TN += batch_tn
EPOCH_FN += batch_fn
EPOCH_ACCURACY = (EPOCH_TP + EPOCH_TN) / (EPOCH_TP + EPOCH_TN + EPOCH_FP + EPOCH_FN)
if EPOCH_TP + EPOCH_FP > 0:
EPOCH_PRECISION = EPOCH_TP / (EPOCH_TP + EPOCH_FP)
if EPOCH_TP + EPOCH_FN > 0:
EPOCH_RECALL = EPOCH_TP / (EPOCH_TP + EPOCH_FN)
EPOCH_F1 = (2 * EPOCH_PRECISION * EPOCH_RECALL) / (EPOCH_PRECISION + EPOCH_RECALL)

calculation of performance metrics in the multi-class classification

I am using XGBoost classifier that classify X-ray images into 3 classes.
My problem is that when I calculate these values manually (by hand) using the confusion matrix, it shows me values that are not as they are in the classification report. Even though I used all the equations to calculate those.
Please I need a help on how I can make a calculation by hand to find these values (accuracy, precision and recall).
here is the classification report
precision recall f1-score support
0 1.0000 0.9052 0.9502 116
1 0.8267 0.9180 0.8700 317
2 0.9627 0.9357 0.9490 855
accuracy 0.9286 1288
macro avg 0.9298 0.9196 0.9231 1288
weighted avg 0.9326 0.9286 0.9297 1288
and this is the confusion matrix
[0.90 0.05 0.04
0 0.91 0.08
0 0.06 0.93]
Accuracy
How many of the correct predictions are made in total? (Closer to 1)
TP plus TN, divided by the sum of all
Recall
In a sample that is actually positive, the proportion of samples that are determined to be positive
How many of the total things I'm trying to get right? (Closer to 1)
Precision
If it is predicted to be positive, moderate positive. How accurate the positive prediction is
How many correct answers are correct among the questions you solved? (Closer to 1 is better)
Okay lets do 3 x 3 confusion matrix
class A precision = 15 / 24 = 0.625
class B precision = 15 / 20 = 0.75
class C precision = 45 / 56 = 0.80
class A recall = 15 / 20 = 0.75
class B recall = 15 / 30 = 0.5
class C recall = 45 / 50 = 0.9
Accuracy of classifier = (15 + 15 + 45) / 100 = 0.75
Weighted Average Precision = Actual class A instances * precison of class A + Actual class B instances * precison of class B + Actual class C instances * precison of class C
= 20 / 100 * 0.625 + 30 / 100 * 0.75 + 50 / 100 * 0.8
= 0.75
Weighted Average Recall = Actual class A instances * Recall of class A + Actual class B instances * Recall of class B + Actual class C instances * Recall of class C
= 20 / 100 * 0.75 + 30 / 100 * 0.5 + 50 / 100 * 0.9
= 0.75
In your case
class A precision = 0.9 / 0.9 = 1
class B precision = 0.91 / 1.02 = 0.89
class C precision = 0.93 / 1.05 = 0.89
class A recall = 0.9 / 0.99 = 0.91
class B recall = 0.91 / 0.99 = 0.92
class C recall = 0.93 / 0.99 = 0.94
Accuracy of classifier = (0.9 + 0.91 + 0.93) / 2.97 = 0.92
Weighted Average Precision = Actual class A instances * precison of class A + Actual class B instances * precison of class B + Actual class C instances * precison of class C
= 0.99 / 2.97 * 1 + 0.99 / 2.97 * 0.89 + 0.99 / 2.97 * 0.89 = 0.93
Weighted Average Recall = Actual class A instances * Recall of class A + Actual class B instances * Recall of class B + Actual class C instances * Recall of class C
= 0.99 / 2.97 * 0.91 + 0.99 / 2.97 * 0.92 + 0.99 / 2.97 * 0.94 = 0.92

How to find and update levels accordingly based on points?

I am creating a rails application which is like a game. So it has points and levels. For example: to become level one the user has to get atleast 100 points and again for level two the user has to reach level 2 the user has to collect 200 points. The level difference changes after every 10 levels i.e., The difference between each level changes after 10 levels always. By that I mean the difference in points between level one and two is 100 and the difference in points in level 11 and 12 is 150 and so on. There is no upper bound for levels.
Now my question is let's say a user's total points is 3150 and just got updated to 3155. What's the optimal solution to find the current level and update it if needed?
I can get a solution using while loops and again looping inside it which will give a result in O(n^2). I need something better.
I think this code works but I'm not sure if this is the best way to go about it
def get_level(points)
diff = 100
sum = 0
level = -1
current_level = 0
while level.negative?
10.times do |i|
current_level += 1
sum += diff
if points > sum
next
elsif points <= sum
level = current_level
break
end
end
diff += 50
end
puts level
end
I wrote a get_points function (it should not be difficult). Then based on it get_level function in which it was necessary to solve the quadratic equation to find high value, and then calc low.
If you have any questions, let me know.
Check output here.
#!/usr/bin/env python3
import math
def get_points(level):
high = (level + 1) // 10
low = (level + 1) % 10
high_point = 250 * high * high + 750 * high # (3 + high) * high // 2 * 500
low_point = (100 + 50 * high) * low
return low_point + high_point
def get_level(points):
# quadratic equation
a = 250
b = 750
c = -points
d = b * b - 4 * a * c
x = (-b + math.sqrt(d)) / (2 * a)
high = int(x)
remainder = points - (250 * high * high + 750 * high)
low = remainder // (100 + 50 * high)
level = high * 10 + low
return level
def main():
for l in range(0, 40):
print(f'{l:3d} {get_points(l - 1):5d}..{get_points(l) - 1}')
for level, (l, r) in (
(1, (100, 199)),
(2, (200, 299)),
(9, (900, 999)),
(10, (1000, 1149)),
(11, (1150, 1299)),
(19, (2350, 2499)),
(20, (2500, 2699)),
):
for p in range(l, r + 1): # for in [l, r]
assert get_level(p) == level, f'{p} {l}'
if __name__ == '__main__':
main()
Why did you set the value of a=250 and b = 750? Can you explain that to me please?
Let's write out every 10 level and the difference between points:
lvl - pnt (+delta)
10 - 1000 (+1000 = +100 * 10)
20 - 2500 (+1500 = +150 * 10)
30 - 4500 (+2000 = +200 * 10)
40 - 7000 (+2500 = +250 * 10)
Divide by 500 (10 levels * 50 difference changes) and received an arithmetic progression starting at 2:
10 - 2 (+2)
20 - 5 (+3)
30 - 9 (+4)
40 - 14 (+5)
Use arithmetic progression get points formula for level = k * 10 equal to:
sum(x for x in 2..k+1) * 500 =
(2 + k + 1) * k / 2 * 500 =
(3 + k) * k * 250 =
250 * k * k + 750 * k
Now we have points and want to find the maximum high such that point >= 250 * high^2 + 750 * high, i. e. 250 * high^2 + 750 * high - points <= 0. Value a = 250 is positive and branches of the parabola are directed up. Now we find the solution of quadratic equation 250 * high^2 + 750 * high - points = 0 and discard the real part (is high = int(x) in python script).

Batch Training Accuracy is always multiple of 10%

So I am training a CNN and compute the training accuracy for each batch. Most of the it gives out 100% batch training accuracy. which I though was okay because I'm testing my model against the data I trained it with. But at some iterations, I get a 90% or 90% batch training accuracy. And worst, sometimes it goes down to 0% real quick and bounces back to 100% batch training accuracy. And I used the algorithm in https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/04_Save_Restore.ipynb and they also computed the batch training accuracy but they don't get the same results I get. They started out with around 80% batch training accuracy and observed a gradual increase until 98%. Why is this?
I was suspecting that my network is overfitting.
Here is my exact code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import tensorflow as tf
import pyfftw
from scipy import signal
import xlrd
from tensorflow.python.tools import freeze_graph
from tensorflow.python.tools import optimize_for_inference_lib
import time
from datetime import timedelta
import math
import os
from sklearn.metrics import confusion_matrix
##matplotlib inline
plt.style.use('ggplot')
## define funtions
def read_data(file_path):
## column_names = ['user-id','activity','timestamp', 'x-axis', 'y-axis', 'z-axis']
column_names = ['activity','timestamp', 'Ax', 'Ay', 'Az', 'Gx', 'Gy', 'Gz', 'Mx', 'My', 'Mz'] ## 3 sensors
data = pd.read_csv(file_path,header = None, names = column_names)
return data
def feature_normalize(dataset):
mu = np.mean(dataset,axis = 0)
sigma = np.std(dataset,axis = 0)
return (dataset - mu)/sigma
def plot_axis(ax, x, y, title):
ax.plot(x, y)
ax.set_title(title)
ax.xaxis.set_visible(False)
ax.set_ylim([min(y) - np.std(y), max(y) + np.std(y)])
ax.set_xlim([min(x), max(x)])
ax.grid(True)
def plot_activity(activity,data):
fig, (ax0, ax1, ax2) = plt.subplots(nrows = 3, figsize = (15, 10), sharex = True)
plot_axis(ax0, data['timestamp'], data['Ax'], 'x-axis')
plot_axis(ax1, data['timestamp'], data['Ay'], 'y-axis')
plot_axis(ax2, data['timestamp'], data['Az'], 'z-axis')
plt.subplots_adjust(hspace=0.2)
fig.suptitle(activity)
plt.subplots_adjust(top=0.90)
plt.show()
def windows(data, size):
start = 0
while start < data.count():
yield start, start + size
start += (size / 2)
def segment_signal(data, window_size = None, num_channels=None): # edited
segments = np.empty((0,window_size,num_channels)) #change from 3 to 9 channels for AGM fusion #use variable num_channels=9
labels = np.empty((0))
for (n_start, n_end) in windows(data['timestamp'], window_size):
## x = data["x-axis"][start:end]
## y = data["y-axis"][start:end]
## z = data["z-axis"][start:end]
n_start = int(n_start)
n_end = int(n_end)
Ax = data["Ax"][n_start:n_end]
Ay = data["Ay"][n_start:n_end]
Az = data["Az"][n_start:n_end]
Gx = data["Gx"][n_start:n_end]
Gy = data["Gy"][n_start:n_end]
Gz = data["Gz"][n_start:n_end]
Mx = data["Mx"][n_start:n_end]
My = data["My"][n_start:n_end]
Mz = data["Mz"][n_start:n_end]
if(len(dataset['timestamp'][n_start:n_end]) == window_size): # include only windows with size of 90
segments = np.vstack([segments,np.dstack([Ax,Ay,Az,Gx,Gy,Gz,Mx,My,Mz])])
labels = np.append(labels,stats.mode(data["activity"][n_start:n_end])[0][0])
return segments, labels
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev = 0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.0, shape = shape)
return tf.Variable(initial)
def depthwise_conv2d(x, W):
return tf.nn.depthwise_conv2d(x,W, [1, 1, 1, 1], padding='VALID')
def apply_depthwise_conv(x,weights,biases):
return tf.nn.relu(tf.add(depthwise_conv2d(x, weights),biases))
def apply_max_pool(x,kernel_size,stride_size):
return tf.nn.max_pool(x, ksize=[1, 1, kernel_size, 1],
strides=[1, 1, stride_size, 1], padding='VALID')
#------------------------get dataset----------------------#
## run shoaib_dataset.py to generate dataset_shoaib_total.txt
## get data from dataset_shoaib_total.txt
dataset = read_data('dataset_shoaib_total.txt')
#--------------------preprocessing------------------------#
dataset['Ax'] = feature_normalize(dataset['Ax'])
dataset['Ay'] = feature_normalize(dataset['Ay'])
dataset['Az'] = feature_normalize(dataset['Az'])
dataset['Gx'] = feature_normalize(dataset['Gx'])
dataset['Gy'] = feature_normalize(dataset['Gy'])
dataset['Gz'] = feature_normalize(dataset['Gz'])
dataset['Mx'] = feature_normalize(dataset['Mx'])
dataset['My'] = feature_normalize(dataset['My'])
dataset['Mz'] = feature_normalize(dataset['Mz'])
###--------------------plot activity data----------------#
##for activity in np.unique(dataset["activity"]):
## subset = dataset[dataset["activity"] == activity][:180]
## plot_activity(activity,subset)
#------------------fixed hyperparameters--------------------#
window_size = 200 #from 90 #FIXED at 4 seconds
#----------------input hyperparameters------------------#
input_height = 1
input_width = window_size
num_labels = 6
num_channels = 9 #from 3 channels #9 channels for AGM
#-------------------sliding time window----------------#
segments, labels = segment_signal(dataset, window_size=window_size, num_channels=num_channels)
labels = np.asarray(pd.get_dummies(labels), dtype = np.int8)
reshaped_segments = segments.reshape(len(segments), (window_size*num_channels)) #use variable num_channels instead of constant 3 channels
#------------divide data into test and training set-----------#
train_test_split = np.random.rand(len(reshaped_segments)) < 0.80
train_x_init = reshaped_segments[train_test_split]
train_y_init = labels[train_test_split]
test_x = reshaped_segments[~train_test_split]
test_y = labels[~train_test_split]
train_validation_split = np.random.rand(len(train_x_init)) < 0.80
train_x = train_x_init[train_validation_split]
train_y = train_y_init[train_validation_split]
validation_x = train_x_init[~train_validation_split]
validation_y = train_y_init[~train_validation_split]
#---------------training hyperparameters----------------#
batch_size = 10
kernel_size = 60 #from 60 #optimal 2
depth = 15 #from 60 #optimal 15
num_hidden = 1000 #from 1000 #optimal 80
learning_rate = 0.0001
training_epochs = 8
total_batches = train_x.shape[0] ##// batch_size
#---------define placeholders for input----------#
X = tf.placeholder(tf.float32, shape=[None,input_width * num_channels], name="input")
X_reshaped = tf.reshape(X,[-1,input_height,input_width,num_channels])
Y = tf.placeholder(tf.float32, shape=[None,num_labels])
#---------------------perform convolution-----------------#
# first convolutional layer
c_weights = weight_variable([1, kernel_size, num_channels, depth])
c_biases = bias_variable([depth * num_channels])
c = apply_depthwise_conv(X_reshaped,c_weights,c_biases)
p = apply_max_pool(c,20,2)
# second convolutional layer
c2_weights = weight_variable([1, 6,depth*num_channels,depth//10])
c2_biases = bias_variable([(depth*num_channels)*(depth//10)])
c = apply_depthwise_conv(p,c2_weights,c2_biases)
#--------------flatten data for fully connected layers----------#
shape = c.get_shape().as_list()
c_flat = tf.reshape(c, [-1, shape[1] * shape[2] * shape[3]])
#------------fully connected layers----------------#
f_weights_l1 = weight_variable([shape[1] * shape[2] * depth * num_channels * (depth//10), num_hidden])
f_biases_l1 = bias_variable([num_hidden])
f = tf.nn.tanh(tf.add(tf.matmul(c_flat, f_weights_l1),f_biases_l1))
#----------------------dropout------------------#
keep_prob = tf.placeholder(tf.float32)
drop_layer = tf.nn.dropout(f, keep_prob)
#----------------------softmax layer----------------#
out_weights = weight_variable([num_hidden, num_labels])
out_biases = bias_variable([num_labels])
y_ = tf.nn.softmax(tf.add(tf.matmul(drop_layer, out_weights),out_biases), name="y_")
#-----------------loss optimization-------------#
loss = -tf.reduce_sum(Y * tf.log(y_))
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(loss)
#-----------------compute accuracy---------------#
correct_prediction = tf.equal(tf.argmax(y_,1), tf.argmax(Y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
cost_history = np.empty(shape=[1],dtype=float)
saver = tf.train.Saver()
session = tf.Session()
session.run(tf.global_variables_initializer())
#-------------early stopping-----------------#
# Best validation accuracy seen so far.
best_validation_accuracy = 0.0
# Iteration-number for last improvement to validation accuracy.
last_improvement = 0
# Stop optimization if no improvement found in this many iterations.
require_improvement = 1000
# Counter for total number of iterations performed so far.
total_iterations = 0
def validation_accuracy():
return session.run(accuracy, feed_dict={X: validation_x, Y: validation_y, keep_prob: 1.0})
def next_batch(b, batch_size, train_x, train_y):
##for b in range(total_batches):
offset = (b * batch_size) % (train_y.shape[0] - batch_size)
batch_x = train_x[offset:(offset + batch_size), :]
batch_y = train_y[offset:(offset + batch_size), :]
return batch_x, batch_y
def optimize(num_iterations):
# Ensure we update the global variables rather than local copies.
global total_iterations
global best_validation_accuracy
global last_improvement
# Start-time used for printing time-usage below.
start_time = time.time()
for i in range(num_iterations):
# Increase the total number of iterations performed.
# It is easier to update it in each iteration because
# we need this number several times in the following.
total_iterations += 1
# Get a batch of training examples.
# x_batch now holds a batch of images and
# y_true_batch are the true labels for those images.
##x_batch, y_true_batch = data.train.next_batch(train_batch_size)
x_batch, y_true_batch = next_batch(i, batch_size, train_x, train_y)
# Put the batch into a dict with the proper names
# for placeholder variables in the TensorFlow graph.
feed_dict_train = {X: x_batch,
Y: y_true_batch, keep_prob: 0.5}
# Run the optimizer using this batch of training data.
# TensorFlow assigns the variables in feed_dict_train
# to the placeholder variables and then runs the optimizer.
session.run(optimizer, feed_dict=feed_dict_train)
# Print status every 100 iterations and after last iteration.
if (total_iterations % 100 == 0) or (i == (num_iterations - 1)):
# Calculate the accuracy on the training-batch.
acc_train = session.run(accuracy, feed_dict={X: x_batch,
Y: y_true_batch, keep_prob: 1.0})
# Calculate the accuracy on the validation-set.
# The function returns 2 values but we only need the first.
##acc_validation, _ = validation_accuracy()
acc_validation = validation_accuracy()
# If validation accuracy is an improvement over best-known.
if acc_validation > best_validation_accuracy:
# Update the best-known validation accuracy.
best_validation_accuracy = acc_validation
# Set the iteration for the last improvement to current.
last_improvement = total_iterations
# Save all variables of the TensorFlow graph to file.
saver.save(sess=session, save_path="../shoaib-har_agm_es.ckpt")
# A string to be printed below, shows improvement found.
improved_str = '*'
else:
# An empty string to be printed below.
# Shows that no improvement was found.
improved_str = ''
# Status-message for printing.
msg = "Iter: {0:>6}, Train-Batch Accuracy: {1:>6.1%}, Validation Acc: {2:>6.1%} {3}"
# Print it.
print(msg.format(i + 1, acc_train, acc_validation, improved_str))
# If no improvement found in the required number of iterations.
if total_iterations - last_improvement > require_improvement:
print("No improvement found in a while, stopping optimization.")
# Break out from the for-loop.
break
# Ending time.
end_time = time.time()
# Difference between start and end-times.
time_dif = end_time - start_time
# Print the time-usage.
print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))
optimize(10000)
With the output:
What exactly is training accuracy? Is it even computed? Or do you compute the training accuracy on the entire training data and not just the batch you trained your network with?
Here I printed the results such that it prints out the batch training accuracy and the training accuracy on the entire dataset set for every multiples of 20 iterations.
The data is divided to 3 sets: train, validation and test.
Batch training accuracy is computed on the train set (the difference between the label and the prediction).
Validation accuracy is the accuracy on the validation set.
The batch accuracy can be computed just after a forward pass in the network. The number of samples in one forward pass is the batch size. It is just a way to train models faster (mini-batch gradient descent)
Overfitting is when the model works really good on known data (training set) but performs poorly on new data.
As to the 10% multiples, it is just the printing format you are using.

Regarding precision and recall

Suppose we have 99% non-span and 1% span. Here I have written function as below
function y = predictSpam(x)
y = 0;
return
here we have true positive's are zero. And accuracy is 99%. In this case precision and recall is zero. Is my understanding is right? Request to provide to fill in below table in case of below scenario
actualclass1 | actualclass0
predict class1 0 | 0
-------------------------------------------------
predict class0 1 | 99
m = 100. Is above table is filled correctly.
When using precision and recall I quite always look again this image:
So we have:
precision = true_positive / true_positive + false_positive
recall = true_positive / true_positive + false_negative
In your data, 99 is correctly classified 0, 1 is classified 0 when it should be 1.
With your data:
- true_positive = 0
- true_negative = 99
- false_positive = 0
- false_negative = 1
Your true positive is 0, so yes, both recall and precision will be 0.
Accuracy is indeed 99%.

Resources