Extra zeros appended in confusion matrix making it 3x3 instead of 2x2 using IsolationForest for Anomaly detection - machine-learning

I am using below code to predict anomaly detection. It is a binary classification so the confusion matrix should be 2x2 instead it is 3x3. There are extra zeros appended in T-shape. Similar thing happened using OneClassSVM few weeks back as well but I thought I was doing something wrong. Could you please help me fix this?
import numpy as np
import pandas as pd
import os
from sklearn.ensemble import IsolationForest
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
from sklearn import metrics
from sklearn.metrics import roc_auc_score
data = pd.read_csv('opensky_train.csv')
#to make sure that normal data contains no anomaly
sortedData = data.sort_values(by=['class'])
target = pd.DataFrame(sortedData['class'])
Y = target.replace(['surveill', 'other'], [1,0])
X = sortedData.drop(['class'], axis = 1)
x_normal = X.iloc[:200,:]
y_normal = Y.iloc[:200,:]
x_anomaly = X.iloc[200:,:]
y_anomaly = Y.iloc[200:,:]
Edited:
column_values = y_anomaly.values.ravel()
unique_values = pd.unique(column_values)
print(unique_values)
Output : [0 1]
clf = IsolationForest(random_state=0).fit(x_normal)
pred = clf.predict(x_anomaly)
print(pred)
Output : [ 1 1 1 1 1 1 -1 1 -1 1 1 1 1 1 1 1 1 1 1 -1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 -1 1 1 1 1 1 1 -1 1 1 -1 1 1 -1 1 1 -1 1 -1 1
-1 1 1 -1 -1 1 -1 -1 1 1 1 1 -1 1 1 -1 -1 1 1 1 1 1 1 1
-1 1 1 1 1 1 1 1 1 1 -1]
#printing the results
print(confusion_matrix(y_anomaly, pred))
print (classification_report(y_anomaly, pred))
Result:
Confusion Matrix :
[[ 0 0 0]
[ 7 0 60]
[12 0 28]]
precision recall f1-score support
-1 0.00 0.00 0.00 0
0 0.00 0.00 0.00 67
1 0.32 0.70 0.44 40
accuracy 0.26 107
macro avg 0.11 0.23 0.15 107
weighted avg 0.12 0.26 0.16 107

Inliers are labeled 1, while outliers are labeled -1
Source: scikit-learn Anomaly and Outlier detection.
Your example has transformed the classes to 0,1 - so the three possible options are -1,0,1
You need to change from
Y = target.replace(['surveill', 'other'], [1,0])
to
Y = target.replace(['surveill', 'other'], [1,-1])

Related

Terrible performance using XGBoost H2O

Very different Model Performance using XGBoost on H2O
I am training a XGBoost model using 5-fold croos validation on a very imbalanced binary classification problem. The dataset has 1200 columns (multi-document word2vec document embeddings).
The only parameters specified to train the XGBoost model were:
min_split_improvement = 1e-5
seed=1
nfolds = 5
The reported performance on train data was extremely high (probably overfitting!!!):
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.2814398407936096:
A D Error Rate
----- ----- --- ------- -------------
A 16858 2 0.0001 (2.0/16860.0)
D 0 414 0 (0.0/414.0)
Total 16858 416 0.0001 (2.0/17274.0)
AUC: 0.9999991404060721
The performance on cross validation data was terrible:
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.016815993119962513:
A D Error Rate
----- ----- --- ------- ----------------
A 16003 857 0.0508 (857.0/16860.0)
D 357 57 0.8623 (357.0/414.0)
Total 16360 914 0.0703 (1214.0/17274.0)
AUC: 0.6015883863129724
I know H2O cross validation generates an extra model using the whole data available and different performances are expected.
But, could be the cause that generated too bad performance on the resulting model?
Ps: XGBoost on a multi node H2O cluster with OMP
Model Type: classifier
Performance do modelo < XGBoost_model_python_1575650180928_617 >:
ModelMetricsBinomial: xgboost
** Reported on train data. **
MSE: 0.0008688085383330077
RMSE: 0.029475558320971762
LogLoss: 0.00836528606162877
Mean Per-Class Error: 5.931198102016033e-05
AUC: 0.9999991404060721
pr_auc: 0.9975495622569983
Gini: 0.9999982808121441
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.2814398407936096:
A D Error Rate
----- ----- --- ------- -------------
A 16858 2 0.0001 (2.0/16860.0)
D 0 414 0 (0.0/414.0)
Total 16858 416 0.0001 (2.0/17274.0)
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
--------------------------- ----------- -------- -----
max f1 0.28144 0.99759 195
max f2 0.28144 0.999035 195
max f0point5 0.553885 0.998053 191
max accuracy 0.28144 0.999884 195
max precision 0.990297 1 0
max recall 0.28144 1 195
max specificity 0.990297 1 0
max absolute_mcc 0.28144 0.997534 195
max min_per_class_accuracy 0.28144 0.999881 195
max mean_per_class_accuracy 0.28144 0.999941 195
max tns 0.990297 16860 0
max fns 0.990297 413 0
max fps 0.000111383 16860 399
max tps 0.28144 414 195
max tnr 0.990297 1 0
max fnr 0.990297 0.997585 0
max fpr 0.000111383 1 399
max tpr 0.28144 1 195
Gains/Lift Table: Avg response rate: 2.40 %, avg score: 2.42 %
group cumulative_data_fraction lower_threshold lift cumulative_lift response_rate score cumulative_response_rate cumulative_score capture_rate cumulative_capture_rate gain cumulative_gain
-- ------- -------------------------- ----------------- ------- ----------------- --------------- ----------- -------------------------- ------------------ -------------- ------------------------- ------- -----------------
1 0.0100151 0.873526 41.7246 41.7246 1 0.907782 1 0.907782 0.417874 0.417874 4072.46 4072.46
2 0.0200301 0.776618 41.7246 41.7246 1 0.834968 1 0.871375 0.417874 0.835749 4072.46 4072.46
3 0.0300452 0.0326301 16.4004 33.2832 0.393064 0.303206 0.797688 0.681985 0.164251 1 1540.04 3228.32
4 0.0400023 0.0224876 0 24.9986 0 0.0263919 0.599132 0.518799 0 1 -100 2399.86
5 0.0500174 0.0180858 0 19.9931 0 0.0201498 0.479167 0.418953 0 1 -100 1899.31
6 0.100035 0.0107386 0 9.99653 0 0.0136044 0.239583 0.216279 0 1 -100 899.653
7 0.149994 0.00798337 0 6.66692 0 0.00922284 0.159784 0.147313 0 1 -100 566.692
8 0.200012 0.00629476 0 4.99971 0 0.00709438 0.119826 0.112249 0 1 -100 399.971
9 0.299988 0.00436827 0 3.33346 0 0.00522157 0.0798919 0.0765798 0 1 -100 233.346
10 0.400023 0.00311204 0 2.49986 0 0.00370085 0.0599132 0.0583548 0 1 -100 149.986
11 0.5 0.00227535 0 2 0 0.00267196 0.0479333 0.0472208 0 1 -100 100
12 0.599977 0.00170271 0 1.66673 0 0.00197515 0.039946 0.0396813 0 1 -100 66.6731
13 0.700012 0.00121528 0 1.42855 0 0.00145049 0.0342375 0.034218 0 1 -100 42.8548
14 0.799988 0.000837358 0 1.25002 0 0.00102069 0.0299588 0.0300692 0 1 -100 25.0018
15 0.899965 0.000507632 0 1.11115 0 0.000670878 0.0266306 0.0268033 0 1 -100 11.1154
16 1 3.35288e-05 0 1 0 0.00033002 0.0239667 0.0241551 0 1 -100 0
Performance da validação cruzada (xval) do modelo < XGBoost_model_python_1575650180928_617 >:
ModelMetricsBinomial: xgboost
** Reported on cross-validation data. **
MSE: 0.023504756648164406
RMSE: 0.15331261085822134
LogLoss: 0.14134815775808462
Mean Per-Class Error: 0.4160864407653825
AUC: 0.6015883863129724
pr_auc: 0.04991836222189148
Gini: 0.2031767726259448
Confusion Matrix (Act/Pred) for max f1 # threshold = 0.016815993119962513:
A D Error Rate
----- ----- --- ------- ----------------
A 16003 857 0.0508 (857.0/16860.0)
D 357 57 0.8623 (357.0/414.0)
Total 16360 914 0.0703 (1214.0/17274.0)
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
--------------------------- ----------- --------- -----
max f1 0.016816 0.0858434 209
max f2 0.00409934 0.138433 318
max f0point5 0.0422254 0.0914205 127
max accuracy 0.905155 0.976323 3
max precision 0.99221 1 0
max recall 9.60076e-05 1 399
max specificity 0.99221 1 0
max absolute_mcc 0.825434 0.109684 5
max min_per_class_accuracy 0.00238436 0.572464 345
max mean_per_class_accuracy 0.00262155 0.583914 341
max tns 0.99221 16860 0
max fns 0.99221 412 0
max fps 9.60076e-05 16860 399
max tps 9.60076e-05 414 399
max tnr 0.99221 1 0
max fnr 0.99221 0.995169 0
max fpr 9.60076e-05 1 399
max tpr 9.60076e-05 1 399
Gains/Lift Table: Avg response rate: 2.40 %, avg score: 0.54 %
group cumulative_data_fraction lower_threshold lift cumulative_lift response_rate score cumulative_response_rate cumulative_score capture_rate cumulative_capture_rate gain cumulative_gain
-- ------- -------------------------- ----------------- -------- ----------------- --------------- ----------- -------------------------- ------------------ -------------- ------------------------- --------- -----------------
1 0.0100151 0.0540408 4.34129 4.34129 0.104046 0.146278 0.104046 0.146278 0.0434783 0.0434783 334.129 334.129
2 0.0200301 0.033963 2.41183 3.37656 0.0578035 0.0424722 0.0809249 0.094375 0.0241546 0.0676329 141.183 237.656
3 0.0300452 0.0251807 2.17065 2.97459 0.0520231 0.0292894 0.0712909 0.0726798 0.0217391 0.089372 117.065 197.459
4 0.0400023 0.02038 2.18327 2.77762 0.0523256 0.0225741 0.0665702 0.0602078 0.0217391 0.111111 118.327 177.762
5 0.0500174 0.0174157 1.92946 2.60779 0.0462428 0.0188102 0.0625 0.0519187 0.0193237 0.130435 92.9463 160.779
6 0.100035 0.0103201 1.59365 2.10072 0.0381944 0.0132217 0.0503472 0.0325702 0.0797101 0.210145 59.3649 110.072
7 0.149994 0.00742152 1.06366 1.7553 0.0254925 0.00867473 0.0420687 0.0246112 0.0531401 0.263285 6.3664 75.5301
8 0.200012 0.00560037 1.11073 1.59411 0.0266204 0.00642966 0.0382055 0.0200645 0.0555556 0.318841 11.0725 59.4111
9 0.299988 0.00366149 1.30465 1.49764 0.0312681 0.00452583 0.0358935 0.0148859 0.130435 0.449275 30.465 49.7642
10 0.400023 0.00259159 1.13487 1.40692 0.0271991 0.00306994 0.0337192 0.0119311 0.113527 0.562802 13.4872 40.6923
11 0.5 0.00189 0.579844 1.24155 0.0138969 0.00220612 0.0297557 0.00998654 0.057971 0.620773 -42.0156 24.1546
12 0.599977 0.00136983 0.990568 1.19972 0.0237406 0.00161888 0.0287534 0.0085922 0.0990338 0.719807 -0.943246 19.9724
13 0.700012 0.000980029 0.676094 1.1249 0.0162037 0.00116698 0.02696 0.0075311 0.0676329 0.78744 -32.3906 12.4895
14 0.799988 0.00067366 0.797286 1.08395 0.0191083 0.000820365 0.0259787 0.00669244 0.0797101 0.86715 -20.2714 8.39529
15 0.899965 0.000409521 0.797286 1.05211 0.0191083 0.000540092 0.0252155 0.00600898 0.0797101 0.94686 -20.2714 5.21072
16 1 2.55768e-05 0.531216 1 0.0127315 0.000264023 0.0239667 0.00543429 0.0531401 1 -46.8784 0
For the non cross-validation case, try splitting your data up front into training and validation frames.
I expect you will get a worse AUC for the validation case.
Although for highly imbalanced cases, sometimes you just need to go by the error rate for each class.
Since there are so many true negatives, that can dominate the AUC (vast majority of predictions are correctly predicting “not interesting”). Some people will upsample the minority class in this situation using row weights to make the model more sensitive to them.

Ramp least squares estimation with CVXPY and DCCP

This is a ramp least squares estimation problem, better described in math formula elsewhere:
https://scicomp.stackexchange.com/questions/33524/ramp-least-squares-estimation
I used Disciplined Convex-Concave Programming and DCCP package based on CVXPY. The code follows:
import cvxpy as cp
import numpy as np
import dccp
from dccp.problem import is_dccp
# Generate data.
m = 20
n = 15
np.random.seed(1)
X = np.random.randn(m, n)
Y = np.random.randn(m)
# Define and solve the DCCP problem.
def loss_fn(X, Y, beta):
return cp.norm2(cp.matmul(X, beta) - Y)**2
def obj_g(X, Y, beta, sval):
return cp.pos(loss_fn(X, Y, beta) - sval)
beta = cp.Variable(n)
s = 10000000000000
constr = obj_g(X, Y, beta, s)
t = cp.Variable(1)
t.value = [1]
cost = loss_fn(X, Y, beta) - t
problem = cp.Problem(cp.Minimize(cost), [constr >= t])
print("problem is DCP:", problem.is_dcp()) # false
print("problem is DCCP:", is_dccp(problem)) # true
problem.solve(verbose=True, solver=cp.ECOS, method='dccp')
# Print result.
print("\nThe optimal value is", problem.value)
print("The optimal beta is")
print(beta.value)
print("The norm of the residual is ", cp.norm(X*beta - Y, p=2).value)
Because of the large value s, I would hope to get a solution similar to the least squares estimation. But there is no solution as the output shows (with different solver, dimension of the problem, etc):
problem is DCP: False
problem is DCCP: True
ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS
It pcost dcost gap pres dres k/t mu step sigma IR | BT
0 +0.000e+00 -0.000e+00 +2e+01 9e-02 1e-04 1e+00 9e+00 --- --- 1 1 - | - -
1 -7.422e-04 +2.695e-09 +2e-01 1e-03 1e-06 1e-02 9e-02 0.9890 1e-04 2 1 1 | 0 0
2 -1.638e-05 +5.963e-11 +2e-03 1e-05 2e-08 1e-04 1e-03 0.9890 1e-04 2 1 1 | 0 0
3 -2.711e-07 +9.888e-13 +2e-05 1e-07 2e-10 2e-06 1e-05 0.9890 1e-04 4 1 1 | 0 0
4 -3.991e-09 +1.379e-14 +2e-07 1e-09 2e-12 2e-08 1e-07 0.9890 1e-04 1 0 0 | 0 0
5 -5.507e-11 +1.872e-16 +3e-09 2e-11 2e-14 2e-10 1e-09 0.9890 1e-04 1 0 0 | 0 0
OPTIMAL (within feastol=1.6e-11, reltol=4.8e+01, abstol=2.6e-09).
Runtime: 0.001112 seconds.
ECOS 2.0.7 - (C) embotech GmbH, Zurich Switzerland, 2012-15. Web: www.embotech.com/ECOS
It pcost dcost gap pres dres k/t mu step sigma IR | BT
0 +0.000e+00 -5.811e-01 +1e+01 6e-01 6e-01 1e+00 2e+00 --- --- 1 1 - | - -
1 -7.758e+00 -2.575e+00 +1e+00 2e-01 7e-01 6e+00 3e-01 0.9890 1e-01 1 1 1 | 0 0
2 -3.104e+02 -9.419e+01 +4e-02 2e-01 8e-01 2e+02 8e-03 0.9725 8e-04 2 1 1 | 0 0
3 -2.409e+03 -9.556e+02 +5e-03 2e-01 8e-01 1e+03 1e-03 0.8968 5e-02 3 2 2 | 0 0
4 -1.103e+04 -5.209e+03 +2e-03 2e-01 7e-01 6e+03 4e-04 0.9347 3e-01 2 2 2 | 0 0
5 -1.268e+04 -1.592e+03 +8e-04 1e-01 1e+00 1e+04 2e-04 0.7916 4e-01 3 2 2 | 0 0
6 -1.236e+05 -2.099e+04 +9e-05 1e-01 1e+00 1e+05 2e-05 0.8979 9e-03 1 1 1 | 0 0
7 -4.261e+05 -1.850e+05 +4e-05 2e-01 7e-01 2e+05 1e-05 0.7182 3e-01 2 1 1 | 0 0
8 -2.492e+07 -1.078e+07 +7e-07 1e-01 7e-01 1e+07 2e-07 0.9838 1e-04 3 2 2 | 0 0
9 -2.226e+08 -9.836e+07 +5e-08 9e-02 5e-01 1e+08 1e-08 0.9339 2e-03 2 3 2 | 0 0
UNBOUNDED (within feastol=1.0e-09).
Runtime: 0.001949 seconds.
The optimal value is None
The optimal beta is
None
The norm of the residual is None

OneVsRestClassifier(svm.SVC()).predict() gives continous values

I am trying to use y_scores=OneVsRestClassifier(svm.SVC()).predict() on datasets
like iris and titanic .The trouble is that I am getting y_scores as continous values.like for iris dataset I am getting :
[[ -3.70047231 -0.74209097 2.29720159]
[ -1.93190155 0.69106231 -2.24974856]
.....
I am using the OneVsRestClassifier for other classifier models like knn,randomforest,naive bayes and they are giving appropriate results in the form of
[[ 0 1 0]
[ 1 0 1]...
etc on the iris dataset .Please help.
Well this is simply not true.
>>> from sklearn.multiclass import OneVsRestClassifier
>>> from sklearn.svm import SVC
>>> from sklearn.datasets import load_iris
>>> iris = load_iris()
>>> clf = OneVsRestClassifier(SVC())
>>> clf.fit(iris['data'], iris['target'])
OneVsRestClassifier(estimator=SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
kernel='rbf', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False),
n_jobs=1)
>>> print clf.predict(iris['data'])
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
2 2]
maybe you called decision_function instead (which would match your output dimension, as predict is supposed to return a vector, not a matrix). Then, SVM returns signed distances to each hyperplane, which is its decision function from mathematical perspective.

Batch processing in Torch with ClassNLLCriterion

I'm trying to implement a simple NN in Torch to learn more about it. I created a very simple dataset: binary numbers from 0 to 15 and my goal is to classify the numbers into two classes - class 1 are numbers 0-3 and 12-15, class 2 are the remaining ones. The following code is what i have now (i have removed the data loading routine only):
require 'torch'
require 'nn'
data = torch.Tensor( 16, 4 )
class = torch.Tensor( 16, 1 )
network = nn.Sequential()
network:add( nn.Linear( 4, 8 ) )
network:add( nn.ReLU() )
network:add( nn.Linear( 8, 2 ) )
network:add( nn.LogSoftMax() )
criterion = nn.ClassNLLCriterion()
for i = 1, 300 do
prediction = network:forward( data )
--print( "prediction: " .. tostring( prediction ) )
--print( "class: " .. tostring( class ) )
loss = criterion:forward( prediction, class )
network:zeroGradParameters()
grad = criterion:backward( prediction, class )
network:backward( data, grad )
network:updateParameters( 0.1 )
end
This is how the data and class Tensors look like:
0 0 0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
[torch.DoubleTensor of size 16x4]
2
2
2
2
1
1
1
1
1
1
1
1
2
2
2
2
[torch.DoubleTensor of size 16x1]
Which is what I expect it to be. However when running this code, i get the following error on line loss = criterion:forward( prediction, class ):
torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:69: attempt to
perform arithmetic on a nil value
When i modify the training routine like this (processing a single data point at a time instead of all 16 in a batch) it works and the network successfully learns to recognize the two classes:
for k = 1, 300 do
for i = 1, 16 do
prediction = network:forward( data[i] )
--print( "prediction: " .. tostring( prediction ) )
--print( "class: " .. tostring( class ) )
loss = criterion:forward( prediction, class[i] )
network:zeroGradParameters()
grad = criterion:backward( prediction, class[i] )
network:backward( data[i], grad )
network:updateParameters( 0.1 )
end
end
I'm not sure what might be wrong with the "batch processing" i'm trying to do. A brief look at the ClassNLLCriterion didn't help, it seems i'm giving it the expected input (see below), but it still fails. The input it receives (prediction and class Tensors) looks like this:
-0.9008 -0.5213
-0.8591 -0.5508
-0.9107 -0.5146
-0.8002 -0.5965
-0.9244 -0.5055
-0.8581 -0.5516
-0.9174 -0.5101
-0.8040 -0.5934
-0.9509 -0.4884
-0.8409 -0.5644
-0.8922 -0.5272
-0.7737 -0.6186
-0.9422 -0.4939
-0.8405 -0.5648
-0.9012 -0.5210
-0.7820 -0.6116
[torch.DoubleTensor of size 16x2]
2
2
2
2
1
1
1
1
1
1
1
1
2
2
2
2
[torch.DoubleTensor of size 16x1]
Can someone help me out here? Thanks.
Experience has shown that nn.ClassNLLCriterion expects target to be a 1D tensor of size batch_size or a scalar. Your class is a 2D one (batch_size x 1) but class[i] is 1D, that's why your non-batch version works.
So, this will solve your problem:
class = class:view(-1)
Alternatively, you can replace
network:add( nn.LogSoftMax() )
criterion = nn.ClassNLLCriterion()
with the equivalent:
criterion = nn.CrossEntropyCriterion()
The interesting thing is that nn.CrossEntropyCriterion is also able to take a 2D tensor. Why is nn.ClassNLLCriterion not?

How to encode video 3840x2160 with 32x32 and 16x16 CU with depth 2 and 1 in HEVC Encoder HM 13

When I try to encode a video the encoder crashes after finishing first GOP.
This is the configuration I'm using:
MaxCUWidth : 16 # Maximum coding unit width in pixel
MaxCUHeight : 16 # Maximum coding unit height in pixel
MaxPartitionDepth : 2 # Maximum coding unit depth
QuadtreeTULog2MaxSize : 3 # Log2 of maximum transform size for
# quadtree-based TU coding (2...5) = MaxPartitionDepth + 2 - 1
QuadtreeTULog2MinSize : 2 # Log2 of minimum transform size for
# quadtree-based TU coding (2...5)
QuadtreeTUMaxDepthInter : 1
QuadtreeTUMaxDepthIntra : 1
#======== Coding Structure =============
IntraPeriod : 8 # Period of I-Frame ( -1 = only first)
DecodingRefreshType : 1 # Random Accesss 0:none, 1:CDR, 2:IDR
GOPSize : 4 # GOP Size (number of B slice = GOPSize-1)
# Type POC QPoffset QPfactor tcOffsetDiv2 betaOffsetDiv2 temporal_id #ref_pics_active #ref_pics reference pictures predict deltaRPS #ref_idcs reference idcs
Frame1: P 4 1 0.5 0 0 0 1 1 -4 0
Frame2: B 2 2 0.5 1 0 1 1 2 -2 2 1 2 2 1 1
Frame3: B 1 3 0.5 2 0 2 1 3 -1 1 3 1 1 3 1 1 1
Frame4: B 3 3 0.5 2 0 2 1 2 -1 1 1 -2 4 0 1 1 0
This also happens with CU=16x16 with depth=1
Note: I encoded CU=64x64 with depth=4 with the same GOP configuration and every thing went fine.
This is most probably due to the fact that you have compiled the binary for a 32-bit system?
Please rebuild it for a 64-bit system and the problem will go away.

Resources