I have a question about Alexnet model.I have re-implemented the Alexnet model in Torch from BVLC caffe model. But I am getting 0 percent top-1 accuracy all the time even after 1 million iterations with batch size set to 256. I was wondering if someone can help me with what is wrong in my model. This is the model that I wrote:
net = nn.Sequential()
net:add(cudnn.SpatialConvolution(3, 96, 11, 11, 4, 4, 0, 0, 1):learningRate('bias', 2):weightDecay('bias', 0))
net:add(cudnn.ReLU(true))
net:add(cudnn.SpatialCrossMapLRN(5))
net:add(cudnn.SpatialMaxPooling(3,3, 2,2, 0,0):ceil())
net:add(cudnn.SpatialConvolution(96,256,5,5,1,1,2,2,2):learningRate('bias', 2):weightDecay('bias', 0))
net:add(cudnn.ReLU(true))
net:add(cudnn.SpatialCrossMapLRN(5))
net:add(cudnn.SpatialMaxPooling(3,3,2,2,0,0):ceil())
net:add(cudnn.SpatialConvolution(256, 384, 3,3, 1,1, 1,1,1):learningRate('bias', 2):weightDecay('bias', 0))
net:add(cudnn.ReLU(true))
net:add(cudnn.SpatialConvolution(384, 384, 3,3, 1,1, 1,1,2):learningRate('bias', 2):weightDecay('bias', 0))
net:add(cudnn.ReLU(true))
net:add(cudnn.SpatialConvolution(384, 256, 3,3, 1,1, 1,1,2):learningRate('bias', 2):weightDecay('bias', 0))
net:add(cudnn.ReLU(true))
net:add(cudnn.SpatialMaxPooling(3,3,2,2,0,0):ceil())
--net:add(nn.View(256*6*6))
net:add(nn.View(-1):setNumInputDims(3))
net:add(nn.Linear(256*6*6, 4096):learningRate('weight', 1):learningRate('bias', 2):weightDecay('weight', 1):weightDecay('bias', 0))
--net:add(nn.BatchNormalization(4096))
net:add(cudnn.ReLU(true))
net:add(nn.Dropout(0.5))
net:add(nn.Linear(4096, 4096):learningRate('weight', 1):learningRate('bias', 2):weightDecay('weight', 1):weightDecay('bias', 0))
--net:add(nn.BatchNormalization(4096))
net:add(cudnn.ReLU(true))
net:add(nn.Dropout(0.5))
net:add(nn.Linear(4096, opt.nClasses):learningRate('weight', 1):learningRate('bias', 2):weightDecay('weight', 1):weightDecay('bias', 0))
And this is how I have initialized the weights:
-- initialize the model
local function weights_init(m)
local name = torch.type(m)
if name:find('Convolution') then
m.weight:normal(0.0, 0.01)
m.bias:fill(0)
elseif name:find('BatchNormalization') then
if m.weight then m.weight:normal(1.0, 0.02) end
if m.bias then m.bias:fill(0) end
end
end
net:apply(weights_init)
Any pointers will be much appreciated!
Thank you so much for your help,
You can implement AlexNet with PyTorch with:
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'alexnet', pretrained=True)
model.eval()
Or also you can check this repo from Github where the model is on h5 format for Tensorflow for easy implementation here.
Related
When I try to combine RandomSearch with the early stopping method to reduce the overfitting, I get this error:
py:372: FitFailedWarning:
300 fits failed out of a total of 300.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.
The code I am trying is like this:
params_dist = {'min_child_weight': [0.1, 1, 5, 10, 50],
'colsample_bytree': np.arange(0.5, 1.0, 0.1),
'gamma': [0.5, 1, 1.5, 2, 5],
'subsample': np.arange(0.5, 1.0, 0.1),
'max_depth': range(3, 21, 3),
'learning_rate': [0.0001,0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 1],
'n_estimators': [50, 100, 250, 500, 750, 1000],
'reg_alpha': [0.0001, 0.001, 0.1, 1],
'reg_lambda': [0.0001, 0.001, 0.1, 1]}
model_with_earlyStopping = xgb.XGBClassifier(objective='binary:logistic',
eval_metric="error",
early_stopping_rounds=13,
seed=42)
random_search = model_selection.RandomizedSearchCV(model_with_earlyStopping,
param_distributions=params_dist,
scoring='roc_auc',
n_jobs=-1,
verbose=0,
cv=3,
random_state=1001,
n_iter=100)
The code worked fine without using early stopping. However, I am looking for a way to combine these 2 methods together.
Can anyone help me fix it?
I am playing around with the XGBoostClassifier and tuning this with GridSearchCV. I first created the variable xgbc:
xgbc = xgb.XGBClassifier()
I did'nt use any parameters as I wanted to see the default model performance. This gave me accuracy_score = 85.65%, recall_score = 77.91% and roc_auc_score = 84.21%, using the following lines of code:
print("Accuracy: ", accuracy_score(y_test, xgbc.predict(X_test)))
print("Recall: ", recall_score(y_test, xgbc.predict(X_test)))
print("ROC_AUC: ", roc_auc_score(y_test, xgbc.predict(X_test)))
Next, I used GridSearchCV to try to tune the parameters, like this:
Setting up the parameter dictionary:
xgbc_params = {'max_depth': [5, 6, 7], #6
'learning_rate': [0.25, 0.300000012, 0.35], #0.300000012
'gamma':[0, 0.001, 0.1], #0
'reg_lambda': [0.8, 0.95, 1], #1
'scale_pos_weight': [0, 1, 2], #1
'n_estimators': [95, 100, 105]} #100
(The numbers after the # are the default values, which gave me the above scores.)
And now run the GridSearchCV like this:
xgbc_grid = GridSearchCV(xgbc, param_grid=xgbc_params, scoring = make_scorer(accuracy_score), cv = 10, n_jobs = -1)
Next, fit this to the training data:
xgbc_grid.fit(X_train, y_train, verbose = 1, early_stopping_rounds = 10, eval_metric = 'aucpr', eval_set = [(X_test, y_test)])
Finally, run the metrics again:
print("Best Reg estimators: ", xgbc_grid.best_params_)
print("Accuracy: ", accuracy_score(y_test, xgbc_grid.predict(X_test)))
print("Recall: ", recall_score(y_test, xgbc_grid.predict(X_test)))
print("ROC_AUC: ", roc_auc_score(y_test, xgbc_grid.predict(X_test)))
Now, the scores change: accuracy_score = 0.8340807174887892, recall_score = 0.7325581395348837 and roc_auc_score = 0.8420896282464777. Also, here is the best_params_ result:
Best Reg estimators: {'gamma': 0, 'learning_rate': 0.35, 'max_depth': 5, 'n_estimators': 95, 'reg_lambda': 0.8, 'scale_pos_weight': 1}
Here is my problem:
The parameter values that GridSearchCV returns through xgbc_grid.best_params_ are not the most optimal for accuracy, as the accuracy score decreases. Can you please help me figure out why this is happenning?
In the parameter dictionary above, I have provided the default values. If I set the parameters to only these single values, then I get the 85% accuracy, like, 'max_depth': [6]. However, as soon as I add other values, like 'max_depth': [5, 6, 7], then GridSearchCV gives the parameters that are not the highest on accuracy score. Full details below:
Base Reg estimators (acc = 85%): {'gamma': 0, 'learning_rate': 0.35, 'max_depth': 5, 'n_estimators': 95, 'reg_lambda': 0.8, 'scale_pos_weight': 1}
Best Reg estimators (acc = 83%): {'gamma': 0, 'learning_rate': 0.35, 'max_depth': 6, 'n_estimators': 100, 'reg_lambda': 1, 'scale_pos_weight': 1}
I am running the same piece of code on Normal XGBoost and Dask XGBoost.
I am getting different probabilities from both models.
Normal XGBoost Code
params = {'objective': 'binary:logistic', 'nround': 1000,
'max_depth': 16, 'eta': 0.01, 'subsample': 0.5,
'min_child_weight': 1, 'tree_method': 'hist',
'grow_policy': 'lossguide'}
model = XGBClassifier(params=params)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
Output:-Normal XGBoost Code Output
Dask XGBoost Code
params = {'objective': 'binary:logistic', 'nround': 1000,
'max_depth': 16, 'eta': 0.01, 'subsample': 0.5,
'min_child_weight': 1, 'tree_method': 'hist',
'grow_policy': 'lossguide'}
bst = dxgb.train(client, params, X_train, y_train)
predictions2 = dxgb.predict(client, bst, X_test).persist()
Output:-
Dask XGBoost Code Output
Can someone please help me here?
I am training a CNN with Keras but with 30x30 patches from an image. I want to test the network with a full image but I get the following error:
ValueError: GpuElemwise. Input dimension mis-match. Input 2 (indices
start at 0) has shape[1] == 30, but the output's size on that axis is
100. Apply node that caused the error: GpuElemwise{Composite{((i0 + i1) - i2)}}[(0, 0)](GpuDimShuffle{0,2,3,1}.0, GpuReshape{4}.0,
GpuFromHost.0) Toposort index: 79 Inputs types:
[CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, True,
True, False)), CudaNdarrayType(float32, 4D)] Inputs shapes: [(10, 100,
100, 3), (1, 1, 1, 3), (10, 30, 30, 3)] Inputs strides: [(30000, 100,
1, 10000), (0, 0, 0, 1), (2700, 90, 3, 1)] Inputs values: ['not
shown', CudaNdarray([[[[ 0.01060364 0.00988821 0.00741314]]]]), 'not
shown'] Outputs clients:
[[GpuCAReduce{pre=sqr,red=add}{0,1,1,1}(GpuElemwise{Composite{((i0 +
i1) - i2)}}[(0, 0)].0)]]
This is my model.predict:
predict_image = model.predict(np.array([test_images[1]]), batch_size=1)[0]
It's seems like the issue is that the input size cannot be anything other than 30x30 but the first input shape for the first layer of my network is none, none, 3.
model.add(Convolution2D(n1, f1, f1, border_mode='same', input_shape=(None, None, 3), activation='relu'))
Is it simply not possible to test an image with different dimensions to the ones I trained with?
As fchollet himself described here, you should be able to define the input as so:
input_shape=(1, None, None)
However this will fail if you have layers that use the Flatten operation.
This suggests that you should be able to accomplish your goal with a fully convolutional NN.
I am new to the machine learning and TensorFlow. I am trying to train a simple model to recognize gender. I use small data-set of height, weight, and shoe size. However, I have encountered a problem with evaluating model's accuracy.
Here's the entire code:
import tflearn
import tensorflow as tf
import numpy as np
# [height, weight, shoe_size]
X = [[181, 80, 44], [177, 70, 43], [160, 60, 38], [154, 54, 37], [166, 65, 40],
[190, 90, 47], [175, 64, 39], [177, 70, 40], [159, 55, 37], [171, 75, 42],
[181, 85, 43], [170, 52, 39]]
# 0 - for female, 1 - for male
Y = [1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0]
data = np.column_stack((X, Y))
np.random.shuffle(data)
# Split into train and test set
X_train, Y_train = data[:8, :3], data[:8, 3:]
X_test, Y_test = data[8:, :3], data[8:, 3:]
# Build neural network
net = tflearn.input_data(shape=[None, 3])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 1, activation='linear')
net = tflearn.regression(net, loss='mean_square')
# fix for tflearn with TensorFlow 12:
col = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
for x in col:
tf.add_to_collection(tf.GraphKeys.VARIABLES, x)
# Define model
model = tflearn.DNN(net)
# Start training (apply gradient descent algorithm)
model.fit(X_train, Y_train, n_epoch=100, show_metric=True)
score = model.evaluate(X_test, Y_test)
print('Training test score', score)
test_male = [176, 78, 42]
test_female = [170, 52, 38]
print('Test male: ', model.predict([test_male])[0])
print('Test female:', model.predict([test_female])[0])
Even though model's prediction is not very accurate
Test male: [0.7158362865447998]
Test female: [0.4076206684112549]
The model.evaluate(X_test, Y_test) always returns 1.0. How do I calculate real accuracy on the test data-set using TFLearn?
You want to do binary classification in this case. Your network is set to perform linear regression.
First, transform the labels (gender) to categorical features:
from tflearn.data_utils import to_categorical
Y_train = to_categorical(Y_train, nb_classes=2)
Y_test = to_categorical(Y_test, nb_classes=2)
The output layer of your network needs two output units for the two classes you want to predict. Also the activation needs to be softmax for classification. The tf.learn default loss is cross-entropy and the default metric is accuracy, so this is already correct.
# Build neural network
net = tflearn.input_data(shape=[None, 3])
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 32)
net = tflearn.fully_connected(net, 2, activation='softmax')
net = tflearn.regression(net)
The output will now be a vector with the probability for each gender. For example:
[0.991, 0.009] #female
Bear in mind that you will hopelessly overfit the network with your tiny data set. This means that during training the accuracy will approach 1 while, the accuracy on your test set will be quite poor.