Related
I wrote a script using xgboost to predict soil class for a certain area using data from field and satellite images. The script as below:
`
rm(list=ls())
library(xgboost)
library(caret)
library(raster)
library(sp)
library(rgeos)
library(ggplot2)
setwd("G:/DATA")
data <- read.csv('96PointsClay02finalone.csv')
head(data)
summary(data)
dim(data)
ras <- stack("Allindices04TIFF.tif")
names(ras) <- c("b1", "b2", "b3", "b4", "b5", "b6", "b7", "b10", "b11","DEM",
"R1011", "SCI", "SAVI", "NDVI", "NDSI", "NDSandI", "MBSI",
"GSI", "GSAVI", "EVI", "DryBSI", "BIL", "BI","SRCI")
set.seed(27) # set seed for generating random data.
# createDataPartition() function from the caret package to split the original dataset into a training and testing set and split data into training (80%) and testing set (20%)
parts = createDataPartition(data$Clay, p = .8, list = F)
train = data[parts, ]
test = data[-parts, ]
#define predictor and response variables in training set
train_x = data.matrix(train[, -1])
train_y = train[,1]
#define predictor and response variables in testing set
test_x = data.matrix(test[, -1])
test_y = test[, 1]
#define final training and testing sets
xgb_train = xgb.DMatrix(data = train_x, label = train_y)
xgb_test = xgb.DMatrix(data = test_x, label = test_y)
#defining a watchlist
watchlist = list(train=xgb_train, test=xgb_test)
#fit XGBoost model and display training and testing data at each iteartion
model = xgb.train(data = xgb_train, max.depth = 3, watchlist=watchlist, nrounds = 100)
#define final model
model_xgboost = xgboost(data = xgb_train, max.depth = 3, nrounds = 86, verbose = 0)
summary(model_xgboost)
#use model to make predictions on test data
pred_y = predict(model_xgboost, xgb_test)
# performance metrics on the test data
mean((test_y - pred_y)^2) #mse - Mean Squared Error
caret::RMSE(test_y, pred_y) #rmse - Root Mean Squared Error
y_test_mean = mean(test_y)
rmseE<- function(error)
{
sqrt(mean(error^2))
}
y = test_y
yhat = pred_y
rmseresult=rmseE(y-yhat)
(r2 = R2(yhat , y, form = "traditional"))
cat('The R-square of the test data is ', round(r2,4), ' and the RMSE is ', round(rmseresult,4), '\n')
#use model to make predictions on satellite image
result <- predict(model_xgboost, ras[1:(nrow(ras)*ncol(ras))])
#create a result raster
res <- raster(ras)
#fill in results and add a "1" to them (to get back to initial class numbering! - see above "Prepare data" for more information)
res <- setValues(res,result+1)
#Save the output .tif file into saved directory
writeRaster(res, "xgbmodel_output", format = "GTiff", overwrite=T)
`
The script works well till it reachs
result <- predict(model_xgboost, ras[1:(nrow(ras)*ncol(ras))])
it takes some time then gives this error:
Error in predict.xgb.Booster(model_xgboost, ras[1:(nrow(ras) * ncol(ras))]) :
Feature names stored in `object` and `newdata` are different!
I realize that I am doing something wrong in that line. However, I do not know how to apply the xgboost model to a raster image that represents my study area.
It would be highly appreciated if someone give a hand, enlightened me, and helped me solve this problem....
My data as csv and raster image can be found here.
Finally, I got the reason for this error.
It was my mistake as the number of columns in the traning data was not the same as in the number of layers in the satellite image.
If I use optim.SGD(model_conv.fc.parameters() I'm getting an error:
optimizer got an empty parameter list
This error is, when model_conv.fc is nn.Hardtanh(...) (also when I try to use ReLu).
But with nn.Linear it works fine.
What could be the reason?
model_conv.fc = nn.Hardtanh(min_val=0.0, max_val=1.0) #not ok --> optimizer got an empy parameter list
#model_conv.fc = nn.ReLU() #also Not OK
# num_ftrs = model_conv.fc.in_features
# model_conv.fc = nn.Linear(num_ftrs, 1) #it works fine
model_conv = model_conv.to(config.device())
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=config.learning_rate, momentum=config.momentum) #error is here
Hardtanh and ReLU are parameter-free layers but Linear has parameters.
Activation functions are used to add non-linearity to your model which is parameter-free, so you should pass a nn.Linear as a fully-connected (FC) layer
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Sequential( list_of_FC_layers )
e.g
model_conv.fc = nn.Sequential( nn.Linear(in_features, hidden_neurons),
nn.Linear(hidden_neurons, out_channels) )
or
model_conv.fc = nn.Linear( in_features, out_channels)
out_channels in case of binary-classification task is 1 and in case of multi-class classification is num_classes
NOTE.1: for multi-class classification, Do Not use the softmax layer as it is done in CrossEntropyLoss
NOTE.2 : You can use Sigmoid activation for binary classification use
I'm using the coil-100 dataset which has images of 100 objects, 72 images per object taken from a fixed camera by turning the object 5 degrees per image. Following is the folder structure I'm using:
data/train/obj1/obj01_0.png, obj01_5.png ... obj01_355.png
.
.
data/train/obj85/obj85_0.png, obj85_5.png ... obj85_355.png
.
.
data/test/obj86/obj86_0.ong, obj86_5.png ... obj86_355.png
.
.
data/test/obj100/obj100_0.ong, obj100_5.png ... obj100_355.png
I have used the imageloader and dataloader classes. The train and test datasets loaded properly and I can print the class names.
train_path = 'data/train/'
test_path = 'data/test/'
data_transforms = {
transforms.Compose([
transforms.Resize(224, 224),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])
}
train_data = torchvision.datasets.ImageFolder(
root=train_path,
transform= data_transforms
)
test_data = torchvision.datasets.ImageFolder(
root = test_path,
transform = data_transforms
)
train_loader = torch.utils.data.DataLoader(
train_data,
batch_size=None,
num_workers=1,
shuffle=False
)
test_loader = torch.utils.data.DataLoader(
test_data,
batch_size=None,
num_workers=1,
shuffle=False
)
print(len(train_data))
print(len(test_data))
classes = train_data.class_to_idx
print("detected classes: ", classes)
In my model I wish to pass every image through pretrained resnet and make a dataset from the output of resnet to feed into a biderectional LSTM.
For which I need to access the images by classname and index.
for ex. pre_resnet_train_data['obj01'][0] should be obj01_0.png and post_resnet_train_data['obj01'][0] should be the resnet output of obj01_0.png and so on.
I'm a beginner in Pytorch and for the past 2 days, I have read many tutorials and stackoverflow questions about creating a custom dataset class but couldn't figure out how to achieve what I want.
please help!
Assuming you only plan on running resent on the images once and save the output for later use, I suggest you write your own data set, derived from ImageFolder.
Save each resnet output at the same location as the image file with .pth extension.
class MyDataset(torchvision.datasets.ImageFolder):
def __init__(self, root, transform):
super(MyDataset, self).__init__(root, transform)
def __getitem__(self, index):
# override ImageFolder's method
"""
Args:
index (int): Index
Returns:
tuple: (sample, resnet, target) where target is class_index of the target class.
"""
path, target = self.samples[index]
sample = self.loader(path)
if self.transform is not None:
sample = self.transform(sample)
if self.target_transform is not None:
target = self.target_transform(target)
# this is where you load your resnet data
resnet_path = os.path.join(os.path.splitext(path)[0], '.pth') # replace image extension with .pth
resnet = torch.load(resnet_path) # load the stored features
return sample, resnet, target
I trained a single model and want to combine it with another keras model using the functional api (backend is tensorflow version 1.4)
My first model looks like this:
import tensorflow.contrib.keras.api.keras as keras
model = keras.models.Sequential()
input = Input(shape=(200,))
dnn = Dense(400, activation="relu")(input)
dnn = Dense(400, activation="relu")(dnn)
output = Dense(5, activation="softmax")(dnn)
model = keras.models.Model(inputs=input, outputs=output)
after I trained this model I save it using the keras model.save() method. I can also load the model and retrain it without problems.
Now I want to use the output of this model as additional input for a second model:
# load first model
old_model = keras.models.load_model(path_to_old_model)
input_1 = Input(shape=(200,))
input_2 = Input(shape=(200,))
output_old_model = old_model(input_2)
merge_layer = concatenate([input_1, output_old_model])
dnn_layer = Dense(200, activation="relu")(merge_layer)
dnn_layer = Dense(200, activation="relu")(dnn_layer)
output = Dense(10, activation="sigmoid")(dnn_layer)
new_model = keras.models.Model(inputs=[input_1, input_2], outputs=output)
new_model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]
new_model.fit(inputs=[x1,x2], labels=labels, epochs=50, batch_size=32)
when I try this I get the following error message:
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value dense_1/kernel
[[Node: dense_1/kernel/read = Identity[T=DT_FLOAT, _class=["loc:#dense_1/kernel"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense_1/kernel)]]
[[Node: model_1_1/dense_3/BiasAdd/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_68_model_1_1/dense_3/BiasAdd", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I would do this in following steps:
Define function for building a clean model with the same architecture:
def build_base():
input = Input(shape=(200,))
dnn = Dense(400, activation="relu")(input)
dnn = Dense(400, activation="relu")(dnn)
output = Dense(5, activation="softmax")(dnn)
model = keras.models.Model(inputs=input, outputs=output)
return input, output, model
Build two copies of the same model:
input_1, output_1, model_1 = build_base()
input_2, output_2, model_2 = build_base()
Set weights in both models:
model_1.set_weights(old_model.get_weights())
model_2.set_weights(old_model.get_weights())
Now do the rest:
merge_layer = concatenate([input_1, output_2])
dnn_layer = Dense(200, activation="relu")(merge_layer)
dnn_layer = Dense(200, activation="relu")(dnn_layer)
output = Dense(10, activation="sigmoid")(dnn_layer)
new_model = keras.models.Model(inputs=[input_1, input_2], outputs=output)
Let's say you have a pre-trained/saved CNN model called pretrained_model and you want to add a densely connected layers to it, then using the functional API you can write something like this:
from keras import models, layers
kmodel = layers.Flatten()(pretrained_model.output)
kmodel = layers.Dense(256, activation='relu')(kmodel)
kmodel_out = layers.Dense(1, activation='sigmoid')(kmodel)
model = models.Model(pretrained_model.input, kmodel_out)
What I am trying to do:
I want to connect any Layers from different models to create a new keras model.
What I found so far:
https://github.com/keras-team/keras/issues/4205: using the Model's call class to change the input of another model. My problems with this approach:
Can only change the input of the Model, no other layers. So if I want to cut off some layers at the beginning of the encoder, that is not possible
Not a fan of the nested array structure when getting the config file. Would prefer to have a 1D-array
When using model.summary() or plot_model(), the encoder only shows as "Model". If anything I would say both models should be wrapped. So the config should show [model_base, model_encoder] and not [base_input, base_conv2D, ..., encoder_model]
To be fair, with this approach: https://github.com/keras-team/keras/issues/3021, the point above is actually possible, but again, it is very inflexible. As soon as I want to cut off some layers at the top or bottom of the base or encoder network, this approach fails
https://github.com/keras-team/keras/issues/3465: Adding new layers to a base model by using any output of the base model. Problems here:
While it is possible to use any layer from the base model, which means I can cut off layers from the base model, I can not load the encoder as a keras model. The top models always must be created new.
What I have tried:
My approach to connecting any layers from different models:
Clear inbound nodes of input layer
use the call() method of the output layer with the tensor of the output layer
Clean up the outbound nodes of the output tensor by switching out the new created tensor with the previous output tensor
I was really optimistic at first, as the summary() and the plot_model() got me exactly what I wanted, thus the Node graph should be fine, right? But I ran into errors when training. While the approach in the "What I found so far" section trained fine, I ran into an error with my approach. This is the error message:
File "C:\Anaconda\envs\dlpipe\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 508, in apply_op
(input_name, err))
ValueError: Tried to convert 'x' to a tensor and failed. Error: None values not supported.
Might be an important info, that I am using Tensorflow as backend. I was able to trace back the root of this error. It seems like there is an error when the gradients are calculated. Usually, there is a gradient calculation for each node, but all the nodes of the base network have "None" when using my approach. So basically in keras/optimizers.py, get_updates() when the gradients are calculated (grad = self.get_gradients(loss, params)).
Here is the code (without the training), with all three approaches implemented:
def create_base():
in_layer = Input(shape=(32, 32, 3), name="base_input")
x = Conv2D(32, (3, 3), padding='same', activation="relu", name="base_conv2d_1")(in_layer)
x = Conv2D(32, (3, 3), padding='same', activation="relu", name="base_conv2d_2")(x)
x = MaxPooling2D(pool_size=(2, 2), name="base_maxpooling_2d_1")(x)
x = Dropout(0.25, name="base_dropout")(x)
x = Conv2D(64, (3, 3), padding='same', activation="relu", name="base_conv2d_3")(x)
x = Conv2D(64, (3, 3), padding='same', activation="relu", name="base_conv2d_4")(x)
x = MaxPooling2D(pool_size=(2, 2), name="base_maxpooling2d_2")(x)
x = Dropout(0.25, name="base_dropout_2")(x)
return Model(inputs=in_layer, outputs=x, name="base_model")
def create_encoder():
in_layer = Input(shape=(8, 8, 64))
x = Flatten(name="encoder_flatten")(in_layer)
x = Dense(512, activation="relu", name="encoder_dense_1")(x)
x = Dropout(0.5, name="encoder_dropout_2")(x)
x = Dense(10, activation="softmax", name="encoder_dense_2")(x)
return Model(inputs=in_layer, outputs=x, name="encoder_model")
def extend_base(input_model):
x = Flatten(name="custom_flatten")(input_model.output)
x = Dense(512, activation="relu", name="custom_dense_1")(x)
x = Dropout(0.5, name="custom_dropout_2")(x)
x = Dense(10, activation="softmax", name="custom_dense_2")(x)
return Model(inputs=input_model.input, outputs=x, name="custom_edit")
def connect_layers(from_tensor, to_layer, clear_inbound_nodes=True):
try:
tmp_output = to_layer.output
except AttributeError:
raise ValueError("Connecting to shared layers is not supported!")
if clear_inbound_nodes:
to_layer.inbound_nodes = []
else:
tensor_list = to_layer.inbound_nodes[0].input_tensors
tensor_list.append(from_tensor)
from_tensor = tensor_list
to_layer.inbound_nodes = []
new_output = to_layer(from_tensor)
for out_node in to_layer.outbound_nodes:
for i, in_tensor in enumerate(out_node.input_tensors):
if in_tensor == tmp_output:
out_node.input_tensors[i] = new_output
if __name__ == "__main__":
base = create_base()
encoder = create_encoder()
#new_model_1 = Model(inputs=base.input, outputs=encoder(base.output))
#plot_model(new_model_1, to_file="plots/new_model_1.png")
new_model_2 = extend_base(base)
plot_model(new_model_2, to_file="plots/new_model_2.png")
print(new_model_2.summary())
base_layer = base.get_layer("base_dropout_2")
top_layer = encoder.get_layer("encoder_flatten")
connect_layers(base_layer.output, top_layer)
new_model_3 = Model(inputs=base.input, outputs=encoder.output)
plot_model(new_model_3, to_file="plots/new_model_3.png")
print(new_model_3.summary())
I know this is a lot of text and a lot of code. But I feel like it is needed to explain the issue here.
EDIT: I just tried thenao and I think the error gives away more information:
theano.gradient.DisconnectedInputError:
Backtrace when that variable is created:
It seems like every layer from the encoder model has some connection with the encoder input layer via TensorVariables.
So this is what I ended up with for the connect_layer() function:
def connect_layers(from_tensor, to_layer, old_tensor=None):
# if there is any shared layer after the to_layer, it is not supported
try:
tmp_output = to_layer.output
except AttributeError:
raise ValueError("Connecting to shared layers is not supported!")
# check if to_layer has multiple input_tensors, and therefore some sort of merge layer
if len(to_layer.inbound_nodes[0].input_tensors) > 1:
tensor_list = to_layer.inbound_nodes[0].input_tensors
found_tensor = False
for i, tensor in enumerate(tensor_list):
# exchange the old tensor with the new created tensor
if tensor == old_tensor:
tensor_list[i] = from_tensor
found_tensor = True
break
if not found_tensor:
tensor_list.append(from_tensor)
from_tensor = tensor_list
to_layer.inbound_nodes = []
else:
to_layer.inbound_nodes = []
new_output = to_layer(from_tensor)
tmp_out_nodes = to_layer.outbound_nodes[:]
to_layer.outbound_nodes = []
# recursively connect all layers after the current to_layer
for out_node in tmp_out_nodes:
l = out_node.outbound_layer
print("Connecting: " + str(to_layer) + " ----> " + str(l))
connect_layers(new_output, l, tmp_output)
As each Tensor has all the information about it's root tensor via -> owner.inputs -> owner.inputs -> ..., all tensor following the new_output tensor must be updated.
It was a lot easier to debug that with theano then with tensorflow backend.
I still need to figure out how to deal with shared layers. With the current implementation it is not possible to connect other models that contain a shared layer after the first to_layer.