keras combine pretrained model - machine-learning

I trained a single model and want to combine it with another keras model using the functional api (backend is tensorflow version 1.4)
My first model looks like this:
import tensorflow.contrib.keras.api.keras as keras
model = keras.models.Sequential()
input = Input(shape=(200,))
dnn = Dense(400, activation="relu")(input)
dnn = Dense(400, activation="relu")(dnn)
output = Dense(5, activation="softmax")(dnn)
model = keras.models.Model(inputs=input, outputs=output)
after I trained this model I save it using the keras model.save() method. I can also load the model and retrain it without problems.
Now I want to use the output of this model as additional input for a second model:
# load first model
old_model = keras.models.load_model(path_to_old_model)
input_1 = Input(shape=(200,))
input_2 = Input(shape=(200,))
output_old_model = old_model(input_2)
merge_layer = concatenate([input_1, output_old_model])
dnn_layer = Dense(200, activation="relu")(merge_layer)
dnn_layer = Dense(200, activation="relu")(dnn_layer)
output = Dense(10, activation="sigmoid")(dnn_layer)
new_model = keras.models.Model(inputs=[input_1, input_2], outputs=output)
new_model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]
new_model.fit(inputs=[x1,x2], labels=labels, epochs=50, batch_size=32)
when I try this I get the following error message:
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value dense_1/kernel
[[Node: dense_1/kernel/read = Identity[T=DT_FLOAT, _class=["loc:#dense_1/kernel"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](dense_1/kernel)]]
[[Node: model_1_1/dense_3/BiasAdd/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_68_model_1_1/dense_3/BiasAdd", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

I would do this in following steps:
Define function for building a clean model with the same architecture:
def build_base():
input = Input(shape=(200,))
dnn = Dense(400, activation="relu")(input)
dnn = Dense(400, activation="relu")(dnn)
output = Dense(5, activation="softmax")(dnn)
model = keras.models.Model(inputs=input, outputs=output)
return input, output, model
Build two copies of the same model:
input_1, output_1, model_1 = build_base()
input_2, output_2, model_2 = build_base()
Set weights in both models:
model_1.set_weights(old_model.get_weights())
model_2.set_weights(old_model.get_weights())
Now do the rest:
merge_layer = concatenate([input_1, output_2])
dnn_layer = Dense(200, activation="relu")(merge_layer)
dnn_layer = Dense(200, activation="relu")(dnn_layer)
output = Dense(10, activation="sigmoid")(dnn_layer)
new_model = keras.models.Model(inputs=[input_1, input_2], outputs=output)

Let's say you have a pre-trained/saved CNN model called pretrained_model and you want to add a densely connected layers to it, then using the functional API you can write something like this:
from keras import models, layers
kmodel = layers.Flatten()(pretrained_model.output)
kmodel = layers.Dense(256, activation='relu')(kmodel)
kmodel_out = layers.Dense(1, activation='sigmoid')(kmodel)
model = models.Model(pretrained_model.input, kmodel_out)

Related

Fine tuning a BERT Model as a chatbot giving error while training

I have been trying to fine tune a BERT model to give response sentences like a character based on input sentences but I am getting a rather odd error every time . the code is
`
Here sourcetexts is a list of sentences that give the context and target_text is a list of sentences that give response to context statments
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("bert-base-cased").to(device)
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
input_ids = \[\]
output_ids = \[\]
for i in range (0 , len(source_text):
input_ids.append(tokenizer.encode(source_texts\[i\], return_tensors="pt"))
output_ids.append(tokenizer.encode(target_texts\[i\], return_tensors="pt"))
import torch
device = torch.device("cuda")
from transformers import BertForMaskedLM, AdamW
model = BertForMaskedLM.from_pretrained("bert-base-cased")
optimizer = AdamW(model.parameters(), lr=1e-5)
loss_fn = torch.nn.CrossEntropyLoss()
def train(input_id, output_id):
input_id = input_id.to(device)
output_id = output_id.to(device)
model.zero_grad()
logits, _ = model(input_id, labels=output_id)
# Compute the loss
loss = loss_fn(logits.view(-1, logits.size(-1)), output_id.view(-1))
loss.backward()
optimizer.step()
return loss.item()
for epoch in range(50):
\# Train the model on the training dataset
train_loss = 0.0
for input_sequences, output_sequences in zip(input_ids, output_ids):
input_sequences = input_sequences.to(device)
output_sequences = output_sequences.to(device)
train_loss += train(input_sequences, output_sequences)
This is the Error that I am getting
Any help would be really appreciated .
Pls help!!
Hi i saw your code but you didn't move your model to GPU, only the inputs, pytorch by default is on CPU
import torch
device = torch.device('cuda')
model = BertForMaskedLM.from_pretrained("bert-base-cased")
model.to(device)

How can I get the feature names after several fit_transform's from sklearn?

I'm running a machine learning model that requires multiple transformations. I applied polynomial transformations, interactions, and also a feature selection using SelectKBest:
transformer = ColumnTransformer(
transformers=[("cat", ce.cat_boost.CatBoostEncoder(y_train), cat_features),]
)
X_train_transformed = transformer.fit_transform(X_train, y_train)
X_test_transformed = transformer.transform(X_test)
poly = PolynomialFeatures(2)
X_train_polynomial = poly.fit_transform(X_train_transformed)
X_test_polynomial = poly.transform(X_test_transformed)
interaction = PolynomialFeatures(2, interaction_only=True)
X_train_interaction = interaction.fit_transform(X_train_polynomial)
X_test_interaction = interaction.transform(X_test_polynomial)
feature_selection = SelectKBest(chi2, k=55)
train_features = feature_selection.fit_transform(X_train_interaction, y_train)
test_features = feature_selection.transform(X_test_interaction)
model = lgb.LGBMClassifier()
model.fit(train_features, y_train)
However, I want to get the feature names and I have no idea on how to get them.

Any way to efficiently stack/ensemble pre-trained models for image classification?

I am trying to stack a few pre-trained models that I have through taking the last hidden layer of each model and then concatenating them together and then plugging them into a meta-learner model (e.g. XGBoost).
I am running into a big problem of having to process each image of my dataset multiple times since each base model requires a different processing method. This is causing my model to take a really long time to train and is infeasible. Is there any way to work past this?
For example:
model_1, processor_1 = pretrained_model(), pretrained_processor()
model_2, processor_2 = pretrained_model2(), pretrained_processor2()
for img in images:
input_1 = processor_1(img)
input_2 = processor_2(img)
out_1 = model_1(input_1)
out_2 = model_2(input_2)
torch.cat((out1,out2), dim=1) #concatenates hidden representations to feed into another model
Here'a recommendation if you want to process your images faster:
Note: I did not test this out
import torch
import torch.nn as nn
# Create a stack nn module
class StackedModel(nn.Module):
def __init__(self, model1, model2):
super(StackedModel, self).__init__()
self.model1 = model1
self.model2 = model2
def forward(self, imgs):
out_1 = model_1(input_1)
out_2 = model_2(input_2)
return torch.cat((out1, out2), dim=1)
# Init model
model = StackedModel(model1, model2)
# Try to stack and run in a larger batch assuming u have extra gpu space
stacked_preproc1 = []
stacked_preproc2 = []
max_batch_size = 16
total_output = []
for index, img in enumerate(images):
input_1 = processor_1(img)
input_2 = processor_2(img)
stacked_preproc1.append(input_1)
stakced_preproc2.appennd(input2)
if index % max_batch_size == 0:
stacked_preproc1 = torch.stack(stacked_preproc1)
stakced_preproc2 = torch.stack(stakced_preproc2)
else:
total_output.append(
model(stacked_preproc1, stacked_preproc2)
)
# Reset array
stacked_preproc1 = []
stakced_preproc2 = []

How to prevent Keras predict_generator from shuffling data?

I created a deep learning model, and I want to check the performance of the model by using predict_generator. I am using the following code which compares the images' labels with the predicted classes and then returns the prediction error.
validation_generator = validation_datagen.flow_from_directory(
validation_dir,
target_size=(image_size, image_size),
batch_size=val_batchsize,
class_mode='categorical',
shuffle=False)
# Get the filenames from the generator
fnames = validation_generator.filenames
# Get the ground truth from generator
ground_truth = validation_generator.classes
# Get the label to class mapping from the generator
label2index = validation_generator.class_indices
# Getting the mapping from class index to class label
idx2label = dict((v,k) for k,v in label2index.items())
# Get the predictions from the model using the generator
predictions = model.predict_generator(validation_generator, steps=validation_generator.samples/validation_generator.batch_size,verbose=1)
predicted_classes = np.argmax(predictions,axis=1)
errors = np.where(predicted_classes != ground_truth)[0]
print("No of errors = {}/{}".format(len(errors),validation_generator.samples))
# Show the errors
for i in range(len(errors)):
pred_class = np.argmax(predictions[errors[i]])
pred_label = idx2label[pred_class]
title = 'Original label:{}, Prediction :{}, confidence : {:.3f}'.format(
fnames[errors[i]].split('/')[0],
pred_label,
predictions[errors[i]][pred_class])
original = load_img('{}/{}'.format(validation_dir,fnames[errors[i]]))
plt.figure(figsize=[7,7])
plt.axis('off')
plt.title(title)
plt.imshow(original)
plt.show()
validation_generator.classes is arranged but predicted_classes is not arranged.
I take the code from here https://www.learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/
How can I prevent predict_generator from shuffling data?

Find the best pipeline model using CrossValidator and ParamGridBuilder

I have an acceptable model, but I would like to improve it by adjusting its parameters in Spark ML Pipeline with CrossValidator and ParamGridBuilder.
As an Estimator I will place the existing pipeline.
In ParamMaps I would not know what to put, I do not understand it.
As Evaluator I will use the RegressionEvaluator already created previously.
I'm going to do it for 5 folds, with a list of 10 different depth values in the tree.
How can I select and show the best model for the lowest RMSE?
ACTUAL example:
from pyspark.ml import Pipeline
from pyspark.ml.regression import DecisionTreeRegressor
from pyspark.ml.feature import VectorIndexer
from pyspark.ml.evaluation import RegressionEvaluator
dt = DecisionTreeRegressor()
dt.setPredictionCol("Predicted_PE")
dt.setMaxBins(100)
dt.setFeaturesCol("features")
dt.setLabelCol("PE")
dt.setMaxDepth(8)
pipeline = Pipeline(stages=[vectorizer, dt])
model = pipeline.fit(trainingSetDF)
regEval = RegressionEvaluator(predictionCol = "Predicted_XX", labelCol = "XX", metricName = "rmse")
rmse = regEval.evaluate(predictions)
print("Root Mean Squared Error: %.2f" % rmse)
(1) Spark Jobs
(2) Root Mean Squared Error: 3.60
NEED:
from pyspark.ml.tuning import CrossValidator, ParamGridBuilder
dt2 = DecisionTreeRegressor()
dt2.setPredictionCol("Predicted_PE")
dt2.setMaxBins(100)
dt2.setFeaturesCol("features")
dt2.setLabelCol("PE")
dt2.setMaxDepth(10)
pipeline2 = Pipeline(stages=[vectorizer, dt2])
model2 = pipeline2.fit(trainingSetDF)
regEval2 = RegressionEvaluator(predictionCol = "Predicted_PE", labelCol = "PE", metricName = "rmse")
paramGrid = ParamGridBuilder().build() # ??????
crossval = CrossValidator(estimator = pipeline2, estimatorParamMaps = paramGrid, evaluator=regEval2, numFolds = 5) # ?????
rmse2 = regEval2.evaluate(predictions)
#bestPipeline = ????
#bestLRModel = ????
#bestParams = ????
print("Root Mean Squared Error: %.2f" % rmse2)
(1) Spark Jobs
(2) Root Mean Squared Error: 3.60 # the same ¿?
You need to call .fit() with your training data on the crossval object to create the cv model. That will do the cross validation. Then you get the best model (according to your evaluator metric) from that. Eg.
cvModel = crossval.fit(trainingData)
myBestModel = cvModel.bestModel

Resources