Using the LSTM layer in encoder in Pytorch - machine-learning

I want to build an autoencoder with LSTM layers. But, at the first step of the encoder, I got an error. Could you please help me with that?
Here is the model which I tried to build:
import numpy
import torch.nn as nn
r_input = torch.nn.LSTM(1, 1, 28)
activation = nn.functional.relu
mu_r = nn.Linear(22, 6)
log_var_r = nn.Linear(22, 6)
y = np.random.rand(1, 1, 28)
def encode_r(y):
y = torch.reshape(y, (-1, 1, 28)) # torch.Size([batch_size, 1, 28])
hidden = torch.flatten(activation(r_input(y)), start_dim = 1)
z_mu = mu_r(hidden)
z_log_var = log_var_r(hidden)
return z_mu, z_log_var
But I got this error in my code:
RuntimeError: input.size(-1) must be equal to input_size. Expected 1, got 28.

You're not creating the layer in the correct way.
torch.nn.LSTM requires input_size as the first argument, but your tensor has a dimension of 28. It seems that you want the encoder to output a tensor with a dimension of 22. You're also passing the batch as the first dimension, so you need to include batch_first=True as an argument.
r_input = torch.nn.LSTM(28, 22, batch_first=True)
This should work for your specific setup. You should also note that LSTM returns 2 items, the first one is the one you want to use.
hidden = torch.flatten(activation(r_input(y)[0]), start_dim=1)
Please read the declaration on the official wiki for more information.

Related

Avoiding data leakage when using BaggingClassifier (Regressor) with feature scaling (StandardScaler)

I am running bagging with LogisticRegression. Since the latter uses regularization, features must be scaled. Since bagging takes a sample (with replacement) from the original training data set, scaling should take place after that. Scaling the original data set and then taking a sample amounts to data leakage. This is similar to how scaling is (mis)used with CV: it is wrong to scale the whole data set and then feed it to CV.
It appears that there are no built in tools to avoid data leakage with bagging (see the code below), but I may be wrong. Any help will be appreciated.
from sklearn.ensemble import BaggingClassifier
single_log_reg = LogisticRegression(solver="liblinear", random_state = np.random.RandomState(18))
bagged_logistic = BaggingClassifier(single_log_reg, n_estimators = 100, random_state = np.random.RandomState(42))
logit_bagged_pipeline = Pipeline(steps=[
('scaler', StandardScaler(with_mean = False)),
('bagged_logit', bagged_logistic)
])
logit_bagged_grid = {'bagged_logit__base_estimator__C': c_values,
'bagged_logit__max_features' : [100, 200, 400, 600, 800, 1000]}
logit_bagged_searcher = GridSearchCV(estimator = logit_bagged_pipeline, param_grid = logit_bagged_grid, cv = skf,
scoring = "roc_auc", n_jobs = 6, verbose = 4)
logit_bagged_searcher.fit(all_model_features, y_train)
The leakage you mention is really only a major concern if you intend to use out-of-bag performance estimates. Otherwise, each of your models gets a little information from the scaling as to how its bag compares to the rest of the data, which might lead to slight overfitting, but your test scores will be fine.
But, it is relatively straightforward to do this in sklearn. You just need to tie the scaler to the logistic regression inside the bagging:
single_log_reg = LogisticRegression(solver="liblinear", random_state = 18)
logit_scaled_pipeline = Pipeline(steps=[
('scaler', StandardScaler(with_mean = False)),
('logit', single_log_reg),
])
bagged_logsc = BaggingClassifier(logit_scaled_pipeline, n_estimators = 100, random_state = 42)
logit_bagged_grid = {
'bagged_logsc__base_estimator__logit__C': c_values,
'bagged_logsc__max_features' : [100, 200, 400, 600, 800, 1000],
}
logit_bagged_searcher = GridSearchCV(estimator = bagged_logsc, param_grid = logit_bagged_grid, cv = skf,
scoring = "roc_auc", n_jobs = 6, verbose = 4)
logit_bagged_searcher.fit(all_model_features, y_train)
On random states, see https://stackoverflow.com/a/69756672/10495893.

How to split a model trained in keras?

I trained a model with 4 hidden layers and 2 dense layers and I have saved that model.
Now I want to load that model and want to split into two models, one with one hidden layers and another one with only dense layers.
I have splitted the model with hidden layer in the following way
model = load_model ("model.hdf5")
HL_model = Model(inputs=model.input, outputs=model.layers[7].output)
Here the model is loaded model, in the that 7th layer is my last hidden layer. I tried to split the dense in the like
DL_model = Model(inputs=model.layers[8].input, outputs=model.layers[-1].output)
and I am getting error
TypeError: Input layers to a `Model` must be `InputLayer` objects.
After splitting, the output of the HL_model will the input for the DL_model.
Can anyone help me to create a model with dense layer?
PS :
I have tried below code too
from keras.layers import Input
inputs = Input(shape=(9, 9, 32), tensor=model_1.layers[8].input)
model_3 = Model(inputs=inputs, outputs=model_1.layers[-1].output)
And getting error as
RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("conv2d_1_input:0", shape=(?, 144, 144, 3), dtype=float32) at layer "conv2d_1_input". The following previous layers were accessed without issue: []
here (144, 144, 3) in the input image size of the model.
You need to specify a new Input layer first, then stack the remaining layers over it:
DL_input = Input(model.layers[8].input_shape[1:])
DL_model = DL_input
for layer in model.layers[8:]:
DL_model = layer(DL_model)
DL_model = Model(inputs=DL_input, outputs=DL_model)
A little more generic. You can use the following function to split a model
from keras.layers import Input
from keras.models import Model
def get_bottom_top_model(model, layer_name):
layer = model.get_layer(layer_name)
bottom_input = Input(model.input_shape[1:])
bottom_output = bottom_input
top_input = Input(layer.output_shape[1:])
top_output = top_input
bottom = True
for layer in model.layers:
if bottom:
bottom_output = layer(bottom_output)
else:
top_output = layer(top_output)
if layer.name == layer_name:
bottom = False
bottom_model = Model(bottom_input, bottom_output)
top_model = Model(top_input, top_output)
return bottom_model, top_model
bottom_model, top_model = get_bottom_top_model(model, "dense_1")
Layer_name is just the name of the layer that you want to split at.

How to use masking layer to mask input/output in LSTM autoencoders?

I am trying to use LSTM autoencoder to do sequence-to-sequence learning with variable lengths of sequences as inputs, using following code:
inputs = Input(shape=(None, input_dim))
masked_input = Masking(mask_value=0.0, input_shape=(None,input_dim))(inputs)
encoded = LSTM(latent_dim)(masked_input)
decoded = RepeatVector(timesteps)(encoded)
decoded = LSTM(input_dim, return_sequences=True)(decoded)
sequence_autoencoder = Model(inputs, decoded)
encoder = Model(inputs, encoded)
where inputs are raw sequence data padded with 0s to the same length (timesteps). Using the code above, the output is also of length timesteps, but when we calculate loss function we only want first Ni elements of the output (where Ni is length of input sequence i, which may be different for different sequences). Does anyone know if there is some good way to do that?
Thanks!
Option 1: you can always train without padding if you accept to train separate batches.
See this answer to a simple way of separating batches of equal length: Keras misinterprets training data shape
In this case, all you have to do is to perform the "repeat" operation in another manner, since you don't have the exact length at training time.
So, instead of RepeatVector, you can use this:
import keras.backend as K
def repeatFunction(x):
#x[0] is (batch,latent_dim)
#x[1] is inputs: (batch,length,features)
latent = K.expand_dims(x[0],axis=1) #shape(batch,1,latent_dim)
inpShapeMaker = K.ones_like(x[1][:,:,:1]) #shape (batch,length,1)
return latent * inpShapeMaker
#instead of RepeatVector:
Lambda(repeatFunction,output_shape=(None,latent_dim))([encoded,inputs])
Option2 (doesn't smell good): use another masking after RepeatVector.
I tried this, and it works, but we don't get 0's at the end, we get the last value repeated until the end. So, you will have to make a weird padding in your target data, repeating the last step until the end.
Example: target [[[1,2],[5,7]]] will have to be [[[1,2],[5,7],[5,7],[5,7]...]]
This may unbalance your data a lot, I think....
def makePadding(x):
#x[0] is encoded already repeated
#x[1] is inputs
#padding = 1 for actual data in inputs, 0 for 0
padding = K.cast( K.not_equal(x[1][:,:,:1],0), dtype=K.floatx())
#assuming you don't have 0 for non-padded data
#padding repeated for latent_dim
padding = K.repeat_elements(padding,rep=latent_dim,axis=-1)
return x[0]*padding
inputs = Input(shape=(timesteps, input_dim))
masked_input = Masking(mask_value=0.0)(inputs)
encoded = LSTM(latent_dim)(masked_input)
decoded = RepeatVector(timesteps)(encoded)
decoded = Lambda(makePadding,output_shape=(timesteps,latent_dim))([decoded,inputs])
decoded = Masking(mask_value=0.0)(decoded)
decoded = LSTM(input_dim, return_sequences=True)(decoded)
sequence_autoencoder = Model(inputs, decoded)
encoder = Model(inputs, encoded)
Option 3 (best): crop the outputs directly from the inputs, this also eliminates the gradients
def cropOutputs(x):
#x[0] is decoded at the end
#x[1] is inputs
#both have the same shape
#padding = 1 for actual data in inputs, 0 for 0
padding = K.cast( K.not_equal(x[1],0), dtype=K.floatx())
#if you have zeros for non-padded data, they will lose their backpropagation
return x[0]*padding
....
....
decoded = LSTM(input_dim, return_sequences=True)(decoded)
decoded = Lambda(cropOutputs,output_shape=(timesteps,input_dim))([decoded,inputs])
For this LSTM Autoencoder architecture, which I assume you understand, the Mask is lost at the RepeatVector due to the LSTM encoder layer having return_sequences=False.
So another option, instead of cropping like above, could also be to create custom bottleneck layer that propagates the mask.

How to estimate? "simple" Nonlinear Regression + Parameter Constraints + AR residuals

I am new to this site so please bear with me. I want to
the nonlinear model as shown in the link: https://i.stack.imgur.com/cNpWt.png by imposing constraints on the parameters a>0 and b>0 and gamma1 in [0,1].
In the nonlinear model [1] independent variable is x(t) and dependent are R(t), F(t) and ΞΎ(t) is the error term.
An example of the dataset can be shown here: https://i.stack.imgur.com/2Vf0j.png 68 rows of time series
To estimate the nonlinear regression I use the nls() function with no problem as shown below:
NLM1 = nls(**Xt ~ (aRt-bFt)/(1-gamma1*Rt), start = list(a = 10, b = 10, lamda = 0.5)**,algorithm = "port", lower=c(0,0,0),upper=c(Inf,Inf,1),data = temp2)
I want to estimate NLM1 with allowing for also an AR(1) on the residuals.
Basically I want the same procedure as we go from lm() to gls(). My problem is that in the gnls() function I dont know how to put contraints for the model parameters a, b, gamma1 and the model estimates wrong values for them.
nls() has the option for lower and upper bounds. I cant do the same on gnls()
In the gnls(): I need to add the contraints something like as in nls() lower=c(0,0,0),upper=c(Inf,Inf,1)
NLM1_AR1 = gnls( model = Xt ~ (aRt-bFt)/(1-gamma1*Rt), data = temp2, start = list(a =13, b = 10, lamda = 0.5),correlation = corARMA(p = 1))
Does any1 know the solution on how to do it?
Thank you

Hough Transform SimpleCV Feature Extractor

I'm trying to build a new SimpleCV FeatureExtractor for openCV's Hough Circle Transform but I'm running into an error during my machine learning script's training phase.
I've provided the error below. It is raised by the Orange machine learning library when creating the self.mDataSetOrange variable within SimpleCV's TreeClassifier.py. The size of the dataset does not match Orange's expectation for some reason. I looked into Orange's source code and the found that error is thrown here:
orange/source/orange/cls_example.cpp
int const nvars = dom->variables->size() + dom->classVars->size();
if (Py_ssize_t(nvars) != PyList_Size(lst)) {
PyErr_Format(PyExc_IndexError, "invalid list size (got %i, expected %i items)",
PyList_Size(lst), nvars);
return false;
}
Obviously, my feature extractor is not extracting the things as required by Orange but I can't pinpoint what the problem could be. I'm pretty new to SimpleCV and Orange so I'd be grateful if someone could point out any mistakes I'm making.
The error:
Traceback (most recent call last):
File "MyClassifier.py", line 113, in <module>
MyClassifier.run(MyClassifier.TRAIN_RUN_TYPE, trainingPaths)
File "MyClassifier.py", line 39, in run
self.decisionTree.train(imgPaths, MyClassifier.CLASSES, verbose=True)
File "/usr/local/lib/python2.7/dist-packages/SimpleCV-1.3-py2.7.egg/SimpleCV/MachineLearning/TreeClassifier.py", line 282, in train
self.mDataSetOrange = orange.ExampleTable(self.mOrangeDomain,self.mDataSetRaw)
IndexError: invalid list size (got 266, expected 263 items) (at example 2)
HoughTransformFeatureExtractor.py
class HoughTransformFeatureExtractor(FeatureExtractorBase):
def extract(self, img):
bitmap = img.getBitmap()
cvMat = cv.GetMat(bitmap)
cvImage = numpy.asarray(cvMat)
height, width = cvImage.shape[:2]
gray = cv2.cvtColor(cvImage, cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(gray, cv2.cv.CV_HOUGH_GRADIENT, 2.0, width / 2)
self.featuresLen = 0
if circles is not None:
circleFeatures = circles.ravel().tolist()
self.featuresLen = len(circleFeatures)
return circleFeatures
else:
return None
def getFieldNames(self):
retVal = []
for i in range(self.featuresLen):
name = "Hough"+str(i)
retVal.append(name)
return retVal
def getNumFields(self):
return self.featuresLen
So, I figured out my issue. Basically, the problem was with the size of list returned by the extract method. The size of the list varied for each processed image, which is what led to this error. So, here are some examples of the type of lists returned by the extract method:
3 -> [74.0, 46.0, 14.866068840026855]
3 -> [118.0, 20.0, 7.071067810058594]
6 -> [68.0, 8.0, 8.5440034866333, 116.0, 76.0, 13.03840446472168]
3 -> [72.0, 44.0, 8.602325439453125]
9 -> [106.0, 48.0, 15.81138801574707, 20.0, 52.0, 23.409399032592773, 90.0, 122.0, 18.0]
Once I made sure that the size of the list was consistent, no matter the image, the error went away. Hopefully, this will help anyone having similar issues in the future.

Resources