I have an 2D input (or 3D if one consider the number of samples) and I want to apply a keras layer that would take this input and outputs another 2D matrix. So, for example, if I have an input with size (ExV), the learning weight matrix would be (SxE) and the output (SxV). Can I do this with Dense layer?
EDIT (Nassim request):
The first layer is doing nothing. It's just to give an input to Lambda layer:
from keras.models import Sequential
from keras.layers.core import Reshape,Lambda
from keras import backend as K
from keras.models import Model
input_sample = [
[[1,2,3,4,5],[6,7,8,9,10],[11,12,13,14,15],[16,17,18,19,20]]
,[[21,22,23,24,25],[26,27,28,29,30],[31,32,33,34,35],[36,37,38,39,40]]
,[[41,42,43,44,45],[46,47,48,49,50],[51,52,53,54,55],[56,57,58,59,60]]
]
model = Sequential()
model.add(Reshape((4,5), input_shape=(4,5)))
model.add(Lambda(lambda x: K.transpose(x)))
intermediate_layer_model = Model(input=model.input,output=model.layers[0].output)
print "First layer:"
print intermediate_layer_model.predict(input_sample)
print ""
print "Second layer:"
intermediate_layer_model = Model(input=model.input,output=model.layers[1].output)
print intermediate_layer_model.predict(input_sample)
It depends on what you want to do. Is it 2D because it's a sequence? Then LSTM are made for that and will return a sequence if desired size if you set return_sequence=True.
CNN's can also work on 2D inputs and will output something of variable size depending on the number of kernels you use.
Otherwise you can reshape it to a (E x V, ) 1D tensor, use a Dense layer with SxV dimension and reshape the output to a (S,V) 2D tensor...
I can not help you more, we need to know your usecase :-) there are too many possibilities with neural networks.
EDIT :
You can use TimeDistributed(Dense(S)).
If your input has a shape (E,V), you reshape to (V,E) to have V as the "time dimension". Then you apply TimeDistributed(Dense(S)) which will be a dense layer with weights (ExS), the output will have the shape (V,S) so you can reshape it to (S,V).
Does that make what you want ? The TimeDistributed() layer will apply the same Dense(S) layer to every V lines of your input with shared weights.
EDIT 2:
After looking at the code of keras backend, it turns out that to use the transpose from tensorflow with 'permutation patterns' option available, you need to use K.permute_dimensions(x,pattern). The batch dimension must be included. In your case :
Lambda(lambda x: K.permute_dimensions(x,[0,2,1]))
K.transpose(x) uses the same function internally (for tf backend) but permutations is set to the default value which is [n,n-1,...,0].
What you want is probably autoencoder.
https://blog.keras.io/building-autoencoders-in-keras.html
Related
I'm a beginner and making a linear regression model, when I make predictions on the basis of test sets, it works fine. But when I try to predict something for a specific value. It gives an error. The tutorial I'm watching, they don't have any errors.
dataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values
# Fitting Linear Regression to the dataset
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)
# Visualising the Linear Regression results
plt.scatter(X, y, color = 'red')
plt.plot(X, lin_reg.predict(X), color = 'blue')
plt.title('Truth or Bluff (Linear Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()
# Predicting a new result with Linear Regression
lin_reg.predict(6.5)
ValueError: Expected 2D array, got scalar array instead:
array=6.5.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
According to the Scikit-learn documentation, the input array should have shape (n_samples, n_features). As such, if you want a single example with a single value, you should expect the shape of your input to be (1,1).
This can be done by doing:
import numpy as np
test_X = np.array(6.5).reshape(-1, 1)
lin_reg.predict(test_X)
You can check the shape by doing:
test_X.shape
The reason for this is because the input can have many samples (i.e. you want to predict for multiple data points at once), or/and each sample can have many features.
Note: Numpy is a Python library to support large arrays and matrices. When scikit-learn is installed, Numpy should be installed as well.
Say, I have a 10x10x4 intermediate output of a convolution layer, which I need to split into 100 1x1x4 volume and apply softmax on each to get 100 outputs from the network. Is there any way to accomplish this without using the Lambda layer? The issue with the Lambda layer in this case is this simple task of splitting takes 100 passes through the lambda layer during forward pass, which makes the network performance very slow for my practical use. Please suggest a quicker way of doing this.
Edit: I had already tried the Softmax+Reshape approach before asking the question. With that approach, I would be getting a 10x10x4 matrix reshaped to a 100x4 Tensor with use of Reshape as the output. What I really need is a multi output network with 100 different outputs. In my application, it is not possible to jointly optimize over the 10x10 matrix, but I get good results by using a network with 100 different outputs with the Lambda layer.
Here are code snippets of my approach using the Keras functional API:
With Lambda layer (slow, gives 100 Tensors of shape (None, 4) as desired):
# Assume conv_output is output from a convolutional layer with shape (None, 10, 10,4)
preds = []
for i in range(10):
for j in range(10):
y = Lambda(lambda x, i,j: x[:, i, j,:], arguments={'i': i,'j':j})(conv_output)
preds.append(Activation('softmax',name='predictions_' + str(i*10+j))(y))
model = Model(inputs=img, outputs=preds, name='model')
model.compile(loss='categorical_crossentropy',
optimizer=Adam(),
metrics=['accuracy']
With Softmax+Reshape (fast, but gives Tensor of shape (None, 100, 4))
# Assume conv_output is output from a convolutional layer with shape (None, 10, 10,4)
y = Softmax(name='softmax', axis=-1)(conv_output)
preds = Reshape([100, 4])(y)
model = Model(inputs=img, outputs=preds, name='model')
model.compile(loss='categorical_crossentropy',
optimizer=Adam(),
metrics=['accuracy']
I don't think in the second case it is possible to individually optimize over each of the 100 outputs (probably one can think of it as learning the joint distribution, whereas I need to learn the marginals as in the first case). Please let me know if there is any way to accomplish what I am doing with the Lambda layer in the first code snippet in a faster way
You can use the Softmax layer and set the axis argument to the last axis (i.e. -1) to apply softmax over that axis:
from keras.layers import Softmax
soft_out = Softmax(axis=-1)(conv_out)
Note that the axis argument by default is set to -1, so you may not even need to pass that.
[Disclaimer] This is my first excursion into machine learning.
I have a list of 1-d numpy real vectors that represent experimental conditions known to be associated to two mutually exclusive classes. To each vector a 1 or 0 can be assigned as the class label.
What is the best way to construct a classifier/predictor using these classes in Python such that the differences between the two classes are maximized?
Let's say you have 1000 vectors with 10 values. Your x data has shape (1000,10), y data (1000,1) (it's either 0 or 1, according to class). You want to predict y from x.
The simplest model could look like (using Keras):
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
mdl = Sequential() // create model
mdl.add(Dense(8, input_shape=(10,), activation='sigmoid'))
mdl.add(Dense(1, activation='sigmoid')
mdl.compile(optimizer = 'adam', loss='binary_crossentropy')
mdl.fit(x, y, epochs = 30)
Note that I can use sigmoid in the last layer of classification problem only if there are 2 classes. With more classes you should use softmax.
I recommend you check this page: https://keras.io/
Also, I think keras is better to begin with than tensorflow.
This question already has an answer here:
Multi-dimensional input layers in Keras
(1 answer)
Closed 5 years ago.
I'm attempting to train a model of 3 layer Dense Neural Network using Keras with a GPU enabled Tensorflow backend.
The dataset I have is 4 million 20x40px images that I placed in directories with the name of the category they belong to.
Because of the large amount of data I can't just load it all into RAM and feed it to my model so I thought using Keras's ImageDataGenerator, specifically the function flow_from_directory() would do the trick. This yields a tuple of (x, y) where x is the numpy array of the image and y is the label of the image.
I expected the model to know to access the numpy array to be given as input for my model so I setup my input shape to be: (None,20,40,3) where None is the batch size, 20 and 40 are size of the image and 3 are the number of channels in the image. This does not work however as when I try to train my model I keep getting the error:
ValueError: Error when checking target: expected dense_3 to have 4 dimensions, but got array with shape (1024, 2)
I know the cause is that it is getting the tuple from flow_from_directoy and I guess I could change the input shape to match, however, I fear that this would render my model useless as I will be using images to make predictions not a pre-categorized tuple. So my question is, how can I get flow_from_directory to feed the image to my model and only use the tuple to validate it's training? Am I misunderstanding something here?
For reference, here is my code:
from keras.models import Model
from keras.layers import *
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import TensorBoard
# Prepare the Image Data Generator.
train_datagen = ImageDataGenerator()
test_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
'/path/to/train_data/',
target_size=(20, 40),
batch_size=1024,
)
test_generator = test_datagen.flow_from_directory(
'/path/to/test_data/',
target_size=(20, 40),
batch_size=1024,
)
# Define input tensor.
input_t = Input(shape=(20,40,3))
# Now create the layers and pass the input tensor to it.
hidden_1 = Dense(units=32, activation='relu')(input_t)
hidden_2 = Dense(units=16)(hidden_1)
prediction = Dense(units=1)(hidden_2)
# Now put it all together and create the model.
model = Model(inputs=input_t, outputs=prediction)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# Prepare Tensorboard callback and start training.
tensorboard = TensorBoard(log_dir='./graph', histogram_freq=0, write_graph=True, write_images=True)
print(test_generator)
model.fit_generator(
train_generator,
steps_per_epoch=2000,
epochs=100,
validation_data=test_generator,
validation_steps=800,
callbacks=[tensorboard]
)
# Save trained model.
model.save('trained_model.h5')
Your input shape is wrong for Dense layers.
Dense layers expect inputs in the shape (None,length).
You'll either need to reshape your inputs so that they become vectors:
imageBatch=imageBatch.reshape((imageBatch.shape[0],20*40*3))
Or use convolutional layers, that expect that type of input shape (None,nRows,nCols,nChannels) like in tensorflow.
Its commonplace for various neural network architectures in NLP and vision-language problems to tie the weights of an initial word embedding layer to that of an output softmax. Usually this produces a boost to sentence generation quality. (see example here)
In Keras its typical to embed word embedding layers using the Embedding class, however there seems to be no easy way to tie the weights of this layer to the output softmax. Would anyone happen to know how this could be implemented ?
Be aware that Press and Wolf dont't propose to freeze the weights to some pretrained ones, but tie them. That means, to ensure that input and output weights are always the same during training (in the sense of synchronized).
In a typical NLP model (e.g. language modelling/translation), you have an input dimension (vocabulary) of size V and a hidden representation size H. Then, you start with an Embedding layer, which is a matrix VxH. And the output layer is (probably) something like Dense(V, activation='softmax'), which is a matrix H2xV. When tying the weights, we want that those matrices are the same (therefore, H==H2).
For doing this in Keras, I think the way to go is via shared layers:
In your model, you need to instantiate a shared embedding layer (of dimension VxH), and apply it to either your input and output. But you need to transpose it, to have the desired output dimensions (HxV). So, we declare a TiedEmbeddingsTransposed layer, which transposes the embedding matrix from a given layer (and applies an activation function):
class TiedEmbeddingsTransposed(Layer):
"""Layer for tying embeddings in an output layer.
A regular embedding layer has the shape: V x H (V: size of the vocabulary. H: size of the projected space).
In this layer, we'll go: H x V.
With the same weights than the regular embedding.
In addition, it may have an activation.
# References
- [ Using the Output Embedding to Improve Language Models](https://arxiv.org/abs/1608.05859)
"""
def __init__(self, tied_to=None,
activation=None,
**kwargs):
super(TiedEmbeddingsTransposed, self).__init__(**kwargs)
self.tied_to = tied_to
self.activation = activations.get(activation)
def build(self, input_shape):
self.transposed_weights = K.transpose(self.tied_to.weights[0])
self.built = True
def compute_mask(self, inputs, mask=None):
return mask
def compute_output_shape(self, input_shape):
return input_shape[0], K.int_shape(self.tied_to.weights[0])[0]
def call(self, inputs, mask=None):
output = K.dot(inputs, self.transposed_weights)
if self.activation is not None:
output = self.activation(output)
return output
def get_config(self):
config = {'activation': activations.serialize(self.activation)
}
base_config = super(TiedEmbeddingsTransposed, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
The usage of this layer is:
# Declare the shared embedding layer
shared_embedding_layer = Embedding(V, H)
# Obtain word embeddings
word_embedding = shared_embedding_layer(input)
# Do stuff with your model
# Compute output (e.g. a vocabulary-size probability vector) with the shared layer:
output = TimeDistributed(TiedEmbeddingsTransposed(tied_to=shared_embedding_layer, activation='softmax')(intermediate_rep)
I have tested this in NMT-Keras and it trains properly. But, as I try to load a trained model, it gets an error, related to the way Keras loads the models: it doesn't load the weights from the tied_to. I've found several questions regarding this (1, 2, 3), but I haven't managed to solve this issue. If someone have any ideas on the next steps to take, I'd be very glad to hear them :)
As you may read here you should simply set trainable flag to False. E.g.
aux_output = Embedding(..., trainable=False)(input)
....
output = Dense(nb_of_classes, .. ,activation='softmax', trainable=False)