Can the input layer of keras take customized input? - machine-learning

In the documentation, the parameters of model.fit() are
fit(self, x, y, batch_size=32, nb_epoch=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)
My question is, can I have three different tensors as input therefore I can have something like fit(x,y,z)?
PS:
Sorry for the ambiguity. I believe the fit(x_val, y_val) function in keras acts similarly as feed_dict={x:x_val, y:y_val}, I am just wondering can I feed in more values I created in the model?

This X is a tensor
For example
X = [[1,2],[2,3],[3,4]]
y = [3,5,7]
(The fitting line should be y=X[0]+X[1])
So if you clear which are features (for example, the x and y in your requirement) and which is the target class (z), you can define:
_X = (x,y) # two nodes in the input layer
_y = z
fit(_X,_y)
Do you mean it?

Related

Pytorch Neural Network Errors

I am trying to compute the loss and accuracy of a certain machine learning model by using Pytorch and I am having trouble initializing the dataset so that it can run. Using the Moon dataset, I am getting a few errors when I run the code. I first initialize the dataset:
X, y = make_moons(200, noise=0.2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=1, stratify = y)
x, y = Variable (torch.from_numpy(X_train)).float(), Variable(torch.from_numpy(y_train)).float()
and then when I run the Neural Network:
def __init__(self):
super(SoftmaxRegression, self).__init__()
self.fc = nn.Linear(200, 1)
self.softmax = nn.Softmax()
def forward(self, x):
x = self.fc(x)
x = self.softmax(x)
return x
I get the following errors:
serWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
x = F.softmax(self.layer(x))
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 1 is out of bounds.
How can I fix this so that it can run the dataset and output the loss and accuracy?
(Sorry to put this as an answer but unfortunately stack overflow won't let me comment :/).
Even if the Softmax worked it is absolutely pointless.(Unless you are softmaxing across your batch but that would be really weird). Your code shows you have a linear layer going from a tensor of 200 to 1. Softmax on a single value will simply return that value, Softmaxing should only be used on 2 or more values.
If you wish to do binary classification I would instead change the code to be this:
import torch.nn.functional as F
def forward(self, x):
x = self.fc(x)
x = F.sigmoid(x)
return x

How to set target in cross entropy loss for pytorch multi-class problem

Problem Statement: I have an image and a pixel of the image can belong to only(either) one of Band5','Band6', 'Band7' (see below for details). Hence, I have a pytorch multi-class problem but I am unable to understand how to set the targets which needs to be in form [batch, w, h]
My dataloader return two values:
x = chips.loc[:, :, :, self.input_bands]
y = chips.loc[:, :, :, self.output_bands]
x = x.transpose('chip','channel','x','y')
y_ohe = y.transpose('chip','channel','x','y')
Also, I have defined:
input_bands = ['Band1','Band2', 'Band3', 'Band3', 'Band4'] # input classes
output_bands = ['Band5','Band6', 'Band7'] #target classes
model = ModelName(num_classes = 3, depth=default_depth, in_channels=5, merge_mode='concat').to(device)
loss_new = nn.CrossEntropyLoss()
In my training function:
#get values from dataloader
X = normalize_zero_to_one(X) #input
y = normalize_zero_to_one(y) #target
images = Variable(torch.from_numpy(X)).to(device) # [batch, channel, H, W]
masks = Variable(torch.from_numpy(y)).to(device)
optim.zero_grad()
outputs = model(images)
loss = loss_new(outputs, masks) # (preds, target)
loss.backward()
optim.step() # Update weights
I know the the target (here masks) should be [batch_size, w, h]. However, it is currently [batch_size, channels, w, h].
I read a lot of posts including 1, 2 and they say the target should only contain the target class indices. I don't understand how can I concatenate indices of three classes and still set target as [batch_size, w, h].
Right now, I get the error:
RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4
To the best of my understanding, I don't need to do any one hot encoding. Similar errors and explanation I found on the internet are here:'
Reference 1
Reference 2
Reference 3
Reference 4
Any help will be appreciated! Thank you.
If I understand correctly, your current "target" is [batch_size, channels, w, h] with channels==3 as you have three possible targets.
What are the values in your target represent? You basically have a 3-vector target for each pixel - are these the expected class probabilities? Are they "one-hot-vectors" indicating the correct "band"?
If so, you can get the target indices by simply taking the argmax along the target channel dimension:
proper_target = torch.argmax(masks, dim=1) # make sure keepdim=False
loss = loss_new(outputs, proper_target)

How to scale / impute a tensor in a `sklearn` pipeline for input to a Keras LSTM

How can I use sklearn scaler / imputer to impute a tensor? I want to scale / impute within a pipeline. My input is a 3-d numpy array.
I have a tensor of shape (n_samples, n_timesteps, n_feat) a la Keras. This is a sequence that can be learned by an LSTM. I want to scale / impute first, however. In particular, I want to scale on the fly inside a sci-kit learn pipeline, since scaling the full dataset, which would be easy, leads to leakage. Keras already integrates w sklearn (see here), but there do not appear to be easy ways to scale and impute the tensors that keras time series models process.
Unfortunately, the following gives an error
import numpy as np
X = np.array([[[3,5],[6,2]],[[8.,23.],[7.,23]],[[3, 4],[2, 55]]])
print X
from sklearn.preprocessing import StandardScaler
s = StandardScaler()
X = s.fit_transform(X)
print X
Of the effect, "the scaler only works on 2-d numpy arrays".
My solution was to add a decorator to the sklearn preprocessing data.py file
def flat(func):
def wrapper(*args, **kwargs):
self, X = args
a, b, c = X.shape
X = X.reshape(a, b*c)
r = func(self, X, **kwargs)
if hasattr(r,'ndim'):
X = r.reshape(a, b, c)
return X
else:
return r
return wrapper
Then use it on the functions, eg fit
#flat
def fit(self, X, y=None):
"""Compute the mean and std to be used for later scaling.
Parameters
----------
X : {array-like, sparse matrix}, shape [n_samples, n_features]
The data used to compute the mean and standard deviation
used for later scaling along the features axis.
y : Passthrough for ``Pipeline`` compatibility.
"""
# Reset internal state before fitting
self._reset()
return self.partial_fit(X, y)
This works well; with the same script as above, I get
[[[ 3. 5.]
[ 6. 2.]]
[[ 8. 23.]
[ 7. 23.]]
[[ 3. 4.]
[ 2. 55.]]]
[[[-0.70710678 -0.64906302]
[ 0.46291005 -1.13191668]]
[[ 1.41421356 1.41266656]
[ 0.9258201 -0.16825789]]
[[-0.70710678 -0.76360355]
[-1.38873015 1.30017457]]]
But beware, it doesn't check for 2d arrays, which it can't process. So, use the normal preprocessing module for 2d arrays!

XGBoost plot_importance cannot show feature names

I used the plot_importance to show me the importance variables. But some variables are categorical, so I did some transformation. After I transformed the type of the variables, when i plot importance features, the plot does not show me feature names. I attached my code, and the plot.
dataset = data.values
X = dataset[1:100,0:-2]
predictors=dataset[1:100,-1]
X = X.astype(str)
encoded_x = None
for i in range(0, X.shape[1]):
label_encoder = LabelEncoder()
feature = label_encoder.fit_transform(X[:,i])
feature = feature.reshape(X.shape[0], 1)
onehot_encoder = OneHotEncoder(sparse=False)
feature = onehot_encoder.fit_transform(feature)
if encoded_x is None:
encoded_x = feature
else:
encoded_x = np.concatenate((encoded_x, feature), axis=1)
print("X shape: : ", encoded_x.shape)
response='Default'
#predictors=list(data.columns.values[:-1])
# Randomly split indexes
X_train, X_test, y_train, y_test = train_test_split(encoded_x,predictors,train_size=0.7, random_state=5)
model = XGBClassifier()
model.fit(X_train, y_train)
plot_importance(model)
plt.show()
[enter image description here][1]
[1]: https://i.stack.imgur.com/M9qgY.png
This is the expected behaviour- sklearn.OneHotEncoder.transform() returns a numpy 2d array instead of the input pd.DataFrame (i assume that's the type of your dataset). So it is not a bug, but a feature. It doesn't look like there is a way to pass feature names manually in the sklearn API (it is possible to set those in xgb.Dmatrix creation in the native training API).
However, your problem is easily solvable with pd.get_dummies() instead of the LabelEncoder + OneHotEncoder combination that you have implemented. I do not know why did you choose to use it instead (it can be useful, if you need to handle also a test set but then you need to play extra tricks), but i would advise in favour of pd.get_dummies()

How tf.gradients work in TensorFlow

Given I have a linear model as the following I would like to get the gradient vector with regards to W and b.
# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")
# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Construct a linear model
pred = tf.add(tf.mul(X, W), b)
# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2*n_samples)
However if I try something like this where cost is a function of cost(x,y,w,b) and I only want to gradients with respect to w and b:
grads = tf.gradients(cost, tf.all_variable())
My placeholders will also be included (X and Y).
Even if I do get a gradient with [x,y,w,b] how do I know which element in the gradient that belong to each parameter since it is just a list without names to which parameter the derivative has be taken with regards to?
In this question I'm using parts of this code and I build on this question.
Quoting the docs for tf.gradients
Constructs symbolic partial derivatives of sum of ys w.r.t. x in xs.
So, this should work:
dc_dw, dc_db = tf.gradients(cost, [W, b])
Here, tf.gradients() returns the gradient of cost wrt each tensor in the second argument as a list in the same order.
Read tf.gradients for more information.

Resources