I'm planning to have the following design:
However my code doesn't seem working:
import numpy as np
from keras.models import Model
from keras.layers import Dense, Input, Concatenate
from keras import optimizers
trainX1 = np.array([[1,2],[3,4],[5,6],[7,8]]) # fake training data
trainY1 = np.array([[1],[2],[3],[4]]) # fake label
trainX2 = np.array([[2,3],[4,5],[6,7]])
trainY2 = np.array([[1],[2],[3]])
trainX3 = np.array([[0,1],[2,3]])
trainY3 = np.array([[1],[2]])
numFeatures = 2
trainXList = [trainX1, trainX2, trainX3]
trainYStack = np.vstack((trainY1,trainY2,trainY3))
inputList = []
modelList = []
for i,_ in enumerate(trainXList):
tempInput= Input(shape = (numFeatures,))
m = Dense(10, activation='tanh')(tempInput)
inputList.append(tempInput)
modelList.append(m)
mAll = Concatenate()(modelList)
out = Dense(1, activation='tanh')(mAll)
model = Model(inputs=inputList, outputs=out)
rmsp = optimizers.rmsprop(lr=0.00001)
model.compile(optimizer=rmsp,loss='mse', dropout = 0.1)
model.fit(trainXList, trainYStack, epochs = 1, verbose=0)
The error message says that my input data sets are not having the same shape, but after I padded my training set to make number of samples = 4 for all 3 sets, I still get errors saying dimension is not right. May I know how I can design this network properly? Thanks!
p.s. Here is the error message before padding:
ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(4, 2), (3, 2), (2, 2)]
Here is the error message after padding (happens on the last line of code):
ValueError: Input arrays should have the same number of samples as target arrays. Found 4 input samples and 12 target samples.
Your input shape is wrong for the given input.
You assign the input a size of numFeatures, but actually you have 2-dimensional arrays and they are different (4,2)(3,2)(2,2). I am not sure about your problem, but number of samples and number of features seem to be reversed.
tempInput= Input(shape = (numFeatures,))
Furthermore your y is also weird. Usually you have X (number_of samples, num_features) and y with (number of samples, labels).
Use model.summary() to see how your network looks like.
Related
I want to train a model and I use the MinMaxScaler to normalize my data. However, I dont know how to use it for return the prediction values to the real values.
Here is a simple example of my data.
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
data_t = np.random.randint(10, size = (6, 1000))
data_te = np.random.randint(10, size = (6, 100))
data_train = min_max_scaler.fit_transform(data_t)
data_test = min_max_scaler.fit_transform(data_te)
# prediction is output from the model and its size is 1*28.
prediction = np.random.randint(1, size = (1,28))
prediction_values = min_max_scaler.inverse_transform(prediction)
but I get this error:
ValueError: operands could not be broadcast together with shapes (1,28) (100,) (1,28)
Could you please help me with this? Thanks
I am trying to experiment with HoltWinters using some random data. However, using the statsmodel api I am unable to prediction for the next X data points.
Here is my sample code. I am unable to understand the predict API and what it means by start and end.
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing
data = np.linspace(start=15, stop=25, num=100)
noise = np.random.uniform(0, 1, 100)
data = data + noise
split = int(len(data)*0.7)
data_train = data[0:split]
data_test = data[-(len(data) - split):]
model = ExponentialSmoothing(data_train)
model_fit = model.fit()
# make prediction
pred = model_fit.predict(split+1, len(data))
test_index = [i for i in range(split, len(data))]
plt.plot(data_train, label='Train')
plt.plot(test_index, data_test, label='Test')
plt.plot(test_index, pred, label='Prediction')
plt.legend(loc='best')
plt.show()
I get a weird graph for prediction and I believe it has something to do with my understanding of the predict API.
The exponential smoothing model you've chosen doesn't include a trend, so it is forecasting the best level, and that gives a horizontal line forecast.
If you do:
model = ExponentialSmoothing(data_train, trend='add')
then you will get a trend, and likely it will look more like you expect.
For example:
# Simulate some data
np.random.seed(12346)
dta = pd.Series(np.arange(100) + np.sin(np.arange(100)) * 5 + np.random.normal(scale=4, size=100))
# Perform exponention smoothing, no trend
mod1 = sm.tsa.ExponentialSmoothing(dta)
res1 = mod1.fit()
fcast1 = res1.forecast(30)
plt.plot(dta)
plt.plot(fcast1, label='Model without trend')
# Perform exponention smoothing, with a trend
mod2 = sm.tsa.ExponentialSmoothing(dta, trend='add')
res2 = mod2.fit()
fcast2 = res2.forecast(30)
plt.plot(fcast2, label='Model with trend')
plt.legend(loc='lower right')
gives the following:
I try to do a creditcard-fraud prediction with keras.
For that, I have a creditcard.csv file, with over 280 000 different cases which are all labeled as fraud or valid.
My problem is, that my code actually does compile, but in the first epoche, my accuracy is already 0.9979 and from the second epoche on acc: 0.9982.
That doesn't seem to be very realistic to me, but I don't know my mistake.
Here is the shortened version of my code:
import pandas as pd
import numpy as np
from keras import models
from keras import layers
combinedData = pd.read_csv('creditcard.csv')
trainData = combinedData[:227845]
testData = combinedData[227845:]
trainDataFactors = trainData.copy()
del trainDataFactors['Class']
trainDataLabels = pd.DataFrame(trainData, columns=['Class'])
testDataFactors = testData.copy()
del testDataFactors['Class']
testDataLabels = pd.DataFrame(testData, columns=['Class'])
model = models.Sequential()
model.add(layers.Dense(30, activation="relu", input_shape = (30, )))
model.add(layers.Dense(60, activation ="relu"))
model.add(layers.Dense(30, activation="sigmoid"))
model.compile(
optimizer = "rmsprop",
loss = "sparse_categorical_crossentropy",
metrics = ["accuracy"]
)
history = model.fit(
trainDataFactors, trainDataLabels,
epochs = 20,
batch_size = 512,
validation_data=(testDataFactors, testDataLabels)
)
I appreciate any help!
Is your test data balanced?
Because if not, e.g. it's collection of real data, I'd guess that a degenerate model replying "valid" to any input could easily get > 99 % acc. Try reporting also F1 score, that's the default choice for (unbalaced) detection tasks.
Beginner to Deep learning..
I'm trying to identify the slum using satellite images(google map) for Pune city. So, in training dataset i have provided about 100 images of slum and 100 images of other area. But my model is not able to classify input image properly even though accuracy rate is high.
I think this might be because of dimensions of image.
I'm resizing all images to 128*128 pixel.
Kernal size is 3*3.
Link to the map:
https://www.google.co.in/maps/#18.5129661,73.822531,286m/data=!3m1!1e3?hl=en
Following is the code
import os,cv2
import glob
import numpy as np
from keras.utils import plot_model
from keras.utils.np_utils import to_categorical
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from keras.models import Model
from keras.layers import Input, Convolution2D, MaxPooling2D, Flatten, Dense, Dropout
PATH = os.getcwd()
data_path = PATH + '/dataset/*'
files = glob.glob(data_path)
X = []
for myFiles in files:
image = cv2.imread(myFiles)
image_resize = cv2.resize(image, (256, 256))
X.append(image_resize)
image_data = np.array(X)
image_data = image_data.astype('float32')
image_data /= 255
print("Image_data shape ", image_data.shape)
no_of_classes = 2
no_of_samples = image_data.shape[0]
label = np.ones(no_of_samples, dtype='int64')
label[0:86] = 0 #Slum
label[87:] = 1 #noSlum
Y = to_categorical(label, no_of_classes)
#shuffle dataset
x,y = shuffle(image_data , Y, random_state = 2)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state = 2)
#print(x_train)
#print(y_train)
input_shape = image_data[0].shape
input = Input(input_shape)
conv_1 = Convolution2D(32,(3,3), padding='same', activation='relu')(input)
conv_2 = Convolution2D(32,(3,3), padding = 'same', activation = 'relu')(conv_1)
pool_1 = MaxPooling2D(pool_size = (2,2))(conv_2)
drop_1 = Dropout(0.5)(pool_1)
conv_3 = Convolution2D(64,(3,3), padding='same', activation='relu')(drop_1)
conv_4 = Convolution2D(64,(3,3), padding='same', activation = 'relu')(conv_3)
pool_2 = MaxPooling2D(pool_size = (2,2))(conv_4)
drop_2 = Dropout(0.5)(pool_2)
flat_1 = Flatten()(drop_2)
hidden = Dense(64,activation='relu')(flat_1)
drop_3 = Dropout(0.5)(hidden)
out = Dense(no_of_classes,activation = 'softmax')(drop_3)
model = Model(inputs = input, outputs = out)
model.compile(loss = 'categorical_crossentropy', optimizer = 'rmsprop', metrics= ['accuracy'])
model.fit(x_train,y_train,batch_size=10,nb_epoch=20,verbose =1, validation_data=(x_test,y_test))
model.save('model.h5')
score = model.evaluate(x_test,y_test,verbose=1)
print('Test Loss: ',score[0])
print('Test Accuracy: ',score[1])
test_image = x_test[0:1]
print(test_image.shape)
print (model.predict(test_image))
Usually, the behavior you've described above resembles to the inability of NN to identify small objects on input images. Just imagine you give an image of 128*128 with rough noise where nothing is seen - you want NN to correctly classify objects?
What to do?
1) Try to manually convert some input image from your dataset to 128*128 size and see on what data you truly train your NN. So, it'll give you more insight --> maybe you need to have better image's dimension size
2) Add more Conv layers with more neurons that will give you ability to detect small and more sophisticated objects by adding more non-linearity to your output function. Google such great Neural Network structures as ResNet.
3) Add more training data, 100 images isn't enough to have an appropriate result
4) Add data augmentation technique as well ( Rotations seem so strong in your case )
And don't give up :) Eventually, you'll solve it out. Good Luck
So this question is about GANs.
I am trying to do a trivial example for my own proof of concept; namely, generate images of hand written digits (MNIST). While most will approach this via deep convolutional gans (dgGANs), I am just trying to achieve this via the 1D array (i.e. instead of 28x28 gray-scale pixel values, a 28*28 1d array).
This git repo features a "vanilla" gans which treats the MNIST dataset as a 1d array of 784 values. Their output values look pretty acceptable so I wanted to do something similar.
Import statements
from __future__ import print_function
import matplotlib as mpl
from matplotlib import pyplot as plt
import mxnet as mx
from mxnet import nd, gluon, autograd
from mxnet.gluon import nn, utils
import numpy as np
import os
from math import floor
from random import random
import time
from datetime import datetime
import logging
ctx = mx.gpu()
np.random.seed(3)
Hyper parameters
batch_size = 100
epochs = 100
generator_learning_rate = 0.001
discriminator_learning_rate = 0.001
beta1 = 0.5
latent_z_size = 100
Load data
mnist = mx.test_utils.get_mnist()
# convert imgs to arrays
flattened_training_data = mnist["test_data"].reshape(10000, 28*28)
define models
G = nn.Sequential()
with G.name_scope():
G.add(nn.Dense(300, activation="relu"))
G.add(nn.Dense(28 * 28, activation="tanh"))
D = nn.Sequential()
with D.name_scope():
D.add(nn.Dense(128, activation="relu"))
D.add(nn.Dense(64, activation="relu"))
D.add(nn.Dense(32, activation="relu"))
D.add(nn.Dense(2, activation="tanh"))
loss = gluon.loss.SoftmaxCrossEntropyLoss()
init stuff
G.initialize(mx.init.Normal(0.02), ctx=ctx)
D.initialize(mx.init.Normal(0.02), ctx=ctx)
trainer_G = gluon.Trainer(G.collect_params(), 'adam', {"learning_rate": generator_learning_rate, "beta1": beta1})
trainer_D = gluon.Trainer(D.collect_params(), 'adam', {"learning_rate": discriminator_learning_rate, "beta1": beta1})
metric = mx.metric.Accuracy()
dynamic plot (for juptyer notebook)
import matplotlib.pyplot as plt
import time
def dynamic_line_plt(ax, y_data, colors=['r', 'b', 'g'], labels=['Line1', 'Line2', 'Line3']):
x_data = []
y_max = 0
y_min = 0
x_min = 0
x_max = 0
for y in y_data:
x_data.append(list(range(len(y))))
if max(y) > y_max:
y_max = max(y)
if min(y) < y_min:
y_min = min(y)
if len(y) > x_max:
x_max = len(y)
ax.set_ylim(y_min, y_max)
ax.set_xlim(x_min, x_max)
if ax.lines:
for i, line in enumerate(ax.lines):
line.set_xdata(x_data[i])
line.set_ydata(y_data[i])
else:
for i in range(len(y_data)):
l = ax.plot(x_data[i], y_data[i], colors[i], label=labels[i])
ax.legend()
fig.canvas.draw()
train
stamp = datetime.now().strftime('%Y_%m_%d-%H_%M')
logging.basicConfig(level=logging.DEBUG)
# arrays to store data for plotting
loss_D = nd.array([0], ctx=ctx)
loss_G = nd.array([0], ctx=ctx)
acc_d = nd.array([0], ctx=ctx)
labels = ['Discriminator Loss', 'Generator Loss', 'Discriminator Acc.']
%matplotlib notebook
fig, ax = plt.subplots(1, 1)
ax.set_xlabel('Time')
ax.set_ylabel('Loss')
dynamic_line_plt(ax, [loss_D.asnumpy(), loss_G.asnumpy(), acc_d.asnumpy()], labels=labels)
for epoch in range(epochs):
tic = time.time()
data_iter.reset()
for i, batch in enumerate(data_iter):
####################################
# Update Disriminator: maximize log(D(x)) + log(1-D(G(z)))
####################################
# extract batch of real data
data = batch.data[0].as_in_context(ctx)
# add noise
# Produce our noisey input to the generator
latent_z = mx.nd.random_normal(0,1,shape=(batch_size, latent_z_size), ctx=ctx)
# soft and noisy labels
# real_label = mx.nd.ones((batch_size, ), ctx=ctx) * nd.random_uniform(.7, 1.2, shape=(1)).asscalar()
# fake_label = mx.nd.ones((batch_size, ), ctx=ctx) * nd.random_uniform(0, .3, shape=(1)).asscalar()
# real_label = nd.random_uniform(.7, 1.2, shape=(batch_size), ctx=ctx)
# fake_label = nd.random_uniform(0, .3, shape=(batch_size), ctx=ctx)
real_label = mx.nd.ones((batch_size, ), ctx=ctx)
fake_label = mx.nd.zeros((batch_size, ), ctx=ctx)
with autograd.record():
# train with real data
real_output = D(data)
errD_real = loss(real_output, real_label)
# train with fake data
fake = G(latent_z)
fake_output = D(fake.detach())
errD_fake = loss(fake_output, fake_label)
errD = errD_real + errD_fake
errD.backward()
trainer_D.step(batch_size)
metric.update([real_label, ], [real_output,])
metric.update([fake_label, ], [fake_output,])
####################################
# Update Generator: maximize log(D(G(z)))
####################################
with autograd.record():
output = D(fake)
errG = loss(output, real_label)
errG.backward()
trainer_G.step(batch_size)
####
# Plot Loss
####
# append new data to arrays
loss_D = nd.concat(loss_D, nd.mean(errD), dim=0)
loss_G = nd.concat(loss_G, nd.mean(errG), dim=0)
name, acc = metric.get()
acc_d = nd.concat(acc_d, nd.array([acc], ctx=ctx), dim=0)
# plot array
dynamic_line_plt(ax, [loss_D.asnumpy(), loss_G.asnumpy(), acc_d.asnumpy()], labels=labels)
name, acc = metric.get()
metric.reset()
logging.info('Binary training acc at epoch %d: %s=%f' % (epoch, name, acc))
logging.info('time: %f' % (time.time() - tic))
output
img = G(mx.nd.random_normal(0,1,shape=(100, latent_z_size), ctx=ctx))[0].reshape((28, 28))
plt.imshow(img.asnumpy(),cmap='gray')
plt.show()
Now this doesn't get nearly as good as the repo's example from above. Although fairly similar.
Thus I was wondering if you could take a look and figure out why:
the colors are inverted
why the results are sub par
I have been fiddling around with this trying a lot of various things to improve the results (I will list this in a second), but for the MNIST dataset this really shouldn't be needed.
Things I have tried (and I have also tried a host of combinations):
increasing the generator network
increasing the discriminator network
using soft labeling
using noisy labeling
batch norm after every layer in the generator
batch norm of the data
normalizing all values between -1 and 1
leaky relus in the generator
drop out layers in the generator
increased learning rate of discriminator compared to generator
decreased learning rate of i compared to generator
Please let me know if you have any ideas.
1) If you look into original dataset:
training_set = mnist["train_data"].reshape(60000, 28, 28)
plt.imshow(training_set[10,:,:], cmap='gray')
you will notice that the digits are white on a black background. So, technically speaking, your results are not inversed - they match the pattern of original images you used as a real data.
If you want to invert colors for visualization purposes, you can easily do that by changing the pallete to reversed one by adding '_r' (it works for all color palletes):
plt.imshow(img.asnumpy(), cmap='gray_r')
You also can play with ranges of colors by changing vmin and vmax parameters. They control how big the difference between colors should be. By default it is calculated automatically based on provided set.
2) "Why the results are sub par" - I think this is exactly the reason why the community started to use dcGANs. To me the results in the git repo you provided are quite noisy. Surely, they are different from what you receive, and you can achieve the same quality just by changing your activation functions from tanh to sigmoid as in the example on github:
G = nn.Sequential()
with G.name_scope():
G.add(nn.Dense(300, activation="relu"))
G.add(nn.Dense(28 * 28, activation="sigmoid"))
D = nn.Sequential()
with D.name_scope():
D.add(nn.Dense(128, activation="relu"))
D.add(nn.Dense(64, activation="relu"))
D.add(nn.Dense(32, activation="relu"))
D.add(nn.Dense(2, activation="sigmoid"))
Sigmoid never goes below zero and it works better in this scenario. Here is a sample picture I get if I train updated model for 30 epochs (the rest of the hyperparameters are same).
If you decide to explore dcGAN to get even better results, take a look here - https://mxnet.incubator.apache.org/tutorials/unsupervised_learning/gan.html It is a well explained tutorial on how to build dcGAN with Mxnet and Gluon. By using dcGAN you will get way better results than that.