Google Cloud ML exited with a non-zero status of 245 when training

Google Cloud ML exited with a non-zero status of 245 when training - machine-learning

I tried to train my model on Google Cloud ML using this sample code:
import keras
from keras import optimizers
from keras import losses
from keras import metrics
from keras.models import Model, Sequential
from keras.layers import Dense, Lambda, RepeatVector, TimeDistributed
import numpy as np
def test():
model = Sequential()
model.add(Dense(2, input_shape=(3,)))
model.add(RepeatVector(3))
model.add(TimeDistributed(Dense(3)))
model.compile(loss=losses.MSE,
optimizer=optimizers.RMSprop(lr=0.0001),
metrics=[metrics.categorical_accuracy],
sample_weight_mode='temporal')
x = np.random.random((1, 3))
y = np.random.random((1, 3, 3))
model.train_on_batch(x, y)
if __name__ == '__main__':
test()
and i got this error:
The replica master 0 exited with a non-zero status of 245. Termination reason: Error.
Detailed error output is big, so i'm pasting it here in pastebin

Note this output:
Module raised an exception for failing to call a subprocess Command '['python', '-m', u'trainer.test', '--job-dir', u'gs://my_test_bucket_keras/s_27_100630']' returned non-zero exit status -11.
And I guess the google cloud will run your code with an extra parameter called --job-dir. So perhaps you can try add the following code in your example code?
import ...
import argparse
def test():
model = Sequential()
model.add(Dense(2, input_shape=(3,)))
model.add(RepeatVector(3))
model.add(TimeDistributed(Dense(3)))
model.compile(loss=losses.MSE,
optimizer=optimizers.RMSprop(lr=0.0001),
metrics=[metrics.categorical_accuracy],
sample_weight_mode='temporal')
x = np.random.random((1, 3))
y = np.random.random((1, 3, 3))
model.train_on_batch(x, y)
if __name__ == '__main__':
parser = argparse.ArgumentParser()
# Input Arguments
parser.add_argument(
'--job-dir',
help='GCS location to write checkpoints and export models',
required=True
)
args = parser.parse_args()
arguments = args.__dict__
test()
# test(**arguments) # or if you want to use this job_dir parameter in your code
Not 100% sure this will work but I think you can give it a try.
Also I have a post here to do something similar, perhaps you can take a look there as well.

Problem is resolved. All I had to do is use tensorflow 1.1.0 instead default 1.0.1

Related

Linear Regression script not working in Python

I tried running my Machine Learning LinearRegression code, but it is not working. Here is the code:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import pandas as pd
df = pd.read_csv(r'C:\Users\SVISHWANATH\Downloads\datasets\GGP_data.csv')
df["OHLC"] = (df.open+df.high+df.low+df.close)/4
df['HLC'] = (df.high+df.low+df.close)/3
df.index = df.index+1
reg = LinearRegression()
reg.fit(df.index, df.OHLC)
Basically, I just imported a few libraries, used the read_csv function, and called the LinearRegression() function, and this is the error:
ValueError: Expected 2D array, got 1D array instead:
array=[ 1 2 3 ... 1257 1258 1259].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or
array.reshape(1, -1) if it contains a single sample
Thanks!

As mentioned in the error message, you need to give the fit method a 2D array.
df.index is a 1D array. You can do it this way:
reg.fit(df.index.values.reshape(-1, 1), df.OHLC)

How to parallelize sklearn's random forest regressor on SLURM

I am currently trying to make sklearn's random forest run parallely on SLURM cluster. I have sent them to nodes, and then I have noticed that the parameter, n_jobs=-1, was no longer working on SLURM.
I have tried ipyparallel package, but it gave me error messages. I do not necessarily use ipyparallel, so I appreciate any module that I can parallelize random forest on the cluster.
from sklearn.ensemble import RandomForestRegressor
from joblib import parallel_backend, register_parallel_backend
from ipyparallel import Client
from ipyparallel.joblib import IPythonParallelBackend
import sys
import time
import pickle
import numpy as np
def fit_predict(self, X_train, y, X_test):
"""
train a model by X_train and y, and then return the prediction of
X_test
"""
pred = None
client = Client(profile='myprofile')
bview = client.load_balanced_view()
register_parallel_backend('ipyparallel', lambda: IPythonParallelBackend(view=bview))
regr = RandomForestRegressor(n_jobs=-1)
try:
with parallel_backend('ipyparallel'):
regr.fit(X_train, y)
pred = regr.predict(X_test)
except Exception as e:
print(e)
return pred
Error:
Traceback (most recent call last):
File "job.py", line 124, in <module>
pred = rf.fit_predict(X_train, y_train, X_test)
File "job.py", line 50, in fit_predict
client = Client(profile='myprofile')
File "/home/lfz/.conda/envs/mvi/lib/python3.7/site-packages/ipyparallel/client/client.py", line 419, in __init__
raise IOError(no_file_msg)
OSError: You have attempted to connect to an IPython Cluster but no Controller could be found.
Please double-check your configuration and ensure that a cluster is running.
srun: error: c6-28: task 0: Exited with exit code 1

Keras Regressor giving different prediction for my input everytime

I built a Keras regressor using the following code:
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import numpy as ny
import pandas
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)
X = ny.array([[1,2], [3,4], [5,6], [7,8], [9,10]])
sc_X=StandardScaler()
X_train = sc_X.fit_transform(X)
Y = ny.array([3, 4, 5, 6, 7])
Y=ny.reshape(Y,(-1,1))
sc_Y=StandardScaler()
Y_train = sc_Y.fit_transform(Y)
N = 5
def brain():
#Create the brain
br_model=Sequential()
br_model.add(Dense(3, input_dim=2, kernel_initializer='normal',activation='relu'))
br_model.add(Dense(2, kernel_initializer='normal',activation='relu'))
br_model.add(Dense(1,kernel_initializer='normal'))
#Compile the brain
br_model.compile(loss='mean_squared_error',optimizer='adam')
return br_model
def predict(X,sc_X,sc_Y,estimator):
prediction = estimator.predict(sc_X.fit_transform(X))
return sc_Y.inverse_transform(prediction)
estimator = KerasRegressor(build_fn=brain, epochs=1000, batch_size=5,verbose=0)
# print "Done"
estimator.fit(X_train,Y_train)
prediction = estimator.predict(X_train)
print predict(X,sc_X,sc_Y,estimator)
X_test = ny.array([[1.5,4.5], [7,8], [9,10]])
print predict(X_test,sc_X,sc_Y,estimator)
The issue I face is that the code is not predicting the same value (for example, it predicting 6.64 for [9,10] in the first prediction (X) and 6.49 for [9,10] in the second prediction (X_test) )
The full output is this:
[2.9929883 4.0016675 5.0103474 6.0190268 6.6434317]
[3.096634 5.422326 6.4955378]
Why do I get different values and how do I resolve them?

The problem lies in this line of code:
prediction = estimator.predict(sc_X.fit_transform(X))
You are fitting a new scaler every time when you predict values for new data. This is where differences come from. Try:
prediction = estimator.predict(sc_X.transform(X))
In this case, you use a pretrained scaler.

KerasRegressor giving different output everytime I run (despite inputs and training set being same)

Whenever I run the following code, I keep getting different outputs. Please could someone help me out with this? Code:
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.preprocessing import StandardScaler
import numpy as ny
X = ny.array([[1,2], [3,4], [5,6], [7,8], [9,10]])
sc_X=StandardScaler()
X_train = sc_X.fit_transform(X)
Y = ny.array([3, 4, 5, 6, 7])
Y=ny.reshape(Y,(-1,1))
sc_Y=StandardScaler()
Y_train = sc_Y.fit_transform(Y)
N = 5
def brain():
#Create the brain
br_model=Sequential()
br_model.add(Dense(3, input_dim=2, kernel_initializer='normal',activation='relu'))
br_model.add(Dense(2, kernel_initializer='normal',activation='relu'))
br_model.add(Dense(1,kernel_initializer='normal'))
#Compile the brain
br_model.compile(loss='mean_squared_error',optimizer='adam')
return br_model
estimator = KerasRegressor(build_fn=brain, epochs=1000, batch_size=5,verbose=0)
estimator.fit(X_train,Y_train)
prediction = estimator.predict(X_train)
print Y
print sc_Y.inverse_transform(prediction)
Basically, I have declared a dataset, am training a neural network to do regression on that and predict the values. Given that everything is already hardcoded into the code, I must be getting the same output everytime I run. However, this is not the case. I request you to help me out.

Probable issue with LSTM in lasagne

With a simple constructor for the LSTM, as given in the tutorial, and an input of dimension [,,1] one would expect to see an output of shape [,,num_units].
But regardless of the num_units passed during construction, the output has the same shape as the input.
Following is the min code to replicate this issue...
import lasagne
import theano
import theano.tensor as T
import numpy as np
num_batches= 20
sequence_length= 100
data_dim= 1
train_data_3= np.random.rand(num_batches,sequence_length,data_dim).astype(theano.config.floatX)
#As in the tutorial
forget_gate = lasagne.layers.Gate(b=lasagne.init.Constant(5.0))
l_lstm = lasagne.layers.LSTMLayer(
(num_batches,sequence_length, data_dim),
num_units=8,
forgetgate=forget_gate
)
lstm_in= T.tensor3(name='x', dtype=theano.config.floatX)
lstm_out = lasagne.layers.get_output(l_lstm, {l_lstm:lstm_in})
f = theano.function([lstm_in], lstm_out)
lstm_output_np= f(train_data_3)
lstm_output_np.shape
#= (20, 100, 1)
An unqualified LSTM (I mean in its default mode) should produce one output for each unit right?
The code was run on kaixhin's cuda lasagne docker image docker image
What gives?
Thanks !

You can fix that by using a lasagne.layers.InputLayer
import lasagne
import theano
import theano.tensor as T
import numpy as np
num_batches= 20
sequence_length= 100
data_dim= 1
train_data_3= np.random.rand(num_batches,sequence_length,data_dim).astype(theano.config.floatX)
#As in the tutorial
forget_gate = lasagne.layers.Gate(b=lasagne.init.Constant(5.0))
input_layer = lasagne.layers.InputLayer(shape=(num_batches, # <-- change
sequence_length, data_dim),) # <-- change
l_lstm = lasagne.layers.LSTMLayer(input_layer, # <-- change
num_units=8,
forgetgate=forget_gate
)
lstm_in= T.tensor3(name='x', dtype=theano.config.floatX)
lstm_out = lasagne.layers.get_output(l_lstm, lstm_in) # <-- change
f = theano.function([lstm_in], lstm_out)
lstm_output_np= f(train_data_3)
print lstm_output_np.shape
If you feed your input into the input_layer, it is not ambiguous anymore, so you do not even need to specify where the input is supposed to go. Directly specifying a shape and adding the tensor3 into the LSTM does not work.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Google Cloud ML exited with a non-zero status of 245 when training - machine-learning

Problem is resolved. All I had to do is use tensorflow 1.1.0 instead default 1.0.1

Related

Linear Regression script not working in Python

How to parallelize sklearn's random forest regressor on SLURM

Keras Regressor giving different prediction for my input everytime

KerasRegressor giving different output everytime I run (despite inputs and training set being same)

Probable issue with LSTM in lasagne

Categories

Resources