Why does using X[0] in MNIST classifier code give me an error? - machine-learning

I was learning to do classification with the MNIST dataset. And I got an error with I am not able to figure out, I have done a lot of google searches and I am not able to do anything, maybe you are an expert and can help me. Here is the code--
>>> from sklearn.datasets import fetch_openml
>>> mnist = fetch_openml('mnist_784', version=1)
>>> mnist.keys()
output:
dict_keys(['data', 'target', 'frame', 'categories', 'feature_names', 'target_names', 'DESCR', 'details', 'url'])
>>> X, y = mnist["data"], mnist["target"]
>>> X.shape
output:(70000, 784)
>>> y.shape
output:(70000)
>>> X[0]
output:KeyError Traceback (most recent call last)
c:\users\khush\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2897 try:
-> 2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-10-19c40ecbd036> in <module>
----> 1 X[0]
c:\users\khush\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2904 if self.columns.nlevels > 1:
2905 return self._getitem_multilevel(key)
-> 2906 indexer = self.columns.get_loc(key)
2907 if is_integer(indexer):
2908 indexer = [indexer]
c:\users\khush\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
-> 2900 raise KeyError(key) from err
2901
2902 if tolerance is not None:
KeyError: 0
Please answer, there can be a silly mistake because I am a beggineer in ML. It would be really helpful if you gave me some hint also.

The API of fetch_openml changed between versions. In earlier versions, it returns a numpy.ndarray array. Since 0.24.0 (December 2020), as_frame argument of fetch_openml is set to auto (instead of False as default option earlier) which gives you a pandas.DataFrame for the MNIST data. You can force the data read as a numpy.ndarray by setting as_frame = False. See fetch_openml reference .

I was also facing the same problem.
scikit-learn: 0.24.0
matplotlib: 3.3.3
Python: 3.9.1
I used to below code to resolve the issue.
import matplotlib as mpl
import matplotlib.pyplot as plt
# instead of some_digit = X[0]
some_digit = X.to_numpy()[0]
some_digit_image = some_digit.reshape(28,28)
plt.imshow(some_digit_image,cmap="binary")
plt.axis("off")
plt.show()

You don't need to downgrade you scikit-learn library, if you follow the code below:
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version= 1, as_frame= False)
mnist.keys()

You load the dataset as a dataframe for you to able to access the images, you have two ways to do this,
Transform the dataframe to an Array
# Transform the dataframe into an array. Check the first value
some_digit = X.to_numpy()[0]
# Reshape it to (28,28). Note: 28 x 28 = 7064, if the reshaping doesn't meet
# this you are not able to show the image
some_digit_image = some_digit.reshape(28,28)
plt.imshow(some_digit_image,cmap="binary")
plt.axis("off")
plt.show()
Transform the row
# Transform the row of your choosing into an array
some_digit = X.iloc[0,:].values
# Reshape it to (28,28). Note: 28 x 28 = 7064, if the reshaping doesn't
# meet this you are not able to show the image
some_digit_image = some_digit.reshape(28,28)
plt.imshow(some_digit_image,cmap="binary")
plt.axis("off")
plt.show()

Related

numpy cross does not support Drake's AutoDiffXd?

The following code
import numpy as np
from pydrake.all import InitializeAutoDiffTuple
x = np.array([1,2,3])
y = np.array([4,5,6])
np.cross(x,y)
(x_ad,y_ad) = InitializeAutoDiffTuple(x, y)
np.cross(x_ad, y_ad)
Leads to the error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-f44901f6fd4e> in <module>
----> 1 np.cross(x_ad, y_ad)
<__array_function__ internals> in cross(*args, **kwargs)
~/.local/lib/python3.8/site-packages/numpy/core/numeric.py in cross(a, b, axisa, axisb, axisc, axis)
1604 "(dimension must be 2 or 3)")
1605 if a.shape[-1] not in (2, 3) or b.shape[-1] not in (2, 3):
-> 1606 raise ValueError(msg)
1607
1608 # Create the output array
ValueError: incompatible dimensions for cross product
(dimension must be 2 or 3)
Does Drake AutoDiffXd not support numpy's cross product?
This is not a lack of support from Drake. np.cross just doesn't accept (3,1) vectors, which is the default shape coming out of InitializeAutoDiffTuple. Either of the following can work instead:
np.cross(np.squeeze(x_ad),np.squeeze(y_ad))
np.cross(x_ad.T, y_ad.T)

TypeError("Tensor is unhashable if Tensor equality is enabled. " K.learning_phase(): 0

I am porting a Keras, Tensorflow, and OpenCV script to TF2 and Keras 2 and have run into a problem. I am getting an error on K.learning_phase(): 0.
The error happens in this code section.
ef detect_image(self, image):
if self.model_image_size != (None, None):
assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'
assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
boxed_image = image_preporcess(np.copy(image), tuple(reversed(self.model_image_size)))
image_data = boxed_image
out_boxes, out_scores, out_classes = self.sess.run(
[self.boxes, self.scores, self.classes],
feed_dict={
self.yolo_model.input: image_data,
self.input_image_shape: [image.shape[0], image.shape[1]],
tf.keras.learning_phase(): 0 })
here is a gist to the full code
https://gist.github.com/robisen1/31976de17af9e752c6ba8d1dd0e08906
Traceback (most recent call last):
File "webcam_detect.py", line 188, in <module>
r_image, ObjectsList = yolo.detect_image(frame)
File "webcam_detect.py", line 110, in detect_image
K.learning_phase(): 0
File "C:\Anaconda3\envs\simplecv\lib\site-packages\tensorflow_core\python\framework\ops.py", line 705, in __hash__
raise TypeError("Tensor is unhashable if Tensor equality is enabled. "
TypeError: Tensor is unhashable if Tensor equality is enabled. Instead, use tensor.experimental_ref() as the key.
(simplecv) PS C:\dev\lacv\yolov3\yolov3ct>
I am not sure what is going on. I would appreciate any insights.
You are trying to use Tensorflow 1.x, which works in graph mode whereas TensorFlow 2.x works in eager mode. TensorFlow 1.X requires users to manually stitch together an abstract syntax tree (the graph) by making tf.* API calls. It then requires users to manually compile the abstract syntax tree by passing a set of output tensors and input tensors to a session.run() call. TensorFlow 2.0 executes eagerly (like Python normally does) and in 2.0, graphs and sessions should feel like implementation details.
The error is due to version. If you are using session in TF2 then you need to use the compatible version and same goes with other operations. Also in TF2 it is tf.keras.backend.learning_phase.
Would recommend to go through the guide - Migrate your TensorFlow 1 code to TensorFlow 2.
For Example Below Code throws the error similar to the error you are facing -
import tensorflow as tf
print(tf.__version__)
x = tf.constant(5)
y = tf.constant(10)
z = tf.constant(20)
# This will show same error.
tensor_set = {x, y, z}
tensor_dict = {x: 'five', y: 'ten', z: 'twenty'}
Output -
2.2.0
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-509b2d8d7ab1> in <module>()
6
7 # This will show same error.
----> 8 tensor_set = {x, y, z}
9 tensor_dict = {x: 'five', y: 'ten', z: 'twenty'}
10
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py in __hash__(self)
724 if (Tensor._USE_EQUALITY and executing_eagerly_outside_functions() and
725 (g is None or g.building_function)):
--> 726 raise TypeError("Tensor is unhashable. "
727 "Instead, use tensor.ref() as the key.")
728 else:
TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.
But below code will fix the issue -
import tensorflow as tf
print(tf.__version__)
x = tf.constant(5)
y = tf.constant(10)
z = tf.constant(20)
#This solves the issue
tensor_set = {x.experimental_ref(), y.experimental_ref(), z.experimental_ref()}
tensor_dict = {x.experimental_ref(): 'five', y.experimental_ref(): 'ten', z.experimental_ref(): 'twenty'}
Output -
2.2.0
WARNING:tensorflow:From <ipython-input-4-05e379e669d9>:12: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
If you are still facing the error, then kindly share the reproducible code for the error like above. Will be happy to help you.
Hope this answers your question. Happy Learning.
try to disable tf.compat.v1.disable_eager_execution()
from tensorflow.compat.v1 import disable_eager_execution
disable_eager_execution()

Memory error while doing Hierarchical Clustering

I have a large dataset (207989, 23), and I am trying to apply Hierarchical clustering on just one column right now to test if it's suitable for the task at my hand.
What I have tried:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
data = pd.read_csv('gpmd.csv', header = 0)
X = data.loc[:, ['ContextID', 'BacksGas_Flow_sccm']]
min_max_scaler = preprocessing.MinMaxScaler()
X_minmax = min_max_scaler.fit_transform(X.values[:,[1]])
import scipy.cluster.hierarchy as sch
dendrogram = sch.dendrogram(sch.linkage(X_minmax, method = 'ward'))
after doing this, I am getting the following error:
dendrogram = sch.dendrogram(sch.linkage(X_minmax, method = 'ward'))
Traceback (most recent call last):
File "<ipython-input-4-429f42b68112>", line 1, in <module>
dendrogram = sch.dendrogram(sch.linkage(X_minmax, method = 'ward'))
File "C:\Users\kashy\Anaconda3\envs\py36\lib\site-packages\scipy\cluster\hierarchy.py", line 708, in linkage
y = distance.pdist(y, metric)
File "C:\Users\kashy\Anaconda3\envs\py36\lib\site-packages\scipy\spatial\distance.py", line 1877, in pdist
dm = np.empty((m * (m - 1)) // 2, dtype=np.double)
MemoryError
Can someone explain what exactly is the problem here?
Thanks in advance
Hierarchical clustering in most variants needs O(n²) memory.
Because of this, most implementations will fail at around 65535 instances, when they hit the 32 bit mark (some may fail at 32k already). But just do the math: n * n * 8 bytes for double precision: how much memory would you need?

I am not able Training models in sklearn (scikit-learn) using python

i have data file it contain data to predict the admission in MS.
it contain 9 column 8 column contain student data and 9th column contain chance of selection of student.
i am new and i don't understand error come in training model
import pandas
import numpy as np
import sklearn as sl
from sklearn.neural_network import MLPClassifier
classifier = MLPClassifier()
data = pandas.read_csv('Addmition.csv')
data_array = np.array(data)
X = data_array[:,1:8]
y = data_array[:,8]
classifier.fit(X,y)
print(classifier)
Traceback (most recent call last):
File "c.py", line 14, in <module>
classifier.fit(X,y)
File "C:\Users\vishal jangid\AppData\Roaming\Python\Python37\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 977, in fit
hasattr(self, "classes_")))
File "C:\Users\vishal jangid\AppData\Roaming\Python\Python37\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 324, in _fit
X, y = self._validate_input(X, y, incremental)
File "C:\Users\vishal jangid\AppData\Roaming\Python\Python37\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 920, in _validate_input
self._label_binarizer.fit(y)
File "C:\Users\vishal jangid\AppData\Roaming\Python\Python37\site-packages\sklearn\preprocessing\label.py", line 413, in fit
self.classes_ = unique_labels(y)
File "C:\Users\vishal jangid\AppData\Roaming\Python\Python37\site-packages\sklearn\utils\multiclass.py", line 96, in unique_labels
raise ValueError("Unknown label type: %s" % repr(ys))
ValueError: Unknown label type: (array
Try this:
import numpy as np
import sklearn as sl
from sklearn.neural_network import MLPRegressor
classifier = MLPRegressor()
data = pandas.read_csv('Addmition.csv')
data_array = np.array(data)
X = data_array[:,1:8]
y = data_array[:,8]
classifier.fit(X,y)
print(classifier)
Explanation:
In machine learning we may have two types of problems:
1) Classification:
Ex: Predict if a person is male or female. (discrete)
2) Regression:
Ex: Predict the age of the person. (continuous)
With this in hand we are going to see your problem, your label (chance of selection) is continous, thus we have a regression problem.
See that you are using the MLPClassifier, resulting in the 'Unknown label error'.
Try using the MLPRegressor.

Probable issue with LSTM in lasagne

With a simple constructor for the LSTM, as given in the tutorial, and an input of dimension [,,1] one would expect to see an output of shape [,,num_units].
But regardless of the num_units passed during construction, the output has the same shape as the input.
Following is the min code to replicate this issue...
import lasagne
import theano
import theano.tensor as T
import numpy as np
num_batches= 20
sequence_length= 100
data_dim= 1
train_data_3= np.random.rand(num_batches,sequence_length,data_dim).astype(theano.config.floatX)
#As in the tutorial
forget_gate = lasagne.layers.Gate(b=lasagne.init.Constant(5.0))
l_lstm = lasagne.layers.LSTMLayer(
(num_batches,sequence_length, data_dim),
num_units=8,
forgetgate=forget_gate
)
lstm_in= T.tensor3(name='x', dtype=theano.config.floatX)
lstm_out = lasagne.layers.get_output(l_lstm, {l_lstm:lstm_in})
f = theano.function([lstm_in], lstm_out)
lstm_output_np= f(train_data_3)
lstm_output_np.shape
#= (20, 100, 1)
An unqualified LSTM (I mean in its default mode) should produce one output for each unit right?
The code was run on kaixhin's cuda lasagne docker image docker image
What gives?
Thanks !
You can fix that by using a lasagne.layers.InputLayer
import lasagne
import theano
import theano.tensor as T
import numpy as np
num_batches= 20
sequence_length= 100
data_dim= 1
train_data_3= np.random.rand(num_batches,sequence_length,data_dim).astype(theano.config.floatX)
#As in the tutorial
forget_gate = lasagne.layers.Gate(b=lasagne.init.Constant(5.0))
input_layer = lasagne.layers.InputLayer(shape=(num_batches, # <-- change
sequence_length, data_dim),) # <-- change
l_lstm = lasagne.layers.LSTMLayer(input_layer, # <-- change
num_units=8,
forgetgate=forget_gate
)
lstm_in= T.tensor3(name='x', dtype=theano.config.floatX)
lstm_out = lasagne.layers.get_output(l_lstm, lstm_in) # <-- change
f = theano.function([lstm_in], lstm_out)
lstm_output_np= f(train_data_3)
print lstm_output_np.shape
If you feed your input into the input_layer, it is not ambiguous anymore, so you do not even need to specify where the input is supposed to go. Directly specifying a shape and adding the tensor3 into the LSTM does not work.

Resources