I have Physiological EEG emotion dataset named "Deap". I want to analyze and visualize the data through MNE but it has its own format.
How can I load my personal data for pre-processing, data format is (.dat)?
import pickle
with open('s01.dat', 'rb') as f:
y = pickle.load(f, encoding='latin1')
This one works for me.
Of course, the ".dat" file is in the same directory as this code.
Related
I'm currently using .txt dataset on yolo and I want to use .json dataset on yolov4.
Can I do that?
example for .txt format
0 0.542720 0.415254 0.409610 0.355932
example for .json format
{"info":{"description":"my-project-name"},"images":[{"id":1,"width":1280,"height":854,"file_name":"a.jpg"}],"annotations":[{"id":0,"iscrowd":0,"image_id":1,"category_id":1,"segmentation":[[572.45197740113,270.1920903954802,628.7419962335217,244.45951035781542,738.105461393597,286.2749529190207,776.7043314500942,369.90583804143125,775.0960451977401,453.53672316384177,545.1111111111111,511.4350282485875,506.512241054614,482.48587570621464,487.21280602636534,357.03954802259886,514.5536723163842,299.14124293785306,562.8022598870057,283.0583804143126]],"bbox":[487.21280602636534,244.45951035781542,289.49152542372883,266.9755178907721],"area":57054.884640074306}],"categories":[{"id":1,"name":"a"}]}```
I'm working on a project that related to NLP. then i use One hot encode for text representation in google colab Then i fit it into LSTM.
This is my code:
from tensorflow.keras.preprocessing.text import one_hot
voc_size=13000
onehot_repr=[one_hot(words,voc_size)for words in X1]
the model seem good but when i want to save it for making prediction with new text i save it using pickle:
import pickle
with open("one_hot", "wb") as f:
pickle.dump(one_hot, f)
but when i restart the colab and load the saved one_hot again the number that represent a word is difference.
Is there any possible way that i can save Onehot and get the same result in colab?
Because I can not save one hot encode for using another time that why i save one hot representation as list and access it by index later:
## load save model
from tensorflow.keras.models import load_model
my_model=load_model("model9419.h5")
##load oneHot representation
with open('/content/drive/MyDrive/last_model/on_hot.json', 'rb') as f:
oneHot=json.load(f)
In order to predict A word i used simple array access element to find one hot representation of a words.
Is This a correct way to make a prediction ? Is there any better way than that?
And If I can save OneHot function how can i use in flask server?
Also can anyone recommend word representation that is easy, can save to use in flask and better?
First, create a one-hot dict and then convert it to pandas DataFrame and save a .csv of that DataFrame. ex.
import pandas as pd
from tensorflow.keras.preprocessing.text import one_hot
onehot_dict = {}
voc_size = 3
for words in ['this', 'that', 'then']:
onehot_dict[words] = one_hot(words, voc_size)
onehot_df = pd.DataFrame(onehot_dict)
onehot_df.to_csv('./onehot.csv', index=False)
This is a "Watson Studio" related question. I've done the following Deep-Learning tutorial/experiment assistant, successfully deployed a generated CNN model to WML(WebService). Cool!
Tutorial: Single convolution layer on MNIST data
Experiment Assistant
Next, I'd like to test if the model could identify my image( MNIST ) in deployed environment, and the questions came to my mind.
What kind of input file( maybe pixel image file ) should I prepare for the model input ? How can I kick the scoring endpoint passing my image? ( I saw python code-snippet on the "Implementation" tab, but it's json example and not sure how can I pass the pixel image...)
payload_scoring = {"fields": [array_of_feature_columns], "values": [array_of_values_to_be_scored, another_array_of_values_to_be_scored]}
Any advice/suggestions highly welcomed. Thx in advance.
The model that was trained accepts an input data that is an array of 4 dimensions i.e [<batchsize>, 28, 28, 1], where 28 refers to the height and width of the image in pixels, 1 refers to the number of channels. Currently the WML online deployment and scoring service requires the payload data in the format that matches the input format of the model. So, to predict any image with this model, you must ...
convert the image to an array of [1, 28, 28, 1] dimension. Converting image to an array is explained in next section.
pre-process the image data as required by the model i.e perform (a) normalize the data (b) convert the type to float
pre-processed data must be be specified in json format with appropriate keys. This json doc will be the input payload for the scoring request for the model.
How to convert image to an array?
There are two ways.. (using python code)
a) keras python library has a MNIST dataset that has MNIST images that are converted to 28 x 28 arrays. Using the python code below, we can use this dataset to create the scoring payload.
import numpy as np
from keras.datasets import mnist
(X, y), (X_test, y_test) = mnist.load_data()
score_payload_data = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)
score_payload_data = score_payload_data.astype("float32")/255
score_payload_data = score_payload_data[2].tolist() ## we are choosing the 2nd image in the list to predict
scoring_payload = {'values': [score_payload_data]}
b) If you have an image of size 28 x 28 pixels, we can create the scoring payload using the code below.
img_file_name = "<image file name with full path>"
from scipy import misc
img = misc.imread(img_file_name)
img_to_predict = img.reshape(img.shape[0], img.shape[1], 1)/255
img_to_predict = img_to_predict.astype("float32").tolist()
scoring_payload = {"values": [img_to_predict]}
This question already has answers here:
Save MinMaxScaler model in sklearn
(5 answers)
Saving StandardScaler() model for use on new datasets
(3 answers)
Closed 1 year ago.
I am building a neural net with the purpose of make predictions on new data in the future. I first preprocess the training data using sklearn.preprocessing, then train the model, then make some predictions, then close the program. In the future, when new data comes in I have to use the same preprocessing scales to transform the new data before putting it into the model. Currently, I have to load all of the old data, fit the preprocessor, then transform the new data with those preprocessors. Is there a way for me to save the preprocessing objects objects (like sklearn.preprocessing.StandardScaler) so that I can just load the old objects rather than have to remake them?
I think besides pickle, you can also use joblib to do this. As stated in Scikit-learn's manual 3.4. Model persistence
In the specific case of scikit-learn, it may be better to use joblib’s replacement of pickle (dump & load), which is more efficient on objects that carry large numpy arrays internally as is often the case for fitted scikit-learn estimators, but can only pickle to the disk and not to a string:
from joblib import dump, load
dump(clf, 'filename.joblib')
Later you can load back the pickled model (possibly in another Python process) with:
clf = load('filename.joblib')
Refer to other posts for more information, Saving StandardScaler() model for use on new datasets, Save MinMaxScaler model in sklearn.
As mentioned by lejlot, you can use the library pickle to save the trained network as a file in your hard drive, then you just need to load it to start to make predictions.
Here is an example on how to use pickle to save and load python objects:
import pickle
import numpy as np
npTest_obj = np.asarray([[1,2,3],[6,5,4],[8,7,9]])
strTest_obj = "pickle example XXXX"
if __name__ == "__main__":
# store object information
pickle.dump(npTest_obj, open("npObject.p", "wb"))
pickle.dump(strTest_obj, open("strObject.p", "wb"))
# read information from file
str_readObj = pickle.load(open("strObject.p","rb"))
np_readObj = pickle.load(open("npObject.p","rb"))
print(str_readObj)
print(np_readObj)
I'm training a simple convolution neural network using pylearn2. I have my RGB image data stored in a npy file. Is there anyway to convert that data directly to grayscale data directly from the npy file?
If this is a standalone file then load the file using numpy.load then convert the content using something like this:
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.299, 0.587, 0.144])
If the file is part of a pylearn2 dataset (resulted from use_design_loc()), then load the dataset
from pylearn2.utils import serial
serial.load("file.pkl")
and apply rgb2gray() function to X member (I assume a DenseDesignMatrix).