Cannot set the Datasetname in a hdf5 file by slicing - hdf5

I'm trying to store 3 datasets in a hdf5 file by unsing h5py. The name of each dataset is extraced by a slicing but that doesn't work.
# First I open a .png image file and convert this image to an numpy array
# called "raster":
.
.
.
# The second step is, that i'm trying to extract the datasetname by a slicing
# on the filename to give the dataset a name:
filename = imagefile1.png
dsetName = filename[:-4]
# Then i open a hdf5 file by:
file = h5py.File('hdf5_file.5h', 'a')
# and write the dataset into that file:
file.create_dataset(dsetName, data=raster)
But at this point i always run into the same issue:
AttributeError: 'slice' object has no attribute 'encode'

Related

How to save/serializing a glm model as zip/pickle file?

I built a tweedie glm model using statsmodels.
Just wondering how to save/serializing it as zip file or pkl file?
I tried
from statsmodels.formula.api import glm
formula4 = "y ~ x1 + C(x2)"
mod4 = glm(formula=formula4, var_weights = 'one', data=train, family=sm.families.Tweedie())
res4 = mod4.fit()
import pickle
filename = 'test.pkl'
#Use pickle to save your object to a file:
pickle.dump(mod4, open(filename, 'wb'))
But the saved pickle file is too large.
Any idea?
--
Answer:
not to use formula directly in the model building process. use dmatrices to process the data ahead. then save the model, the result is around 10kb.

Keras Image Data Generator ,flow from dataframe results in 'FileNotFound' error

I am working on Convolutional Neural Network model in keras for 'cats and dogs' photos classification.
The code works fine when I am using 'flow_from_directory' as shown below:
Image_generator_object=ImageDataGenerator(rescale=1.0/255.0)
Image_train=Image_generator_object.flow_from_directory(directory=train_folder,batch_size=64,
target_size=(200,200),class_mode='binary')
# Found 8000 images belonging to 2 classes.
Image_test=Image_generator_object.flow_from_directory(directory=test_folder,batch_size=64,
target_size=(200,200),class_mode='binary')
# Found 4000 images belonging to 2 classes.
#the model is given below
def base_model():
model=Sequential()
# Stage 1
model.add(Conv2D(filters=32 , kernel_size=(3,3),input_shape=(200,200,3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2),strides=None,padding='valid'))
model.add(Flatten())
model.add(Dense(units=50,activation='relu'))
model.add(Dense(units=1,activation='sigmoid'))
model.compile(optimizer=Adam(learning_rate=0.001),loss='binary_crossentropy',metrics=['accuracy'])
return model
model=base_model()
# finally fitting the model
history=model.fit(Image_train , steps_per_epoch=len(data_gen_testing_flow), batch_size=100,verbose=1,validation_data=Image_test,epochs=10)
All the code works fine but when I try to use flow_from_directory option as shown below. Then I get the following error.
FileNotFoundError: [Errno 2] No such file or directory: 'cat.10469.jpg'
Here is what I did when I am using the "flow_from_data_frame" option. I have the training and testing dataframes (trainn, testt) showing the files names and classes as shown below(as an example):
files class
124 cat.10566.jpg cat
374 cat.10795.jpg cat
260 cat.10674.jpg cat
386 dog.9447.jpg dog
472 cat.10878.jpg cat
Then I created the Image Data generator object and used from dataframe and it found the image file names as shown below
data_gen_object=ImageDataGenerator( rescale=1.0/255.0,validation_split=0.25)
data_gen_object_training_flow=data_gen_object.flow_from_dataframe(trainn,diretory=base_folder,subset='training',x_col='files', y_col='class',
target_size=(200,200), batch_size=64,class_mode='binary',validate_filenames=False)
data_gen_object_testing_flow=data_gen_object.flow_from_dataframe(trainn,diretory=base_folder,subset='validation',x_col='files',
y_col='class',target_size=(200,200), batch_size=64,class_mode='binary',validate_filenames=False)
I get the following result:
Found 563 non-validated image filenames belonging to 2 classes.
Found 187 non-validated image filenames belonging to 2 classes.
It should be noted that the above code doesn't work when I am using validate_filenames=True.
After that when I fit the model I get the no files found error as shown below:
model.fit(data_gen_object_training_flow ,batch_size=64,verbose=1,
validation_data=data_gen_object_testing_flow,epochs=3)
# FileNotFoundError: [Errno 2] No such file or directory: 'cat.10469.jpg'

Removing duplicate images while scraping images from google

I took code from here: How to remove duplicate items during training CNN?
from PIL import Image
import imagehash
# image_fns : List of training image files
img_hashes = {}
for img_fn in sorted(image_fns):
hash = imagehash.average_hash(Image.open(image_fn))
if hash in img_hashes:
print( '{} duplicate of {}'.format(image_fn, img_hashes[hash]) )
else:
img_hashes[hash] = image_fn
How can we append images in img_hashes,which is an empty dict?
when executing if statement how program checks if hash is in img_hashes?
Anyone has any idea?
Thank you

Plot vtk file using Python

I cannot display a vtk file using python. Spyder Code Analysis prompts me that vtkDataSetMapper is an undefined name.
I know the vtk file is in order because I have already displayed it using Paraview.
My vtk file looks like this:
# vtk DataFile Version 2.0
velocity field
ASCII
DATASET STRUCTURED_POINTS
DIMENSIONS 108 103 31
ORIGIN 0.0000000000000000 0.0000000000000000 -297.50000000000000
SPACING 15.465818554775332 12.565027060488859 10.000000000000000
POINT_DATA 344844
SCALARS scalars float
LOOKUP_TABLE default
8.4405251
8.4405251
...
...
...
After the last shown line, the vtk file contains the rest of information, which are merely numbers (~ 300000 values)
My code looks like this:
import vtk
# Read vtk file data
reader = vtk.vtkDataSetReader()
reader.SetFileName("seaust.vtk")
reader.ReadAllScalarsOn() # Activate the reading of all scalars
reader.ReadAllVectorsOn() # Activate the reading of all vectors
reader.ReadAllTensorsOn() # Activate the reading of all tensors
reader.Update()
data = reader.GetOutput()
scalar_range = data.GetScalarRange()
# Create the mapper that corresponds the objects of the vtk file
# into graphics elements
mapper = vtkDataSetMapper()
mapper.SetInput(data)
When trying to compile the code, python prompts me this error:
AttributeError: 'vtkRenderingCorePython.vtkDataSetMapper' object has no attribute 'SetInput'
I'm expecting a 3D visualisation of my data.
Can you please help me to get it?
I guess you are missing the vtk. in
mapper = vtk.vtkDataSetMapper()
and you probably need to use
mapper.SetInputData(data)

Object detection From videos in openCV (int() argument must be a string, a bytes-like object or a number, not 'NoneType')

I am using python 3.5 in Ubuntu 16.04
https://www.learnopencv.com/install-opencv3-on-ubuntu/
used this link to download opencv3
File "<ipython-input-12-e1defa92c813>", line 1, in <module>
runfile('/home/abhishek/models/research/object_detection/Video_detection.py', wdir='/home/abhishek/models/research/object_detection')
File "/home/abhishek/anaconda3/lib/python3.5/site-packages/spyder/utils/site/sitecustomize.py", line 710, in runfile
execfile(filename, namespace)
File "/home/abhishek/anaconda3/lib/python3.5/site-packages/spyder/utils/site/sitecustomize.py", line 101, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/abhishek/models/research/object_detection/Video_detection.py", line 139, in <module>
feed_dict={image_tensor: image_np_expanded})
File "/home/abhishek/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/abhishek/anaconda3/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1093, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "/home/abhishek/anaconda3/lib/python3.5/site-packages/numpy/core/numeric.py", line 531, in asarray
return array(a, dtype, copy=False, order=order)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
Python Programming Video Detection Tutorial #2
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
import cv2
cap = cv2.VideoCapture(0)
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
sys.path.append(sys.executable)
# ## Object detection imports
# Here are the imports from the object detection module.
# In[3]:
from utils import label_map_util
from utils import visualization_utils as vis_util
# # Model preparation
# ## Variables
#
# Any model exported using the `export_inference_graph.py` tool can be loaded here simply by changing `PATH_TO_CKPT` to point to a new .pb file.
#
# By default we use an "SSD with Mobilenet" model here. See the [detection model zoo](https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md) for a list of other models that can be run out-of-the-box with varying speeds and accuracies.
# In[4]:
# What model to download.
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')
NUM_CLASSES = 90
print(12)
# ## Download Model
# In[5]:
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
file_name = os.path.basename(file.name)
if 'frozen_inference_graph.pb' in file_name:
tar_file.extract(file, os.getcwd())
print(13)
# ## Load a (frozen) Tensorflow model into memory.
# In[6]:
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
print(14)
# ## Loading label map
# Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`. Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine
# In[7]:
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
print(15)
# ## Helper code
# In[8]:
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
# # Detection
# In[9]:
# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
# In[10]:
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
while True:
ret, image_np = cap.read()
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8)
cv2.imshow('object detection', cv2.resize(image_np, (800,600)))
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
>
Please help me to proceed.I have already added my code which mostly copied from https://pythonprogramming.net/video-tensorflow-object-detection-api-tutorial/
I had the same problem.
Firstly crosscheck if your opencv has been correctly installed.
and before you take the source for object detection from your webcam try it with stock photos and check if it works.
Later upgrade your opencv to opencv3.
conda install opencv3
If problem still persists then check your webcam for input issuses.
Nonetype is only returned when no frames are being captured by your webcam.
Had same problem, file does'nt not exist when it is picture, maybe could not transform in picture when come from video or cam. check your input

Resources