error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize' OpenCV

error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize' OpenCV - opencv

I have this old code that is used to run fine in Python 2.7 a while ago. I just updated the code to run in Python 3.8, but when I try to execute it code in Python 3.8 and OpenCV 3.4 I get a resize error and a warning (below)!
Here is the link to the two tif images that are required to run this code.
It's worth noting that both tif images are in the same folder as the Python code
import cv2
import matplotlib.pyplot as plt
import numpy as np
## Code for C_preferred Mask and C_images##
## There are three outputs to this code:
#"Block_order_C.PNG"
#"Out_img.PNG"
#"Output_C.txt"
## Change the image name here
filename_image = '2.tif'
filename_mask = '1.tif'
## OpenCV verison Checking
#print 'OpenCV version used', cv2.__version__
filename = open("Output_C.txt","w")
filename.write("Processing Image : " + str(filename_image) + '\n\n')
## Function to sort the contours : Parameters that you can tune : tolerance_factor and size 0f the image.Here, I have used a fix size of
## (800,800)
def get_contour_precedence(contour, cols):
tolerance_factor = 10
origin = cv2.boundingRect(contour)
return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]
## Loading the colored mask, resizing it to (800,800) and converting it from RGB to HSV space, so that the color values are emphasized
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask),(800,800)),cv2.COLOR_RGB2HSV);
# Loading the original Image
b_image_1 = cv2.resize(cv2.imread(filename_image),(800,800));
cv2.imshow("c_mask_preferred",p_mask_c)
cv2.waitKey();
# convert the target color to HSV, As our target mask portion to be considered is green. So I have chosen target color to be green
b = 0;
g = 255;
r = 0;
# Converting target color to HSV space
target_color = np.uint8([[[b, g, r]]])
target_color_hsv = cv2.cvtColor(target_color, cv2.COLOR_BGR2HSV)
# boundaries for Hue define the proper color boundaries, saturation and values can vary a lot
target_color_h = target_color_hsv[0,0,0]
tolerance = 20
lower_hsv = np.array([max(0, target_color_h - tolerance), 10, 10])
upper_hsv = np.array([min(179, target_color_h + tolerance), 250, 250])
# apply threshold on hsv image
mask = cv2.inRange(p_mask_c, lower_hsv, upper_hsv)
cv2.imshow("mask",mask)
cv2.waitKey()
# Eroding the binary mask, such that every white portion (grids) are seperated from each other, to avoid overlapping and mixing of
# adjacent grids
b_mask = mask;
kernel = np.ones((5,5))
#kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
sharp = cv2.erode(b_mask,kernel, iterations=2)
# Finding all the grids (from binary image)
contours, hierarchy = cv2.findContours(sharp,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
print (' Number of contours', len(contours))
# Sorting contours
contours.sort(key=lambda x:get_contour_precedence(x, np.shape(b_mask)[0]))
#cv2.drawContours(b_image_1, contours, -1, (0,255,0), 1)
# Label variable for each grid/panel
label = 1;
b_image = b_image_1.copy();
temp =np.zeros(np.shape(b_image_1),np.uint8)
print (' size of temp',np.shape(temp), np.shape(b_image))
out_img = b_image_1.copy()
# Processing in each contour/label one by one
for cnt in contours:
cv2.drawContours(b_image_1,[cnt],0,(255,255,0), 1)
## Just to draw labels in the center of each grid
((x, y), r) = cv2.minEnclosingCircle(cnt)
x = int(x)
y = int(y)
r = int(r)
cv2.putText(b_image_1, "#{}".format(label), (int(x) - 10, int(y)),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
##
cv2.drawContours(temp,[cnt],0,(255,255,255), -1)
#crop_img = np.bitwise_and(b_image,temp)
r = cv2.boundingRect(cnt)
crop_img = b_image[r[1]:r[1]+r[3], r[0]:r[0]+r[2]]
mean = cv2.mean(crop_img);
mean = np.array(mean).reshape(-1,1)
print (' Mean color', mean, np.shape(mean))
if mean[1] < 50:
cv2.putText(out_img, "M", (int(x) - 10, int(y)),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 1)
filename.write("Block number #"+ str(label)+ ' is : ' + 'Magenta'+'\n');
else:
cv2.putText(out_img, "G", (int(x) - 10, int(y)),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 1)
filename.write("Block number #"+ str(label)+ ' is : ' +'Gray'+'\n');
label = label+1;
cv2.imwrite("Block_order_C.PNG",b_image_1)
cv2.imwrite("Out_img.PNG",out_img)
filename.close()
cv2.imshow("preferred",b_image_1)
cv2.waitKey()
Error
[ WARN:0] global C:\projects\opencv-python\opencv\modules\imgcodecs\src\grfmt_tiff.cpp (449) cv::TiffDecoder::readData OpenCV TIFF: TIFFRGBAImageOK: Sorry, can not handle images with IEEE floating-point samples
Traceback (most recent call last):
File "Processing_C_preferred.py", line 32, in
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask),(800,800)),cv2.COLOR_RGB2HSV);
cv2.error: OpenCV(4.2.0) C:\projects\opencv-python\opencv\modules\imgproc\src\resize.cpp:4045: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'

When you read in the image pass the cv::IMREAD_ANYDEPTH = 2 parameter as the second parameter in cv2.imread().
Changing your lines to
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask, 2),(800,800)),cv2.COLOR_RGB2HSV);
and
b_image_1 = cv2.resize(cv2.imread(filename_image, 2),(800,800));
removes the resize error you're seeing.
But you get another error when changing the color since your TIFF image apparently has only one channel so cv2.COLOR_RGB2HSV won't work..
You could also use multiple flags like cv::IMREAD_COLOR = 1,
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask, 2 | 1),(800,800)),cv2.COLOR_BGR2HSV);
to read in a color image. But you get a different error. Perhaps you understand this image better than I do and can solve the problem from here on out.

Related

Why does Tesseract fail to recognize 6 out of 26 of my alphabetic keyboard keys even with several parameter tunings?

TL;DR I'm using:
adaptive thresholding
segmenting by keys (width/height ratio) - see green boxes in image result
psm 10 to treat each key as a character
but it fails to recognize some keys, falsely identifies others or identifies 2 for 1 char (see the L character in the image result, it's an L and P), etc.
Note: I cropped the image and re-ran the results to get it to fit on this site, but before cropping it did slightly better (recognized more keys, fewer false positives, etc).
I just want it to recognize the alphabet keys. Ultimately I will want it to work for realtime video.
config:
'-l eng --oem 1 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ"'
I've tried scaling the image differently, scaling the individual key segments, using opening/closing/etc but it doesn't recognize all the keys.
original image
image result
Update: new results if I make the image straighter (bird's eye) and remove the whitelisting, it manages to detect all for the most part (although it thinks the O is a 0 and the I is a |, which is understandable). Why is this and how could I make this adaptive enough for a dynamic video when it is so sensitive to these conditions?
Code:
import pytesseract
import numpy as np
try:
from PIL import Image
except ImportError:
import Image
import cv2
from tqdm import tqdm
from collections import defaultdict
def get_missing_chars(dict):
capital_alphabet = [chr(ascii) for ascii in range(65, 91)]
return [let for let in capital_alphabet if let not in dict]
def draw_box_and_char(img, contour_dims, c, box_col, text_col):
x, y, w, h = contour_dims
top_left = (x, y)
bot_right = (x + w, y+h)
font_offset = 3
text_pos = (x+h//2+12, y+h-font_offset)
img_copy = img.copy()
cv2.rectangle(img_copy, top_left, bot_right, box_col, 2)
cv2.putText(img_copy, c, text_pos, cv2.FONT_HERSHEY_SIMPLEX, fontScale=.5, color=text_col, thickness=1, lineType=cv2.LINE_AA)
return img_copy
def detect_keys(img):
scaling = .25
img = cv2.resize(img, None, fx=scaling, fy=scaling, interpolation=cv2.INTER_AREA)
print("img shape", img.shape)
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ratio_min = 0.7
area_min = 1000
nbrhood_size = 1001
bias = 2
# adapt to different lighting
bin_img = cv2.adaptiveThreshold(gray_img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY_INV, nbrhood_size, bias)
items = cv2.findContours(bin_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = items[0] if len(items) == 2 else items[1]
key_contours = []
for c in contours:
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
# square-like ratio, try to get character
if ratio > ratio_min and area > area_min:
key_contours.append(c)
detected = defaultdict(int)
n_kept = 0
img_copy = cv2.cvtColor(bin_img, cv2.COLOR_GRAY2RGB)
let_to_contour = {}
n_contours = len(key_contours)
# offset to get smaller square within the key segment for easier char recognition
offset = 10
show_each_char = False
for _, c in tqdm(enumerate(key_contours), total=n_contours):
x, y, w, h = cv2.boundingRect(c)
ratio = h/w
area = cv2.contourArea(c)
base = np.zeros(bin_img.shape, dtype=np.uint8)
base.fill(255)
n_kept += 1
new_y = y+offset
new_x = x+offset
new_h = h-2*offset
new_w = w-2*offset
base[new_y:new_y+new_h, new_x:new_x+new_w] = bin_img[new_y:new_y+new_h, new_x:new_x+new_w]
segment = cv2.bitwise_not(base)
# try scaling up individual keys
# scaling = 2
# segment = cv2.resize(segment, None, fx=scaling, fy=scaling, interpolation=cv2.INTER_CUBIC)
# psm 10: treats the segment as a single character
custom_config = r'-l eng --oem 1 --psm 10 -c tessedit_char_whitelist="ABCDEFGHIJKLMNOPQRSTUVWXYZ"'
d = pytesseract.image_to_data(segment, config=custom_config, output_type='dict')
conf = d['conf']
c = d['text'][-1]
if c:
# sometimes recognizes multiple keys even though there is only 1
for sub_c in c:
# save character and contour to draw on image and show bounds/detection
if sub_c not in let_to_contour or (sub_c in let_to_contour and conf > let_to_contour[sub_c]['conf']):
let_to_contour[sub_c] = {'conf': conf, 'cont': (new_x, new_y, new_w, new_h)}
else:
c = "?"
text_col = (0, 0, 255)
if show_each_char:
contour_dims = (new_x, new_y, new_w, new_h)
box_col = (0, 255, 0)
text_col = (0, 0, 0)
segment_with_boxes = draw_box_and_char(segment, contour_dims, c, box_col, text_col)
cv2.imshow('segment', segment_with_boxes)
cv2.waitKey(0)
cv2.destroyAllWindows()
# draw boxes around recognized keys
for c, data in let_to_contour.items():
box_col = (0, 255, 0)
text_col = (0, 0, 0)
img_copy = draw_box_and_char(img_copy, data['cont'], c, box_col, text_col)
detected = {k: 1 for k in let_to_contour}
for det in let_to_contour:
print(det, let_to_contour[det])
print("total detected: ", let_to_contour.keys())
missing = get_missing_chars(detected)
print(f"n_missing: {len(missing)}")
print(f"chars missing: {missing}")
return img_copy
if __name__ == "__main__":
img_file = "keyboard.jpg"
img = cv2.imread(img_file)
img_with_detected_keys = detect_keys(img)
cv2.imshow("detected", img_with_detected_keys)
cv2.waitKey(0)
cv2.destroyAllWindows()

Resize image mask (shrink) using max value of united pixel group

I would like to resize, and specifically shrink, a mask (2D array of 1s and 0s) so that any pixel in the low-resolution-mask that maps to a group of pixels in the high-resolution-mask (original) containing at least one value of 1 will be set to 1 itself (example at bottom).
I've tried using cv2.resize() using cv2.INTER_MAX but it returned an error:
error: OpenCV(4.6.0) /io/opencv/modules/imgproc/src/resize.cpp:3927: error: (-5:Bad argument) Unknown interpolation method in function 'resize'
It doesn't seem that Pillow Image or scipy have an interpolation method to do so.
I'm looking for a solution for the defined shrink_max()
>>> orig_mask = [[1,0,0],[0,0,0],[0,0,0]]
>>> orig_mask
[[1,0,0]
,[0,0,0]
,[0,0,0]]
>>> mini_mask = shrink_max(orig_mask, (2,2))
>>> mini_mask
[[1,0]
,[0,0]]
>>> mini_mask = shrink_max(orig_mask, (1,1))
>>> mini_mask
[[1]]

I'm not aware of a direct method but try this for shrinking the mask to half-size, i.e. each low-res pixel maps to 4 original pixels (modify to any ratio as per your needs):
import numpy as np
orig_mask = np.array([[1,0,0],[0,0,0],[0,0,0]])
# first make the original mask divisible by 2
pad_row = orig_mask.shape[0] % 2
pad_col = orig_mask.shape[1] % 2
# i.e. pad the right and bottom of the mask with zeros
orig_mask_padded = np.pad(orig_mask, ((0,pad_row), (0,pad_col)))
# get the new shape
new_rows = orig_mask_padded.shape[0] // 2
new_cols = orig_mask_padded.shape[1] // 2
# group the original pixels by fours and max each group
shrunk_mask = orig_mask_padded.reshape(new_rows, 2, new_cols, 2).max(axis=(1,3))
print(shrunk_mask)
Check working with submatrixes here: Numpy: efficiently sum sub matrix m of M
Here's the complete function for shrinking to any desired shape:
def shrink_max(mask, shrink_to_shape):
r, c = shrink_to_shape
m, n = mask.shape
padded_mask = np.pad(mask, ((0, -m % r), (0, -n % c)))
pr, pc = padded_mask.shape
return padded_mask.reshape(r, pr // r, c, pc // c).max(axis=(1, 3))
For example print(shrink_max(orig_mask, (2,1))) returns:
[[1]
[0]]

How to split image of table at vertical lines into three images?

I want to split an image of a table at the vertical lines into three images as shown below. Is it possible? The width of each column is variable. And the sad thing is that the left vertical line is drawn down from the header as you can see.
Input image (input.png)
Output image (output1.png)
Output image (output2.png)
Output image (output3.png)
Update 1
And the sad thing is that the left vertical line is drawn down from the header as you can see.
It means I guess the following image B is easier to split. But my case is A.
Update 2
I am trying to do the way #HansHirse gave me. My expectation is sub_image_1.png, sub_image_2.png and sub_image_3.png are stored in the out folder. But no luck so far. I'm looking into it.
https://github.com/zono/ocr/blob/16fd0ec9a2c7d2e26279ec53947fe7fbab9f526d/src/opencv.py
$ git clone https://github.com/zono/ocr.git
$ cd ocr
$ git checkout 16fd0ec9a2c7d2e26279ec53947fe7fbab9f526d
$ docker-compose up -d
$ docker exec -it ocr /bin/bash
$ python3 opencv.py

Since your table is perfectly aligned, you can inverse binary threshold your image, and count (white) pixels along the y-axis to detect the vertical lines:
You'll need to clean the peaks, since you might get plateaus for the thicker lines.
That'd be my idea in Python OpenCV:
import cv2
import numpy as np
from skimage import io # Only needed for web reading images
# Web read image via scikit-image; convert to OpenCV's BGR color ordering
img = cv2.cvtColor(io.imread('https://i.stack.imgur.com/BTqBs.png'), cv2.COLOR_RGB2BGR)
# Inverse binary threshold grayscale version of image
img_thr = cv2.threshold(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), 128, 255, cv2.THRESH_BINARY_INV)[1]
# Count pixels along the y-axis, find peaks
thr_y = 200
y_sum = np.count_nonzero(img_thr, axis=0)
peaks = np.where(y_sum > thr_y)[0]
# Clean peaks
thr_x = 50
temp = np.diff(peaks).squeeze()
idx = np.where(temp > thr_x)[0]
peaks = np.concatenate(([0], peaks[idx+1]), axis=0) + 1
# Save sub-images
for i in np.arange(peaks.shape[0] - 1):
cv2.imwrite('sub_image_' + str(i) + '.png', img[:, peaks[i]:peaks[i+1]])
I get the following three images:
As you can see, you might want to modify the selection by +/- 1 pixel, if an actual line is only 1 pixel wide.
Hope that helps!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.1
NumPy: 1.18.1
OpenCV: 4.2.0
----------------------------------------

OpenCV has a line detection function:
You can filter the lines that are returned by passing min_theta and max_theta. For vertical lines you can specify maybe : 88 and 92 respectively for margin.
This is a edited sample taken from openCV documentation:
import sys
import math
import cv2 as cv
import numpy as np
def main(argv):
default_file = 'img.png'
filename = argv[0] if len(argv) > 0 else default_file
# Loads an image
src = cv.imread(cv.samples.findFile(filename), cv.IMREAD_GRAYSCALE)
#some preparation of the photo
dst = cv.Canny(src, 50, 200, None, 3)
# Copy edges to the images that will display the results in BGR
cdst = cv.cvtColor(dst, cv.COLOR_GRAY2BGR)
cdstP = np.copy(cdst)
lines = cv.HoughLines(dst, 1, np.pi / 180, 150, None, 88, 92) #min and max theta
You can get the x, y coordinate of the line and draw them by using the following code.
if lines is not None:
for i in range(0, len(lines)):
rho = lines[i][0][0]
theta = lines[i][0][2]
a = math.cos(theta)
b = math.sin(theta)
x0 = a * rho
y0 = b * rho
pt1 = (int(x0 + 1000*(-b)), int(y0 + 1000*(a)))
pt2 = (int(x0 - 1000*(-b)), int(y0 - 1000*(a)))
cv.line(cdst, pt1, pt2, (0,0,255), 3, cv.LINE_AA)
Alternatively you can also use HoughLinesP as this allows you to specify a minimum length, which will help your filtering. Also the lines are returned as x,y pairs for each end making it easier to work with.
linesP = cv.HoughLinesP(dst, 1, np.pi / 180, 50, None, 50, 10)
if linesP is not None:
for i in range(0, len(linesP)):
l = linesP[i][0]
cv.line(cdstP, (l[0], l[2]), (l[2], l[3]), (0,0,255), 3, cv.LINE_AA)
cv.imshow("Source", src)
cv.imshow("Detected Lines (in red) - Standard Hough Line Transform", cdst)
cv.imshow("Detected Lines (in red) - Probabilistic Line Transform", cdstP)
cv.waitKey()
return 0
Documentation
To crop your image you can take the x coordinates of the lines you detected and use numpy slicing.
for i in range(0, len(linesP) - 1):
l = linesP[i][0]
xcoords = l[0], linesP[i+1][0][0]
slice = img[:xcoords[0],xcoords[1]]
cv.imshow('slice', slice)
cv.waitKey(0)

detect object in image with almost similar background

I have to detect mice in a cage, input images look like following:
at the moment I am using cv.createBackgroundSubtractorMOG2() in the video stream to find the area containing the mice and afterwards Canny Edge detector to extract the contours of the mice.
However, this is not working that well.. the more the mice is moving the better, but I guess there could be a better approach to detect the mice.
Does anyne have a different idea how to detect the mice?
thanks in advance

After subtracting the background, you could use a threshold to remove noise. Try saving the subtracted image and seeing what it looks like. Here's a script I use to tweak filter parameters (run it with the subtracted image):
import cv2
import numpy as np
screenshot_path = 'screenshot.bmp'
def nothing(x):
pass
# Creating a window for later use
cv2.namedWindow('mask', cv2.WINDOW_NORMAL)
cv2.namedWindow('trackbar', cv2.WINDOW_NORMAL)
# Starting with 100's to prevent error while masking
h, s, v = 100, 100, 100
# Creating track bar
cv2.createTrackbar('h', 'trackbar', 0, 180, nothing)
cv2.createTrackbar('s', 'trackbar', 0, 255, nothing)
cv2.createTrackbar('v', 'trackbar', 164, 255, nothing)
cv2.createTrackbar('h2', 'trackbar', 120, 180, nothing)
cv2.createTrackbar('s2', 'trackbar', 12, 255, nothing)
cv2.createTrackbar('v2', 'trackbar', 253, 255, nothing)
frame = cv2.imread(screenshot_path)
# converting to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
while (1):
# get info from track bar and appy to result
h = cv2.getTrackbarPos('h', 'trackbar')
s = cv2.getTrackbarPos('s', 'trackbar')
v = cv2.getTrackbarPos('v', 'trackbar')
h2 = cv2.getTrackbarPos('h2', 'trackbar')
s2 = cv2.getTrackbarPos('s2', 'trackbar')
v2 = cv2.getTrackbarPos('v2', 'trackbar')
# Normal masking algorithm
lower = np.array([h, s, v])
upper = np.array([h2, s2, v2])
mask = cv2.inRange(hsv, lower, upper)
result = cv2.bitwise_and(frame,frame,mask = mask)
cv2.imshow('result', result)
print(h, s, v, h2, s2, v2)
k = cv2.waitKey(5) & 0xFF
if k == 27:
break
cv2.destroyAllWindows()
If that doesn't work, I would use an object tracker API like CSRT
# python opencv_object_tracking.py
# python opencv_object_tracking.py --video dashcam_boston.mp4 --tracker csrt
# import the necessary packages
from imutils.video import VideoStream
from imutils.video import FPS
import argparse
import imutils
import time
import cv2
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", type=str,
help="path to input video file")
ap.add_argument("-t", "--tracker", type=str, default="kcf",
help="OpenCV object tracker type")
args = vars(ap.parse_args())
# extract the OpenCV version info
(major, minor) = cv2.__version__.split(".")[:2]
# if we are using OpenCV 3.2 OR BEFORE, we can use a special factory
# function to create our object tracker
if int(major) == 3 and int(minor) < 3:
tracker = cv2.Tracker_create(args["tracker"].upper())
# otherwise, for OpenCV 3.3 OR NEWER, we need to explicity call the
# approrpiate object tracker constructor:
else:
# initialize a dictionary that maps strings to their corresponding
# OpenCV object tracker implementations
OPENCV_OBJECT_TRACKERS = {
"csrt": cv2.TrackerCSRT_create,
"kcf": cv2.TrackerKCF_create,
"boosting": cv2.TrackerBoosting_create,
"mil": cv2.TrackerMIL_create,
"tld": cv2.TrackerTLD_create,
"medianflow": cv2.TrackerMedianFlow_create,
"mosse": cv2.TrackerMOSSE_create
}
# grab the appropriate object tracker using our dictionary of
# OpenCV object tracker objects
tracker = OPENCV_OBJECT_TRACKERS[args["tracker"]]()
# initialize the bounding box coordinates of the object we are going
# to track
initBB = None
# if a video path was not supplied, grab the reference to the web cam
if not args.get("video", False):
print("[INFO] starting video stream...")
vs = VideoStream(src=0).start()
time.sleep(1.0)
# otherwise, grab a reference to the video file
else:
vs = cv2.VideoCapture(args["video"])
# initialize the FPS throughput estimator
fps = None
# loop over frames from the video stream
while True:
# grab the current frame, then handle if we are using a
# VideoStream or VideoCapture object
frame = vs.read()
frame = frame[1] if args.get("video", False) else frame
# check to see if we have reached the end of the stream
if frame is None:
break
# resize the frame (so we can process it faster) and grab the
# frame dimensions
frame = imutils.resize(frame, width=500)
(H, W) = frame.shape[:2]
# check to see if we are currently tracking an object
if initBB is not None:
# grab the new bounding box coordinates of the object
(success, box) = tracker.update(frame)
# check to see if the tracking was a success
if success:
(x, y, w, h) = [int(v) for v in box]
cv2.rectangle(frame, (x, y), (x + w, y + h),
(0, 255, 0), 2)
# update the FPS counter
fps.update()
fps.stop()
# initialize the set of information we'll be displaying on
# the frame
info = [
("Tracker", args["tracker"]),
("Success", "Yes" if success else "No"),
("FPS", "{:.2f}".format(fps.fps())),
]
# loop over the info tuples and draw them on our frame
for (i, (k, v)) in enumerate(info):
text = "{}: {}".format(k, v)
cv2.putText(frame, text, (10, H - ((i * 20) + 20)),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
# show the output frame
cv2.imshow("Frame", frame)
key = cv2.waitKey(1) & 0xFF
# if the 's' key is selected, we are going to "select" a bounding
# box to track
if key == ord("s"):
# select the bounding box of the object we want to track (make
# sure you press ENTER or SPACE after selecting the ROI)
initBB = cv2.selectROI("Frame", frame, fromCenter=False,
showCrosshair=True)
# start OpenCV object tracker using the supplied bounding box
# coordinates, then start the FPS throughput estimator as well
tracker.init(frame, initBB)
fps = FPS().start()
# if the `q` key was pressed, break from the loop
elif key == ord("q"):
break
# if we are using a webcam, release the pointer
if not args.get("video", False):
vs.stop()
# otherwise, release the file pointer
else:
vs.release()
# close all windows
cv2.destroyAllWindows()

ValueError: could not broadcast input array from shape (700,227,3) into shape (0,227,3)

Please help me to rectify the errors. This is an Opencv feature extraction code.
from __future__ import division
import numpy as np
import cv2
ESC=27
camera = cv2.VideoCapture(0)
orb = cv2.ORB()
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
imgTrainColor=cv2.imread('train.jpg')
imgTrainGray = cv2.cvtColor(imgTrainColor, cv2.COLOR_BGR2GRAY)
kpTrain = orb.detect(imgTrainGray,None)
kpTrain, desTrain = orb.compute(imgTrainGray, kpTrain)
firsttime=True
while True:
ret, imgCamColor = camera.read()
imgCamGray = cv2.cvtColor(imgCamColor, cv2.COLOR_BGR2GRAY)
kpCam = orb.detect(imgCamGray,None)
kpCam, desCam = orb.compute(imgCamGray, kpCam)
matches = bf.match(desCam,desTrain)
dist = [m.distance for m in matches]
thres_dist = (sum(dist) / len(dist)) * 0.5
matches = [m for m in matches if m.distance < thres_dist]
if firsttime==True:
h1, w1 = imgCamColor.shape[:2]
h2, w2 = imgTrainColor.shape[:2]
nWidth = w1+w2
nHeight = max(h1, h2)
hdif = (h1-h2)/2
firsttime=False
result = np.zeros((nHeight, nWidth, 3), np.uint8)
result[hdif:hdif+h2, :w2] = imgTrainColor
result[:h1, w2:w1+w2] = imgCamColor
for i in range(len(matches)):
pt_a=(int(kpTrain[matches[i].trainIdx].pt[0]), int(kpTrain[matches[i].trainIdx].pt[1]+hdif))
pt_b=(int(kpCam[matches[i].queryIdx].pt[0]+w2), int(kpCam[matches[i].queryIdx].pt[1]))
cv2.line(result, pt_a, pt_b, (255, 0, 0))
cv2.imshow('Camara', result)
key = cv2.waitKey(30)
if key == ESC:
break
cv2.destroyAllWindows()
camera.release()
ERRORS APPEARING:
Traceback (most recent call last):
File "sift.py", line 39, in
result[hdif:hdif+h2, :w2] = imgTrainColor
ValueError: could not broadcast input array from shape (700,227,3) into shape (0,227,3)

Without digging through your code in detail
result[hdif:hdif+h2, :w2] = imgTrainColor
... from shape (700,227,3) into shape (0,227,3)
I duduce that imgTrainColor is 3d with shape (700,227,3).
result must has (3,) last dimension; the :w2 must be slicing 227 vales. But the hdif:hdif+h2 is slicing 0, probably because h2 is 0.
In other words, you are trying to put the imgTrainColor values into a block of result that is too small.
Can I leave to you to figure out why h2 is wrong? Another possibility is the hdif is too large (>700). You may need to print those indexing values just before this error.
Oh, and clean up the indentation.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize' OpenCV - opencv

Related

Why does Tesseract fail to recognize 6 out of 26 of my alphabetic keyboard keys even with several parameter tunings?

Resize image mask (shrink) using max value of united pixel group

How to split image of table at vertical lines into three images?

detect object in image with almost similar background

ValueError: could not broadcast input array from shape (700,227,3) into shape (0,227,3)

Categories

Resources