imshow seems to show the wrong image (not duplicate) - opencv

I am trying to show image using cv2.imshow(), but batch_data (original image) got altered after I concatenate it with data (set of all image). I am note sure how original image is altered.
data = np.array([]).reshape([0, IMG_WIDTH , IMG_HEIGHT ,IMG_DEPTH ])
label = np.array([])
batch_label = np.array([255]).reshape(1) #label number represent background
x = True
for (n,address) in enumerate(address_list):
print("Reading all images with background from ", address)
batch_data = cv2.imread(image_dir + address)
dim = (IMG_WIDTH ,IMG_HEIGHT)
if batch_data is not None:
batch_data = cv2.resize(batch_data,dim, interpolation = cv2.INTER_NEAREST)
else:
print("batch_data is not read.")
batch_data = np.expand_dims(batch_data, axis= 0)
data = np.concatenate((data,batch_data))
label = np.concatenate((label,batch_label))
while x:
print("batch_data.shape",batch_data.shape)
print("data.shape", data.shape)
print((np.squeeze(batch_data, axis=0) == data[n,...]).all()) # return true
cv2.imshow('image', np.squeeze(batch_data, axis= 0)) # show original image
cv2.imshow('image2', data[n,...]) #show original image but color is alter to white and red
cv2.waitKey(0)
cv2.destroyAllWindows()
x = False
I think cv2.imshow('image2', data[n,...]) show the original image because I tried using transpose to swap axis=1 to axis=2, and the red spot is moved accordingly. I may be wrong.
Can anyone spot the mistake? I feel like it is going to be a very stupid mistake, but I just could not find it.

I think this is a data type problem.
Try to change data from float64 to uint8:
data = np.array([], dtype=np.uint8).reshape([0, IMG_WIDTH , IMG_HEIGHT ,IMG_DEPTH])
The white and red are signs that indicate saturation. float64 range is expected to be [0, 1], while uint8 is expected to be [0, 255]. You can find more about this problem here.

Related

error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize' OpenCV

I have this old code that is used to run fine in Python 2.7 a while ago. I just updated the code to run in Python 3.8, but when I try to execute it code in Python 3.8 and OpenCV 3.4 I get a resize error and a warning (below)!
Here is the link to the two tif images that are required to run this code.
It's worth noting that both tif images are in the same folder as the Python code
import cv2
import matplotlib.pyplot as plt
import numpy as np
## Code for C_preferred Mask and C_images##
## There are three outputs to this code:
#"Block_order_C.PNG"
#"Out_img.PNG"
#"Output_C.txt"
## Change the image name here
filename_image = '2.tif'
filename_mask = '1.tif'
## OpenCV verison Checking
#print 'OpenCV version used', cv2.__version__
filename = open("Output_C.txt","w")
filename.write("Processing Image : " + str(filename_image) + '\n\n')
## Function to sort the contours : Parameters that you can tune : tolerance_factor and size 0f the image.Here, I have used a fix size of
## (800,800)
def get_contour_precedence(contour, cols):
tolerance_factor = 10
origin = cv2.boundingRect(contour)
return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]
## Loading the colored mask, resizing it to (800,800) and converting it from RGB to HSV space, so that the color values are emphasized
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask),(800,800)),cv2.COLOR_RGB2HSV);
# Loading the original Image
b_image_1 = cv2.resize(cv2.imread(filename_image),(800,800));
cv2.imshow("c_mask_preferred",p_mask_c)
cv2.waitKey();
# convert the target color to HSV, As our target mask portion to be considered is green. So I have chosen target color to be green
b = 0;
g = 255;
r = 0;
# Converting target color to HSV space
target_color = np.uint8([[[b, g, r]]])
target_color_hsv = cv2.cvtColor(target_color, cv2.COLOR_BGR2HSV)
# boundaries for Hue define the proper color boundaries, saturation and values can vary a lot
target_color_h = target_color_hsv[0,0,0]
tolerance = 20
lower_hsv = np.array([max(0, target_color_h - tolerance), 10, 10])
upper_hsv = np.array([min(179, target_color_h + tolerance), 250, 250])
# apply threshold on hsv image
mask = cv2.inRange(p_mask_c, lower_hsv, upper_hsv)
cv2.imshow("mask",mask)
cv2.waitKey()
# Eroding the binary mask, such that every white portion (grids) are seperated from each other, to avoid overlapping and mixing of
# adjacent grids
b_mask = mask;
kernel = np.ones((5,5))
#kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
sharp = cv2.erode(b_mask,kernel, iterations=2)
# Finding all the grids (from binary image)
contours, hierarchy = cv2.findContours(sharp,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
print (' Number of contours', len(contours))
# Sorting contours
contours.sort(key=lambda x:get_contour_precedence(x, np.shape(b_mask)[0]))
#cv2.drawContours(b_image_1, contours, -1, (0,255,0), 1)
# Label variable for each grid/panel
label = 1;
b_image = b_image_1.copy();
temp =np.zeros(np.shape(b_image_1),np.uint8)
print (' size of temp',np.shape(temp), np.shape(b_image))
out_img = b_image_1.copy()
# Processing in each contour/label one by one
for cnt in contours:
cv2.drawContours(b_image_1,[cnt],0,(255,255,0), 1)
## Just to draw labels in the center of each grid
((x, y), r) = cv2.minEnclosingCircle(cnt)
x = int(x)
y = int(y)
r = int(r)
cv2.putText(b_image_1, "#{}".format(label), (int(x) - 10, int(y)),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
##
cv2.drawContours(temp,[cnt],0,(255,255,255), -1)
#crop_img = np.bitwise_and(b_image,temp)
r = cv2.boundingRect(cnt)
crop_img = b_image[r[1]:r[1]+r[3], r[0]:r[0]+r[2]]
mean = cv2.mean(crop_img);
mean = np.array(mean).reshape(-1,1)
print (' Mean color', mean, np.shape(mean))
if mean[1] < 50:
cv2.putText(out_img, "M", (int(x) - 10, int(y)),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 1)
filename.write("Block number #"+ str(label)+ ' is : ' + 'Magenta'+'\n');
else:
cv2.putText(out_img, "G", (int(x) - 10, int(y)),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 1)
filename.write("Block number #"+ str(label)+ ' is : ' +'Gray'+'\n');
label = label+1;
cv2.imwrite("Block_order_C.PNG",b_image_1)
cv2.imwrite("Out_img.PNG",out_img)
filename.close()
cv2.imshow("preferred",b_image_1)
cv2.waitKey()
Error
[ WARN:0] global C:\projects\opencv-python\opencv\modules\imgcodecs\src\grfmt_tiff.cpp (449) cv::TiffDecoder::readData OpenCV TIFF: TIFFRGBAImageOK: Sorry, can not handle images with IEEE floating-point samples
Traceback (most recent call last):
File "Processing_C_preferred.py", line 32, in
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask),(800,800)),cv2.COLOR_RGB2HSV);
cv2.error: OpenCV(4.2.0) C:\projects\opencv-python\opencv\modules\imgproc\src\resize.cpp:4045: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'
When you read in the image pass the cv::IMREAD_ANYDEPTH = 2 parameter as the second parameter in cv2.imread().
Changing your lines to
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask, 2),(800,800)),cv2.COLOR_RGB2HSV);
and
b_image_1 = cv2.resize(cv2.imread(filename_image, 2),(800,800));
removes the resize error you're seeing.
But you get another error when changing the color since your TIFF image apparently has only one channel so cv2.COLOR_RGB2HSV won't work..
You could also use multiple flags like cv::IMREAD_COLOR = 1,
p_mask_c = cv2.cvtColor(cv2.resize(cv2.imread(filename_mask, 2 | 1),(800,800)),cv2.COLOR_BGR2HSV);
to read in a color image. But you get a different error. Perhaps you understand this image better than I do and can solve the problem from here on out.

shouldn't return type of skimage.filters.threshold_mean be from 0-255?

bw_image = sk_col.rgb2gray(image)
fig = plt.figure(figsize=(3,3))
plt.imshow(bw_image, cmap="gray")
plt.show()
mean_val = threshold_mean(bw_image)
print('*****',mean_val,'******')
binary_image = bw_image > mean_val
fig = plt.figure(figsize=(3,3))
plt.imshow(binary_image, cmap="gray")
plt.title("Mean Threshold: " + str(mean_val))
plt.show()
The issue is why am I getting a mean of 0.386 . Shouldn't I get values around 0-255. skimage.color.rgb2gray function should return an image of pixel range 0-255. but when I print the pixels I see the values between 0 to 1.
For this confusion I'm having trouble understanding how the binary image is found. Checked the documents of skimage.color, but didn't find anything helpful.
skimage.color.rgb2gray(image) method returns the values after normalizing that's why all values are between 0-1. Answer was that simple.

Color Quantization using K Means (making sense of code)

Can someone please explain what the 'label' is in ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS) ?
also what exactly is happening here:
# Now convert back into uint8, and make original image
center = np.uint8(center)
res = center[label.flatten()]
res2 = res.reshape((img.shape))
Compelete code:
img = cv2.imread('home.jpg')
Z = img.reshape((-1,3))
# convert to np.float32
Z = np.float32(Z)
# define criteria, number of clusters(K) and apply kmeans()
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 8
ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)
# Now convert back into uint8, and make original image
center = np.uint8(center)
res = center[label.flatten()]
res2 = res.reshape((img.shape))
cv2.imshow('res2',res2)
I would refer you to the documentation https://docs.opencv.org/master/d5/d38/group__core__cluster.html#ga9a34dc06c6ec9460e90860f15bcd2f88.
The labels are the cluster indices for every sample.

How to split the image into chunks without breaking character - python

I am trying to read image from the text.
I am getting better result if I break the images into small chunks but the problem is when i try to split the image it is cutting/slicing my characters.
code I am using :
from __future__ import division
import math
import os
from PIL import Image
def long_slice(image_path, out_name, outdir, slice_size):
"""slice an image into parts slice_size tall"""
img = Image.open(image_path)
width, height = img.size
upper = 0
left = 0
slices = int(math.ceil(height/slice_size))
count = 1
for slice in range(slices):
#if we are at the end, set the lower bound to be the bottom of the image
if count == slices:
lower = height
else:
lower = int(count * slice_size)
#set the bounding box! The important bit
bbox = (left, upper, width, lower)
working_slice = img.crop(bbox)
upper += slice_size
#save the slice
working_slice.save(os.path.join(outdir, "slice_" + out_name + "_" + str(count)+".png"))
count +=1
if __name__ == '__main__':
#slice_size is the max height of the slices in pixels
long_slice("/python_project/screenshot.png","longcat", os.getcwd(), 100)
Sample Image : The image i want to process
Expected/What i am trying to do :
I want to split every line as separate image without cutting the character
Line 1:
Line 2:
Current result:Characters in the image are cropped
I dont want to cut the image based on pixels since each document will have separate spacing and line width
Thanks
Jk
Here is a solution that finds the brightest rows in the image (i.e., the rows without text) and then splits the image on those rows. So far I have just marked the sections, and am leaving the actual cropping up to you.
The algorithm is as follows:
Find the sum of the luminance (I am just using the red channel) of every pixel in each row
Find the rows with sums that are at least 0.999 (which is the threshold I am using) as bright as the brightest row
Mark those rows
Here is the code that will return a list of these rows:
def find_lightest_rows(img, threshold):
line_luminances = [0] * img.height
for y in range(img.height):
for x in range(img.width):
line_luminances[y] += img.getpixel((x, y))[0]
line_luminances = [x for x in enumerate(line_luminances)]
line_luminances.sort(key=lambda x: -x[1])
lightest_row_luminance = line_luminances[0][1]
lightest_rows = []
for row, lum in line_luminances:
if(lum > lightest_row_luminance * threshold):
lightest_rows.add(row)
return lightest_rows
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 ... ]
After colouring these rows red, we have this image:

Image normalization

In my perspective, image normalization is to make every pixel to be normalized with an value between 0 and 1, am I right?
But what does the following code mean?
image_size = 28 # Pixel width and height.
pixel_depth = 255.0 # Number of levels per pixel.
for image in image_files:
image_file = os.path.join(folder, image)
try:
image_data = (ndimage.imread(image_file).astype(float) -
pixel_depth / 2) / pixel_depth # WHY ??
if image_data.shape != (image_size, image_size):
raise Exception('Unexpected image shape: %s' % str(image_data.shape))
dataset[num_images, :, :] = image_data
num_images = num_images + 1
except IOError as e:
print('Could not read:', image_file, ':', e, '- it\'s ok, skipping.')
Image normalization is merely the process of changing the range of pixel intensity values.
The choice of the new range is up to you.
In the case you've shown, it looks like the range -0.5 .. 0.5 has been chosen.

Resources