OpenCV hangs when using multiprocessing on a Raspberry Pi - opencv

This code runs as expected, and gives the expected output
import multiprocessing
import cv2
import os
path = r"/home/pi/Desktop/calibration.jpg"
image = cv2.imread(path)
def cvtcolor(img):
print "converting to gray ..."
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print "converted to gray"
if True:
p = multiprocessing.Process(name='test',
target=cvtcolor,
kwargs={'img':image}
)
p.start()
p2 = multiprocessing.Process(name='test',
target=cvtcolor,
kwargs={'img':image}
)
p2.start()
outputs:
converting to gray ...
converting to gray ...
converted to gray
converted to gray
However, this code hangs when executed
import multiprocessing
import cv2
import os
path = r"/home/pi/Desktop/calibration.jpg"
image = cv2.imread(path)
def cvtcolor(img):
print "converting to gray ..."
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print "converted to gray"
cvtcolor(image)
if True:
p = multiprocessing.Process(name='test',
target=cvtcolor,
kwargs={'img':image}
)
p.start()
the function executed in the main process proceeds, but the function executed in the "test" process hangs forever
converting to gray ...
converted to gray
converting to gray ...
I am using OpenCV version 3.2.0, installed as detailed here on Raspbian Jessie (raspberry pi)
Does anyone has an explanation / solution for this?

have a look at what is returned. If you try the BGR2GRAY directly, you will get an array with shape attribute same as input image but with only 1 color, e.g. gray. When you run the same function using multiprocessing you do not get an array returned. It will have no shape attribute, try printing the output to see what form it is in, then maybe reconstruct an image from this.

Related

Denoising multiple grayscaled text images using Opencv [duplicate]

This question already has answers here:
What does OpenCV's cvWaitKey( ) function do?
(9 answers)
what does waitKey (30) mean in OpenCV? [duplicate]
(1 answer)
Closed 2 years ago.
I am trying to denoise multiple gray-scaled text images from a folder. I have converted all the images into gray-scale already. All I want is to remove noise or blurriness from all the images without changing text. For this, I am using opencv in order to remove blurriness or noisiness. I have written the code as shown below, when I run the code it shows no error and displays nothing.Please help me to solve this problem. I am new in image processing that's why I am confused. Here's my code...
import numpy as np
from PIL import Image
import cv2
import glob
src_path = r"C:\Users\usama\Documents\FYP-Data\FYP Project Data\grayscale images\*.png" #images folder path
def get_string(src_path):
for filename in glob.glob(src_path):
img = cv2.imread(filename)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1, 1), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
cv2.imwrite(src_path + "filename", img)
You should narrow down the files you load in. This I prefer to do with glob which allows for easy regular expression patterns when searching for files. I would expect that either you get to a file that is not an image but still loaded or that you are missing a cv2.waitKey(0) to exit the view.
import cv2
from glob import glob
files = glob('*.jpg')
for filename in glob('*.jpg'):
img = cv2.imread(filename)
bilateral_blur = cv2.bilateralFilter(img, 9, 75, 75)
cv2.imshow('denoised_images', bilateral_blur)
cv2.waitKey(0)

Anyone knows what is in Skimage TIfffile save, unknown error "type b".?

I am getting a strange error saving a tiff file (stack grayscale), any idea?:
File
"C:\Users\ptyimg_np.MT00200169\Anaconda3\lib\site-packages\tifffile\tifffile.py",
line 1241, in save
sampleformat = {'u': 1, 'i': 2, 'f': 3, 'c': 6}[datadtype.kind] KeyError: 'b'
my code is
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from skimage.morphology import watershed
from skimage.feature import peak_local_max
from scipy import ndimage
from skimage import img_as_float
from skimage import exposure,io
from skimage import external
from skimage.color import rgb2gray
from skimage.filters import threshold_local , threshold_niblack
import numpy as np
import tifffile
from joblib import Parallel, delayed
import sys
# Load an example image
input_namefile = sys.argv[1]
output_namefile = 'seg_'+ input_namefile
#Settings
block_size = 25 #Size block of the local thresholding
img = io.imread(input_namefile, plugin='tifffile')
thresh = threshold_niblack(img, window_size=block_size , k=0.8) #
res = img > thresh
res = np.asanyarray(res)
print("saving segmentation")
tifffile.imsave(output_namefile, res , photometric='minisblack' )
It looks like the error is caused by a bug in writing boolean images in your installed version of tifffile. However, the bug has been fixed in more recent versions (I have 2020.2.16 in my current environment). On my machine, this works fine:
import numpy as np
import tifffile
tifffile.imsave('test.tiff', np.random.random((10, 10)) > 0.5)
and the line causing a crash in your version is never executed in the case of a boolean image.
So, long story short, use python -m pip install -U tifffile to upgrade your version of tifffile, and your program should work!
Some analysis first. The offending line:
sampleformat = {'u': 1, 'i': 2, 'f': 3, 'c': 6}[datadtype.kind]
is causing a KeyError exception because the value of datadtype.kind (the NumPy datatype) is set to b and there is no b in that dictionary. It only caters for types i, u, f, and c (respectively, signed integer, unsigned integer, floating-point, and complex floating-point). Type b is boolean.
This looks like a bug in the code that you're using. If it's something that's not supported, the code should really catch the exception and report on it in a more user-friendly manner rather than just dumping an exception for you to figure out.
My advice is to raise this as a bug with the author.
In terms of the root cause of the issue (this is speculation based on analysis, so could be wrong, I'm just providing it as a possible cause), an examination of your code shows:
img = io.imread(input_namefile, plugin='tifffile')
thresh = threshold_niblack(img, window_size=block_size , k=0.8) #
res = img > thresh
res = np.asanyarray(res)
tifffile.imsave(output_namefile, res , photometric='minisblack' )
That third line above will set res to a either a boolean value or a boolean array that depends on the respective values of each pixel in img and thresh (I don't know enough about NumPy to pontificate on this).
However, regardless of that, they are one or more booleans so, when you try to write them with the imsave() call, it complains about the type being used (as mentioned above, it appears to not cater for boolean values correrctly).
Based on some sample code found elsewhere:
image = data.coins()
mask = image > 128
masked_image = image * mask
I suspect that you should use something similar to that last line to apply the mask to the image, then write the resultant value:
img = io.imread(input_namefile, plugin='tifffile')
thresh = threshold_niblack(img, window_size=block_size , k=0.8)
mask = image > 128 # <-- unsure if this is needed.
res = img * thresh # <-- add this line.
res = np.asanyarray(res)
tifffile.imsave(output_namefile, res , photometric='minisblack' )
Applying the mask to the original image should give you an array of usable values that you can write back out to an image file. Note that I'm unsure whether you need the res > thresh line since it appears to me that the threshold already gives you a mask. I could be wrong on that point so my advice is still to raise it with the author.

How can I improve recognition?

I set myself the task of recognizing passports, but I can’t completely recognize all areas. Tell me, what can help? Used a different filtering and canny algorithm, but something is missing. Код не может распознать серию и номер документа, а также мелкие символы, иногда не может распознать имя или фамилию совсем....
# import the necessary packages
from PIL import Image
import pytesseract
import argparse
import cv2
import os
import numpy as np
# построить разбор аргументов и разбор аргументов
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image" )
ap.add_argument("-p", "--preprocess", type=str, default="thresh")
args = vars(ap.parse_args())
# загрузить пример изображения и преобразовать его в оттенки серого
image = cv2.imread ("pt.jpg")
gray = cv2.cvtColor (image, cv2.COLOR_BGR2GRAY)
gray = cv2.Canny(image,300,300,apertureSize = 3)
# check to see if we should apply thresholding to preprocess the
# image
if args["preprocess"] == "thresh":
gray = cv2.threshold (gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# make a check to see if median blurring should be done to remove
# noise
elif args["preprocess"] == "blur":
gray = cv2.medianBlur (gray, 3)
# write the grayscale image to disk as a temporary file so we can
# apply OCR to it
filename = "{}.png".format (os.getpid ())
cv2.imwrite (filename, gray)
# load the image as a PIL/Pillow image, apply OCR, and then delete
# the temporary file
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
text = pytesseract.image_to_string (image, lang = 'rus+eng')
os.remove (filename)
print (text)
os.system('python gon.py > test.txt') # doc output file
# show the output images
cv2.imshow ("Image", image)
cv2.imshow ("Output", gray)
cv2.waitKey (0)
It's easier (and faster) for Tesseract to recognize text when you provide it only with the regions that contain the text you want to interpret, in your case the big, black letters in the middle, for example:
I'm referring to running Tesseract only in the regions in green. Since the document's structure is predictable, you could easily find these regions as follows:
binarize and invert the image (black = empty)
use opencv's connectedComponentsWithStats() function to get a list of all connected components with their positions and size
you can hardcode thresholds to filter only the characters you want, or get a histogram of areas and use statistics to define thresholds dinamically
on the remaining connected components use morphological operations (e.g. dilation with a horizontal kernel) to connect letters together horizontally
get the bounding box of the final connected components
Optional: postprocessing will work much better in these isolated boxes
Feed each bounding box as a separate Mat to tesseract, it will greatly simplify the problem.

tesseract not able to read all digits accurately

I'm using Tesseract to recognize numbers from images of a screen taken with a phone camera. I've done some preprocessing of the image: processed image, and using Tesseract, I'm able to get some mixed results. Using the following code on the above images, I get the following output: "EOE". However, with this image, processed image, I get an exact match: "39:45.8"
import cv2
import pytesseract
from PIL import Image, ImageEnhance
from matplotlib import pyplot as plt
orig_name = "time3.jpg";
image_name = "time3_.jpg";
img = cv2.imread(orig_name, 0)
img = cv2.medianBlur(img, 5)
img_th = cv2.adaptiveThreshold(img, 255,\
cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, 11, 2)
cv2.imshow('image', img_th)
cv2.waitKey(0)
cv2.imwrite(image_name, img_th)
im = Image.open(image_name)
time = pytesseract.image_to_string(im, config = "-psm 7")
print(time)
Is there anything I can do to get more consistent results?
I did three additional things to get it correct for the first Image.
You can set a whitelist for Tesseract. In your case we know that
there will only charachters from this List 01234567890.:. This
improves the accuracy significantly.
I resized the image to make it easier for tesseract.
I switched from psm mode 7 to 11 (Recoginze as much as possible)
Code:
import cv2
import pytesseract
from PIL import Image, ImageEnhance
orig_name = "./time1.jpg";
img = cv2.imread(orig_name)
height, width, channels = img.shape
imgResized = cv2.resize(img, ( width*3, height*3))
cv2.imshow("img",imgResized)
cv2.waitKey()
im = Image.fromarray(imgResized)
time = pytesseract.image_to_string(im, config ='--tessdata-dir "/home/rvq/github/tesseract/tessdata/" -c tessedit_char_whitelist=01234567890.: -psm 11 -oem 0')
print(time)
Note:
You can use Image.fromarray(imgResized) to convert an opencv image to a PIL Image. You don't have to write to disk and read it again.

Simultneous depth and video from kinect

I want to get both depth and video from streams from the kinect to my opencv code. I am working in Linux. I have installed libfreenect module for depth. However, there is only one device listed in /dev/. Now, when I connect the Kinect to my pc and run
camorama -d /dev/video0
I get the depth map. Then, I access the device using videocapture in opencv and I get the rgb video. Now, if I again run the camorama command, I get the rgb video this time. I can't figure out what's happening. I basically want both the stream in my opencv code. Please help.
Run this python script:
import freenect
import cv2
import numpy as np
from functions import *
def nothing(x):
pass
kernel = np.ones((5, 5), np.uint8)
def pretty_depth(depth):
np.clip(depth, 0, 2**10 - 1, depth)
depth >>= 2
depth = depth.astype(np.uint8)
return depth
while 1:
orig = freenect.sync_get_video()[0]
orig = cv2.cvtColor(orig,cv2.COLOR_BGR2RGB)
dst = pretty_depth(freenect.sync_get_depth()[0])#input from kinect
cv2.imshow('Disparity', dst)
cv2.imshow('RGB',orig)
if cv2.waitKey(1) & 0xFF == ord('b'):
break

Resources