Related
This is my first experience with image processing. In the jupiter notebook, using scipy , I am trying to convert a gray scale line art image into SVG vector representation. So far I was able to convert the gray scale image to binary (monochrome image) and use sobel filter in x and y axis to get the edges of the drawing. I am getting double lines as edges to account for the both sides of the lines (as shown in below picture and also the code i have used)
I want to replace these double lines with a single one. After that to detect the lines and curves in the drawing and convert them to svg lines and bezier curves. Searching online, i am getting a bit overwhelmed and confused about the proper way forward. It would be of great help if i can get some pointers about how to proceed from here. If possible i want to do this in scipy only and not with opencv.
Rather than simply using the existing scipy functions and algorithms, I also want to learn about the underlying theory so that i can use them efficiently. So please kindly share any helpful theoretical resources.
Thanks in advance
%matplotlib inline
import numpy as np
from scipy import ndimage as nd
import matplotlib.pyplot as plt
from skimage import io
def apply_gradient_threshold(d,thres):
d2 = np.copy(d)
d2[d2 == -thres] = thres
d2[d2 != thres] = 0
return d2
def plot_images(imgs, names):
fig, axes_list = plt.subplots(1, len(imgs), figsize=(20, 20))
for name,axes in zip(names, axes_list):
axes.set_title(name)
for img, axes in zip(imgs, axes_list):
axes.imshow(img, cmap='Greys_r')
plt.show()
img_file = <file_url>
img = plt.imread(img_file)
gray_img = io.imread(img_file, as_gray=True)
if(np.max(gray_img) > 1) :
gray_img = gray_img/255 #normalize
threshold = 0.2
binary = (gray_img > threshold)*1 # convert the grayscale image to binary (monochrome)
im = binary.astype('int32')
dx = nd.sobel(im,1)
dy = nd.sobel(im,0)
dx = apply_gradient_threshold(dx, 4)
dy = apply_gradient_threshold(dy, 4)
mag = np.hypot(dx,dy) #sqrt(dx^2 + dy^2)
mag *= 255.0/np.max(mag)
plot_images([binary, mag ], ['Binary - ' + str(threshold), 'Sobel Filter Result'])
Your image is virtually already made of edges. Use thinning, not an edge filter.
I am working on detecting handwritten symbols using computer vision in python. I trained a cnn on a dataset of individual characters, but now I want to be able to extract characters from an image in order to make predictions on the individual characters. What is the best way to do this? The handwritten text that I will be working with will not be cursive and there will be an obvious separation between the characters.
In the below snippet,the boxes variable has dimensions for each character in the image.
import cv2
import pytesseract
file = '/content/Captchas/image22.jpg'
img = cv2.imread(file)
h, w, _ = img.shape
boxes = pytesseract.image_to_boxes(img)
for b in boxes.splitlines():
b = b.split(' ')
img = cv2.rectangle(img, (int(b[1]), h - int(b[2])), (int(b[3]), h - int(b[4])), (0, 255, 0), 2)
cv2_imshow(img)
print(boxes)
you can use find contours and bound them with a box.
image = cv2.imread("filename")
image = cv2.fastNlMeansDenoisingColored(image,None,10,10,7,21)
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
res,thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) #threshold
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dilated = cv2.dilate(thresh,kernel,iterations = 5)
val,contours, hierarchy =
cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
coord = []
for contour in contours:
[x,y,w,h] = cv2.boundingRect(contour)
if h>300 and w>300:
continue
if h<40 or w<40:
continue
coord.append((x,y,w,h))
coord.sort(key=lambda tup:tup[0]) # if the image has only one sentence sort in one axis
count = 0
for cor in coord:
[x,y,w,h] = cor
t = image[y:y+h,x:x+w,:]
cv2.imwrite(str(count)+".png",t)
print("number of char in image:", count)
I want to measure the distance between two points using scikit-image. Here is the image:
In the above photo I want to measure the distance between the red point and black point. The unit of measurement does not matter to me as I want to normalize the distance by the end of the day. Any idea how I can do it?
Thanks
You could get the job done through the following stepwise procedure:
Compute distance from each pixel to red and black colors.
Binarize using an appropriate threshold.
Perform morphological closing.
Determine the centroid coordinates of the resulting blobs.
Calculate the distance between centroids.
Hopefully the code below will put you on the right track:
import numpy as np
from skimage import io
from skimage.morphology import closing
from skimage.measure import regionprops
import matplotlib.pyplot as plt
from matplotlib.patches import ConnectionPatch
img = io.imread('https://i.stack.imgur.com/vbOmy.jpg')
red = [255, 0, 0]
black = [0, 0, 0]
threshold = 10
dist_from_red = np.linalg.norm(img - red, axis=-1)
dist_from_black = np.linalg.norm(img - black, axis=-1)
red_blob = closing(dist_from_red < threshold)
black_blob = closing(dist_from_black < threshold)
labels = np.zeros(shape=img.shape[:2], dtype=np.ubyte)
labels[black_blob] = 1
labels[red_blob] = 2
blobs = regionprops(labels)
center_0 = np.asarray(blobs[0].centroid[::-1])
center_1 = np.asarray(blobs[1].centroid[::-1])
dist = np.linalg.norm(center_0 - center_1)
Demo
fig, ax = plt.subplots(1, 1)
ax.imshow(img)
con = ConnectionPatch(xyA=center_0, xyB=center_1,
coordsA='data', arrowstyle="-|>", ec='yellow')
ax.add_artist(con)
plt.annotate('Distance = {:.2f}'.format(dist),
xy=(center_0 + center_1)/2, xycoords='data',
xytext=(0.5, 0.7), textcoords='figure fraction', color='blue',
arrowprops=dict(arrowstyle="->", color='blue'))
plt.show(fig)
I am trying to make a shape recognition classifier in which if you give an individual picture of an object (from a scene), it would be able to classify (after machine learning) the shape of an object (cylinder, cube, sphere, etc).
Original scene:
Individual objects it will classify:
I attempted to do this using cv2.approxPolyDB with an attempt to classify a cylinder. However, either my implementation isn't good or this wasn't a good choice of an algorithm to choose in the first place, the objects in the shape of cylinders were assigned a approxPolyDB value of 3 or 4.
Perhaps I can threshold and, in general, if given a value of 3 or 4, assume the object is a cylinder, but I feel like it's not the most reliable method for 3D shape classification. I feel like there is a better way to implement this and a better method as opposed to just hardcoding values. I feel like that with this method, it can easily confuse a cylinder with a cube.
Is there any way I can improve my 3D shape recognition program?
Code:
import cv2
import numpy as np
from pyimagesearch import imutils
from PIL import Image
from time import time
def invert_img(img):
img = (255-img)
return img
def threshold(im):
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
imgray = cv2.medianBlur(imgray,9)
imgray = cv2.Canny(imgray,75,200)
return imgray
def view_all_contours(im, size_min, size_max):
main = np.array([[]])
cnt_target = im.copy()
for c in cnts:
epsilon = 0.1*cv2.arcLength(c,True)
approx = cv2.approxPolyDP(c,epsilon,True)
area = cv2.contourArea(c)
print 'area: ', area
test = im.copy()
# To weed out contours that are too small or big
if area > size_min and area < size_max:
print c[0,0]
print 'approx: ', len(approx)
max_pos = c.max(axis=0)
max_x = max_pos[0,0]
max_y = max_pos[0,1]
min_pos = c.min(axis=0)
min_x = min_pos[0,0]
min_y = min_pos[0,1]
# Load each contour onto image
cv2.drawContours(cnt_target, c, -1,(0,0,255),2)
print 'Found object'
frame_f = test[min_y:max_y , min_x:max_x]
main = np.append(main, approx[None,:][None,:])
thresh = frame_f.copy()
thresh = threshold(thresh)
contours_small, hierarchy = cv2.findContours(thresh.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnts_small = sorted(contours_small, key = cv2.contourArea, reverse = True)
cv2.drawContours(frame_f, cnts_small, -1,(0,0,255),2)
cv2.imshow('Thresh', thresh)
cv2.imshow('Show Ya', frame_f)
cv2.waitKey(0)
# Uncomment in order to show all rectangles in image
print '---------------------------------------------'
#cv2.drawContours(cnt_target, cnts, -1,(0,255,0),2)
print main.shape
print main
return cnt_target
time_1 = time()
roi = cv2.imread('images/beach_trash_3.jpg')
hsv = cv2.cvtColor(roi,cv2.COLOR_BGR2HSV)
target = cv2.imread('images/beach_trash_3.jpg')
target = imutils.resize(target, height = 400)
hsvt = cv2.cvtColor(target,cv2.COLOR_BGR2HSV)
img_height = target.shape[0]
img_width = target.shape[1]
# calculating object histogram
roihist = cv2.calcHist([hsv],[0, 1], None, [180, 256], [0, 180, 0, 256] )
# normalize histogram and apply backprojection
cv2.normalize(roihist,roihist,0,255,cv2.NORM_MINMAX)
dst = cv2.calcBackProject([hsvt],[0,1],roihist,[0,180,0,256],1)
# Now convolute with circular disc
disc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
cv2.filter2D(dst,-1,disc,dst)
# threshold and binary AND
ret,thresh = cv2.threshold(dst,50,255,0)
thresh_one = thresh.copy()
thresh = cv2.merge((thresh,thresh,thresh))
res = cv2.bitwise_and(target,thresh)
# Implementing morphological erosion & dilation
kernel = np.ones((9,9),np.uint8) # (6,6) to get more contours (9,9) to reduce noise
thresh_one = cv2.erode(thresh_one, kernel, iterations = 3)
thresh_one = cv2.dilate(thresh_one, kernel, iterations=2)
# Invert the image
thresh_one = invert_img(thresh_one)
# To show prev img
#res = np.vstack((target,thresh,res))
#cv2.imwrite('res.jpg',res)
#cv2.waitKey(0)
#cv2.imshow('Before contours', thresh_one)
cnt_target = target.copy()
cnt_full = target.copy()
# Code to draw the contours
contours, hierarchy = cv2.findContours(thresh_one.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(contours, key = cv2.contourArea, reverse = True)
print time() - time_1
size_min = 200
size_max = 5000
cnt_target = view_all_contours(target, size_min, size_max)
cv2.drawContours(cnt_full, cnts, -1,(0,0,255),2)
res = imutils.resize(thresh_one, height = 700)
cv2.imshow('Original image', target)
cv2.imshow('Preprocessed', thresh_one)
cv2.imshow('All contours', cnt_full)
cv2.imshow('Filtered contours', cnt_target)
cv2.waitKey(0)
I'm studying Image Processing on the famous Gonzales "Digital Image Processing" and talking about image restoration a lot of examples are done with computer-generated noise (gaussian, salt and pepper, etc). In MATLAB there are some built-in functions to do it. What about OpenCV?
As far as I know there are no convenient built in functions like in Matlab. But with only a few lines of code you can create those images yourself.
For example additive gaussian noise:
Mat gaussian_noise = img.clone();
randn(gaussian_noise,128,30);
Salt and pepper noise:
Mat saltpepper_noise = Mat::zeros(img.rows, img.cols,CV_8U);
randu(saltpepper_noise,0,255);
Mat black = saltpepper_noise < 30;
Mat white = saltpepper_noise > 225;
Mat saltpepper_img = img.clone();
saltpepper_img.setTo(255,white);
saltpepper_img.setTo(0,black);
There is function random_noise() from the scikit-image package. It has several builtin noise patterns, such as gaussian, s&p (for salt and pepper noise), possion and speckle.
Below I show an example of how to use this method
from PIL import Image
import numpy as np
from skimage.util import random_noise
im = Image.open("test.jpg")
# convert PIL Image to ndarray
im_arr = np.asarray(im)
# random_noise() method will convert image in [0, 255] to [0, 1.0],
# inherently it use np.random.normal() to create normal distribution
# and adds the generated noised back to image
noise_img = random_noise(im_arr, mode='gaussian', var=0.05**2)
noise_img = (255*noise_img).astype(np.uint8)
img = Image.fromarray(noise_img)
img.show()
There is also a package called imgaug which are dedicated to augment images in various ways. It provides gaussian, poissan and salt&pepper noise augmenter. Here is how you can use it to add noise to image:
from PIL import Image
import numpy as np
from imgaug import augmenters as iaa
def main():
im = Image.open("bg_img.jpg")
im_arr = np.asarray(im)
# gaussian noise
# aug = iaa.AdditiveGaussianNoise(loc=0, scale=0.1*255)
# poisson noise
# aug = iaa.AdditivePoissonNoise(lam=10.0, per_channel=True)
# salt and pepper noise
aug = iaa.SaltAndPepper(p=0.05)
im_arr = aug.augment_image(im_arr)
im = Image.fromarray(im_arr).convert('RGB')
im.show()
if __name__ == "__main__":
main()
Simple Function to add Gaussian, Salt-pepper speckle and poisson noise to an image
Parameters
----------
image : ndarray
Input image data. Will be converted to float.
mode : str
One of the following strings, selecting the type of noise to add:
'gauss' Gaussian-distributed additive noise.
'poisson' Poisson-distributed noise generated from the data.
's&p' Replaces random pixels with 0 or 1.
'speckle' Multiplicative noise using out = image + n*image,where
n,is uniform noise with specified mean & variance.
import numpy as np
import os
import cv2
def noisy(noise_typ,image):
if noise_typ == "gauss":
row,col,ch= image.shape
mean = 0
#var = 0.1
#sigma = var**0.5
gauss = np.random.normal(mean,1,(row,col,ch))
gauss = gauss.reshape(row,col,ch)
noisy = image + gauss
return noisy
elif noise_typ == "s&p":
row,col,ch = image.shape
s_vs_p = 0.5
amount = 0.004
out = image
# Salt mode
num_salt = np.ceil(amount * image.size * s_vs_p)
coords = [np.random.randint(0, i - 1, int(num_salt))
for i in image.shape]
out[coords] = 1
# Pepper mode
num_pepper = np.ceil(amount* image.size * (1. - s_vs_p))
coords = [np.random.randint(0, i - 1, int(num_pepper))
for i in image.shape]
out[coords] = 0
return out
elif noise_typ == "poisson":
vals = len(np.unique(image))
vals = 2 ** np.ceil(np.log2(vals))
noisy = np.random.poisson(image * vals) / float(vals)
return noisy
elif noise_typ =="speckle":
row,col,ch = image.shape
gauss = np.random.randn(row,col,ch)
gauss = gauss.reshape(row,col,ch)
noisy = image + image * gauss
return noisy
"Salt & Pepper" noise can be added in a quite simple fashion using NumPy matrix operations.
def add_salt_and_pepper(gb, prob):
'''Adds "Salt & Pepper" noise to an image.
gb: should be one-channel image with pixels in [0, 1] range
prob: probability (threshold) that controls level of noise'''
rnd = np.random.rand(gb.shape[0], gb.shape[1])
noisy = gb.copy()
noisy[rnd < prob] = 0
noisy[rnd > 1 - prob] = 1
return noisy
# Adding noise to the image
import cv2
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
img = cv2.imread('./fruit.png',0)
im = np.zeros(img.shape, np.uint8) # do not use original image it overwrites the image
mean = 0
sigma = 10
cv2.randn(im,mean,sigma) # create the random distribution
Fruit_Noise = cv2.add(img, im) # add the noise to the original image
plt.imshow(Fruit_Noise, cmap='gray')
The values of mean and sigma can be altered to bring about a specific change in noise like gaussian or pepper-salt noise etc.
You can use either randn or randu according to the need. Have a look at the documentation: https://docs.opencv.org/2.4/modules/core/doc/operations_on_arrays.html#cv2.randu
I made some change of #Shubham Pachori 's code. When reading a image into numpy arrary, the default dtype is uint8, which can cause wrapping when adding noise onto the image.
import numpy as np
from PIL import Image
"""
image: read through PIL.Image.open('path')
sigma: variance of gaussian noise
factor: the bigger this value is, the more noisy is the poisson_noised image
##IMPORTANT: when reading a image into numpy arrary, the default dtype is uint8,
which can cause wrapping when adding noise onto the image.
E.g, example = np.array([128,240,255], dtype='uint8')
example + 50 = np.array([178,44,49], dtype='uint8')
Transfer np.array to dtype='int16' can solve this problem.
"""
def gaussian_noise(image, sigma):
img = np.array(image)
noise = np.random.randn(img.shape[0], img.shape[1], img.shape[2])
img = img.astype('int16')
img_noise = img + noise * sigma
img_noise = np.clip(img_noise, 0, 255)
img_noise = img_noise.astype('uint8')
return Image.fromarray(img_noise)
def poisson_noise(image, factor):
factor = 1 / factor
img = np.array(image)
img = img.astype('int16')
img_noise = np.random.poisson(img * factor) / float(factor)
np.clip(img_noise, 0, 255, img_noise)
img_noise = img_noise.astype('uint8')
return Image.fromarray(img_noise)
http://scikit-image.org/docs/dev/api/skimage.util.html#skimage.util.random_noise
skimage.util.random_noise(image, mode='gaussian', seed=None, clip=True, **kwargs)
#Adding noise
[m,n]=img.shape
saltpepper_noise=zeros((m, n));
saltpepper_noise=rand(m,n); #creates a uniform random variable from 0 to 1
for i in range(0,m):
for j in range(0,n):
if saltpepper_noise[i,j]<=0.5:
saltpepper_noise[i,j]=0
else:
saltpepper_noise[i,j]=255
def add_salt_noise(src, ratio: float = 0.05, noise: list = [0, 0, 0]):
dst = src.copy()
import random
shuffle_dict = {}
i = 0
while i < (int(dst.shape[0]*dst.shape[1] * ratio)):
x, y = random.randint(0, dst.shape[0] - 1), random.randint(0, dst.shape[1] - 1)
if (x, y) in shuffle_dict:
continue
else:
dst[x, y] = noise
shuffle_dict[(x, y)] = 0
i += 1
return dst
although there is no built-in functions like in matlab
imnoise(image,noiseType,NoiseLevel) but we can easily add required amount random
valued impulse noise or salt and pepper into an image manually.
to add random valued impulse noise.
import random as r
def addRvinGray(image,n): # add random valued impulse noise in grayscale
'''parameters:
image: type=numpy array. input image in which you want add noise.
n: noise level (in percentage)'''
k=0 # counter variable
ih=image.shape[0]
iw=image.shape[1]
noisypixels=(ih*iw*n)/100 # here we calculate the number of pixels to be altered.
for i in range(ih*iw):
if k<noisypixels:
image[r.randrange(0,ih)][r.randrange(0,iw)]=r.randrange(0,256) #access random pixel in the image gives random intensity (0-255)
k+=1
else:
break
return image
to add salt and pepper noise
def addSaltGray(image,n): #add salt-&-pepper noise in grayscale image
k=0
salt=True
ih=image.shape[0]
iw=image.shape[1]
noisypixels=(ih*iw*n)/100
for i in range(ih*iw):
if k<noisypixels: #keep track of noise level
if salt==True:
image[r.randrange(0,ih)][r.randrange(0,iw)]=255
salt=False
else:
image[r.randrange(0,ih)][r.randrange(0,iw)]=0
salt=True
k+=1
else:
break
return image
Note: for color images: first split image in to three or four channels depending on the input image using opencv function:
(B, G, R) = cv2.split(image)
(B, G, R, A) = cv2.split(image)
after spliting perform the same operations on all channels.
at the end merge all the channels:
merged = cv2.merge([B, G, R])
return merged