Given a batch of images i have to find the images that fit together the best like in the example given below, but my solutions are not working:
Left image
Right image
I tried firstly with google cloud Vision API but it wasn't giving good results, then i trained a model over with ludwig but it will take forever to try all the possible combinations of images, as i have 2500 left images and 2500 right images.
is there a way to find this out or decrease the possible cases so that i can use it in my model.
This solution looks at a pair of images. The algorithm evaluates whether the shapes in the image will mesh like a key and a lock. My answer does not attempt to align the images.
The first step is to find the contours in the images:
left= cv2.imread('/home/stephen/Desktop/left.png')
right = cv2.imread('/home/stephen/Desktop/right.png')
# Resize
left = cv2.resize(left, (320,320))
gray = cv2.cvtColor(left, cv2.COLOR_BGR2GRAY)
_, left_contours, _ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
# Approximate
left_contour = left_contours[0]
epsilon = 0.005*cv2.arcLength(left_contour,True)
left_contour = cv2.approxPolyDP(left_contour,epsilon,True)
What is a contour? A contour is just a list of points that lie on the perimeter of a shape. The contour for a triangle will have 3 points and a length of 3. The distance between the points will be the length of each leg in the triangle.
Similarly, the distances between the peaks and valleys will match in your images. To compute this distance, I found the distance between the contour points. Because of the way that the images are aligned I only used the horizontal distance.
left_dx = []
for point in range(len(left_contour)-1):
a = left_contour[point][0]
b = left_contour[point+1][0]
dist = a[0]-b[0]
left_dx.append(dist)
right_dx = []
for point in range(len(right_contour)-1):
a = right_contour[point][0]
b = right_contour[point+1][0]
# Use the - of the distance becuase this is the key hole, not the key
dist = -distance(a,b)
right_dx.append(dist)
# Reverse so they will fit
right_dx.reverse()
A this point you can sort of see that the contours line up. If you have better images, the contours will line up in this step. I used Scipy to iterpolate and check if the functions line up. If the two functions do line up, then the objects in the images will mesh.
left_x_values = []
for i in range(len(left_dx)): left_x_values.append(i)
x = np.array(left_x_values)
y = np.array(left_dx)
left_x_new = np.linspace(x.min(), x.max(),500)
f = interp1d(x, y, kind='quadratic')
left_y_smooth=f(left_x_new)
plt.plot (left_x_new, left_y_smooth,c = 'g')
I tried this again on a pair of images that I generated myself:
The contours:
The distances between contour points:
Fitting the contours:
Related
I've got an image from phone camera which have a paper inside it. In the image are also, some coordinates marked to get the distance between them. Since the aspect ratio of paper is known in advance (0.7072135785007072), I want to correct the distortion so that the whole image looks as if it's taken from the top view. I collect the four corners of the paper and apply opencv getPerspectiveTransform as follows:
pts1 = [[ 717., 664.],
[1112., 660.],
[1117., 1239.],
[ 730., 1238.]]
ratio=0.7072135785007072
cardH=math.sqrt((pts1[2][0]-pts1[1][0])*(pts1[2][0]-pts1[1][0])+(pts1[2][1]-pts1[1][1])*(pts1[2][1]-pts1[1][1]))
cardW=ratio*cardH;
pts2 = np.float32([[pts1[0][0],pts1[0][1]], [pts1[0][0]+cardW, pts1[0][1]], [pts1[0][0]+cardW, pts1[0][1]+cardH], [pts1[0][0], pts1[0][1]+cardH]])
M = cv2.getPerspectiveTransform(pts1,pts2)
with this matrix M I'm transforming the whole image as follows:
transformed = np.zeros((image.shape[1], image.shape[0]), dtype=np.uint8);
dst = cv2.warpPerspective(image, M, transformed.shape)
_ = cv2.rectangle(dst, (pts2[0][0], pts2[0][1]), (int(pts2[2][0]), int(pts2[2][1])), (0, 255, 0), 2)
The problem with this is that it's correcting the perspective of paper but distorting the overall image. I don't know why. The input image is this and the corresponding output image is this. In the input image point M and O and aligned horizontally, but to my surprise after transforming the overall image the point M and O are no longer aligned horizontally, why is that happening ?
For an openCV project, I need to find a "perfect" homography from descriptor matching. I pushed paramters : many corners, retroprojection threshold very low...
It's close but I need almost pixel perfect homography.
I name marker the image I'm looking for, query the image which contains the marker.
My idea was to refine the 4 corner positions of the marker in query image. And then recalculate homography from these refined 4 corners.
I was thinking of using hough to detect line intersections. So I get corners which are indeed good candidates.
Now I need a score function to assess which of these candidates is the "best". Here what I tried
1/
- let's call H an homogrphy
- test = query + H*marker. So if my homography was "pefect", test would be identical to query (but for shadows...).
2/ Then I calculate the "difference" naively like below. I sum the absolute difference and then divide by the area (otherwise the smaller, the better). Not reliable... I gave more weight to the sums and still fails.
It really seemed a good idea to me but it's an epic fail for now: the difference does not help find the "better" corners. Any idea ?
Thank you very much,
Michaƫl
def diffImageScore(testImage,workingImage, queryPersp):
height,width,d = testImage.shape
height2,width2,d2 = workingImage.shape
area=quadrilatereArea(queryPersp)*1.0
score=0
height, width, depth = testImage.shape
img1Gray= cv2.cvtColor(cv2.blur(testImage,(3,3)), cv2.COLOR_BGR2GRAY)
img2Gray= cv2.cvtColor(cv2.blur(workingImage,(3,3)), cv2.COLOR_BGR2GRAY)
for i in range(0, height):
for j in range(0, width):
s1=abs(int(img1Gray[i,j]) -int(img2Gray[i,j] ))
s1=pow(s1,3)
score+=s1
return score/area
I am trying to segment a QFN package on the x-ray image of the PCB. The general description of QNF package is that it's square or rectangle in the centre with the rectangular pins on the edge. The example is on this image:
I can segment the rectangles on the x-ray image quite good but I dont know how to write the condition to segment only the QFN package. The package can be square or rectangle and can have different number of pins on the edge. My idea is to check the close neighborhood of each rectangle filter out rectangles that are too big and somehow check if the remaining rectangles are all around. Is there a better approach? Or how would you check if the big rectangle is surrounded by the small ones?
I am using python 3.5 and OpenCV 3.1
This is going to be a bit long, I would give you some basic techniques and guidelines along with some advanced suggestions to increase the accuracy of QNF package detection.
Assuming that you already have the red marked contours stored in a variable contours.
First of all define the upper and lower limit of contour area you want to filter. As per the given image, the area range of central region comes out to be:
CHIP_CENTER_AREA_LOWER, CHIP_CENTER_AREA_UPPER = 20*1000, 25*1000
So we iterate over all contours and filter the contour which have area in above range, it would eliminate the smaller contours, and we would be examining only the larger contours.
probable_chip_center_contour_idx = []
for i in xrange(len(contours)):
cnt = contours[i]
area = cv2.contourArea(cnt)
if CHIP_CENTER_AREA_LOWER < area < CHIP_CENTER_AREA_UPPER:
probable_chip_center_contour_idx.append(i)
Now after filtering out probable contours on basis of area, we would check the number of neighbouring contours(pins). Those central contours which would have expected number of neighbouring contours within a given radius would be out final result.
radius = 80
EXPECTED_NEIGHBOURING_PINS = 28
for i in probable_chip_center_contour_idx:
cnt = contours[i]
cnt_bounding_rect = cv2.boundingRect(cnt)
extended_cnt_bounding_rect = [cnt_bounding_rect[0] - radius, cnt_bounding_rect[1] - radius,
cnt_bounding_rect[2] + 2*radius, cnt_bounding_rect[3] + 2*radius]
neighbouring_contours = 0
for probable_neighbouring_contour in contours:
probable_bounding_rect = cv2.boundingRect(probable_neighbouring_contour)
if is_rect_inside(probable_bounding_rect, extended_cnt_bounding_rect):
neighbouring_contours += 1
if neighbouring_contours > EXPECTED_NEIGHBOURING_PINS:
print "QFN Found"
I'd like to compute a sort of direction field on a 2D image, as (poorly) illustrated from this photoshop mockup. NOTE: This is NOT a vector field as you learn about in differential equations. Instead, this is something that draws along the lines that one would see if they computed level sets of the image.
Are there known methods of obtaining this type of direction field (red lines) of an image? It seems like it almost behaves like the normal to the gradient, but this isn't exactly it, either, since there are places where the gradient is zero and I'd like direction fields at these locations as well.
I was able to find a paper on how to do this for fingerprint processing that went into enough detail that their results were repeatable. It's unfortunately behind a paywall, but here it is for anyone interested and able to access the full text:
Systematic methods for the computation of the directional fields and singular points of fingerprints
EDIT: As requested, here is a quick and dirty summary (in Python) of how this is achieved in the above paper.
A naive approach would be to average the gradient in a small square neighborhood around the target pixel, much like the superimposed grid on the image in the question, and then compute the normal. However, if you simply average the gradient, it's possible that opposite gradients in the region will cancel each other (e.g. when computing the orientation along a ridge). Thus, it is common to compute with squared gradients, since gradients pointing in opposite directions would then be aligned. There is a clever formula for the squared gradient based on the original gradient. I won't give the derivation, but here is the formula:
Now, take the sum of squared gradients over the region (modulo some piece-wise defined compensations for the way angles work). Finally, through some arctangent magic, you'll get the orientation field.
If you run the following code on a smooth grayscale bitmap image with the grid-size chosen appropriately and then plot the orientation field O alongside your original image, you'll see how the orientation field more or less gives the angles I asked about in my original question.
from scipy import misc
import numpy as np
import math
# Import the grayscale image
bmp = misc.imread('path/filename.bmp')
# Compute the gradient - VERY important to convert to floats!
grad = np.gradient(bmp.astype(float))
# Set the block size (superimposed grid on the sample image in the question)
blockRadius=5
# Compute the orientation field. Result will be a matrix of angles in [0, \pi), one for each pixel in the original (grayscale) image.
O = np.zeros(bmp.shape)
for x in range(0,bmp.shape[0]):
for y in range(0,bmp.shape[1]):
numerator = 0.
denominator = 0.
for i in range(max(0,x-blockRadius),min(bmp.shape[0],x+blockRadius)):
for j in range(max(0,y-blockRadius),min(bmp.shape[0],y+blockRadius)):
numerator = numerator + 2.*grad[0][i,j]*grad[1][i,j]
denominator = denominator + (math.pow(grad[0][i,j],2.) - math.pow(grad[1][i,j],2.))
if denominator==0:
O[x,y] = 0
elif denominator > 0:
O[x,y] = (1./2.)*math.atan(numerator/denominator)
elif numerator >= 0:
O[x,y] = (1./2.)*(math.atan(numerator/denominator)+math.pi)
elif numerator < 0:
O[x,y] = (1./2.)*(math.atan(numerator/denominator)-math.pi)
for x in range(0,bmp.shape[0]):
for y in range(0,bmp.shape[1]):
if O[x,y] <= 0:
O[x,y] = O[x,y] + math.pi
else:
O[x,y] = O[x,y]
Cheers!
I'm intending to write a program to detect and differentiate certain objects from a nearly solid background. The foreground and the background have a high contrast difference which I would further increase to aid in the object identification process. I'm planning to use Hough transform technique and OpenCV.
Sample image
As seen in the above image, I would want to separately identify the circular objects and the square objects (or any other shape out of a finite set of shapes). Since I'm quite new to image processing I do not have an idea whether such a situation needs a neural network to be implemented and each shape to be learned beforehand. Would a technique such as template matching let me do this without a neural network?
These posts will get you started:
How to detect circles
How to detect squares
How to detect a sheet of paper (advanced square detection)
You will probably have to adjust some parameters in these codes to match your circles/squares, but the core of the technique is shown on these examples.
If you intend to detect shapes other than just circles, (and from the image I assume you do), I would recommend the Chamfer matching for a quick start, especially as you have a good contrast.
The basic premise, explained in simple terms, is following:
You do an edge detection (for example, cvCanny in opencv)
You create a distance image, where the value of each pixel means the distance fom the nearest edge.
You take the shapes you would like to detect, define sample points along the edges of the shape, and try to match these points on the distance image. Basically you just add the values on the distance image which are "under" the coordinates of your sample points, given a specific position of your objects.
Find a good minimization algorithm, the effectiveness of this depends on your application.
This basic approach is a general solution, usually works well, but without further advancements, it is very slow.
Usually it's a good idea to first separate the objects of interest, so you don't have to always do the full search on the whole image. Find a good threshold, so you can separate objects. You still don't know which object it is, but you only have to do the matching itself in close proximity of this object.
Another good idea is, instead of doing the full search on the high resolution image, first do it on a very low resolution. The result will not be very accurate, but you can know the general areas where it's worth to do a search on a higher resolution, so you don't waste your time on areas where there is nothing of interest.
There are a number of more advanced techniques, but it's still worth to take a look at the basic chamfer matching, as it is the base of a large number of techniques.
With the assumption that the objects are simple shapes, here's an approach using thresholding + contour approximation. Contour approximation is based on the assumption that a curve can be approximated by a series of short line segments which can be used to determine the shape of a contour. For instance, a triangle has three vertices, a square/rectangle has four vertices, a pentagon has five vertices, and so on.
Obtain binary image. We load the image, convert to grayscale, Gaussian blur, then adaptive threshold to obtain a binary image.
Detect shapes. Find contours and identify the shape of each contour using contour approximation filtering. This can be done using arcLength to compute the perimeter of the contour and approxPolyDP to obtain the actual contour approximation.
Input image
Detected objects highlighted in green
Labeled contours
Code
import cv2
def detect_shape(c):
# Compute perimeter of contour and perform contour approximation
shape = ""
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.04 * peri, True)
# Triangle
if len(approx) == 3:
shape = "triangle"
# Square or rectangle
elif len(approx) == 4:
(x, y, w, h) = cv2.boundingRect(approx)
ar = w / float(h)
# A square will have an aspect ratio that is approximately
# equal to one, otherwise, the shape is a rectangle
shape = "square" if ar >= 0.95 and ar <= 1.05 else "rectangle"
# Star
elif len(approx) == 10:
shape = "star"
# Otherwise assume as circle or oval
else:
shape = "circle"
return shape
# Load image, grayscale, Gaussian blur, and adaptive threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,31,3)
# Find contours and detect shape
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Identify shape
shape = detect_shape(c)
# Find centroid and label shape name
M = cv2.moments(c)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
cv2.putText(image, shape, (cX - 20, cY), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (36,255,12), 2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()