I'm trying to write an app for wild leopard classification and conservation in South Asia. For this, I have the main challenge to identify the leopards by their spot pattern in the forehead.
The current approach I am using is,
Store the known leopard forehead images as a base list
Get the user-provided leopard image and crop the forehead of the leopard
Pre-process the images with the bilateral filter to reduce the noise
Identify the keypoints using the SIFT algorithm
Use FLANN matcher to get KNN matches
Select good matches based on the ratio threshold
Sample code:
# Pre-Process & reduce noise.
img1 = cv.bilateralFilter(baseImg, 9, 75, 75)
img2 = cv.bilateralFilter(userImage, 9, 75, 75)
detector = cv.xfeatures2d_SIFT.create()
keypoints1, descriptors1 = detector.detectAndCompute(img1, None)
keypoints2, descriptors2 = detector.detectAndCompute(img2, None)
# FLANN parameters
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50) # or pass empty dictionary
matcher = cv.FlannBasedMatcher(index_params, search_params)
knn_matches = matcher.knnMatch(descriptors1, descriptors2, 2)
allmatchpointcount = len(knn_matches)
ratio_thresh = 0.7
good_matches = []
for m, n in knn_matches:
if m.distance < ratio_thresh * n.distance:
good_matches.append(m)
goodmatchpointcount = len(good_matches)
print("Good match count : ", goodmatchpointcount)
matchsuccesspercentage = goodmatchpointcount/allmatchpointcount*100
print("Match percentage : ", matchsuccesspercentage)
Problems I have with this approach:
The method has a medium-low success rate and tends to break when there is a new user image.
The user images are sometimes taken from different angles where some key patterns are not visible or warped.
The user image quality affects the match result significantly.
I appreciate any suggestions to get this improved in any manner.
Sample Images
Base Image
Above is matching to below: (Incorrect pattern matched)
More sample images as requested.
Related
i am biggner in image processing and in gabor filter and i want to use this filter to enhance fingerprint image
i read many articles about fingerprint image enhancement and i know that the steps for that is
read image -> noramalize -> get orientation map -> gabor filter -> binarize -> skeleton
now i am in step 4 , my question is how to get the right values for ( lambds and gamma ) for gabor
filter
my image :
my code :
1- read image and get the orientation map using HOG features
imgc = imread(r'C:\Users\iP\Desktop\printe.jpg',as_gray=True)
imgc = resize(imgc, (64*3,128*3))
rows,cols=imgc.shape
offset=24
ori=9 # to get angels (0,45,90,135) only
fd, hog_image = hog(imgc, orientations=ori, pixels_per_cell=(offset, offset),
cells_per_block=(1, 1), visualize=True, multichannel=None,feature_vector=False
)
orientation map :
2- reshape the orientation map from (8, 16, 1, 1, 9) to (8, 16, 9),,,
8 ->rows , 16 -> cols , 9 orientation
fd=np.array(fd)
fd=np.reshape(fd,(fd.shape[0],fd.shape[1],ori))
# from (8, 16, 9) to (8, 16, 1)
# Choose the angle that has the most potential ( biggest magntude )
angels=np.zeros((fd.shape[0],fd.shape[1],1))
for r in range(fd.shape[0]):
for c in range(fd.shape[1]):
bloc_prop = fd[r,c]
angelss=bloc_prop.reshape((1,ori))
angel=np.argmax(angelss)
angels[r,c]=angel
angels=angels.astype(np.int32)
3- the convolve function
def conv_gabor(img,orient_map,gabor_kernel_shape):
#
# loop on all pixels in the image and convolve it with it's angel in the orientation map
#
roo,coo=img.shape
#to get the padding value for immage before convolving it with kernels
pad=(gabor_kernel_shape-1)
padded=np.zeros((img.shape[0]+pad,img.shape[1]+pad)) # adding the cols and rows
padded[int(pad/2):-int(pad/2),int(pad/2):-int(pad/2)]=img # copy image to inside the padded
image
#result image
dst=padded.copy()
# start from the image that inside the padded
for r in range(int(pad/2),int(pad/2)+roo):
for c in range(int(pad/2),int(pad/2)+coo):
# get the angel from the orientation map
ro=(r-int(pad/2))//offset
co=(c-int(pad/2))//offset
ang=angels[ro,co]
real_angel=(((180/ori)*ang))
# bloack around the pixe to convolve it
block=padded[r-int(pad/2):r+int(pad/2)+1,c-int(pad/2):c+int(pad/2)+1]
# get Gabor kernel
# here is my question ->> what to get the parametres values for ( lambda and gamma
and phi)
ker= cv2.getGaborKernel( (gabor_kernel_shape,gabor_kernel_shape), 3,
np.deg2rad(real_angel),np.pi/4,0.001,0 )
dst[r,c]=np.sum((ker*block))
return dst
dst=conv_gabor(imgc,angels,11)
dst :
you see the image is too bad i dont know why this , i think because the lambda and gamma or what ?
but when i filter with one angel only 45 :
ker= cv2.getGaborKernel( (11,11), 2, np.deg2rad(45),np.pi/4,0.5,0 )
filt = cv2.filter2D(imgc,cv2.CV_64F,ker)
plt.imshow(filt,'gray')
reslut :
you see the edges that has 45 on the left is good quality
can anyone help me please , and tell me what should i do in this probelm ?
thanks all :)
EDIT:
i searched for another way and i found that i can use gabor fiter bank with many orientation and get best score in filtred images , so how can i find best score for pixels from filtred images
this is the output when i use gabor fiter bank with 45,60,65,90,135 angels and divide the filtered images to 16*16 and find the highest standard deviation (best score -> i use standard deviation as the score) for each block and get the best filtred image
so as you can see there are good and bad parts in the image ,i think using standard deviation alone is ineffective in some parts of the image , so my new question is what is best score function that gives me good output parts in the image
original image :
In my opinion, weighting the filtered images might be enough for your task. Considering your filter orientations, the filters with angle 45 and 135 respond quite well at different regions of the image. So, you can calculate the weighted sum to get the best filter result.
img = cv2.imread('fingerprint.jpg',0)
w_45 = 0.5
w_135 = 0.5
img_45 = cv2.filter2D(img,cv2.CV_64F,cv2.getGaborKernel( (11,11), 2, np.deg2rad(45),np.pi/4,0.5,0 ))
img_135 = cv2.filter2D(img,cv2.CV_64F,cv2.getGaborKernel( (11,11), 2, np.deg2rad(135),np.pi/4,0.5,0 ))
result = img_45*w_45+img_135*w_135
result = result/np.amax(result)*255
plt.imshow(result,cmap='gray')
plt.show()
Feel free to play with the weights. The result totally depends on what your next step is.
I have gone through the code below and would like to know how can I count the outlier points and inlier points after using RANSAC? could you point to a good code how it can be done?
Second question, which feature matching algorithm is better: BFMatcher.knnMatch() with Test ratio or bf = cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True) with shortest distance? any reference for this comparison?
**# BFMatcher with default params
bf = cv.BFMatcher()
matches = bf.knnMatch(des1, des2, k=2)
# Apply ratio test
good_matches = []
for m,n in matches:
if m.distance < 0.75*n.distance:
good_matches.append([m])
# Draw matches
img3=cv.drawMatchesKnn(img1,kp1,img2,kp2,good_matches,None,flags=cv.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)
cv.imwrite('matches.jpg', img3)
# Select good matched keypoints
ref_matched_kpts = np.float32([kp1[m[0].queryIdx].pt for m in good_matches])
sensed_matched_kpts = np.float32([kp2[m[0].trainIdx].pt for m in good_matches])
# Compute homography
H, status = cv.findHomography(sensed_matched_kpts, ref_matched_kpts, cv.RANSAC,5.0)**
Count number of outliers and inliers
# number of detected outliers: len(status) - np.sum(status)
# number of detected inliers: np.sum(status)
# Inlier Ratio, number of inlier/number of matches: float(np.sum(status)) / float(len(status))
Feature Matching Algorithm
I would say that if you are using the sparse feature-based algorithm (SIFT or SURF), BFMatcher.knnMatch() with Test ratio is preferred. While the bf = cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True) is used for binary-based algorithm (ORB, FAST, etc). My suggestion would be try both algorithms on your project to investigate which one is better.
I'm fairly new to the world of OCR, OpenCV, Tesseract etc and was hoping to get some advice or a nudge in the right direction for a project I'm working on. For context, I practice golf at an indoor simulator that is powered by Full Swing Golf. My goal is to build an app (preferably iphone, but desktop is fine too) that will be able to grab the data provided by the simulator and process it however I'd like. The overall workflow would look something like:
Set up iPhone or laptop camera to watch the simulator screen.
Hit ball
Statistics Screen is displayed that looks more or less like:
Detect that the Statistics Screen has been displayed and grab all relevant data:
| Distance | Launch | Back Spin | Club Speed | Carry | To Pin | Direction | Ball Speed | Side Spin | Club Face | Club Path |
|----------|--------|-----------|------------|-------|--------|-----------|------------|-----------|-----------|-----------|
| 345 | 13 | 3350 | 135 | 335 | 80 | 2.4 | 190 | 350 | 4.3 | 1.6 |
5-?: Save the data to my app, keep track of it over time etc...
Attempts So Far:
It seemed like OpenCV's matchTemplate would be a simple way to find all of the headings in the image (Distance, Launch etc...) and it does seem to work when the image and template are both the perfect resolution. However, as this will be an iPhone app, the quality is not something I can really guarantee (within reason). Moreso, the screen will almost never be straight-on as it appears above. Most likely, the camera will be off to the side and we will have to de-skew accordingly. I've attempted to use the following image to work on my deskewing logic to no avail:
Finding the reference points in order to deskew via getPerspectiveTransform and warpPerspective has proven to be incredibly difficult due to the above issues with matching templates.
I've also tried dynamically adjusting for scale with code resembling the following:
def findTemplateLocation(image_path):
template = cv2.imread(image_path)
template = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
w, h = template.shape[::-1]
threshold = 0.65
loc = []
for scale in np.linspace(0.1, 2, 20)[::-1]:
resized = imutils.resize(template, width=int(template.shape[1] * scale))
w, h = resized.shape[::-1]
res = cv2.matchTemplate(image_gray, resized, cv2.TM_CCOEFF_NORMED)
loc = np.where(res >= threshold)
if len(list(zip(*loc[::-1]))) > 0:
break
if loc and len(list(zip(*loc[::-1]))) > 0:
adjusted_w = int(w/scale)
adjusted_h = int(h/scale)
print(str(adjusted_w) + " " + str(adjusted_h) + " " + str(scale))
ret = []
for pt in zip(*loc[::-1]):
ret.append({'width': w, 'height': h, 'location': pt})
return ret
return None
This still returns a ton of false positives.
I'm hoping to get some advice on how to approach this problem with a clean slate. I'm open to any language / workflow.
If it does seem that I'm on the right track, my current code is at https://gist.github.com/naderhen/9ec8d45f13d92507131d5bce0e84fad8 . Would really appreciate any suggestions for the best next steps.
Thanks for any help you can provide!
EDIT: Additional Resources
I've uploaded a number of videos and still photos from my time at the indoor simulator this weekend: https://www.dropbox.com/sh/5vub2mi4rvunyaw/AAAY1_7Q_WBV4JvmDD0dEiTDa?dl=0
I tried to get a number of different angles, with different lighting etc. Please let me know if I can provide any other resources that may help.
So, I tried two different methods:
Contour detection - This seemed to be the most obvious method since the statistics screen is the primary part of the image and is present in all your images. Although it does work with two of the three images, it might not be very robust with the parameters. Here are the steps that I tried for contour:
First, get the the image in grayscale or take one of the Value channel in HSV. Then, threshold the image using either Otsu or Adaptive Thresholding. After playing with a lot of the associated parameters, I got satisfactory results, which would basically mean nice whole statistics screen in white on a black background. After this, sort the contours like this:
contours = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)[1]
# Sort the contours to avoid unnecessary comparison in the for loop below
cntsSorted = sorted(contours, key=lambda x: cv2.contourArea(x), reverse=True)
for cnt in cntsSorted[0:20]:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.04 * peri, True)
if len(approx) == 4 and peri > 10000:
cv2.drawContours(sorted_image, cnt, -1, (0, 255, 0), 10)
Feature Detection and Matching: Since using contours wasn't robust enough, I tried another method which I've worked on for a similar problem as yours. This method is fairly robust, much faster (I tried this on an android phone 2 years back and it would do the job in less than a second for a 1280 x 760 image). However, after trying on your work cases, I figured that your images are pretty vague. What I mean by that is, you have two images in your question that have fairly similar primaries and it works for that but the images you posted in comments are very different from these and hence it doesn't find suitable number of good matches (at least 10 in my case). If you can post a nice set of images that you actually will encounter, I will update this answer with my results on the new set. More importantly, the images of the scene evidently have change in perspective which shouldn't be an issue assuming you are able to get a very good source image (as the first one in your question). However, the change in lighting conditions can be a pain. I'd suggest either using different color spaces such as HSV, Lab and Luv instead of BGR.
Here is where you can find a working example of how to implement your own feature matcher. There are some code changes required depending on the version of OpenCV you are using but I am sure you can find the solutions ( I did ;) ).
A good example:
Some suggestions:
Try getting as clean an image as possible for the image you are using to match with the others (your first image in my case). Hopefully, this would require you to do less processing.
Try using unsharp mask before finding Keypoints.
My results are from using ORB. You can also try with other detectors/descriptors like SURF, SIFT and FAST.
Finally, your approach of template matching should work in cases where there is a change only in the scaling and not the perspective.
Hope this helps! Write a comment if you have any additional questions and/or when you have a good image set ready (rubs palms). Cheers!
Edit 1: This is the code that I used for the Feature Detection and Matching in Opencv 3.4.3 and Python 3.4
def unsharp_mask(im):
# This is used to sharpen images
gaussian_3 = cv2.GaussianBlur(im, (3, 3), 3.0)
return cv2.addWeighted(im, 2.0, gaussian_3, -1.0, 0, im)
def screen_finder2(image, source, num=0):
def resize(im, new_width):
r = float(new_width) / im.shape[1]
dim = (new_width, int(im.shape[0] * r))
return cv2.resize(im, dim, interpolation=cv2.INTER_AREA)
width = 300
source = resize(source, new_width=width)
image = resize(image, new_width=width)
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2LUV)
image, u, v = cv2.split(hsv)
hsv = cv2.cvtColor(source, cv2.COLOR_BGR2LUV)
source, u, v = cv2.split(hsv)
MIN_MATCH_COUNT = 10
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(image, None)
kp2, des2 = orb.detectAndCompute(source, None)
flann = cv2.DescriptorMatcher_create(cv2.DescriptorMatcher_FLANNBASED)
# Without the below 2 lines, matching doesn't work
des1 = np.asarray(des1, dtype=np.float32)
des2 = np.asarray(des2, dtype=np.float32)
matches = flann.knnMatch(des1, des2, k=2)
# store all the good matches as per Lowe's ratio test
good = []
for m, n in matches:
if m.distance < 0.7 * n.distance:
good.append(m)
if len(good) >= MIN_MATCH_COUNT:
src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1,
1, 2)
dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(-1,
1, 2)
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
matchesMask = mask.ravel().tolist()
h,w = image.shape
pts = np.float32([[0, 0], [0, h-1], [w-1, h-1], [w-1, 0]]).reshape(-1,
1, 2)
dst = cv2.perspectiveTransform(pts, M)
source_bgr = cv2.cvtColor(source, cv2.COLOR_GRAY2BGR)
img2 = cv2.polylines(source_bgr, [np.int32(dst)], True, (0,0,255), 3,
cv2.LINE_AA)
cv2.imwrite("out"+str(num)+".jpg", img2)
else:
print("Not enough matches." + str(len(good)))
matchesMask = None
draw_params = dict(matchColor=(0, 255, 0), # draw matches in green color
singlePointColor=None,
matchesMask=matchesMask, # draw only inliers
flags=2)
img3 = cv2.drawMatches(image, kp1, source, kp2, good, None, **draw_params)
cv2.imwrite("ORB"+str(num)+".jpg", img3)
match_image = unsharp_mask(cv2.imread("source.jpg"))
image_1 = unsharp_mask(cv2.imread("Screen_1.jpg"))
screen_finder2(match_image, image_1, num=1)
I need to evaluate the results after using ORB and BFMatcher in OpenCV, such that I interpret the matches after comparing img1 to img3, and img2 to img3. I understand that ORB matches contain a list of hamming distances, but I want to convert this vector into a scalar value of similarity.
I thought of two scenarios:
1) using the length of matches, the higher indicates the great similarity. But how we deal if the length of matches1 = length matches2?. In this case, we can 2)add all distances, and the minimum is preferable.
Can we combine all cases into one metric?
Here is the minimum of my work:
orb = cv2.ORB()
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1,des2)
matches = sorted(matches, key = lambda x:x.distance)
return len(matches)
Thanks
Not drawing matches. Opencv 3.0, fully updated Ubuntu. The code runs but it doesn't show any matches. The test region is directly cut and copied from the image to match.
import numpy as np
import cv2
cv2.ocl.setUseOpenCL(False)
img1 = cv2.imread('images/ingrassroi.png',0)
img2 = cv2.imread('images/ingrass.png',0)
img3 = img1.copy()
# Initiate ORB detector
orb = cv2.ORB_create()
# compute the descriptors with ORB
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
# create BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)
# Draw first 10 matches.
img3 = cv2.drawMatches(img1,kp1,img2,kp2,matches[:10],None, flags=2)
cv2.imshow("Matches",img3)
cv2.waitKey(-1)
Turns out that the image to match with was too small. It was not finding any key points in the train image. I enlarged the area that I cropped from the test image and it found matches and correctly identified the cigarette butt in the image. Just an FYI if you decide to try the FLANN based matcher on the same page in the OpenCV Python tutorials, be sure to define FLANN_INDEX_LSH = 6.