Whats the difference between SiftFeatureDetector() and Ptr. They both apparently have the same function. The opencv tutorial uses SiftFeatureDetector but when clicking on the official documentation they use Ptr and have no mention of SiftFeatureDetector(), so I cant read up on it. as in the tutorial they used this: int minHessian = 400; SurfFeatureDetector detector( minHessian ); and I dont know what the minHessian is supposed to do.
Also I tried them both on the same image and they both have the same result, then why are they different?
int _tmain(int argc, _TCHAR* argv[])
Mat img;
img = imread("c:\\box.png", 0);
//cvtColor( img, gry, CV_BGR2GRAY );
//SiftFeatureDetector detector;
//vector<KeyPoint> keypoints;
//detector.detect(img, keypoints);
Ptr<FeatureDetector> feature_detector = FeatureDetector::create("SIFT");
vector<KeyPoint> keypoints;
feature_detector->detect(img, keypoints);
Mat output;
drawKeypoints(img, keypoints, output, Scalar::all(-1));
namedWindow("meh", CV_WINDOW_AUTOSIZE);
imshow("meh", output);
return 0;
EDIT: See the correction by #gantzer89 in the comments below. (Leaving my original text in place for historical clarity.)
In my general experience, using the FeatureDetector::create() syntax (discussed here in the "official documentation" you cited) allows the flexibility to specify your algorithm at runtime via a parameter file, while the more specific classes, such as SiftFeatureDetector, provide more opportunities for customization.
The create() methods start with a set of default algorith-specific parameters, while the algorithim-specific classes allow customization of these parameters upon construction. Thus, the create() method is assigning a default value to minHessian, while the SiftFeatureDetector constructor provides the opportunity to choose a value of minHessian.
As a rule of thumb, if you want to quickly experiment with which algorithm to use, use the create() syntax, and if you want to experiment with fine-tuning a particular algorithm, use the algorithm-specific class constructor.
Greetings for the past week (or more) I've been struggling with a problem.
I am developing an app which will allow an expert to create a recipe using a provided image of something to be used as a base. The recipe consists of areas of interests. The program's purpose is to allow non experts to use it, providing images similar to that original and the software cross checks these different areas of interest from the Recipe image to the Provided image.
One use-case scenario could be banknotes. The expert would select an area on an a good picture of a banknote that is genuine, and then the user would provide the software with images of banknotes that need to be checked. So illumination, as well as capturing device could be different.
I don't want you guys to delve into the nature of comparing banknotes, that's another monster to tackle and I got it covered for the most part.
My Problem:
Initially I shrink one of the two pictures to the size of the smaller one.
So now we are dealing with pictures having the same size. (I actually perform the shrinking to the areas of interest and not the whole picture, but that shouldn't matter.)
I have tried and used different methodologies compare these parts but each one had it's limitations due to the nature of the images. Illumination might be different, provided image might have some sort of contamination etc.
What have I tried:
Simple image similarity comparison using RGB difference.
Problem is provided image could be totally different but colours could be similar. So I would get high percentages on "totally" different banknotes.
SSIM on RGB Images.
Would give really low percentage of similarity on all channels.
SSIM after using sobel filter.
Again low percentage of similarity.
I used SSIM from both Scikit in python and SSIM from OpenCV
Feature matching with Flann.
Couldn't find a good way to use detected matches to extract a similarity.
Basically I am guessing that I need to use various methods and algorithms to achieve the best result. My gut tells me that I will need to combine RGB comparison results with a methodology that will:
Perform some form of edge detection like sobel.
Compare the results based on shape matching or something similar.
I am an image analysis newbie and I also tried to find a way to compare, the sobel products of the provided images, using mean and std calculations from openCV, however I either did it wrong, or the results I got were useless anyway. I calculated the eucledian distance between the vectors that resulted from mean and std calculation, however I could not use the results mainly because I couldn't see how they related between images.
I am not providing code I used, firslty because I scrapped some of it, and secondly because I am not looking for a code solution but a methodology or some direction to study-material. (I've read shitload of papers already).
Finally I am not trying to detect similar images, but given two images, extract the similarity between them, trying to bypass small differences created by illumination or paper distortion etc.
Finally I would like to say that I tested all the methods by providing the same image twice and I would get 100% similarity, so I didn't totally fuck it up.
Is what I am trying even possible without some sort of training sets to teach the software what are the acceptable variants of the image? (Again I have no idea if that even makes sense :D )
I think you can try Feature Matching, like SURF alogrithm, FLANN
Example of Feature Detection using SURF : https://docs.opencv.org/3.0-beta/doc/tutorials/features2d/feature_detection/feature_detection.html
#include <stdio.h>
#include <iostream>
#include "opencv2/core.hpp"
#include "opencv2/features2d.hpp"
#include "opencv2/xfeatures2d.hpp"
#include "opencv2/highgui.hpp"
using namespace cv;
using namespace cv::xfeatures2d;
void readme();
/** #function main */
int main( int argc, char** argv )
if( argc != 3 )
{ readme(); return -1; }
Mat img_1 = imread( argv[1], IMREAD_GRAYSCALE );
Mat img_2 = imread( argv[2], IMREAD_GRAYSCALE );
if( !img_1.data || !img_2.data )
{ std::cout<< " --(!) Error reading images " << std::endl; return -1; }
//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 400;
Ptr<SURF> detector = SURF::create( minHessian );
std::vector<KeyPoint> keypoints_1, keypoints_2;
detector->detect( img_1, keypoints_1 );
detector->detect( img_2, keypoints_2 );
//-- Draw keypoints
Mat img_keypoints_1; Mat img_keypoints_2;
drawKeypoints( img_1, keypoints_1, img_keypoints_1, Scalar::all(-1), DrawMatchesFlags::DEFAULT );
drawKeypoints( img_2, keypoints_2, img_keypoints_2, Scalar::all(-1), DrawMatchesFlags::DEFAULT );
//-- Show detected (drawn) keypoints
imshow("Keypoints 1", img_keypoints_1 );
imshow("Keypoints 2", img_keypoints_2 );
return 0;
/** #function readme */
void readme()
{ std::cout << " Usage: ./SURF_detector <img1> <img2>" << std::endl; }
Ok after some digging around, this is what I came with :
import numpy as np
import cv2
import sys
import matplotlib.image as mpimg
from skimage import io
from skimage import measure
import time
s = 0
imgA = cv2.imread(sys.argv[1])
imgB = cv2.imread(sys.argv[2])
#imgA = cv2.imread('imageA.bmp')
#imgB = cv2.imread('imageB.bmp')
imgA = cv2.cvtColor(imgA, cv2.COLOR_BGR2GRAY)
imgB = cv2.cvtColor(imgB, cv2.COLOR_BGR2GRAY)
ret,imgA = cv2.threshold(imgA,127,255,0)
ret,imgB = cv2.threshold(imgB,127,255,0)
imgAContours, contoursA, hierarchyA = cv2.findContours(imgA, cv2.RETR_TREE , cv2.CHAIN_APPROX_NONE)
imgBContours, contoursB, hierarchyB = cv2.findContours(imgB, cv2.RETR_TREE , cv2.CHAIN_APPROX_NONE)
imgAContours = cv2.drawContours(imgAContours,contoursA,-1,(0,0,0),1)
imgBContours = cv2.drawContours(imgBContours,contoursB,-1,(0,0,0),1)
imgAContours = cv2.medianBlur(imgAContours,5)
imgBContours = cv2.medianBlur(imgBContours,5)
#s = 100 * 1/(1+cv2.matchShapes(imgAContours,imgBContours,cv2.CONTOURS_MATCH_I2,0.0))
#s = measure.compare_ssim(imgAContours,imgBContours)
#equality = np.equal(imgAContours,imgBContours)
total = 0.0
sum = 0.0
for x in range(len(imgAContours)):
for y in range(len(imgAContours[x])):
total +=1
t = imgAContours[x,y] == imgBContours[x,y]
if t:
s = (sum/total) * 100
Basically I preprocess the two images as simply as possible, then I find the contours. Now the matchShapes function from openCV was not giving me the results I wanted.
So I create two images using the information from the contours, and then I apply a median blur filter.
Currently, I am doing a simply boolean check pixel to pixel. However I am planning to change this in the future, making it smarter. Probably with some array math.
If anyone has any suggestions, they are welcome.
Is there any way in which we can limit the number of keypoints to the 100 in OPENCV SURF?
Will the keypoints obtained be ordered according to their strength?
How to obtain the strength of the descriptor?
I am working on OPENCV in a LINUX system with a cpp program.
My code is:
int main( int argc, char** argv )
Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );
//-- Step 1: Detect the keypoints using SURF Detector
int minHessian = 500;
SurfFeatureDetector detector( minHessian,1,2,false,true );
std::vector<KeyPoint> keypoints_1p;
std::vector<KeyPoint> keypoints_2p;
detector.detect( img_1, keypoints_1p );
detector.detect( img_2, keypoints_2p);
// computing descriptors
SurfDescriptorExtractor extractor(minHessian,1,1,1,0);
Mat descriptors1, descriptors2;
extractor.compute(img_1, keypoints_1p, descriptors1);
extractor.compute(img_2, keypoints_2p, descriptors2);
You can get at most 100. I could imagine images (say for example a constant image) that have no SIFT descriptor. There are many ways to limit the keypoints to 100. There are easy solutions and hard solutions to your problem. You can get at most 100, by randomly selecting 100 keypoints from as many keypoints you get.
There is no such thing as the strength of the keypoint. You're going to have to define your own concept of strength.
There are a wide variety of parameter in the original Lowe paper that filter the keypoints (one of them is that they don't match an image edge, section 4.1 of Lowe's paper). There are 2 or 3 other parameters. You would need to adjust the parameters systematically in such a way that you only get 100. If you get less than 100 you filter less, and if you get more than 100 you filter more.
see the question here. And see my answer there how to limit the number of keypoints.
I have a 3x3x1000 OpenCV Mat matrix created using
int sz[] = {3,3,1000};
Mat bigCube(3, sz, CV_8U);
I want to do matrix operations on the 1000 separate 3x3 sub-matrices. But I can not find a way to do this.
The most obvious would be in a for-loop using Range, something like;
//Do some operations...
But this won't compile. Is there a way to do this?
I am trying to extract features using OpenCV's HoG API, however I can't seem to find the API that allow me to do that.
What I am trying to do is to extract features using HoG from all my dataset (a set number of positive and negative images), then train my own SVM.
I peeked into HoG.cpp under OpenCV, and it didn't help. All the codes are buried within complexities and the need to cater for different hardwares (e.g. Intel's IPP)
My question is:
Is there any API from OpenCV that I can use to extract all those features / descriptors to be fed into a SVM ? If there's how can I use it to train my own SVM ?
If there isn't, are there any existing libraries out there, which could accomplish the same thing ?
So far, I am actually porting an existing library (http://hogprocessing.altervista.org/) from Processing (Java) to C++, but it's still very slow, with detection taking around at least 16 seconds
Has anyone else successfully to extract HoG features, how did you go around it ? And do you have any open source codes which I could use ?
You can use hog class in opencv as follows
HOGDescriptor hog;
vector<float> ders;
vector<Point> locs;
This function computes the hog features for you
hog.compute(grayImg, ders, Size(32, 32), Size(0, 0), locs);
The HOG features computed for grayImg are stored in ders vector to make it into a matrix, which can be used later for training.
Mat Hogfeat(ders.size(), 1, CV_32FC1);
for(int i=0;i<ders.size();i++)
Now your HOG features are stored in Hogfeat matrix.
You can also set the window size, cell size and block size by using object hog as follows:
hog.blockSize = 16;
hog.cellSize = 4;
hog.blockStride = 8;
// This is for comparing the HOG features of two images without using any SVM
// (It is not an efficient way but useful when you want to compare only few or two images)
// Simple distance
// Consider you have two HOG feature vectors for two images Hogfeat1 and Hogfeat2 and those are same size.
double distance = 0;
for(int i = 0; i < Hogfeat.rows; i++)
distance += abs(Hogfeat.at<float>(i, 0) - Hogfeat.at<float>(i, 0));
if (distance < Threshold)
cout<<"Two images are of same class"<<endl;
cout<<"Two images are of different class"<<endl;
Hope it is useful :)
I also wrote the program of 2 hog feature comparing with the help of the above article.
And I apply this method to check ROI region changing or not.
Please refer to the page here.
source code and simple introduction
Here is GPU version as well.
cv::Mat temp;
gpu::GpuMat gpu_img, descriptors;
cv::gpu::HOGDescriptor gpu_hog(win_size, Size(16, 16), Size(8, 8), Size(8, 8), 9,
cv::gpu::HOGDescriptor::DEFAULT_WIN_SIGMA, 0.2, gamma_corr,
gpu_hog.getDescriptors(gpu_img, win_stride, descriptors, cv::gpu::HOGDescriptor::DESCR_FORMAT_ROW_BY_ROW);
OpenCV 3 provides some changes to the way GPU algorithms (i.e. CUDA) can be used by the user, see the Transition Guide - CUDA.
To update the answer from user3398689 to OpenCV 3, here is a snipped code:
#include <opencv2/core/cuda.hpp>
#include <opencv2/cudaimgproc.hpp>
/* Suppose you load an image in a cv::Mat variable called 'src' */
int img_width = 320;
int img_height = 240;
int block_size = 16;
int bin_number = 9;
cv::Ptr<cv::cuda::HOG> cuda_hog = cuda::HOG::create(Size(img_width, img_height),
Size(block_size, block_size),
Size(block_size/2, block_size/2),
Size(block_size/2, block_size/2),
/* The following commands are optional: default values applies */
cuda_hog->setWinStride(Size(img_width_, img_height_));
cv::cuda::GpuMat image;
cv::cuda::GpuMat descriptor;
/* May not apply to you */
/* CUDA HOG works with intensity (1 channel) or BGRA (4 channels) images */
/* The next function call convert a standard BGR image to BGRA using the GPU */
cv::cuda::GpuMat image_alpha;
cuda::cvtColor(image, image_alpha, COLOR_BGR2BGRA, 4);
cuda_hog->compute(image_alpha, descriptor);
cv::Mat dst;
You can then use the descriptors in 'dst' variable as you prefer like, e.g., as suggested by G453.
I tried using cvMatchShapes() to match two marker patterns. As you can see at Best way to count number of "White Blobs" in a Thresholded IplImage in OpenCV 2.3.0 , the source is having a poor image quality.
I'm not satisfied with the results returned from that function, most of the times it gives incorrect matches. How to use this function (or some suitable function) to do effective matching?
Note: My fallback solution is to change marker pattern to have fairly big/clearly visible shapes. Please visit the above link to see my current marker pattern.
I found this comprehensive comparison of various feature detection algorithms implemented in OpenCV. http://computer-vision-talks.com/2011/01/comparison-of-the-opencvs-feature-detection-algorithms-2 . According to that FAST seems to be a good choice.
I'd give +1 to anyone who can share a good tutorial for implementing FAST (else STAR/ SURF/ SIFT) in OpenCV. I'm unable to google thinks fast as in speed :(
Here is the FAST inventor's website. FAST stands for Features from Accelerated Segment Test. Here is a short Wikipedia entry on AST based algorithms. Also, here is a good survey of the different feature detectors currently in use today.
FAST is actually already implemented by OpenCV if you would like to use their implementation.
EDIT : Here is short example I created to show you how to use the FAST detector:
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <vector>
using namespace std;
using namespace cv;
int main(int argc, char* argv[])
Mat far = imread("far.jpg", 0);
Mat near = imread("near.jpg", 0);
Ptr<FeatureDetector> detector = FeatureDetector::create("FAST");
vector<KeyPoint> farPoints;
detector->detect(far, farPoints);
Mat farColor;
cvtColor(far, farColor, CV_GRAY2BGR);
drawKeypoints(farColor, farPoints, farColor, Scalar(255, 0, 0), DrawMatchesFlags::DRAW_OVER_OUTIMG);
imshow("farColor", farColor);
imwrite("farPoints.jpg", farColor);
vector<KeyPoint> nearPoints;
detector->detect(near, nearPoints);
Mat nearColor;
cvtColor(near, nearColor, CV_GRAY2BGR);
drawKeypoints(nearColor, nearPoints, nearColor, Scalar(0, 255, 0), DrawMatchesFlags::DRAW_OVER_OUTIMG);
imshow("nearColor", nearColor);
imwrite("nearPoints.jpg", nearColor);
return 0;
This code finds the follow feature points for the far and near imagery:
As you can see, the near image has many more features, but it looks like the same basic structure is detected with the far image. So, you should be able to match these. Have a look at the descriptor_extractor_matcher.cpp. That should get you started.
Hope that helps!