I am trying to find the similarity between two images using Correlogram. I have already created Correlograms (2D) for both the images.
Task: Now i want to find out how similar these two Correlograms are.
Problem: I do not how can i match these two Correlograms. Can i match them in the similar way in which we match two histograms?
Formula comparision:
As per a research paper, the mathematical formulas of histogram matching and Correlogram matching are following. It can be seen clearly that in case of histogram, the summation is taken only for the difference between the corresponding values of color bins. Whereas, in case of Correlogram matching, the summation is taken over two dimensions i.e. distance, color bins.
My code: i have two images i.e. Mat correlogram1 and Mat correlogram2 in which i am storing the values of Correlogram for two images. Then, i am trying to match them using the following code which is based upon the formula mentioned above.
double correlogramMatching(Mat correlogram1, Mat correlogram2)
{
double confidenceValue = 0;
for(int i=0; i<ColorBins; i++)
{
for(int j=0; j<DistanceRange; j++)
{
double value = (std::abs) ( (correlogram1.at<double>(i,j) - correlogram2.at<double>(i,j)) / (1 + correlogram1.at<double>(i,j) + correlogram1.at<double>(i,j)) );
confidenceValue = confidenceValue + value;
}
}
return confidenceValue;
}
Confusion: For the two same images, the value of confidanceValue is Zero and for two not so common images the values are like 66, 88....so on. So, upto which values should i predict if the two images are similar or not?
PS: I am doing the programming in OpenCV (C++).
A correlogram is also a co-occurrence matrix / histogram. To answer your question, the simple answer is yes. Remember, when you're comparing histograms by themselves, you are comparing the grayscale / colour content between two images / patches. By extending this to correlograms / co-occurrence matrices, you are also comparing the spatial distributions of the colours as well, which are handled by the third dimension, the distance, of the histogram. If you had two images that had the same colour distribution, but the spatial distributions are different, the histogram measures will also take this into account and will report a high dissimilarity / low similarity between them.
As such, you are perfectly fine in using standard histogram comparison measures between two correlograms (and I'm also speaking from experience). As such, you can simply use any standard techniques that compare histograms together. Examples include histogram intersection, the L_p norm, the chi-squared distance, the Bhattacharyya distance and so on.
Take a look at the following link for more details. There are some great histogram similarity / dissimilarity measures you can use to compare between two histograms, each with their own advantages and disadvantages. Also, Ander Biguri raised a good point. Be sure to normalize the contrast between each view to make the content between the histograms somewhat contrast and illumination independent.
Link: http://pi-virtualworld.blogspot.ca/2013/09/histogram-similarity.html
Related
Problem Formulation
Suppose I have several 10000*10000 grids (can be transformed to 10000*10000 grayscale images. I would regard image and grid as the same below), and at each grid-point, there is some value (in my case it's the number of copies of a specific gene expressed at that pixel location, note that the locations are same for every grid). What I want is to quantify the similarity between two 2D spatial point-patterns of this kind (i.e., the spatial expression patterns of two distinct genes), and rank all pairs of genes in a "most similar" to "most dissimilar" manner. Note that it is not the spatial pattern in terms of the absolute value of expression level that I care about, rather, it's the relative pattern that I care about. As a result, I might need to utilize some correlation instead of distance metrics when comparing corresponding pixels.
The easiest method might be directly viewing all pixels together as a vector and calculate some correlation metric between the two vectors. However, this does not take the spatial information into account. Those genes that I am most interested in have spatial patterns, i.e., clustering and autocorrelation effects their expression pattern (though their "cluster" might take a very thin shape rather than sticking together, e.g., genes specific to the skin cells), which means usually the image would have several peak local regions, while expression levels at other pixels would be extremely low (near 0).
Possible Directions
I am not exactly sure if I should (1) consider applying image similarity comparison algorithms from image processing that take local structure similarity into account (e.g., SSIM, SIFT, as outlined in Simple and fast method to compare images for similarity), or (2) consider applying spatial similarity comparison algorithms from spatial statistics in GIS (there are some papers about this, but I am not sure if there are some algorithms dealing with simple point data rather than the normal region data with shape (in a more GIS-sense way, I need to find an algorithm dealing with raster data rather than polygon data)), or (3) consider directly applying statistical methods that deal with discrete 2D distributions, which I think might be a bit crude (seems to disregard the regional clustering/autocorrelation effects, ~ Tobler's First Law of Geography).
For direction (1), I am thinking about a simple method, that is, first find some "peak" regions in the two images respectively and regard their union as ROIs, and then compare those ROIs in the two images specifically in a simple pixel-by-pixel way (regard them together as a vector), but I am not sure if I can replace the distance metrics with correlation metrics, and am a bit worried that many methods of similarity comparison in image processing might not work well when the two images are dissimilar. For direction (2), I think this direction might be more appropriate because this problem is indeed related to spatial statistics, but I do not yet know where to start in GIS. I guess direction (3) is somewhat masked by (2), so I might not consider it here.
Sample
Sample image: (There are some issues w/ my own data, so here I borrowed an image from SpatialLIBD http://research.libd.org/spatialLIBD/reference/sce_image_grid_gene.html)
Let's say the value at each pixel is discretely valued between 0 and 10 (could be scaled to [0,1] if needed). The shapes of tissues in the right and left subfigure are a bit different, but in my case they are exactly the same.
PS: There is one might-be-serious problem regarding spatial statistics though. The expression of certain marker genes of a specific cell type might not be clustered in a bulk, but in the shape of a thin layer or irregularly. For example, if the grid is a section of the brain, then the high-expression peak region for cortex layer-specific genes (e.g., Ctip2 for layer V) might form a thin arc curved layer in the 10000*10000 grid.
UPDATE: I found a method belonging to the (3) direction called "optimal transport" problem that might be useful. Looks like it integrates locality information into the comparison of distribution. Would try to test this way (seems to be the easiest to code among all three directions?) tomorrow.
Any thoughts would be greatly appreciated!
In the absence of any sample image, I am assuming that your problem is similar to texture-pattern recognition.
We can start with Local Binary Patterns (2002), or LBPs for short. Unlike previous (1973) texture features that compute a global representation of texture based on the Gray Level Co-occurrence Matrix, LBPs instead compute a local representation of texture by comparing each pixel with its surrounding neighborhood of pixels. For each pixel in the image, we select a neighborhood of size r (to handle variable neighborhood sizes) surrounding the center pixel. A LBP value is then calculated for this center pixel and stored in the output 2D array with the same width and height as the input image. Then you can calculate a histogram of LBP codes (as final feature vector) and apply machine learning for classifications.
LBP implementations can be found in both the scikit-image and OpenCV but latter's implementation is strictly in the context of face recognition — the underlying LBP extractor is not exposed for raw LBP histogram computation. The scikit-image implementation of LBPs offer more control of the types of LBP histograms you want to generate. Furthermore, the scikit-image implementation also includes variants of LBPs that improve rotation and grayscale invariance.
Some starter code:
from skimage import feature
import numpy as np
from sklearn.svm import LinearSVC
from imutils import paths
import cv2
import os
class LocalBinaryPatterns:
def __init__(self, numPoints, radius):
# store the number of points and radius
self.numPoints = numPoints
self.radius = radius
def describe(self, image, eps=1e-7):
# compute the Local Binary Pattern representation
# of the image, and then use the LBP representation
# to build the histogram of patterns
lbp = feature.local_binary_pattern(image, self.numPoints,
self.radius, method="uniform")
(hist, _) = np.histogram(lbp.ravel(),
bins=np.arange(0, self.numPoints + 3),
range=(0, self.numPoints + 2))
# normalize the histogram
hist = hist.astype("float")
hist /= (hist.sum() + eps)
# return the histogram of Local Binary Patterns
return hist
# initialize the local binary patterns descriptor along with
# the data and label lists
desc = LocalBinaryPatterns(24, 8)
data = []
labels = []
# loop over the training images
for imagePath in paths.list_images(args["training"]):
# load the image, convert it to grayscale, and describe it
image = cv2.imread(imagePath)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
hist = desc.describe(gray)
# extract the label from the image path, then update the
# label and data lists
labels.append(imagePath.split(os.path.sep)[-2])
data.append(hist)
# train a Linear SVM on the data
model = LinearSVC(C=100.0, random_state=42)
model.fit(data, labels)
Once our Linear SVM is trained, we can use it to classify subsequent texture images:
# loop over the testing images
for imagePath in paths.list_images(args["testing"]):
# load the image, convert it to grayscale, describe it,
# and classify it
image = cv2.imread(imagePath)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
hist = desc.describe(gray)
prediction = model.predict(hist.reshape(1, -1))
# display the image and the prediction
cv2.putText(image, prediction[0], (10, 30), cv2.FONT_HERSHEY_SIMPLEX,
1.0, (0, 0, 255), 3)
cv2.imshow("Image", image)
cv2.waitKey(0)
Have a look at this excellent tutorial for more details.
Ravi Kumar (2016) was able to extract more finely textured images by combining LBP with Gabor filters to filter the coefficients of LBP pattern
I'm trying to use opencv to find some template in images. While opencv has several template matching methods, I have big trouble to understand the difference and when to use which by looking at their mathematic equization:
CV_TM_SQDIFF
CV_TM_SQDIFF_NORMED
CV_TM_CCORR
CV_TM_CCORR_NORMED
CV_TM_CCOEFF
Can someone explain the major difference between all these method in a non-mathematical way?
The general idea of template matching is to give each location in the target image I, a similarity measure, or score, for the given template T. The output of this process is the image R.
Each element in R is computed from the template, which spans over the ranges of x' and y', and a window in I of the same size.
Now, you have two windows and you want to know how similar they are:
CV_TM_SQDIFF - Sum of Square Differences (or SSD):
Simple euclidian distance (squared):
Take every pair of pixels and subtract
Square the difference
Sum all the squares
CV_TM_SQDIFF_NORMED - SSD Normed
This is rarely used in practice, but the normalization part is similar in the next methods.
The nominator term is same as above, but divided by a factor, computed from the
- square root of the product of:
sum of the template, squared
sum of the image window, squared
CV_TM_CCORR - Cross Correlation
Basically, this is a dot product:
Take every pair of pixels and multiply
Sum all products
CV_TM_CCOEFF - Cross Coefficient
Similar to Cross Correlation, but normalized with their Covariances (which I find hard to explain without math. But I would refer to
mathworld
or mathworks
for some examples
I have a basic question regarding pattern learning, or pattern representation. Assume I have a complex pattern of this form, could you please provide me with some research directions or concepts that I can follow to learn how to represent (mathematically describe) these forms of patterns? in general the pattern does not have a closed contour nor it can be represented with analytical objects like boxes, circles etc.
By mathematically describe I'm assuming you mean derive from the image a vector of values that represents the content of the image. In computer vision/image processing we call this an "image descriptor".
There are several image descriptors that could be applied to pixel based data of the form you showed, which appear to be 1 value per pixel i.e. greyscale images.
One approach is to perform "spatial gridding" where you divide the image up into a regular grid of a constant size e.g. a 4x4 grid. You then average the pixel values within each cell of the grid. Then concatenate these values to form a 16 element vector - this coarsely describes the pixel distribution of the image.
Another approach would be to use "image moments" which are 2D statistical moments. Use this equation:
where f(x,y) is they pixel value at coordinates (x,y). W and H are the image width and height. The mu_x and mu_y indicate the average x and y. The values i and j select the order of moment you want to compute. Various orders of moment can be combined in different ways for example in the "Hu moments" we can compute 7 numbers using combinations of image moments:
The cool thing about the Hu moments is you can scale, rotate, flip etc the image and you still get the same 7 values which makes this a robust ("affine invariant") image descriptor.
Hope this helps as a general direction to read more in.
I have a set of Images of more then 1000 pictures. For every Image I extract SURF descriptors. Now I'll add a query Image and want to try to find the most similar image in the image set. For perfomance and memory reasons I just extract for every image 200 keypoint with descriptors. And this is more or less my problem. At the moment I filter the matches by doing this:
Symmetrie Matching:
Simple BruteForce Matching in both directions. So from Image1 to Image2
and from Image2 to Image1. I just keep the matches which exist in both
directions.
List<Matches> match1 = BruteForceMatching.BFMatch(act.interestPoints, query.interestPoints);
List<Matches> match2 = BruteForceMatching.BFMatch(query.interestPoints, act.interestPoints);
List<Matches> finalMatch = FeatureMatchFilter.DoSymmetryTest(match1, match2);
float distance = 0;
for(int i = 0; i < finalMatch.size(); i++)
distance += finalMatch.get(i).distance;
act.pic.distance = distance * (float) query.interestPoints.size() / (float) finalMatch.size();
I know there are more filter methods. How you can see I try to weight the distances by the number of the final matches. But I don't have the feeling Iam doing this correct. When I look to other approaches it looks they all compute with all extractet interest points which exists in the image. Does anyone have a good approach for this? Or a good idea to weight the distances?
I know there is no golden solution, but some experiences, ideas and other approaches would be really helpfull.
So "match1" represents the directed matches of one of the database images and "match2" a query image, "finalMatch" are all the matches between those images and "finalMatch.get(i).distance" is some kind of mean value between the two directed distances.
So what you do is, you calculate the mean of the sum of the distances and scale them by the number of interest points you have. The goal I assume is to have a good meassure of how well the overall images match.
I am pretty sure the distance you calculate doesn't reflect that similarity very well. Dividing the sum of the distances by the number of matches makes some sense and this might give you an idea of similarity when compared to other query images, but scaling this value with the number of interest points just doesn't do anything meaningful.
First of all I would suggest that you get rid of this scaling. I'm not sure what your brute force matching does exactly, but additionally to your symmetry test, you should discard matches where the ratio of the first and the second candidate is to high (if I remember right, Lowe suggest a threshold of 0.8). Then, if it is a rigid scene, I would suggest that you apply some kind of fundamental matrix estimation (8 point algorithm + RANSAC) and filter the result using epipolar geometry. I'm pretty sure the mean discriptor distance of the "real" matches will give you a good idea about the "similarity" of the database image and the query.
What is the best way to match the scan (taken photo) point sets to the template point set (blue,green,red,pink circles in the images)?
I am using opencv/c++. Maybe some kind of the ICP algorithm? I would like to wrap the scan image to the template image!
template point set:
scan point set:
If the object is reasonably rigid and aligned, simple auto-correlation would do the trick.
If not, I would use RANSAC to estimate the transformation between the subject and the template (it seems that you have the feature points). Please provide some details on the problem.
Edit:
RANSAC (Random Sample Consensus) could be used in your case. Think about unnecessary points in your template as noise (false features detected by a feature detector) - they are the outliners. RANSAC could handle outliners, because it choose a small subset of feature points (the minimal amount that could initiate your model) randomly, initiates the model and calculates how well your model match the given data (how many other points in the template correspond to your other points). If you choose wrong subset, this value will be low and you will drop the model. If you choose right subset it will be high and you could improve your match with an LMS algorithm.
Do you have to match the red rectangles? The original image contains four black rectangles in the corners that seem to be made for matching. I can reliably find them with 4 lines of Mathematica code:
lotto = [source image]
lottoBW = Image[Map[Max, ImageData[lotto], {2}]]
This takes max(R,G,B) for each pixel, i.e. it filters out the red and yellow print (more or less). The result looks like this:
Then I just use a LoG filter to find the dark spots and look for local maxima in the result image
lottoBWG = ImageAdjust[LaplacianGaussianFilter[lottoBW, 20]]
MaxDetect[lottoBWG, 0.5]
Result:
Have you looked at OpenCV's descriptor_extractor_matcher.cpp sample? This sample uses RANSAC to detect the homography between the two input images. I assume when you say wrap you actually mean warp? If you would like to warp the image with the homography matrix you detect, have a look at the warpPerspective function. Finally, here are some good tutorials using the different feature detectors in OpenCV.
EDIT :
You may not have SURF features, but you certainly have feature points with different classes. Feature based matching is generally split into two phases: feature detection (which you have already done), and extraction which you need for matching. So, you might try converting your features into a KeyPoint and then doing the feature extraction and matching. Here is a little code snippet of how you might go about this:
typedef int RED_TYPE = 1;
typedef int GREEN_TYPE = 2;
typedef int BLUE_TYPE = 3;
typedef int PURPLE_TYPE = 4;
struct BenFeature
{
Point2f pt;
int classId;
};
vector<BenFeature> benFeatures;
// Detect the features as you normally would in addition setting the class ID
vector<KeyPoint> keypoints;
for(int i = 0; i < benFeatures.size(); i++)
{
BenFeature bf = benFeatures[i];
KeyPoint kp(bf.pt,
10.0, // feature neighborhood diameter (you'll probaby need to tune it)
-1.0, // (angle) -1 == not applicable
500.0, // feature response strength (set to the same unless you have a metric describing strength)
1, // octave level, (ditto as above)
bf.classId // RED, GREEN, BLUE, or PURPLE.
);
keypoints.push_back(kp);
}
// now proceed with extraction and matching...
You may need to tune the response strength such that it doesn't get thresholded out by the extraction phase. But, hopefully that's illustrative of what you might try to do.
Follow these steps:
Match points or features in two images, this will determine your wrapping;
Determine what transformation you are looking for for your wrapping. The most general would be homography (see cv::findHomography()) and the less general would be a simple translation (use cv::matchTempalte()). The intermediate case would be translation along x, y and rotation. For this I wrote a fast function that is better than Homography since it uses less degrees of freedom while still optimizing the right metrics (squared differences in coordinates):
https://stackoverflow.com/a/18091472/457687
If you think your matches have a lot of outliers use RANSAC on top of your step 1. You basically need to randomly select a minimal set of points required for finding parameters, solve, determine inliers, solve again using all inliers, and then iterate trying to improve your current solution (increase the number of inliers, reduce error, or both). See Wikipedia for RANSAC algorithm: http://en.wikipedia.org/wiki/Ransac