Matching image to images collection - image-processing

I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?
Here's collection sample:
Abundance
Aggressive Urge
Demystify
Here's what I'm trying to find:
Card Photo

New method!
It seems that the following ImageMagick command, or maybe a variation of it, depending on looking at a greater selection of your images, will extract the wording at the top of your cards
convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png
which takes the top 10% of your image and 80% of the width (starting at 10% in from the top left corner and stores it in crop.png as follows:
And if your run that through tessseract OCR as follows:
tesseract crop.png agg
you get a file called agg.txt containing:
E‘ Aggressive Urge \L® E
which you can run through grep to clean up, looking only for upper and lower case letters adjacent to each other:
grep -Eo "\<[A-Za-z]+\>" agg.txt
to get
Aggressive Urge
:-)

Thank you for posting some photos.
I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:
Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%
so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.
I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.
#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){
TEMP="tmp$$.png"
# Force image to 8x8 pixels and greyscale
convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"
# Calculate mean brightness and correct to range 0..255
MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)
# Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
hash=""
for i in {0..7}; do
for j in {0..7}; do
pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
bit="0"
[ $pixel -gt $MEAN ] && bit="1"
hash="$hash$bit"
done
done
hex=$(echo "obase=16;ibase=2;$hash" | bc)
printf "%016s\n" $hex
#rm "$TEMP" > /dev/null 2>&1
}
function HammingDistance(){
# Convert input hex strings to upper case like bc requires
STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
STR2=$(tr '[a-z]' '[A-Z]' <<< $2)
# Convert hex to binary and zero left pad to 64 binary digits
STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))
# Calculate Hamming distance between two strings, each differing bit adds 1
hamming=0
for i in {0..63};do
a=${STR1:i:1}
b=${STR2:i:1}
[ $a != $b ] && ((hamming++))
done
# Hamming distance is in range 0..64 and small means more similar
# We want percentage similarity, so we do a little maths
similarity=$((100-(hamming*100/64)))
echo $similarity
}
function Usage(){
echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
exit 1
}
################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
# Expecting a single image file for which to generate hash
if [ ! -f "$1" ]; then
echo "ERROR: File $1 does not exist" >&2
exit 1
fi
PerceptualHash "$1"
exit 0
fi
if [ $# -eq 2 ]; then
# Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
if [ -f "$1" ]; then
hash1=$(PerceptualHash "$1")
else
hash1=$1
fi
if [ -f "$2" ]; then
hash2=$(PerceptualHash "$2")
else
hash2=$2
fi
HammingDistance $hash1 $hash2
exit 0
fi
Usage

I also tried a normalised cross-correlation of each of your images with the card, like this:
#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do
cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
compare - card.jpg -metric NCC null: 2>&1)
echo "$cc:$i"
done | sort -n
and I got this output (sorted by match quality):
0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg
which shows that the card correlates best with demystify.jpg.
Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.

I tried this by arranging the image data as a vector and taking the inner-product between the collection image vectors and the searched image vector. The vectors that are most similar will give the highest inner-product. I resize all the images to the same size to get equal length vectors so I can take inner-product. This resizing will additionally reduce inner-product computational cost and give a coarse approximation of the actual image.
You can quickly check this with Matlab or Octave. Below is the Matlab/Octave script. I've added comments there. I tried varying the variable mult from 1 to 8 (you can try any integer value), and for all those cases, image Demystify gave the highest inner product with the card image. For mult = 8, I get the following ip vector in Matlab:
ip =
683007892
558305537
604013365
As you can see, it gives the highest inner-product of 683007892 for image Demystify.
% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');
% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;
% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);
% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
double(smallAggressiveUrge(:)) ...
double(smallAbundance(:))];
% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));
% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;
EDIT
I tried another approach, basically taking the euclidean distance (l2 norm) between reference images and the card image and it gave me very good results with a large collection of reference images (383 images) I found at this link for your test card image.
Here instead of taking the whole image, I extracted the upper part that contains the image and used it for comparison.
In the following steps, all training images and the test image are resized to a predefined size before doing any processing.
extract the image regions from training images
perform morphological closing on these images to get a coarse approximation (this step may not be necessary)
vectorize these images and store in a training set (I call it training set even though there's no training in this approach)
load the test card image, extract the image region-of-interest(ROI), apply closing, then vectorize
calculate the euclidean distance between each reference image vector and the test image vector
choose the minimum distance item (or the first k items)
I did this in C++ using OpenCV. I'm also including some test results using different scales.
#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>
using namespace cv;
using namespace std;
#define INPUT_FOLDER_PATH string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH string("Your training image folder path")
void search()
{
WIN32_FIND_DATA ffd;
HANDLE hFind = INVALID_HANDLE_VALUE;
vector<Mat> images;
vector<string> labelNames;
int label = 0;
double scale = .2; // you can experiment with scale
Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
// get all training samples in the directory
hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
if (INVALID_HANDLE_VALUE == hFind)
{
cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
return;
}
do
{
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
{
Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
Mat re;
resize(im, re, imgSize, 0, 0); // resize the image
// extract only the upper part that contains the image
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
// get a coarse approximation
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
images.push_back(roi.reshape(1)); // vectorize the roi
labelNames.push_back(string(ffd.cFileName));
}
}
while (FindNextFile(hFind, &ffd) != 0);
// load the test image, apply the same preprocessing done for training images
Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
Mat re;
resize(test, re, imgSize, 0, 0);
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
Mat testre = roi.reshape(1);
struct imgnorm2_t
{
string name;
double norm2;
};
vector<imgnorm2_t> imgnorm;
for (size_t i = 0; i < images.size(); i++)
{
imgnorm2_t data = {labelNames[i],
norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
imgnorm.push_back(data); // store data
}
// sort stored data based on euclidean-distance in the ascending order
sort(imgnorm.begin(), imgnorm.end(),
[] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
for (size_t i = 0; i < imgnorm.size(); i++)
{
cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
}
}
Results:
scale = 1.0;
demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6
scale = .8;
demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36
scale = .6;
demystify.jpg : 6226.6, steel_golem.jpg : 6887.96, sylvan_basilisk.jpg : 7013.05
scale = .4;
demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67
scale = .2;
demystify.jpg : 1903.05, steel_golem.jpg : 2154.64, sylvan_basilisk.jpg : 2277.42

If i understand you correctly you need to compare them as pictures. There is one very simple, but effective solution here - it's called Sikuli.
What tools can I use to find which image of collection is most similar to mine?
This tool is working very good with the image-processing and is not only capable to find if your card(image) is similar to what you have already defined as pattern, but also search partial image content (so called rectangles).
By default you can extend it's functionality via Python. Any ImageObject can be set to accept similarity_pattern in percentages and by doing so you'll be able to precisely find what you are looking for.
Also another big advantage of this tool is that you can learn basics in one day.
Hope this helps.

Related

Gabor filter parametrs for fingerprint image enhancement?

i am biggner in image processing and in gabor filter and i want to use this filter to enhance fingerprint image
i read many articles about fingerprint image enhancement and i know that the steps for that is
read image -> noramalize -> get orientation map -> gabor filter -> binarize -> skeleton
now i am in step 4 , my question is how to get the right values for ( lambds and gamma ) for gabor
filter
my image :
my code :
1- read image and get the orientation map using HOG features
imgc = imread(r'C:\Users\iP\Desktop\printe.jpg',as_gray=True)
imgc = resize(imgc, (64*3,128*3))
rows,cols=imgc.shape
offset=24
ori=9 # to get angels (0,45,90,135) only
fd, hog_image = hog(imgc, orientations=ori, pixels_per_cell=(offset, offset),
cells_per_block=(1, 1), visualize=True, multichannel=None,feature_vector=False
)
orientation map :
2- reshape the orientation map from (8, 16, 1, 1, 9) to (8, 16, 9),,,
8 ->rows , 16 -> cols , 9 orientation
fd=np.array(fd)
fd=np.reshape(fd,(fd.shape[0],fd.shape[1],ori))
# from (8, 16, 9) to (8, 16, 1)
# Choose the angle that has the most potential ( biggest magntude )
angels=np.zeros((fd.shape[0],fd.shape[1],1))
for r in range(fd.shape[0]):
for c in range(fd.shape[1]):
bloc_prop = fd[r,c]
angelss=bloc_prop.reshape((1,ori))
angel=np.argmax(angelss)
angels[r,c]=angel
angels=angels.astype(np.int32)
3- the convolve function
def conv_gabor(img,orient_map,gabor_kernel_shape):
#
# loop on all pixels in the image and convolve it with it's angel in the orientation map
#
roo,coo=img.shape
#to get the padding value for immage before convolving it with kernels
pad=(gabor_kernel_shape-1)
padded=np.zeros((img.shape[0]+pad,img.shape[1]+pad)) # adding the cols and rows
padded[int(pad/2):-int(pad/2),int(pad/2):-int(pad/2)]=img # copy image to inside the padded
image
#result image
dst=padded.copy()
# start from the image that inside the padded
for r in range(int(pad/2),int(pad/2)+roo):
for c in range(int(pad/2),int(pad/2)+coo):
# get the angel from the orientation map
ro=(r-int(pad/2))//offset
co=(c-int(pad/2))//offset
ang=angels[ro,co]
real_angel=(((180/ori)*ang))
# bloack around the pixe to convolve it
block=padded[r-int(pad/2):r+int(pad/2)+1,c-int(pad/2):c+int(pad/2)+1]
# get Gabor kernel
# here is my question ->> what to get the parametres values for ( lambda and gamma
and phi)
ker= cv2.getGaborKernel( (gabor_kernel_shape,gabor_kernel_shape), 3,
np.deg2rad(real_angel),np.pi/4,0.001,0 )
dst[r,c]=np.sum((ker*block))
return dst
dst=conv_gabor(imgc,angels,11)
dst :
you see the image is too bad i dont know why this , i think because the lambda and gamma or what ?
but when i filter with one angel only 45 :
ker= cv2.getGaborKernel( (11,11), 2, np.deg2rad(45),np.pi/4,0.5,0 )
filt = cv2.filter2D(imgc,cv2.CV_64F,ker)
plt.imshow(filt,'gray')
reslut :
you see the edges that has 45 on the left is good quality
can anyone help me please , and tell me what should i do in this probelm ?
thanks all :)
EDIT:
i searched for another way and i found that i can use gabor fiter bank with many orientation and get best score in filtred images , so how can i find best score for pixels from filtred images
this is the output when i use gabor fiter bank with 45,60,65,90,135 angels and divide the filtered images to 16*16 and find the highest standard deviation (best score -> i use standard deviation as the score) for each block and get the best filtred image
so as you can see there are good and bad parts in the image ,i think using standard deviation alone is ineffective in some parts of the image , so my new question is what is best score function that gives me good output parts in the image
original image :
In my opinion, weighting the filtered images might be enough for your task. Considering your filter orientations, the filters with angle 45 and 135 respond quite well at different regions of the image. So, you can calculate the weighted sum to get the best filter result.
img = cv2.imread('fingerprint.jpg',0)
w_45 = 0.5
w_135 = 0.5
img_45 = cv2.filter2D(img,cv2.CV_64F,cv2.getGaborKernel( (11,11), 2, np.deg2rad(45),np.pi/4,0.5,0 ))
img_135 = cv2.filter2D(img,cv2.CV_64F,cv2.getGaborKernel( (11,11), 2, np.deg2rad(135),np.pi/4,0.5,0 ))
result = img_45*w_45+img_135*w_135
result = result/np.amax(result)*255
plt.imshow(result,cmap='gray')
plt.show()
Feel free to play with the weights. The result totally depends on what your next step is.

What would be the best strategy to match color dominants between two pictures?

I need to match the color dominant between two different pictures, to make them as similar as possible.
For example,I would like to match the grayscale picture of the child below, to the sepia picture of the soldier and compensate for contrast and lighning.
So far, I am thinking to convert the pictures to YCrCb and match the contrast on the histogram of the Y channel and the color in the other channels.
I will have to do the same also between color pictures.
Any suggestions?
I have some ideas that should be of use - they kind of start in Photoshop and wander through Perl, ImageMagick and OpenCV. I am a big fan of the warm and beautiful tonalities achieved by photographers such as David Fokos and Michael Kenna and I worked out, many years back, how to replicate their toning.
First, load your image up in Photoshop, convert to black and white mode, and then back to RGB mode, add a Curves adjustment layer and a new layer with the original colour image. Your Layers window will look like this:
Now turn off all layers except the grey background, and use the Color Dropper to find and mark:
a quarter-tone pixel (i.e. value around 64 in the Info window)
a mid-tone pixel (i.e. around 128 in the Info window)
a three quarter-tone pixel (i.e. around 192 in the Info window)
Now turn the other layers back on and find what those three tones map to in RGB:
Now go in the Curves layer and adjust the Red, Green and Blue curves to match those values:
And if you then switch back to RGB, you can see all three curves on one diagram:
You now just need to save that Curve as a file with ACV extension and you can apply it to other images:
I got a bit bored doing that, so I wrote a Perl script that does exactly the same. You pass it a toned image as a filename, it finds the quarter, mid and three-quarter tones and then creates an Adobe Photoshop Curves file - an ACV file for you which you can then batch apply to other photos.
Here's the Perl:
#!/usr/bin/perl
use strict;
use warnings;
use Image::Magick;
use Data::Dumper;
my $Debug=1; # 1=print debug messages, 0=don't
my $NPOINTS=5; # Number of points in curve we create
# Read in image in first parameter
my $imagename=$ARGV[0];
my $orig=Image::Magick::->new;
my $x = $orig->Read($imagename);
warn "$x" if "$x";
my $width =$orig->Get('columns');
my $height=$orig->Get('rows');
my $depth=$orig->Get('depth');
print "DEBUG: ",$width,"x",$height,", depth: ",$depth,"\n" if $Debug;
# Access pixel cache
my #RGBpixels = $orig->GetPixels(map=>'RGB',height=>$height,width=>$width,normalize=>1);
my ($i,$j,$p);
my (#greypoint,#Rpoint,#Gpoint,#Bpoint);
for($p=0;$p<$NPOINTS;$p++){
my $greylevelsought=int(($p+1)*256/($NPOINTS+1));
my $nearestgrey=1000;
for(my $t=0;$t<$height*$width;$t++){
my $R = int(255*$RGBpixels[(3*$t)+0]);
my $G = int(255*$RGBpixels[(3*$t)+1]);
my $B = int(255*$RGBpixels[(3*$t)+2]);
my $this=int(0.21*$R + 0.72*$G +0.07*$B);
printf "Point: %d, Greysought: %d, this pixel: %d\n",$p,$greylevelsought,$this if $Debug>1;
if(abs($this-$greylevelsought)<abs($nearestgrey-$greylevelsought)){
$nearestgrey=$this;
$greypoint[$p]=$nearestgrey;
$Rpoint[$p]=$R;
$Gpoint[$p]=$G;
$Bpoint[$p]=$B;
}
}
printf "DEBUG: Point#: %d, sought grey: %d, nearest grey: %d\n",$p,$greylevelsought,$nearestgrey if $Debug;
}
# Work out name of the curve file = image basename + acv
my $curvefile=substr($imagename,0,rindex($imagename,'.')) . ".acv";
open(my $out,'>:raw',$curvefile) or die "Unable to open: $!";
print $out pack("s>",4); # Version=4
print $out pack("s>",4); # Number of curves in file = Master NULL curve + R + G + B
print $out pack("s>",2); # Master NULL curve with 2 points for all channels
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
print $out pack("s>",2+$NPOINTS); # Red curve
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
for($p=0;$p<$NPOINTS;$p++){
print $out pack("s>",$Rpoint[$p]),pack("s>",$greypoint[$p]);
}
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
print $out pack("s>",2+$NPOINTS); # Green curve
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
for($p=0;$p<$NPOINTS;$p++){
print $out pack("s>",$Gpoint[$p]),pack("s>",$greypoint[$p]);
}
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
print $out pack("s>",2+$NPOINTS); # Blue curve
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
for($p=0;$p<$NPOINTS;$p++){
print $out pack("s>",$Bpoint[$p]),pack("s>",$greypoint[$p]);
}
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
close($out);
If you want to do this in OpenCV, you could translate the first 70% of the script to OpenCV pretty simply - it is just 2 loops. Then you would have the quarter, mid and three-quarter tone points. You could use a curve-fitting program such as gnuplot (I have no idea of your skillset) to fit a curve to the points and then generate a lookup table for each of the 256 values 0-255, and apply that to your other images using cv::LUT() to replicate or clone the tone.

OpenCV : how to provide matrix for 'undistort' if I know lens correction factor in Gimp?

I'm getting images from an IP camera that have a strong fish-eye effect. I found that in Gimp I can get lines mostly straight by applying the Lens Distortion filter with a "main" value of -30 (all other parameters remain zero).
Now I need to do this ad-hoc using OpenCV. I gathered that the undistort function in imgproc would be the right thing to call. But how do I generate the correct camera and distortion matrix? I see there is a calibrateCamera function, but it seem you need a PhD in computer vision or so to use it. I have no clue. Since I know the one parameter, there must be a simple way to translate it into the matrix expected by 'undistort'?
Note: I only need the radial distortion coefficients, I'm not interested in the tangential distortion.
There is a sample provided by opencv for calibration. For that all you need is the list of the images of checkerboard(around 20 should be good). taken by your desired camera. It will give you all the required parameters( distortion coefficients, intrinsic parameters etc.). Then you can use 'undistort' function of opencv to correct your image.
You need to change in default.xml,(or you can create your own .xml) the name of the xml file containing the address of your images, the count of inner squares and their dimension in real world.
tadaa you have you required parameters :-)
For those who wonder where that calibration tool comes from. Seems one has to build it from source. This is what I did on Linux:
git clone https://github.com/opencv/opencv.git
cd opencv
git checkout -b 3.1.0 3.1.0 # make sure we build that version
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=Release -D BUILD_EXAMPLES=ON ..
make -j4
Then to calibrate:
./bin/cpp-example-calibration -w=8 -h=6 -o=camera.yml -op -oe -su image_list.xml
The -su lets you verify how the images look after un-distortion. The -w and -h parameters take "inner corners" which is not the number of squares in the chess pattern, but rather (num-black-squares - 1) * 2.
Here's how the perspective transform is applied in the end, using Scala and JavaCV:
import org.bytedeco.javacpp.indexer.FloatRawIndexer
import org.bytedeco.javacpp.opencv_core.Mat
import org.bytedeco.javacpp.{opencv_core, opencv_imgcodecs, opencv_imgproc}
import java.io.File
// from the camera_matrix > data part of the yml:
val cameraFocal = 1.4656877976320607e+03
val cameraCX = 1920.0/2
val cameraCY = 1080.0/2
val cameraMatrixData = Array[Double](
cameraFocal, 0.0 , cameraCX,
0.0 , cameraFocal, cameraCY,
0.0 , 0.0 , 1.0
)
// from the distortion_coefficients of the yml:
val distMatrixData = Array[Double](
-4.016824381742e-01, 4.368842493074e-02, 0.0, 0.0, 1.096412142704e-01
)
def run(in: File, out: File): Unit = {
val matOut = new Mat
val camMat = new Mat(3, 3, opencv_core.CV_32FC1)
val camIdx = camMat.createIndexer[FloatRawIndexer]
for (row <- 0 until 3) {
for (col <- 0 until 3) {
camIdx.put(row, col, cameraMatrixData(row * 3 + col).toFloat)
}
}
val distVec = new Mat(1, 5, opencv_core.CV_32FC1)
val distIdx = distVec.createIndexer[FloatRawIndexer]
for (col <- 0 until 5) {
distIdx.put(0, col, distMatrixData(col).toFloat)
}
val matIn = opencv_imgcodecs.imread(in.getPath)
opencv_imgproc.undistort(matIn, matOut, camMat, distVec)
opencv_imgcodecs.imwrite(out.getPath, matOut)
}

Image filtering - wrong results?

I'm experimenting with convolving an image with a user-supplied mask, in this case
u = array([[-2,-2,-2],[-2,25,-2],[-2,-2,-2]])/9
using the commands
In[1]: import scipy.ndimage as ndi
In[2]: import skimage.io as io
In[3]: c = io.imread('cameraman.png')
In[4]: cu = ndi.convolve(c,u)
In[5]: io.imshow(cu)
I'm checking this against commands in GNU Octave:
Octave-3.8: 1> c = imread('cameraman.png');
Octave-3.8: 2> u = [-2 -2 -2;-2 25 -2;-2 -2 -2]/9
Octave-3.8: 3> cu = imfilter(c,u)
Octave-3.8: 4> imshow(cu)
But here's the thing: Octave seems to give the correct result, but Python doesn't, even though the commands convolve and imfilter are supposed to be implementing the same algorithm. (Well in fact imfilter performs a correlation, which in this case is the same as a convolution.)
The Octave output is:
!
and the Python output is:
!
which as you can see is very different to the Octave result. Does anybody know what's going on here? Or is there a better way of convolving with a user-supplied linear filter than using convolve?
The problem may be the result of your convolution taking your image luminance values out of bounds. I ran the example below in Matlab (~=Octave) and for an image that initially has grey values 0-255 so in normalised range [0,0.99] the result ends in with pixels in range [-0.88,2.03].
>> img=double(imread('cameraman.tif'))./255;
>> K=[-2 -2 -2 ; -2 25 -2; -2 -2 -2]/9;
>> out=conv2(img,K,'same');
>> max(max(out))
ans =
2.0288
>> min(min(out))
ans =
-0.8776
It could be that Python has a problem visualising images with out of range grey values <0 or >255 and this is causing a clamping of values resulting in black/white halos in those areas. Perhaps Octave normalises the image prior to displaying it resulting in few artifacts. If you normalise you image in Python prior to displaying it, do you still have this problem?

How to get most similar Eigenfaces or Fisherfaces in OpenCV?

I'm trying to find a measurement for the similarity of 2 faces. I use OpenCV. For that I train Eigenfaces / Fisherfaces with 1000 Photos of 1000 different people (so 1 Photo each person). So I also have 1000 labels in the training set.
Now I can use the predict method to get the most similar face.
I want to input 2 unknown face images to find if they are both similar to the same vector of faces in the training set.
Here is the code of openCV that returns the most similar label (with the lowest distance).
for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
Questions:
Can anyone tell me how to rewrite this to output the top 10 faces and not just the top 1 ? I'm thinking about pushing them into a priority queue, but maybe there is something easier?!
In the training: should I put all the faces on the same label or on different labels? So should I have 1 label or 1000 ?
Cheers
Here's what I did. Note I'm really good at perl, really newb at C++ (in fact, this is my first c++ project!) so I output a lot to the command line and parsed it with perl.
I went to facerec.cpp as you did, and I changed the contents of the for loop to this:
for(size_t sampleIdx = 0; sampleIdx < _projections.size(); sampleIdx++) {
double dist = norm(_projections[sampleIdx], q, NORM_L2);
int labelClass = _labels.at<int>((int)sampleIdx);
cout << dist << " " << labelClass << endl;
if((dist < minDist) && (dist < _threshold)) {
minDist = dist;
minClass = _labels.at<int>((int)sampleIdx);
}
}
This now outputs the distance and label of every face. Since all the predict function appears to do is take the picture with the shortest distance (lowest number) and return that as the answer, you can now take the resulting list, sort it, and take the first 10 results. Or you can take the first ten labels or whatever. This just gives you access to all of the data rather than the first X results.
I also added
#include <iostream>
using namespace std;
to the top of the file so I could use cout.
Q1:: Since OpenCV doesn't provide a default function, you have to create your own by creating a vector which has distance and label. You can write your own function as below and store the distance and label in the vector. Here you need to rebuild the opencv.
virtual void predict(InputArray src, int &label, double &confidence, Vector <variable>) const = 0;

Resources