Computing distance of perceptual hashes

Computing distance of perceptual hashes - imagemagick

I am using Imagemagick in order to get the perceptual hash of an image. I use the following command:
identify -verbose -define identify:moments x.png
The output returns amongst other params also the pereceptual hash:
I1: 0.0017694 (0.451197) I2: 3.22345e-07 (0.0209605) I3: 2.88038e-10 (0.00477606) I4: 3.93968e-12 (6.53253e-05) I5: 1.2326e-22 (3.38892e-08) I6: -1.94034e-15 (-8.20426e-06) I7: -4.91938e-23 (-1.35254e-08) I8: 5.56374e-16 (2.35249e-06) Channel perceptual hash: Red, Hue: PH1: 0.407586, 0.690687 PH2: 1.88394, 2.91999 PH3: 2.36028, 3.96979 PH4: 5.36184, 5.3591 PH5: 9.25849, 11 PH6: 6.30422, 6.93025 PH7: 9.6332, 10.0241 Green, Chroma: PH1: 0.293148, -0.0406998 PH2: 1.49146, 2.52843 PH3: 2.21568, 0.992456 PH4: 3.52683, 2.3777 PH5: 6.48291, 4.06334 PH6: 4.38149, 4.23342 PH7: 6.64322, 5.35487 Blue, Luma: PH1: 0.329865, 0.33357 PH2: 1.6461, 1.63528 PH3: 2.39206, 2.26483 PH4: 3.72747, 4.09284 PH5: 6.789, 7.36151 PH6: 4.56493, 5.0171 PH7: 7.83416, 7.50669
I want to save the hash and then compute the distance between 2 images. How can I convert the above output to a hash and calculate the distance between 2 hashes?

See http://www.fmwconcepts.com/misc_tests/perceptual_hash_test_results_510/index.html for detailed information and tests of this perceptual hash.
Basically it creates 42 floating point values that need to be compared with another set of 42 floating point values from another image using Sum Squared metric.
This is not a simple binary hash that can be easily stored as a string of 1s and 0x and compared using the Hamming distance.
But you can compare two images from their perceptual hashes in ImageMagick using
compare -metric phash image1 image2 null:
You can output the phash values to a .json file if you want.
Alternately, I have two bash unix ImageMagick shell scripts (phashconvert and phashcompare). One will convert the 42 floats to a string of digits that can be saved in the file in the comment section. The second will read two file's comment sections to extract the string, convert them back to floats and then use the Sum Squared Metric to evaluate them. But note this process is only an approximation due to the conversion back and forth from floats to digits.
If you just want to extract the 42 floats, this should do it (from my script phashconvert)
identify -quiet -verbose -moments -alpha off "x.png" | grep "PH[1-7]" | sed -n 's/.*: \(.*\)$/\1/p' | sed 's/ *//g' | tr "," "\n"

Related

Attention OCR to OpenVino

Good afternoon! I have a question about AttentionOCR model inference using OpenVino.
There is an AttentionOCR model that takes a size tensor (1,1,32,214) as input, I convert it to OpenVino using the following command:
mo \
--input_model=model/path/frozen_graph.pb \
--input="map/TensorArrayStack/TensorArrayGatherV3:0[1 32 214 1]" \
--output "transpose_1,transpose_2" \
--output_dir path/to/ir/
I submit a picture and the model returns the following transpose_1 and transpose_2 arrays , which, as I understand it, should output a tensor with predicted symbols and their probabilities, but output something not what was expected. And it's still not clear why lists of size 8 are returned.
Before feeding the model as input, I converted the image to gray and then also executed the different commands like that, but get the same result:
blob = cv2.dnn.blobFromImage(image, 1.0, (32, 214))
or
image = cv2.resize(image, (32,214))
image = image[None, None,:,:]

Python parallelization for code to combine multiple images

I am new to Python and am trying to parallelize a program that I somehow pieced together from the internet. The program reads all image files (usually multiple series of images such as abc001,abc002...abc015 and xyz001,xyz002....xyz015) in a specific folder and then combines images in a specified range. Most times, the number of files exceeds 10000, and my latest case requires me to combine 24000 images. Could someone help me with:
Taking 2 sets of images from different directories. Currently I have to move these images into 1 directory and then work in said directory.
Reading only specified files. Currently my program reads all files, saves names in an array (I think it's an array. Could be a directory also) and then uses only the images required to combine. If I specify a range of files, it still checks against all files in the directory and takes a lot of time.
Parallel Processing - I work with usually 10k files or sometimes more. These are images saved from the fluid simulations that I run at specific times. Currently, I save about 2k files at a time in separate folders and run the program to combine these 2000 files at one time. And then I copy all the output files to a separate folder to keep them together. It would be great if I could use all 16 cores on the processor to combine all files in 1 go.
Image series 1 is like so.
Consider it to be a series of photos of the cat walking towards the camera. Each frame is is suffixed with 001,002,...,n.
Image series 1 is like so.
Consider it to be a series of photos of the cat's expression changing with each frame. Each frame is is suffixed with 001,002,...,n.
The code currently combines each frame from set1 and set2 to provide output.png as shown in the link here.
import sys
import os
from PIL import Image
keywords=input('Enter initial characters of image series 1 [Ex:Scalar_ , VoF_Scene_]:\n')
keywords2=input('Enter initial characters of image series 2 [Ex:Scalar_ , VoF_Scene_]:\n')
directory = input('Enter correct folder name where images are present :\n') # FOLDER WHERE IMAGES ARE LOCATED
result1 = {}
result2={}
name_count1=0
name_count2=0
for filename in os.listdir(directory):
if keywords in filename:
name_count1 +=1
result1[name_count1] = os.path.join(directory, filename)
if keywords2 in filename:
name_count2 +=1
result2[name_count2] = os.path.join(directory, filename)
num1=input('Enter initial number of series:\n')
num2=input('Enter final number of series:\n')
num1=int(num1)
num2=int(num2)
if name_count1==(num2-num1+1):
a1=1
a2=name_count1
elif name_count2==(num2-num1+1):
a1=1
a2=name_count2
else:
a1=num1
a2=num2+1
for x in range(a1,a2):
y=format(x,'05') # '05' signifies number of digits in the series of file name Ex: [Scalar_scene_1_00345.png --> 5 digits], [Temperature_section_2_951.jpg --> 3 digits]. Change accordingly
y=str(y)
for comparison_name1 in result1:
for comparison_name2 in result2:
test1=result1[comparison_name1]
test2=result2[comparison_name2]
if y in test1 and y in test2:
a=test1
b=test2
test=[a,b]
images = [Image.open(x) for x in test]
widths, heights = zip(*(i.size for i in images))
total_width = sum(widths)
max_height = max(heights)
new_im = Image.new('RGB', (total_width, max_height))
x_offset = 0
for im in images:
new_im.paste(im, (x_offset,0))
x_offset += im.size[0]
output_name='output'+y+'.png'
new_im.save(os.path.join(directory, output_name))

I did a Python version as well, it's not quite as fast but it is maybe closer to your heart :-)
#!/usr/bin/env python3
import cv2
import numpy as np
from multiprocessing import Pool
def doOne(params):
"""Append the two input images side-by-side to output the third."""
imA = cv2.imread(params[0], cv2.IMREAD_UNCHANGED)
imB = cv2.imread(params[1], cv2.IMREAD_UNCHANGED)
res = np.hstack((imA, imB))
cv2.imwrite(params[2], res)
if __name__ == '__main__':
# Build the list of jobs - each entry is a tuple with 2 input filenames and an output filename
jobList = []
for i in range(1000):
# Horizontally append a-XXXXX.png to b-XXXXX.png to make c-XXXXX.png
jobList.append( (f'a-{i:05d}.png', f'b-{i:05d}.png', f'c-{i:05d}.png') )
# Make a pool of processes - 1 per CPU core
with Pool() as pool:
# Map the list of jobs to the pool of processes
pool.map(doOne, jobList)

You can do this a little quicker with libvips. To join two images left-right, enter:
vips join left.png out.png result.png horizontal
To test, I made 200 pairs of 1200x800 PNGs like this:
for i in {1..200}; do cp x.png left$i.png; cp x.png right$i.png; done
Then tried a benchmark:
time parallel vips join left{}.png right{}.png result{}.png horizontal ::: {1..200}
real 0m42.662s
user 2m35.983s
sys 0m6.446s
With imagemagick on the same laptop I see:
time parallel convert left{}.png right{}.png +append result{}.png ::: {1..200}
real 0m55.088s
user 3m24.556s
sys 0m6.400s

You can do that much faster without Python, and using multi-processing with ImageMagick or libvips.
The first part is all setup:
Make 20 images, called a-000.png ... a-019.png that go from red to blue:
convert -size 64x64 xc:red xc:blue -morph 18 a-%03d.png
Make 20 images, called b-000.png ... b-019.png that go from yellow to magenta:
convert -size 64x64 xc:yellow xc:magenta -morph 18 b-%03d.png
Now append them side-by-side into c-000.png ... c-019.png
for ((f=0;f<20;f++))
do
z=$(printf "%03d" $f)
convert a-${z}.png b-${z}.png +append c-${z}.png
done
Those images look like this:
If that looks good, you can do them all in parallel with GNU Parallel:
parallel convert a-{}.png b-{}.png +append c-{}.png ::: {1..19}
Benchmark
I did a quick benchmark and made 20,000 images a-00000.png...a-019999.png and another 20,000 images b-00000.png...b-019999.png with each image 1200x800 pixels. Then I ran the following command to append each pair horizontally and write 20,000 output images c-00000.png...c-019999.png:
seq -f "%05g" 0 19999 | parallel --eta convert a-{}.png b-{}.png +append c-{}.png
and that takes 16 minutes on my MacBook Pro with all 12 CPU cores pegged at 100% throughout. Note that you can:
add spacers between the images,
write annotation onto the images,
add borders,
resize
if you wish and do lots of other processing - this is just a simple example.
Note also that you can get even quicker times - in the region of 10-12 minutes if you accept JPEG instead of PNG as the output format.

How are floating-point pixel values converted to integer values?

How does image library (such as PIL, OpenCV, etc) convert floating-point values to integer pixel values?
For example
import numpy as np
from PIL import Image
# Creates a random image and saves in a file
def get_random_img(m=0, s=1, fname='temp.png'):
im = m + s * np.random.randn(60, 60, 3) # For eg. min: -3.8947058634971179, max: 3.6822041760496904
print(im[0, 0]) # for eg. array([ 0.36234732, 0.96987366, 0.08343])
imp = Image.fromarray(im, 'RGB') # (*)
print(np.array(imp)[0, 0]) # [140 , 74, 217]
imp.save(fname)
return im, imp
For the above method, an example is provided in the comment (which is randomly produced). My question is: how does (*) convert ndarray (which can range from - infinity to plus infinity) to pixel values between 0 and 255?
I tried to investigate the Pil.Image.fromarray method and eventually ended by at line #798 d.decode(data) within Pil.Image.Image().frombytes method. I could find the implementation of decode method, thus unable to know what computation goes behind the conversion.
My initial thought was that maybe the method use minimum (to 0) and maximum (to 255) value from the array and then map all the other values accordingly between 0 and 255. But upon investigations, I found out that's not what is happening. Moreover, how does it handle when the values of the array range between 0 and 1 or any other range of values?

Some libraries assume that floating-point pixel values are between 0 and 1, and will linearly map that range to 0 and 255 when casting to 8-bit unsigned integer. Some others will find the minimum and maximum values and map those to 0 and 255. You should always explicitly do this conversion if you want to be sure of what happened to your data.
In general, a pixel does not need to be 8-bit unsigned integer. A pixel can have any numerical type. Usually a pixel intensity represents an amount of light, or a density of some sort, but this is not always the case. Any physical quantity can be sampled in 2 or more dimensions. The range of meaningful values thus depends on what is imaged. Negative values are often also meaningful.
Many cameras have 8-bit precision when converting light intensity to a digital number. Likewise, displays typically have an b-bit intensity range. This is the reason many image file formats store only 8-bit unsigned integer data. However, some cameras have 12 bits or more, and some processes derive pixel data with a higher precision that one does not want to quantize. Therefore formats such as TIFF and ICS will allow you to save images in just about any numeric format you can think of.

I'm afraid it has done nothing anywhere near as clever as you hoped! It has merely interpreted the first byte of the first float as a uint8, then the second byte as another uint8...
from random import random, seed
import numpy as np
from PIL import Image
# Generate repeatable random data, so other folks get the same results
np.random.seed(42)
# Make a single RGB pixel
im = np.random.randn(1, 1, 3)
# Print the floating point values - not that we are interested in them
print(im)
# OUTPUT: [[[ 0.49671415 -0.1382643 0.64768854]]]
# Save that pixel to a file so we can dump it
im.tofile('array.bin')
# Now make a PIL Image from it and print the uint8 RGB values
imp = Image.fromarray(im, 'RGB')
print(imp.getpixel((0,0)))
# OUTPUT: (124, 48, 169)
So, PIL has interpreted our data as RGB=124/48/169
Now look at the hex we dumped. It is 24 bytes long, i.e. 3 float64 (8-byte) values, one for red, one for green and one for blue for the 1 pixel in our image:
xxd array.bin
Output
00000000: 7c30 a928 2aca df3f 2a05 de05 a5b2 c1bf |0.(*..?*.......
00000010: 685e 2450 ddb9 e43f h^$P...?
And the first byte (7c) has become 124, the second byte (30) has become 48 and the third byte (a9) has become 169.
TLDR; PIL has merely taken the first byte of the first float as the Red uint8 channel of the first pixel, then the second byte of the first float as the Green uint8 channel of the first pixel and the third byte of the first float as the Blue uint8 channel of the first pixel.

Duplicate photoshop level command in Imagemagick

All ,
How can I apply level (10, 245, 095) to a picture. When these level are applied in photoshop the results are different, when I am doing same in Imagemagick I get a completely different result. What am I missing

This question was also asked on the ImageMagick forum and answered by Fred (http://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=26023). This is his answer:
IM -level values for min and max, depend upon the Q level of your IM compile. check convert -version. I assume it will say Q16, which means that IM is expecting values in the range of 0 to 65535. So you need to convert your values of 0 to 255 to the range of 65535 or what I usually do is convert the values to percent and use that:
10/255 = 0.03921568627451 (x100 to get percent)
245/255 = 0.96078431372549 (x100 to get percent)
so try
-level 3.921%,96.08%,0.95
The gamma value (0.95) does not need any conversion.
See:
http://www.imagemagick.org/script/command-line-options.php#level

Matching image to images collection

I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?
Here's collection sample:
Abundance
Aggressive Urge
Demystify
Here's what I'm trying to find:
Card Photo

New method!
It seems that the following ImageMagick command, or maybe a variation of it, depending on looking at a greater selection of your images, will extract the wording at the top of your cards
convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png
which takes the top 10% of your image and 80% of the width (starting at 10% in from the top left corner and stores it in crop.png as follows:
And if your run that through tessseract OCR as follows:
tesseract crop.png agg
you get a file called agg.txt containing:
E‘ Aggressive Urge \L® E
which you can run through grep to clean up, looking only for upper and lower case letters adjacent to each other:
grep -Eo "\<[A-Za-z]+\>" agg.txt
to get
Aggressive Urge
:-)

Thank you for posting some photos.
I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:
Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%
so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.
I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.
#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){
TEMP="tmp$$.png"
# Force image to 8x8 pixels and greyscale
convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"
# Calculate mean brightness and correct to range 0..255
MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)
# Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
hash=""
for i in {0..7}; do
for j in {0..7}; do
pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
bit="0"
[ $pixel -gt $MEAN ] && bit="1"
hash="$hash$bit"
done
done
hex=$(echo "obase=16;ibase=2;$hash" | bc)
printf "%016s\n" $hex
#rm "$TEMP" > /dev/null 2>&1
}
function HammingDistance(){
# Convert input hex strings to upper case like bc requires
STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
STR2=$(tr '[a-z]' '[A-Z]' <<< $2)
# Convert hex to binary and zero left pad to 64 binary digits
STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))
# Calculate Hamming distance between two strings, each differing bit adds 1
hamming=0
for i in {0..63};do
a=${STR1:i:1}
b=${STR2:i:1}
[ $a != $b ] && ((hamming++))
done
# Hamming distance is in range 0..64 and small means more similar
# We want percentage similarity, so we do a little maths
similarity=$((100-(hamming*100/64)))
echo $similarity
}
function Usage(){
echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
exit 1
}
################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
# Expecting a single image file for which to generate hash
if [ ! -f "$1" ]; then
echo "ERROR: File $1 does not exist" >&2
exit 1
fi
PerceptualHash "$1"
exit 0
fi
if [ $# -eq 2 ]; then
# Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
if [ -f "$1" ]; then
hash1=$(PerceptualHash "$1")
else
hash1=$1
fi
if [ -f "$2" ]; then
hash2=$(PerceptualHash "$2")
else
hash2=$2
fi
HammingDistance $hash1 $hash2
exit 0
fi
Usage

I also tried a normalised cross-correlation of each of your images with the card, like this:
#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do
cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
compare - card.jpg -metric NCC null: 2>&1)
echo "$cc:$i"
done | sort -n
and I got this output (sorted by match quality):
0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg
which shows that the card correlates best with demystify.jpg.
Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.

I tried this by arranging the image data as a vector and taking the inner-product between the collection image vectors and the searched image vector. The vectors that are most similar will give the highest inner-product. I resize all the images to the same size to get equal length vectors so I can take inner-product. This resizing will additionally reduce inner-product computational cost and give a coarse approximation of the actual image.
You can quickly check this with Matlab or Octave. Below is the Matlab/Octave script. I've added comments there. I tried varying the variable mult from 1 to 8 (you can try any integer value), and for all those cases, image Demystify gave the highest inner product with the card image. For mult = 8, I get the following ip vector in Matlab:
ip =
683007892
558305537
604013365
As you can see, it gives the highest inner-product of 683007892 for image Demystify.
% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');
% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;
% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);
% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
double(smallAggressiveUrge(:)) ...
double(smallAbundance(:))];
% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));
% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;
EDIT
I tried another approach, basically taking the euclidean distance (l2 norm) between reference images and the card image and it gave me very good results with a large collection of reference images (383 images) I found at this link for your test card image.
Here instead of taking the whole image, I extracted the upper part that contains the image and used it for comparison.
In the following steps, all training images and the test image are resized to a predefined size before doing any processing.
extract the image regions from training images
perform morphological closing on these images to get a coarse approximation (this step may not be necessary)
vectorize these images and store in a training set (I call it training set even though there's no training in this approach)
load the test card image, extract the image region-of-interest(ROI), apply closing, then vectorize
calculate the euclidean distance between each reference image vector and the test image vector
choose the minimum distance item (or the first k items)
I did this in C++ using OpenCV. I'm also including some test results using different scales.
#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>
using namespace cv;
using namespace std;
#define INPUT_FOLDER_PATH string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH string("Your training image folder path")
void search()
{
WIN32_FIND_DATA ffd;
HANDLE hFind = INVALID_HANDLE_VALUE;
vector<Mat> images;
vector<string> labelNames;
int label = 0;
double scale = .2; // you can experiment with scale
Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
// get all training samples in the directory
hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
if (INVALID_HANDLE_VALUE == hFind)
{
cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
return;
}
do
{
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
{
Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
Mat re;
resize(im, re, imgSize, 0, 0); // resize the image
// extract only the upper part that contains the image
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
// get a coarse approximation
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
images.push_back(roi.reshape(1)); // vectorize the roi
labelNames.push_back(string(ffd.cFileName));
}
}
while (FindNextFile(hFind, &ffd) != 0);
// load the test image, apply the same preprocessing done for training images
Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
Mat re;
resize(test, re, imgSize, 0, 0);
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
Mat testre = roi.reshape(1);
struct imgnorm2_t
{
string name;
double norm2;
};
vector<imgnorm2_t> imgnorm;
for (size_t i = 0; i < images.size(); i++)
{
imgnorm2_t data = {labelNames[i],
norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
imgnorm.push_back(data); // store data
}
// sort stored data based on euclidean-distance in the ascending order
sort(imgnorm.begin(), imgnorm.end(),
[] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
for (size_t i = 0; i < imgnorm.size(); i++)
{
cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
}
}
Results:
scale = 1.0;
demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6
scale = .8;
demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36
scale = .6;
demystify.jpg : 6226.6, steel_golem.jpg : 6887.96, sylvan_basilisk.jpg : 7013.05
scale = .4;
demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67
scale = .2;
demystify.jpg : 1903.05, steel_golem.jpg : 2154.64, sylvan_basilisk.jpg : 2277.42

If i understand you correctly you need to compare them as pictures. There is one very simple, but effective solution here - it's called Sikuli.
What tools can I use to find which image of collection is most similar to mine?
This tool is working very good with the image-processing and is not only capable to find if your card(image) is similar to what you have already defined as pattern, but also search partial image content (so called rectangles).
By default you can extend it's functionality via Python. Any ImageObject can be set to accept similarity_pattern in percentages and by doing so you'll be able to precisely find what you are looking for.
Also another big advantage of this tool is that you can learn basics in one day.
Hope this helps.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Computing distance of perceptual hashes - imagemagick

Related

Attention OCR to OpenVino

Python parallelization for code to combine multiple images

How are floating-point pixel values converted to integer values?

Duplicate photoshop level command in Imagemagick

Matching image to images collection

Categories

Resources