ANSI Color Specific RGB Sequence Bash - ansi-escape

I know that in bash terminals a reliable way to change color is using ANSI escape sequences. For example:
echo -e "\033[0;31mbrown text\033[0;00m"
should output
brown text (in brown)
Is there a way to output color using a specific RGB set with ANSI? Say I want bright red:
echo -e "**\033[255:0:0m**red text\033[0;00m"
Does this sort of thing exist?
I just want to use standard bash.

Both answers here fail to mention the Truecolor ANSI support for 8bpc color. This will get the RGB color the OP originally asked for.
Instead of ;5, use ;2, and specify the R, G, and B values (0-255) in the following three control segments.
\x1b[38;2;40;177;249m
To test if your terminal supports Truecolor:
printf "\x1b[38;2;40;177;249mTRUECOLOR\x1b[0m\n"
On my machine, XTerm happily outputted the correct color; although, terminals that are modeled after terminals that predate modern RGB color generally will not support truecolor - make sure you know your target before using this particular variant of the escape code.
I'd also like to point out the 38 and the ;5/;2 - Blue Ice mentioned that 38 routes and then 5 changes the color. That is slightly incorrect.
38 is the xterm-256 extended foreground color code; 30-37 are simply 16-color foreground codes (with a brightness controlled by escape code 1 on some systems and the arguably-supported 90-97 non-standard 'bright' codes) that are supported by all vt100/xterm-compliant colored terminals.
The ;2 and ;5 indicate the format of the color, ultimately telling the terminal how many more sequences to pull: ;5 specifying an 8-bit format (as Blue Ice mentioned) requiring only 1 more control segment, and ;2 specifying a full 24-bit RGB format requiring 3 control segments.
These extended modes are technically "undocumented" and are completely implementation defined. As far as I know and can research, they are not governed by the ANSI committee.
For the so inclined, the 5; (256 color) format starts with the 16 original colors (both dark/light, so 30-37 and 90-97) as colors 0-15.
The proceeding 216 colors (16-231) are formed by a 3bpc RGB value offset by 16, packed into a single value.
The final 24 colors (232-256) are greyscale starting from a shade slightly lighter than black ranging up to a shade slightly darker than white. Some emulators interpret these steps as linear increments from (256 / 24) on all three channels, though I've come across some emulators that seem to explicitly define these values.
Here is a Javascript function that performs such a conversion, taking into account all of the greys.
function rgbToAnsi256(r, g, b) {
// we use the extended greyscale palette here, with the exception of
// black and white. normal palette only has 4 greyscale shades.
if (r === g && g === b) {
if (r < 8) {
return 16;
}
if (r > 248) {
return 231;
}
return Math.round(((r - 8) / 247) * 24) + 232;
}
var ansi = 16
+ (36 * Math.round(r / 255 * 5))
+ (6 * Math.round(g / 255 * 5))
+ Math.round(b / 255 * 5);
return ansi;
}
So in a way, you can calculate 256 ANSI colors from initial RGB values by reducing them from 8 to 3 bits in order to form a 256 encoded value in the event you want to programmatically do so on terminals that do not support Truecolor.

This does exist, but instead of the 16777216 (256^3) colors that the OP was looking for, there are 216 (6^3) equally distributed colors, in a larger set of 256 colors. Example:
echo -e "\033[38;5;208mpeach\033[0;00m"
This will output a pleasing sort of peach colored text.
Taking apart this command: \033[38;5;208m
The \033 is the escape code. The [38; directs command to the foreground. If you want to change the background color instead, use [48; instead. The 5; is just a piece of the sequence that changes color. And the most important part, 208m, selects the actual color.
There are 3 sets of colors that can be found in the 256 color sequence for this escape. The first set is the basic "candy" color set, or values 0-15. Then there is a cube of distributed colors, from 16-231. Lastly there is a detailed grayscale set from 232-255.
You can find a table with all of these values here: http://bitmote.com/index.php?post/2012/11/19/Using-ANSI-Color-Codes-to-Colorize-Your-Bash-Prompt-on-Linux#256%20(8-bit)%20Colors

This will work
echo -e "**\033[38;2;255;0;0m**red text\033[0;00m"
format: "\033[38;2;R;G;Bm"
R is your RED component of your RGB
G is your GREEN component of your RGB
B is your BLUE component of your RGB

Playing with RGB (and HSV) in bash
ANSI sequences in terminal.
There are two way of printing colors in bash.
After playing with nice tools found on xterm's source tree, here is how vttests/256colors2.pl show on my gnome-terminal:
show 256 colors: 16 terminal colors + 6 * 6 * 6 RGB levels + 24 grayscales.
this use ANSI syntax \e[48;5;COLORm:
printf '\e[48;5;%sm' $color;
instead of \e[48;2;RED;GREEN;BLUEm:
printf '\e[48;2;%s;%s;%sm' $red $green $blue;
I've done some bash functions to play with RGB, and HSV:
RGB to HSV
hsv() {
local -n _result=$4
local -i _hsv_min _hsv_t
local _hsv_s
local -i _hsv_max=" $1 > $2 ?
(_hsv_min=($2 > $3 ? $3:$2 ), ( $1 > $3 ? $1 : $3 )) :
(_hsv_min=($1 > $3 ? $3:$1 ), $2) > $3 ? $2 : $3 "
case $_hsv_max in
$_hsv_min) _hsv_t=0 ;;
$1) _hsv_t=" ( 60 * ( $2 - $3 ) / ( _hsv_max-_hsv_min )+ 360 )%360";;
$2) _hsv_t=" 60 * ( $3 - $1 ) / ( _hsv_max-_hsv_min )+ 120 " ;;
$3) _hsv_t=" 60 * ( $1 - $2 ) / ( _hsv_max-_hsv_min )+ 240 " ;;
esac
_hsv_s=0000000$(( _hsv_max==0?0 : 100000000-100000000*_hsv_min / _hsv_max ))
printf -v _hsv_s %.7f ${_hsv_s::-8}.${_hsv_s: -8}
_result=($_hsv_t $_hsv_s $_hsv_max)
}
Then
RED=255 GREEN=240 BLUE=128
hsv $RED $GREEN $BLUE hsvAr
echo ${hsvAr[#]}
52 0.4980392 255
printf 'Hue: %d, Saturation: %f, Value: %d\n' "${hsvAr[#]}"
Hue: 52, Saturation: 0.498039, Value: 255
HSV to RGB
rgb() {
local -n _result=$4
local -i _rgb_i=" (($1%360)/60)%6 "
local -i _rgb_f=" 100000000*($1%360)/60-_rgb_i*100000000 "
local _rgb_s
printf -v _rgb_s %.8f "$2"
_rgb_s=$((10#${_rgb_s/.}))
local -i _rgb_l=" $3*(100000000-_rgb_s)/100000000 "
case $_rgb_i in
0 )
local -i _rgb_n="$3*(100000000-(100000000-_rgb_f)*_rgb_s/100000000)/
100000000 "
_result=("$3" "$_rgb_n" "$_rgb_l") ;;
1 )
local -i _rgb_m=" $3*(100000000-_rgb_f*_rgb_s/100000000)/100000000 "
_result=("$_rgb_m" "$3" "$_rgb_l") ;;
2 )
local -i _rgb_n="$3*(100000000-(100000000-_rgb_f)*_rgb_s/100000000)/
100000000 "
_result=("$_rgb_l" "$3" "$_rgb_n") ;;
3 )
local -i _rgb_m=" $3*(100000000-_rgb_f*_rgb_s/100000000)/100000000 "
_result=("$_rgb_l" "$_rgb_m" "$3") ;;
4 )
local -i _rgb_n="$3*(100000000-(100000000-_rgb_f)*_rgb_s/100000000)/
100000000 "
_result=("$_rgb_n" "$_rgb_l" "$3") ;;
* )
local -i _rgb_m=" $3*(100000000-_rgb_f*_rgb_s/100000000)/100000000 "
_result=("$3" "$_rgb_l" "$_rgb_m") ;;
esac
}
Then
rgb 160 .6 240 out
echo ${out[#]}
96 240 192
printf '\e[48;5;%d;%d;%dm \e[0m\n' "${out[#]}"
Will produce a bunch of colored spaces.
Further: hsvrgb-browser.sh
Preamble: Store previous two function into a file called hsvrgb.sh, stored in same directory than downloaded hsvrgb-browser.sh.
HSV-RGB Color browser - Usage:
[RrGgBbVb] Incrase/decrase value by step ('1'), from 0 to 255..
[HhTt] Incrase/decrase Hue (tint), loop over 0 - 359.
[Ss] Increase/decrase Saturation by .006 x step (1).
[Cc] Toggle Color bar rendering (upper C fix HSV)
[+-] Incrase/decrase step.
[u] show this.
[q] quit.
Note: Regarding mmeisner' comment, If you encounter issue with this script, try to run them by:
LC_ALL=C.UTF8 ./hsvrgb-browser.sh

Currently true color escape sequences (\e[38;2;R;G;Bm) are supported by certain terminal emulators including gnome-terminal (with vte >= 0.36), konsole, and st [suckless].
The feature is not supported by certain others, e.g. pterm [putty], terminology [enlightenment], urxvt.
xterm is halfway in between: it recognizes the escape sequences, but rounds every color to the nearest one in the 256-color palette.

No there's not.
And to nitpick, those are technically not "ANSI escape sequences" but VT100 control codes (which were defined long before there were graphical terminals and terms like "RGB").

Related

What would be the best strategy to match color dominants between two pictures?

I need to match the color dominant between two different pictures, to make them as similar as possible.
For example,I would like to match the grayscale picture of the child below, to the sepia picture of the soldier and compensate for contrast and lighning.
So far, I am thinking to convert the pictures to YCrCb and match the contrast on the histogram of the Y channel and the color in the other channels.
I will have to do the same also between color pictures.
Any suggestions?
I have some ideas that should be of use - they kind of start in Photoshop and wander through Perl, ImageMagick and OpenCV. I am a big fan of the warm and beautiful tonalities achieved by photographers such as David Fokos and Michael Kenna and I worked out, many years back, how to replicate their toning.
First, load your image up in Photoshop, convert to black and white mode, and then back to RGB mode, add a Curves adjustment layer and a new layer with the original colour image. Your Layers window will look like this:
Now turn off all layers except the grey background, and use the Color Dropper to find and mark:
a quarter-tone pixel (i.e. value around 64 in the Info window)
a mid-tone pixel (i.e. around 128 in the Info window)
a three quarter-tone pixel (i.e. around 192 in the Info window)
Now turn the other layers back on and find what those three tones map to in RGB:
Now go in the Curves layer and adjust the Red, Green and Blue curves to match those values:
And if you then switch back to RGB, you can see all three curves on one diagram:
You now just need to save that Curve as a file with ACV extension and you can apply it to other images:
I got a bit bored doing that, so I wrote a Perl script that does exactly the same. You pass it a toned image as a filename, it finds the quarter, mid and three-quarter tones and then creates an Adobe Photoshop Curves file - an ACV file for you which you can then batch apply to other photos.
Here's the Perl:
#!/usr/bin/perl
use strict;
use warnings;
use Image::Magick;
use Data::Dumper;
my $Debug=1; # 1=print debug messages, 0=don't
my $NPOINTS=5; # Number of points in curve we create
# Read in image in first parameter
my $imagename=$ARGV[0];
my $orig=Image::Magick::->new;
my $x = $orig->Read($imagename);
warn "$x" if "$x";
my $width =$orig->Get('columns');
my $height=$orig->Get('rows');
my $depth=$orig->Get('depth');
print "DEBUG: ",$width,"x",$height,", depth: ",$depth,"\n" if $Debug;
# Access pixel cache
my #RGBpixels = $orig->GetPixels(map=>'RGB',height=>$height,width=>$width,normalize=>1);
my ($i,$j,$p);
my (#greypoint,#Rpoint,#Gpoint,#Bpoint);
for($p=0;$p<$NPOINTS;$p++){
my $greylevelsought=int(($p+1)*256/($NPOINTS+1));
my $nearestgrey=1000;
for(my $t=0;$t<$height*$width;$t++){
my $R = int(255*$RGBpixels[(3*$t)+0]);
my $G = int(255*$RGBpixels[(3*$t)+1]);
my $B = int(255*$RGBpixels[(3*$t)+2]);
my $this=int(0.21*$R + 0.72*$G +0.07*$B);
printf "Point: %d, Greysought: %d, this pixel: %d\n",$p,$greylevelsought,$this if $Debug>1;
if(abs($this-$greylevelsought)<abs($nearestgrey-$greylevelsought)){
$nearestgrey=$this;
$greypoint[$p]=$nearestgrey;
$Rpoint[$p]=$R;
$Gpoint[$p]=$G;
$Bpoint[$p]=$B;
}
}
printf "DEBUG: Point#: %d, sought grey: %d, nearest grey: %d\n",$p,$greylevelsought,$nearestgrey if $Debug;
}
# Work out name of the curve file = image basename + acv
my $curvefile=substr($imagename,0,rindex($imagename,'.')) . ".acv";
open(my $out,'>:raw',$curvefile) or die "Unable to open: $!";
print $out pack("s>",4); # Version=4
print $out pack("s>",4); # Number of curves in file = Master NULL curve + R + G + B
print $out pack("s>",2); # Master NULL curve with 2 points for all channels
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
print $out pack("s>",2+$NPOINTS); # Red curve
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
for($p=0;$p<$NPOINTS;$p++){
print $out pack("s>",$Rpoint[$p]),pack("s>",$greypoint[$p]);
}
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
print $out pack("s>",2+$NPOINTS); # Green curve
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
for($p=0;$p<$NPOINTS;$p++){
print $out pack("s>",$Gpoint[$p]),pack("s>",$greypoint[$p]);
}
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
print $out pack("s>",2+$NPOINTS); # Blue curve
print $out pack("s>",0 ),pack("s>",0 ); # 0 out, 0 in
for($p=0;$p<$NPOINTS;$p++){
print $out pack("s>",$Bpoint[$p]),pack("s>",$greypoint[$p]);
}
print $out pack("s>",255),pack("s>",255); # 255 out, 255 in
close($out);
If you want to do this in OpenCV, you could translate the first 70% of the script to OpenCV pretty simply - it is just 2 loops. Then you would have the quarter, mid and three-quarter tone points. You could use a curve-fitting program such as gnuplot (I have no idea of your skillset) to fit a curve to the points and then generate a lookup table for each of the 256 values 0-255, and apply that to your other images using cv::LUT() to replicate or clone the tone.

Increase image canvas by a number divisible by N

I have an image of certain dimensions, say WxH. My goal is to increase its canvas size (without scaling an image) to such dimensions W'xH' that W' is divisible by arbitrary N and H' is divisible by arbitrary M, yet both are least possible s.t. W'>=W and H'>=H.
I've searched through tons of docs but it seems like I didn't define perfectly what I'm looking for.
Here's a solution using awk, but I'm sure there's plenty of other techniques.
#!/bin/bash
N=4
M=5
FILENAME="rose:"
WIDTH=$(identify -format %w "${FILENAME}" | awk -v N=$N '{ m = $1 % N; d = int($1 / N) + 1; printf "%d", (m==0)? $1 : d * N}')
HEIGHT=$(identify -format %h "${FILENAME}" | awk -v M=$M '{ m = $1 % M; d = int($1 / M) + 1; printf "%d", (m==0)? $1 : d * M}')
convert "${FILENAME}" -extent "${WIDTH}x${HEIGHT}" /tmp/output.png
This works by reading the FILENAME metrics, and calculates the next arbitrary divisibility with awk -- if not already divisible. Than pass the new width/height to -extent operator which will increase the canvas size without resizing/scaling the image. The -gravity can also be used to control centering & alignment.

Matching image to images collection

I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?
Here's collection sample:
Abundance
Aggressive Urge
Demystify
Here's what I'm trying to find:
Card Photo
New method!
It seems that the following ImageMagick command, or maybe a variation of it, depending on looking at a greater selection of your images, will extract the wording at the top of your cards
convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png
which takes the top 10% of your image and 80% of the width (starting at 10% in from the top left corner and stores it in crop.png as follows:
And if your run that through tessseract OCR as follows:
tesseract crop.png agg
you get a file called agg.txt containing:
E‘ Aggressive Urge \L® E
which you can run through grep to clean up, looking only for upper and lower case letters adjacent to each other:
grep -Eo "\<[A-Za-z]+\>" agg.txt
to get
Aggressive Urge
:-)
Thank you for posting some photos.
I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:
Card vs. Abundance 79%
Card vs. Aggressive 83%
Card vs. Demystify 85%
so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.
I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.
#!/bin/bash
################################################################################
# Similarity
# Mark Setchell
#
# Calculate percentage similarity of two images using Perceptual Hashing
# See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
#
# Method:
# 1) Resize image to black and white 8x8 pixel square regardless
# 2) Calculate mean brightness of those 64 pixels
# 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
# 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
#
# If finding difference between Perceptual Hashes, simply total up number of bits
# that differ between the two strings - this is the Hamming distance.
#
# Requires ImageMagick - www.imagemagick.org
#
# Usage:
#
# Similarity image|imageHash [image|imageHash]
# If you pass one image filename, it will tell you the Perceptual hash as a 16
# character hex string that you may want to store in an alternate stream or as
# an attribute or tag in filesystems that support such things. Do this in order
# to just calculate the hash once for each image.
#
# If you pass in two images, or two hashes, or an image and a hash, it will try
# to compare them and give a percentage similarity between them.
################################################################################
function PerceptualHash(){
TEMP="tmp$$.png"
# Force image to 8x8 pixels and greyscale
convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"
# Calculate mean brightness and correct to range 0..255
MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)
# Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
hash=""
for i in {0..7}; do
for j in {0..7}; do
pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
bit="0"
[ $pixel -gt $MEAN ] && bit="1"
hash="$hash$bit"
done
done
hex=$(echo "obase=16;ibase=2;$hash" | bc)
printf "%016s\n" $hex
#rm "$TEMP" > /dev/null 2>&1
}
function HammingDistance(){
# Convert input hex strings to upper case like bc requires
STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
STR2=$(tr '[a-z]' '[A-Z]' <<< $2)
# Convert hex to binary and zero left pad to 64 binary digits
STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))
# Calculate Hamming distance between two strings, each differing bit adds 1
hamming=0
for i in {0..63};do
a=${STR1:i:1}
b=${STR2:i:1}
[ $a != $b ] && ((hamming++))
done
# Hamming distance is in range 0..64 and small means more similar
# We want percentage similarity, so we do a little maths
similarity=$((100-(hamming*100/64)))
echo $similarity
}
function Usage(){
echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
exit 1
}
################################################################################
# Main
################################################################################
if [ $# -eq 1 ]; then
# Expecting a single image file for which to generate hash
if [ ! -f "$1" ]; then
echo "ERROR: File $1 does not exist" >&2
exit 1
fi
PerceptualHash "$1"
exit 0
fi
if [ $# -eq 2 ]; then
# Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
if [ -f "$1" ]; then
hash1=$(PerceptualHash "$1")
else
hash1=$1
fi
if [ -f "$2" ]; then
hash2=$(PerceptualHash "$2")
else
hash2=$2
fi
HammingDistance $hash1 $hash2
exit 0
fi
Usage
I also tried a normalised cross-correlation of each of your images with the card, like this:
#!/bin/bash
size="300x400!"
convert card.png -colorspace RGB -normalize -resize $size card.jpg
for i in *.jpg
do
cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
compare - card.jpg -metric NCC null: 2>&1)
echo "$cc:$i"
done | sort -n
and I got this output (sorted by match quality):
0.453999:abundance.jpg
0.550696:aggressive.jpg
0.629794:demystify.jpg
which shows that the card correlates best with demystify.jpg.
Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.
I tried this by arranging the image data as a vector and taking the inner-product between the collection image vectors and the searched image vector. The vectors that are most similar will give the highest inner-product. I resize all the images to the same size to get equal length vectors so I can take inner-product. This resizing will additionally reduce inner-product computational cost and give a coarse approximation of the actual image.
You can quickly check this with Matlab or Octave. Below is the Matlab/Octave script. I've added comments there. I tried varying the variable mult from 1 to 8 (you can try any integer value), and for all those cases, image Demystify gave the highest inner product with the card image. For mult = 8, I get the following ip vector in Matlab:
ip =
683007892
558305537
604013365
As you can see, it gives the highest inner-product of 683007892 for image Demystify.
% load images
imCardPhoto = imread('0.png');
imDemystify = imread('1.jpg');
imAggressiveUrge = imread('2.jpg');
imAbundance = imread('3.jpg');
% you can experiment with the size by varying mult
mult = 8;
size = [17 12]*mult;
% resize with nearest neighbor interpolation
smallCardPhoto = imresize(imCardPhoto, size);
smallDemystify = imresize(imDemystify, size);
smallAggressiveUrge = imresize(imAggressiveUrge, size);
smallAbundance = imresize(imAbundance, size);
% image collection: each image is vectorized. if we have n images, this
% will be a (size_rows*size_columns*channels) x n matrix
collection = [double(smallDemystify(:)) ...
double(smallAggressiveUrge(:)) ...
double(smallAbundance(:))];
% vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
% vector
x = double(smallCardPhoto(:));
% take the inner product of x and each image vector in collection. this
% will result in a n x 1 vector. the higher the inner product is, more similar the
% image and searched image(that is x)
ip = collection' * x;
EDIT
I tried another approach, basically taking the euclidean distance (l2 norm) between reference images and the card image and it gave me very good results with a large collection of reference images (383 images) I found at this link for your test card image.
Here instead of taking the whole image, I extracted the upper part that contains the image and used it for comparison.
In the following steps, all training images and the test image are resized to a predefined size before doing any processing.
extract the image regions from training images
perform morphological closing on these images to get a coarse approximation (this step may not be necessary)
vectorize these images and store in a training set (I call it training set even though there's no training in this approach)
load the test card image, extract the image region-of-interest(ROI), apply closing, then vectorize
calculate the euclidean distance between each reference image vector and the test image vector
choose the minimum distance item (or the first k items)
I did this in C++ using OpenCV. I'm also including some test results using different scales.
#include <opencv2/opencv.hpp>
#include <iostream>
#include <algorithm>
#include <string.h>
#include <windows.h>
using namespace cv;
using namespace std;
#define INPUT_FOLDER_PATH string("Your test image folder path")
#define TRAIN_IMG_FOLDER_PATH string("Your training image folder path")
void search()
{
WIN32_FIND_DATA ffd;
HANDLE hFind = INVALID_HANDLE_VALUE;
vector<Mat> images;
vector<string> labelNames;
int label = 0;
double scale = .2; // you can experiment with scale
Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
// get all training samples in the directory
hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
if (INVALID_HANDLE_VALUE == hFind)
{
cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
return;
}
do
{
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
{
Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
Mat re;
resize(im, re, imgSize, 0, 0); // resize the image
// extract only the upper part that contains the image
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
// get a coarse approximation
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
images.push_back(roi.reshape(1)); // vectorize the roi
labelNames.push_back(string(ffd.cFileName));
}
}
while (FindNextFile(hFind, &ffd) != 0);
// load the test image, apply the same preprocessing done for training images
Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
Mat re;
resize(test, re, imgSize, 0, 0);
Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
morphologyEx(roi, roi, MORPH_CLOSE, kernel);
Mat testre = roi.reshape(1);
struct imgnorm2_t
{
string name;
double norm2;
};
vector<imgnorm2_t> imgnorm;
for (size_t i = 0; i < images.size(); i++)
{
imgnorm2_t data = {labelNames[i],
norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
imgnorm.push_back(data); // store data
}
// sort stored data based on euclidean-distance in the ascending order
sort(imgnorm.begin(), imgnorm.end(),
[] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
for (size_t i = 0; i < imgnorm.size(); i++)
{
cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
}
}
Results:
scale = 1.0;
demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6
scale = .8;
demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36
scale = .6;
demystify.jpg : 6226.6, steel_golem.jpg : 6887.96, sylvan_basilisk.jpg : 7013.05
scale = .4;
demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67
scale = .2;
demystify.jpg : 1903.05, steel_golem.jpg : 2154.64, sylvan_basilisk.jpg : 2277.42
If i understand you correctly you need to compare them as pictures. There is one very simple, but effective solution here - it's called Sikuli.
What tools can I use to find which image of collection is most similar to mine?
This tool is working very good with the image-processing and is not only capable to find if your card(image) is similar to what you have already defined as pattern, but also search partial image content (so called rectangles).
By default you can extend it's functionality via Python. Any ImageObject can be set to accept similarity_pattern in percentages and by doing so you'll be able to precisely find what you are looking for.
Also another big advantage of this tool is that you can learn basics in one day.
Hope this helps.

ImageMagick. What is the correct way to dice an image into sub-tiles

What is the correct way to an dice an image into N x N sub-tile images?
Thanks,
Doug
Thanks,
Actually I futzed a bit and came up with the correct imagemagick incantations.
Here's the tcsh version.
Dice an image into a 4 x 4 grid (resultant images numbered sequentual). The number system is interpreted as: col + row * nrows:
$ convert -crop 25%x25% image.png tile-prefix.png
Often it is desirable to remap the sequential numbering to row x column. For example if you are using CATiledLayer in an iOS app and will need to ingest the correct tiles for a given scale. Here's how:
while ( $i < $number_of_tiles )
while -> set r = `expr $i \/ 4`
while -> set c = `expr $i \% 4`
while -> cp tile-prefix-$i.png tile-prefix-${r}x${c}.png
while -> echo $i
while -> # i++
while -> end

Converting RGB to grayscale/intensity

When converting from RGB to grayscale, it is said that specific weights to channels R, G, and B ought to be applied. These weights are: 0.2989, 0.5870, 0.1140.
It is said that the reason for this is different human perception/sensibility towards these three colors. Sometimes it is also said these are the values used to compute NTSC signal.
However, I didn't find a good reference for this on the web. What is the source of these values?
See also these previous questions: here and here.
The specific numbers in the question are from CCIR 601 (see Wikipedia article).
If you convert RGB -> grayscale with slightly different numbers / different methods,
you won't see much difference at all on a normal computer screen
under normal lighting conditions -- try it.
Here are some more links on color in general:
Wikipedia Luma
Bruce Lindbloom 's outstanding web site
chapter 4 on Color in the book by Colin Ware, "Information Visualization", isbn 1-55860-819-2;
this long link to Ware in books.google.com
may or may not work
cambridgeincolor :
excellent, well-written
"tutorials on how to acquire, interpret and process digital photographs
using a visually-oriented approach that emphasizes concept over procedure"
Should you run into "linear" vs "nonlinear" RGB,
here's part of an old note to myself on this.
Repeat, in practice you won't see much difference.
### RGB -> ^gamma -> Y -> L*
In color science, the common RGB values, as in html rgb( 10%, 20%, 30% ),
are called "nonlinear" or
Gamma corrected.
"Linear" values are defined as
Rlin = R^gamma, Glin = G^gamma, Blin = B^gamma
where gamma is 2.2 for many PCs.
The usual R G B are sometimes written as R' G' B' (R' = Rlin ^ (1/gamma))
(purists tongue-click) but here I'll drop the '.
Brightness on a CRT display is proportional to RGBlin = RGB ^ gamma,
so 50% gray on a CRT is quite dark: .5 ^ 2.2 = 22% of maximum brightness.
(LCD displays are more complex;
furthermore, some graphics cards compensate for gamma.)
To get the measure of lightness called L* from RGB,
first divide R G B by 255, and compute
Y = .2126 * R^gamma + .7152 * G^gamma + .0722 * B^gamma
This is Y in XYZ color space; it is a measure of color "luminance".
(The real formulas are not exactly x^gamma, but close;
stick with x^gamma for a first pass.)
Finally,
L* = 116 * Y ^ 1/3 - 16
"... aspires to perceptual uniformity [and] closely matches human perception of lightness." --
Wikipedia Lab color space
I found this publication referenced in an answer to a previous similar question. It is very helpful, and the page has several sample images:
Perceptual Evaluation of Color-to-Grayscale Image Conversions by Martin Čadík, Computer Graphics Forum, Vol 27, 2008
The publication explores several other methods to generate grayscale images with different outcomes:
CIE Y
Color2Gray
Decolorize
Smith08
Rasche05
Bala04
Neumann07
Interestingly, it concludes that there is no universally best conversion method, as each performed better or worse than others depending on input.
Heres some code in c to convert rgb to grayscale.
The real weighting used for rgb to grayscale conversion is 0.3R+0.6G+0.11B.
these weights arent absolutely critical so you can play with them.
I have made them 0.25R+ 0.5G+0.25B. It produces a slightly darker image.
NOTE: The following code assumes xRGB 32bit pixel format
unsigned int *pntrBWImage=(unsigned int*)..data pointer..; //assumes 4*width*height bytes with 32 bits i.e. 4 bytes per pixel
unsigned int fourBytes;
unsigned char r,g,b;
for (int index=0;index<width*height;index++)
{
fourBytes=pntrBWImage[index];//caches 4 bytes at a time
r=(fourBytes>>16);
g=(fourBytes>>8);
b=fourBytes;
I_Out[index] = (r >>2)+ (g>>1) + (b>>2); //This runs in 0.00065s on my pc and produces slightly darker results
//I_Out[index]=((unsigned int)(r+g+b))/3; //This runs in 0.0011s on my pc and produces a pure average
}
Check out the Color FAQ for information on this. These values come from the standardization of RGB values that we use in our displays. Actually, according to the Color FAQ, the values you are using are outdated, as they are the values used for the original NTSC standard and not modern monitors.
What is the source of these values?
The "source" of the coefficients posted are the NTSC specifications which can be seen in Rec601 and Characteristics of Television.
The "ultimate source" are the CIE circa 1931 experiments on human color perception. The spectral response of human vision is not uniform. Experiments led to weighting of tristimulus values based on perception. Our L, M, and S cones1 are sensitive to the light wavelengths we identify as "Red", "Green", and "Blue" (respectively), which is where the tristimulus primary colors are derived.2
The linear light3 spectral weightings for sRGB (and Rec709) are:
Rlin * 0.2126 + Glin * 0.7152 + Blin * 0.0722 = Y
These are specific to the sRGB and Rec709 colorspaces, which are intended to represent computer monitors (sRGB) or HDTV monitors (Rec709), and are detailed in the ITU documents for Rec709 and also BT.2380-2 (10/2018)
FOOTNOTES
(1) Cones are the color detecting cells of the eye's retina.
(2) However, the chosen tristimulus wavelengths are NOT at the "peak" of each cone type - instead tristimulus values are chosen such that they stimulate on particular cone type substantially more than another, i.e. separation of stimulus.
(3) You need to linearize your sRGB values before applying the coefficients. I discuss this in another answer here.
Starting a list to enumerate how different software packages do it. Here is a good CVPR paper to read as well.
FreeImage
#define LUMA_REC709(r, g, b) (0.2126F * r + 0.7152F * g + 0.0722F * b)
#define GREY(r, g, b) (BYTE)(LUMA_REC709(r, g, b) + 0.5F)
OpenCV
nVidia Performance Primitives
Intel Performance Primitives
Matlab
nGray = 0.299F * R + 0.587F * G + 0.114F * B;
These values vary from person to person, especially for people who are colorblind.
is all this really necessary, human perception and CRT vs LCD will vary, but the R G B intensity does not, Why not L = (R + G + B)/3 and set the new RGB to L, L, L?

Resources