This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
OpenCV rgb value for cv::Point in cv::Mat
As you know, in matlab it's easy to get r/g/b values using r = image(:,:,1).
But in openCV (before 2.2) we must use pointer like this:
plImage* img=cvCreateImage(cvSize(640,480),IPL_DEPTH_32F,3);
((float *)(img->imageData + i*img->widthStep))[j*img->nChannels + 0]=111; // B
((float *)(img->imageData + i*img->widthStep))[j*img->nChannels + 1]=112; // G
((float *)(img->imageData + i*img->widthStep))[j*img->nChannels + 2]=113; // R
But as openCV2.3 comes out, it's easy to get pixel value of a single channel image like this:
Mat image;
int pixel = image.at<uchar>(row,col);
So I just wonder it there also a easy way to get the r,g,b pixel value of a multichannel image just like that in the Matlab? Any help will be appreciated =)
For C++ interface you can do:
Vec3f pixel = image.at<Vec3f>(row, col);
int b = pixel[0];
int g = pixel[1];
int r = pixel[2];
as vasile said, getting a cell as a Vec3 will get you the pixel with easy access to its rgb components, this is the simplest solution in opencv since the data structure saves the pixels in the following format "RGBRGBRGBRGBRGB..." while matlab saves it as "RRRRRRRGGGGGGGBBBBBBBB..."
to get a specified channel like in matlab you can use the CvSplit (or cv::split in c++ style), this function will split the image into its 3-4 different channels so you could access a channels like in matlab. in the provided links you can find also a reference for the opposite function - merge
Related
I am trying to write a thresholding function that would take as parameter my Threshold function. for that I need to use meanStdDev.
Mat structElem = dst(Range(i - radius, i + radius), Range(j - radius, j + radius));
meanStdDev(structElem, mean, stdev);
double threshValue = mean[0] * stdMean[0] + stdMean[1] * stdev[0] + stdMean[2]);
here mean and stdev are scalars while stdMean is an array that I use for getting the Thresh value. The funny thing is that when I try to do the same with 8Bit images everything works.
Documantation of meanStdDev says that
"The function meanStdDev calculates the mean and the standard deviation M of array elements independently for each channel and returns it via the output parameters"
....
"results can be stored in Scalar_ 's."
so mean and stdev values are scalar values
for color images split image into channels
and calculate and apply a threshold for each channell indpendently.
mean[0] mean value of first channel
mean[1] mean value of second channel
....
Say I have a very simple image or shape such as this stick man drawing:
I also have a library of other simple images which I want to compare the first image to and determine the closest match:
Notice that the two stick men are not completely identical but are reasonably similar.
I want to be able to compare the first image to each image in my library until a reasonably close match is found. If necessary, my image library could contain numerous variations of the same image in order to help decide which type of image I have. For example:
My question is whether this is something that OpenCV would be capable of? Has it been done before, and if so, can you point me in the direction of some examples? Many thanks for your help.
Edit: Through my searches I have found many examples of people who are comparing images, or even people that are comparing images which have been stretched or skewed such as this: Checking images for similarity with OpenCV . Unfortunately as you can see, my images are not just translated (Rotated/Skewed/Stretched) versions of one another - They actually different images although they are very similar.
You should be able to do it using feature template match function of OpenCV. You can use matchTemplate function to look for the feature and then, minMaxLoc to find its location. Check out the tutorial on OpenCV web site for matchTemplate.
seems you need feature points detections and matching. Check these docs from OpenCV:
http://docs.opencv.org/doc/tutorials/features2d/feature_detection/feature_detection.html
http://docs.opencv.org/doc/tutorials/features2d/feature_flann_matcher/feature_flann_matcher.html
For your particular type of images, you might get good results by using moments/HuMoments for the connected components (which you can find with findContours).
since there is a rotation involved, I dont think template matching would work well. You probably need to use Feature point detection such as SIFT or SURF.
EDIT: This won't work with rotation. Same for matchTemplate. I am yet to try the findContours + moments as in bjoernz answer which sounds promising.
Failed Solution:
I tried using ShapeContextDistanceExtractor(1) available in OpenCV 3.0 along with findContours on your sample images to get good results. The sample images were cropped to same size as original image(128*200). You can could as well use resize in OpenCV.
Code below compares images in images folder with 1.png as the base image.
#include "opencv2/shape.hpp"
#include "opencv2/opencv.hpp"
#include <iostream>
#include <string>
using namespace std;
using namespace cv;
const int MAX_SHAPES = 7;
vector<Point> findContours( const Mat& compareToImg )
{
vector<vector<Point> > contour2D;
findContours(compareToImg, contour2D, RETR_LIST, CHAIN_APPROX_NONE);
//converting 2d vector contours to 1D vector for comparison
vector <Point> contour1D;
for (size_t border=0; border < contour2D.size(); border++) {
for (size_t p=0; p < contour2D[border].size(); p++) {
contour1D.push_back( contour2D[border][p] );
}
}
//limiting contours size to reduce distance comparison time
contour1D.resize( 300 );
return contour1D;
}
int main()
{
string path = "./images/";
cv::Ptr <cv::ShapeContextDistanceExtractor> distanceExtractor = cv::createShapeContextDistanceExtractor();
//base image
Mat baseImage= imread( path + "1.png", IMREAD_GRAYSCALE);
vector<Point> baseImageContours= findContours( baseImage );
for ( int idx = 2; idx <= MAX_SHAPES; ++idx ) {
stringstream imgName;
imgName << path << idx << ".png";
Mat compareToImg=imread( imgName.str(), IMREAD_GRAYSCALE ) ;
vector<Point> contii = findContours( compareToImg );
float distance = distanceExtractor->computeDistance( baseImageContours, contii );
std::cout<<" distance to " << idx << " : " << distance << std::endl;
}
return 0;
}
Result
distance to 2 : 89.7951
distance to 3 : 14.6793
distance to 4 : 6.0063
distance to 5 : 4.79834
distance to 6 : 0.0963184
distance to 7 : 0.00212693
Do three things: 1. Forget about image comparison since you really comparing stroke symbols. 2. Download and play wth a Gesture Search app from google store; 3. Realize that for good performance you cannot recognize your strokes without using timestamp information about stroke drawing. Otherwice we would have a successful handwriting recognition. Then you can research Android stroke reco library to write your code properly.
I want to mutiply 2 with each element of vec3 in opencv as we do in Matlab simplt by ".*". I searched alot but didn't find any command is their any command for this or not in opencv? thanks in advance for any help.
This answer would suggest you can just use the * assignment operator in C++.
If you are using Java I don't think this is possible, you can only multiply a Mat by another Mat.
So you would need to create a new Mat instance of the same size and type, initialised with the scalar value you want to multiply by.
You can easily create a funcion to do this:
public Mat multiplyScalar(Mat m, double i)
{
return m = m.mul(new Mat((int)m.size().height, (int)m.size().width, m.type(), new Scalar(i)));
}
Then x = multiplyScalar(x, 5); will multiply each element by 5.
Currently, I'm working on a project in medical engineering. I have a big image with several sub-images of the cell, so my first task is to divide the image.
I thought about the next thing:
Convert the image into binary
doing a projection of the brightness pixels into the x-axis so I can see where there are gaps between brightnesses values and then divide the image.
The problem comes when I try to reach the second part. My idea is using a vector as the projection and sum all the brightnesses values all along one column, so the position number 0 of the vector is the sum of all the brightnesses values that are in the first column of the image, the same until I reach the last column, so at the end I have the projection.
This is how I have tried:
void calculo(cv::Mat &result,cv::Mat &binary){ //result=the sum,binary the imag.
int i,j;
for (i=0;i<=binary.rows;i++){
for(j=0;j<=binary.cols;j++){
cv::Scalar intensity= binaria.at<uchar>(j,i);
result.at<uchar>(i,i)=result.at<uchar>(i,i)+intensity.val[0];
}
cv::Scalar intensity2= result.at<uchar>(i,i);
cout<< "content" "\n"<< intensity2.val[0] << endl;
}
}
When executing this code, I have a violation error. Another problem is that I cannot create a matrix with one unique row, so...I don't know what could I do.
Any ideas?! Thanks!
At the end, it does not work, I need to sum all the pixels in one COLUMN. I did:
cv::Mat suma(cv::Mat& matrix){
int i;
cv::Mat output(1,matrix.cols,CV_64F);
for (i=0;i<=matrix.cols;i++){
output.at<double>(0,i)=norm(matrix.col(i),1);
}
return output;
}
but It gave me a mistake:
Assertion failed (0 <= colRange.start && colRange.start <= colRange.end && colRange.end <= m.cols) in Mat, file /home/usuario/OpenCV-2.2.0/modules/core/src/matrix.cpp, line 276
I dont know, any idea would be helpful, anyway many thanks mevatron, you really left me in the way.
If you just want the sum of the binary image, you could simply take the L1-norm. Like so:
Mat binaryVectorSum(const Mat& binary)
{
Mat output(1, binary.rows, CV_64F);
for(int i = 0; i < binary.rows; i++)
{
output.at<double>(0, i) = norm(binary.row(i), NORM_L1);
}
return output;
}
I'm at work, so I can't test it out, but that should get you close.
EDIT : Got home. Tested it. It works. :) One caveat...this function works if your binary matrix is truly binary (i.e., 0's and 1's). You may need to scale the norm output with the maximum value if the binary matrix is say 0's and 255's.
EDIT : If you don't have using namespace cv; in your .cpp file, then you'll need to declare the namespace to use NORM_L1 like this cv::NORM_L1.
Have you considered transposing the matrix before you call the function? Like this:
sumCols = binaryVectorSum(binary.t());
vs.
sumRows = binaryVectorSum(binary);
EDIT : A bug with my code :)
I changed:
Mat output(1, binary.cols, CV_64F);
to
Mat output(1, binary.rows, CV_64F);
My test case was a square matrix, so that bug didn't get found...
Hope that is helpful!
When converting from RGB to grayscale, it is said that specific weights to channels R, G, and B ought to be applied. These weights are: 0.2989, 0.5870, 0.1140.
It is said that the reason for this is different human perception/sensibility towards these three colors. Sometimes it is also said these are the values used to compute NTSC signal.
However, I didn't find a good reference for this on the web. What is the source of these values?
See also these previous questions: here and here.
The specific numbers in the question are from CCIR 601 (see Wikipedia article).
If you convert RGB -> grayscale with slightly different numbers / different methods,
you won't see much difference at all on a normal computer screen
under normal lighting conditions -- try it.
Here are some more links on color in general:
Wikipedia Luma
Bruce Lindbloom 's outstanding web site
chapter 4 on Color in the book by Colin Ware, "Information Visualization", isbn 1-55860-819-2;
this long link to Ware in books.google.com
may or may not work
cambridgeincolor :
excellent, well-written
"tutorials on how to acquire, interpret and process digital photographs
using a visually-oriented approach that emphasizes concept over procedure"
Should you run into "linear" vs "nonlinear" RGB,
here's part of an old note to myself on this.
Repeat, in practice you won't see much difference.
### RGB -> ^gamma -> Y -> L*
In color science, the common RGB values, as in html rgb( 10%, 20%, 30% ),
are called "nonlinear" or
Gamma corrected.
"Linear" values are defined as
Rlin = R^gamma, Glin = G^gamma, Blin = B^gamma
where gamma is 2.2 for many PCs.
The usual R G B are sometimes written as R' G' B' (R' = Rlin ^ (1/gamma))
(purists tongue-click) but here I'll drop the '.
Brightness on a CRT display is proportional to RGBlin = RGB ^ gamma,
so 50% gray on a CRT is quite dark: .5 ^ 2.2 = 22% of maximum brightness.
(LCD displays are more complex;
furthermore, some graphics cards compensate for gamma.)
To get the measure of lightness called L* from RGB,
first divide R G B by 255, and compute
Y = .2126 * R^gamma + .7152 * G^gamma + .0722 * B^gamma
This is Y in XYZ color space; it is a measure of color "luminance".
(The real formulas are not exactly x^gamma, but close;
stick with x^gamma for a first pass.)
Finally,
L* = 116 * Y ^ 1/3 - 16
"... aspires to perceptual uniformity [and] closely matches human perception of lightness." --
Wikipedia Lab color space
I found this publication referenced in an answer to a previous similar question. It is very helpful, and the page has several sample images:
Perceptual Evaluation of Color-to-Grayscale Image Conversions by Martin Čadík, Computer Graphics Forum, Vol 27, 2008
The publication explores several other methods to generate grayscale images with different outcomes:
CIE Y
Color2Gray
Decolorize
Smith08
Rasche05
Bala04
Neumann07
Interestingly, it concludes that there is no universally best conversion method, as each performed better or worse than others depending on input.
Heres some code in c to convert rgb to grayscale.
The real weighting used for rgb to grayscale conversion is 0.3R+0.6G+0.11B.
these weights arent absolutely critical so you can play with them.
I have made them 0.25R+ 0.5G+0.25B. It produces a slightly darker image.
NOTE: The following code assumes xRGB 32bit pixel format
unsigned int *pntrBWImage=(unsigned int*)..data pointer..; //assumes 4*width*height bytes with 32 bits i.e. 4 bytes per pixel
unsigned int fourBytes;
unsigned char r,g,b;
for (int index=0;index<width*height;index++)
{
fourBytes=pntrBWImage[index];//caches 4 bytes at a time
r=(fourBytes>>16);
g=(fourBytes>>8);
b=fourBytes;
I_Out[index] = (r >>2)+ (g>>1) + (b>>2); //This runs in 0.00065s on my pc and produces slightly darker results
//I_Out[index]=((unsigned int)(r+g+b))/3; //This runs in 0.0011s on my pc and produces a pure average
}
Check out the Color FAQ for information on this. These values come from the standardization of RGB values that we use in our displays. Actually, according to the Color FAQ, the values you are using are outdated, as they are the values used for the original NTSC standard and not modern monitors.
What is the source of these values?
The "source" of the coefficients posted are the NTSC specifications which can be seen in Rec601 and Characteristics of Television.
The "ultimate source" are the CIE circa 1931 experiments on human color perception. The spectral response of human vision is not uniform. Experiments led to weighting of tristimulus values based on perception. Our L, M, and S cones1 are sensitive to the light wavelengths we identify as "Red", "Green", and "Blue" (respectively), which is where the tristimulus primary colors are derived.2
The linear light3 spectral weightings for sRGB (and Rec709) are:
Rlin * 0.2126 + Glin * 0.7152 + Blin * 0.0722 = Y
These are specific to the sRGB and Rec709 colorspaces, which are intended to represent computer monitors (sRGB) or HDTV monitors (Rec709), and are detailed in the ITU documents for Rec709 and also BT.2380-2 (10/2018)
FOOTNOTES
(1) Cones are the color detecting cells of the eye's retina.
(2) However, the chosen tristimulus wavelengths are NOT at the "peak" of each cone type - instead tristimulus values are chosen such that they stimulate on particular cone type substantially more than another, i.e. separation of stimulus.
(3) You need to linearize your sRGB values before applying the coefficients. I discuss this in another answer here.
Starting a list to enumerate how different software packages do it. Here is a good CVPR paper to read as well.
FreeImage
#define LUMA_REC709(r, g, b) (0.2126F * r + 0.7152F * g + 0.0722F * b)
#define GREY(r, g, b) (BYTE)(LUMA_REC709(r, g, b) + 0.5F)
OpenCV
nVidia Performance Primitives
Intel Performance Primitives
Matlab
nGray = 0.299F * R + 0.587F * G + 0.114F * B;
These values vary from person to person, especially for people who are colorblind.
is all this really necessary, human perception and CRT vs LCD will vary, but the R G B intensity does not, Why not L = (R + G + B)/3 and set the new RGB to L, L, L?