opencv: how to clusterize by angle using kmeans() - opencv

Question is, how to clusterize pairs of some units by their angle? Problem is that, kmeans operates on the notion of Euclidean space distance and does not know about periodic nature of angles. So to make it work, one needs to translate the angle to Euclidean space but hold the following true:
close angles are close values in Euclidean space;
far angles are far in Euclidean space.
Which means, that, 90 and -90 are distant values, 180 and -180 is the same, 170 and -170 are close (angles come from left up and to right: 0 - +180 and from left down to the right: 0 - -180)
I tried to use various sin() functions but they all have issues mentioned in points 1 and 2. Most perspective one is sin(x * 0.5f) but also having the problem that 180 and -180 are distant values in Euclidean space.

The solution I found is to translate angles to points on circle and feed them into kmeans. This way we make it to compare distances between points and this works perfectly.
Important thing to mention. Kmeans #eps in termination criterion is expressed in terms of units of samples that you feed to kmeans. In our example maximal distant points have dist 200 units (2 * radius). This means that having 1.0f is totally fine. If you use cv::normalize(samples, samples, 0.0f, 1.0f) for your samples before calling kmeans(), adjust your #eps appropriately. Something like eps=0.01f plays better here.
Enjoy! Hope this helps someone.
static cv::Point2f angleToPointOnCircle(float angle, float radius, cv::Point2f origin /* center */)
{
float x = radius * cosf(angle * M_PI / 180.0f) + origin.x;
float y = radius * sinf(angle * M_PI / 180.0f) + origin.y;
return cv::Point2f(x, y);
}
static std::vector<std::pair<size_t, int> > biggestKmeansGroup(const std::vector<int> &labels, int count)
{
std::vector<std::pair<size_t, int> > indices;
std::map<int, size_t> l2cm;
for (int i = 0; i < labels.size(); ++i)
l2cm[labels[i]]++;
std::vector<std::pair<size_t, int> > c2lm;
for (std::map<int, size_t>::iterator it = l2cm.begin(); it != l2cm.end(); it++)
c2lm.push_back(std::make_pair(it->second, it->first)); // count, group
std::sort(c2lm.begin(), c2lm.end(), cmp_pair_first_reverse);
for (int i = 0; i < c2lm.size() && count-- > 0; i++)
indices.push_back(c2lm[i]);
return indices;
}
static void sortByAngle(std::vector<boost::shared_ptr<Pair> > &group,
std::vector<boost::shared_ptr<Pair> > &result)
{
std::vector<int> labels;
cv::Mat samples;
/* Radius is not so important here. */
for (int i = 0; i < group.size(); i++)
samples.push_back(angleToPointOnCircle(group[i]->angle, 100, cv::Point2f(0, 0)));
/* 90 degrees per group. May be less if you need it. */
static int PAIR_MAX_FINE_GROUPS = 4;
int groupNr = std::max(std::min((int)group.size(), PAIR_MAX_FINE_GROUPS), 1);
assert(group.size() >= groupNr);
cv::kmeans(samples.reshape(1, (int)group.size()), groupNr, labels,
cvTermCriteria(CV_TERMCRIT_EPS/* | CV_TERMCRIT_ITER*/, 30, 1.0f),
100, cv::KMEANS_RANDOM_CENTERS);
std::vector<std::pair<size_t, int> > biggest = biggestKmeansGroup(labels, groupNr);
for (int g = 0; g < biggest.size(); g++) {
for (int i = 0; i < group.size(); i++) {
if (labels[i] == biggest[g].second)
result.push_back(group[i]);
}
}
}

Related

Convolution of Image Processing in Processing language

Since the Corona situation characterizes my studies as self-study, as a Processing-Language newbie I don't have an easy time getting into the subject of image processing , more specifically convolution. Therefore I hope that you can help me.
My lecturer, who unfortunately is nearly never reachable, left me the following conv code. The theory behind convolution is clear to me, but I have many gaps in understanding related to the code. Could someone leave a line comment so that I can get into the code a bit more fluently?
The Code is following
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
int offset = matrix_size / 2;
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
int xloc = x+i-offset;
int yloc = y+j-offset;
int loc = xloc + img.width*yloc;
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
}
}
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
return color(rtotal, gtotal, btotal);
}
I have to do a bit of guesswork since I'm not positive about all of the functions you're using and I'm not familiar with the Processing 3+ library, but here's my best shot at it.
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
// Note: the 'matrix' parameter here will also frequently be referred to as
// a 'window' or 'kernel' in research
// I'm not certain what your PImage class is from, but I'll assume
// you're using the Processing 3+ library and work off of that assumption
// how much of each color we see within the kernel (matrix) space
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
// this offset is to zero-center our kernel
// the fact that we use matrix_size / 2 sort of implicitly
// alludes to the fact that our matrix_size should be an odd-number
// so that we can have a middle-pixel
int offset = matrix_size / 2;
// looping through the kernel. the fact that we use 'matrix_size'
// as our end-condition for both dimensions means that our 'matrix' kernel
// must always be a square
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
// calculating the index conversion from 2D to the 1D format that PImage uses
// refer to: https://processing.org/tutorials/pixels/
// for a better understanding of PImage indexing (about 1/3 of the way down the page)
// WARNING: by subtracting the offset it is possible to hit negative
// x,y values here if you pick an x or y position less than matrix_size / 2.
// the same index-out-of-bounds can occur on the high end.
// When you convolve using a kernel of N x N size (N here would be matrix_size)
// you can only convolve from [N / 2, Width - (N / 2)] for x and y
int xloc = x+i-offset;
int yloc = y+j-offset;
// this is the final 1D PImage index that corresponds to [xloc, yloc] in our 2D image
// really go back up and take a look at the link if this doesn't make sense, it's pretty good
int loc = xloc + img.width*yloc;
// I have to do some speculation again since I'm not certain what red(img.pixels[loc]) does
// I'll assume it returns the red red channel of the pixel
// this section just adds up all of the pixel colors multiplied by the value in the kernel
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
}
}
// the fact that no further division or averaging happens after the for-loops implies
// that the kernel you feed in should have balanced values for your kernel size
// for example, a kernel that's designed to average out the color over the 3 x 3 area
// it covers (this would be like blurring the image) would be filled with 1/9
// in general: the kernel you're using should have a sum of 1 for all of the numbers inside
// this is just 'in general' you can play around with not doing that, but you'll probably notice a
// darkening effect for when the sum is less than 1, and a brightening effect if it's greater than 1
// for more info on kernels, read this: https://en.wikipedia.org/wiki/Kernel_(image_processing)
// I don't have the code for this constrain function,
// but it's almost certainly just your typical clamp (constrains the values to [0, 255])
// Note: this means that your values saturate at 0 and 255
// if you see a lot of black or white then that means your kernel
// probably isn't balanced as mentioned above
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
// Finished!
return color(rtotal, gtotal, btotal);
}

Opencv, calculate distance at pixel from depth mat

I have a disparity image created with a calibrated stereo camera pair and opencv. It looks good, and my calibration data is good.
I need to calculate the real world distance at a pixel.
From other questions on stackoverflow, i see that the approach is:
depth = baseline * focal / disparity
Using the function:
setMouseCallback("disparity", onMouse, &disp);
static void onMouse(int event, int x, int y, int flags, void* param)
{
cv::Mat &xyz = *((cv::Mat*)param); //cast and deref the param
if (event == cv::EVENT_LBUTTONDOWN)
{
unsigned int val = xyz.at<uchar>(y, x);
double depth = (camera_matrixL.at<float>(0, 0)*T.at<float>(0, 0)) / val;
cout << "x= " << x << " y= " << y << " val= " << val << " distance: " << depth<< endl;
}
}
I click on a point that i have measured to be 3 meters away from the stereo camera.
What i get is:
val= 31 distance: 0.590693
The depth mat values are between 0 and 255, the depth mat is of type 0, or CV_8UC1.
The stereo baseline is 0.0643654 (in meters).
The focal length is 284.493
I have also tried:
(from OpenCV - compute real distance from disparity map)
float fMaxDistance = static_cast<float>((1. / T.at<float>(0, 0) * camera_matrixL.at<float>(0, 0)));
//outputDisparityValue is single 16-bit value from disparityMap
float fDisparity = val / (float)cv::StereoMatcher::DISP_SCALE;
float fDistance = fMaxDistance / fDisparity;
which gives me a (closer to truth, if we assume mm units) distance of val= 31 distance: 2281.27
But is still incorrect.
Which of these approaches is correct? And where am i going wrong?
Left, Right, Depth map. (EDIT: this depth map is from a different pair of images)
EDIT: Based on an answer, i am trying this:
`std::vector pointcloud;
float fx = 284.492615;
float fy = 285.683197;
float cx = 424;// 425.807709;
float cy = 400;// 395.494293;
cv::Mat Q = cv::Mat(4,4, CV_32F);
Q.at<float>(0, 0) = 1.0;
Q.at<float>(0, 1) = 0.0;
Q.at<float>(0, 2) = 0.0;
Q.at<float>(0, 3) = -cx; //cx
Q.at<float>(1, 0) = 0.0;
Q.at<float>(1, 1) = 1.0;
Q.at<float>(1, 2) = 0.0;
Q.at<float>(1, 3) = -cy; //cy
Q.at<float>(2, 0) = 0.0;
Q.at<float>(2, 1) = 0.0;
Q.at<float>(2, 2) = 0.0;
Q.at<float>(2, 3) = -fx; //Focal
Q.at<float>(3, 0) = 0.0;
Q.at<float>(3, 1) = 0.0;
Q.at<float>(3, 2) = -1.0 / 6; //1.0/BaseLine
Q.at<float>(3, 3) = 0.0; //cx - cx'
//
cv::Mat XYZcv(depth_image.size(), CV_32FC3);
reprojectImageTo3D(depth_image, XYZcv, Q, false, CV_32F);
for (int y = 0; y < XYZcv.rows; y++)
{
for (int x = 0; x < XYZcv.cols; x++)
{
cv::Point3f pointOcv = XYZcv.at<cv::Point3f>(y, x);
Eigen::Vector4d pointEigen(0, 0, 0, left.at<uchar>(y, x) / 255.0);
pointEigen[0] = pointOcv.x;
pointEigen[1] = pointOcv.y;
pointEigen[2] = pointOcv.z;
pointcloud.push_back(pointEigen);
}
}`
And that gives me a cloud.
I would recommend to use reprojectImageTo3D of OpenCV to reconstruct the distance from the disparity. Note that when using this function you indeed have to divide by 16 the output of StereoSGBM. You should already have all the parameters f, cx, cy, Tx. Take care to give f and Tx in the same units. cx, cy are in pixels.
Since the problem is that you need the Q matrix, I think that this link or this one should help you to build it. If you don't want to use reprojectImageTo3D I strongly recommend the first link!
I hope this helps!
To find the point-based depth of an object from the camera, use the following formula:
Depth = (Baseline x Focallength)/disparity
I hope you are using it correctly as per your question.
Try the below nerian calculator for the therotical error.
https://nerian.com/support/resources/calculator/
Also, use sub-pixel interpolation in your code.
Make sure object you are identifying for depth should have good texture.
The most common problems with depth maps are:
Untextured surfaces (plain object)
Calibration results are bad.
What is the RMS value for your calibration, camera resolution, and lens type(focal
length)? This is important to provide much better data for your program.

standard deviation of a UIImage/CGImage

I need to calculate the standard deviation on an image I have inside a UIImage object.
I know already how to access all pixels of an image, one at a time, so somehow I can do it.
I'm wondering if there is somewhere in the framework a function to perform this in a better and more efficient way... I can't find it so maybe it doensn't exist.
Do anyone know how to do this?
bye
To further expand on my comment above. I would definitely look into using the Accelerate framework, especially depending on the size of your image. If you image is a few hundred pixels by a few hundred. You will have a ton of data to process and Accelerate along with vDSP will make all of that math a lot faster since it processes everything on the GPU. I will look into this a little more, and possibly put some code in a few minutes.
UPDATE
I will post some code to do standard deviation in a single dimension using vDSP, but this could definitely be extended to 2-D
float *imageR = [0.1,0.2,0.3,0.4,...]; // vector of values
int numValues = 100; // number of values in imageR
float mean = 0; // place holder for mean
vDSP_meanv(imageR,1,&mean,numValues); // find the mean of the vector
mean = -1*mean // Invert mean so when we add it is actually subtraction
float *subMeanVec = (float*)calloc(numValues,sizeof(float)); // placeholder vector
vDSP_vsadd(imageR,1,&mean,subMeanVec,1,numValues) // subtract mean from vector
free(imageR); // free memory
float *squared = (float*)calloc(numValues,sizeof(float)); // placeholder for squared vector
vDSP_vsq(subMeanVec,1,squared,1,numValues); // Square vector element by element
free(subMeanVec); // free some memory
float sum = 0; // place holder for sum
vDSP_sve(squared,1,&sum,numValues); sum entire vector
free(squared); // free squared vector
float stdDev = sqrt(sum/numValues); // calculated std deviation
Please explain your query so that can come up with specific reply.
If I am getting you right then you want to calculate standard deviation of RGB of pixel or HSV of color, you can frame your own method of standard deviation for circular quantities in case of HSV and RGB.
We can do this by wrapping the values.
For example: Average of [358, 2] degrees is (358+2)/2=180 degrees.
But this is not correct because its average or mean should be 0 degrees.
So we wrap 358 into -2.
Now the answer is 0.
So you have to apply wrapping and then you can calculate standard deviation from above link.
UPDATE:
Convert RGB to HSV
// r,g,b values are from 0 to 1 // h = [0,360], s = [0,1], v = [0,1]
// if s == 0, then h = -1 (undefined)
void RGBtoHSV( float r, float g, float b, float *h, float *s, float *v )
{
float min, max, delta;
min = MIN( r, MIN(g, b ));
max = MAX( r, MAX(g, b ));
*v = max;
delta = max - min;
if( max != 0 )
*s = delta / max;
else {
// r = g = b = 0
*s = 0;
*h = -1;
return;
}
if( r == max )
*h = ( g - b ) / delta;
else if( g == max )
*h=2+(b-r)/delta;
else
*h=4+(r-g)/delta;
*h *= 60;
if( *h < 0 )
*h += 360;
}
and then calculate standard deviation for hue value by this:
double calcStddev(ArrayList<Double> angles){
double sin = 0;
double cos = 0;
for(int i = 0; i < angles.size(); i++){
sin += Math.sin(angles.get(i) * (Math.PI/180.0));
cos += Math.cos(angles.get(i) * (Math.PI/180.0));
}
sin /= angles.size();
cos /= angles.size();
double stddev = Math.sqrt(-Math.log(sin*sin+cos*cos));
return stddev;
}

How to get a rectangle around the target object using the features extracted by SIFT in OpenCV

I'm doing project in OpenCV on object detection which consists of matching the object in template image with the reference image. Using SIFT algorithm the features get acurately detected and matched but I want a rectagle around the matched features
My algorithm uses the KD-Tree est ean First technique to get the matches
If you want a rectangle around the detected object, here you have code example with exactly that. You just need to draw a rectangle around the homography H.
Hope it helps. Good luck.
I use the following code, adapted from the SURF algoritm in OpenCV (modules/features2d/src/surf.cpp) to extract a surrounding of a keypoint.
Apart from other examples based on rectangles and ROI, this code returns the patch correctly oriented according to the orientation and scale determined by the feature detection algorithm (both available in the KeyPoint struct).
An example of the results of the detection on several different images:
const int PATCH_SZ = 20;
Mat extractKeyPoint(const Mat& image, KeyPoint kp)
{
int x = (int)kp.pt.x;
int y = (int)kp.pt.y;
float size = kp.size;
float angle = kp.angle;
int win_size = (int)((PATCH_SZ+1)*size*1.2f/9.0);
Mat win(win_size, win_size, CV_8UC3);
float descriptor_dir = angle * (CV_PI/180);
float sin_dir = sin(descriptor_dir);
float cos_dir = cos(descriptor_dir);
float win_offset = -(float)(win_size-1)/2;
float start_x = x + win_offset*cos_dir + win_offset*sin_dir;
float start_y = y - win_offset*sin_dir + win_offset*cos_dir;
uchar* WIN = win.data;
uchar* IMG = image.data;
for( int i = 0; i < win_size; i++, start_x += sin_dir, start_y += cos_dir )
{
float pixel_x = start_x;
float pixel_y = start_y;
for( int j = 0; j < win_size; j++, pixel_x += cos_dir, pixel_y -= sin_dir )
{
int x = std::min(std::max(cvRound(pixel_x), 0), image.cols-1);
int y = std::min(std::max(cvRound(pixel_y), 0), image.rows-1);
for (int c=0; c<3; c++) {
WIN[i*win_size*3 + j*3 + c] = IMG[y*image.step1() + x*3 + c];
}
}
}
return win;
}
I am not sure if the scale is entirely OK, but it is taken from the SURF source and the results look relevant to me.

OpenCV: Detecting Movement in a tile

I would like to detect a movement in a tile of grids defined by N*N, I've tried a way which is done by https://stackoverflow.com/users/724461/andrey-kamaev
and it shown in the following code, but the result isn't accurate at all, I would like to do a more accurate approach.
cv::Sobel(input, sobel, CV_32F, 1, 1);
int h = input.rows / NUM_BLOCK_ROWS;
int w = input.rows / NUM_BLOCK_COLUMNS;
float pos=0;
for (int r = 0; r<NUM_BLOCK_ROWS; r++)
for(int c=0; c<NUM_BLOCK_COLUMNS; c++)
{
cv::Scalar weight = cv::sum(sobel(cv::Range(h*r, (r+1)*h), cv::Range(c*w, (c+1)*w)));
if (weight[0] + weight[1] > 60) {
input(cv::Range(h*r, (r+1)*h-1), cv::Range(c*w, (c+1)*w-1)).setTo(cv::Scalar(0,0,255));
}
}
I used Frame Differencing approach and it worked.
What about optical flow? OpenCV's implementation is here

Resources