I would like to detect a movement in a tile of grids defined by N*N, I've tried a way which is done by
and it shown in the following code, but the result isn't accurate at all, I would like to do a more accurate approach.
cv::Sobel(input, sobel, CV_32F, 1, 1);
int h = input.rows / NUM_BLOCK_ROWS;
int w = input.rows / NUM_BLOCK_COLUMNS;
float pos=0;
for (int r = 0; r<NUM_BLOCK_ROWS; r++)
for(int c=0; c<NUM_BLOCK_COLUMNS; c++)
cv::Scalar weight = cv::sum(sobel(cv::Range(h*r, (r+1)*h), cv::Range(c*w, (c+1)*w)));
if (weight[0] + weight[1] > 60) {
input(cv::Range(h*r, (r+1)*h-1), cv::Range(c*w, (c+1)*w-1)).setTo(cv::Scalar(0,0,255));

I used Frame Differencing approach and it worked.

What about optical flow? OpenCV's implementation is here


Opencv, calculate distance at pixel from depth mat

I have a disparity image created with a calibrated stereo camera pair and opencv. It looks good, and my calibration data is good.
I need to calculate the real world distance at a pixel.
From other questions on stackoverflow, i see that the approach is:
depth = baseline * focal / disparity
Using the function:
setMouseCallback("disparity", onMouse, &disp);
static void onMouse(int event, int x, int y, int flags, void* param)
cv::Mat &xyz = *((cv::Mat*)param); //cast and deref the param
if (event == cv::EVENT_LBUTTONDOWN)
unsigned int val =<uchar>(y, x);
double depth = (<float>(0, 0)*<float>(0, 0)) / val;
cout << "x= " << x << " y= " << y << " val= " << val << " distance: " << depth<< endl;
I click on a point that i have measured to be 3 meters away from the stereo camera.
What i get is:
val= 31 distance: 0.590693
The depth mat values are between 0 and 255, the depth mat is of type 0, or CV_8UC1.
The stereo baseline is 0.0643654 (in meters).
The focal length is 284.493
I have also tried:
(from OpenCV - compute real distance from disparity map)
float fMaxDistance = static_cast<float>((1. /<float>(0, 0) *<float>(0, 0)));
//outputDisparityValue is single 16-bit value from disparityMap
float fDisparity = val / (float)cv::StereoMatcher::DISP_SCALE;
float fDistance = fMaxDistance / fDisparity;
which gives me a (closer to truth, if we assume mm units) distance of val= 31 distance: 2281.27
But is still incorrect.
Which of these approaches is correct? And where am i going wrong?
Left, Right, Depth map. (EDIT: this depth map is from a different pair of images)
EDIT: Based on an answer, i am trying this:
`std::vector pointcloud;
float fx = 284.492615;
float fy = 285.683197;
float cx = 424;// 425.807709;
float cy = 400;// 395.494293;
cv::Mat Q = cv::Mat(4,4, CV_32F);<float>(0, 0) = 1.0;<float>(0, 1) = 0.0;<float>(0, 2) = 0.0;<float>(0, 3) = -cx; //cx<float>(1, 0) = 0.0;<float>(1, 1) = 1.0;<float>(1, 2) = 0.0;<float>(1, 3) = -cy; //cy<float>(2, 0) = 0.0;<float>(2, 1) = 0.0;<float>(2, 2) = 0.0;<float>(2, 3) = -fx; //Focal<float>(3, 0) = 0.0;<float>(3, 1) = 0.0;<float>(3, 2) = -1.0 / 6; //1.0/BaseLine<float>(3, 3) = 0.0; //cx - cx'
cv::Mat XYZcv(depth_image.size(), CV_32FC3);
reprojectImageTo3D(depth_image, XYZcv, Q, false, CV_32F);
for (int y = 0; y < XYZcv.rows; y++)
for (int x = 0; x < XYZcv.cols; x++)
cv::Point3f pointOcv =<cv::Point3f>(y, x);
Eigen::Vector4d pointEigen(0, 0, 0,<uchar>(y, x) / 255.0);
pointEigen[0] = pointOcv.x;
pointEigen[1] = pointOcv.y;
pointEigen[2] = pointOcv.z;
And that gives me a cloud.
I would recommend to use reprojectImageTo3D of OpenCV to reconstruct the distance from the disparity. Note that when using this function you indeed have to divide by 16 the output of StereoSGBM. You should already have all the parameters f, cx, cy, Tx. Take care to give f and Tx in the same units. cx, cy are in pixels.
Since the problem is that you need the Q matrix, I think that this link or this one should help you to build it. If you don't want to use reprojectImageTo3D I strongly recommend the first link!
I hope this helps!
To find the point-based depth of an object from the camera, use the following formula:
Depth = (Baseline x Focallength)/disparity
I hope you are using it correctly as per your question.
Try the below nerian calculator for the therotical error.
Also, use sub-pixel interpolation in your code.
Make sure object you are identifying for depth should have good texture.
The most common problems with depth maps are:
Untextured surfaces (plain object)
Calibration results are bad.
What is the RMS value for your calibration, camera resolution, and lens type(focal
length)? This is important to provide much better data for your program.

opencv: how to clusterize by angle using kmeans()

Question is, how to clusterize pairs of some units by their angle? Problem is that, kmeans operates on the notion of Euclidean space distance and does not know about periodic nature of angles. So to make it work, one needs to translate the angle to Euclidean space but hold the following true:
close angles are close values in Euclidean space;
far angles are far in Euclidean space.
Which means, that, 90 and -90 are distant values, 180 and -180 is the same, 170 and -170 are close (angles come from left up and to right: 0 - +180 and from left down to the right: 0 - -180)
I tried to use various sin() functions but they all have issues mentioned in points 1 and 2. Most perspective one is sin(x * 0.5f) but also having the problem that 180 and -180 are distant values in Euclidean space.
The solution I found is to translate angles to points on circle and feed them into kmeans. This way we make it to compare distances between points and this works perfectly.
Important thing to mention. Kmeans #eps in termination criterion is expressed in terms of units of samples that you feed to kmeans. In our example maximal distant points have dist 200 units (2 * radius). This means that having 1.0f is totally fine. If you use cv::normalize(samples, samples, 0.0f, 1.0f) for your samples before calling kmeans(), adjust your #eps appropriately. Something like eps=0.01f plays better here.
Enjoy! Hope this helps someone.
static cv::Point2f angleToPointOnCircle(float angle, float radius, cv::Point2f origin /* center */)
float x = radius * cosf(angle * M_PI / 180.0f) + origin.x;
float y = radius * sinf(angle * M_PI / 180.0f) + origin.y;
return cv::Point2f(x, y);
static std::vector<std::pair<size_t, int> > biggestKmeansGroup(const std::vector<int> &labels, int count)
std::vector<std::pair<size_t, int> > indices;
std::map<int, size_t> l2cm;
for (int i = 0; i < labels.size(); ++i)
std::vector<std::pair<size_t, int> > c2lm;
for (std::map<int, size_t>::iterator it = l2cm.begin(); it != l2cm.end(); it++)
c2lm.push_back(std::make_pair(it->second, it->first)); // count, group
std::sort(c2lm.begin(), c2lm.end(), cmp_pair_first_reverse);
for (int i = 0; i < c2lm.size() && count-- > 0; i++)
return indices;
static void sortByAngle(std::vector<boost::shared_ptr<Pair> > &group,
std::vector<boost::shared_ptr<Pair> > &result)
std::vector<int> labels;
cv::Mat samples;
/* Radius is not so important here. */
for (int i = 0; i < group.size(); i++)
samples.push_back(angleToPointOnCircle(group[i]->angle, 100, cv::Point2f(0, 0)));
/* 90 degrees per group. May be less if you need it. */
static int PAIR_MAX_FINE_GROUPS = 4;
int groupNr = std::max(std::min((int)group.size(), PAIR_MAX_FINE_GROUPS), 1);
assert(group.size() >= groupNr);
cv::kmeans(samples.reshape(1, (int)group.size()), groupNr, labels,
cvTermCriteria(CV_TERMCRIT_EPS/* | CV_TERMCRIT_ITER*/, 30, 1.0f),
std::vector<std::pair<size_t, int> > biggest = biggestKmeansGroup(labels, groupNr);
for (int g = 0; g < biggest.size(); g++) {
for (int i = 0; i < group.size(); i++) {
if (labels[i] == biggest[g].second)

Detect basket ball Hoops and ball tracking

Detect the hoop(basket).To see the samples of "hoop".
Count the no of successful attempts(shoot) and the failure attempts.
I am using opencv.
Camera position will be static.
The Portrait mode videos from any mobile device.
What have i tried:
Able to track the basket ball. Still, seeking for a better solution.
My code:
int main () {
VideoCapture vid(path);
if (!vid.isOpened())
int i_frame_height = vid.get(CV_CAP_PROP_FRAME_HEIGHT);
i_height_basketball = i_height_basketball * I_HEIGHT / i_frame_height;
int fps = vid.get(CV_CAP_PROP_FPS);
Mat mat_black(640, 480, CV_8UC3, Scalar(0, 0, 0));
vector <Mat> vec_frames;
for (int i_push = 0; i_push < I_NO_FRAMES_STORE; i_push++)
vector <Mat> vec_mat_result;
for (int i_push = 0; i_push < I_RESULT_STORE; i_push++)
int count_frame = 0;
while (true) {
int clk_start = clock();
Mat image, result;
vid >> image;
if (image.empty())
resize(image, image, Size(I_WIDTH, I_HEIGHT));
image.copyTo(vec_mat_result[count_frame % I_RESULT_STORE]);
if (count_frame >= 1)
vec_mat_result[(count_frame - 1) % I_RESULT_STORE].copyTo(result);
GaussianBlur(image, image, Size(9, 9), 2, 2);
image.copyTo(vec_frames[count_frame % I_NO_FRAMES_STORE]);
if (count_frame >= I_NO_FRAMES_STORE - 1) {
Mat mat_diff_temp(I_HEIGHT, I_WIDTH, CV_32S, Scalar(0));
for (int i_diff = 0; i_diff < I_NO_FRAMES_STORE; i_diff++) {
Mat mat_rgb_diff_temp = abs(vec_frames[ (count_frame - 1) % I_NO_FRAMES_STORE ] - vec_frames[ (count_frame - i_diff) % I_NO_FRAMES_STORE ]);
cvtColor(mat_rgb_diff_temp, mat_rgb_diff_temp, CV_BGR2GRAY);
mat_rgb_diff_temp = mat_rgb_diff_temp > I_THRESHOLD;
mat_rgb_diff_temp.convertTo(mat_rgb_diff_temp, CV_32S);
mat_diff_temp = mat_diff_temp + mat_rgb_diff_temp;
mat_diff_temp = mat_diff_temp > I_THRESHOLD_2;
// mat_diff_temp.convertTo(mat_diff_temp, CV_8U);
Mat mat_roi = mat_diff_temp.rowRange(0, i_height_basketball);
// imshow("ROI", mat_roi);
Moments mm = cv::moments(mat_roi, true);
Point p_center = Point(mm.m10 / mm.m00, mm.m01 / mm.m00);
circle(result, p_center, 3, CV_RGB(0, 255, 0), -1);
line(result, Point(0, i_height_basketball), Point(result.cols, i_height_basketball), Scalar(225, 0, 0), 1);
count_frame = count_frame + 1;
int clk_processing_time = (clock() - clk_start);
if (count_frame > 1)
imshow("image", result);
// waitKey(0);
int delay = (1000 / fps) - clk_processing_time;
if (delay <= 0)
delay = 2;
if (waitKey(delay) >= 27)
return 0;
How to detect the hoop? I thought of doing with Square detection to detect the square regions around the hoop.
What is the best way of counting the successful shoots? Or How to count ?
I have what I suspect will be a fairly strong baseline: once the ball has commenced its downward arc, if the ball demonstrates significant upward movement again, its a miss. Otherwise, its a basket. This won't catch airballs, but I suspect they're relatively few anyway.
I think you could get a whole lot of mileage out of learning the ball trajectory of a successful shot and not worry too much about the hoop. Furthermore, didn't you say the camera was fixed-position? Doesn't that mean the hoop's always in the same place, and so you could just specify its location?
If you absolutely did have to find the hoop, I'd look for an object (sub-region of the image) of about the same size as the ball (which you say you can track) that's orange. More generally, you could learn a classifier for the hoop based on the training images you linked to, and apply it at a mixture of locations and scales, searching for the best match. You should know its approximate location, i.e. that it's in the upper portion of the image and likely to be to one side or the other. Then you could use proximity features to this identified region in addition to trajectory features to build a classifier for whether the shot succeeded or not.

How to get a rectangle around the target object using the features extracted by SIFT in OpenCV

I'm doing project in OpenCV on object detection which consists of matching the object in template image with the reference image. Using SIFT algorithm the features get acurately detected and matched but I want a rectagle around the matched features
My algorithm uses the KD-Tree est ean First technique to get the matches
If you want a rectangle around the detected object, here you have code example with exactly that. You just need to draw a rectangle around the homography H.
Hope it helps. Good luck.
I use the following code, adapted from the SURF algoritm in OpenCV (modules/features2d/src/surf.cpp) to extract a surrounding of a keypoint.
Apart from other examples based on rectangles and ROI, this code returns the patch correctly oriented according to the orientation and scale determined by the feature detection algorithm (both available in the KeyPoint struct).
An example of the results of the detection on several different images:
const int PATCH_SZ = 20;
Mat extractKeyPoint(const Mat& image, KeyPoint kp)
int x = (int);
int y = (int);
float size = kp.size;
float angle = kp.angle;
int win_size = (int)((PATCH_SZ+1)*size*1.2f/9.0);
Mat win(win_size, win_size, CV_8UC3);
float descriptor_dir = angle * (CV_PI/180);
float sin_dir = sin(descriptor_dir);
float cos_dir = cos(descriptor_dir);
float win_offset = -(float)(win_size-1)/2;
float start_x = x + win_offset*cos_dir + win_offset*sin_dir;
float start_y = y - win_offset*sin_dir + win_offset*cos_dir;
uchar* WIN =;
uchar* IMG =;
for( int i = 0; i < win_size; i++, start_x += sin_dir, start_y += cos_dir )
float pixel_x = start_x;
float pixel_y = start_y;
for( int j = 0; j < win_size; j++, pixel_x += cos_dir, pixel_y -= sin_dir )
int x = std::min(std::max(cvRound(pixel_x), 0), image.cols-1);
int y = std::min(std::max(cvRound(pixel_y), 0), image.rows-1);
for (int c=0; c<3; c++) {
WIN[i*win_size*3 + j*3 + c] = IMG[y*image.step1() + x*3 + c];
return win;
I am not sure if the scale is entirely OK, but it is taken from the SURF source and the results look relevant to me.

OpenCV displaying a 2-channel image (optical flow)

I have optical flow stored in a 2-channel 32F matrix. I want to visualize the contents, what's the easiest way to do this?
How do I convert a CV_32FC2 to RGB with an empty blue channel, something imshow can handle? I am using OpenCV 2 C++ API.
Super Bonus Points
Ideally I would get the angle of flow in hue and the magnitude in brightness (with saturation at a constant 100%).
imshow can handle only 1-channel gray-scale and 3-4 channel BRG/BGRA images. So you need do a conversion yourself.
I think you can do something similar to:
//extraxt x and y channels
cv::Mat xy[2]; //X,Y
cv::split(flow, xy);
//calculate angle and magnitude
cv::Mat magnitude, angle;
cv::cartToPolar(xy[0], xy[1], magnitude, angle, true);
//translate magnitude to range [0;1]
double mag_max;
cv::minMaxLoc(magnitude, 0, &mag_max);
magnitude.convertTo(magnitude, -1, 1.0 / mag_max);
//build hsv image
cv::Mat _hsv[3], hsv;
_hsv[0] = angle;
_hsv[1] = cv::Mat::ones(angle.size(), CV_32F);
_hsv[2] = magnitude;
cv::merge(_hsv, 3, hsv);
//convert to BGR and show
cv::Mat bgr;//CV_32FC3 matrix
cv::cvtColor(hsv, bgr, cv::COLOR_HSV2BGR);
cv::imshow("optical flow", bgr);
The MPI Sintel Dataset provides C and MatLab code for visualizing computed flow. Download the ground truth optical flow of the training set from here. The archive contains a folder flow_code containing the mentioned source code.
You can port the code to OpenCV, however, I wrote a simple OpenCV wrapper to easily use the provided code. Note that the method MotionToColor is taken from the color_flow.cpp file. Note the comments in the listing below.
// Important to include this before flowIO.h!
#include "imageLib.h"
#include "flowIO.h"
#include "colorcode.h"
// I moved the MotionToColor method in a separate header file.
#include "motiontocolor.h"
cv::Mat flow;
// Compute optical flow (e.g. using OpenCV); result should be
// 2-channel float matrix.
assert(flow.channels() == 2);
// assert(flow.type() == CV_32F);
int rows = flow.rows;
int cols = flow.cols;
CFloatImage cFlow(cols, rows, 2);
// Convert flow to CFLoatImage:
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
cFlow.Pixel(j, i, 0) =<cv::Vec2f>(i, j)[0];
cFlow.Pixel(j, i, 1) =<cv::Vec2f>(i, j)[1];
CByteImage cImage;
MotionToColor(cFlow, cImage, max);
cv::Mat image(rows, cols, CV_8UC3, cv::Scalar(0, 0, 0));
// Compute back to cv::Mat with 3 channels in BGR:
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {<cv::Vec3b>(i, j)[0] = cImage.Pixel(j, i, 0);<cv::Vec3b>(i, j)[1] = cImage.Pixel(j, i, 1);<cv::Vec3b>(i, j)[2] = cImage.Pixel(j, i, 2);
// Display or output the image ...
Below is the result when using the Optical Flow code and example images provided by Ce Liu.
