Implementing Hough Transform for Lines - opencv

I am trying to implement Hough Transform for line detection in an already pre-processed image.
So my input image is a black-white edge image, 0 - background and 255 - foreground. I do not wish to use the inbuilt HoughLines library by OpenCV.
I am actually stuck with creating the accumulator and increasing its values properly. I cant figure out where i went wrong, so here is my code block :
int diagonal = sqrt(height * height + width * width);
IplImage *acc = cvCreateImage (cvSize(180, 2 * diagonal),IPL_DEPTH_8U, 1);
unsigned char* accData = (unsigned char *)acc->imageData;
for (int i=0; i<height; i++)
{
for (int j=0; j<step; j++)
{
if (data[i*step + j] > 200)
{
for (int theta=0; theta<180; theta++)
{
int p = j * cos(theta) + i * sin(theta);
if (p > 0)
accData[theta*180 + p] += 1;
}
}
}
}
The output image that i get in acc is not what it should look like. I am not getting any sinusoids, instead only white patches here and there. Can anyone provide any feedback about where i went wrong ?

What I see there is that you don t use sinus with radians values but with degree values you could change it as follows:
int p = j * cos((double)theta*PI/180) + i * sin((double)theta*PI/180);

Related

How to implement Sobel operator

I have implemented Sobel operator in vertical direction. But the result which I am getting is very poor. I have attached my code below.
int mask_size= 3;
char mask [3][3]= {{-1,0,1},{-2,0,2},{-1,0,1}};
void sobel(Mat input_image)
{
/**Padding m-1 and n-1 zeroes to the result where m and n are mask_size**/
Mat result=Mat::zeros(input_image.rows+(mask_size - 1) * 2,input_image.cols+(mask_size - 1) * 2,CV_8UC1);
Mat result1=Mat::zeros(result.rows,result.cols,CV_8UC1);
int sum= 0;
/*For loop for copying original values to new padded image **/
for(int i=0;i<input_image.rows;i++)
for(int j=0;j<input_image.cols;j++)
result.at<uchar>(i+(mask_size-1),j+(mask_size-1))=input_image.at<uchar>(i,j);
GaussianBlur( result, result, Size(5,5), 0, 0, BORDER_DEFAULT );
/**For loop to implement the convolution **/
for(int i=0;i<result.rows-(mask_size - 1);i++)
for(int j=0;j<result.cols-(mask_size - 1);j++)
{
int counter=0;
int counterX=0,counterY=0;
sum= 0;
for(int k= i ; k < i + mask_size ; k++)
{
for(int l= j ; l< j + mask_size ; l++)
{
sum+=result.at<uchar>(k,l) * mask[counterX][counterY];
counterY++;
}
counterY=0;
counterX++;
}
result1.at<uchar>(i+mask_size/2,j+mask_size/2)=sum/(mask_size * mask_size);
}
/** Truncating all the extras rows and columns **/
result=Mat::zeros( result1.rows - (mask_size - 1) * 2, result1.cols - (mask_size - 1) * 2,CV_8UC1);
for(int i=0;i<result.rows;i++)
for(int j=0;j<result.cols;j++)
result.at<uchar>(i,j)=result1.at<uchar>(i+(mask_size - 1),j+(mask_size - 1));
imshow("Input",result);
imwrite("output2.tif",result);
}
My input to the algorithm is
My output is
I have also tried using Gaussian blur before actually convolving an image and the output I got is
The output which I am expecting is
The guide I am using is: https://www.tutorialspoint.com/dip/sobel_operator.htm
Your convolution looks ok although I only had a quick look.
Check your output type. It's unsigned char.
Now think about the values your output pixels may have if you have negative kernel values and if it is a good idea to store them in uchar directly.
If you store -1 in an unsigned char it will be wrapped around and your output is 255. In case you're wondering where all that excess white stuff is coming from. That's actually small negative gradients.
The desired result looks like the absolute of the Sobel output values.

How tu put B, G and R component value straight into a pixel of cv::Mat? [duplicate]

I have searched internet and stackoverflow thoroughly, but I haven't found answer to my question:
How can I get/set (both) RGB value of certain (given by x,y coordinates) pixel in OpenCV? What's important-I'm writing in C++, the image is stored in cv::Mat variable. I know there is an IplImage() operator, but IplImage is not very comfortable in use-as far as I know it comes from C API.
Yes, I'm aware that there was already this Pixel access in OpenCV 2.2 thread, but it was only about black and white bitmaps.
EDIT:
Thank you very much for all your answers. I see there are many ways to get/set RGB value of pixel. I got one more idea from my close friend-thanks Benny! It's very simple and effective. I think it's a matter of taste which one you choose.
Mat image;
(...)
Point3_<uchar>* p = image.ptr<Point3_<uchar> >(y,x);
And then you can read/write RGB values with:
p->x //B
p->y //G
p->z //R
Try the following:
cv::Mat image = ...do some stuff...;
image.at<cv::Vec3b>(y,x); gives you the RGB (it might be ordered as BGR) vector of type cv::Vec3b
image.at<cv::Vec3b>(y,x)[0] = newval[0];
image.at<cv::Vec3b>(y,x)[1] = newval[1];
image.at<cv::Vec3b>(y,x)[2] = newval[2];
The low-level way would be to access the matrix data directly. In an RGB image (which I believe OpenCV typically stores as BGR), and assuming your cv::Mat variable is called frame, you could get the blue value at location (x, y) (from the top left) this way:
frame.data[frame.channels()*(frame.cols*y + x)];
Likewise, to get B, G, and R:
uchar b = frame.data[frame.channels()*(frame.cols*y + x) + 0];
uchar g = frame.data[frame.channels()*(frame.cols*y + x) + 1];
uchar r = frame.data[frame.channels()*(frame.cols*y + x) + 2];
Note that this code assumes the stride is equal to the width of the image.
A piece of code is easier for people who have such problem. I share my code and you can use it directly. Please note that OpenCV store pixels as BGR.
cv::Mat vImage_;
if(src_)
{
cv::Vec3f vec_;
for(int i = 0; i < vHeight_; i++)
for(int j = 0; j < vWidth_; j++)
{
vec_ = cv::Vec3f((*src_)[0]/255.0, (*src_)[1]/255.0, (*src_)[2]/255.0);//Please note that OpenCV store pixels as BGR.
vImage_.at<cv::Vec3f>(vHeight_-1-i, j) = vec_;
++src_;
}
}
if(! vImage_.data ) // Check for invalid input
printf("failed to read image by OpenCV.");
else
{
cv::namedWindow( windowName_, CV_WINDOW_AUTOSIZE);
cv::imshow( windowName_, vImage_); // Show the image.
}
The current version allows the cv::Mat::at function to handle 3 dimensions. So for a Mat object m, m.at<uchar>(0,0,0) should work.
uchar * value = img2.data; //Pointer to the first pixel data ,it's return array in all values
int r = 2;
for (size_t i = 0; i < img2.cols* (img2.rows * img2.channels()); i++)
{
if (r > 2) r = 0;
if (r == 0) value[i] = 0;
if (r == 1)value[i] = 0;
if (r == 2)value[i] = 255;
r++;
}
const double pi = boost::math::constants::pi<double>();
cv::Mat distance2ellipse(cv::Mat image, cv::RotatedRect ellipse){
float distance = 2.0f;
float angle = ellipse.angle;
cv::Point ellipse_center = ellipse.center;
float major_axis = ellipse.size.width/2;
float minor_axis = ellipse.size.height/2;
cv::Point pixel;
float a,b,c,d;
for(int x = 0; x < image.cols; x++)
{
for(int y = 0; y < image.rows; y++)
{
auto u = cos(angle*pi/180)*(x-ellipse_center.x) + sin(angle*pi/180)*(y-ellipse_center.y);
auto v = -sin(angle*pi/180)*(x-ellipse_center.x) + cos(angle*pi/180)*(y-ellipse_center.y);
distance = (u/major_axis)*(u/major_axis) + (v/minor_axis)*(v/minor_axis);
if(distance<=1)
{
image.at<cv::Vec3b>(y,x)[1] = 255;
}
}
}
return image;
}

Un-Distort raw images received from the Leap motion cameras

I've been working with the leap for a long time now. 2.1.+ SDK version allows us to access the cameras and get raw images. I want to use those images with OpenCV for square/circle detection and stuff... the problem is i can't get those images undistorted. i read the docs, but don't quite get what they mean. here's one thing i need to understand properly before going forward
distortion_data_ = image.distortion();
for (int d = 0; d < image.distortionWidth() * image.distortionHeight(); d += 2)
{
float dX = distortion_data_[d];
float dY = distortion_data_[d + 1];
if(!((dX < 0) || (dX > 1)) && !((dY < 0) || (dY > 1)))
{
//what do i do now to undistort the image?
}
}
data = image.data();
mat.put(0, 0, data);
//Imgproc.Canny(mat, mat, 100, 200);
//mat = findSquare(mat);
ok.showImage(mat);
in the docs it says something like this "
The calibration map can be used to correct image distortion due to lens curvature and other imperfections. The map is a 64x64 grid of points. Each point consists of two 32-bit values....(the rest on the dev website)"
can someone explain this in detail please, OR OR, just post the java code to undistort the images give me an output MAT image so i may continue processing that (i'd still prefer a good explanation if possible)
Ok, I have no leap camera to test all this, but this is how I understand the documentation:
The calibration map does not hold offsets but full point positions. An entry says where the pixel has to be placed instead. Those values are mapped between 0 and 1, which means that you have to mutiply them by your real image width and height.
What isnt explained explicitly is, how you pixel positions are mapped to 64 x 64 positions of your calibration map. I assume that it's the same way: 640 pixels width are mapped to 64 pixels width and 240 pixels height are mapped to 64 pixels height.
So in general, to move from one of your 640 x 240 pixel positions (pX, pY) to the undistorted position you will:
compute corresponding pixel position in the calibration map: float cX = pX/640.0f * 64.0f; float cY = pY/240.0f * 64.0f;
(cX, cY) is now the locaion of that pixel in the calibration map. You will have to interpolate between two pixel locaions, but I will now only explain how to go on for a discrete location in the calibration map (cX', cY') = rounded locations of (cX, cY).
read the x and y values out of the calibration map: dX, dY as in the documentation. You have to compute the location in the array by: d = dY*calibrationMapWidth*2 + dX*2;
dX and dY are values between 0 and 1 (if not: dont undistort this point because there is no undistortion available. To find out the pixel location in your real image, multiply by the image size: uX = dX*640; uY = dY*240;
set your pixel to the undistorted value: undistortedImage(pX,pY) = distortedImage(uX,uY);
but you dont have discrete point positions in your calibration map, so you have to interpolate. I'll give you an example:
let be (cX,cY) = (13.7, 10.4)
so you read from your calibration map four values:
calibMap(13,10) = (dX1, dY1)
calibMap(14,10) = (dX2, dY2)
calibMap(13,11) = (dX3, dY3)
calibMap(14,11) = (dX4, dY4)
now your undistorted pixel position for (13.7, 10.4) is (multiply each with 640 or 240 to get uX1, uY1, uX2, etc):
// interpolate in x direction first:
float tmpUX1 = uX1*0.3 + uX2*0.7
float tmpUY1 = uY1*0.3 + uY2*0.7
float tmpUX2 = uX3*0.3 + uX4*0.7
float tmpUY2 = uY3*0.3 + uY4*0.7
// now interpolate in y direction
float combinedX = tmpUX1*0.6 + tmpUX2*0.4
float combinedY = tmpUY1*0.6 + tmpUY2*0.4
and your undistorted point is:
undistortedImage(pX,pY) = distortedImage(floor(combinedX+0.5),floor(combinedY+0.5)); or interpolate pixel values there too.
Hope this helps for a basic understanding. I'll try to add openCV remap code soon! The only point thats unclear for me is, whether the mapping between pX/Y and cX/Y is correct, cause thats not explicitly explained in the documentation.
Here is some code. You can skip the first part, where I am faking a distortion and creating the map, which is your initial state.
With openCV it is simple, just resize the calibration map to your image size and multiply all the values with your resolution. The nice thing is, that openCV performs the interpolation "automatically" while resizing.
int main()
{
cv::Mat input = cv::imread("../Data/Lenna.png");
cv::Mat distortedImage = input.clone();
// now i fake some distortion:
cv::Mat transformation = cv::Mat::eye(3,3,CV_64FC1);
transformation.at<double>(0,0) = 2.0;
cv::warpPerspective(input,distortedImage,transformation,input.size());
cv::imshow("distortedImage", distortedImage);
//cv::imwrite("../Data/LenaFakeDistorted.png", distortedImage);
// now fake a calibration map corresponding to my faked distortion:
const unsigned int cmWidth = 64;
const unsigned int cmHeight = 64;
// compute the calibration map by transforming image locations to values between 0 and 1 for legal positions.
float calibMap[cmWidth*cmHeight*2];
for(unsigned int y = 0; y < cmHeight; ++y)
for(unsigned int x = 0; x < cmWidth; ++x)
{
float xx = (float)x/(float)cmWidth;
xx = xx*2.0f; // this if from my fake distortion... this gives some values bigger than 1
float yy = (float)y/(float)cmHeight;
calibMap[y*cmWidth*2+ 2*x] = xx;
calibMap[y*cmWidth*2+ 2*x+1] = yy;
}
// NOW you have the initial situation of your scenario: calibration map and distorted image...
// compute the image locations of calibration map values:
cv::Mat cMapMatX = cv::Mat(cmHeight, cmWidth, CV_32FC1);
cv::Mat cMapMatY = cv::Mat(cmHeight, cmWidth, CV_32FC1);
for(int j=0; j<cmHeight; ++j)
for(int i=0; i<cmWidth; ++i)
{
cMapMatX.at<float>(j,i) = calibMap[j*cmWidth*2 +2*i];
cMapMatY.at<float>(j,i) = calibMap[j*cmWidth*2 +2*i+1];
}
//cv::imshow("mapX",cMapMatX);
//cv::imshow("mapY",cMapMatY);
// interpolate those values for each of your original images pixel:
// here I use linear interpolation, you could use cubic or other interpolation too.
cv::resize(cMapMatX, cMapMatX, distortedImage.size(), 0,0, CV_INTER_LINEAR);
cv::resize(cMapMatY, cMapMatY, distortedImage.size(), 0,0, CV_INTER_LINEAR);
// now the calibration map has the size of your original image, but its values are still between 0 and 1 (for legal positions)
// so scale to image size:
cMapMatX = distortedImage.cols * cMapMatX;
cMapMatY = distortedImage.rows * cMapMatY;
// now create undistorted image:
cv::Mat undistortedImage = cv::Mat(distortedImage.rows, distortedImage.cols, CV_8UC3);
undistortedImage.setTo(cv::Vec3b(0,0,0)); // initialize black
//cv::imshow("undistorted", undistortedImage);
for(int j=0; j<undistortedImage.rows; ++j)
for(int i=0; i<undistortedImage.cols; ++i)
{
cv::Point undistPosition;
undistPosition.x =(cMapMatX.at<float>(j,i)); // this will round the position, maybe you want interpolation instead
undistPosition.y =(cMapMatY.at<float>(j,i));
if(undistPosition.x >= 0 && undistPosition.x < distortedImage.cols
&& undistPosition.y >= 0 && undistPosition.y < distortedImage.rows)
{
undistortedImage.at<cv::Vec3b>(j,i) = distortedImage.at<cv::Vec3b>(undistPosition);
}
}
cv::imshow("undistorted", undistortedImage);
cv::waitKey(0);
//cv::imwrite("../Data/LenaFakeUndistorted.png", undistortedImage);
}
cv::Mat SelfDescriptorDistances(cv::Mat descr)
{
cv::Mat selfDistances = cv::Mat::zeros(descr.rows,descr.rows, CV_64FC1);
for(int keyptNr = 0; keyptNr < descr.rows; ++keyptNr)
{
for(int keyptNr2 = 0; keyptNr2 < descr.rows; ++keyptNr2)
{
double euclideanDistance = 0;
for(int descrDim = 0; descrDim < descr.cols; ++descrDim)
{
double tmp = descr.at<float>(keyptNr,descrDim) - descr.at<float>(keyptNr2, descrDim);
euclideanDistance += tmp*tmp;
}
euclideanDistance = sqrt(euclideanDistance);
selfDistances.at<double>(keyptNr, keyptNr2) = euclideanDistance;
}
}
return selfDistances;
}
I use this as input and fake a remap/distortion from which I compute my calib mat:
input:
faked distortion:
used the map to undistort the image:
TODO: after those computatons use a opencv map with those values to perform faster remapping.
Here's an example on how to do it without using OpenCV. The following seems to be faster than using the Leap::Image::warp() method (probably due to the additional function call overhead when using warp()):
float destinationWidth = 320;
float destinationHeight = 120;
unsigned char destination[(int)destinationWidth][(int)destinationHeight];
//define needed variables outside the inner loop
float calX, calY, weightX, weightY, dX1, dX2, dX3, dX4, dY1, dY2, dY3, dY4, dX, dY;
int x1, x2, y1, y2, denormalizedX, denormalizedY;
int x, y;
const unsigned char* raw = image.data();
const float* distortion_buffer = image.distortion();
//Local variables for values needed in loop
const int distortionWidth = image.distortionWidth();
const int width = image.width();
const int height = image.height();
for (x = 0; x < destinationWidth; x++) {
for (y = 0; y < destinationHeight; y++) {
//Calculate the position in the calibration map (still with a fractional part)
calX = 63 * x/destinationWidth;
calY = 63 * y/destinationHeight;
//Save the fractional part to use as the weight for interpolation
weightX = calX - truncf(calX);
weightY = calY - truncf(calY);
//Get the x,y coordinates of the closest calibration map points to the target pixel
x1 = calX; //Note truncation to int
y1 = calY;
x2 = x1 + 1;
y2 = y1 + 1;
//Look up the x and y values for the 4 calibration map points around the target
// (x1, y1) .. .. .. (x2, y1)
// .. ..
// .. (x, y) ..
// .. ..
// (x1, y2) .. .. .. (x2, y2)
dX1 = distortion_buffer[x1 * 2 + y1 * distortionWidth];
dX2 = distortion_buffer[x2 * 2 + y1 * distortionWidth];
dX3 = distortion_buffer[x1 * 2 + y2 * distortionWidth];
dX4 = distortion_buffer[x2 * 2 + y2 * distortionWidth];
dY1 = distortion_buffer[x1 * 2 + y1 * distortionWidth + 1];
dY2 = distortion_buffer[x2 * 2 + y1 * distortionWidth + 1];
dY3 = distortion_buffer[x1 * 2 + y2 * distortionWidth + 1];
dY4 = distortion_buffer[x2 * 2 + y2 * distortionWidth + 1];
//Bilinear interpolation of the looked-up values:
// X value
dX = dX1 * (1 - weightX) * (1- weightY) + dX2 * weightX * (1 - weightY) + dX3 * (1 - weightX) * weightY + dX4 * weightX * weightY;
// Y value
dY = dY1 * (1 - weightX) * (1- weightY) + dY2 * weightX * (1 - weightY) + dY3 * (1 - weightX) * weightY + dY4 * weightX * weightY;
// Reject points outside the range [0..1]
if((dX >= 0) && (dX <= 1) && (dY >= 0) && (dY <= 1)) {
//Denormalize from [0..1] to [0..width] or [0..height]
denormalizedX = dX * width;
denormalizedY = dY * height;
//look up the brightness value for the target pixel
destination[x][y] = raw[denormalizedX + denormalizedY * width];
} else {
destination[x][y] = -1;
}
}
}

Fast Gaussian Blur image filter with ARM NEON

I'm trying to make a mobile fast version of Gaussian Blur image filter.
I've read other questions, like: Fast Gaussian blur on unsigned char image- ARM Neon Intrinsics- iOS Dev
For my purpose i need only a fixed size (7x7) fixed sigma (2) Gaussian filter.
So, before optimizing for ARM NEON, I'm implementing 1D Gaussian Kernel in C++, and comparing performance with OpenCV GaussianBlur() method directly in mobile environment (Android with NDK). This way it will result in a much simpler code to optimize.
However the result is that my implementation is 10 times slower then OpenCV4Android version. I've read that OpenCV4 Tegra have optimized GaussianBlur implementation, but I don't think that standard OpenCV4Android have those kind of optimizations, so why is my code so slow?
Here is my implementation (note: reflect101 is used for pixel reflection when applying filter near borders):
Mat myGaussianBlur(Mat src){
Mat dst(src.rows, src.cols, CV_8UC1);
Mat temp(src.rows, src.cols, CV_8UC1);
float sum, x1, y1;
// coefficients of 1D gaussian kernel with sigma = 2
double coeffs[] = {0.06475879783, 0.1209853623, 0.1760326634, 0.1994711402, 0.1760326634, 0.1209853623, 0.06475879783};
//Normalize coeffs
float coeffs_sum = 0.9230247873f;
for (int i = 0; i < 7; i++){
coeffs[i] /= coeffs_sum;
}
// filter vertically
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0.0;
for(int i = -3; i <= 3; i++){
y1 = reflect101(src.rows, y - i);
sum += coeffs[i + 3]*src.at<uchar>(y1, x);
}
temp.at<uchar>(y,x) = sum;
}
}
// filter horizontally
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0.0;
for(int i = -3; i <= 3; i++){
x1 = reflect101(src.rows, x - i);
sum += coeffs[i + 3]*temp.at<uchar>(y, x1);
}
dst.at<uchar>(y,x) = sum;
}
}
return dst;
}
A big part of the problem, here, is that the algorithm is overly precise, as #PaulR pointed out. It's usually best to keep your coefficient table no more precise than your data. In this case, since you appear to be processing uchar data, you would use roughly an 8-bit coefficient table.
Keeping these weights small will particularly matter in your NEON implementation because the narrower you have the arithmetic, the more lanes you can process at once.
Beyond that, the first major slowdown that stands out is that having the image edge reflection code within the main loop. That's going to make the bulk of the work less efficient because it will generally not need to do anything special in that case.
It might work out better if you use a special version of the loop near the edges, and then when you're safe from that you use a simplified inner loop that doesn't call that reflect101() function.
Second (more relevant to prototype code) is that it's possible to add the wings of the window together before applying the weighting function, because the table contains the same coefficients on both sides.
sum = src.at<uchar>(y1, x) * coeffs[3];
for(int i = -3; i < 0; i++) {
int tmp = src.at<uchar>(y + i, x) + src.at<uchar>(y - i, x);
sum += coeffs[i + 3] * tmp;
}
This saves you six multiplies per pixel, and it's a step towards some other optimisations around controlling overflow conditions.
Then there are a couple of other problems related to the memory system.
The two-pass approach is good in principle, because it saves you from performing a lot of recomputation. Unfortunately it can push the useful data out of L1 cache, which can make everything quite a lot slower. It also means that when you write the result out to memory, you're quantising the intermediate sum, which can reduce precision.
When you convert this code to NEON, one of the things you will want to focus on is trying to keep your working set inside the register file, but without discarding calculations before they've been fully utilised.
When people do use two passes, it's usual for the intermediate data to be transposed -- that is, a column of input becomes a row of output.
This is because the CPU will really not like fetching small amounts of data across multiple lines of the input image. It works out much more efficient (because of the way the cache works) if you collect together a bunch of horizontal pixels, and filter those. If the temporary buffer is transposed, then the second pass also collects together a bunch of horizontal points (which would vertical in the original orientation) and it transposes its output again so it comes out the right way.
If you optimise to keep your working set localised, then you might not need this transposition trick, but it's worth knowing about so that you can set yourself a healthy baseline performance. Unfortunately, localisation like this does force you to go back to the non-optimal memory fetches, but with the wider data types that penalty can be mitigated.
If this is specifically for 8 bit images then you really don't want floating point coefficients, especially not double precision. Also you don't want to use floats for x1, y1. You should just use integers for coordinates and you can use fixed point (i.e. integer) for the coefficients to keep all the filter arithmetic in the integer domain, e.g.
Mat myGaussianBlur(Mat src){
Mat dst(src.rows, src.cols, CV_8UC1);
Mat temp(src.rows, src.cols, CV_16UC1); // <<<
int sum, x1, y1; // <<<
// coefficients of 1D gaussian kernel with sigma = 2
double coeffs[] = {0.06475879783, 0.1209853623, 0.1760326634, 0.1994711402, 0.1760326634, 0.1209853623, 0.06475879783};
int coeffs_i[7] = { 0 }; // <<<
//Normalize coeffs
float coeffs_sum = 0.9230247873f;
for (int i = 0; i < 7; i++){
coeffs_i[i] = (int)(coeffs[i] / coeffs_sum * 256); // <<<
}
// filter vertically
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0; // <<<
for(int i = -3; i <= 3; i++){
y1 = reflect101(src.rows, y - i);
sum += coeffs_i[i + 3]*src.at<uchar>(y1, x); // <<<
}
temp.at<uchar>(y,x) = sum;
}
}
// filter horizontally
for(int y = 0; y < src.rows; y++){
for(int x = 0; x < src.cols; x++){
sum = 0; // <<<
for(int i = -3; i <= 3; i++){
x1 = reflect101(src.rows, x - i);
sum += coeffs_i[i + 3]*temp.at<uchar>(y, x1); // <<<
}
dst.at<uchar>(y,x) = sum / (256 * 256); // <<<
}
}
return dst;
}
This is the code after implementing all the suggestions of #Paul R and #sh1, summarized as follows:
1) use only integer arithmetic (with precision to taste)
2) add the values ​​of the pixels at the same distance from the mask center before applying the multiplications, to reduce the number of multiplications.
3) apply only horizontal filters to take advantage of the storage by rows of the matrices
4) separate cycles around the edges from those inside the image not to make unnecessary calls to reflection functions. I totally removed the functions of reflection, including them inside the loops along the edges.
5) In addition, as a personal observation, to improve rounding without calling a (slow) function "round" or "cvRound", I've added to both temporary and final pixel results 0.5f (= 32768 in integers precision) to reduce the error / difference compared to OpenCV.
Now the performance is much better from about 15 to about 6 times slower than OpenCV.
However, the resulting matrix is not perfectly identical to that obtained with the Gaussian Blur of OpenCV. This is not due to arithmetic length (sufficient) as well as removing the error remains. Note that this is a minimum difference, between 0 and 2 (in absolute value) of pixel intensity, between the matrices resulting from the two versions. Coefficient are the same used by OpenCV, obtained with getGaussianKernel with same size and sigma.
Mat myGaussianBlur(Mat src){
Mat dst(src.rows, src.cols, CV_8UC1);
Mat temp(src.rows, src.cols, CV_8UC1);
int sum;
int x1;
double coeffs[] = {0.070159, 0.131075, 0.190713, 0.216106, 0.190713, 0.131075, 0.070159};
int coeffs_i[7] = { 0 };
for (int i = 0; i < 7; i++){
coeffs_i[i] = (int)(coeffs[i] * 65536); //65536
}
// filter horizontally - inside the image
for(int y = 0; y < src.rows; y++){
uchar *ptr = src.ptr<uchar>(y);
for(int x = 3; x < (src.cols - 3); x++){
sum = ptr[x] * coeffs_i[3];
for(int i = -3; i < 0; i++){
int tmp = ptr[x+i] + ptr[x-i];
sum += coeffs_i[i + 3]*tmp;
}
temp.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
// filter horizontally - edges - needs reflect
for(int y = 0; y < src.rows; y++){
uchar *ptr = src.ptr<uchar>(y);
for(int x = 0; x <= 2; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 < 0){
x1 = -x1;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
temp.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
for(int y = 0; y < src.rows; y++){
uchar *ptr = src.ptr<uchar>(y);
for(int x = (src.cols - 3); x < src.cols; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 >= src.cols){
x1 = 2*src.cols - x1 - 2;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
temp.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
// transpose to apply again horizontal filter - better cache data locality
transpose(temp, temp);
// filter horizontally - inside the image
for(int y = 0; y < src.rows; y++){
uchar *ptr = temp.ptr<uchar>(y);
for(int x = 3; x < (src.cols - 3); x++){
sum = ptr[x] * coeffs_i[3];
for(int i = -3; i < 0; i++){
int tmp = ptr[x+i] + ptr[x-i];
sum += coeffs_i[i + 3]*tmp;
}
dst.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
// filter horizontally - edges - needs reflect
for(int y = 0; y < src.rows; y++){
uchar *ptr = temp.ptr<uchar>(y);
for(int x = 0; x <= 2; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 < 0){
x1 = -x1;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
dst.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
for(int y = 0; y < src.rows; y++){
uchar *ptr = temp.ptr<uchar>(y);
for(int x = (src.cols - 3); x < src.cols; x++){
sum = 0;
for(int i = -3; i <= 3; i++){
x1 = x + i;
if(x1 >= src.cols){
x1 = 2*src.cols - x1 - 2;
}
sum += coeffs_i[i + 3]*ptr[x1];
}
dst.at<uchar>(y,x) = (sum + 32768) / 65536;
}
}
transpose(dst, dst);
return dst;
}
According to Google document, on Android device, using float/double is twice slower than using int/uchar.
You may find some solutions to speed up your C++ code on this Android documents.
https://developer.android.com/training/articles/perf-tips

Search for lines with a small range of angles in OpenCV

I'm using the Hough transform in OpenCV to detect lines. However, I know in advance that I only need lines within a very limited range of angles (about 10 degrees or so). I'm doing this in a very performance sensitive setting, so I'd like to avoid the extra work spent detecting lines at other angles, lines I know in advance I don't care about.
I could extract the Hough source from OpenCV and just hack it to take min_rho and max_rho parameters, but I'd like a less fragile approach (have to manually update my code w/ each OpenCV update, etc.).
What's the best approach here?
Well, i've modified the icvHoughlines function to go for a certain range of angles. I'm sure there's cleaner ways that plays with memory allocation as well, but I got a speed gain going from 100ms to 33ms for a range of angle going from 180deg to 60deg, so i'm happy with that.
Note that this code also outputs the accumulator value. Also, I only output 1 line because that fit my purposes but there was no gain really there.
static void
icvHoughLinesStandard2( const CvMat* img, float rho, float theta,
int threshold, CvSeq *lines, int linesMax )
{
cv::AutoBuffer<int> _accum, _sort_buf;
cv::AutoBuffer<float> _tabSin, _tabCos;
const uchar* image;
int step, width, height;
int numangle, numrho;
int total = 0;
float ang;
int r, n;
int i, j;
float irho = 1 / rho;
double scale;
CV_Assert( CV_IS_MAT(img) && CV_MAT_TYPE(img->type) == CV_8UC1 );
image = img->data.ptr;
step = img->step;
width = img->cols;
height = img->rows;
numangle = cvRound(CV_PI / theta);
numrho = cvRound(((width + height) * 2 + 1) / rho);
_accum.allocate((numangle+2) * (numrho+2));
_sort_buf.allocate(numangle * numrho);
_tabSin.allocate(numangle);
_tabCos.allocate(numangle);
int *accum = _accum, *sort_buf = _sort_buf;
float *tabSin = _tabSin, *tabCos = _tabCos;
memset( accum, 0, sizeof(accum[0]) * (numangle+2) * (numrho+2) );
// find n and ang limits (in our case we want 60 to 120
float limit_min = 60.0/180.0*PI;
float limit_max = 120.0/180.0*PI;
//num_steps = (limit_max - limit_min)/theta;
int start_n = floor(limit_min/theta);
int stop_n = floor(limit_max/theta);
for( ang = limit_min, n = start_n; n < stop_n; ang += theta, n++ )
{
tabSin[n] = (float)(sin(ang) * irho);
tabCos[n] = (float)(cos(ang) * irho);
}
// stage 1. fill accumulator
for( i = 0; i < height; i++ )
for( j = 0; j < width; j++ )
{
if( image[i * step + j] != 0 )
//
for( n = start_n; n < stop_n; n++ )
{
r = cvRound( j * tabCos[n] + i * tabSin[n] );
r += (numrho - 1) / 2;
accum[(n+1) * (numrho+2) + r+1]++;
}
}
int max_accum = 0;
int max_ind = 0;
for( r = 0; r < numrho; r++ )
{
for( n = start_n; n < stop_n; n++ )
{
int base = (n+1) * (numrho+2) + r+1;
if (accum[base] > max_accum)
{
max_accum = accum[base];
max_ind = base;
}
}
}
CvLinePolar2 line;
scale = 1./(numrho+2);
int idx = max_ind;
n = cvFloor(idx*scale) - 1;
r = idx - (n+1)*(numrho+2) - 1;
line.rho = (r - (numrho - 1)*0.5f) * rho;
line.angle = n * theta;
line.votes = accum[idx];
cvSeqPush( lines, &line );
}
If you use the Probabilistic Hough transform then the output is in the form of a cvPoint each for lines[0] and lines[1] parameters. We can get x and y co-ordinated for each of the two points by pt1.x, pt1.y and pt2.x and pt2.y.
Then use the simple formula for finding slope of a line - (y2-y1)/(x2-x1). Taking arctan (tan inverse) of that will yield that angle in radians. Then simply filter out desired angles from the values for each hough line obtained.
I think it's more natural to use standart HoughLines(...) function, which gives collection of lines directly in rho and theta terms and select nessessary angle range from it, rather than recalculate angle from segment end points.

Resources