Here's what I got so far. I rewrote the code to simplify things a bit. Previous code wasn't actually the real, basic algorithm. It had fluff that I didn't need. I answered the question about pitch, and below you'll see some images of my test results.
local function Line (buf, x1, y1, x2, y2, color, pitch)
-- identify the first pixel
local n = x1 + y1 * pitch
-- // difference between starting and ending points
local dx = x2 - x1;
local dy = y2 - y1;
local m = dy / dx
local err = m - 1
if (dx > dy) then -- // dx is the major axis
local j = y1
local i = x1
while i < x2 do
buf.buffer[j * pitch + i] = color
if (err >= 0) then
i = i + 1
err = err - 1
end
j = j + 1
err = err + m
end
else -- // dy is the major axis
local j = x1
local i = y1
while i < y2 do
buf.buffer[i * pitch + j] = color
if (err >= 0) then
i = i + 1
err = err - 1
end
j = j + 1
err = err + m
end
end
end
-- (visdata[2][1][576], int isBeat, int *framebuffer, int *fbout, int w, int h
function LibAVSSuperScope:Render(visdata, isBeat, framebuffer, fbout, w, h)
local size = 5
Line (self.buffer, 0, 0, 24, 24, 0xffff00, 24)
do return end
end
Edit: Oh I just realized something. 0,0 is in the lower left-hand corner. So the function's sort of working, but it's overlapping and slanted.
Edit2:
Yeah, this whole thing's broken. I'm plugging numbers into Line() and getting all sort of results. Let me show you some.
Here's Line (self.buffer, 0, 0, 23, 23, 0x00ffff, 24 * 2)
And here's Line (self.buffer, 0, 1, 23, 23, 0x00ffff, 24 * 2)
Edit: Wow, doing Line (self.buffer, 0, 24, 24, 24, 0x00ffff, 24 * 2) uses way too much CPU time.
Edit: Here's another image using this algorithm. The yellow dots are starting points.
Line (self.buffer, 0, 0, 24, 24, 0xff0000, 24)
Line (self.buffer, 0, 12, 23, 23, 0x00ff00, 24)
Line (self.buffer, 12, 0, 23, 23, 0x0000ff, 24)
Edit: And yes, that blue line wraps around.
This one works.
Line (self.buffer, 0, 0, 23, 23, 0xff0000, 24 * 2)
Line (self.buffer, 0, 5, 23, 23, 0x00ff00, 24)
Line (self.buffer, 12, 0, 23, 23, 0x0000ff, 24)
--
local function Line (buf, x0, y0, x1, y1, color, pitch)
local dx = x1 - x0;
local dy = y1 - y0;
buf.buffer[x0 + y0 * pitch] = color
if (dx ~= 0) then
local m = dy / dx;
local b = y0 - m*x0;
if x1 > x0 then
dx = 1
else
dx = -1
end
while x0 ~= x1 do
x0 = x0 + dx
y0 = math.floor(m*x0 + b + 0.5);
buf.buffer[x0 + y0 * pitch] = color
end
end
end
Here's the spiral.
The one below dances around like a music visualization, but we're just feeding it random data. I think the line quality could be better.
This is what I settled on. I just had to find valid information on that Bresenham algorithm. Thanks cs-unc for the information about various line algorithms, from simple to complex.
function LibBuffer:Line4(x0, y0, x1, y1, color, pitch)
local dx = x1 - x0;
local dy = y1 - y0;
local stepx, stepy
if dy < 0 then
dy = -dy
stepy = -1
else
stepy = 1
end
if dx < 0 then
dx = -dx
stepx = -1
else
stepx = 1
end
self.buffer[x0 + y0 * pitch] = color
if dx > dy then
local fraction = dy - bit.rshift(dx, 1)
while x0 ~= x1 do
if fraction >= 0 then
y0 = y0 + stepy
fraction = fraction - dx
end
x0 = x0 + stepx
fraction = fraction + dy
self.buffer[floor(y0) * pitch + floor(x0)] = color
end
else
local fraction = dx - bit.rshift(dy, 1)
while y0 ~= y1 do
if fraction >= 0 then
x0 = x0 + stepx
fraction = fraction - dy
end
y0 = y0 + stepy
fraction = fraction + dx
self.buffer[floor(y0) * pitch + floor(x0)] = color
end
end
end
Here's what this one looks like.
Related
I got point cloud data in the form of [(x, y, z) , (norm_x, norm_y, norm_z)] in a text file. I am trying to convert this into a png or jpg image file where any points intensity corresponds to its depth (z).
here is how an stl 3d file looks like (left). On the right is what i am trying to make.
Thank you all for taking time to read this.
x_min = -2
x_max = 2
y_min = -1
y_max = 1
z_min = 0
z_max = 1
dx = x_max - x_min
dy = y_max - y_min
dz = z_max - z_min
Ps = []
for (i = 0; i < 1000; ++i) Ps.push([x_min + Math.random()*dx, y_min + Math.random()*dy, z_min + Math.random()*dz])
width = canvas.width
height = canvas.height
context = canvas.getContext('2d')
context.setFillColor('#000000')
context.fillRect(0, 0, width, height)
imagedata = context.getImageData(0, 0, width, height)
data = imagedata.data
w = width - 1
h = height - 1
for (P of Ps) {
col = Math.round(((P[0] - x_min)/dx)*w)
row = Math.round(((y_max - P[1])/dy)*h)
val = ((P[2] - z_min)/dz)*255
i = 4*(width*row + col)
if (data[i] < val) data[i] = data[i + 1] = data[i + 2] = val
}
context.putImageData(imagedata, 0, 0)
a.href = canvas.toDataURL()
<canvas id=canvas>HTML5</canvas><br><a id=a>Download</a>
I have implemented separable Gaussian blur. Horizontal pass was relatively easy to optimize with SIMD processing. However, I am not sure how to optimize vertical pass.
Accessing elements is not very cache friendly and filling SIMD lane would mean reading many different pixels. I was thinking about transpose the image and run horizontal pass and then transpose image back, however, I am not sure if it will gain any improvement because of two tranpose operations.
I have quite large images 16k resolution and kernel size is 19, so vectorization of vertical pass gain was about 15%.
My Vertical pass is as follows (it is sinde generic class typed to T which can be uint8_t or float):
int yStart = kernelHalfSize;
int xStart = kernelHalfSize;
int yEnd = input.GetWidth() - kernelHalfSize;
int xEnd = input.GetHeigh() - kernelHalfSize;
const T * inData = input.GetData().data();
V * outData = output.GetData().data();
int kn = kernelHalfSize * 2 + 1;
int kn4 = kn - kn % 4;
for (int y = yStart; y < yEnd; y++)
{
size_t yW = size_t(y) * output.GetWidth();
size_t outX = size_t(xStart) + yW;
size_t xEndSimd = xStart;
int len = xEnd - xStart;
len = len - len % 4;
xEndSimd = xStart + len;
for (int x = xStart; x < xEndSimd; x += 4)
{
size_t inYW = size_t(y) * input.GetWidth();
size_t x0 = ((x + 0) - kernelHalfSize) + inYW;
size_t x1 = x0 + 1;
size_t x2 = x0 + 2;
size_t x3 = x0 + 3;
__m128 sumDot = _mm_setzero_ps();
int i = 0;
for (; i < kn4; i += 4)
{
__m128 kx = _mm_set_ps1(kernelDataX[i + 0]);
__m128 ky = _mm_set_ps1(kernelDataX[i + 1]);
__m128 kz = _mm_set_ps1(kernelDataX[i + 2]);
__m128 kw = _mm_set_ps1(kernelDataX[i + 3]);
__m128 dx, dy, dz, dw;
if constexpr (std::is_same<T, uint8_t>::value)
{
//we need co convert uint8_t inputs to float
__m128i u8_0 = _mm_loadu_si128((const __m128i*)(inData + x0));
__m128i u8_1 = _mm_loadu_si128((const __m128i*)(inData + x1));
__m128i u8_2 = _mm_loadu_si128((const __m128i*)(inData + x2));
__m128i u8_3 = _mm_loadu_si128((const __m128i*)(inData + x3));
__m128i u32_0 = _mm_unpacklo_epi16(
_mm_unpacklo_epi8(u8_0, _mm_setzero_si128()),
_mm_setzero_si128());
__m128i u32_1 = _mm_unpacklo_epi16(
_mm_unpacklo_epi8(u8_1, _mm_setzero_si128()),
_mm_setzero_si128());
__m128i u32_2 = _mm_unpacklo_epi16(
_mm_unpacklo_epi8(u8_2, _mm_setzero_si128()),
_mm_setzero_si128());
__m128i u32_3 = _mm_unpacklo_epi16(
_mm_unpacklo_epi8(u8_3, _mm_setzero_si128()),
_mm_setzero_si128());
dx = _mm_cvtepi32_ps(u32_0);
dy = _mm_cvtepi32_ps(u32_1);
dz = _mm_cvtepi32_ps(u32_2);
dw = _mm_cvtepi32_ps(u32_3);
}
else
{
/*
//load 8 consecutive values
auto dd = _mm256_loadu_ps(inData + x0);
//extract parts by shifting and casting to 4 values float
dx = _mm256_castps256_ps128(dd);
dy = _mm256_castps256_ps128(_mm256_permutevar8x32_ps(dd, _mm256_set_epi32(0, 0, 0, 0, 4, 3, 2, 1)));
dz = _mm256_castps256_ps128(_mm256_permutevar8x32_ps(dd, _mm256_set_epi32(0, 0, 0, 0, 5, 4, 3, 2)));
dw = _mm256_castps256_ps128(_mm256_permutevar8x32_ps(dd, _mm256_set_epi32(0, 0, 0, 0, 6, 5, 4, 3)));
*/
dx = _mm_loadu_ps(inData + x0);
dy = _mm_loadu_ps(inData + x1);
dz = _mm_loadu_ps(inData + x2);
dw = _mm_loadu_ps(inData + x3);
}
//calculate 4 dots at once
//[dx, dy, dz, dw] <dot> [kx, ky, kz, kw]
auto mx = _mm_mul_ps(dx, kx); //dx * kx
auto my = _mm_fmadd_ps(dy, ky, mx); //mx + dy * ky
auto mz = _mm_fmadd_ps(dz, kz, my); //my + dz * kz
auto res = _mm_fmadd_ps(dw, kw, mz); //mz + dw * kw
sumDot = _mm_add_ps(sumDot, res);
x0 += 4;
x1 += 4;
x2 += 4;
x3 += 4;
}
for (; i < kn; i++)
{
auto v = _mm_set_ps1(kernelDataX[i]);
auto v2 = _mm_set_ps(
*(inData + x3), *(inData + x2),
*(inData + x1), *(inData + x0)
);
sumDot = _mm_add_ps(sumDot, _mm_mul_ps(v, v2));
x0++;
x1++;
x2++;
x3++;
}
sumDot = _mm_mul_ps(sumDot, _mm_set_ps1(weightX));
if constexpr (std::is_same<V, uint8_t>::value)
{
__m128i asInt = _mm_cvtps_epi32(sumDot);
asInt = _mm_packus_epi32(asInt, asInt);
asInt = _mm_packus_epi16(asInt, asInt);
uint32_t res = _mm_cvtsi128_si32(asInt);
((uint32_t *)(outData + outX))[0] = res;
outX += 4;
}
else
{
float tmpRes[4];
_mm_store_ps(tmpRes, sumDot);
outData[outX + 0] = tmpRes[0];
outData[outX + 1] = tmpRes[1];
outData[outX + 2] = tmpRes[2];
outData[outX + 3] = tmpRes[3];
outX += 4;
}
}
for (int x = xEndSimd; x < xEnd; x++)
{
int kn = kernelHalfSize * 2 + 1;
const T * v = input.GetPixelStart(x - kernelHalfSize, y);
float tmp = 0;
for (int i = 0; i < kn; i++)
{
tmp += kernelDataX[i] * v[i];
}
tmp *= weightX;
outData[outX] = ImageUtils::clamp_cast<V>(tmp);
outX++;
}
}
There’s a well-known trick for that.
While you compute both passes, read them sequentially, use SIMD to compute, but write out the result into another buffer, transposed, using scalar stores. Protip: SSE 4.1 has _mm_extract_ps just don’t forget to cast your destination image pointer from float* into int*. Another thing about these stores, I would recommend using _mm_stream_si32 for that as you want maximum cache space used by your input data. When you’ll be computing the second pass, you’ll be reading sequential memory addresses again, the prefetcher hardware will deal with the latency.
This way both passes will be identical, I usually call same function twice, with different buffers.
Two transposes caused by your 2 passes cancel each other. Here’s an HLSL version, BTW.
There’s more. If your kernel size is only 19, that fits in 3 AVX registers. I think shuffle/permute/blend instructions are still faster than even L1 cache loads, i.e. it might be better to load the kernel outside the loop.
I have a pair of Cartesian coordinates that represent a line in an image. I would like to convert this line to polar form and draw it over the image.
e.g
cv::Vec4f line {10,20,60,70};
float x1 = line[0];
float y1 = line[1];
float x2 = line[2];
float y2 = line[3];
I want this line to be represented in cv::Vec2f form(rho,theta).
Taking care of rho & theta with all possible slopes.
Given are the image dimensions :: w and h;
w = image.cols
h = image.rows
How can I achieve this.
N.B: We can also assume that the line can be an extended one running across the image.
for (size_t i = 0; i < lines.size(); i++)
{
int x1 = lines[i][0];
int y1 = lines[i][1];
int x2 = lines[i][2];
int y2 = lines[i][3];
float d = sqrt(((y1-y2)*(y1-y2)) + ((x2-x1)*(x2-x1)) );
float rho = (y1*x2 - y2*x1)/d;
float theta = atan2(x2 - x1,y1-y2) ;
if(rho < 0){
theta *= -1;
rho *= -1;
}
linv2f.push_back(cv::Vec2f(rho,theta));
}
The above approach doesnt give me results when I plot the lines I dont get the lines that are overlapping their original vec4f form.
I use this to convert vec2f to vec4f for testing :
cv::Vec4f cvtVec2fLine(const cv::Vec2f& data, const cv::Mat& img)
{
float const rho = data[0];
float const theta = data[1];
cv::Point pt1,pt2;
if((theta < CV_PI/4. || theta > 3. * CV_PI/4.)){
pt1 = cv::Point(rho / std::cos(theta), 0);
pt2 = cv::Point( (rho - img.rows * std::sin(theta))/std::cos(theta), img.rows);
}else {
pt1 = cv::Point(0, rho / std::sin(theta));
pt2 = cv::Point(img.cols, (rho - img.cols * std::cos(theta))/std::sin(theta));
}
cv::Vec4f l;
l[0] = pt1.x;
l[1] = pt1.y;
l[2] = pt2.x;
l[3] = pt2.y;
return l;
}
rho-theta equation has form
x * Cos(Theta) + y * Sin(Theta) - Rho = 0
We want to represent equation 'by two points' into rho-theta form (page 92 in pdf here). If we have
x * A + y * B - C = 0
and need coefficients in trigonometric form, we can divide all equation by magnitude of (A,B) coefficient vector.
D = Length(A,B) = Math.Hypot(A,B)
x * A/D + y * B/D - C/D = 0
note that (A/D)^2 + (B/D)^2 = 1 - basic trigonometric equality, so we can consider A/D and B/D as cosine and sine of some angle theta.
Your line equation is
(y-y1) * (x2-x1) - (x-x1) * (y2-y1) = 0
or
x * (y1-y2) + y * (x2-x1) - (y1 * x2 - y2 * x1) = 0
let
D = Sqrt((y1-y2)^2 + (x2-x1)^2)
so
Theta = ArcTan2(x2-x1, y1-y2)
Rho = (y1 * x2 - y2 * x1) / D
edited
If Rho is negative, change sign of Rho and shift Theta by Pi
Example:
x1=1,y1=0, x2=0,y2=1
Theta = atan2(-1,-1)=-3*Pi/4
D=Sqrt(2)
Rho=-Sqrt(2)/2 negative =>
Rho = Sqrt(2)/2
Theta = Pi/4
Back substitutuon - find points of intersection with axes
0 * Sqrt(2)/2 + y0 * Sqrt(2)/2 - Sqrt(2)/2 = 0
x=0 y=1
x0 * Sqrt(2)/2 + 0 * Sqrt(2)/2 - Sqrt(2)/2 = 0
x=1 y=0
I want to extract the red ball from one picture and get the detected ellipse matrix in picture.
Here is my example:
I threshold the picture, find the contour of red ball by using findContour() function and use fitEllipse() to fit an ellipse.
But what I want is to get coefficient of this ellipse. Because the fitEllipse() return a rotation rectangle (RotatedRect), so I need to re-write this function.
One Ellipse can be expressed as Ax^2 + By^2 + Cxy + Dx + Ey + F = 0; So I want to get u=(A,B,C,D,E,F) or u=(A,B,C,D,E) if F is 1 (to construct an ellipse matrix).
I read the source code of fitEllipse(), there are totally three SVD process, I think I can get the above coefficients from the results of those three SVD process. But I am quite confused what does each result (variable cv::Mat x) of each SVD process represent and why there are three SVD here?
Here is this function:
cv::RotatedRect cv::fitEllipse( InputArray _points )
{
Mat points = _points.getMat();
int i, n = points.checkVector(2);
int depth = points.depth();
CV_Assert( n >= 0 && (depth == CV_32F || depth == CV_32S));
RotatedRect box;
if( n < 5 )
CV_Error( CV_StsBadSize, "There should be at least 5 points to fit the ellipse" );
// New fitellipse algorithm, contributed by Dr. Daniel Weiss
Point2f c(0,0);
double gfp[5], rp[5], t;
const double min_eps = 1e-8;
bool is_float = depth == CV_32F;
const Point* ptsi = points.ptr<Point>();
const Point2f* ptsf = points.ptr<Point2f>();
AutoBuffer<double> _Ad(n*5), _bd(n);
double *Ad = _Ad, *bd = _bd;
// first fit for parameters A - E
Mat A( n, 5, CV_64F, Ad );
Mat b( n, 1, CV_64F, bd );
Mat x( 5, 1, CV_64F, gfp );
for( i = 0; i < n; i++ )
{
Point2f p = is_float ? ptsf[i] : Point2f((float)ptsi[i].x, (float)ptsi[i].y);
c += p;
}
c.x /= n;
c.y /= n;
for( i = 0; i < n; i++ )
{
Point2f p = is_float ? ptsf[i] : Point2f((float)ptsi[i].x, (float)ptsi[i].y);
p -= c;
bd[i] = 10000.0; // 1.0?
Ad[i*5] = -(double)p.x * p.x; // A - C signs inverted as proposed by APP
Ad[i*5 + 1] = -(double)p.y * p.y;
Ad[i*5 + 2] = -(double)p.x * p.y;
Ad[i*5 + 3] = p.x;
Ad[i*5 + 4] = p.y;
}
solve(A, b, x, DECOMP_SVD);
// now use general-form parameters A - E to find the ellipse center:
// differentiate general form wrt x/y to get two equations for cx and cy
A = Mat( 2, 2, CV_64F, Ad );
b = Mat( 2, 1, CV_64F, bd );
x = Mat( 2, 1, CV_64F, rp );
Ad[0] = 2 * gfp[0];
Ad[1] = Ad[2] = gfp[2];
Ad[3] = 2 * gfp[1];
bd[0] = gfp[3];
bd[1] = gfp[4];
solve( A, b, x, DECOMP_SVD );
// re-fit for parameters A - C with those center coordinates
A = Mat( n, 3, CV_64F, Ad );
b = Mat( n, 1, CV_64F, bd );
x = Mat( 3, 1, CV_64F, gfp );
for( i = 0; i < n; i++ )
{
Point2f p = is_float ? ptsf[i] : Point2f((float)ptsi[i].x, (float)ptsi[i].y);
p -= c;
bd[i] = 1.0;
Ad[i * 3] = (p.x - rp[0]) * (p.x - rp[0]);
Ad[i * 3 + 1] = (p.y - rp[1]) * (p.y - rp[1]);
Ad[i * 3 + 2] = (p.x - rp[0]) * (p.y - rp[1]);
}
solve(A, b, x, DECOMP_SVD);
// store angle and radii
rp[4] = -0.5 * atan2(gfp[2], gfp[1] - gfp[0]); // convert from APP angle usage
if( fabs(gfp[2]) > min_eps )
t = gfp[2]/sin(-2.0 * rp[4]);
else // ellipse is rotated by an integer multiple of pi/2
t = gfp[1] - gfp[0];
rp[2] = fabs(gfp[0] + gfp[1] - t);
if( rp[2] > min_eps )
rp[2] = std::sqrt(2.0 / rp[2]);
rp[3] = fabs(gfp[0] + gfp[1] + t);
if( rp[3] > min_eps )
rp[3] = std::sqrt(2.0 / rp[3]);
box.center.x = (float)rp[0] + c.x;
box.center.y = (float)rp[1] + c.y;
box.size.width = (float)(rp[2]*2);
box.size.height = (float)(rp[3]*2);
if( box.size.width > box.size.height )
{
float tmp;
CV_SWAP( box.size.width, box.size.height, tmp );
box.angle = (float)(90 + rp[4]*180/CV_PI);
}
if( box.angle < -180 )
box.angle += 360;
if( box.angle > 360 )
box.angle -= 360;
return box;
}
The source code link: https://github.com/Itseez/opencv/blob/master/modules/imgproc/src/shapedescr.cpp
The function fitEllipse returns a RotatedRect that contains all the parameters of the ellipse.
An ellipse is defined by 5 parameters:
xc : x coordinate of the center
yc : y coordinate of the center
a : major semi-axis
b : minor semi-axis
theta : rotation angle
You can obtain these parameters like:
RotatedRect e = fitEllipse(points);
float xc = e.center.x;
float yc = e.center.y;
float a = e.size.width / 2; // width >= height
float b = e.size.height / 2;
float theta = e.angle; // in degrees
You can draw an ellipse with the function ellipse using the RotatedRect:
ellipse(image, e, Scalar(0,255,0));
or, equivalently using the ellipse parameters:
ellipse(res, Point(xc, yc), Size(a, b), theta, 0.0, 360.0, Scalar(0,255,0));
If you need the values of the coefficients of the implicit equation, you can do like (from Wikipedia):
So, you can get the parameters you need from the RotatedRect, and you don't need to change the function fitEllipse.
The solve function is used to solve linear systems or least-squares problems. Using the SVD decomposition method the system can be over-defined and/or the matrix src1 can be singular.
For more details on the algorithm, you can see the paper of Fitzgibbon that proposed this fit ellipse method.
Here is some code that worked for me which I based on the other responses on this thread.
def getConicCoeffFromEllipse(e):
# ellipse(Point(xc, yc),Size(a, b), theta)
xc = e[0][0]
yc = e[0][1]
a = e[1][0]/2
b = e[1][1]/2
theta = math.radians(e[2])
# See https://en.wikipedia.org/wiki/Ellipse
# Ax^2 + Bxy + Cy^2 + Dx + Ey + F = 0 is the equation
A = a*a*math.pow(math.sin(theta),2) + b*b*math.pow(math.cos(theta),2)
B = 2*(b*b - a*a)*math.sin(theta)*math.cos(theta)
C = a*a*math.pow(math.cos(theta),2) + b*b*math.pow(math.sin(theta),2)
D = -2*A*xc - B*yc
E = -B*xc - 2*C*yc
F = A*xc*xc + B*xc*yc + C*yc*yc - a*a*b*b
coef = np.array([A,B,C,D,E,F]) / F
return coef
def getConicMatrixFromCoeff(c):
C = np.array([[c[0], c[1]/2, c[3]/2], # [ a, b/2, d/2 ]
[c[1]/2, c[2], c[4]/2], # [b/2, c, e/2 ]
[c[3]/2, c[4]/2, c[5]]]) # [d/2], e/2, f ]
return C
I'm using Python and OpenCV 2.4. I'm trying to get a HSV average from an area selected by dragging the mouse, much like in the camShift example provided by OpenCV. But I want the X, Y of the selected color instances in a video feed.
I've been hacking at the onmouse function in camShift. I feel it is close to want I want, I just can't seem to extract the mean HSV values of the area selected. I know I could probably get this done with a for loop, but trying to make it as responsive as possible.
def onmouse(self, event, x, y, flags, param):
x, y = np.int16([x, y]) # BUG
if event == cv2.EVENT_LBUTTONDOWN:
self.drag_start = (x, y)
self.tracking_state = 0
if self.drag_start:
if flags & cv2.EVENT_FLAG_LBUTTON:
h, w = 480, 640 # self.frame.shape[:2]
xo, yo = self.drag_start
x0, y0 = np.maximum(0, np.minimum([xo, yo], [x, y]))
x1, y1 = np.minimum([w, h], np.maximum([xo, yo], [x, y]))
self.selection = None
if x1-x0 > 0 and y1-y0 > 0:
self.selection = (x0, y0, x1, y1)
else:
self.drag_start = None
if self.selection is not None:
self.tracking_state = 1
Ok. It's crude, but this seems to be a lot closer than I was:
import numpy as np
import cv2
import video
class App(object):
def __init__(self, video_src):
#self.cam = video.create_capture(video_src)
self.cam = cv2.VideoCapture(0)
ret, self.frame = self.cam.read()
cv2.namedWindow('camshift')
cv2.setMouseCallback('camshift', self.onmouse)
self.selection = None
self.drag_start = None
self.tracking_state = 0
self.show_backproj = False
def onmouse(self, event, x, y, flags, param):
x, y = np.int16([x, y]) # BUG
if event == cv2.EVENT_LBUTTONDOWN:
self.drag_start = (x, y)
self.tracking_state = 0
if self.drag_start:
if flags & cv2.EVENT_FLAG_LBUTTON:
h, w = self.frame.shape[:2]
xo, yo = self.drag_start
x0, y0 = np.maximum(0, np.minimum([xo, yo], [x, y]))
x1, y1 = np.minimum([w, h], np.maximum([xo, yo], [x, y]))
self.selection = None
if x1-x0 > 0 and y1-y0 > 0:
self.selection = (x0, y0, x1, y1)
else:
self.drag_start = None
if self.selection is not None:
self.tracking_state = 1
def show_hist(self):
bin_count = self.hist.shape[0]
bin_w = 24
img = np.zeros((256, bin_count*bin_w, 3), np.uint8)
for i in xrange(bin_count):
h = int(self.hist[i])
cv2.rectangle(img, (i*bin_w+2, 255), ((i+1)*bin_w-2, 255-h), (int(180.0*i/bin_count), 255, 255), -1)
img = cv2.cvtColor(img, cv2.COLOR_HSV2BGR)
cv2.imshow('hist', img)
def run(self):
while True:
ret, self.frame = self.cam.read()
self.frame = cv2.blur(self.frame,(3,3))
vis = self.frame.copy()
hsv = cv2.cvtColor(self.frame, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv, np.array((0., 60., 32.)), np.array((180., 255., 255.)))
mask2 = mask.copy()
if self.selection:
x0, y0, x1, y1 = self.selection
self.track_window = (x0, y0, x1-x0, y1-y0)
hsv_roi = hsv[y0:y1, x0:x1]
mask_roi = mask[y0:y1, x0:x1]
#cv2.norm(hsv_roi)
dHSV = cv2.mean(hsv_roi)
h, s, v = int(dHSV[0]), int(dHSV[1]), int(dHSV[2])
hist = cv2.calcHist( [hsv_roi], [0], mask_roi, [16], [0, 180] )
cv2.normalize(hist, hist, 0, 255, cv2.NORM_MINMAX);
self.hist = hist.reshape(-1)
self.show_hist()
vis_roi = vis[y0:y1, x0:x1]
cv2.bitwise_not(vis_roi, vis_roi)
vis[mask == 0] = 0
if self.tracking_state == 1:
self.selection = None
cv2.imshow('camshift', vis)
if self.tracking_state == 1:
if h > 159:
h = 159
if s > 235:
s = 235
if v > 235:
v = 235
if h < 20:
h = 20
if s < 20:
s = 20
if v < 20:
v = 20
thresh = cv2.inRange(hsv,np.array(((h-20), (s-20), (v-20))), np.array(((h+20), (s+20), (v+20))))
thresh2 = thresh.copy()
# find contours in the threshold image
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
#best_cnt = 1
max_area = 0
for cnt in contours:
area = cv2.contourArea(cnt)
if area > max_area:
max_area = area
best_cnt = cnt
# finding centroids of best_cnt and draw a circle there
M = cv2.moments(best_cnt)
cx,cy = int(M['m10']/M['m00']), int(M['m01']/M['m00'])
print cx, cy
cv2.circle(thresh2,(cx,cy),20,255,-1)
cv2.imshow('thresh',thresh2)
ch = 0xFF & cv2.waitKey(5)
if ch == 27:
break
if ch == ord('b'):
self.show_backproj = not self.show_backproj
cv2.destroyAllWindows()
if __name__ == '__main__':
import sys
try: video_src = sys.argv[1]
except: video_src = 0
print __doc__
App(video_src).run()
I hack-sawed Rahman's example and camShift