cvHaarDetectObjects Memory Leak - opencv

I'm using the function cvHaarDetectObjects to do face detection and there is a memory leak checking with valgrind even though I think I freed all the memories. I really don't know how to fix the memory leak. Here is my code:
int Detect(MyImage* Img,MyImage **Face)
{
Char* Cascade_name = new Char[1024];
strcpy(Cascade_name,"/usr/share/OpenCV/haarcascades/haarcascade_frontalface_alt.xml");
// Create memory for calculations
CvMemStorage* Storage = 0;
// Create a new Haar classifier
CvHaarClassifierCascade* Cascade = 0;
int Scale = 1;
// Create two points to represent the face locations
CvPoint pt1, pt2;
int Loop;
// Load the HaarClassifierCascade
Cascade = (CvHaarClassifierCascade*)cvLoad( Cascade_name, 0, 0, 0 );
// Check whether the cascade has loaded successfully. Else report and error and quit
if( !Cascade )
{
fprintf( stderr, "ERROR: Could not load classifier cascade\n" );
exit(0);
}
// Allocate the memory storage
Storage = cvCreateMemStorage(0);
// Clear the memory storage which was used before
cvClearMemStorage( Storage );
// Find whether the cascade is loaded, to find the faces. If yes, then:
if( Cascade )
{
// There can be more than one face in an image. So create a growable sequence of faces.
// Detect the objects and store them in the sequence
CvSeq* Faces = cvHaarDetectObjects( Img->Image(), Cascade, Storage,
1.1, 2, CV_HAAR_DO_CANNY_PRUNING,
cvSize(40, 40) );
int MaxWidth = 0;
int MaxHeight = 0;
if(Faces->total == 0)
{
cout<<"There is no face."<<endl;
return 1;
}
//just get the first face
for( Loop = 0; Loop <1; Loop++ )
{
// Create a new rectangle for drawing the face
CvRect* Rect = (CvRect*)cvGetSeqElem( Faces, Loop );
// Find the dimensions of the face,and scale it if necessary
pt1.x = Rect->x*Scale;
pt2.x = (Rect->x+Rect->width)*Scale;
if(Rect->width>MaxWidth) MaxWidth = Rect->width;
pt1.y = Rect->y*Scale;
pt2.y = (Rect->y+Rect->height)*Scale;
if(Rect->height>MaxHeight) MaxHeight = Rect->height;
cvSetImageROI( Img->Image(), *Rect );
MyImage* Dest = new MyImage(cvGetSize(Img->Image()),IPL_DEPTH_8U, 1);
cvCvtColor( Img->Image(), Dest->Image(), CV_RGB2GRAY );
MyImage* Equalized = new MyImage(cvGetSize(Dest->Image()), IPL_DEPTH_8U, 1);
// Perform histogram equalization
cvEqualizeHist( Dest->Image(), Equalized->Image());
(*Face) = new MyImage(Equalized->Image());
if(Equalized)
delete Equalized;
Equalized = NULL;
if(Dest)
delete Dest;
Dest = NULL;
cvResetImageROI(Img->Image());
}
if(Cascade)
{
cvReleaseHaarClassifierCascade( &Cascade );
delete Cascade;
Cascade = NULL;
}
if(Storage)
{
cvClearMemStorage(Storage);
cvReleaseMemStorage(&Storage);
delete Storage;
Storage = NULL;
}
if(Cascade_name)
delete [] Cascade_name;
Cascade_name = NULL;
return 0;
}
In the code, MyImage is a wrapper class of IplImage containing IplImage* p as a member. if the constructor takes a IplImage* ppara as parameter, then the member p will create memory using cvCreateImage(cvGetSize(ppara), ppara->depth, ppara->nChannels) and cvCopy(ppara, p). if it takes size,depth and channels as parameter, then only do cvCreateImage. Then the destructor do cvReleaseImage(&p). The function int Detect(MyImage *Img, MyImage **Face) is called like:
IplImage *Temp = cvLoadImage(ImageName);
MyImage* Img = new MyImage(Temp);
if(Temp)
cvReleaseImage(&Temp);
Temp = NULL;
MyImage * Face = NULL;
Detect(Img, &Face);
I released Img and Face in the following code once the operations on them is done. And the memory leak is happened inside the Detect function. I'm using OpenCV 2.3.1 on 64 bit OS fedora 16. The whole program can terminate normally except for the memory leak.
Thanks a lot.

I found out why there is a memory leak. The reason is:
In the MyImage class constructor, I passed in a IplImage* p pointer, and do the following:
mp = cvCloneImage(p);
where mp is a IplImage* member of the MyImage class. I free the IplImage* pointer that I passed in after creating a new MyImage class object, since cvCloneImage() will create some memories. However I free the member pointer mp in the class destructor when it actually doesn't new any memory. It just points to memories that created by cvCloneImage(). So the memories created by cvCloneImage() isn't freed. This is where the memory leak came from.
Thus I do the following in the constructor given a IplImage* p passed in as parameter:
mp = cvCreateImage(cvGetSize(p), p->depth, p->nChannels);
cvCopy(p, mp);
And free the mp pointer in the class destructor will free the memory that it creates.
After doing this, definitely lost and indirectly lost memories are turned to 0, but there are still possibly lost memories, and valgrind points all the lost record to the cvHaarDetectObjects() function which is from OpenCV. And mostly are caused by some "new threads" issues. Thus I googled this problem and found out valgrind does give possibly lost memories sometimes when new thread is involved. So I monitored the memory usage of the system. The result shows no memory usage building up as the program executed repeatedly.
That's what I found.

Related

What is the syntax for construct array of array using placement new?

i have trying to construct array of array for placement new.
i searching internet only manage to found construct an array using placement new. But what if i want array of array instead?
i not sure how to construct the inner array.
memory manager constructor already allocate buffer with large size
memory manager destructor have delete buff
Node operator new overload already implement
This my code
map_size_x = terrain->get_map_width();
map_size_y = terrain->get_map_height();
grid_map = new Node *[map_size_x];
for (int i = 0; i < map_size_x; ++i)
{
//grid_map[i] = new Node[map_size_y];
grid_map[i] = new( buf + i * sizeof(Node)) Node;
}
buf is char * already allocated big size of memory in somewhere other class like memory manager and should be enough to fit in sizeof Node * width and height.
There is new operator overload implemented in Node class which is
void *AStarPather::Node::operator new(std::size_t size, void* buffer)
{
return buffer;
}
the result seem failed to allocate and program stuck, but no crash.
i am using visual studio 2017

Fill CubeTexture with data

I'm puzzled why this isn't working.
I'm trying to add texture data to each of the cube textures faces. For some reason, only the first(+x) works. The MSDN documentation is quite sparse, but it looks like this should do the trick:
// mip-level 0 data
// R8G8B8A8 texture
uint32_t sizeWidth = textureWidth * sizeof(uint8_t) * 4;
if (isCubeTexture)
{
for (uint32_t index = 0; index < gCubemapNumTextures; ++index)
{
const uint32_t subResourceID = D3D11CalcSubresource(0, index, 1);
context->UpdateSubresource(mTexture, subResourceID, NULL, &textureData.at(sizeWidth * textureHeight * index), sizeWidth, 0);
}
}
When debugging and looking at the faces its all just black except the first face, which seems to load fine. So obivously I am doing somerhing wrong, how do you properly upload cubetexture data to all the faces?
EDIT: follow parameters used to create the texture:
D3D11_TEXTURE2D_DESC textureDesc;
ZeroMemory(&textureDesc, sizeof(D3D11_TEXTURE2D_DESC));
textureDesc.Width = textureWidth;
textureDesc.Height = textureHeight;
textureDesc.ArraySize = isCubeTexture ? gCubemapNumTextures : 1;
if (isSRGB)
textureDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM_SRGB;
else
textureDesc.Format = DXGI_FORMAT_R8G8B8A8_UNORM;
textureDesc.SampleDesc.Count = 1;
textureDesc.Usage = D3D11_USAGE_DEFAULT;
textureDesc.BindFlags = D3D11_BIND_RENDER_TARGET | D3D11_BIND_SHADER_RESOURCE;
textureDesc.MiscFlags = D3D11_RESOURCE_MISC_GENERATE_MIPS;
if (isCubeTexture)
textureDesc.MiscFlags |= D3D11_RESOURCE_MISC_TEXTURECUBE;
DXCALL(device->CreateTexture2D(&textureDesc, NULL, &mTexture));
Then after uploading the data I generate mip chain like this:
context->GenerateMips(mShaderResourceView);
And again, it works fine but only for the first (+x) face.
You create the texture with "0" mip levels by virtue of zero'ing out the texture description. Zero means "full mip chain please", which means more than 1 mip (unless your texture is 1x1).
Your arguments to D3D11CalcSubresource has a third argument of '1', suggesting only one mip, which appears not to be true. Be sure to pass in the correct number of mips to this helper function or it won't calculate the correct subresource index.
You can get the mip count by calling GetDesc() after the texture has been created.

Is it possible to use GPU just for calculations without copying the variables to GPU. Can we use RAM or even HardDisk for variable stoorage? [duplicate]

I tried the code in this link Is CUDA pinned memory zero-copy?
The one who asked claims the program worked fine for him
But does not work the same way on mine
the values does not change if I manipulate them in the kernel.
Basically my problem is, my GPU memory is not enough but I want to do calculations which require more memory. I my program to use RAM memory, or host memory and be able to use CUDA for calculations. The program in the link seemed to solve my problem but the code does not give output as shown by the guy.
Any help or any working example on Zero copy memory would be useful.
Thank you
__global__ void testPinnedMemory(double * mem)
{
double currentValue = mem[threadIdx.x];
printf("Thread id: %d, memory content: %f\n", threadIdx.x, currentValue);
mem[threadIdx.x] = currentValue+10;
}
void test()
{
const size_t THREADS = 8;
double * pinnedHostPtr;
cudaHostAlloc((void **)&pinnedHostPtr, THREADS, cudaHostAllocDefault);
//set memory values
for (size_t i = 0; i < THREADS; ++i)
pinnedHostPtr[i] = i;
//call kernel
dim3 threadsPerBlock(THREADS);
dim3 numBlocks(1);
testPinnedMemory<<< numBlocks, threadsPerBlock>>>(pinnedHostPtr);
//read output
printf("Data after kernel execution: ");
for (int i = 0; i < THREADS; ++i)
printf("%f ", pinnedHostPtr[i]);
printf("\n");
}
First of all, to allocate ZeroCopy memory, you have to specify cudaHostAllocMapped flag as an argument to cudaHostAlloc.
cudaHostAlloc((void **)&pinnedHostPtr, THREADS * sizeof(double), cudaHostAllocMapped);
Still the pinnedHostPointer will be used to access the mapped memory from the host side only. To access the same memory from device, you have to get the device side pointer to the memory like this:
double* dPtr;
cudaHostGetDevicePointer(&dPtr, pinnedHostPtr, 0);
Pass this pointer as kernel argument.
testPinnedMemory<<< numBlocks, threadsPerBlock>>>(dPtr);
Also, you have to synchronize the kernel execution with the host to read the updated values. Just add cudaDeviceSynchronize after the kernel call.
The code in the linked question is working, because the person who asked the question is running the code on a 64 bit OS with a GPU of Compute Capability 2.0 and TCC enabled. This configuration automatically enables the Unified Virtual Addressing feature of the GPU in which the device sees host + device memory as a single large memory instead of separate ones and host pointers allocated using cudaHostAlloc can be passed directly to the kernel.
In your case, the final code will look like this:
#include <cstdio>
__global__ void testPinnedMemory(double * mem)
{
double currentValue = mem[threadIdx.x];
printf("Thread id: %d, memory content: %f\n", threadIdx.x, currentValue);
mem[threadIdx.x] = currentValue+10;
}
int main()
{
const size_t THREADS = 8;
double * pinnedHostPtr;
cudaHostAlloc((void **)&pinnedHostPtr, THREADS * sizeof(double), cudaHostAllocMapped);
//set memory values
for (size_t i = 0; i < THREADS; ++i)
pinnedHostPtr[i] = i;
double* dPtr;
cudaHostGetDevicePointer(&dPtr, pinnedHostPtr, 0);
//call kernel
dim3 threadsPerBlock(THREADS);
dim3 numBlocks(1);
testPinnedMemory<<< numBlocks, threadsPerBlock>>>(dPtr);
cudaDeviceSynchronize();
//read output
printf("Data after kernel execution: ");
for (int i = 0; i < THREADS; ++i)
printf("%f ", pinnedHostPtr[i]);
printf("\n");
return 0;
}

ID3D11DeviceContext::DrawIndexed() Failed

my program is Directx Program that draws a container cube within it smaller cubes....these smaller cubes fall by time i hope you understand what i mean...
The program isn't complete yet ...it should draws the container only ....but it draws nothing ...only the background color is visible... i only included what i think is needed ...
this is the routines that initialize the program
bool Game::init(HINSTANCE hinst,HWND _hw){
Directx11 ::init(hinst , _hw);
return LoadContent();}
Directx11::init()
bool Directx11::init(HINSTANCE hinst,HWND hw){
_hinst=hinst;_hwnd=hw;
RECT rc;
GetClientRect(_hwnd,&rc);
height= rc.bottom - rc.top;
width = rc.right - rc.left;
UINT flags=0;
#ifdef _DEBUG
flags |=D3D11_CREATE_DEVICE_DEBUG;
#endif
HR(D3D11CreateDevice(0,_driverType,0,flags,0,0,D3D11_SDK_VERSION,&d3dDevice,&_featureLevel,&d3dDeviceContext));
if (d3dDevice == 0 || d3dDeviceContext == 0)
return 0;
DXGI_SWAP_CHAIN_DESC sdesc;
ZeroMemory(&sdesc,sizeof(DXGI_SWAP_CHAIN_DESC));
sdesc.Windowed=true;
sdesc.BufferCount=1;
sdesc.BufferDesc.Format=DXGI_FORMAT_R8G8B8A8_UNORM;
sdesc.BufferDesc.Height=height;
sdesc.BufferDesc.Width=width;
sdesc.BufferDesc.Scaling=DXGI_MODE_SCALING_UNSPECIFIED;
sdesc.BufferDesc.ScanlineOrdering=DXGI_MODE_SCANLINE_ORDER_UNSPECIFIED;
sdesc.OutputWindow=_hwnd;
sdesc.BufferDesc.RefreshRate.Denominator=1;
sdesc.BufferDesc.RefreshRate.Numerator=60;
sdesc.Flags=0;
sdesc.BufferUsage = DXGI_USAGE_RENDER_TARGET_OUTPUT;
if (m4xMsaaEnable)
{
sdesc.SampleDesc.Count=4;
sdesc.SampleDesc.Quality=m4xMsaaQuality-1;
}
else
{
sdesc.SampleDesc.Count=1;
sdesc.SampleDesc.Quality=0;
}
IDXGIDevice *Device=0;
HR(d3dDevice->QueryInterface(__uuidof(IDXGIDevice),reinterpret_cast <void**> (&Device)));
IDXGIAdapter*Ad=0;
HR(Device->GetParent(__uuidof(IDXGIAdapter),reinterpret_cast <void**> (&Ad)));
IDXGIFactory* fac=0;
HR(Ad->GetParent(__uuidof(IDXGIFactory),reinterpret_cast <void**> (&fac)));
fac->CreateSwapChain(d3dDevice,&sdesc,&swapchain);
ReleaseCOM(Device);
ReleaseCOM(Ad);
ReleaseCOM(fac);
ID3D11Texture2D *back = 0;
HR(swapchain->GetBuffer(0,__uuidof(ID3D11Texture2D),reinterpret_cast <void**> (&back)));
HR(d3dDevice->CreateRenderTargetView(back,0,&RenderTarget));
D3D11_TEXTURE2D_DESC Tdesc;
ZeroMemory(&Tdesc,sizeof(D3D11_TEXTURE2D_DESC));
Tdesc.BindFlags = D3D11_BIND_DEPTH_STENCIL;
Tdesc.ArraySize = 1;
Tdesc.Format= DXGI_FORMAT_D24_UNORM_S8_UINT;
Tdesc.Height= height;
Tdesc.Width = width;
Tdesc.Usage = D3D11_USAGE_DEFAULT;
Tdesc.MipLevels=1;
if (m4xMsaaEnable)
{
Tdesc.SampleDesc.Count=4;
Tdesc.SampleDesc.Quality=m4xMsaaQuality-1;
}
else
{
Tdesc.SampleDesc.Count=1;
Tdesc.SampleDesc.Quality=0;
}
HR(d3dDevice->CreateTexture2D(&Tdesc,0,&depthview));
HR(d3dDevice->CreateDepthStencilView(depthview,0,&depth));
d3dDeviceContext->OMSetRenderTargets(1,&RenderTarget,depth);
D3D11_VIEWPORT vp;
vp.TopLeftX=0.0f;
vp.TopLeftY=0.0f;
vp.Width = static_cast <float> (width);
vp.Height= static_cast <float> (height);
vp.MinDepth = 0.0f;
vp.MaxDepth = 1.0f;
d3dDeviceContext -> RSSetViewports(1,&vp);
return true;
SetBuild() Prepare the matrices inside the container for the smaller cubes ....i didnt program it to draw the smaller cubes yet
and this the function that draws the scene
void Game::Render(){
d3dDeviceContext->ClearRenderTargetView(RenderTarget,reinterpret_cast <const float*> (&Colors::LightSteelBlue));
d3dDeviceContext->ClearDepthStencilView(depth,D3D11_CLEAR_DEPTH | D3D11_CLEAR_STENCIL,1.0f,0);
d3dDeviceContext-> IASetInputLayout(_layout);
d3dDeviceContext-> IASetPrimitiveTopology(D3D10_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
d3dDeviceContext->IASetIndexBuffer(indices,DXGI_FORMAT_R32_UINT,0);
UINT strides=sizeof(Vertex),off=0;
d3dDeviceContext->IASetVertexBuffers(0,1,&vertices,&strides,&off);
D3DX11_TECHNIQUE_DESC des;
Tech->GetDesc(&des);
Floor * Lookup; /*is a variable to Lookup inside the matrices structure (Floor Contains XMMATRX Piese[9])*/
std::vector<XMFLOAT4X4> filled; // saves the matrices of the smaller cubes
XMMATRIX V=XMLoadFloat4x4(&View),P = XMLoadFloat4x4(&Proj);
XMMATRIX vp = V * P;XMMATRIX wvp;
for (UINT i = 0; i < des.Passes; i++)
{
d3dDeviceContext->RSSetState(BuildRast);
wvp = XMLoadFloat4x4(&(B.Memory[0].Pieces[0])) * vp; // Loading The Matrix at translation(0,0,0)
HR(ShadeMat->SetMatrix(reinterpret_cast<float*> ( &wvp)));
HR(Tech->GetPassByIndex(i)->Apply(0,d3dDeviceContext));
d3dDeviceContext->DrawIndexed(build_ind_count,build_ind_index,build_vers_index);
d3dDeviceContext->RSSetState(PieseRast);
UINT r1=B.GetSize(),r2=filled.size();
for (UINT j = 0; j < r1; j++)
{
Lookup = &B.Memory[j];
for (UINT r = 0; r < Lookup->filledindeces.size(); r++)
{
filled.push_back(Lookup->Pieces[Lookup->filledindeces[r]]);
}
}
for (UINT j = 0; j < r2; j++)
{
ShadeMat->SetMatrix( reinterpret_cast<const float*> (&filled[i]));
Tech->GetPassByIndex(i)->Apply(0,d3dDeviceContext);
d3dDeviceContext->DrawIndexed(piese_ind_count,piese_ind_index,piese_vers_index);
}
}
HR(swapchain->Present(0,0));}
thanks in Advance
One bug in your program appears to be that you're using i, the index of the current pass, as an index into the filled vector, when you should apparently be using j.
Another apparent bug is that in the loop where you are supposed to be iterating over the elements of filled, you're not iterating over all of them. The value r2 is set to the size of filled before you append anything to it during that pass. During the first pass this means that nothing will be drawn by this loop. If your technique only has one pass then this means that the second DrawIndexed call in your code will never be executed.
It also appears you should be only adding matrices to filled once, regardless of the number of the passes the technique has. You should consider if your code is actually meant to work with techniques with multiple passes.

Selecting a Region OpenCV

I am new to OpenCV and I want to select a particular region in the video/image for detection. In my case I want to detect cars that are only in the road not in the parking lot.
Well, selecting cars requires use of training data. But to select an ROI (region of interest) is fairly simple:
Consider img = cv2.imread(image)
In that case, somewhere in your code, you can specify a region this way:
sub_image = img[y:y+h, x:x+w]
That will get the ROI once you specify the values, of course, not using 'x' or 'y', where h is the height and w is the width. Remember that images are just 2D matrices.
Use CascadeClassifier() to select the car(s) from the image(s). Documentation is found here. OpenCV comes packed with training data you can use to make classifications in the form of XML files.
If you want to manually select a region of interest (ROI) to do some processing on it, then you may trying using mouse click event to select start and stop points of your ROI.
Once you have start and stop point you can use it to retrieve image from selected region.
The can be done on image or capture video frame.
bool roi_captured = false;
Point pt1, pt2;
Mat cap_img;
//Callback for mousclick event, the x-y coordinate of mouse button-up and button-down
//are stored in two points pt1, pt2.
void mouse_click(int event, int x, int y, int flags, void *param)
{
switch(event)
{
case CV_EVENT_LBUTTONDOWN:
{
std::cout<<"Mouse Pressed"<<std::endl;
if(!roi_capture)
{
pt1.x = x;
pt1.y = y;
}
else
{
std::cout<<"ROI Already Acquired"<<std::endl;
}
break;
}
case CV_EVENT_LBUTTONUP:
{
if(!got_roi)
{
Mat cl;
std::cout<<"Mouse LBUTTON Released"<<std::endl;
pt2.x = x;
pt2.y = y;
cl = cap_img.clone();
Mat roi(cl, Rect(pt1, pt2));
Mat prev_imgT = roi.clone();
std::cout<<"PT1"<<pt1.x<<", "<<pt1.y<<std::endl;
std::cout<<"PT2"<<pt2.x<<","<<pt2.y<<std::endl;
imshow("Clone",cl);
got_roi = true;
}
else
{
std::cout<<"ROI Already Acquired"<<std::endl;
}
break;
}
}
}
//In main open video and wait for roi event to complete by the use.
// You capture roi in pt1 and pt2 you can use the same coordinates for processing // //subsequent frame
int main(int argc, char *argv[])
{
int frame_num = 0;
int non_decode_frame =0;
int count = 1, idx =0;
int frame_pos =0;
std::cout<<"Video File "<<argv[1]<<std::endl;
cv::VideoCapture input_video(argv[1]);
namedWindow("My_Win",1);
cvSetMouseCallback("My_Win", mouse_click, 0);
sleep(1);
while(input_video.grab())
{
cap_img.release();
if(input_video.retrieve(cap_img))
{
imshow("My_Win", cap_img);
if(!got_roi)
{
//Wait here till user select the desire ROI
waitKey(0);
}
else
{
std::cout<<"Got ROI disp prev and curr image"<<std::endl;
std::cout<<"PT1"<<pt1.x<<" "<<pt1.y<<std::endl;
std::cout<<"PT2"<<pt2.x<<" "<<pt2.y<<std::endl;
Mat curr_img_t1;
Mat roi2(cap_img,Rect(pt1, pt2));
Mat curr_imgT = roi2.clone();
cvtColor(curr_imgT, curr_img_t1, CV_RGB2GRAY);
imshow("curr_img", curr_img);
// Do remaining processing here on capture roi for every frame
waitKey(1);
}
}
}
}
You didn't tag in what programming language you are writing with. Anyway, I answer you in python. (You can easily convert it to C++ if you want)
def mouse_drawing(event, x, y, flags, params):
if event == cv2.EVENT_LBUTTONDOWN:
car = img[y: y + carheight, x: x + carwidth]
cv2.imwrite("car", car)
cv2.namedWindow("my_img")
cv2.setMouseCallback("my_img", mouse_drawing)
while True:
cv2.imshow("my_img", img)
key = cv2.waitKey(1)
if key == 27:
break
As in other answers was told, if you want to find cars automatically, that would be another problem and has to do with training data and other things.

Resources