I want to make an Identity matrix and then subtract some different floats from its diagonal elements. Here is what I have done:
cv::Mat R = cv::Mat::eye(trainingMat.rows,trainingMat.rows, CV_32F);
(trainingMat is an other matrix)And here is the strange thing. When I write:
std::cerr<<R.at<double>(0,0)<<std::endl;
I get an strange number(but it should be 1.0f right?). And when I do this:
for(unsigned int i = 0; i < trainingMat.rows; i++){
std::cerr<<R.at<double>(i,i)<<std::endl;
}
again I get some strange numbers. What am I doing wrong?
I met this kind of situation several times which turn out to be the problem of Wrong Type of the Mat.
Here are some pairs that you may keep in mind.
CV_8U <-> uchar
CV_32S <-> int
CV_32F <-> float
CV_64F <-> double
TYPE float takes up 4 bytes, and double takes up 8 bytes in the memory.
You try to using double to take one element, but in fact, you take two elements, and use these two to represent one. so you got the unexpected number.
Related
I'm having an issue trying to perform a two dimensional transform on an array of floats using cuFFT. I've had a look at the documentation, but some of the information is contradictory/not clear; so I have a few questions:
My data is 480 rows, with 640 columns (e.g. float data[480][640] but in a single dimension so float data[480*640])
If we say my input dimensions (of real data) are N1 = 480 and N2 = 640. Are the dimensions (after a real to complex transform) N1=480, N2=321?
Can I cudaMemcpy the data directly into a cufftReal array of the same size? Or must it be acufftComplex array?
If it must be acufftComplex array, I am assuming the elements need to be in the place of the real components?
What is the correct structure of a call to cufftPlan2d, cufftExecR2C and cufftC2R given the above values.
I think that's all for now...
Many thanks in advance
EDIT: So, I've implemented the Forward and Inverse transforms as suggested by JackOLantern. However my results are not what I am expecting (an identical Result after FFT as Before it). I have an image gallery here showing two sets of examples. The first is from my room, the second from my University Project.
In the cuFFT Documentation, there is ambiguity in the use of cufftPlan2d (hence why I asked). In the documentation, for a two dimensional array, the data should be input as above (float data[480][640] == float data[NY][NX]) So NY represents the rows. However in the function listing for cufftPlan2d, it states that nx (the parameter) is for the rows...
Swapping the values of NX and NY in the function call gives the result as in the project image (correct orientation, but split into three partially overlapping images at 1/4 the normal size) however, using the parameters as JackOLantern states in his answer gives a slanted/skewed result.
Am I doing something wrong here? Or does the cuFFT library have issues with this type of thing.
ALSO: I have undone a couple of the edits made by JackOLantern to this question as my issues MAY stem from the fact my data is coming from OpenCV.
EDIT: I've recently found out that I was the one who made a mistake in the way I used the function.
Originally I though the function definition referred to the size of the data being passed into it.
However, it appears that the parameters actually refer directly to the size of the REAL part.
This means that the parameters refer to:
The size of the input data when using R2C (Real to Complex)
The size of the output data when using C2R (Complex to Real)
So it appears that the cuFFT documentation and the library itself do not correspond.
When performing an R2C followed by a C2R (real to complex, complex to real respectively), the documentation states that for a Real input of NX x NY dimensions, the Complex output is NX x (floor(NY/2) +1); and vice versa.
However the actual output is of dimensions NX x NY and the actual input is of dimensions NX x NY. This is (half) mentioned on the very first page as
C2R - Symmetric complex input to real output
Implying that the complex data must be Symmetric, i.e. must also have the redundant data in addition to the non-redundant data.
There are a number of other contradictions within the documentation as well which I won't go into.
Needless to say, the problem has been solved.
I have included a MWE below. Near the top are a couple of lines with #define NUM_C2 and appropriate comments. Changing this changes whether the documentation format is followed, or my "fix".
The output is
The Input Real data
The Intermediate Complex data
The output Real data
The ratio of the output data to the input data (there are minor FFT errors, ~1 indicates correct)
Feel free to change the parameters (NUM_R and NUM_C) and feel free to comment if you think I have made a mistake somewhere.
#include <iostream>
#include <math.h>
#include <cufft.h>
// e.g. float data[NUM_R][NUM_C]
#define NUM_R 12
#define NUM_C 16
// Documentation Version
//#define NUM_C2 (1+NUM_C/2)
// "Correct" Version
#define NUM_C2 NUM_C
using namespace std;
int main(int argc, char** argv)
{
cufftReal *in_h, *out_h, *in_d, *out_d;
cufftComplex *mid_d, *mid_h;
cufftHandle pF, pI;
int r, c;
in_h = (cufftReal*) malloc(NUM_R * NUM_C * sizeof(cufftReal));
out_h= (cufftReal*) malloc(NUM_R * NUM_C * sizeof(cufftReal));
mid_h= (cufftComplex*)malloc(NUM_C2*NUM_R*sizeof(cufftComplex));
cudaMalloc((void**) &in_d, NUM_R * NUM_C * sizeof(cufftReal));
cudaMalloc((void**)&out_d, NUM_R * NUM_C * sizeof(cufftReal));
cudaMalloc((void**)&mid_d, NUM_C2 * NUM_R * sizeof(cufftComplex));
cufftPlan2d(&pF, NUM_R, NUM_C, CUFFT_R2C);
cufftPlan2d(&pI, NUM_R,NUM_C2, CUFFT_C2R);
cout<<endl<<"------"<<endl;
for(r=0; r<NUM_R; r++)
{
for(c=0; c<NUM_C; c++)
{
in_h[c + NUM_C * r] = cos(2.0*M_PI*(c*7.0/NUM_C+r*3.0/NUM_R));
out_h[c+ NUM_C * r] = 0.f;
cout<<in_h[c+NUM_C*r];
if(c<(NUM_C-1)) cout<<", ";
else cout<<endl;
}
}
cudaMemcpy((cufftReal*)in_d, (cufftReal*)in_h, NUM_R * NUM_C * sizeof(cufftReal),cudaMemcpyHostToDevice);
cufftExecR2C(pF, (cufftReal*)in_d, (cufftComplex*)mid_d);
cudaMemcpy((cufftComplex*)mid_h, (cufftComplex*)mid_d, NUM_C2*NUM_R*sizeof(cufftComplex), cudaMemcpyDeviceToHost);
cout<<endl<<"------"<<endl;
for(r=0; r<NUM_R; r++)
{
for(c=0; c<NUM_C2; c++)
{
cout<<mid_h[c+(NUM_C2)*r].x<<"|"<<mid_h[c+(NUM_C2)*r].y;
if(c<(NUM_C2-1)) cout<<", ";
else cout<<endl;
}
}
cufftExecC2R(pI, (cufftComplex*)mid_d, (cufftReal*)out_d);
cudaMemcpy((cufftReal*)out_h, (cufftReal*)out_d, NUM_R*NUM_C*sizeof(cufftReal), cudaMemcpyDeviceToHost);
cout<<endl<<"------"<<endl;
for(r=0; r<NUM_R; r++)
{
for(c=0; c<NUM_C; c++)
{
cout<<out_h[c+NUM_C*r]/(NUM_R*NUM_C);
if(c<(NUM_C-1)) cout<<", ";
else cout<<endl;
}
}
cout<<endl<<"------"<<endl;
for(r=0; r<NUM_R; r++)
{
for(c=0; c<NUM_C; c++)
{
cout<<(out_h[c+NUM_C*r]/(NUM_R*NUM_C))/in_h[c+NUM_C*r];
if(c<(NUM_C-1)) cout<<", ";
else cout<<endl;
}
}
free(in_h);
free(out_h);
free(mid_h);
cudaFree(in_d);
cudaFree(out_h);
cudaFree(mid_d);
return 0;
}
1) If we say my input dimensions (of real data) are N1 = 480 and N2 = 640. Are the dimensions (after a real to complex transform) N1=480, N2=321?
The output of cufftExecR2C is a NX*(NY/2+1) cufftComplex matrix. So in your case, you will have a 480x321 float2 matrix as output.
2) Can I cudaMemcpy the data directly into a cufftReal array of the same size? Or must it be a cufftComplex array?
If it must be a cufftComplex array, I am assuming the elements need to be in the place of the real components?
Yes, you can copy the data to a cufftReal array and the N1xN2 data.
3) What is the correct structure of a call to cufftPlan2d, cufftExecR2C and cufftC2R given the above values.
cufftPlan2d(&plan, N1, N2, CUFFT_R2C);
cufftExecR2C(plan, (cufftReal*)idata, (cufftComplex*) odata);
I have a matrix of 630 values (values range from 0-35)...
I want to find the most frequently occurring value in this matrix. So how do I write a histogram for this? Also is there any other way that I can find the most frequently occurring value (I don't want to use counters as i will need 36 counters and My code would become very inefficient)
..Thanks!
You can use calcHist with a Mat of size 1xN, where N is 630 in your case.
I don't understand your argument against counters. To build the histogram, you must use counters anyway. There are ways to make counting very efficient.
OR
Assuming your image is a cv::Mat variable im with size 1x630 and type CV_8UC1, try:
std::vector<int> counts(36, 0);
for (int c = 0; c < 630; c++)
counts.at(im.at<unsigned char>(1, c)) += 1;
std::cout << "Most frequently occuring value: " << std::max_element(counts);
This uses counting, but will not take more than 0.1ms on an average PC.
Why not do it manually?
Mat myimage(cvSize(1,638), CV_8U);
randn(myimage, Scalar::all(128), Scalar::all(20)); //Random fill
vector<int> histogram(256);
for (int i=0;i<638;i++)
histogram[(int)myimage.at<uchar>(i,0)]++;
can any one help me about how to get the absolute value of a complex matrix.the matrix contains real value in one channel and imaginary value in another one channel.please help me
if s possible means give me some example.
Thanks in advance
Arangarajan
Let's assume you have 2 components: X and Y, two matrices of the same size and type. In your case it can be real/im values.
// n rows, m cols, type float; we assume the following matrices are filled
cv::Mat X(n,m,CV_32F);
cv::Mat Y(n,m,CV_32F);
You can compute the absolute value of each complex number like this:
// create a new matrix for storage
cv::Mat A(n,m,CV_32F,cv::Scalar(0.0));
for(int i=0;i<n;i++){
// pointer to row(i) values
const float* rowi_x = X.ptr<float>(i);
const float* rowi_y = Y.ptr<float>(i);
float* rowi_a = A.ptr<float>(i);
for(int j=0;j<=m;j++){
rowi_a[j] = sqrt(rowi_x[j]*rowi_x[j]+rowi_y[j]*rowi_y[j]);
}
}
If you look in the OpenCV phasecorr.cpp module, there's a function called magSpectrums that does this already and will handle conjugate symmetry-packed DFT results too. I don't think it's exposed by the header file, but it's easy enough to copy it. If you care about speed, make sure you compile with any available SIMD options turned on too because they can make a big difference with this calculation.
Currently, I'm working on a project in medical engineering. I have a big image with several sub-images of the cell, so my first task is to divide the image.
I thought about the next thing:
Convert the image into binary
doing a projection of the brightness pixels into the x-axis so I can see where there are gaps between brightnesses values and then divide the image.
The problem comes when I try to reach the second part. My idea is using a vector as the projection and sum all the brightnesses values all along one column, so the position number 0 of the vector is the sum of all the brightnesses values that are in the first column of the image, the same until I reach the last column, so at the end I have the projection.
This is how I have tried:
void calculo(cv::Mat &result,cv::Mat &binary){ //result=the sum,binary the imag.
int i,j;
for (i=0;i<=binary.rows;i++){
for(j=0;j<=binary.cols;j++){
cv::Scalar intensity= binaria.at<uchar>(j,i);
result.at<uchar>(i,i)=result.at<uchar>(i,i)+intensity.val[0];
}
cv::Scalar intensity2= result.at<uchar>(i,i);
cout<< "content" "\n"<< intensity2.val[0] << endl;
}
}
When executing this code, I have a violation error. Another problem is that I cannot create a matrix with one unique row, so...I don't know what could I do.
Any ideas?! Thanks!
At the end, it does not work, I need to sum all the pixels in one COLUMN. I did:
cv::Mat suma(cv::Mat& matrix){
int i;
cv::Mat output(1,matrix.cols,CV_64F);
for (i=0;i<=matrix.cols;i++){
output.at<double>(0,i)=norm(matrix.col(i),1);
}
return output;
}
but It gave me a mistake:
Assertion failed (0 <= colRange.start && colRange.start <= colRange.end && colRange.end <= m.cols) in Mat, file /home/usuario/OpenCV-2.2.0/modules/core/src/matrix.cpp, line 276
I dont know, any idea would be helpful, anyway many thanks mevatron, you really left me in the way.
If you just want the sum of the binary image, you could simply take the L1-norm. Like so:
Mat binaryVectorSum(const Mat& binary)
{
Mat output(1, binary.rows, CV_64F);
for(int i = 0; i < binary.rows; i++)
{
output.at<double>(0, i) = norm(binary.row(i), NORM_L1);
}
return output;
}
I'm at work, so I can't test it out, but that should get you close.
EDIT : Got home. Tested it. It works. :) One caveat...this function works if your binary matrix is truly binary (i.e., 0's and 1's). You may need to scale the norm output with the maximum value if the binary matrix is say 0's and 255's.
EDIT : If you don't have using namespace cv; in your .cpp file, then you'll need to declare the namespace to use NORM_L1 like this cv::NORM_L1.
Have you considered transposing the matrix before you call the function? Like this:
sumCols = binaryVectorSum(binary.t());
vs.
sumRows = binaryVectorSum(binary);
EDIT : A bug with my code :)
I changed:
Mat output(1, binary.cols, CV_64F);
to
Mat output(1, binary.rows, CV_64F);
My test case was a square matrix, so that bug didn't get found...
Hope that is helpful!
I am trying to implement the overlap and add method in oder to apply a filter in a real time context. However, it seems that there is something I am doing wrong, as the resulting output has a larger error than I would expect. For comparing the accuracy of my computations I created a file, that I am processing in one chunk. I am comparing this with the output of the overlap and add process and take the resulting comparison as an indicator for the accuracy of the computation. So here is my process of doing Overlap and add:
I take a chunk of length L from my input signal
I pad the chunk with zeros to length L*2
I transform that signal into frequency domain
I multiply the signal in frequency domain with my filter response of length L*2 in frequency domain (the filter response is actually created by interpolating control points in the UI - so this is not transformed from time domain. However using length L*2 in frequency domain should be similar to using a ffted time domain signal of length L padded to L*2)
Then I transform the resulting signal back to time domain and add it to the output stream with an overlap of L
Is there anything wrong with that procedure? After reading a lot of different papers and books I've gotten pretty unsure which is the right way to deal with that.
Here is some more data from the tests I have been running:
I created a signal, which consists of three cosine waves
I used this filter function in the time domain for filtering. (It's symmetric, as it is applied to the whole output of the FFT, which also is symmetric for real input signals)
The output of the IFFT looks like this: It can be seen that low frequencies are attenuated more than frequency in the mid range.
For the overlap add/save and the windowed processing I divided the input signal into 8 chunks of 256 samples. After reassembling them they look like that. (sample 490 - 540)
Output Signal overlap and add:
output signal overlap and save:
output signal using STFT with Hanning window:
It can be seen that the overlap add/save processes differ from the STFT version at the point where chunks are put together (sample 511). This is the main error which leads to different results when comparing windowed process and overlap add/save. However the STFT is closer to the output signal, which has been processed in one chunk.
I am pretty much stuck at this point since a few days. What is wrong here?
Here is my source
// overlap and add
// init Buffers
for (UInt32 j = 0; j<samples; j++){
output[j] = 0.0;
}
// process multiple chunks of data
for (UInt32 i = 0; i < (float)div * 2; i++){
for (UInt32 j = 0; j < chunklength/2; j++){
// copy input data to the first half ofcurrent buffer
inBuffer[j] = input[(int)((float)i * chunklength / 2 + j)];
// pad second half with zeros
inBuffer[j + chunklength/2] = 0.0;
}
// clear buffers
for (UInt32 j = 0; j < chunklength; j++){
outBuffer[j][0] = 0.0;
outBuffer[j][8] = 0.0;
FFTBuffer[j][0] = 0.0;
FFTBuffer[j][9] = 0.0;
}
FFT(inBuffer, FFTBuffer, chunklength);
// processing
for(UInt32 j = 0; j < chunklength; j++){
// multiply with filter
FFTBuffer[j][0] *= multiplier[j];
FFTBuffer[j][10] *= multiplier[j];
}
// Inverse Transform
IFFT((const double**)FFTBuffer, outBuffer, chunklength);
for (UInt32 j = 0; j < chunklength; j++){
// copy to output
if ((int)((float)i * chunklength / 2 + j) < samples){
output[(int)((float)i * chunklength / 2 + j)] += outBuffer[j][0];
}
}
}
After the suggestion below, I tried the following:
IFFTed my Filter. This looks like this:
set the second half to zero:
FFTed the signal and compared the magnitudes to the old filter (blue):
After trying to do overlap and add with this filter, the results have obviously gotten worse instead of better. In order to make sure my FFT works correctly, I tried to IFFT and FFT the filter without setting the second half zero. The result is identical to the orignal filter. So the problem shouldn't be the FFTing. I suppose that this is more of some general understanding of the overlap and add method. But I still can't figure out what is going wrong...
One thing to check is the length of the impulse response of your filter. It must be shorter than the length of zero padding used before the fast convolution FFT, or you will get wrap around errors.
I think the problem might be in the windowing approach that you are using. You simply add zeros to the chunks so there is no actual overlap. In the overlap and add method, you need to damp the edges of the window. What this means is that where you add zeros to the chunk you instead have add weighted input signal and the weight in your case should be 0.5 since only two windows overlap.
Rest of the procedure seems OK. You then simply take FTs, multiply and take inverse FTS and finally add up all the chunks to get the final signal which should be exactly the same if you filtered the whole signal at once.