Related
I am having trouble understanding the inner workings of OpenCV. Consider the following code:
Scalar getAverageColor(Mat img, vector<Rect>& rois) {
int n = static_cast<int>(rois.size());
Mat avgs(1, n, CV_8UC3);
for (int i = 0; i < n; ++i) {
// What is the correct way to assign the color elements in
// the matrix?
avgs.at<Scalar>(i) = mean(Mat(img, rois[i]));
/*
This seems to always work, but there has to be a better way.
avgs.at<Vec3b>(i)[0] = mean(Mat(img, rois[i]))[0];
avgs.at<Vec3b>(i)[1] = mean(Mat(img, rois[i]))[1];
avgs.at<Vec3b>(i)[2] = mean(Mat(img, rois[i]))[2];
*/
}
// If I access the first element it seems to be set correctly.
Scalar first = avgs.at<Scalar>(0);
// However mean returns [0 0 0 0] if I did the assignment above using scalar, why???
Scalar avg = mean(avgs);
return avg;
}
If I use avgs.at<Scalar>(i) = mean(Mat(img, rois[i])) for the assignment in the loop the first element looks correct, but then the mean calculation always returns zero (even thought the first element looks correct). If I assign all the color elements by hand using Vec3b it seems to work, but why???
Note: cv::Scalar is a typedef for cv::Scalar_<double>, which derives from cv::Vec<double, 4>, which derives from cv::Matx<double, 4, 1>.
Similarly, cv::Vec3b is cv::Vec<uint8_t, 3> which derives from cv::Matx<uint8_t, 3, 1> -- this means that we can use any of those 3 in cv::Mat::at and get identical (correct) behaviour.
It's important to be aware that cv::Mat::at is basically a reinterpret_cast on the underlying data array. You need to be extremely careful to use an appropriate data type for the template argument, one which corresponds to the type of elements (including channel count) of the cv::Mat you're invoking it on.
The documentation mentions the following:
Keep in mind that the size identifier used in the at operator cannot be chosen at random. It depends on the image from which you are trying to retrieve the data. The table below gives a better insight in this:
If matrix is of type CV_8U then use Mat.at<uchar>(y,x).
If matrix is of type CV_8S then use Mat.at<schar>(y,x).
If matrix is of type CV_16U then use Mat.at<ushort>(y,x).
If matrix is of type CV_16S then use Mat.at<short>(y,x).
If matrix is of type CV_32S then use Mat.at<int>(y,x).
If matrix is of type CV_32F then use Mat.at<float>(y,x).
If matrix is of type CV_64F then use Mat.at<double>(y,x).
It doesn't seem to mention there what to do in case of multiple channels -- in that case you use cv::Vec<...> (or rather one of the typedefs provided). cv::Vec<...> is basically a wrapper around an fixed-size array of N values of given type.
In your case, the matrix avgs is CV_8UC3 -- each element consists of 3 unsigned byte values (i.e. 3 bytes total). However, by using avgs.at<Scalar>(i), you interpret each element as 4 doubles (32 bytes in total). That means that:
The actual element you tried to write to (if interpreted correctly) will only hold the 3 most significant bytes of the (8 byte floating point) mean of the first channel -- i.e. complete garbage.
You actually overwrite the next 10 elements (the last one partially, 3rd channel escapes unscathed) with more garbage.
At some point, you are bound to overflow the buffer and potentially trash other data structures. This issue is rather serious.
We can demonstrate it using the following simple program.
Example:
#include <opencv2/opencv.hpp>
int main()
{
cv::Mat test_mat(cv::Mat::zeros(1, 12, CV_8UC3)); // 12 * 3 = 36 bytes of data
std::cout << "Before: " << test_mat << "\n";
cv::Scalar test_scalar(cv::Scalar::all(1234.5678));
test_mat.at<cv::Scalar>(0, 0) = test_scalar;
std::cout << "After: " << test_mat << "\n";
return 0;
}
Output:
Before: [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
After: [173, 250, 92, 109, 69, 74, 147, 64, 173, 250, 92, 109, 69, 74, 147, 64, 173, 250, 92, 109, 69, 74, 147, 64, 173, 250, 92, 109, 69, 74, 147, 64, 0, 0, 0, 0]
This clearly shows we're writing way more than we should.
In Debug mode, the incorrect use of at also triggers an assertion:
OpenCV(3.4.3) Error: Assertion failed (((((sizeof(size_t)<<28)|0x8442211) >> ((traits::Depth<_Tp>::value) & ((1 << 3) - 1))*4) & 15) == elemSize1()) in cv::Mat::at, file D:\code\shit\so07\deps\include\opencv2/core/mat.inl.hpp, line 1102
To allow assignment of the result from cv::mean (which is a cv::Scalar) to our CV_8UC3 matrix, we need to do two things (not necessarily in this order):
Convert the values from double to uint8_t -- OpenCV will do a saturate_cast, but given that the mean won't go past the min/max of the input items, we'd be fine with a regular cast.
Get rid of the 4th element.
To remove the 4th element, we can use cv::Matx::get_minor (The documentation is a bit lacking, but a look at the implementation explains it fairly well). The result is a cv::Matx, so we have to use that instead of cv::Vec when using cv::Mat::at.
The two possible options then are:
Get rid of the 4th element and then
cast result to convert the cv::Matx to uint8_t element type.
Cast the cv::Scalar to cv::Scalar_<uint8_t> first, and then get rid of the 4th element.
Example:
#include <opencv2/opencv.hpp>
typedef cv::Matx<uint8_t, 3, 1> Mat31b; // Convenience, OpenCV only has typedefs for double and float variants
int main()
{
cv::Mat test_mat(1, 12, CV_8UC3); // 12 * 3 = 36 bytes of data
test_mat = cv::Scalar(1, 1, 1); // Set all elements to 1
std::cout << "Before: " << test_mat << "\n";
cv::Scalar test_scalar{ 2,3,4,0 };
cv::Matx31d temp = test_scalar.get_minor<3, 1>(0, 0);
test_mat.at<Mat31b>(0, 0) = static_cast<Mat31b>(temp);
// or
// cv::Scalar_<uint8_t> temp(static_cast<cv::Scalar_<uint8_t>>(test_scalar));
// test_mat.at<Mat31b>(0, 0) = temp.get_minor<3, 1>(0, 0);
std::cout << "After: " << test_mat << "\n";
return 0;
}
NB: You can get rid of the explicit temporaries, they're here just for easier readability.
Output:
Both options produce the following output:
Before: [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
After: [ 2, 3, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
As we can see, only the first 3 bytes were changed, so it behaves correctly.
Some thoughts about performance.
It's hard to guess which of the two approaches is better. Casting first means you allocate smaller amount of memory for the temporary, but then you have to do 4 saturate_casts instead of 3. Some benchmarking would have to be done (excercise for the reader). The calculation of mean will outweigh it significantly, so it's likely to be irrelevant.
Given that we don't really need the saturate_casts, perhaps the simple, but more verbose approach (optimized version of the thing that worked for you) might perform better in a tight loop.
cv::Vec3b& current_element(avgs.at<cv::Vec3b>(i));
cv::Scalar current_mean(cv::mean(cv::Mat(img, rois[i])));
for (int n(0); n < 3; ++n) {
current_element[n] = static_cast<uint8_t>(current_mean[n]);
}
Update:
One more idea that came up in discussion with #alkasm. The assignment operator for a cv::Mat is vectorized when given a cv::Scalar (it assigns the same value to all elements), and it ignores the additional channel values the cv::Scalar may hold relative to the target cv::Mat type. (e.g. for a 3-channel Mat it ignores the 4th value).
We could take a 1x1 ROI of the target Mat, and assign it the mean Scalar. Necessary type conversions will happen, and the 4th channel will be discared. Probably not optimal, but it's by far the least amount of code so far.
test_mat(cv::Rect(0, 0, 1, 1)) = test_scalar;
The result is the same as before.
I'm using OpenCV 3.2.0 to do some Fourier space calculations. To obtain a phase image after an inverse DFT, I tried using cv::phase() but I noticed that in some cases, it returned values close to 2*Pi where it should (in my eyes) return a value close to zero. I wonder if this function is implemented badly or if I'm using it wrong.
This is my example data, a 7x8 FFT where the imaginary part is zero or, due to rounding errors, very close to zero (value pairs in the form of real, imag):
0.75686288, 0, 0.74509817, -3.6017641e-19, 0.74117655, -4.8023428e-19, 0.76078451, -1.3206505e-18, 0.77647072, 0, 0.74509817, -3.6017641e-19, 0.72549027, 4.8023428e-19, 0.70588243, 2.0410032e-18;
0.70980388, 0, 0.66666675, -6.6032515e-19, 0.69803929, -3.8418834e-18, 0.73725492, -5.3426161e-18, 0.69803923, 0, 0.6549021, -6.6032515e-19, 0.5725491, 3.8418834e-18, 0.5411765, 6.6632662e-18;
0.63529414, 0, 0.6352942, -1.7408535e-18, 0.63921577, -5.1625314e-18, 0.61960787, -3.1815585e-18, 0.60784316, 0, 0.55686277, -1.7408535e-18, 0.4705883, 5.1625314e-18, 0.45882356, 6.6632657e-18;
0.58039224, 0, 0.58431381, -6.6032412e-19, 0.63921583, -7.8038246e-18, 0.63921577, -7.9839117e-18, 0.50196087, 0, 0.45490205, -6.6032412e-19, 0.38431379, 7.8038246e-18, 0.35686284, 9.3045593e-18;
0.54117656, 0, 0.58431375, -9.0044183e-19, 0.68627465, -9.1244722e-18, 0.6156863, -6.7833236e-18, 0.48627454, 0, 0.45490202, -9.0044183e-19, 0.38823539, 9.1244722e-18, 0.36470592, 8.5842074e-18;
0.50980395, 0, 0.56470597, -6.0029469e-19, 0.57254916, -4.8023546e-18, 0.54901963, -3.9619416e-18, 0.4784314, 0, 0.42352945, -6.0029469e-19, 0.41568634, 4.8023546e-18, 0.39999998, 5.162531e-18;
0.49411768, 0, 0.50588238, 4.8023392e-19, 0.54509813, -1.6808249e-18, 0.56078434, -3.241587e-18, 0.49803928, 0, 0.49411774, 4.8023392e-19, 0.49019611, 1.6808249e-18, 0.47058827, 2.2811191e-18
I then applied cv::phase() like this:
Mat planes[2];
split(output,planes);
Mat ph;
phase(planes[0],planes[1],ph);
Then, cout<<ph yields:
0, 6.2831855, 6.2831855, 6.2831855, 0, 6.2831855, 6.6180405e-19, 2.8908079e-18;
0, 6.2831855, 6.2831855, 6.2831855, 0, 6.2831855, 6.7087144e-18, 1.2309944e-17;
0, 6.2831855, 6.2831855, 6.2831855, 0, 6.2831855, 1.096805e-17, 1.451942e-17;
0, 6.2831855, 6.2831855, 6.2831855, 0, 6.2831855, 2.0301558e-17, 2.6067677e-17;
0, 6.2831855, 6.2831855, 6.2831855, 0, 6.2831855, 2.3497438e-17, 2.3532349e-17;
0, 6.2831855, 6.2831855, 6.2831855, 0, 6.2831855, 1.1550381e-17, 1.2903592e-17;
0, 9.4909814e-19, 6.2831855, 6.2831855, 0, 9.7169579e-19, 3.4281555e-18, 4.8463495e-18
So the output is sort of oscillating between the lowest and highest value. I was awaiting a matrix of (near) zeros though, because a non-existing phase shift would be in line with the underlying physics application. I then tried computing the phase image pixel by pixel:
Mat_<double> myPhase = Mat_<double>(8,7);
for(int i = 0; i < fftReal.rows; i++) {
for(int j = 0; j < fftReal.cols; j++) {
float fftRealVal = planes[0].at<float>(i,j);
float fftImagVal = planes[1].at<float>(i,j);
double angle = atan2(fftImagVal, fftRealVal);
myPhase(i,j) = angle;
}
Here, the output of cout<<myPhase is what I expected to see, a matrix of near zeros:
0, -4.833945789050036e-19, -6.479350716073673e-19, -1.735906137457605e-18, 0, -4.833945789050036e-19, 6.619444609555068e-19;
0, -9.904875721669217e-19, -5.503821154321125e-18, -7.246633215917781e-18, 0, -1.00828074413082e-18, 6.710137932686301e-18;
0, -2.740232027682232e-18, -8.076351618590122e-18, -5.13479354918468e-18, 0, -3.126180439429062e-18, 1.097037782204674e-17;
0, -1.130084743690479e-18, -1.220843476128649e-17, -1.249016668765776e-17, 0, -1.451574279023501e-18, 2.030586625060691e-17;
0, -1.541024556489219e-18, -1.329565697018843e-17, -1.101749982000204e-17, 0, -1.979419217601631e-18, 2.350242300975683e-17;
0, -1.063021695913201e-18, -8.387671795472417e-18, -7.216393147084068e-18, 0, -1.417362295683461e-18, 1.155283208729227e-17;
0, 9.492995611309157e-19, -3.083527284733071e-18, -5.78045227144509e-18, 0, 9.719018577786209e-19, 3.428882635099473e-18;
4.847377651234249e-18, 6.937607420147441e-310, 6.937607420153765e-310, 6.93760742011582e-310, 6.93760742011503e-310, 6.937607420163251e-310, 6.937607420188547e-310
So does cv::phase() yield a wrong result here due to some rounding errors or does it work as it should and I'm missing some pre-processing or anything?
Note that
2*pi - 6.479350716073673e-19 == 6.28318530717959
Your two results are equivalent.
The C++ std::atan2 function returns a value in the range (-π , +π], so for any angle close to zero, whether positive or negative, you get a value close to zero.
The OpenCV cv::phase function is documented to use atan2, but it seems to return a value in the range [0, 2π) instead.
If you need the output to be in the (-π , +π] range, you can do (modified from here):
float pi = 3.14159265358979;
cv::subtract(ph, 2*pi, ph, (ph > pi));
Apologies if this seems trivial - relatively new to openCV.
Essentially, I'm trying to create a function that can take in a camera's image, the known world coordinates of that image, and the world coordinates of some other point 2, and then transform the camera's image to what it would look like if the camera was at point 2. From my understanding, the best way to tackle this is using a homography transformation using the warpPerspective tool.
The experiment is being done inside the Unreal Game simulation engine. Right now, I essentially read the data from the camera, and add a set transformation to the image. However, I seem to be doing something wrong as the image is looking something like this (original image first then distorted image):
Original Image
Distorted Image
This is the current code I have. Basically, it reads in the texture from Unreal engine, and then gets the individual pixel values and puts them into the openCV Mat. Then I try and apply my warpPerspective transformation. Interestingly, if I just try a simple warpAffine transformation (rotation), it works fine. I have seen this questions: Opencv virtually camera rotating/translating for bird's eye view, but I cannot figure out what I am doing wrong vs. their solution. I would really appreciate any help or guidance any of you may have. Thanks in advance!
ROSCamTextureRenderTargetRes->ReadPixels(ImageData);
cv::Mat image_data_matrix(TexHeight, TexWidth, CV_8UC3);
cv::Mat warp_dst, warp_rotate_dst;
int currCol = 0;
int currRow = 0;
cv::Vec3b* pixel_left = image_data_matrix.ptr<cv::Vec3b>(currRow);
for (auto color : ImageData)
{
pixel_left[currCol][2] = color.R;
pixel_left[currCol][1] = color.G;
pixel_left[currCol][0] = color.B;
currCol++;
if (currCol == TexWidth)
{
currRow++;
currCol = 0;
pixel_left = image_data_matrix.ptr<cv::Vec3b>(currRow);
}
}
warp_dst = cv::Mat(image_data_matrix.rows, image_data_matrix.cols, image_data_matrix.type());
double rotX = (45 - 90)*PI / 180;
double rotY = (90 - 90)*PI / 180;
double rotZ = (90 - 90)*PI / 180;
cv::Mat A1 = (cv::Mat_<float>(4, 3) <<
1, 0, (-1)*TexWidth / 2,
0, 1, (-1)*TexHeight / 2,
0, 0, 0,
0, 0, 1);
// Rotation matrices Rx, Ry, Rz
cv::Mat RX = (cv::Mat_<float>(4, 4) <<
1, 0, 0, 0,
0, cos(rotX), (-1)*sin(rotX), 0,
0, sin(rotX), cos(rotX), 0,
0, 0, 0, 1);
cv::Mat RY = (cv::Mat_<float>(4, 4) <<
cos(rotY), 0, (-1)*sin(rotY), 0,
0, 1, 0, 0,
sin(rotY), 0, cos(rotY), 0,
0, 0, 0, 1);
cv::Mat RZ = (cv::Mat_<float>(4, 4) <<
cos(rotZ), (-1)*sin(rotZ), 0, 0,
sin(rotZ), cos(rotZ), 0, 0,
0, 0, 1, 0,
0, 0, 0, 1);
// R - rotation matrix
cv::Mat R = RX * RY * RZ;
// T - translation matrix
cv::Mat T = (cv::Mat_<float>(4, 4) <<
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, dist,
0, 0, 0, 1);
// K - intrinsic matrix
cv::Mat K = (cv::Mat_<float>(3, 4) <<
12.5, 0, TexHeight / 2, 0,
0, 12.5, TexWidth / 2, 0,
0, 0, 1, 0
);
cv::Mat warp_mat = K * (T * (R * A1));
//warp_mat = cv::getRotationMatrix2D(srcTri[0], 43.0, 1);
//cv::warpAffine(image_data_matrix, warp_dst, warp_mat, warp_dst.size());
cv::warpPerspective(image_data_matrix, warp_dst, warp_mat, image_data_matrix.size(), CV_INTER_CUBIC | CV_WARP_INVERSE_MAP);
cv::imshow("distort", warp_dst);
cv::imshow("imaage", image_data_matrix)
I am trying to use:
layout (binding = 0, rgba8ui) readonly uniform uimage2D input;
in a compute shader. In order to to bind a texture to this I am using:
glBindImageTexture(0, texture_name, 0, GL_FALSE, 0, GL_READ_ONLY, GL_RGBA8);
and it seems that in order for this bind to work the texture has to be immutable so I've switched from:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
to:
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA8UI, width, height);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, pixels);
But this generates "Invalid operation" (specifically the glTexSubImage2D() call generates it). Looking in the documentation I discovered that this call may cause 1282 for the following reasons:
GL_INVALID_OPERATION is generated if the texture array has not been defined by a previous glTexImage2D or glCopyTexImage2D operation whose internalformat matches the format of glTexSubImage2D.
GL_INVALID_OPERATION is generated if type is GL_UNSIGNED_SHORT_5_6_5 and format is not GL_RGB.
GL_INVALID_OPERATION is generated if type is GL_UNSIGNED_SHORT_4_4_4_4 or GL_UNSIGNED_SHORT_5_5_5_1 and format is not GL_RGBA
but none of these is my case.
The first of them might seem to be the problem (considering I am using glTexStorage2D(), not glTexImage2D() )but this is not the problem because in case of float texture the same mechanism works:
glTexStorage2D(GL_TEXTURE_2D, 1, GL_RGBA32F, width, height);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, width, height, GL_RGBA, GL_FLOAT, pixels);
instead of:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, width, height, 0, GL_RGBA, GL_FLOAT, pixels);
This is probably irrelevant but both methods work well on PC.
Any suggestions on why is this happening?
The internalFormat you use in glTexImage2D and glBindImageTexture should be the same and be compatible with your sampler. For a uimage2D, try using GL_RGBA8UI everywhere.
Also, for transfers to GL_RGBA8UI (and other integer formats) you need to use GL_RGBA_INTEGER as format.
glBindImageTexture(0, texture_name, 0, GL_FALSE, 0, GL_READ_ONLY, GL_RGBA8UI);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8UI, width, height, 0, GL_RGBA_INTEGER, GL_UNSIGNED_BYTE, pixels);
Using the format GL_RGBA_INTEGER should also make the glTexSubImage2D variant work.
When I call
glCopyTexImage2D( GL_TEXTURE_2D, 0, GL_RGB8_OES, 0, 0, w, h, 0 );
I get GLERROR 1281 GL_INVALID_VALUE. If I use
glCopyTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, 0, 0, w, h, 0 );
It works fine.
It seems GL_RGB8_OES is unsupported for glCopyTexImage2D! Such a glaring omission, I find that hard to believe. How can I rescue the alpha channel?
Have you tried using GL_RGBA ?
http://www.opengl.org/sdk/docs/man/xhtml/glCopyTexImage2D.xml