I have a question about OpenCV's example on Basic Thresholding as provided in the link below:
http://docs.opencv.org/2.4/doc/tutorials/imgproc/threshold/threshold.html#goal
I am slowly beginning to understand the code and have tried out an example too. However I am confused about a part of the code regarding thresholding operations. How does the thresholding function know which threshold operation to use?
This is where it is called:
threshold( src_gray, dst, threshold_value, max_BINARY_value,threshold_type);
I get that the last parameter "threshold_type is how it knows which threshold operation to use(eg. binary, binary inverted, truncated etc.) However in the code, this is all that is assigned to threshold_type:
int threshold_type = 3
As it is only assigned an int value of 3. How does the Threshold function know what operation to give it? Could someone explain it to me?
You should avoid using numeric literals to call the method of OpenCV instead use the constant variable defined in the opencv namespace, However it won't create any difference in output, but it makes the code more readable, So deciphered set of inputs to the cv::threshold() method are:
THRESH_BINARY = 0,
THRESH_BINARY_INV = 1,
THRESH_TRUNC = 2,
THRESH_TOZERO = 3,
THRESH_TOZERO_INV = 4,
THRESH_MASK = 7,
THRESH_OTSU = 8,
THRESH_TRIANGLE = 16
According to this table you are using thresholdType == THRESH_TOZERO
Related
Hi I am trying to program an app that will display simple 3d models in iOS on Xcode and I have run into a small problem but I can not find a solution to this problem in Apples Documentation or in any forums on the internet I have looked in. I have an big array with vertices for triangles in 3 Dimensions that I want to transform into world space in the rendering process in metal. I read in an article online that in order to tell metal to tell the graphics processor to transform the vertices in the rendering process you need to put this matrix in a metal buffer and then tell the rendering process to use this buffer with the matrix in it in this line of code:
renderEncoder.setVertexBuffer(ROTMATRIX, offset: 0, index: 1)
if "ROTMATRIX" is the name of the metal buffer that contains the models rotation matrix. The problem is that I do not know how to put the matrix inside this buffer. I constructed a matrix for the model called MODMAT like this:
var A = simd_float4(1, 0, 0, 0)
var B = simd_float4(0, 0, 0, 0)
var C = simd_float4(0, 0, 1, 0)
var D = simd_float4(0, 0, 0, 1)
var MODMAT = float4x4([A, B, C, D])
I tried to put the matrix MODMAT in ROTMATRIX in this line of code:
ROTMATRIX.contents().copyMemory(from: MODMAT, byteCount: 64)
But the compiler in Xcode says that it "Cannot convert value of type 'float4x4' (aka 'simd_float4x4') to expected argument type 'UnsafeRawPointer'". So I need to provide it with the unsafe raw pointer to the matrix MODMAT. So is it possible to create this kind of pointer to a Matrix in Swift and if not how should I modify ROTMATRIX in the correct way?
Best Regards Simon
contents returns an UnsafeMutableRawPointer. You can use either storeBytes(of:toByteOffset:as:) or storeBytes(of:as:) to store a simd_float4x4 to this pointer. In fact, you can use this to store any value of a trivial (basically, values that can be copied bit for bit without any refcounting and so on) type.
Refer to documentation page for UnsafeMutableRawPointer and contents
I am currently planning on training a binary image classification model. The images I want to train on are the difference between two original pictures. In other words, for each data entry, I start out with 2 pictures, take their difference, and the label that difference as a 0 or 1. My question is what is the best way to find this difference. I know about cv2.absdiff and then normal subtraction of images - what is the most effective way to go about this?
About the data: The images I'm training on are screenshots that usually are the same but may have small differences. I found that normal subtraction seems to show the differences less than absdiff.
This is the code I use for absdiff:
diff = cv2.absdiff(img1, img2)
mask = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)
th = 1
imask = mask>1
canvas = np.zeros_like(img2, np.uint8)
canvas[imask] = img2[imask]
And then this for normal subtraction:
def extract_diff(self,imageA, imageB, image_name, path):
subtract = imageB.astype(np.float32) - imageA.astype(np.float32)
mask = cv2.inRange(np.abs(subtract),(30,30,30),(255,255,255))
th = 1
imask = mask>1
canvas = np.zeros_like(imageA, np.uint8)
canvas[imask] = imageA[imask]
Thanks!
A difference can be negative or positive.
For some number types, such as uint8 (unsigned 8-bit int), which can't be negative (have no sign), a negative value wraps around and the value would make no sense anymore. Other types can be signed (e.g. floats, signed ints), so a negative value can be represented correctly.
That's why cv.absdiff exists. It always gives you absolute differences, and those are okay to represent in an unsigned type.
Example with numbers: a = 4, b = 6. a-b should be -2, right?
That value, as an uint8, will wrap around to become 0xFE, or 254 in decimal. The 254 value has some relation to the true -2 difference, but it also incorporates the range of values of the data type (8 bits: 256 values), so it's really just "code".
cv.absdiff would give you the absolute of the difference (-2), which is 2.
I am currently working on replicating YOLOv2 (not tiny) on iOS (Swift4) using MPS.
A problem is that it is hard for me to implement space_to_depth function (https://www.tensorflow.org/api_docs/python/tf/space_to_depth) and concatenation of two results from convolutions (13x13x256 + 13x13x1024 -> 13x13x1280). Could you give me some advice on making these parts? My codes are below.
...
let conv19 = MPSCNNConvolutionNode(source: conv18.resultImage,
weights: DataSource("conv19", 3, 3, 1024, 1024))
let conv20 = MPSCNNConvolutionNode(source: conv19.resultImage,
weights: DataSource("conv20", 3, 3, 1024, 1024))
let conv21 = MPSCNNConvolutionNode(source: conv13.resultImage,
weights: DataSource("conv21", 1, 1, 512, 64))
/*****
1. space_to_depth with conv21
2. concatenate the result of conv20(13x13x1024) to the result of 1 (13x13x256)
I need your help to implement this part!
******/
I believe space_to_depth can be expressed in form of a convolution:
For instance, for an input with dimension [1,2,2,1], Use 4 convolution kernels that each output one number to one channel, ie. [[1,0],[0,0]] [[0,1],[0,0]] [[0,0],[1,0]] [[0,0],[0,1]], this should put all input numbers from spatial dimension to depth dimension.
MPS actually has a concat node. See here: https://developer.apple.com/documentation/metalperformanceshaders/mpsnnconcatenationnode
You can use it like this:
concatNode = [[MPSNNConcatenationNode alloc] initWithSources:#[layerA.resultImage, layerB.resultImage]];
If you are working with the high level interface and the MPSNNGraph, you should just use a MPSNNConcatenationNode, as described by Tianyu Liu above.
If you are working with the low level interface, manhandling the MPSKernels around yourself, then this is done by:
Create a 1280 channel destination image to hold the result
Run the first filter as normal to produce the first 256 channels of the result
Run the second filter to produce the remaining channels, with the destinationFeatureChannelOffset set to 256.
That should be enough in all cases, except when the data is not the product of a MPSKernel. In that case, you'll need to copy it in yourself or use something like a linear neuron (a=1,b=0) to do it.
I am doing a 6-dof transformation with the RANSAC given in OpenCV and I now want to convert two matrices of cv::Mat to an Isometry3d of Eigen but I didn't find good examples about this problem.
e.g.
cv::Mat rot;
cv::Mat trsl;
// the rot is 3-by-3 and trsl is 3-by-1 vector.
Eigen::Isometry3d trsf;
trsf.rotation = rot;
trsf.translation = trsl; // I know trsf has two members but it seems not the correct way to do a concatenation.
Anyone give me a hand? Thanks.
Essentially, you need an Eigen::Map to read the opencv data and store it to parts of your trsf:
typedef Eigen::Matrix<double, 3, 3, Eigen::RowMajor> RMatrix3d;
Eigen::Isometry3d trsf;
trsf.linear() = RMatrix3d::Map(reinterpret_cast<const double*>(rot.data));
trsf.translation() = Eigen::Vector3d::Map(reinterpret_cast<const double*>(trsl.data));
You need to be sure that rot and trsl indeed hold double data (perhaps consider using cv::Mat_<double> instead).
I am new to opencv, so please help me in solving this basic query. I am trying to find the max. value of a Mat variable. I tried to use the max_element and minMaxLoc, but end up facing errors, as the function keeps saying the datatype matched is wrong. I checked it over and over again, but am not successful. here is my code.
ABS_DST is the MAT variable
double *estimate,*min;
CvPoint *minLoc,*maxLoc;
Size s = abs_dst.size();
int rows = s.height;
int cols = s.width;
double imagearray[rows][cols] = abs_dst.data();
minMaxLoc(imagearray,min,estimate,minLoc,maxLoc);
I even tried giving the Mat variable abs_dst directly. But have not succeeded. there's an optional input mask array, which I have ignored as I do not require that.
Do next:
Point[] Mat_To_Point = Your_Mat_Variable.toArray();
And next you can to sort your array
I think I got the answer. Thanks for your efforts. The problem is minMaxLoc doesn't take RGB images array, as it is 3 channel. I had to convert the ABS_DST to Gray scale.
Secondly,
it is not
CvPoint *minLoc,maxLoc;
it is
Point *minLoc,*maxLoc;
I need not convert it to array, as converting to Gray Scale will directly give me a 1D channel, enough for the minMaxLoc. My apologies for my own mistakes and thanks once again for your efforts.