While loading texture within kernel function in Metal, is it possible to find the default z-value (if it exists at all) of the texture being sampled, the z-near and z-far values (likewise, if these values exist at all when the kernel is used instead of the normal pipeline using shaders) of the space in which the texture resides?
What I am trying to understand is:
When sampling a texture within kernel function, is it possible for us to change (or set) the z value of the texture before writing it? I have not been able to find this information along with the z-near and z-far values (is it even possible that we define these values manually when using the kernel function?) from the documentation.
Thanks.
Related
Metal supports kernel in addition to the standard vertex and fragment functions. I found a metal kernel example that converts an image to grayscale.
What exactly is the difference between doing this in a kernel vs fragment? What can a compute kernel do (better) that a fragment shader can't and vice versa?
Metal has four different types of command encoders:
MTLRenderCommandEncoder
MTLComputeCommandEncoder
MTLBlitCommandEncoder
MTLParallelRenderCommandEncoder
If you're just doing graphics programming, you're most familiar with the MTLRenderCommandEncoder. That is where you would set up your vertex and fragment shaders. This is optimized to deal with a lot of draw calls and object primitives.
The kernel shaders are primarily used for the MTLComputeCommandEncoder. I think the reason a kernel shader and a compute encoder were used for the image processing example is because you're not drawing any primitives as you would be with the render command encoder. Even though both methods are utilizing graphics, in this instance it's simply modifying color data on a texture rather than calculating depth of multiple objects on a screen.
The compute command encoder is also more easily set up to do parallel computing using threads:
https://developer.apple.com/reference/metal/mtlcomputecommandencoder
So if your application wanted to utilize multithreading on data modification, it's easier to do that in this command encoder than the render command encoder.
i am reading a tutorial and there is an equation as shown in the image. I know that sign in the image called cross addition, but my question is is there any method in opencv that performs cross addition?
This 'plus in a circle' in this context most likely refers to Direct addition of Matrices
In particular, the unary notation ⊕I1..n refers to the construction of a diagonalised matrix of the matrices I.
For example, suppose we have:
There is no single method in OpenCV that performs this but you can easily use existing matrix operations to do it by:
Create output matrix of correct size and init with zeros
Iterate over matrices to be direct added and set appropriate esubrange of output matrix
I am using the Jtransforms library which seems to be wicked fast for my purpose.
At this point I think I have a pretty good handle on how FFT works so now I am wondering if there is any form of a standard domain which is used for audio visualizations like spectograms?
Thanks to android's native FFT in 2.3 I had been using bytes as the range although I am still unclear as to whether it is signed or not. (I know java doesn't have unsigned bytes, but Google implemented these functions natively and the waveform is PCM 8bit unsigned)
However I am adapting my app to work with mic audio and 2.1 phones. At this point having the input domain being in the range of bytes whether it is [-128, 127] or [0, 255] no longer seems quite optimal.
I would like the range of my FFT function to be [0,1] so that I can scale it easily.
So should I use a domain of [-1, 1] or [0, 1]?
Essentially, the input domain does not matter. At most, it causes an offset and a change in scaling on your original data, which will be turned into an offset on bin #0 and an overall change in scaling on your frequency-domain results, respectively.
As to limiting your FFT output to [0,1]; that's essentially impossible. In general, the FFT output will be complex, there's no way to manipulate your input data so that the output is restricted to positive real numbers.
If you use DCT instead of FFT your output range will be real. (Read about the difference and decide if DCT is suitable for your application.)
FFT implementations for real numbers (as input domain) use half the samples for the output range (since there are only even results when the input is real), therefore the fact you have both real and imaginary parts for each sample doesn't effect the size of the result (vs the size of the source) much (output size is ceil(n/2)*2).
I noticed that a new data structure cv::Matx was added to the new OpenCV version, intended for small matrices of known size at compilation time, for example
cv::Matx31f // matrix 3x1 of float type
Checking the documentation I saw that most of matrix operations are available, but still I don't see the advantages of using this new type instead of the old cv::Mat.
When should I use Matx instead of Mat?
Short answer: cv::Mat uses the heap to store its data, while cv::Matx uses the stack.
A cv::Mat uses dynamic memory allocation (on the heap). This is appropriate for big matrices (like images) and lets you do things like shallow copies of a matrix, which is the default behavior of cv::Mat.
However, for the small matrices that cv::Matx is designed for, heap allocation would be very expensive compared to doing the same thing on the stack. I have seen a block of math reduce processing time by over 75% by switching to using stack-allocated types (e.g. cv::Point and cv::Matx) instead of cv::Mat.
It's about memory management and not wasting (in some cases important) memory or just reservation of memory for an object you'll use later.
That's how I understand it – may be someone else can give a better explanation.
This is a late late answer, but it is still an interesting question!
dom's answer is quite accurate, and the heap/stack reference in user1460044's is also interesting.
From a practical point of view, I wouldn't use Matx (or Vec), except if it were completely necessary. The major advantages of Matx are
Using the stack (efficient![1])
Initialization.
The problem is, at the end you will have to move your Matx data to a Mat to do most of stuff, and so, you will be back at the heap again.
On the other hand, the "cool initialization" of a Matx can be done in a normal Mat:
// Matx initialization:
Matx31f A(1.f,2.f,3.f);
// Mat initialization:
Mat B = (Mat_<float>(3,1) << 1.f, 2.f, 3.f);
Also, there is a difference in initialization (beyond the heap/stack) stuff. If you try to put 5 values into the Matx31, it will crash (runtime exception), while calling the Mat_::operator<< with 5 values will only store the first three.
[1] Efficient if your program has to create lots of matrices of less than ~10 elements. In that case use Matx matrices.
There are 2 other reasons why I prefer Matx to Mat:
Readability: people reading the code can immediately see the size of the matrices, for example:
cv::Matx34d transform = ...;
It's clear that this is a 3x4 matrix, so it contains a 3D transformation of type (R,t), where R is a rotation matrix (as opposed to say, axis-angle).
Similarly, accessing an element is more natural with transform(i,j) vs transform.at<double>(i,j).
Easy debugging. Since the elements for Matx are allocated on the stack in an array of known length, IDEs or debuggers can display the entire contents nicely when stepping through the code.
I am working on implementing an algorithm using vDSP.
1) take FFT
2) take log of square of absolute value (can be done with lookup table)
3) take another FFT
4) take absolute value
I'm not sure if it is up to me to throw the incoming data through a windowing function before I run the FFT on it.
vDSP_fft_zrip(setupReal, &A, stride, log2n, direction);
that is my FFT function
Do I need to throw the data through vDSP_hamm_window(...) first?
The iOS Accelerate library function vDSP_fft_zrip() does not include applying a window function (unless you count the implied rectangular window due to the finite length parameter).
So you need to apply your chosen window function (there are many different ones) first.
It sounds like you're doing cepstral analysis and yes, you do need a window function prior to the first FFT. I would suggest a simple Hann or Hamming window.
I don't have any experience with your particular library, but in every other FFT library I know of it's up to you to window the data first. If nothing else, the library can't know what window you wish to use, and sometimes you don't want to use a window (if you're using the FFT for overlap-add filtering, or if you know the signal is exactly periodic in the transform block).
Also, just offhand, it seems like if you're doing 2 FFTs, the overhead of calling a logarithm function is relatively minor.