Stop computing when Metal kernel output buffer is full? - ios

I have a metal compute kernel that takes input points from a buffer particles, and populates a new buffer particlesOut. My compute kernel is defined as:
kernel void compute(device DrawingPoint *particles [[buffer(0)]],
device Particle *particlesOut [[buffer(1)]],
constant ComputeParameters *params [[buffer(2)]],
device atomic_int &counter [[buffer(3)]],
uint id [[thread_position_in_grid]]) {
This works fine, so long as the output buffer has room for the number of records populated.
So, for instance, if the input buffer has 10,000 records, and for each of those records I create 10 output records, and the output buffer has a length of 100,000, then all is fine. In other words, if the number of output records is fixed and is large enough, all is fine.
But for some input records, I would like a random number of output records to be populated. For instance, some I would like to populate 5, and another I would like to create 200 (and any number in-between).
I am using an atomic_int for the output record's position in the buffer. Again, this works if I have a fixed number of records populated per input record.
I am populating the output buffer like this:
//Output buffer is 10 times the size of the input buffer
for (int i = 0; i < 10; i++) {
int counterValue = atomic_fetch_add_explicit(&counter, 1, memory_order_relaxed);
...
particlesOut[counterValue].position = finalPoint;
}
This works fine.
If I try to make it work on a variable number instead of the fixed value, the buffer is way under populated (instead of getting say 100,000 particles populated, maybe only 10,000 are populated).
For example:
int numberOfOutputPoints = someRandomValueBetweenFiveAndTwoHundred();
for (int i = 0; i < numberOfOutputPoints; i++) {
int counterValue = atomic_fetch_add_explicit(&counter, 1, memory_order_relaxed);
//particleCount is the size of the output buffer
if (counterValue > params->particleCount) {
return;
}
...
particlesOut[counterValue].position = finalPoint;
}
When I do that, only a small number of the particles in the output buffer are actually populated.
I looked at using different options for atomic_fetch_add_explicit, but only memory_order_relaxed will compile.
I tried using:
int counterValue = atomic_fetch_add(&counter, 1)
But, the compiler reports that there is no matching function. Other than having the buffer output large enough for every record to populate the maximum number of possible particles populated (e.g. 200 times 10,000), is there any way to make it dynamic?
In other words, I just want to stop populating the output buffer when it is full.

Related

how can I update dynamic vertex buffer fastly?

I'm trying to make a simple 3D modeling tool.
there is some work to move a vertex( or vertices ) for transform the model.
I used dynamic vertex buffer because thought it needs much update.
but performance is too low in high polygon model even though I change just one vertex.
is there other methods? or did I wrong way?
here is my D3D11_BUFFER_DESC
Usage = D3D11_USAGE_DYNAMIC;
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
and my update funtion
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < vIndice.size(); ++i)
{
pBuffer[vIndice[i]].xfPosition.x = pVerticesInfo[vIndice[i]].xfPosition.x;
pBuffer[vIndice[i]].xfPosition.y = pVerticesInfo[vIndice[i]].xfPosition.y;
pBuffer[vIndice[i]].xfPosition.z = pVerticesInfo[vIndice[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
As mentioned in the previous answer, you are updating your whole buffer every time, which will be slow depending on model size.
The solution is indeed to implement partial updates, there are two possibilities for it, you want to update a single vertex, or you want to update
arbitrary indices (for example, you want to move N vertices in one go, in different locations, like vertex 1,20,23 for example.
The first solution is rather simple, first create your buffer with the following description :
Usage = D3D11_USAGE_DEFAULT;
CPUAccessFlags = 0;
BindFlags = D3D11_BIND_VERTEX_BUFFER;
ByteWidth = sizeof(ST_Vertex) * _nVertexCount
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = pVerticesInfo;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pVertexBuffer);
This makes sure your vertex buffer is gpu visible only.
Next create a second dynamic buffer which has the size of a single vertex (you do not need any bind flags in that case, as it will be used only for copies)
_pCopyVertexBuffer
Usage = D3D11_USAGE_DYNAMIC; //Staging works as well
CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
BindFlags = 0;
ByteWidth = sizeof(ST_Vertex);
D3D11_SUBRESOURCE_DATA d3dBufferData;
d3dBufferData.pSysMem = NULL;
hr = pd3dDevice->CreateBuffer(&descBuffer, &d3dBufferData, &_pCopyVertexBuffer);
when you move a vertex, copy the changed vertex in the copy buffer :
ST_Vertex changedVertex;
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
pBuffer->xfPosition.x = changedVertex.xfPosition.x;
pBuffer->.xfPosition.y = changedVertex.xfPosition.y;
pBuffer->.xfPosition.z = changedVertex.xfPosition.z;
pImmediateContext->Unmap(_pVertexBuffer, 0);
Since you use D3D11_MAP_WRITE_DISCARD, make sure to write all attributes there (not only position).
Now once you done, you can use ID3D11DeviceContext::CopySubresourceRegion to only copy the modified vertex in the current location :
I assume that vertexID is the index of the modified vertex :
pd3DeviceContext->CopySubresourceRegion(_pVertexBuffer,
0, //must be 0
vertexID * sizeof(ST_Vertex), //location of the vertex in you gpu vertex buffer
0, //must be 0
0, //must be 0
_pCopyVertexBuffer,
0, //must be 0
NULL //in this case we copy the full content of _pCopyVertexBuffer, so we can set to null
);
Now if you want to update a list of vertices, things get more complicated and you have several options :
-First you apply this single vertex technique in a loop, this will work quite well if your changeset is small.
-If your changeset is very big (close to almost full vertex size, you can probably rewrite the whole buffer instead).
-An intermediate technique is to use compute shader to perform the updates (thats the one I normally use as its the most flexible version).
Posting all c++ binding code would be way too long, but here is the concept :
your vertex buffer must have BindFlags = D3D11_BIND_VERTEX_BUFFER | D3D11_BIND_UNORDERED_ACCESS; //this allows to write wioth compute
you need to create an ID3D11UnorderedAccessView for this buffer (so shader can write to it)
you need the following misc flags : D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS //this allows to write as RWByteAddressBuffer
you then create two dynamic structured buffers (I prefer those over byteaddress, but vertex buffer and structured is not allowed in dx11, so for the write one you need raw instead)
first structured buffer has a stride of ST_Vertex (this is your changeset)
second structured buffer has a stride of 4 (uint, these are the indices)
both structured buffers get an arbitrary element count (normally i use 1024 or 2048), so that will be the maximum amount of vertices you can update in a single pass.
both structured buffers you need an ID3D11ShaderResourceView (shader visible, read only)
Then update process is the following :
write modified vertices and locations in structured buffers (using map discard, if you have to copy less its ok)
attach both structured buffers for read
attach ID3D11UnorderedAccessView for write
set your compute shader
call dispatch
detach ID3D11UnorderedAccessView for write (this is VERY important)
This is a sample compute shader code (I assume you vertex is position only, for simplicity)
cbuffer cbUpdateCount : register(b0)
{
uint updateCount;
};
RWByteAddressBuffer RWVertexPositionBuffer : register(u0);
StructuredBuffer<float3> ModifiedVertexBuffer : register(t0);
StructuredBuffer<uint> ModifiedVertexIndicesBuffer : register(t0);
//this is the stride of your vertex buffer, since here we use float3 it is 12 bytes
#define WRITE_STRIDE 12
[numthreads(64, 1, 1)]
void CS( uint3 tid : SV_DispatchThreadID )
{
//make sure you do not go part element count, as here we runs 64 threads at a time
if (tid.x >= updateCount) { return; }
uint readIndex = tid.x;
uint writeIndex = ModifiedVertexIndicesBuffer[readIndex];
float3 vertex = ModifiedVertexBuffer[readIndex];
//byte address buffers do not understand float, asuint is a binary cast.
RWVertexPositionBuffer.Store3(writeIndex * WRITE_STRIDE, asuint(vertex));
}
For the purposes of this question I'm going to assume you already have a mechanism for selecting a vertex from a list of vertices based upon ray casting or some other picking method and a mechanism for creating a displacement vector detailing how the vertex was moved in model space.
The method you have for updating the buffer is sufficient for anything less than a few hundred vertices, but on large scale models it becomes extremely slow. This is because you're updating everything, rather than the individual vertices you modified.
To fix this, you should only update the vertices you have changed, and to do that you need to create a change set.
In concept, a change set is nothing more than a set of changes made to the data - a list of the vertices that need to be updated. Since we already know which vertices were modified (otherwise we couldn't have manipulated them), we can map in the GPU buffer, go to that vertex specifically, and copy just those vertices into the GPU buffer.
In your vertex modification method, record the index of the vertex that was modified by the user:
//Modify the vertex coordinates based on mouse displacement
pVerticesInfo[SelectedVertexIndex].xfPosition.x += DisplacementVector.x;
pVerticesInfo[SelectedVertexIndex].xfPosition.y += DisplacementVector.y;
pVerticesInfo[SelectedVertexIndex].xfPosition.z += DisplacementVector.z;
//Add the changed vertex to the list of changes.
changedVertices.add(SelectedVertexIndex);
//And update the GPU buffer
UpdateD3DBuffer();
In UpdateD3DBuffer(), do the following:
D3D11_MAPPED_SUBRESOURCE d3dMappedResource;
pImmediateContext->Map(_pVertexBuffer, 0, D3D11_MAP_WRITE, 0, &d3dMappedResource);
ST_Vertex* pBuffer = (ST_Vertex*)d3dMappedResource.pData;
for (int i = 0; i < changedVertices.size(); ++i)
{
pBuffer[changedVertices[i]].xfPosition.x = pVerticesInfo[changedVertices[i]].xfPosition.x;
pBuffer[changedVertices[i]].xfPosition.y = pVerticesInfo[changedVertices[i]].xfPosition.y;
pBuffer[changedVertices[i]].xfPosition.z = pVerticesInfo[changedVertices[i]].xfPosition.z;
}
pImmediateContext->Unmap(_pVertexBuffer, 0);
changedVertices.clear();
This has the effect of only updating the vertices that have changed, rather than all vertices in the model.
This also allows for some more complex manipulations. You can select multiple vertices and move them all as a group, select a whole face and move all the connected vertices, or move entire regions of the model relatively easily, assuming your picking method is capable of handling this.
In addition, if you record the change sets with enough information (the affected vertices and the displacement index), you can fairly easily implement an undo function by simply reversing the displacement vector and reapplying the selected change set.

WebGL attempt to access out of range vertices in attribute 2 error

I know this question has been asked quite a bit, but none of the solutions really fit my case. I am looking to add a second type of object to the canvas with the code shown below. I know I didn't provide much but its a quick start. Just ask for more if you think you have a hunch. This code below is in my render function.
So far I have checked that
I have enough vertices in my points array
I have enough normal vectors in my normals array
I have enough texture coordinates in my texCoords array
There are no mismatches between the vectors added when creating my terrain and my propeller.
The terrain renders just fine with the texture, lighting and all but,I am unable to get the propeller to render. I get the error I listed above. I have added multiple objects to canvases before and never run into an error like this.
//----------------------------------------- Draw Terrain ------------------------------------
var i = 0;
for(var row=0-dimension; row<dimension; row+=3){
for(var col=0-dimension; col<dimension; col+=3, i++){
var mv = mult(viewer, mult(translate(row, -1, col), mult(scale[i],rot[i])));
gl.uniformMatrix4fv(modelViewLoc, false, flatten(mv));
gl.uniformMatrix3fv(normalLoc, false, flatten(normalMatrix(mv, true)));
gl.drawArrays( gl.TRIANGLES, 0, index);
}
}
//----------------------------------------- Draw Propeller ------------------------------------
mv = mult(viewer, mult( translate(-2.1, -2.9, -.2), scalem(4,5,5)));
gl.uniformMatrix4fv(modelViewLoc, false, flatten(mv));
gl.uniformMatrix3fv(normalLoc, false, flatten(normalMatrix(mv, true)));
gl.drawArrays( gl.TRIANGLES, propellerStart, points.length);
Is there any way i can use the "Attribute 2" in the error message to track down the variable giving me this issue?
Appreciate the help!
What part don't you understand? The error is clear, whatever buffer you have attached to attribute 2 is not big enough to handle the propellerStart, points.length draw request.
So first thing is figure out which attribute is attribute 2. Do this by printing out your attribute locations. Is your points, normals, or texcoords?
You should already be looking them up somewhere with gl.getAttribLocation so print out those values, find out which one is #2.
Then go look at the size of the buffer you attached to that attribute. To do that somewhere you would have called.
gl.bindBuffer(gl.ARRAY_BUFFER, someBuffer);
gl.vertexAttribPointer(locationForAttribute2, size, type, normalize, stride, offset);
So we know it's someBuffer from the above code. We also need to know size, type, stride, and offset
Somewhere else you filled that buffer with data using
gl.bindBuffer(gl.ARRAEY_BUFFER, someBuffer);
gl.bufferData(gl.ARRAY_BUFFER, someData, ...);
So you need to find the size of someData.
sizeOfBuffer = someData.length * someData.BYTES_PER_ELEMENT
Let's it's a 1000 element Float32Array so it someData.length is 1000 and someData.BYTES_PER_ELEMENT is 4 therefore sizeOfBuffer is 4000.
Using all of that you can now check if your buffer is too small. (note: we already know it's too small since the browser told us so but if you want know how to compute it yourself)
Let's say size is 3, type is gl.FLOAT, stride is 32, offset is 12 (note: I personally never use anything but stride = 0 and offset = 0)
Let's say points.length = 50
numPoints = points.length;
bytesPerElement = size * 4; // because a gl.FLOAT is 4 bytes
realStride = stride === 0 ? bytesPerElement : stride;
bytesNeeded = realStride * (numPoints - 1) + bytesPerElement;
bytesNeeded in this case is (64 * 49) + 12 = 3148
So now we know how many bytes are needed. Does are buffer have enough data? We'll when you called draw you passed in an offset propellerStart. Let's assume it's 900 and there's the offset in the attribute so.
bufferSizeNeeded = offset + propellerStart + bytesNeeded
so bufferSizeNeeded = 12 + 900 + 3148 which is 4060. Since 4060 is > sizeOfBuffer which was 4000 you're going to get the error you got.
In any case the point is really it's up to you to figure out which buffer is used by attribute #2, then go look at why your buffer is too small. Is your offset to drawArrays wrong? Is your stride too big? Is your offset wrong in vertexAttribPointer (it's in number of bytes not number of units). Do you put the wrong size (1,2,3,4). Do you mis-calculate the number of points?

Spectrogram from AVAudioPCMBuffer using Accelerate framework in Swift

I'm trying to generate a spectrogram from an AVAudioPCMBuffer in Swift. I install a tap on an AVAudioMixerNode and receive a callback with the audio buffer. I'd like to convert the signal in the buffer to a [Float:Float] dictionary where the key represents the frequency and the value represents the magnitude of the audio on the corresponding frequency.
I tried using Apple's Accelerate framework but the results I get seem dubious. I'm sure it's just in the way I'm converting the signal.
I looked at this blog post amongst other things for a reference.
Here is what I have:
self.audioEngine.mainMixerNode.installTapOnBus(0, bufferSize: 1024, format: nil, block: { buffer, when in
let bufferSize: Int = Int(buffer.frameLength)
// Set up the transform
let log2n = UInt(round(log2(Double(bufferSize))))
let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2))
// Create the complex split value to hold the output of the transform
var realp = [Float](count: bufferSize/2, repeatedValue: 0)
var imagp = [Float](count: bufferSize/2, repeatedValue: 0)
var output = DSPSplitComplex(realp: &realp, imagp: &imagp)
// Now I need to convert the signal from the buffer to complex value, this is what I'm struggling to grasp.
// The complexValue should be UnsafePointer<DSPComplex>. How do I generate it from the buffer's floatChannelData?
vDSP_ctoz(complexValue, 2, &output, 1, UInt(bufferSize / 2))
// Do the fast Fournier forward transform
vDSP_fft_zrip(fftSetup, &output, 1, log2n, Int32(FFT_FORWARD))
// Convert the complex output to magnitude
var fft = [Float](count:Int(bufferSize / 2), repeatedValue:0.0)
vDSP_zvmags(&output, 1, &fft, 1, vDSP_length(bufferSize / 2))
// Release the setup
vDSP_destroy_fftsetup(fftsetup)
// TODO: Convert fft to [Float:Float] dictionary of frequency vs magnitude. How?
})
My questions are
How do I convert the buffer.floatChannelData to UnsafePointer<DSPComplex> to pass to the vDSP_ctoz function? Is there a different/better way to do it maybe even bypassing vDSP_ctoz?
Is this different if the buffer contains audio from multiple channels? How is it different when the buffer audio channel data is or isn't interleaved?
How do I convert the indices in the fft array to frequencies in Hz?
Anything else I may be doing wrong?
Update
Thanks everyone for suggestions. I ended up filling the complex array as suggested in the accepted answer. When I plot the values and play a 440 Hz tone on a tuning fork it registers exactly where it should.
Here is the code to fill the array:
var channelSamples: [[DSPComplex]] = []
for var i=0; i<channelCount; ++i {
channelSamples.append([])
let firstSample = buffer.format.interleaved ? i : i*bufferSize
for var j=firstSample; j<bufferSize; j+=buffer.stride*2 {
channelSamples[i].append(DSPComplex(real: buffer.floatChannelData.memory[j], imag: buffer.floatChannelData.memory[j+buffer.stride]))
}
}
The channelSamples array then holds separate array of samples for each channel.
To calculate the magnitude I used this:
var spectrum = [Float]()
for var i=0; i<bufferSize/2; ++i {
let imag = out.imagp[i]
let real = out.realp[i]
let magnitude = sqrt(pow(real,2)+pow(imag,2))
spectrum.append(magnitude)
}
Hacky way: you can just cast a float array. Where reals and imag values are going one after another.
It depends on if audio is interleaved or not. If it's interleaved (most of the cases) left and right channels are in the array with STRIDE 2
Lowest frequency in your case is frequency of a period of 1024 samples. In case of 44100kHz it's ~23ms, lowest frequency of the spectrum will be 1/(1024/44100) (~43Hz). Next frequency will be twice of this (~86Hz) and so on.
4: You have installed a callback handler on an audio bus. This is likely run with real-time thread priority and frequently. You should not do anything that has potential for blocking (it will likely result in priority inversion and glitchy audio):
Allocate memory (realp, imagp - [Float](.....) is shorthand for Array[float] - and likely allocated on the heap`. Pre-allocate these
Call lengthy operations such as vDSP_create_fftsetup() - which also allocates memory and initialises it. Again, you can allocate this once outside of your function.

How do I interpret an AudioBuffer and get the power?

I am trying to make a volume-meter for my app, which will show while recording a video. I have found a lot of support for such meters for iOS, but mostly for AVAudioPlayer, which is no option for me. I am using AVCaptureSession to record, and will then end up with the delegate method shown below:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
CMFormatDescriptionRef formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer);
CFRetain(sampleBuffer);
CFRetain(formatDescription);
if(connection == audioConnection)
{
CMBlockBufferRef blockBuffer;
AudioBufferList audioBufferList;
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer,
NULL, &audioBufferList, sizeof(AudioBufferList), NULL, NULL,
kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
&blockBuffer);
SInt16 *data = audioBufferList.mBuffers[0].mData;
}
//Releases etc..
}
(Only showing relevant code)
Of what I understand, I receive a 'sample buffer', containing either audio or video. Once I've verified that the connection indeed is audio, then I 'extract' the audioBufferList from the buffer, and I am sitting here left with a list of one (or more?) audioBuffers. The actual data is, as I understand, represented as SInt16, or '16 bits signed integer', which as far as I understand has a range from -32,768 to 32,767. However, if I simply print out this received value, I get A LOT of bouncing numbers. When in "silence" I get values bouncing rapidly between -200 and 200, and when there's noise I get values from -4,000 to 13,000, completely out of order.
As I've understood from reading, the value 0 will represent silence. However, I do not understand the difference between negative and positive values, as well as I do not know if the are able to reach all the way up/down to +-32,768.
I believe I need a percentage of how 'loud' it is, but have been unable to find anything.
I have read a couple of tutorials and references on the matter, but nothing makes sense to me. I followed one guide by doing this(appending to the code above, inside the if):
float accumulator = 0;
for(int i = 0; i < audioBufferList.mBuffers[0].mDataByteSize; i++)
accumulator += data[i] * data[i];
float power = accumulator / audioBufferList.mBuffers[0].mDataByteSize;
float decibels = log10f(power);
NSLog(#"%f", decibels);
Apparently, this code was supposed to align from -1 to +1, but that did not happen. I am now getting values around 6.194681 when silence, and 7.773492 for some noise. This is feels like the correct 'range', but in the 'wrong place'. I can't simply subtract 7 from the number and assume I'm between -1 and +1. There should be some logic and science behind how this should work, but I do not know enough about how digital audio works.
Does anyone know the logic behind this? Is 0 always silence while -32,768 and 32,767 are loud noises? Can I then simply multiply all negative values by -1 to always get positive values, and then find out how many percent they are at (between 0 and 32767)? Somehow, I don't believe this will work, as I guess there is a reason for the negative values.. I'm not completely sure what to try.
The code in your question is wrong in several ways. This code is trying to copy that from the article below, but you've not handled it properly converting from the float-based code in the article to 16-bit integer math. You're also looping on the wrong number of values (max i) and will end up pulling in garbage data. So this is all kinds of wrong.
https://www.mikeash.com/pyblog/friday-qa-2012-10-12-obtaining-and-interpreting-audio-data.html
The code in the article is correct. Here's what it is, expanded a bit. This is only looking at the first buffer in a 32-bit float buffer list.
float accumulator = 0;
AudioBuffer buffer = bufferList->mBuffers[0];
float * data = (float *)buffer.mData;
UInt32 numSamples = buffer.mDataByteSize / sizeof(float);
for (UInt32 i = 0; i < numSamples; i++) {
accumulator += data[i] * data[i];
}
float power = accumulator / (float)numSamples;
float decibels = 10 * log10f(power);
As the article says, the result here is decibels uses 0dB reference. eg, 0.0 is the maximum value. This is the same thing that AVAudioPlayer's averagePowerForChannel returns for example.
To use this in your 16-bit integer context, you'd need to a) loop appropriately through each 16-bit sample, b) convert the data[i] value from a 16-bit integer to a floating point value in the [-1.0, 1.0] range before squaring and adding to the accumulator.

minMaxLoc maximum value confusion

I have been using minMaxLoc to compute the maximum value of the data obtained by running a laplacian filter over a grayscale image. My simple intention is to roughly estimate the sharpness. I encountered a confusing situation, which I have discussed here.
The max value obtained from the minMaxLoc function was like 1360, 1456,450 etc for a set of images.
Laplacian(src_gray,dst,ddepth,kernel_size,scale,delta,BORDER_DEFAULT);
minMaxLoc(dst,&min,&estimate,&minLoc,&maxLoc,noArray());//Estimate is the max value
Now I just tried to compute the average to have a better idea of the spread of the sharpness in the image. Note that DST is the Mat variable holding data from the Laplacian.
Size s = dst.size();
rows = s.height;
cols = s.width;
total = 0;
max = 0;
for(int k=0;k<rows;k++)
{
for(int l=0;l<cols;l++)
{
total = total + abs(dst.at<int>(k,l));
}
}
average = total/(rows*cols);
There are 2 baffling results I obtained. The average value I obtained, was not only greater than the max value obtained from minMaxLoc, but also was at times negative, when tried over a set of images. sample Average values where 22567, at times -25678.
The occurrence of negative was even more baffling as am using the abs() to get the absolute value of the laplacian results.
To get a proper understanding, I calculated the max value by myself and then the average values :
Size s = dst.size();
rows = s.height;
cols = s.width;
total = 0;
max = 0;
for(int k=0;k<rows;k++)
{
for(int l=0;l<cols;l++)
{
if(abs(dst.at<int>(k,l))>max)
{
max = abs(dst.at<int>(k,l));
}
total = total + abs(dst.at<int>(k,l));
}
}
average = total/(rows*cols);
surprisingly, I found the max value to be in 8 digits.
This is why I got confused. What is the max value given from the minMaxLoc function? And why is the abs() in total, not working and why am I getting -ve average values.?
Please forgive me if am missing something in the code, but this is slightly confusing me. Thanks for your help in advance.
I think you should use .at< uchar > instead of int (considering image to be grayscale) otherwise the value will overflow!
Typically images have 8 bit images. So chances are, that you are accessing the pixels of your image using the wrong method. And in this case, the values you read from the matrix are wrong.
To check if you are working with a single channel integer matrix use
dst.type() == CV_32SC1 .
To check for a 8 bit matrix use
dst.type() == CV_8SC1 .
If you are actually having such an 8 bit integer matrix, you need to use .at<uchar> to access the pixels.
The reason your total variable is negative even though you only added positive values to it is probably due to an integer overflow. You can avoid this by using a long int for total.

Resources