Converting 16-bit short to 32 bit float - ios

In the tone generator example for iOS:http://www.cocoawithlove.com/2010/10/ios-tone-generator-introduction-to.html
I am trying to convert a short array to Float32 in iOS.
Float32 *buffer = (Float32 *)ioData->mBuffers[channel].mData;
short* outputShortBuffer = static_cast<short*>(outputBuffer);
for (UInt32 frame = 0, j=0; frame < inNumberFrames; frame++, j=j+2)
{
buffer[frame] = outputShortBuffer[frame];
}
For some reasons, I am hearing an added noise when played back from the speaker. I think that there is a problem with conversion from short to Float32?

Yes, there is.
Consider that the value-range for floating point samples is -1.0 <= Xn <= 1.0 and for signed short is -32767 <= Xn <= +32767. Merely casting will result in clipping on virtually all samples.
So taking this into account:
Float32 *buffer = (Float32 *)ioData->mBuffers[channel].mData;
short* outputShortBuffer = static_cast<short*>(outputBuffer);
for (UInt32 frame = 0, j=0; frame < inNumberFrames; frame++, j=j+2)
{
buffer[frame] = ((float) outputShortBuffer[frame]) / 32767.0f;
}
[Note: this is not the optimal way of doing this].
However, are you sure your frames are mono? If not this might also be a cause of audio corruption as you'll only be copying one channel.
As an aside, why, if your output buffer is floats are you not using them throughout?

Related

How to get more precise output out of an FFT?

I am trying to make a colored waveform using the output of the following code. But when I run it, I only get certain numbers (see the freq variable, it uses the bin size, frame rate and index to make these frequencies) as output frequencies. I'm no math expert, even though I cobbled this together from existing code and answers.
//
// colored_waveform.c
// MixDJ
//
// Created by Jonathan Silverman on 3/14/19.
// Copyright © 2019 Jonathan Silverman. All rights reserved.
//
#include "colored_waveform.h"
#include "fftw3.h"
#include <math.h>
#include "sndfile.h"
//int N = 1024;
// helper function to apply a windowing function to a frame of samples
void calcWindow(double* in, double* out, int size) {
for (int i = 0; i < size; i++) {
double multiplier = 0.5 * (1 - cos(2*M_PI*i/(size - 1)));
out[i] = multiplier * in[i];
}
}
// helper function to compute FFT
void fft(double* samples, fftw_complex* out, int size) {
fftw_plan p;
p = fftw_plan_dft_r2c_1d(size, samples, out, FFTW_ESTIMATE);
fftw_execute(p);
fftw_destroy_plan(p);
}
// find the index of array element with the highest absolute value
// probably want to take some kind of moving average of buf[i]^2
// and return the maximum found
double maxFreqIndex(fftw_complex* buf, int size, float fS) {
double max_freq = 0;
double last_magnitude = 0;
for(int i = 0; i < (size / 2) - 1; i++) {
double freq = i * fS / size;
// printf("freq: %f\n", freq);
double magnitude = sqrt(buf[i][0]*buf[i][0] + buf[i][1]*buf[i][1]);
if(magnitude > last_magnitude)
max_freq = freq;
last_magnitude = magnitude;
}
return max_freq;
}
//
//// map a frequency to a color, red = lower freq -> violet = high freq
//int freqToColor(int i) {
//
//}
void generateWaveformColors(const char path[]) {
printf("Generating waveform colors\n");
SNDFILE *infile = NULL;
SF_INFO sfinfo;
infile = sf_open(path, SFM_READ, &sfinfo);
sf_count_t numSamples = sfinfo.frames;
// sample rate
float fS = 44100;
// float songLengLengthSeconds = numSamples / fS;
// printf("seconds: %f", songLengLengthSeconds);
// size of frame for analysis, you may want to play with this
float frameMsec = 5;
// samples in a frame
int frameSamples = (int)(fS / (frameMsec * 1000));
// how much overlap each frame, you may want to play with this one too
int frameOverlap = (frameSamples / 2);
// color to use for each frame
// int outColors[(numSamples / frameOverlap) + 1];
// scratch buffers
double* tmpWindow;
fftw_complex* tmpFFT;
tmpWindow = (double*) fftw_malloc(sizeof(double) * frameSamples);
tmpFFT = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * frameSamples);
printf("Processing waveform for colors\n");
for (int i = 0, outptr = 0; i < numSamples; i += frameOverlap, outptr++)
{
double inSamples[frameSamples];
sf_read_double(infile, inSamples, frameSamples);
// window another frame for FFT
calcWindow(inSamples, tmpWindow, frameSamples);
// compute the FFT on the next frame
fft(tmpWindow, tmpFFT, frameSamples);
// which frequency is the highest?
double freqIndex = maxFreqIndex(tmpFFT, frameSamples, fS);
printf("%i: ", i);
printf("Max freq: %f\n", freqIndex);
// map to color
// outColors[outptr] = freqToColor(freqIndex);
}
printf("Done.");
sf_close (infile);
}
Here is some of the output:
2094216: Max freq: 5512.500000
2094220: Max freq: 0.000000
2094224: Max freq: 0.000000
2094228: Max freq: 0.000000
2094232: Max freq: 5512.500000
2094236: Max freq: 5512.500000
It only shows certain numbers, not a wide variety of frequencies like it maybe should. Or am I wrong? Is there anything wrong with my code you guys can see? The color stuff is commented out because I haven't done it yet.
The frequency resolution of an FFT is limited by the length of the data sample you have. The more samples you have, the higher the frequency resolution.
In your specific case you chose frames of 5 milliseconds, which is then transformed to a number of samples on the following line:
// samples in a frame
int frameSamples = (int)(fS / (frameMsec * 1000));
This corresponds to only 8 samples at the specified 44100Hz sampling rate. The frequency resolution with such a small frame size can be computed to be
44100 / 8
or 5512.5Hz, a rather poor resolution. Correspondingly, the observed frequencies will always be one of 0, 5512.5, 11025, 16537.5 or 22050Hz.
To get a higher resolution you should increase the number of samples used for analysis by increasing frameMsec (as suggested by the comment "size of frame for analysis, you may want to play with this").

Initialising texture of MTLPixelFormatR32Float in metal

I have a buffer initialised with a single-channel floating point image, which I need to get into a floating point format texture (MTLPixelFormatR32Float). I've tried creating the texture with that format and doing the following:
float *rawData = (float*)malloc(sizeof(float) * img.cols * img.rows);
for(int i = 0; i < img.rows; i++){
for(int j = 0; j < img.cols; j++){
rawData[i * img.cols + j] = img.at<float>(i, j);
}
}
MTLTextureDescriptor *textureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatR32Float
width:img.cols
height:img.rows
mipmapped:NO];
[texture replaceRegion:region mipmapLevel:0 withBytes:&rawData bytesPerRow:bytesPerRow];
where rawData is my buffer with the necessary floating point data. This doesn't work, I get an EXC_BAD_ACCESS error on the [texture replaceRegion...] line. I've also tried the MTKTextureLoader, which also returns nil instead of the texture.
Help would be appreciated. I would be most grateful if anyone has a working method of how to initialise the MTLPixelFormatR32Float texture with custom floating point data for data-parallel computation purposes.
The bytes that you pass to replaceRegion should point to your data. You are incorrectly passing a pointer to a pointer.
To fix it, replace withBytes:&rawData with withBytes:rawData

Kiss fft does not work after giving it more than 32 samples

I am trying to take data from an accelerometer and apply Kiss FFT to the samples. I'm using a Freescale Kinetis FRDM-K22F board. I want to use 64 samples, but when I run the program I get an error saying "kiss fft usage error: improper alloc" I started turning down the sample size and saw that the FFT does work with 32 samples, but giving it 33 samples the program just stops and returns no errors. Giving it any more samples gives similar results.
I played around with how I set up the FFT and followed a few websites and forum posts:
KissFFT output of kiss_fftr
http://digiphd.com/programming-reconstruction-fast-fourier-transform-real-signal-kiss-fft-libraries/
Kiss FFT on a dsPIC33
From what I can see, I haven't done anything different from what the above websites and forums have done. I've included my code below. Any help or advice is greatly appreciated.
void Sample_RUN()
{
int size = 64;
kiss_fft_scalar zero;
memset(&zero,0,sizeof(zero));
kiss_fft_cpx fft_in[size];
kiss_fft_cpx fft_out[size];
kiss_fftr_cfg fft = kiss_fftr_alloc(size*2 ,0 ,NULL,NULL);
signed short samples[size];
for (int i = 0; i < size; i++) {
fft_in[i].r = zero;
fft_in[i].i = zero;
fft_out[i].r = zero;
fft_out[i].i = zero;
}
printf("Data Collection Begins \r\n");
for(int j = 0; j < size; j++)
{
for(;;)
{
dr_status = My_I2C_ReadByte(STATUS_REG);
dr_status = (dr_status & 0x04);
if (dr_status == 0x04)
{
//READING FROM ACCEL OUTPUT DATA REGISTERS
AccelData[0] = My_I2C_ReadByte(OUT_X_MSB_REG);
AccelData[1] = My_I2C_ReadByte(OUT_X_LSB_REG);
AccelData[2] = My_I2C_ReadByte(OUT_Y_MSB_REG);
AccelData[3] = My_I2C_ReadByte(OUT_Y_LSB_REG);
AccelData[4] = My_I2C_ReadByte(OUT_Z_MSB_REG);
AccelData[5] = My_I2C_ReadByte(OUT_Z_LSB_REG);
// 14-bit accelerometer data
Xout_Accel_14_bit = ((signed short) (AccelData[0]<<8 | AccelData[1])) >> 2; // Compute 16-bit X-axis acceleration output value
Yout_Accel_14_bit = ((signed short) (AccelData[2]<<8 | AccelData[3])) >> 2; // Compute 16-bit Y-axis acceleration output value
Zout_Accel_14_bit = ((signed short) (AccelData[4]<<8 | AccelData[5])) >> 2; // Compute 16-bit Z-axis acceleration output value
mag_accel = sqrt(pow(Xout_Accel_14_bit, 2) + pow(Yout_Accel_14_bit, 2) + pow(Zout_Accel_14_bit, 2) );
printf("%d \r\n", mag_accel);
samples[j] = mag_accel;
break;
} // end if
} // end infinite for
} // end for
for (int j = 0; j < size; j++)
{
fft_in[j].r = samples[j];
fft_in[j].i = zero;
fft_out[j].r = zero;
fft_out[j].i = zero;
}
printf("Executing FFT\r\n");
kiss_fftr(fft, (kiss_fft_scalar*) fft_in, fft_out);
printf("Printing FFT Outputs\r\n");
for(int j = 0; j < size; j++)
{
printf("%d \r\n", fft_out[j].r);
}
kiss_fft_cleanup();
free(fft);
} // end Sample_RUN
Sounds like you are running out of memory. I am not familiar with that chip, but perhaps you should be using the last arguments of kiss_fft_alloc so you can skip heap allocation.

using HLSL to invisibly stress a graphics card - How to stress the memory?

I've been developing for a bit an invisible (read: doesn't produce any visual output) stressor to test the capabilities of my graphics card (and as a exploration of DirectCompute in general, with which I'm pretty new). I've got the following code right now that I'm pretty proud of:
RWStructuredBuffer<uint> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void CSMain( uint3 DTid : SV_DispatchThreadID )
{
uint total = 0;
float p = 0;
while(p++ < 40.0){
float s= 4.0;
float M= pow(2.0,p) - 1.0;
for(uint i=0; i <= p - 2; i++)
{
s=((s*s) - 2) % M;
}
if(s < 1.0) total++;
}
BufferOut[DTid.x] = total;
}
This runs the Lucas Lehmer Test for the first 40 powers of two. When I dispatch this code in a timed loop and look at my graphics cards stats using GPU-Z, my GPU load shoots to 99% for the duration. I'm pretty happy with this, but I also notice that the heat generation from a fully loaded out GPU is actually pretty minimal (I'm getting about a 5 to 10 degree Celsius jump, nowhere near the heat jump I get when running, say, Borderlands 2). My thought is that most of my heat comes from memory accesses, so I would need to include consistent memory accesses across the run. My initial code looked like this:
RWStructuredBuffer<uint> BufferOut : register(u0);
groupshared float4 memory_buffer[1024];
[numthreads(1, 1, 1)]
void CSMain( uint3 DTid : SV_DispatchThreadID )
{
uint total = 0;
float p = 0;
while(p++ < 40.0){
[fastop] // to lower compile times - Code efficiency is strangely not what Im looking for right now.
for(uint i = 0; i < 1024; ++i)
float s= 4.0;
float M= pow(2.0,p) - 1.0;
for(uint i=0; i <= p - 2; i++)
{
s=((s*s) - 2) % M;
}
if(s < 1.0) total++;
}
BufferOut[DTid.x] = total;
}
Read a lot of non-coherent samples in large textures. Try both DXT1 compressed and non-compressed values. And use render to texture. And MRT. All will beat on the GPU memory systems.

Cepstrum and Formant Tracking Using Apple Accelerate Framework

I've been using this web page as a guideline for formant tracking of speech...
http://iitg.vlab.co.in/?sub=59&brch=164&sim=615&cnt=1
It all seems to be going pretty well, except for the last step, which is the converting of the cepstrum into a smoothed representation for simple peak picking for the formant tracking. The spectrograph looks good, and the cepstrograph (can I say that? :P) also looks good (from what I can tell), but the final stage the results (smoothed formant representation) are not what I expected.
I uploaded a sample of each stage as visual images to...
http://imgur.com/a/62duS
This sample is for the speech of the sound 'i' as in 'beed'. According to this site...
http://home.cc.umanitoba.ca/~robh/howto.html#formants
the first formant should come in around 500hz, and the second and third around 2200hz and 2800 hz respectively. The spetrograph shows something very similar, but on the last stage I am gettings results similar to...
F1 - 891
F2 - 1550
F3 - 2329
Any insight would be greatly appreciated. I've been going round in circles on this for some time. My code looks as follows...
// set up fft parameters
UInt32 log2n = 9;
UInt32 n = 512;
UInt32 window = n;
UInt32 halfN = n/2;
UInt32 stride = 1;
FFTSetup setupReal = [appDelegate getFftSetup];
int stepSize = (hpBuffer.sampleCount-window) / quantizeCount;
// calculate volume from raw samples, because it seems more reliable that fft
UInt32 volumeWindow = 128;
volumeBuffer = malloc(sizeof(float)*quantizeCount);
int windowPos = 0;
for (int i=0; i < quantizeCount; i++) {
windowPos += stepSize;
float total = 0.0f;
float max = 0.0f;
for (int p=windowPos; p < windowPos+volumeWindow; p++) {
total += sampleBuffer.buffer[p];
if (sampleBuffer.buffer[p] > max)
max = sampleBuffer.buffer[p];
}
volumeBuffer[i] = max;
}
// normalize volumebuffer
[FloatAudioBuffer normalizePositiveBuffer:volumeBuffer ofSize:quantizeCount];
// allocate memory for complex array
COMPLEX_SPLIT complexArray;
complexArray.realp = (float*)malloc(4096*sizeof(float));
complexArray.imagp = (float*)malloc(4096*sizeof(float));
// allocate some space for temporary hamming buffer
float *hamBuffer = malloc(n*sizeof(float));
// create spectrum and feature buffer
spectrumBuffer = malloc(sizeof(float)*halfN*quantizeCount);
formantBuffer = malloc(sizeof(float)*4096*quantizeCount);
cepstrumBuffer = malloc(sizeof(float)*halfN*quantizeCount);
lowCepstrumBuffer = malloc(sizeof(float)*featureCount*quantizeCount);
featureBuffer = malloc(sizeof(float)*featureCount*quantizeCount);
// create data point for each quantize segment
float TWOPI = 2.0f * M_PI;
for (int s=0; s < quantizeCount; s++) {
// copy buffer data into a seperate array and apply hamming window
int offset = (int)(s * stepSize);
for (int i=0; i < n; i++)
hamBuffer[i] = hpBuffer.buffer[offset+i] * ((1.0f-0.46f) - 0.46f*cos(TWOPI*i/((float)n-1.0f)));
// configure float array into acceptable input array format (interleaved)
vDSP_ctoz((COMPLEX*)hamBuffer, 2, &complexArray, 1, halfN);
// run FFT
vDSP_fft_zrip(setupReal, &complexArray, stride, log2n, FFT_FORWARD);
// Absolute square (equivalent to mag^2)
complexArray.imagp[0] = 0.0f;
vDSP_zvmags(&complexArray, 1, complexArray.realp, 1, halfN);
bzero(complexArray.imagp, (halfN) * sizeof(float));
// scale
float scale = 1.0f / (2.0f*(float)n);
vDSP_vsmul(complexArray.realp, 1, &scale, complexArray.realp, 1, halfN);
// get log of absolute values for passing to inverse FFT for cepstrum
for (int i=0; i < halfN; i++)
complexArray.realp[i] = logf(sqrtf(complexArray.realp[i]));
// save this into spectrum buffer
memcpy(&spectrumBuffer[s*halfN], complexArray.realp, halfN*sizeof(float));
// convert spectrum to interleaved ready for inverse fft
vDSP_ctoz((COMPLEX*)&spectrumBuffer[s*halfN], 2, &complexArray, 1, halfN/2);
// create cepstrum
vDSP_fft_zrip(setupReal, &complexArray, stride, log2n-1, FFT_INVERSE);
//convert interleaved to real and straight into cepstrum buffer
vDSP_ztoc(&complexArray, 1, (COMPLEX*)&cepstrumBuffer[s*halfN], 2, halfN/2);
// copy first part of cepstrum into low cepstrum buffer
memcpy(&lowCepstrumBuffer[s*featureCount], &cepstrumBuffer[s*halfN], featureCount*sizeof(float));
// make 8000 point array based on the first 15 values
float *tempArray = malloc(8192*sizeof(float));
for (int i=0; i < 8192; i++) {
if (i < 15)
tempArray[i] = cepstrumBuffer[s*halfN+i];
else
tempArray[i] = 0.0f;
}
vDSP_ctoz((COMPLEX*)tempArray, 2, &complexArray, 1, 4096);
float newLog2n = log2f(8192.0f);
complexArray.imagp[0] = 0.0f;
vDSP_fft_zrip(setupReal, &complexArray, stride, newLog2n, FFT_FORWARD);
vDSP_zvmags(&complexArray, 1, complexArray.realp, 1, 4096);
bzero(complexArray.imagp, (4096) * sizeof(float));
// scale
scale = 1.0f / (2.0f*(float)8192);
vDSP_vsmul(complexArray.realp, 1, &scale, complexArray.realp, 1, 4096);
// get magnitude
for (int i=0; i < 4096; i++)
complexArray.realp[i] = sqrtf(complexArray.realp[i]);
// write to formant buffer
memcpy(&formantBuffer[s*4096], complexArray.realp, 4096*sizeof(float));
// complex array now contains formant spectrum
// it's large, so get features here!
// try simple peak picking algorithm for first 3 formants
int formantIndex = 0;
float *peaks = malloc(6*sizeof(float));
for (int i=0; i < 6; i++)
peaks[i] = 0.0f;
for (int i=1; i < 4096-1 && formantIndex < 6; i++) {
if (complexArray.realp[i-1] < complexArray.realp[i] &&
complexArray.realp[i+1] < complexArray.realp[i])
peaks[formantIndex++] = i;
}

Resources