why is my NAudio Low pass filter not working - signal-processing

This is my sample provider implementation
public class FilterSampleProvider : ISampleProvider
{
private ISampleProvider sourceProvider;
private float cutOffFreq;
private float bandWidth;
private BiQuadFilter filter;
public FilterSampleProvider(ISampleProvider sourceProvider, int cutOffFreq, int bandWidth)
{
this.sourceProvider = sourceProvider;
this.cutOffFreq = cutOffFreq;
this.bandWidth = bandWidth;
filter = BiQuadFilter.LowPassFilter(sourceProvider.WaveFormat.SampleRate, this.cutOffFreq, this.bandWidth);
}
public WaveFormat WaveFormat { get { return sourceProvider.WaveFormat; } }
public int Read(float[] buffer, int offset, int count)
{
int samplesRead = sourceProvider.Read(buffer, offset, count);
for (int i = 0; i < samplesRead; i++)
buffer[offset + i] = filter.Transform(buffer[offset + i]);
return samplesRead;
}
}
I am generating a sin wave at 4000 hz frequency using the below code
var sine20Seconds = new SignalGenerator()
{
Gain = 1,
Frequency = 4000,
Type = SignalGeneratorType.Sin
}
.Take(TimeSpan.FromSeconds(60));
Then I am creating a file and again reading the file because i want the original file to compare with the output.
WaveFileWriter.CreateWaveFile("filteroutput.wav", sine20Seconds.ToWaveProvider());
var reader = new WaveFileReader(File.OpenRead("filteroutput.wav"));
Then creating my filter sample provider with a cutoff frequency of 500, output I am expecting is a file without the sin wave hum
filterSampleProvider = new FilterSampleProvider(reader.ToSampleProvider(),500,1);
filteredWaveProvider = filterSampleProvider.ToWaveProvider();
I believe the q is for Quality Factor so I am passing 1
WaveFileWriter.CreateWaveFile("filteroutput1.wav", filteredWaveProvider);
Then I am create a new output file.
the output file after going through LFT is still having the sin wave at 4000Hz
Is there anything I am doing wrong?
After going through the Github Code repo of NAudio I am confused about the q value, is it quality factor or bandwidth? why would you have a bandwidth for low pass filter?

Related

Query of Extraction and process of buffer_data from 4 channel PCM data buffer in STM32 Code

I am trying to understand one 4 channel mic-array code provided by ST(AMicArray_Microphones_Streaming).
In the code, the PCM buffer data is sent on the USB via this function. In this code, I actually want to do some processing on the received data and then I want to send it on USB. My question is how to extract the raw data and process them?
This topic seems very broad to a beginner person if someone can help me with some start material or guidelines would be appreciated.
//PCMSamples = AUDIO_IN_SAMPLING_FREQUENCY/1000*AUDIO_IN_CHANNELS;
uint8_t USBD_AUDIO_Data_Transfer(USBD_HandleTypeDef *pdev, int16_t * audioData, uint16_t PCMSamples)
{
USBD_AUDIO_HandleTypeDef *haudio;
haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassData;
if(haudioInstance.state==STATE_USB_WAITING_FOR_INIT){
return USBD_BUSY;
}
uint16_t dataAmount = PCMSamples * 2; /*Bytes*/
uint16_t true_dim = haudio->buffer_length;
uint16_t current_data_Amount = haudio->dataAmount;
uint16_t packet_dim = haudio->paketDimension;
if(haudio->state==STATE_USB_REQUESTS_STARTED || current_data_Amount!=dataAmount){
/*USB parameters definition, based on the amount of data passed*/
haudio->dataAmount=dataAmount;
uint16_t wr_rd_offset = (AUDIO_IN_PACKET_NUM/2) * dataAmount / packet_dim;
haudio->wr_ptr=wr_rd_offset * packet_dim;
haudio->rd_ptr = 0;
haudio->upper_treshold = wr_rd_offset + 1;
haudio->lower_treshold = wr_rd_offset - 1;
haudio->buffer_length = (packet_dim * (dataAmount / packet_dim) * AUDIO_IN_PACKET_NUM);
/*Memory allocation for data buffer, depending (also) on data amount passed to the transfer function*/
if(haudio->buffer != NULL)
{
USBD_free(haudio->buffer);
}
haudio->buffer = USBD_malloc(haudio->buffer_length + haudio->dataAmount);
if(haudio->buffer == NULL)
{
return USBD_FAIL;
}
memset(haudio->buffer,0,(haudio->buffer_length + haudio->dataAmount));
haudio->state=STATE_USB_BUFFER_WRITE_STARTED;
}else if(haudio->state==STATE_USB_BUFFER_WRITE_STARTED){
if(haudio->timeout++==TIMEOUT_VALUE){
haudio->state=STATE_USB_IDLE;
((USBD_AUDIO_ItfTypeDef *)pdev->pUserData)->Stop();
haudio->timeout=0;
}
memcpy((uint8_t * )&haudio->buffer[haudio->wr_ptr], (uint8_t *)(audioData), dataAmount);
haudio->wr_ptr += dataAmount;
haudio->wr_ptr = haudio->wr_ptr % (true_dim);
if((haudio->wr_ptr-dataAmount) == 0){
memcpy((uint8_t *)(((uint8_t *)haudio->buffer)+true_dim),(uint8_t *)haudio->buffer, dataAmount);
}
}
return USBD_OK;
}
I am guessing you are using the X-Cube Audio library from ST, if so the
AudioProcess()
function in the Core/Src/audio_application.c file matches your needs.
void AudioProcess(void)
{
if (CCA02M2_AUDIO_IN_PDMToPCM(CCA02M2_AUDIO_INSTANCE,(uint16_t * )PDM_Buffer,(uint16_t *)PCM_Buffer) != BSP_ERROR_NONE)
{
Error_Handler();
}
Send_Audio_to_USB((int16_t *)PCM_Buffer, (AUDIO_IN_SAMPLING_FREQUENCY/1000)*AUDIO_IN_CHANNELS * N_MS);
}
The PDM and PCM data are available here for processing.
ST provides video tutorial on Audio acquisition here. You can refer their channel for more information.

Why does this EA (in early development) not produce an alert, or an error?

#property strict
string subfolder = "ipc\\";
int last_read = 0;
int t = 0;
struct trade_message
{
int time; // time
string asset; // asset
string direction; // direction
double open_price; // open
double stop_price; // open
double close_price;// open
float fraction; // fraction
string comment; // comment
string status; // status
};
trade_message messages[];
int OnInit()
{
int FH = FileOpen(subfolder+"processedtime.log",FILE_BIN);
if(FH >=0)
{
last_read = FileReadInteger(FH,4);
FileClose(FH);
}
return(INIT_SUCCEEDED);
}
void OnTick()
{
int FH=FileOpen(subfolder+"data.csv",FILE_READ|FILE_CSV, ","); //open file
int p=0;
while(!FileIsEnding(FH))
{
t = StringToInteger(FileReadString(FH));
if(t<=last_read)
{
break;
}
do
{
messages[p].time = t; // time
messages[p].direction = FileReadString(FH); // direction
messages[p].open_price = StringToDouble(FileReadString(FH)); // open
messages[p].stop_price = StringToDouble(FileReadString(FH)); // stop
messages[p].close_price = StringToDouble(FileReadString(FH)); // close
messages[p].fraction = StringToDouble(FileReadString(FH)); // fraction (?float)
messages[p].comment = FileReadString(FH); // comment
messages[p].status = FileReadString(FH); // status
Alert(messages[p].comment);
}
while(!FileIsLineEnding(FH));
p++;
Alert("P = ",p,"; Array length = ", ArraySize(messages));
}
FileClose(FH);
last_read = t;
FileDelete(subfolder+"processedtime.log");
FH = FileOpen(subfolder+"processedtime.log",FILE_BIN);
FileWriteInteger(FH,t,4);
FileClose(FH);
ArrayFree(messages);
}
The code is in tick function in order to test it before taking it out to a function.
The data.csv file is:
Timestamp
Asset
Direction
Price
Stop
Profit
Fraction
Comment
Status
xxx
yyy
SHORT
13240
13240
13220
0.5
yyy SHORT 13240 - taken half at 13220 and stop to breakeven
U
xxx
yyy
SHORT
13240
13262
13040
1.0
55%
DP
The processedtime.log is not being created.
So 2 problems, both of my own making.
Not skipping the header row, simply remedied by inserting
do f = FileReadString(FH);
while(!FileIsLineEnding(FH));
after FileOpen() (easier than altering my source for data.csv), and
forgetting about the asset part of my structure, thus ensuring my structures got out of sync with my lines!! Remedied by adding messages[p].asset = FileReadString(FH); between messages[p].time = t and messages[p].direction = FileReadString(FH);

Running a DX11 compute shader with SharpDX - cannot get results

I am trying to run a compute shader and get the resulting texture using SharpDX.
From what I understood, I need to:
1. Create a texture to set as an output to the shader.
2. Set the above texture as an unordered access view so I can write to it.
3. Run the shader
4. Copy the UAV texture to a staging texture so it can be accessed by the CPU
5. Read the staging texture to a Bitmap
The problem is that whatever I do, the result is a black bitmap. I don't think the bug is in the Texture2D -> Bitmap conversion code as printing the first pixel directly from the staging texture also gives me 0.
This is my shader code:
RWTexture2D<float4> Output : register(u0);
[numthreads(32, 32, 1)]
void main(uint3 id : SV_DispatchThreadID) {
Output[id.xy] = float4(0, 1.0, 0, 1.0);
}
Using the MS DX11 docs and blogs, I pieced together this code to run the texture:
public class GPUScreenColor {
private int adapterIndex = 0;
private Adapter1 gpu;
private Device device;
private ComputeShader computeShader;
private Texture2D texture;
private Texture2D stagingTexture;
private UnorderedAccessView view;
public GPUScreenColor() {
initializeDirectX();
}
private void initializeDirectX() {
using (var factory = new Factory1()) {
gpu = factory.GetAdapter1(adapterIndex);
}
device = new Device(gpu, DeviceCreationFlags.Debug, FeatureLevel.Level_11_1);
var compilationResult = ShaderBytecode.CompileFromFile("test.hlsl", "main", "cs_5_0", ShaderFlags.Debug);
computeShader = new ComputeShader(device, compilationResult.Bytecode);
texture = new Texture2D(device, new Texture2DDescription() {
BindFlags = BindFlags.UnorderedAccess | BindFlags.ShaderResource,
Format = Format.R8G8B8A8_UNorm,
Width = 1024,
Height = 1024,
OptionFlags = ResourceOptionFlags.None,
MipLevels = 1,
ArraySize = 1,
SampleDescription = { Count = 1, Quality = 0 }
});
UnorderedAccessView view = new UnorderedAccessView(device, texture, new UnorderedAccessViewDescription() {
Format = Format.R8G8B8A8_UNorm,
Dimension = UnorderedAccessViewDimension.Texture2D,
Texture2D = { MipSlice = 0 }
});
stagingTexture = new Texture2D(device, new Texture2DDescription {
CpuAccessFlags = CpuAccessFlags.Read,
BindFlags = BindFlags.None,
Format = Format.R8G8B8A8_UNorm,
Width = 1024,
Height = 1024,
OptionFlags = ResourceOptionFlags.None,
MipLevels = 1,
ArraySize = 1,
SampleDescription = { Count = 1, Quality = 0 },
Usage = ResourceUsage.Staging
});
}
public Bitmap getBitmap() {
device.ImmediateContext.ComputeShader.Set(computeShader);
device.ImmediateContext.ComputeShader.SetUnorderedAccessView(0, view);
device.ImmediateContext.Dispatch(32, 32, 1);
device.ImmediateContext.CopyResource(texture, stagingTexture);
var mapSource = device.ImmediateContext.MapSubresource(stagingTexture, 0, MapMode.Read, MapFlags.None);
Console.WriteLine(Marshal.ReadInt32(IntPtr.Add(mapSource.DataPointer, 0)));
try {
// Copy pixels from screen capture Texture to GDI bitmap
Bitmap bitmap = new Bitmap(1024, 1024, System.Drawing.Imaging.PixelFormat.Format32bppRgb);
BitmapData mapDest = bitmap.LockBits(new Rectangle(0, 0, 1024, 1024), ImageLockMode.ReadWrite, bitmap.PixelFormat);
try {
var sourcePtr = mapSource.DataPointer;
var destPtr = mapDest.Scan0;
for (int y = 0; y < 1024; y++) {
// Copy a single line
Utilities.CopyMemory(destPtr, sourcePtr, 1024 * 4);
// Advance pointers
sourcePtr = IntPtr.Add(sourcePtr, mapSource.RowPitch);
destPtr = IntPtr.Add(destPtr, mapDest.Stride);
}
return bitmap;
} finally {
bitmap.UnlockBits(mapDest);
}
} finally {
device.ImmediateContext.UnmapSubresource(stagingTexture, 0);
}
}
}
I am pretty new to shaders so it may be something obvious...
First thing, you create your UAV as a local :
UnorderedAccessView view = new UnorderedAccessView(....
So the field is then null, replacing by
view = new UnorderedAccessView(....
will solve the first issue.
Second, it's quite likely that the runtime will complain about types (debug will give you something like :
The resource return type for component 0 declared in the shader code (FLOAT) is not compatible with the resource type bound to Unordered Access View slot 0 of the Compute Shader unit (UNORM).
Some cards might do something (fix it silently), some might do nothing, some might crash :)
Problem is that RWTexture2D does not match UNORM format (as you specify flating point format here).
You need to enforce your RWTexture to be specifically of unorm format eg (yes runtime can be that picky):
RWTexture2D<unorm float4> Output : register(u0);
Then your whole setup should work (PS: I did not check the bitmap code, but I doubled checked that the shader is running without error and first pixel is matching)

AudioConverter#FillComplexBuffer returns -50 and does not convert anything

I'm strongly following this Xamarin sample (based on this Apple sample) to convert a LinearPCM file to an AAC file.
The sample works great, but implemented in my project, the FillComplexBuffer method returns error -50 and the InputData event is not triggered once, thus nothing is converted.
The error only appears when testing on a device. When testing on the emulator, everything goes great and I get a good encoded AAC file at the end.
I tried a lot of things today, and I don't see any difference between my code and the sample code. Do you have any idea where this may come from?
I don't know if this is in anyway related to Xamarin, it doesn't seem so since the Xamarin sample works great.
Here's the relevant part of my code:
protected void Encode(string path)
{
// In class setup. File at TempWavFilePath has DecodedFormat as format.
//
// DecodedFormat = AudioStreamBasicDescription.CreateLinearPCM();
// AudioStreamBasicDescription encodedFormat = new AudioStreamBasicDescription()
// {
// Format = AudioFormatType.MPEG4AAC,
// SampleRate = DecodedFormat.SampleRate,
// ChannelsPerFrame = DecodedFormat.ChannelsPerFrame,
// };
// AudioStreamBasicDescription.GetFormatInfo (ref encodedFormat);
// EncodedFormat = encodedFormat;
// Setup converter
AudioStreamBasicDescription inputFormat = DecodedFormat;
AudioStreamBasicDescription outputFormat = EncodedFormat;
AudioConverterError converterCreateError;
AudioConverter converter = AudioConverter.Create(inputFormat, outputFormat, out converterCreateError);
if (converterCreateError != AudioConverterError.None)
{
Console.WriteLine("Converter creation error: " + converterCreateError);
}
converter.EncodeBitRate = 192000; // AAC 192kbps
// get the actual formats back from the Audio Converter
inputFormat = converter.CurrentInputStreamDescription;
outputFormat = converter.CurrentOutputStreamDescription;
/*** INPUT ***/
AudioFile inputFile = AudioFile.OpenRead(NSUrl.FromFilename(TempWavFilePath));
// init buffer
const int inputBufferBytesSize = 32768;
IntPtr inputBufferPtr = Marshal.AllocHGlobal(inputBufferBytesSize);
// calc number of packets per read
int inputSizePerPacket = inputFormat.BytesPerPacket;
int inputBufferPacketSize = inputBufferBytesSize / inputSizePerPacket;
AudioStreamPacketDescription[] inputPacketDescriptions = null;
// init position
long inputFilePosition = 0;
// define input delegate
converter.InputData += delegate(ref int numberDataPackets, AudioBuffers data, ref AudioStreamPacketDescription[] dataPacketDescription)
{
// how much to read
if (numberDataPackets > inputBufferPacketSize)
{
numberDataPackets = inputBufferPacketSize;
}
// read from the file
int outNumBytes;
AudioFileError readError = inputFile.ReadPackets(false, out outNumBytes, inputPacketDescriptions, inputFilePosition, ref numberDataPackets, inputBufferPtr);
if (readError != 0)
{
Console.WriteLine("Read error: " + readError);
}
// advance input file packet position
inputFilePosition += numberDataPackets;
// put the data pointer into the buffer list
data.SetData(0, inputBufferPtr, outNumBytes);
// add packet descriptions if required
if (dataPacketDescription != null)
{
if (inputPacketDescriptions != null)
{
dataPacketDescription = inputPacketDescriptions;
}
else
{
dataPacketDescription = null;
}
}
return AudioConverterError.None;
};
/*** OUTPUT ***/
// create the destination file
var outputFile = AudioFile.Create (NSUrl.FromFilename(path), AudioFileType.M4A, outputFormat, AudioFileFlags.EraseFlags);
// init buffer
const int outputBufferBytesSize = 32768;
IntPtr outputBufferPtr = Marshal.AllocHGlobal(outputBufferBytesSize);
AudioBuffers buffers = new AudioBuffers(1);
// calc number of packet per write
int outputSizePerPacket = outputFormat.BytesPerPacket;
AudioStreamPacketDescription[] outputPacketDescriptions = null;
if (outputSizePerPacket == 0) {
// if the destination format is VBR, we need to get max size per packet from the converter
outputSizePerPacket = (int)converter.MaximumOutputPacketSize;
// allocate memory for the PacketDescription structures describing the layout of each packet
outputPacketDescriptions = new AudioStreamPacketDescription [outputBufferBytesSize / outputSizePerPacket];
}
int outputBufferPacketSize = outputBufferBytesSize / outputSizePerPacket;
// init position
long outputFilePosition = 0;
long totalOutputFrames = 0; // used for debugging
// write magic cookie if necessary
if (converter.CompressionMagicCookie != null && converter.CompressionMagicCookie.Length != 0)
{
outputFile.MagicCookie = converter.CompressionMagicCookie;
}
// loop to convert data
Console.WriteLine ("Converting...");
while (true)
{
// create buffer
buffers[0] = new AudioBuffer()
{
NumberChannels = outputFormat.ChannelsPerFrame,
DataByteSize = outputBufferBytesSize,
Data = outputBufferPtr
};
int writtenPackets = outputBufferPacketSize;
// LET'S CONVERT (it's about time...)
AudioConverterError converterFillError = converter.FillComplexBuffer(ref writtenPackets, buffers, outputPacketDescriptions);
if (converterFillError != AudioConverterError.None)
{
Console.WriteLine("FillComplexBuffer error: " + converterFillError);
}
if (writtenPackets == 0) // EOF
{
break;
}
// write to output file
int inNumBytes = buffers[0].DataByteSize;
AudioFileError writeError = outputFile.WritePackets(false, inNumBytes, outputPacketDescriptions, outputFilePosition, ref writtenPackets, outputBufferPtr);
if (writeError != 0)
{
Console.WriteLine("WritePackets error: {0}", writeError);
}
// advance output file packet position
outputFilePosition += writtenPackets;
if (FlowFormat.FramesPerPacket != 0) {
// the format has constant frames per packet
totalOutputFrames += (writtenPackets * FlowFormat.FramesPerPacket);
} else {
// variable frames per packet require doing this for each packet (adding up the number of sample frames of data in each packet)
for (var i = 0; i < writtenPackets; ++i)
{
totalOutputFrames += outputPacketDescriptions[i].VariableFramesInPacket;
}
}
}
// write out any of the leading and trailing frames for compressed formats only
if (outputFormat.BitsPerChannel == 0)
{
Console.WriteLine("Total number of output frames counted: {0}", totalOutputFrames);
WritePacketTableInfo(converter, outputFile);
}
// write the cookie again - sometimes codecs will update cookies at the end of a conversion
if (converter.CompressionMagicCookie != null && converter.CompressionMagicCookie.Length != 0)
{
outputFile.MagicCookie = converter.CompressionMagicCookie;
}
// Clean everything
Marshal.FreeHGlobal(inputBufferPtr);
Marshal.FreeHGlobal(outputBufferPtr);
converter.Dispose();
outputFile.Dispose();
// Remove temp file
File.Delete(TempWavFilePath);
}
I already saw this SO question, but the not-detailed C++/Obj-C related answer doesn't seem to fit with my problem.
Thanks !
I finally found the solution!
I just had to declare AVAudioSession category before converting the file.
AVAudioSession.SharedInstance().SetCategory(AVAudioSessionCategory.AudioProcessing);
AVAudioSession.SharedInstance().SetActive(true);
Since I also use an AudioQueue to RenderOffline, I must in fact set the category to AVAudioSessionCategory.PlayAndRecord so both the offline rendering and the audio converting work.

Improve Face Recognition

I am trying to a develop face-recognition app in android. I am using JavaCv FaceRecognizer. But so far I am getting very poor results. It recognizes image of person which was trained but it also recognizes unknown images. For the known faces it gives me large value as a distance, most of the time from 70-90, sometimes 90+, while unknown images also get 70-90.
So how can I increase the performance of face-recognition? What techniques are there? What percentage of success you can get with this normally?
I have never worked with image processing. I will appreciate any guidelines.
Here is the code:
public class PersonRecognizer {
public final static int MAXIMG = 100;
FaceRecognizer faceRecognizer;
String mPath;
int count=0;
labels labelsFile;
static final int WIDTH= 70;
static final int HEIGHT= 70;
private static final String TAG = "PersonRecognizer";
private int mProb=999;
PersonRecognizer(String path)
{
faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createLBPHFaceRecognizer(2,8,8,8,100);
// path=Environment.getExternalStorageDirectory()+"/facerecog/faces/";
mPath=path;
labelsFile= new labels(mPath);
}
void changeRecognizer(int nRec)
{
switch(nRec) {
case 0: faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createLBPHFaceRecognizer(1,8,8,8,100);
break;
case 1: faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createFisherFaceRecognizer();
break;
case 2: faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createEigenFaceRecognizer();
break;
}
train();
}
void add(Mat m, String description)
{
Bitmap bmp= Bitmap.createBitmap(m.width(), m.height(), Bitmap.Config.ARGB_8888);
Utils.matToBitmap(m,bmp);
bmp= Bitmap.createScaledBitmap(bmp, WIDTH, HEIGHT, false);
FileOutputStream f;
try
{
f = new FileOutputStream(mPath+description+"-"+count+".jpg",true);
count++;
bmp.compress(Bitmap.CompressFormat.JPEG, 100, f);
f.close();
} catch (Exception e) {
Log.e("error",e.getCause()+" "+e.getMessage());
e.printStackTrace();
}
}
public boolean train() {
File root = new File(mPath);
FilenameFilter pngFilter = new FilenameFilter() {
public boolean accept(File dir, String name) {
return name.toLowerCase().endsWith(".jpg");
};
};
File[] imageFiles = root.listFiles(pngFilter);
MatVector images = new MatVector(imageFiles.length);
int[] labels = new int[imageFiles.length];
int counter = 0;
int label;
IplImage img=null;
IplImage grayImg;
int i1=mPath.length();
for (File image : imageFiles) {
String p = image.getAbsolutePath();
img = cvLoadImage(p);
if (img==null)
Log.e("Error","Error cVLoadImage");
Log.i("image",p);
int i2=p.lastIndexOf("-");
int i3=p.lastIndexOf(".");
int icount = 0;
try
{
icount=Integer.parseInt(p.substring(i2+1,i3));
}
catch(Exception ex)
{
ex.printStackTrace();
}
if (count<icount) count++;
String description=p.substring(i1,i2);
if (labelsFile.get(description)<0)
labelsFile.add(description, labelsFile.max()+1);
label = labelsFile.get(description);
grayImg = IplImage.create(img.width(), img.height(), IPL_DEPTH_8U, 1);
cvCvtColor(img, grayImg, CV_BGR2GRAY);
images.put(counter, grayImg);
labels[counter] = label;
counter++;
}
if (counter>0)
if (labelsFile.max()>1)
faceRecognizer.train(images, labels);
labelsFile.Save();
return true;
}
public boolean canPredict()
{
if (labelsFile.max()>1)
return true;
else
return false;
}
public String predict(Mat m) {
if (!canPredict())
return "";
int n[] = new int[1];
double p[] = new double[1];
//conver Mat to black and white
/*Mat gray_m = new Mat();
Imgproc.cvtColor(m, gray_m, Imgproc.COLOR_RGBA2GRAY);*/
IplImage ipl = MatToIplImage(m, WIDTH, HEIGHT);
faceRecognizer.predict(ipl, n, p);
if (n[0]!=-1)
{
mProb=(int)p[0];
Log.v(TAG, "Distance = "+mProb+"");
Log.v(TAG, "N = "+n[0]);
}
else
{
mProb=-1;
Log.v(TAG, "Distance = "+mProb);
}
if (n[0] != -1)
{
return labelsFile.get(n[0]);
}
else
{
return "Unknown";
}
}
IplImage MatToIplImage(Mat m,int width,int heigth)
{
Bitmap bmp;
try
{
bmp = Bitmap.createBitmap(m.width(), m.height(), Bitmap.Config.RGB_565);
}
catch(OutOfMemoryError er)
{
bmp = Bitmap.createBitmap(m.width()/2, m.height()/2, Bitmap.Config.RGB_565);
er.printStackTrace();
}
Utils.matToBitmap(m, bmp);
return BitmapToIplImage(bmp, width, heigth);
}
IplImage BitmapToIplImage(Bitmap bmp, int width, int height) {
if ((width != -1) || (height != -1)) {
Bitmap bmp2 = Bitmap.createScaledBitmap(bmp, width, height, false);
bmp = bmp2;
}
IplImage image = IplImage.create(bmp.getWidth(), bmp.getHeight(),
IPL_DEPTH_8U, 4);
bmp.copyPixelsToBuffer(image.getByteBuffer());
IplImage grayImg = IplImage.create(image.width(), image.height(),
IPL_DEPTH_8U, 1);
cvCvtColor(image, grayImg, opencv_imgproc.CV_BGR2GRAY);
return grayImg;
}
protected void SaveBmp(Bitmap bmp,String path)
{
FileOutputStream file;
try
{
file = new FileOutputStream(path , true);
bmp.compress(Bitmap.CompressFormat.JPEG, 100, file);
file.close();
}
catch (Exception e) {
// TODO Auto-generated catch block
Log.e("",e.getMessage()+e.getCause());
e.printStackTrace();
}
}
public void load() {
train();
}
public int getProb() {
// TODO Auto-generated method stub
return mProb;
}
}
I have faced similar challenges recently, here are the things which helped me in getting better results:
Crop the faces from images - this will remove unnecessary pixels at the time of inference
Resize the cropped face images - this impacts when detecting face landmarks, try different scales on test sets to understand what works best. Also, this impacts the inference time as well, smaller the size, faster the inference.
Improve the brightness of the face images - I found this really helpful, detecting face landmarks in darker images was not much good, this is mainly due to the model, which was pre-trained with mostly white faces - having understanding on training data will helps when dealing with bias.
Convert to grayscale images - this I have seen it in many forums and said that, this will helpful in finding the edges efficiently - and processing time is less when compared to colour images (3 channels -RGB) - however, this did not help much.
Try to capture (register) as many as images for individual person in different angles, lightings and other variations - this one really helps as it is comparing with encodings of the stored images.
Try to implement 1-1 comparison for face verification - for example, in my system, I have captured 10 pictures for each person, and at the time of verification, I am comparing against 10 pictures, instead of all the encodings of all the persons stored in the system. This will provide, false positives, however use-cases are limited in this setup, I am using it for face authentication, and compare the new face against existing faces where mobile number is same.
My understanding as of today, face recognition system works great and but not 100% accurate, we have to understand the model architecture, training data and our requirement and deploy it accordingly to get better outcome. Here are some points which helped me improve overall system:
Implement fallback method - provide option to user, when our system failed to detects them correctly, example, if face authentication failed for some reason, show them enter PIN option
In critical system - add periodic human intervention to confirm system result - for example, if a system not allows a user based on FR result - verify with a human agent for failed result and allow the user
Implement multiple factors for authentication - deploy face recognition system as addition to existing system - for example, after user logged in with credentials - verify them its intended person using face recognition system
Design your user interface in a way, at the time of verification, how user should behave like open eyes, close mouth, etc without impacting user experience
Provide clear instruction to users, when they are dealing with the system - for example, let user know, FR system integrated and they need to show their faces in good lighting condition, etc.

Resources