Image Processing with Kinect and AForge - image-processing

I'm working on a project where I want to track a dice with the Microsoft Kinect using the AForge.NET-Library.
The project itself contains only the fundamentals such as initializing the Kinect, obtaining a Colorframe and applying one color filter but there already the problem occurs.
So here is the main part of the program:
void ColorFrameReady(object sender, ColorImageFrameReadyEventArgs e)
{
using (ColorImageFrame colorFrame = e.OpenColorImageFrame())
{
if (colorFrame != null)
{
colorFrameManager.Update(colorFrame);
BitmapSource thresholdedImage =
diceDetector.GetThresholdedImage(colorFrameManager.Bitmap);
if (thresholdedImage != null)
{
Display.Source = thresholdedImage;
}
}
}
}
The 'Update'-method of the 'colorFrameManager'-object looks like this:
public void Update(ColorImageFrame colorFrame)
{
byte[] colorData = new byte[colorFrame.PixelDataLength];
colorFrame.CopyPixelDataTo(colorData);
if (Bitmap == null)
{
Bitmap = new WriteableBitmap(colorFrame.Width, colorFrame.Height,
96, 96, PixelFormats.Bgr32, null);
}
int stride = Bitmap.PixelWidth * Bitmap.Format.BitsPerPixel / 8;
imageRect.X = 0;
imageRect.Y = 0;
imageRect.Width = colorFrame.Width;
imageRect.Height = colorFrame.Height;
Bitmap.WritePixels(imageRect, colorData, stride, 0);
}
And the 'getThresholdedImage'-method looks like this:
public BitmapSource GetThresholdedImage(WriteableBitmap colorImage)
{
BitmapSource thresholdedImage = null;
if (colorImage != null)
{
try
{
Bitmap bitmap = BitmapConverter.ToBitmap(colorImage);
HSLFiltering filter = new HSLFiltering();
filter.Hue = new IntRange(335, 0);
filter.Saturation = new Range(0.6f, 1.0f);
filter.Luminance = new Range(0.1f, 1.0f);
filter.ApplyInPlace(bitmap);
thresholdedImage = BitmapConverter.ToBitmapSource(bitmap);
}
catch (Exception ex)
{
System.Console.WriteLine(ex.Message);
}
}
return thresholdedImage;
}
Now the program slows down a lot/ doesn't respond when this line is executed:
filter.ApplyInPlace(bitmap);
So I already read this thread (C# image processing on Kinect video using AForge) and I tried EMGU but I couldn't get it to work because of inner exceptions and as the thread-starter wasn't online since four months my question to have a look at his working code wasn't answered.
Now firstly I'm intereseted in how the reason for the slow execution can be
filter.ApplyInPlace(bitmap);
Is this image processing really so complex? Or could this be a problem with my enviroment?
Secondly I would like to ask if skipping frames is a good solution? Or is it better to use polling and open frames only every - for instance - 500 milliseconds.
Thank you very much!

The HSL filter would not slow down the computation, is not an complex Filter.
Im utilizing it in 320x240 images with 30 fps without problems.
The problem may be in the resolution of the computed image or in a too high frame rate!
If the resolution of the image is high, i suggest to resize it before any filter application.
And i think a framerate of 20 (and maybe less) is enough to tracking a dice.

Related

part of gpu memory is not released after asynchronous resize

I'm facing some problem with gpu resize using opencv.
Here is my code:
#define MX 500
#define ASYNC 0
class job {
public:
cv::cuda::GpuMat gpuImage;
cv::cuda::Stream stream;
cv::Mat cpuImage;
~job() {
printf("job deleted\n");
}
};
void onComplete(int status, void* uData) {
job* _job = (job*) uData;
delete _job;
}
void resize(job* _job, vector<uchar> buffer) {
_job->cpuImage = cv::imdecode(buffer, cv::IMREAD_COLOR);
if (ASYNC) {
_job->gpuImage.upload(_job->cpuImage, _job->stream);
cv::cuda::resize(_job->gpuImage, _job->gpuImage, cv::Size(100, 100), 0, 0, cv::INTER_NEAREST, _job->stream);
_job->gpuImage.download(_job->cpuImage, _job->stream);
_job->stream.enqueueHostCallback(onComplete, _job);
// _job->stream.waitForCompletion();
} else {
_job->gpuImage.upload(_job->cpuImage);
cv::cuda::resize(_job->gpuImage, _job->gpuImage, cv::Size(100, 100), 0, 0, cv::INTER_NEAREST);
_job->gpuImage.download(_job->cpuImage);
delete _job;
}
}
vector<uchar> readFile(string filename) {
std::ifstream input(filename, std::ios::binary);
std::vector<unsigned char> buffer(std::istreambuf_iterator<char>(input),{});
return buffer;
}
int main() {
for (int i = 0; i < MX; i++) {
vector<uchar> buf = readFile("input.jpg");
job* _job = new job();
resize(_job, buf);
printFreeGPUMemory();
}
while (true) {
// wait
}
return 0;
}
When I run resize synchronously (ASYNC = 0), the code works perfectly fine. But when I run it asynchronously (ASYNC = 1), it seems that some gpu memory is lost somewhere despite the fact that I have deleted all created GpuMats and Streams. The more loop I run, the less free memory I have. is there a bug or part of my code is wrong?
problem solved.
here is the note of the callback from OpenCV docs:
Callbacks must not make any CUDA API calls. Callbacks must not perform
any synchronization that may depend on outstanding device work or
other callbacks that are not mandated to run earlier. Callbacks
without a mandated order (in independent streams) execute in undefined
order and may be serialized.
I had read the note but didn't actually notice that even deleting a cv::cuda::* still causes problems. So the solution is to avoid "touching" any cv::cuda::* in the callback, even deleting or releasing.

Running a DX11 compute shader with SharpDX - cannot get results

I am trying to run a compute shader and get the resulting texture using SharpDX.
From what I understood, I need to:
1. Create a texture to set as an output to the shader.
2. Set the above texture as an unordered access view so I can write to it.
3. Run the shader
4. Copy the UAV texture to a staging texture so it can be accessed by the CPU
5. Read the staging texture to a Bitmap
The problem is that whatever I do, the result is a black bitmap. I don't think the bug is in the Texture2D -> Bitmap conversion code as printing the first pixel directly from the staging texture also gives me 0.
This is my shader code:
RWTexture2D<float4> Output : register(u0);
[numthreads(32, 32, 1)]
void main(uint3 id : SV_DispatchThreadID) {
Output[id.xy] = float4(0, 1.0, 0, 1.0);
}
Using the MS DX11 docs and blogs, I pieced together this code to run the texture:
public class GPUScreenColor {
private int adapterIndex = 0;
private Adapter1 gpu;
private Device device;
private ComputeShader computeShader;
private Texture2D texture;
private Texture2D stagingTexture;
private UnorderedAccessView view;
public GPUScreenColor() {
initializeDirectX();
}
private void initializeDirectX() {
using (var factory = new Factory1()) {
gpu = factory.GetAdapter1(adapterIndex);
}
device = new Device(gpu, DeviceCreationFlags.Debug, FeatureLevel.Level_11_1);
var compilationResult = ShaderBytecode.CompileFromFile("test.hlsl", "main", "cs_5_0", ShaderFlags.Debug);
computeShader = new ComputeShader(device, compilationResult.Bytecode);
texture = new Texture2D(device, new Texture2DDescription() {
BindFlags = BindFlags.UnorderedAccess | BindFlags.ShaderResource,
Format = Format.R8G8B8A8_UNorm,
Width = 1024,
Height = 1024,
OptionFlags = ResourceOptionFlags.None,
MipLevels = 1,
ArraySize = 1,
SampleDescription = { Count = 1, Quality = 0 }
});
UnorderedAccessView view = new UnorderedAccessView(device, texture, new UnorderedAccessViewDescription() {
Format = Format.R8G8B8A8_UNorm,
Dimension = UnorderedAccessViewDimension.Texture2D,
Texture2D = { MipSlice = 0 }
});
stagingTexture = new Texture2D(device, new Texture2DDescription {
CpuAccessFlags = CpuAccessFlags.Read,
BindFlags = BindFlags.None,
Format = Format.R8G8B8A8_UNorm,
Width = 1024,
Height = 1024,
OptionFlags = ResourceOptionFlags.None,
MipLevels = 1,
ArraySize = 1,
SampleDescription = { Count = 1, Quality = 0 },
Usage = ResourceUsage.Staging
});
}
public Bitmap getBitmap() {
device.ImmediateContext.ComputeShader.Set(computeShader);
device.ImmediateContext.ComputeShader.SetUnorderedAccessView(0, view);
device.ImmediateContext.Dispatch(32, 32, 1);
device.ImmediateContext.CopyResource(texture, stagingTexture);
var mapSource = device.ImmediateContext.MapSubresource(stagingTexture, 0, MapMode.Read, MapFlags.None);
Console.WriteLine(Marshal.ReadInt32(IntPtr.Add(mapSource.DataPointer, 0)));
try {
// Copy pixels from screen capture Texture to GDI bitmap
Bitmap bitmap = new Bitmap(1024, 1024, System.Drawing.Imaging.PixelFormat.Format32bppRgb);
BitmapData mapDest = bitmap.LockBits(new Rectangle(0, 0, 1024, 1024), ImageLockMode.ReadWrite, bitmap.PixelFormat);
try {
var sourcePtr = mapSource.DataPointer;
var destPtr = mapDest.Scan0;
for (int y = 0; y < 1024; y++) {
// Copy a single line
Utilities.CopyMemory(destPtr, sourcePtr, 1024 * 4);
// Advance pointers
sourcePtr = IntPtr.Add(sourcePtr, mapSource.RowPitch);
destPtr = IntPtr.Add(destPtr, mapDest.Stride);
}
return bitmap;
} finally {
bitmap.UnlockBits(mapDest);
}
} finally {
device.ImmediateContext.UnmapSubresource(stagingTexture, 0);
}
}
}
I am pretty new to shaders so it may be something obvious...
First thing, you create your UAV as a local :
UnorderedAccessView view = new UnorderedAccessView(....
So the field is then null, replacing by
view = new UnorderedAccessView(....
will solve the first issue.
Second, it's quite likely that the runtime will complain about types (debug will give you something like :
The resource return type for component 0 declared in the shader code (FLOAT) is not compatible with the resource type bound to Unordered Access View slot 0 of the Compute Shader unit (UNORM).
Some cards might do something (fix it silently), some might do nothing, some might crash :)
Problem is that RWTexture2D does not match UNORM format (as you specify flating point format here).
You need to enforce your RWTexture to be specifically of unorm format eg (yes runtime can be that picky):
RWTexture2D<unorm float4> Output : register(u0);
Then your whole setup should work (PS: I did not check the bitmap code, but I doubled checked that the shader is running without error and first pixel is matching)

Contour position with "findcontour" opencv on processing

I'm working on a project where I have to use a webcam, an arduino, a raspberry and an IR proximity sensor. I arrived to do everything with some help of google. But I have a big problem that's really I think.
I'm using OpenCV library on processing and I'd like the contours that get by the webcam be in the center of the sketch. But All only arrived to move the video and not the contours here's my code.
I hope you'll could help me :)
All the best
Alexandre
////////////////////////////////////////////
////////////////////////////////// LIBRARIES
////////////////////////////////////////////
import processing.serial.*;
import gab.opencv.*;
import processing.video.*;
/////////////////////////////////////////////////
////////////////////////////////// INITIALIZATION
/////////////////////////////////////////////////
Movie mymovie;
Capture video;
OpenCV opencv;
Contour contour;
////////////////////////////////////////////
////////////////////////////////// VARIABLES
////////////////////////////////////////////
int lf = 10; // Linefeed in ASCII
String myString = null;
Serial myPort; // The serial port
int sensorValue = 0;
int x = 300;
/////////////////////////////////////////////
////////////////////////////////// VOID SETUP
/////////////////////////////////////////////
void setup() {
size(1280, 1024);
// List all the available serial ports
printArray(Serial.list());
// Open the port you are using at the rate you want:
myPort = new Serial(this, Serial.list()[1], 9600);
myPort.clear();
// Throw out the first reading, in case we started reading
// in the middle of a string from the sender.
myString = myPort.readStringUntil(lf);
myString = null;
opencv = new OpenCV(this, 720, 480);
video = new Capture(this, 720, 480);
mymovie = new Movie(this, "visage.mov");
opencv.startBackgroundSubtraction(5, 3, 0.5);
mymovie.loop();
}
////////////////////////////////////////////
////////////////////////////////// VOID DRAW
////////////////////////////////////////////
void draw() {
image(mymovie, 0, 0);
image(video, 20, 20);
//tint(150, 20);
noFill();
stroke(255, 0, 0);
strokeWeight(1);
// check if there is something new on the serial port
while (myPort.available() > 0) {
// store the data in myString
myString = myPort.readStringUntil(lf);
// check if we really have something
if (myString != null) {
myString = myString.trim(); // let's remove whitespace characters
// if we have at least one character...
if (myString.length() > 0) {
println(myString); // print out the data we just received
// if we received a number (e.g. 123) store it in sensorValue, we sill use this to change the background color.
try {
sensorValue = Integer.parseInt(myString);
}
catch(Exception e) {
}
}
}
}
if (x < sensorValue) {
video.start();
opencv.loadImage(video);
}
if (x > sensorValue) {
image(mymovie, 0, 0);
}
opencv.updateBackground();
opencv.dilate();
opencv.erode();
for (Contour contour : opencv.findContours()) {
contour.draw();
}
}
//////////////////////////////////////////////
////////////////////////////////// VOID CUSTOM
//////////////////////////////////////////////
void captureEvent(Capture video) {
video.read(); // affiche l'image de la webcam
}
void movieEvent(Movie myMovie) {
myMovie.read();
}
One approach you could use is to call the translate() function to move the origin of the canvas before you call contour.draw(). Something like this:
translate(moveX, moveY);
for (Contour contour : opencv.findContours()) {
contour.draw();
}
What you use for moveX and moveY depends entirely on exactly what you're trying to do. You might use whatever position you're using to draw the video (if you want the contours displayed on top of the video), or you might use width/2 and height/2 (maybe minus a bit to really center the contours).
More info can be found in the reference. Play with a bunch of different values, and post an MCVE if you get stuck. Good luck.

Improve Face Recognition

I am trying to a develop face-recognition app in android. I am using JavaCv FaceRecognizer. But so far I am getting very poor results. It recognizes image of person which was trained but it also recognizes unknown images. For the known faces it gives me large value as a distance, most of the time from 70-90, sometimes 90+, while unknown images also get 70-90.
So how can I increase the performance of face-recognition? What techniques are there? What percentage of success you can get with this normally?
I have never worked with image processing. I will appreciate any guidelines.
Here is the code:
public class PersonRecognizer {
public final static int MAXIMG = 100;
FaceRecognizer faceRecognizer;
String mPath;
int count=0;
labels labelsFile;
static final int WIDTH= 70;
static final int HEIGHT= 70;
private static final String TAG = "PersonRecognizer";
private int mProb=999;
PersonRecognizer(String path)
{
faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createLBPHFaceRecognizer(2,8,8,8,100);
// path=Environment.getExternalStorageDirectory()+"/facerecog/faces/";
mPath=path;
labelsFile= new labels(mPath);
}
void changeRecognizer(int nRec)
{
switch(nRec) {
case 0: faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createLBPHFaceRecognizer(1,8,8,8,100);
break;
case 1: faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createFisherFaceRecognizer();
break;
case 2: faceRecognizer = com.googlecode.javacv.cpp.opencv_contrib.createEigenFaceRecognizer();
break;
}
train();
}
void add(Mat m, String description)
{
Bitmap bmp= Bitmap.createBitmap(m.width(), m.height(), Bitmap.Config.ARGB_8888);
Utils.matToBitmap(m,bmp);
bmp= Bitmap.createScaledBitmap(bmp, WIDTH, HEIGHT, false);
FileOutputStream f;
try
{
f = new FileOutputStream(mPath+description+"-"+count+".jpg",true);
count++;
bmp.compress(Bitmap.CompressFormat.JPEG, 100, f);
f.close();
} catch (Exception e) {
Log.e("error",e.getCause()+" "+e.getMessage());
e.printStackTrace();
}
}
public boolean train() {
File root = new File(mPath);
FilenameFilter pngFilter = new FilenameFilter() {
public boolean accept(File dir, String name) {
return name.toLowerCase().endsWith(".jpg");
};
};
File[] imageFiles = root.listFiles(pngFilter);
MatVector images = new MatVector(imageFiles.length);
int[] labels = new int[imageFiles.length];
int counter = 0;
int label;
IplImage img=null;
IplImage grayImg;
int i1=mPath.length();
for (File image : imageFiles) {
String p = image.getAbsolutePath();
img = cvLoadImage(p);
if (img==null)
Log.e("Error","Error cVLoadImage");
Log.i("image",p);
int i2=p.lastIndexOf("-");
int i3=p.lastIndexOf(".");
int icount = 0;
try
{
icount=Integer.parseInt(p.substring(i2+1,i3));
}
catch(Exception ex)
{
ex.printStackTrace();
}
if (count<icount) count++;
String description=p.substring(i1,i2);
if (labelsFile.get(description)<0)
labelsFile.add(description, labelsFile.max()+1);
label = labelsFile.get(description);
grayImg = IplImage.create(img.width(), img.height(), IPL_DEPTH_8U, 1);
cvCvtColor(img, grayImg, CV_BGR2GRAY);
images.put(counter, grayImg);
labels[counter] = label;
counter++;
}
if (counter>0)
if (labelsFile.max()>1)
faceRecognizer.train(images, labels);
labelsFile.Save();
return true;
}
public boolean canPredict()
{
if (labelsFile.max()>1)
return true;
else
return false;
}
public String predict(Mat m) {
if (!canPredict())
return "";
int n[] = new int[1];
double p[] = new double[1];
//conver Mat to black and white
/*Mat gray_m = new Mat();
Imgproc.cvtColor(m, gray_m, Imgproc.COLOR_RGBA2GRAY);*/
IplImage ipl = MatToIplImage(m, WIDTH, HEIGHT);
faceRecognizer.predict(ipl, n, p);
if (n[0]!=-1)
{
mProb=(int)p[0];
Log.v(TAG, "Distance = "+mProb+"");
Log.v(TAG, "N = "+n[0]);
}
else
{
mProb=-1;
Log.v(TAG, "Distance = "+mProb);
}
if (n[0] != -1)
{
return labelsFile.get(n[0]);
}
else
{
return "Unknown";
}
}
IplImage MatToIplImage(Mat m,int width,int heigth)
{
Bitmap bmp;
try
{
bmp = Bitmap.createBitmap(m.width(), m.height(), Bitmap.Config.RGB_565);
}
catch(OutOfMemoryError er)
{
bmp = Bitmap.createBitmap(m.width()/2, m.height()/2, Bitmap.Config.RGB_565);
er.printStackTrace();
}
Utils.matToBitmap(m, bmp);
return BitmapToIplImage(bmp, width, heigth);
}
IplImage BitmapToIplImage(Bitmap bmp, int width, int height) {
if ((width != -1) || (height != -1)) {
Bitmap bmp2 = Bitmap.createScaledBitmap(bmp, width, height, false);
bmp = bmp2;
}
IplImage image = IplImage.create(bmp.getWidth(), bmp.getHeight(),
IPL_DEPTH_8U, 4);
bmp.copyPixelsToBuffer(image.getByteBuffer());
IplImage grayImg = IplImage.create(image.width(), image.height(),
IPL_DEPTH_8U, 1);
cvCvtColor(image, grayImg, opencv_imgproc.CV_BGR2GRAY);
return grayImg;
}
protected void SaveBmp(Bitmap bmp,String path)
{
FileOutputStream file;
try
{
file = new FileOutputStream(path , true);
bmp.compress(Bitmap.CompressFormat.JPEG, 100, file);
file.close();
}
catch (Exception e) {
// TODO Auto-generated catch block
Log.e("",e.getMessage()+e.getCause());
e.printStackTrace();
}
}
public void load() {
train();
}
public int getProb() {
// TODO Auto-generated method stub
return mProb;
}
}
I have faced similar challenges recently, here are the things which helped me in getting better results:
Crop the faces from images - this will remove unnecessary pixels at the time of inference
Resize the cropped face images - this impacts when detecting face landmarks, try different scales on test sets to understand what works best. Also, this impacts the inference time as well, smaller the size, faster the inference.
Improve the brightness of the face images - I found this really helpful, detecting face landmarks in darker images was not much good, this is mainly due to the model, which was pre-trained with mostly white faces - having understanding on training data will helps when dealing with bias.
Convert to grayscale images - this I have seen it in many forums and said that, this will helpful in finding the edges efficiently - and processing time is less when compared to colour images (3 channels -RGB) - however, this did not help much.
Try to capture (register) as many as images for individual person in different angles, lightings and other variations - this one really helps as it is comparing with encodings of the stored images.
Try to implement 1-1 comparison for face verification - for example, in my system, I have captured 10 pictures for each person, and at the time of verification, I am comparing against 10 pictures, instead of all the encodings of all the persons stored in the system. This will provide, false positives, however use-cases are limited in this setup, I am using it for face authentication, and compare the new face against existing faces where mobile number is same.
My understanding as of today, face recognition system works great and but not 100% accurate, we have to understand the model architecture, training data and our requirement and deploy it accordingly to get better outcome. Here are some points which helped me improve overall system:
Implement fallback method - provide option to user, when our system failed to detects them correctly, example, if face authentication failed for some reason, show them enter PIN option
In critical system - add periodic human intervention to confirm system result - for example, if a system not allows a user based on FR result - verify with a human agent for failed result and allow the user
Implement multiple factors for authentication - deploy face recognition system as addition to existing system - for example, after user logged in with credentials - verify them its intended person using face recognition system
Design your user interface in a way, at the time of verification, how user should behave like open eyes, close mouth, etc without impacting user experience
Provide clear instruction to users, when they are dealing with the system - for example, let user know, FR system integrated and they need to show their faces in good lighting condition, etc.

SharpPCap missing packets

I'm using SharpPCap to collect IEC61850-9-2LE Sampled Values over Ethernet.
IEC61850-9-2LE Sampled Values consists of several streams, each one sending 4000 packets per second, where the avg packet size is 125 bytes.
Using SharpPCap I'm trying to collect 3 of those streams (3x4000 packets per second - 125bytes each).
In the following code I set up the Network Interface Card.
if (nicToUse != null)
{
try
{
nicToUse.OnPacketArrival -= OnPackectArrivalLive;
nicToUse.OnPacketArrival += OnPackectArrivalLive;
try
{
if (nicToUse.Started)
nicToUse.StopCapture();
if (nicToUse.Opened)
nicToUse.Close();
}
catch (Exception)
{
//no handling, just do it.
}
nicToUse.Open(OpenFlags.Promiscuous|OpenFlags.MaxResponsiveness,10);
var kernelBufferAssigned = false;
uint kernelBufferSize = 200;
while (!kernelBufferAssigned)
{
try
{
nicToUse.KernelBufferSize = kernelBufferSize * 1024 * 1024;
kernelBufferAssigned = true;
}
catch (Exception)
{
kernelBufferSize--;
}
}
nicToUse.Filter = "(ether[0:4] = 0x010CCD04)";
watchdog.Enabled = true;
counter = 0;
nicToUse.StartCapture();
}
catch (Exception ex)
{
throw new Exception(Resources.SharpPCapPacketsProducer_Start_Error_while_starting_online_capture_, ex);
}
}
This is the OnPacketArrival event handler:
private void OnPackectArrivalLive(object sender, CaptureEventArgs e)
{
try
{
counter++;
circularBuffer[circularBufferIndex] = e.Packet;
circularBufferIndex++;
if (circularBufferIndex > circularBufferSize - 1)
circularBufferIndex = 0;
}
catch (Exception)
{
}
}
When the capturing is over (user stops it), the captured packets are decoded and since they hold a sequential counter I've discovered some samples are missing.
Connecting the same source to another PC running Wireshark, those samples are not missing.
Any idea ?
What version of SharpPcap are you using? There have been some pretty big performance improvements due to overhead reduction in the 3.x and 4.x series.
Your example seems to be wrapping the circular buffer around at the tail. What type is circularBuffer? How are you sure that you are processing the packets before your buffer has filled up?
Have you looked at this example, from the SharpPcap source distribution, that shows one technique for doing background packet processing?
QueueingPacketsForBackgroundProcessing/Main.cs

Resources