Processing Open CV Face Tracking - opencv

Hi I started with the code vom Sparkfun below to do some Face Tracking and get this error:
the type OpenCV is ambiguous
I tried other examples from the OpenCV for Processing library.
And they work without a problem (Also the Face Tracking Example)
The original code from Sparkfun was written for a different OpenCV (Version 1 I believe).
But I could not make it work, because there is no import library at the top of the code.
Since I have OpenCV for Processing installed I imported that:
import gab.opencv.*;
and from then on I get this error.
I don't see why it does not work and I don't understand why it was supposed to work (since it does not import OpenCV in the orginal code).
Any help would be great.
Thanks.
/**********************************************************************************************
* Pan/Tilt Face Tracking Sketch
* Written by Ryan Owens for SparkFun Electronics
* Uses the OpenCV real-time computer vision framework from Intel
* Based on the OpenCV Processing Examples from ubaa.net
* This example is released under the Beerware License.
* (Use the code however you'd like, but mention us and by me a beer if we ever meet!)
*
* The Pan/Tilt Face Tracking Sketch interfaces with an Arduino Main board to control
* two servos, pan and tilt, which are connected to a webcam. The OpenCV library
* looks for a face in the image from the webcam. If a face is detected the sketch
* uses the coordinates of the face to manipulate the pan and tilt servos to move the webcam
* in order to keep the face in the center of the frame.
*
* Setup-
* A webcam must be connected to the computer.
* An Arduino must be connected to the computer. Note the port which the Arduino is connected on.
* The Arduino must be loaded with the SerialServoControl Sketch.
* Two servos mounted on a pan/tilt backet must be connected to the Arduino pins 2 and 3.
* The Arduino must be powered by a 9V external power supply.
*
* Read this tutorial for more information:
**********************************************************************************************/
import gab.opencv.*;
import hypermedia.video.*; //Include the video library to capture images from the webcam
import java.awt.Rectangle; //A rectangle class which keeps track of the face coordinates.
import processing.serial.*; //The serial library is needed to communicate with the Arduino.
OpenCV opencv; //Create an instance of the OpenCV library.
//Screen Size Parameters
int width = 320;
int height = 240;
// contrast/brightness values
int contrast_value = 0;
int brightness_value = 0;
Serial port; // The serial port
//Variables for keeping track of the current servo positions.
char servoTiltPosition = 90;
char servoPanPosition = 90;
//The pan/tilt servo ids for the Arduino serial command interface.
char tiltChannel = 0;
char panChannel = 1;
//These variables hold the x and y location for the middle of the detected face.
int midFaceY=0;
int midFaceX=0;
//The variables correspond to the middle of the screen, and will be compared to the midFace values
int midScreenY = (height/2);
int midScreenX = (width/2);
int midScreenWindow = 10; //This is the acceptable 'error' for the center of the screen.
//The degree of change that will be applied to the servo each time we update the position.
int stepSize=1;
void setup() {
//Create a window for the sketch.
size( width, height );
opencv = new OpenCV( this );
opencv.capture( width, height ); // open video stream
opencv.cascade( OpenCV.CASCADE_FRONTALFACE_ALT ); // load detection description, here-> front face detection : "haarcascade_frontalface_alt.xml"
println(Serial.list()); // List COM-ports (Use this to figure out which port the Arduino is connected to)
//select first com-port from the list (change the number in the [] if your sketch fails to connect to the Arduino)
port = new Serial(this, Serial.list()[0], 57600); //Baud rate is set to 57600 to match the Arduino baud rate.
// print usage
println( "Drag mouse on X-axis inside this sketch window to change contrast" );
println( "Drag mouse on Y-axis inside this sketch window to change brightness" );
//Send the initial pan/tilt angles to the Arduino to set the device up to look straight forward.
port.write(tiltChannel); //Send the Tilt Servo ID
port.write(servoTiltPosition); //Send the Tilt Position (currently 90 degrees)
port.write(panChannel); //Send the Pan Servo ID
port.write(servoPanPosition); //Send the Pan Position (currently 90 degrees)
}
public void stop() {
opencv.stop();
super.stop();
}
void draw() {
// grab a new frame
// and convert to gray
opencv.read();
opencv.convert( GRAY );
opencv.contrast( contrast_value );
opencv.brightness( brightness_value );
// proceed detection
Rectangle[] faces = opencv.detect( 1.2, 2, OpenCV.HAAR_DO_CANNY_PRUNING, 40, 40 );
// display the image
image( opencv.image(), 0, 0 );
// draw face area(s)
noFill();
stroke(255,0,0);
for( int i=0; i<faces.length; i++ ) {
rect( faces[i].x, faces[i].y, faces[i].width, faces[i].height );
}
//Find out if any faces were detected.
if(faces.length > 0){
//If a face was found, find the midpoint of the first face in the frame.
//NOTE: The .x and .y of the face rectangle corresponds to the upper left corner of the rectangle,
// so we manipulate these values to find the midpoint of the rectangle.
midFaceY = faces[0].y + (faces[0].height/2);
midFaceX = faces[0].x + (faces[0].width/2);
//Find out if the Y component of the face is below the middle of the screen.
if(midFaceY < (midScreenY - midScreenWindow)){
if(servoTiltPosition >= 5)servoTiltPosition -= stepSize; //If it is below the middle of the screen, update the tilt position variable to lower the tilt servo.
}
//Find out if the Y component of the face is above the middle of the screen.
else if(midFaceY > (midScreenY + midScreenWindow)){
if(servoTiltPosition <= 175)servoTiltPosition +=stepSize; //Update the tilt position variable to raise the tilt servo.
}
//Find out if the X component of the face is to the left of the middle of the screen.
if(midFaceX < (midScreenX - midScreenWindow)){
if(servoPanPosition >= 5)servoPanPosition -= stepSize; //Update the pan position variable to move the servo to the left.
}
//Find out if the X component of the face is to the right of the middle of the screen.
else if(midFaceX > (midScreenX + midScreenWindow)){
if(servoPanPosition <= 175)servoPanPosition +=stepSize; //Update the pan position variable to move the servo to the right.
}
}
//Update the servo positions by sending the serial command to the Arduino.
port.write(tiltChannel); //Send the tilt servo ID
port.write(servoTiltPosition); //Send the updated tilt position.
port.write(panChannel); //Send the Pan servo ID
port.write(servoPanPosition); //Send the updated pan position.
delay(1);
}
/**
* Changes contrast/brigthness values
*/
void mouseDragged() {
contrast_value = (int) map( mouseX, 0, width, -128, 128 );
brightness_value = (int) map( mouseY, 0, width, -128, 128 );
}

You're using two separate Processing wrappers for OpenCV(gab.* and hypermedia.*), both having an OpenCV class of they're own. Use one or the other but not both in the same project. Java can't tell which one you want to use(hence the ambiguous OpenCV type error)
You seem to be using the hypermedia classes anyway, so remove the gab.* import for now as a quick fix.
The gab.* library though is better (more up to date) than the hypermedia one, so you might want to update you OpenCV calls to use that one in the future.

Related

2D augmented reality with sensor issue

I'm making a "geolocational AR app" in which I use Paint() to draw bitmap on my screen with the use of sensors to do some translation so that my image will only be shown at a specific point.
I've done everything
However, the image shakes while I present it.
I really want the effect as the following videos presents
https://www.youtube.com/watch?v=8U3vWETmk2U
I've implemented low pass filter to ease the situation but it is still not as steady as the images from the video.
This is how I achieve AR movement
float dx = (float) ( (canvas.getWidth()/ horizontalFOV) * (Math.toDegrees(orientation[0])-curBearingTo));
float dy = (float) ( (canvas.getHeight()/ verticalFOV) * Math.toDegrees(orientation[1]));
canvas.translate(0.0f, 0.0f-dy);
canvas.translate(0.0f-dx, 0.0f);
The curBearingTois the result of android location API: BearingTo(location, destination)
And this is how I get Orientation matrix
if (lastAccelerometer != null && lastCompass != null) {
boolean gotRotation = SensorManager.getRotationMatrix(rotation,
identity, lastAccelerometer, lastCompass);
if (gotRotation) {
// remap such that the camera is pointing straight down the Y
// axis
SensorManager.remapCoordinateSystem(rotation,
SensorManager.AXIS_X, SensorManager.AXIS_Z,
cameraRotation);
// orientation vector
SensorManager.getOrientation(cameraRotation, orientation);
}
Any suggestion?

How to determine the distance between upper lip and lower lip by using webcam in Processing?

Where should I start? I can see plenty of face recognition and analysis using Python, Java script but how about Processing ?
I want to determine the distance by using 2 points between upper and lower lip at their highest and lowest point via webcam to use it in further project.
any help would be appreciated
If you want to do it in Processing alone you can use Greg Borenstein's OpenCV for Processing library:
You can start with the Face Detection example
Once you detect a face, you can detect a mouth within the face rectangle using OpenCV.CASCADE_MOUTH.
Once you have mouth detected maybe you can get away with using the mouth bounding box height. For more detail you use OpenCV to threshold that rectangle. Hopefully the open mouth will segment nicely from the rest of the skin. Finding contours should give you lists of points you can work with.
For something a lot more exact, you can use Jason Saragih's CLM FaceTracker, which is available as an OpenFrameworks addon. OpenFrameworks has similarities to Processing. If you do need this sort of accuracy in Processing you can run FaceOSC in the background and read the mouth coordinates in Processing using oscP5
Update
For the first option, using HAAR cascade classifiers, turns out there are a couple of issues:
The OpenCV Processing library can load one cascade and a second instance will override the first.
The OpenCV.CASCADE_MOUTH seems to work better for closed mouths, but not very well with open mouths
To get past the 1st issue, you can use the OpenCV Java API directly, bypassing OpenCV Processing for multiple cascade detection.
There are couple of parameters that can help the detection, such as having idea of the bounding box of the mouth before hand to pass as a hint to the classifier.
I've done a basic test using a webcam on my laptop and measure the bounding box for face and mouth at various distances. Here's an example:
import gab.opencv.*;
import org.opencv.core.*;
import org.opencv.objdetect.*;
import processing.video.*;
Capture video;
OpenCV opencv;
CascadeClassifier faceDetector,mouthDetector;
MatOfRect faceDetections,mouthDetections;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
Size minFeatureSizeFace = new Size(50,60);
Size maxFeatureSizeFace = new Size(125,150);
Size minFeatureSizeMouth = new Size(30,10);
Size maxFeatureSizeMouth = new Size(120,60);
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means only good detections are given but some are missed.
int minNeighbors = 4;
//laptop webcam face rectangle
//far, small scale, ~50,60px
//typing distance, ~83,91px
//really close, ~125,150
//laptop webcam mouth rectangle
//far, small scale, ~30,10
//typing distance, ~50,25px
//really close, ~120,60
int mouthHeightHistory = 30;
int[] mouthHeights = new int[mouthHeightHistory];
void setup() {
opencv = new OpenCV(this,320,240);
size(opencv.width, opencv.height);
noFill();
frameRate(30);
video = new Capture(this,width,height);
video.start();
faceDetector = new CascadeClassifier(dataPath("haarcascade_frontalface_alt2.xml"));
mouthDetector = new CascadeClassifier(dataPath("haarcascade_mcs_mouth.xml"));
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
//detect face using raw Java OpenCV API
Mat equalizedImg = opencv.getGray();
faceDetections = new MatOfRect();
faceDetector.detectMultiScale(equalizedImg, faceDetections, searchScaleFactor, minNeighbors, flags, minFeatureSizeFace, maxFeatureSizeFace);
Rect[] faceDetectionResults = faceDetections.toArray();
int faces = faceDetectionResults.length;
text("detected faces: "+faces,5,15);
if(faces >= 1){
Rect face = faceDetectionResults[0];
stroke(0,192,0);
rect(face.x,face.y,face.width,face.height);
//detect mouth - only within face rectangle, not the whole frame
Rect faceLower = face.clone();
faceLower.height = (int) (face.height * 0.65);
faceLower.y = face.y + faceLower.height;
Mat faceROI = equalizedImg.submat(faceLower);
//debug view of ROI
PImage faceImg = createImage(faceLower.width,faceLower.height,RGB);
opencv.toPImage(faceROI,faceImg);
image(faceImg,width-faceImg.width,0);
mouthDetections = new MatOfRect();
mouthDetector.detectMultiScale(faceROI, mouthDetections, searchScaleFactor, minNeighbors, flags, minFeatureSizeMouth, maxFeatureSizeMouth);
Rect[] mouthDetectionResults = mouthDetections.toArray();
int mouths = mouthDetectionResults.length;
text("detected mouths: "+mouths,5,25);
if(mouths >= 1){
Rect mouth = mouthDetectionResults[0];
stroke(192,0,0);
rect(faceLower.x + mouth.x,faceLower.y + mouth.y,mouth.width,mouth.height);
text("mouth height:"+mouth.height+"~px",5,35);
updateAndPlotMouthHistory(mouth.height);
}
}
}
void updateAndPlotMouthHistory(int newHeight){
//shift older values by 1
for(int i = mouthHeightHistory-1; i > 0; i--){
mouthHeights[i] = mouthHeights[i-1];
}
//add new value at the front
mouthHeights[0] = newHeight;
//plot
float graphWidth = 100.0;
float elementWidth = graphWidth / mouthHeightHistory;
for(int i = 0; i < mouthHeightHistory; i++){
rect(elementWidth * i,45,elementWidth,mouthHeights[i]);
}
}
void captureEvent(Capture c) {
c.read();
}
One very imortant note to make: I've copied cascade xml files from the OpenCV Processing library folder (~/Documents/Processing/libraries/opencv_processing/library/cascade-files) to the sketch's data folder. My sketch is OpenCVMouthOpen, so the folder structure looks like this:
OpenCVMouthOpen
├── OpenCVMouthOpen.pde
└── data
├── haarcascade_frontalface_alt.xml
├── haarcascade_frontalface_alt2.xml
├── haarcascade_frontalface_alt_tree.xml
├── haarcascade_frontalface_default.xml
├── haarcascade_mcs_mouth.xml
└── lbpcascade_frontalface.xml
If you don't copy the cascades files and use the code as it is you won't get any errors, but the detection simply won't work. If you want to check, you can do
println(faceDetector.empty())
at the end of the setup() function and if you get false, the cascade has been loaded and if you get true, the cascade hasn't been loaded.
You may need to play with the minFeatureSize and maxFeatureSize values for face and mouth for your setup. The second issue, cascade not detecting wide open mouth very well is tricky. There might be an already trained cascade for open mouths, but you'd need to find it. Otherwise, with this method you may need to train one yourself and that can be a bit tedious.
Nevertheless, notice that there is an upside down plot drawn on the left when a mouth is detected. In my tests I noticed that the height isn't super accurate, but there are noticeable changes in the graph. You may not be able to get a steady mouth height, but by comparing current to averaged previous height values you should see some peaks (values going from positive to negative or vice-versa) which give you an idea of a mouth open/close change.
Although searching through the whole image for a mouth as opposed to a face only can be a bit slower and less accurate, it's a simpler setup. It you can get away with less accuracy and more false positives on your project this could be simpler:
import gab.opencv.*;
import java.awt.Rectangle;
import org.opencv.objdetect.Objdetect;
import processing.video.*;
Capture video;
OpenCV opencv;
Rectangle[] faces,mouths;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
int minFeatureSize = 20;
int maxFeatureSize = 150;
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means only good detections are given but some are missed.
int minNeighbors = 6;
void setup() {
size(320, 240);
noFill();
stroke(0, 192, 0);
strokeWeight(3);
video = new Capture(this,width,height);
video.start();
opencv = new OpenCV(this,320,240);
opencv.loadCascade(OpenCV.CASCADE_MOUTH);
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
Rectangle[] mouths = opencv.detect(searchScaleFactor,minNeighbors,flags,minFeatureSize, maxFeatureSize);
for (int i = 0; i < mouths.length; i++) {
text(mouths[i].x + "," + mouths[i].y + "," + mouths[i].width + "," + mouths[i].height,mouths[i].x, mouths[i].y);
rect(mouths[i].x, mouths[i].y, mouths[i].width, mouths[i].height);
}
}
void captureEvent(Capture c) {
c.read();
}
I was mentioning segmenting/thresholding as well. Here's a rough example using the lower part of a detected face just a basic threshold, then some basic morphological filters (erode/dilate) to cleanup the thresholded image a bit:
import gab.opencv.*;
import org.opencv.core.*;
import org.opencv.objdetect.*;
import org.opencv.imgproc.Imgproc;
import java.awt.Rectangle;
import java.util.*;
import processing.video.*;
Capture video;
OpenCV opencv;
CascadeClassifier faceDetector,mouthDetector;
MatOfRect faceDetections,mouthDetections;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
Size minFeatureSizeFace = new Size(50,60);
Size maxFeatureSizeFace = new Size(125,150);
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means only good detections are given but some are missed.
int minNeighbors = 4;
//laptop webcam face rectangle
//far, small scale, ~50,60px
//typing distance, ~83,91px
//really close, ~125,150
float threshold = 160;
int erodeAmt = 1;
int dilateAmt = 5;
void setup() {
opencv = new OpenCV(this,320,240);
size(opencv.width, opencv.height);
noFill();
video = new Capture(this,width,height);
video.start();
faceDetector = new CascadeClassifier(dataPath("haarcascade_frontalface_alt2.xml"));
mouthDetector = new CascadeClassifier(dataPath("haarcascade_mcs_mouth.xml"));
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
//detect face using raw Java OpenCV API
Mat equalizedImg = opencv.getGray();
faceDetections = new MatOfRect();
faceDetector.detectMultiScale(equalizedImg, faceDetections, searchScaleFactor, minNeighbors, flags, minFeatureSizeFace, maxFeatureSizeFace);
Rect[] faceDetectionResults = faceDetections.toArray();
int faces = faceDetectionResults.length;
text("detected faces: "+faces,5,15);
if(faces > 0){
Rect face = faceDetectionResults[0];
stroke(0,192,0);
rect(face.x,face.y,face.width,face.height);
//detect mouth - only within face rectangle, not the whole frame
Rect faceLower = face.clone();
faceLower.height = (int) (face.height * 0.55);
faceLower.y = face.y + faceLower.height;
//submat grabs a portion of the image (submatrix) = our region of interest (ROI)
Mat faceROI = equalizedImg.submat(faceLower);
Mat faceROIThresh = faceROI.clone();
//threshold
Imgproc.threshold(faceROI, faceROIThresh, threshold, width, Imgproc.THRESH_BINARY_INV);
Imgproc.erode(faceROIThresh, faceROIThresh, new Mat(), new Point(-1,-1), erodeAmt);
Imgproc.dilate(faceROIThresh, faceROIThresh, new Mat(), new Point(-1,-1), dilateAmt);
//find contours
Mat faceContours = faceROIThresh.clone();
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Imgproc.findContours(faceContours, contours, new Mat(), Imgproc.RETR_EXTERNAL , Imgproc.CHAIN_APPROX_SIMPLE);
//draw contours
for(int i = 0 ; i < contours.size(); i++){
MatOfPoint contour = contours.get(i);
Point[] points = contour.toArray();
stroke(map(i,0,contours.size()-1,32,255),0,0);
beginShape();
for(Point p : points){
vertex((float)p.x,(float)p.y);
}
endShape();
}
//debug view of ROI
PImage faceImg = createImage(faceLower.width,faceLower.height,RGB);
opencv.toPImage(faceROIThresh,faceImg);
image(faceImg,width-faceImg.width,0);
}
text("Drag mouseX to control threshold: " + threshold+
"\nHold 'e' and drag mouseX to control erodeAmt: " + erodeAmt+
"\nHold 'd' and drag mouseX to control dilateAmt: " + dilateAmt,5,210);
}
void mouseDragged(){
if(keyPressed){
if(key == 'e') erodeAmt = (int)map(mouseX,0,width,1,6);
if(key == 'd') dilateAmt = (int)map(mouseX,0,width,1,10);
}else{
threshold = mouseX;
}
}
void captureEvent(Capture c) {
c.read();
}
This could be improved a bit by using YCrCb colour space to segment skin better, but overall you notice that there are quite a few variables to get right which doesn't make this a very flexible setup.
You will be much better results using FaceOSC and reading the values you need in Processing via oscP5. Here is a slightly simplified version of the FaceOSCReceiver Processing example focusing mainly on mouth:
import oscP5.*;
OscP5 oscP5;
// num faces found
int found;
// pose
float poseScale;
PVector posePosition = new PVector();
// gesture
float mouthHeight;
float mouthWidth;
void setup() {
size(640, 480);
frameRate(30);
oscP5 = new OscP5(this, 8338);
oscP5.plug(this, "found", "/found");
oscP5.plug(this, "poseScale", "/pose/scale");
oscP5.plug(this, "posePosition", "/pose/position");
oscP5.plug(this, "mouthWidthReceived", "/gesture/mouth/width");
oscP5.plug(this, "mouthHeightReceived", "/gesture/mouth/height");
}
void draw() {
background(255);
stroke(0);
if(found > 0) {
translate(posePosition.x, posePosition.y);
scale(poseScale);
noFill();
ellipse(0, 20, mouthWidth* 3, mouthHeight * 3);
}
}
// OSC CALLBACK FUNCTIONS
public void found(int i) {
println("found: " + i);
found = i;
}
public void poseScale(float s) {
println("scale: " + s);
poseScale = s;
}
public void posePosition(float x, float y) {
println("pose position\tX: " + x + " Y: " + y );
posePosition.set(x, y, 0);
}
public void mouthWidthReceived(float w) {
println("mouth Width: " + w);
mouthWidth = w;
}
public void mouthHeightReceived(float h) {
println("mouth height: " + h);
mouthHeight = h;
}
// all other OSC messages end up here
void oscEvent(OscMessage m) {
if(m.isPlugged() == false) {
println("UNPLUGGED: " + m);
}
}
On OSX you can simply download the compiled FaceOSC app.
On other operating systems you may need to setup OpenFrameworks, download ofxFaceTracker and compile FaceOSC yourself.
It's really hard to answer general "how do I do this" type questions. Stack Overflow is designed for specific "I tried X, expected Y, but got Z instead" type questions. But I'll try to answer in a general sense:
You need to break your problem down into smaller pieces.
Step 1: Can you get a webcam feed showing in your sketch? Don't worry about the computer vision stuff for a second. Just get the camera connected. Do some research and try something out.
Step 2: Can you detect facial features in that video? You might try doing it yourself, or you might use one of the many libraries listed in the Videos and Vision section of the Processing libraries page.
Step 3: Read the documentation on those libraries. Try them out. You might have to make a bunch of little example sketches using each library until you find one you like. We can't do this for you, as which one is right for you depends on you. If you're confused about something specific we can try to help you, but we can't really help you with picking out a library.
Step 4: Once you've done a bunch of example programs and picked out a library, start working towards your goal. Can you detect facial features using the library? Get just that part working. Once you have that working, can you detect changes like opening or closing a mouth?
Work on one small step at a time. If you get stuck, post an MCVE along with a specific technical question, and we'll go from there. Good luck.

How to convert TangoXyxIjData into a matrix of z-values

I am currently using a Project Tango tablet for robotic obstacle avoidance. I want to create a matrix of z-values as they would appear on the Tango screen, so that I can use OpenCV to process the matrix. When I say z-values, I mean the distance each point is from the Tango. However, I don't know how to extract the z-values from the TangoXyzIjData and organize the values into a matrix. This is the code I have so far:
public void action(TangoPoseData poseData, TangoXyzIjData depthData) {
byte[] buffer = new byte[depthData.xyzCount * 3 * 4];
FileInputStream fileStream = new FileInputStream(
depthData.xyzParcelFileDescriptor.getFileDescriptor());
try {
fileStream.read(buffer, depthData.xyzParcelFileDescriptorOffset, buffer.length);
fileStream.close();
} catch (IOException e) {
e.printStackTrace();
}
Mat m = new Mat(depthData.ijRows, depthData.ijCols, CvType.CV_8UC1);
m.put(0, 0, buffer);
}
Does anyone know how to do this? I would really appreciate help.
The short answer is it can't be done, at least not simply. The XYZij struct in the Tango API does not work completely yet. There is no "ij" data. Your retrieval of buffer will work as you have it coded. The contents are a set of X, Y, Z values for measured depth points, roughly 10000+ each callback. Each X, Y, and Z value is of type float, so not CV_8UC1. The problem is that the points are not ordered in any way, so they do not correspond to an "image" or xy raster. They are a random list of depth points. There are ways to get them into some xy order, but it is not straightforward. I have done both of these:
render them to an image, with the depth encoded as color, and pull out the image as pixels
use the model/view/perspective from OpenGL and multiply out the locations of each point and then figure out their screen space location (like OpenGL would during rendering). Sort the points by their xy screen space. Instead of the calculated screen-space depth just keep the Z value from the original buffer.
or
wait until (if) the XYZij struct is fixed so that it returns ij values.
I too wish to use Tango for object avoidance for robotics. I've had some success by simplifying the use case to be only interested in the distance of any object located at the center view of the Tango device.
In Java:
private Double centerCoordinateMax = 0.020;
private TangoXyzIjData xyzIjData;
final FloatBuffer xyz = xyzIjData.xyz;
double cumulativeZ = 0.0;
int numberOfPoints = 0;
for (int i = 0; i < xyzIjData.xyzCount; i += 3) {
float x = xyz.get(i);
float y = xyz.get(i + 1);
if (Math.abs(x) < centerCoordinateMax &&
Math.abs(y) < centerCoordinateMax) {
float z = xyz.get(i + 2);
cumulativeZ += z;
numberOfPoints++;
}
}
Double distanceInMeters;
if (numberOfPoints > 0) {
distanceInMeters = cumulativeZ / numberOfPoints;
} else {
distanceInMeters = null;
}
Said simply this code is taking the average distance of a small square located at the origin of x and y axes.
centerCoordinateMax = 0.020 was determined to work based on observation and testing. The square typically contains 50 points in ideal conditions and fewer when held close to the floor.
I've tested this using version 2 of my tango-caminada application and the depth measuring seems quite accurate. Standing 1/2 meter from a doorway I slid towards the open door and the distance changed form 0.5 meters to 2.5 meters which is the wall at the end of the hallway.
Simulating a robot being navigated I moved the device towards a trash can in the path until 0.5 meters separation and then rotated left until the distance was more than 0.5 meters and proceeded forward. An oversimplified simulation, but the basis for object avoidance using Tango depth perception.
You can do this by using camera intrinsics to convert XY coordinates to normalized values -- see this post - Google Tango: Aligning Depth and Color Frames - it's talking about texture coordinates but it's exactly the same problem
Once normalized, move to screen space x[1280,720] and then the Z coordinate can be used to generate a pixel value for openCV to chew on. You'll need to decide how to color pixels that don't correspond to depth points on your own, and advisedly, before you use the depth information to further colorize pixels.
The main thing is to remember that the raw coordinates returned are already using the basis vectors you want, i.e. you do not want the pose attitude or location

How can i prevent my object detection program from detecting multiple objects of different sizes?

So, here is my situation. I have created a object detection program which is based on color object detection. My program detects the color red and it works perfectly. But here is the problems i am facing:-
Whenever there are more than one red object in the surrounding, my program detects them and it cannot really track one object at that time(i.e it tracks other red objects of various sizes in the background. It shows me the error that "too much noise in the background". As you can see in the "threshold image" attached, it detects the round object (which is my tracking object) and my cap which is red in color. I want my program to detect only my tracking object("which is a round shaped coke cap"). How can i achieve that? Please help me out. I have my engineering design contest in few days and i have to demo my program infront of my lecturers. My program should only be able to detect and track the object which i want. Thanks
My code for the objectdetection program is a little long. So, i am hereby explaining the code as follows- I captured a frame from the webcam frame-converted it to HSV- used HSV Inrange filter to filter out the other colors but red- applied morphological operations on the filtered image. This all goes in my main function
I am using a frame resolution of 1280*720 for my webcam frame. It kind of slows down my program but it was a trade off which i had to do for performing gesture controlled operations. Anyways here is my drawobjectfunction and trackfilteredobjectfunction.
int H_MIN = 0;
int H_MAX = 256;
int S_MIN = 0;
int S_MAX = 256;
int V_MIN = 0;
int V_MAX = 256;
//default capture width and height
const int FRAME_WIDTH = 1280;
const int FRAME_HEIGHT = 720;
//max number of objects to be detected in frame
const int MAX_NUM_OBJECTS=50;
//minimum and maximum object area
const int MIN_OBJECT_AREA = 20*20;
const int MAX_OBJECT_AREA = FRAME_HEIGHT*FRAME_WIDTH/1.5;
void drawObject(int x, int y,Mat &frame){
circle(frame,Point(x,y),20,Scalar(0,255,0),2);
if(y-25>0)
line(frame,Point(x,y),Point(x,y-25),Scalar(0,255,0),2);
else line(frame,Point(x,y),Point(x,0),Scalar(0,255,0),2);
if(y+25<FRAME_HEIGHT)
line(frame,Point(x,y),Point(x,y+25),Scalar(0,255,0),2);
else line(frame,Point(x,y),Point(x,FRAME_HEIGHT),Scalar(0,255,0),2);
if(x-25>0)
line(frame,Point(x,y),Point(x-25,y),Scalar(0,255,0),2);
else line(frame,Point(x,y),Point(0,y),Scalar(0,255,0),2);
if(x+25<FRAME_WIDTH)
line(frame,Point(x,y),Point(x+25,y),Scalar(0,255,0),2);
else line(frame,Point(x,y),Point(FRAME_WIDTH,y),Scalar(0,255,0),2);
putText(frame,intToString(x)+","+intToString(y),Point(x,y+30),1,1,Scalar(0,255,0),2);
}
void trackFilteredObject(int &x, int &y, Mat threshold, Mat &cameraFeed){
Mat temp;
threshold.copyTo(temp);
//these two vectors needed for output of findContours
vector< vector<Point> > contours;
vector<Vec4i> hierarchy;
//find contours of filtered image using openCV findContours function
findContours(temp,contours,hierarchy,CV_RETR_CCOMP,CV_CHAIN_APPROX_SIMPLE );
//use moments method to find our filtered object
double refArea = 0;
bool objectFound = false;
if (hierarchy.size() > 0) {
int numObjects = hierarchy.size();
//if number of objects greater than MAX_NUM_OBJECTS we have a noisy filter
if(numObjects<MAX_NUM_OBJECTS){
for (int index = 0; index >= 0; index = hierarchy[index][0]) {
Moments moment = moments((cv::Mat)contours[index]);
double area = moment.m00;
//if the area is less than 20 px by 20px then it is probably just noise
//if the area is the same as the 3/2 of the image size, probably just a bad filter
//we only want the object with the largest area so we safe a reference area each
//iteration and compare it to the area in the next iteration.
if(area>MIN_OBJECT_AREA && area<MAX_OBJECT_AREA && area>refArea){
x = moment.m10/area;
y = moment.m01/area;
objectFound = true;
refArea = area;
}else objectFound = false;
}
//let user know you found an object
if(objectFound ==true){
putText(cameraFeed,"Tracking Object",Point(0,50),2,1,Scalar(0,255,0),2);
//draw object location on screen
drawObject(x,y,cameraFeed);}
}else putText(cameraFeed,"TOO MUCH NOISE! ADJUST FILTER",Point(0,50),1,2,Scalar(0,0,255),2);
}
}
Here is the link of the image; as you can see it also detects the red hat in the background along with the red cap of the coke bottle.
My observations:- Here is what i think, to achieve my desired goal of not detecting objects of unknown sizes of red color. I think i have to edit the value of maximum object area which i declared in the above program as (const int MAX_OBJECT_AREA = FRAME_HEIGHT*FRAME_WIDTH/1.5;). I think i have to change this value, that might eliminate the detection of bigger continous red pictures. But also, there is another problem some objects are not completely red in color and they have patches of red and other colors. So, if the detected area is within the range specfied in my program then my program detects those red patches too. What i mean to say is i was wearing a tshirt which has mixed colors and when i tested my program by wearing that tshirt, my program was able to detect the red color out of the other colors. Now, how do i solve this issue?
I think you can try out the following procedure:
obtain a circular kernel having roughly the same area as your object of interest. You can do it like: Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(d, d));
where d is the diameter of the disk.
perform normalized-cross-correlation or convolution of the filtered regions image with this kernel (I think normalized-cross-correlation would be better. And add an empty boarder around the kernel).
the peak of the resulting image should give you the location of the circular region in your filtered image (if you are using normalized-cross-correlation, you'll have to add the shift).
To speed things up, you can perform this at a reduced resolution.
You can filter out non-circular shapes by detecting circles in your thresholded image. OpenCV provides a built-on method to detect circles using Hough transform, more info here. You can take advantage of this function to retain only circles that have a radius in a given range.
Another possibility is to implement connected component labeling (CCL) into your demo program.
I believe that it was removed at some point in verions 2.x of OpenCV, but a basic implementation of the two-pass version is straightforward from the Wikipedia page.
CCL will assign a unique ID for each object after thresholding. You then have to implement matching between the objects at frame (T-1) and objects in frame (T) (for example based on some nearest distance criterion) and possibly trajectory filtering or smoothing, but this would definitely give you some extra-points.

How to stop a for loop (OpenCV)

I am using Processing (processing.org) for a project that requires face tracking. The problem now is that the program is going to run out of memory because of a for loop. I want to stop the loop or at least solve the problem of running out of memory. This is the code.
import hypermedia.video.*;
import java.awt.Rectangle;
OpenCV opencv;
// contrast/brightness values
int contrast_value = 0;
int brightness_value = 0;
void setup() {
size( 900, 600 );
opencv = new OpenCV( this );
opencv.capture( width, height ); // open video stream
opencv.cascade( OpenCV.CASCADE_FRONTALFACE_ALT ); // load detection description, here-> front face detection : "haarcascade_frontalface_alt.xml"
// print usage
println( "Drag mouse on X-axis inside this sketch window to change contrast" );
println( "Drag mouse on Y-axis inside this sketch window to change brightness" );
}
public void stop() {
opencv.stop();
super.stop();
}
void draw() {
// grab a new frame
// and convert to gray
opencv.read();
opencv.convert( GRAY );
opencv.contrast( contrast_value );
opencv.brightness( brightness_value );
// proceed detection
Rectangle[] faces = opencv.detect( 1.2, 2, OpenCV.HAAR_DO_CANNY_PRUNING, 40, 40 );
// display the image
image( opencv.image(), 0, 0 );
// draw face area(s)
noFill();
stroke(255,0,0);
for( int i=0; i<faces.length; i++ ) {
rect( faces[i].x, faces[i].y, faces[i].width, faces[i].height );
}
}
void mouseDragged() {
contrast_value = (int) map( mouseX, 0, width, -128, 128 );
brightness_value = (int) map( mouseY, 0, width, -128, 128 );
}
Thank you!
A few points...
1 As George mentioned in the comments, you can reduce the size of the capture area, which will exponentially reduce the amount of RAM that your sketch is using to analyze the face tracking. Try making two global variables called CaptureWidth and CaptureHeight and set them to 320 and 240 - which is totally sufficient for this.
2 You can increase the amount of memory that your sketch uses by default in the Java Virtual Machine. Processing defaults to 128 I think, but if you go to the Preferences, you will see a checkbox to "Increase maximum available memory to [x]" ... I usually make mine 1500 mb, but it depends on your machine what you can handle. Dont try to make it bigger than 1800mb unless you are on a 64-bit machine and are using Processing 2.0 in 64-bit mode...
3 To actually break the loop... use the 'break' command http://processing.org/reference/break.html ... but please understand why you want to use that first, as this will simply jump you out of your loop.
4 If you only want to show a certain number of faces, you can test if faces[i] == 1, et cetera, which might help....
But I think the loop itself isn't the culprit here, it's more likely the memory footprint. Start with suggestions 1 & 2 and report back...

Resources