here is the code now I can both detect face and mouth together, and able to roughly measure the distance of its bounding box <--
the problem is the mouth detection seems to detects everything they defines as mouth even it is not
and I want to use the "face" bounding box as a mouth detection region to minimize its error, I don't know if Forloop stacked would work? by put mouth loop inside face loop?? I'm fairly new to coding any help would be appreciated
import gab.opencv.*;
import java.awt.Rectangle;
import processing.video.*;
Capture video;
OpenCV f;
OpenCV m;
void setup() {
size(800, 600);
video = new Capture(this, 800/2, 600/2);
f = new OpenCV(this, 800/2, 600/2);
m = new OpenCV(this, 800/2, 600/2);
video.start();
}
void draw() {
scale(2);
f.loadImage(video);
m.loadImage(video);
f.loadCascade(OpenCV.CASCADE_FRONTALFACE);
m.loadCascade(OpenCV.CASCADE_MOUTH);
image(video, 0, 0 );
noFill();
stroke(0, 255, 0);
strokeWeight(3);
Rectangle[] mouth = m.detect();
Rectangle[] face = f.detect();
println(mouth.length);
strokeWeight(3);
for (int i = 0; i < face.length; i++) {
println(face[i].x + "," + face[i].y);
rect(face[i].x, face[i].y, face[i].width, face[i].height);
}
for (int i = 0; i < mouth.length; i++) {
println(mouth[i].x + "," + mouth[i].y);
rect(mouth[i].x, mouth[i].y, mouth[i].width, mouth[i].height);
}
for (int i = 0; i < mouth.length; i++) {
fill(255, 0, 0);
noStroke();
ellipse((mouth[i].x)+(mouth[i].width/2), mouth[i].y, 5, 5);
ellipse((mouth[i].x)+(mouth[i].width/2), (mouth[i].y)+ (mouth[i].height), 5, 5);
}
for (int i = 0; i < mouth.length; i++) {
int px = (mouth[i].x)+(mouth[i].width/2);
int py = (mouth[i].y)+(mouth[i].height);
int mOpen = int (dist(px, mouth[i].y, px, py));
println(mOpen);
}
}
void captureEvent(Capture d) {
d.read();
}
There are a couple issues:
You shouldn't be loading OpenCV cascades multiple times a second in draw(). You should do it once in setup() and just call detect() in draw()
OpenCV for Processing seems to override the cascade loaded in the second instance with a cascade loaded in the first instance
If accuracy isn't a huge issue, you can get away with a single cascade: the mouth one. Note that there are options/hints you can use for the detect function which may help the detection. For example you can tell the detector to detect largest object only, give it a hint of the smallest and largest bounding boxes the mouth would have with your setup and how much should the results filtered out.
Here's a code sample for the above:
import gab.opencv.*;
import java.awt.Rectangle;
import org.opencv.objdetect.Objdetect;
import processing.video.*;
Capture video;
OpenCV opencv;
//cascade detections parameters - explanations from Mastering OpenCV with Practical Computer Vision Projects
int flags = Objdetect.CASCADE_FIND_BIGGEST_OBJECT;
// Smallest object size.
int minFeatureSize = 20;
int maxFeatureSize = 80;
// How detailed should the search be. Must be larger than 1.0.
float searchScaleFactor = 1.1f;
// How much the detections should be filtered out. This should depend on how bad false detections are to your system.
// minNeighbors=2 means lots of good+bad detections, and minNeighbors=6 means good detections are given but some are missed.
int minNeighbors = 6;
void setup() {
size(320, 240);
noFill();
stroke(0, 192, 0);
strokeWeight(3);
video = new Capture(this,width,height);
video.start();
opencv = new OpenCV(this,320,240);
opencv.loadCascade(OpenCV.CASCADE_MOUTH);
}
void draw() {
//feed cam image to OpenCV, it turns it to grayscale
opencv.loadImage(video);
opencv.equalizeHistogram();
image(opencv.getOutput(), 0, 0 );
Rectangle[] mouths = opencv.detect(searchScaleFactor,minNeighbors,flags,minFeatureSize, maxFeatureSize);
for (int i = 0; i < mouths.length; i++) {
text(mouths[i].x + "," + mouths[i].y + "," + mouths[i].width + "," + mouths[i].height,mouths[i].x, mouths[i].y);
rect(mouths[i].x, mouths[i].y, mouths[i].width, mouths[i].height);
}
}
void captureEvent(Capture c) {
c.read();
}
Note that facial hair can cause false positives.
I have provided more in depth notes in an answer to your previous related question. I recommend focusing on the FaceOSC part as it will be more accurate.
Related
I'm working on a project to keep your face in the center of the screen, by using a camera on top of a servo. I've used the simple servo control tutorial, on the arduino playground website, to use the mouse to control the servo and tried to rewrite it to use your face's x coordinates to make the servo move in the desired direction.
simple servo control arduino playground
So far i got it working with the built-in camera. The servo moves nicely in the right direction with my face. But as soon as I use the external USB camera on top of the servo instead of the built-in camera I don't get the desired result. The camera doesn't wanna look at me. as soon as it detects your face it turns straight in the opposite direction. So if the camera detects your face on the left side of the screen, the servo will turn to the right until your face is out of the screen.
I hope that someone can answer or help me explain why it works with the built-in camera but not when I use the USB camera that i attached on the servo.
I'm using the Arduino, Processing and the OpenCV library in Processing.
This is the code that i have so far:
Arduino code:
#include <Servo.h>
Servo servo1; Servo servo2;
void setup() {
servo1.attach(4);
servo2.attach(10);
Serial.begin(19200);
Serial.println("Ready");
}
void loop() {
static int v = 0;
if ( Serial.available()) {
char ch = Serial.read();
switch(ch) {
case '0'...'9':
v = v * 10 + ch - '0';
/*
so if the chars sent are 45x (turn x servo to 45 degs)..
v is the value we want to send to the servo and it is currently 0
The first char (ch) is 4 so
0*10 = 0 + 4 - 0 = 4;
Second char is 4;
4*10 = 40 + 5 = 45 - 0 = 45;
Third char is not a number(0-9) so we drop through...
*/
break;
case 's':
servo1.write(v);
v = 0;
break;
case 'w':
servo2.write(v);
v = 0;
break;
case 'd':
servo2.detach();
break;
case 'a':
servo2.attach(10);
break;
}
}
}
My processing code:
import gab.opencv.*;
import processing.video.*;
import java.awt.*;
//----------------
import processing.serial.*;
int gx = 15;
int gy = 35;
//int spos=90;
float midden=90;
float leftColor = 0.0;
float rightColor = 0.0;
Serial port;
//----------------
Capture video;
OpenCV opencv;
void setup() {
size(640, 480);
String[] cameras = Capture.list();
if (cameras.length == 0) {
println("There are no cameras available for capture.");
exit();
} else {
println("Available cameras:");
for (int i = 0; i < cameras.length; i++) {
println(cameras[i]);
}
}
//----------------
colorMode(RGB, 1.0);
noStroke();
frameRate(100);
//println(Serial.list()); // List COM-ports
//select second com-port from the list
port = new Serial(this, Serial.list()[5], 19200); //arduino aangesloten aan linker USB
//----------------
video = new Capture(this, 640/2, 480/2, "USB2.0 Camera"); //external camera rechter USB
//video = new Capture(this, 640/2, 480/2); //built-in camera
opencv = new OpenCV(this, 640/2, 480/2);
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE);
video.start();
//-_-_-_-_-_-_-_-_- weergave kleur camera
opencv.useColor();
}
void draw() {
//---------------- Mouse Control
background(0.0);
update(mouseX);
fill(mouseX/4);
rect(150, 320, gx*2, gx*2);
fill(180 - (mouseX/4));
rect(450, 320, gy*2, gy*2);
//----------------
scale(2);
opencv.loadImage(video);
//-_-_-_-_-_-_-_-_- Flip camera image
opencv.flip(OpenCV.HORIZONTAL);
image(video, 0, 0 );
//-_-_-_-_-_-_-_-_-
image(opencv.getOutput(), 0, 0 );
noFill();
stroke(0, 255, 0);
strokeWeight(3);
Rectangle[] faces = opencv.detect();
//println(faces.length);
for (int i = 0; i < faces.length; i++) {
println(faces[i].x + "," + faces[i].y);
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height); //groene vierkant om het gezicht
ellipse( faces[i].x + 0.5*faces[i].width, faces[i].y + 0.5*faces[i].height, 5, 5 ); //middenpunt v.h. gezicht
midden= (faces[i].x + 0.5*faces[i].width);
//midden= (faces[i].x);
}
}
void captureEvent(Capture c) {
c.read();
}
//---------------- servo controls voor muislocatie en draaiing servo
void update(int x)
{
//Calculate servo postion from mouseX
//spos= x/4;
//Output the servo position ( from 0 to 180)
port.write("s"+midden);
println(midden);
// if( midden>80 && midden<150){
// port.write("s"+90);
// } else if(midden<80){
// port.write("s"+45);
// }else{
// port.write("s"+135);
// }
}
//----------------
It sounds like the two images are flipped. To test this, try drawing a circle on the left side of both images (and then display using imshow) to see if they end up at the same location.
I am working in a Processing project, but when I try to record the sketch with the GSvideo library I get this error:
A library used by this sketch is not installed properly.
GSVideo version: 1.0.0
A library relies on native code that's not available.
Or only works properly when the sketch is run as a 64-bit application.
In my project I'm tracking objects with the HSV space color and the OpenCV for Processing library and I want to record the sketch just so I can show later my work. This is my code:
/**
* HSVColorTracking
* Greg Borenstein
* https://github.com/atduskgreg/opencv-processing-book/blob/master/code/hsv_color_tracking/HSVColorTracking/HSVColorTracking.pde
*
* Modified by Jordi Tost #jorditost (color selection)
* University of Applied Sciences Potsdam, 2014
*
* Modified by Luz Alejandra Magre
* Universidad Tecnológica de Bolívar, 2015
*/
import gab.opencv.*;
import processing.video.*;
import codeanticode.gsvideo.*;
import java.awt.Rectangle;
GSCapture video;
OpenCV opencv;
GSMovieMaker mm;
int fps = 30;
PImage src, colorFilteredImage;
ArrayList<Contour> contours;
// <1> Set the range of Hue values for our filter
int rangeLow = 20;
int rangeHigh = 35;
void setup() {
frameRate(fps);
String[] cameras = GSCapture.list();
size(2*opencv.width, opencv.height, P2D);
if (cameras.length == 0)
{
println("There are no cameras available for capture.");
exit();
}
else {
println("Available cameras:");
for (int i = 0; i < cameras.length; i++) {
println(cameras[i]);
}
video = new GSCapture(this, 640, 480, cameras[0]);
video.start();
opencv = new OpenCV(this, video.width, video.height);
contours = new ArrayList<Contour>();
mm = new GSMovieMaker(this, width, height, "Test.ogg", GSMovieMaker.THEORA, GSMovieMaker.BEST, fps);
mm.setQueueSize(50, 10);
mm.start();
}
}
void draw() {
// Read last captured frame
if (video.available()) {
video.read();
}
// <2> Load the new frame of our movie in to OpenCV
opencv.loadImage(video);
// Tell OpenCV to use color information
opencv.useColor();
src = opencv.getSnapshot();
// <3> Tell OpenCV to work in HSV color space.
opencv.useColor(HSB);
// <4> Copy the Hue channel of our image into
// the gray channel, which we process.
opencv.setGray(opencv.getH().clone());
// <5> Filter the image based on the range of
// hue values that match the object we want to track.
opencv.inRange(rangeLow, rangeHigh);
// <6> Get the processed image for reference.
colorFilteredImage = opencv.getSnapshot();
///////////////////////////////////////////
// We could process our image here!
// See ImageFiltering.pde
///////////////////////////////////////////
// <7> Find contours in our range image.
// Passing 'true' sorts them by descending area.
contours = opencv.findContours(true, true);
// <8> Display background images
image(src, 0, 0);
image(colorFilteredImage, src.width, 0);
// <9> Check to make sure we've found any contours
if (contours.size() > 0) {
// <9> Get the first contour, which will be the largest one
Contour biggestContour = contours.get(0);
// <10> Find the bounding box of the largest contour,
// and hence our object.
Rectangle r = biggestContour.getBoundingBox();
// <11> Draw the bounding box of our object
noFill();
strokeWeight(2);
stroke(255, 0, 0);
rect(r.x, r.y, r.width, r.height);
// <12> Draw a dot in the middle of the bounding box, on the object.
noStroke();
fill(255, 0, 0);
ellipse(r.x + r.width/2, r.y + r.height/2, 30, 30);
text(r.x + r.width/2, 50, 50);
text(r.y + r.height/2, 50, 80);
}
loadPixels();
mm.addFrame(pixels);
saveFrame("frame-######.png");
}
void mousePressed() {
color c = get(mouseX, mouseY);
println("r: " + red(c) + " g: " + green(c) + " b: " + blue(c));
int hue = int(map(hue(c), 0, 255, 0, 180));
println("hue to detect: " + hue);
rangeLow = hue - 5;
rangeHigh = hue + 5;
}
void keyPressed() {
if (key == ' ') {
// Finish the movie if space bar is pressed
mm.finish();
// Quit running the sketch once the file is written
exit();
}
}
I would really appreciate the help on this.
Hi currently i am working on an OCR reading app where i have successfully able to capture the card image by using AVFoundation framework.
For next step, i need to find out edges of the card , so that i can crop the card image from main captured image & later i can sent it to OCR engine for processing.
The main problem is now to find the edges of the card & i am using below code(taken from another open source project) which uses OpenCV for this purpose.It is working fine if the card is pure rectangular Card or Paper. But when i use a card with rounded corner (e.g Driving License), it is failed to detect . Also i dont have much expertise in OpenCV , Can any one help me in solving this issue?
- (void)detectEdges
{
cv::Mat original = [MAOpenCV cvMatFromUIImage:_adjustedImage];
CGSize targetSize = _sourceImageView.contentSize;
cv::resize(original, original, cvSize(targetSize.width, targetSize.height));
cv::vector<cv::vector<cv::Point>>squares;
cv::vector<cv::Point> largest_square;
find_squares(original, squares);
find_largest_square(squares, largest_square);
if (largest_square.size() == 4)
{
// Manually sorting points, needs major improvement. Sorry.
NSMutableArray *points = [NSMutableArray array];
NSMutableDictionary *sortedPoints = [NSMutableDictionary dictionary];
for (int i = 0; i < 4; i++)
{
NSDictionary *dict = [NSDictionary dictionaryWithObjectsAndKeys:[NSValue valueWithCGPoint:CGPointMake(largest_square[i].x, largest_square[i].y)], #"point" , [NSNumber numberWithInt:(largest_square[i].x + largest_square[i].y)], #"value", nil];
[points addObject:dict];
}
int min = [[points valueForKeyPath:#"#min.value"] intValue];
int max = [[points valueForKeyPath:#"#max.value"] intValue];
int minIndex;
int maxIndex;
int missingIndexOne;
int missingIndexTwo;
for (int i = 0; i < 4; i++)
{
NSDictionary *dict = [points objectAtIndex:i];
if ([[dict objectForKey:#"value"] intValue] == min)
{
[sortedPoints setObject:[dict objectForKey:#"point"] forKey:#"0"];
minIndex = i;
continue;
}
if ([[dict objectForKey:#"value"] intValue] == max)
{
[sortedPoints setObject:[dict objectForKey:#"point"] forKey:#"2"];
maxIndex = i;
continue;
}
NSLog(#"MSSSING %i", i);
missingIndexOne = i;
}
for (int i = 0; i < 4; i++)
{
if (missingIndexOne != i && minIndex != i && maxIndex != i)
{
missingIndexTwo = i;
}
}
if (largest_square[missingIndexOne].x < largest_square[missingIndexTwo].x)
{
//2nd Point Found
[sortedPoints setObject:[[points objectAtIndex:missingIndexOne] objectForKey:#"point"] forKey:#"3"];
[sortedPoints setObject:[[points objectAtIndex:missingIndexTwo] objectForKey:#"point"] forKey:#"1"];
}
else
{
//4rd Point Found
[sortedPoints setObject:[[points objectAtIndex:missingIndexOne] objectForKey:#"point"] forKey:#"1"];
[sortedPoints setObject:[[points objectAtIndex:missingIndexTwo] objectForKey:#"point"] forKey:#"3"];
}
[_adjustRect topLeftCornerToCGPoint:[(NSValue *)[sortedPoints objectForKey:#"0"] CGPointValue]];
[_adjustRect topRightCornerToCGPoint:[(NSValue *)[sortedPoints objectForKey:#"1"] CGPointValue]];
[_adjustRect bottomRightCornerToCGPoint:[(NSValue *)[sortedPoints objectForKey:#"2"] CGPointValue]];
[_adjustRect bottomLeftCornerToCGPoint:[(NSValue *)[sortedPoints objectForKey:#"3"] CGPointValue]];
}
original.release();
}
This naive implementation is based on some of the techniques demonstrated in squares.cpp, available in the OpenCV sample directory. The following posts also discuss similar applications:
OpenCV C++/Obj-C: Detecting a sheet of paper / Square Detection
Square detection doesn't find squares
Find corner of papers
#John, the code below has been tested with the sample image you provided and another one I created:
The processing pipeline starts with findSquares(), a simplification of the same function implemented by OpenCV's squares.cpp demo. This function converts the input image to grayscale and applies a blur to improve the detection of the edges (Canny):
The edge detection is good, but a morphological operation (dilation) is needed to join nearby lines:
After that we try to find the contours (edges) and assemble squares out of them. If we tried to draw all the detected squares on the input images, this would be the result:
It looks good, but it's not exactly what we are looking for since there are too many detected squares. However, the largest square is actually the card, so from here on it's pretty simple and we just figure out which of the squares is the largest. That's exactly what findLargestSquare() does.
Once we know the largest square, we simply paint red dots at the corners of the square for debugging purposes:
As you can see, the detection is not perfect but it seems good enough for most uses. This is not a robust solution and I only wanted to share one approach to solve the problem. I'm sure that there are other ways to deal with this that might be more interesting to you. Good luck!
#include <iostream>
#include <cmath>
#include <vector>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/imgproc/imgproc_c.h>
/* angle: finds a cosine of angle between vectors, from pt0->pt1 and from pt0->pt2
*/
double angle(cv::Point pt1, cv::Point pt2, cv::Point pt0)
{
double dx1 = pt1.x - pt0.x;
double dy1 = pt1.y - pt0.y;
double dx2 = pt2.x - pt0.x;
double dy2 = pt2.y - pt0.y;
return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10);
}
/* findSquares: returns sequence of squares detected on the image
*/
void findSquares(const cv::Mat& src, std::vector<std::vector<cv::Point> >& squares)
{
cv::Mat src_gray;
cv::cvtColor(src, src_gray, cv::COLOR_BGR2GRAY);
// Blur helps to decrease the amount of detected edges
cv::Mat filtered;
cv::blur(src_gray, filtered, cv::Size(3, 3));
cv::imwrite("out_blur.jpg", filtered);
// Detect edges
cv::Mat edges;
int thresh = 128;
cv::Canny(filtered, edges, thresh, thresh*2, 3);
cv::imwrite("out_edges.jpg", edges);
// Dilate helps to connect nearby line segments
cv::Mat dilated_edges;
cv::dilate(edges, dilated_edges, cv::Mat(), cv::Point(-1, -1), 2, 1, 1); // default 3x3 kernel
cv::imwrite("out_dilated.jpg", dilated_edges);
// Find contours and store them in a list
std::vector<std::vector<cv::Point> > contours;
cv::findContours(dilated_edges, contours, cv::RETR_LIST, cv::CHAIN_APPROX_SIMPLE);
// Test contours and assemble squares out of them
std::vector<cv::Point> approx;
for (size_t i = 0; i < contours.size(); i++)
{
// approximate contour with accuracy proportional to the contour perimeter
cv::approxPolyDP(cv::Mat(contours[i]), approx, cv::arcLength(cv::Mat(contours[i]), true)*0.02, true);
// Note: absolute value of an area is used because
// area may be positive or negative - in accordance with the
// contour orientation
if (approx.size() == 4 && std::fabs(contourArea(cv::Mat(approx))) > 1000 &&
cv::isContourConvex(cv::Mat(approx)))
{
double maxCosine = 0;
for (int j = 2; j < 5; j++)
{
double cosine = std::fabs(angle(approx[j%4], approx[j-2], approx[j-1]));
maxCosine = MAX(maxCosine, cosine);
}
if (maxCosine < 0.3)
squares.push_back(approx);
}
}
}
/* findLargestSquare: find the largest square within a set of squares
*/
void findLargestSquare(const std::vector<std::vector<cv::Point> >& squares,
std::vector<cv::Point>& biggest_square)
{
if (!squares.size())
{
std::cout << "findLargestSquare !!! No squares detect, nothing to do." << std::endl;
return;
}
int max_width = 0;
int max_height = 0;
int max_square_idx = 0;
for (size_t i = 0; i < squares.size(); i++)
{
// Convert a set of 4 unordered Points into a meaningful cv::Rect structure.
cv::Rect rectangle = cv::boundingRect(cv::Mat(squares[i]));
//std::cout << "find_largest_square: #" << i << " rectangle x:" << rectangle.x << " y:" << rectangle.y << " " << rectangle.width << "x" << rectangle.height << endl;
// Store the index position of the biggest square found
if ((rectangle.width >= max_width) && (rectangle.height >= max_height))
{
max_width = rectangle.width;
max_height = rectangle.height;
max_square_idx = i;
}
}
biggest_square = squares[max_square_idx];
}
int main()
{
cv::Mat src = cv::imread("cc.png");
if (src.empty())
{
std::cout << "!!! Failed to open image" << std::endl;
return -1;
}
std::vector<std::vector<cv::Point> > squares;
findSquares(src, squares);
// Draw all detected squares
cv::Mat src_squares = src.clone();
for (size_t i = 0; i < squares.size(); i++)
{
const cv::Point* p = &squares[i][0];
int n = (int)squares[i].size();
cv::polylines(src_squares, &p, &n, 1, true, cv::Scalar(0, 255, 0), 2, CV_AA);
}
cv::imwrite("out_squares.jpg", src_squares);
cv::imshow("Squares", src_squares);
std::vector<cv::Point> largest_square;
findLargestSquare(squares, largest_square);
// Draw circles at the corners
for (size_t i = 0; i < largest_square.size(); i++ )
cv::circle(src, largest_square[i], 4, cv::Scalar(0, 0, 255), cv::FILLED);
cv::imwrite("out_corners.jpg", src);
cv::imshow("Corners", src);
cv::waitKey(0);
return 0;
}
instead of "pure" rectangular blobs, try to go for nearly rectangular ones.
1- gaussian blur
2- grayscale and canny edge detection
3- extract all blobs (contours) in your image and filter out small ones. you will use findcontours and contourarea functions for that purpose.
4- using moments, filter out non-rectangular ones. First you need to check out moments of rectangle-like objects. You can do it by yourself or google it. Then list those moments and find similarity between objects, create your filter as such.
Ex: After test, say you found out central moment m30's are similar for rectangle-like objects -> filter out objects having inaccurate m30.
I know maybe it's too late for this post, but I am posting this in case it might help someone else.
The iOS Core Image framework already has a good tool to detect features such as rectangles (since iOS 5), faces, QR codes and even regions containing text in a still image. If you check out the CIDetector class you'll find what you need. I am using it for an OCR app too, it's super easy and very reliable compared to what you can do with OpenCV (I am not good with OpenCV, but the CIDetector gives much better results with 3-5 lines of code).
I don't know if it is an option, but you could have the user define the edges of it rather than trying to do it programatically.
I am new in Match faces , I am trying to learn how to use SVM with HOG descriptors.
I wrote a simple face recognizer with SVM, but when i activate it , code always returns 1
float *getHOG(const cv::Mat &image, int* count)//Compute HOG
{
cv::HOGDescriptor hog;
std::vector<float> res;
cv::Mat img2;
cv::resize(image, img2, cv::Size(64, 128));
hog.compute(img2, res, cv::Size(8, 8), cv::Size(0, 0));
*count = res.size();
float* result = new float[*count];
for(int i = 0; i < res.size(); i++)
{
result[i] = res[i];
}
return result;
}
const int dataSetLength = 10;
float **getTraininigData(int* setlen, int* veclen)//Load some samples of data
{
char *names[dataSetLength] = {
"../faces/s1/1.pgm",
"../faces/s1/2.pgm",
"../faces/s1/3.pgm",
"../faces/s1/4.pgm",
"../faces/s1/5.pgm",
"../faces/cars/1.jpg",
"../faces/cars/2.jpg",
"../faces/cars/3.jpg",
"../faces/cars/4.jpg",
"../faces/cars/5.jpg",
};
float **res = new float* [dataSetLength];
for(int i = 0; i < dataSetLength; i++)
{
std::cout<<names[i]<<"\n";
cv::Mat img = cv::imread(names[i], 0);
res[i] = getHOG(img, veclen);
}
*setlen = dataSetLength;
return res;
}
void test()//Training and activate SVM
{
int setlen, veclen;
float **trainingData = getTraininigData(&setlen, &veclen);
float *labels = new float[dataSetLength];
for(int i = 0; i < dataSetLength; i++)
{
labels[i] = (i < dataSetLength/2)? 0.0 : 1.0;
}
cv::Mat labelsMat(setlen, 1, CV_32FC1, labels);
cv::Mat trainingDataMat(setlen, veclen, CV_32FC1, trainingData);
cv::SVMParams params;
params.svm_type = cv::SVM::C_SVC;
params.kernel_type = cv::SVM::LINEAR;
params.term_crit = cv::TermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);
std::cout<<labelsMat<<"\n";
cv::SVM SVM;
SVM.train(trainingDataMat, labelsMat, cv::Mat(), cv::Mat(), params);
cv::Mat img = cv::imread("../faces/s1/2.pgm", 0);//sample from train data, but ansewer is 1 for every sample
auto desc = getHOG(img, &veclen);
cv::Mat sampleMat(1, veclen, CV_32FC1, desc);
float response = SVM.predict(sampleMat);
std::cout<<"resp "<< response<<"\n";
}
What wrong with my code ?
PS sorry for my writing mistakes. English in not my native language
You don't have much training data. Note how Dalal and Triggs in their original paper on HOG (http://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf) used thousands of examples to train the SVM, you have just 5 negative and 5 positive.
You haven't set the C parameter (you need to find a good value via cross validation) - you will need more data.
Possibly HOG descriptors for faces and cars are not separable with a linear kernel, try RBF.
But this is unlikely to be an issue since D&L use a linear SVM in their paper.
Read this: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
If you haven't done this yet, get the SVM working for a simpler case (e.g. just use image patches instead of HOG).
I'm trying to create a program that will draw a 2d greyscale spectrum of a given image. I'm using OpenCV and FFTW libraries. By using tips and codes from the internet and modifying them I've managed to load an image, calculate fft of this image and recreate the image from the fft (it's the same). What I'm unable to do is to draw the fourier spectrum itself. Could you please help me?
Here's the code (less important lines removed):
/* Copy input image */
/* Create output image */
/* Allocate input data for FFTW */
in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
dft = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
/* Create plans */
plan_f = fftw_plan_dft_2d(w, h, in, dft, FFTW_FORWARD, FFTW_ESTIMATE);
/* Populate input data in row-major order */
for (i = 0, k = 0; i < h; i++)
{
for (j = 0; j < w; j++, k++)
{
in[k][0] = ((uchar*)(img1->imageData + i * img1->widthStep))[j];
in[k][1] = 0.;
}
}
/* forward DFT */
fftw_execute(plan_f);
/* spectrum */
for (i = 0, k = 0; i < h; i++)
{
for (j = 0; j < w; j++, k++)
((uchar*)(img2->imageData + i * img2->widthStep))[j] = sqrt(pow(dft[k][0],2) + pow(dft[k][1],2));
}
cvShowImage("iplimage_dft(): original", img1);
cvShowImage("iplimage_dft(): result", img2);
cvWaitKey(0);
/* Free memory */
}
The problem is in the "Spectrum" section. Instead of a spectrum I get some noise. What am I doing wrong? I would be grateful for your help.
You need to draw magnitude of spectrum. here is the code.
void ForwardFFT(Mat &Src, Mat *FImg)
{
int M = getOptimalDFTSize( Src.rows );
int N = getOptimalDFTSize( Src.cols );
Mat padded;
copyMakeBorder(Src, padded, 0, M - Src.rows, 0, N - Src.cols, BORDER_CONSTANT, Scalar::all(0));
// Создаем комплексное представление изображения
// planes[0] содержит само изображение, planes[1] его мнимую часть (заполнено нулями)
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexImg;
merge(planes, 2, complexImg);
dft(complexImg, complexImg);
// После преобразования результат так-же состоит из действительной и мнимой части
split(complexImg, planes);
// обрежем спектр, если у него нечетное количество строк или столбцов
planes[0] = planes[0](Rect(0, 0, planes[0].cols & -2, planes[0].rows & -2));
planes[1] = planes[1](Rect(0, 0, planes[1].cols & -2, planes[1].rows & -2));
Recomb(planes[0],planes[0]);
Recomb(planes[1],planes[1]);
// Нормализуем спектр
planes[0]/=float(M*N);
planes[1]/=float(M*N);
FImg[0]=planes[0].clone();
FImg[1]=planes[1].clone();
}
void ForwardFFT_Mag_Phase(Mat &src, Mat &Mag,Mat &Phase)
{
Mat planes[2];
ForwardFFT(src,planes);
Mag.zeros(planes[0].rows,planes[0].cols,CV_32F);
Phase.zeros(planes[0].rows,planes[0].cols,CV_32F);
cv::cartToPolar(planes[0],planes[1],Mag,Phase);
}
Mat LogMag;
LogMag.zeros(Mag.rows,Mag.cols,CV_32F);
LogMag=(Mag+1);
cv::log(LogMag,LogMag);
//---------------------------------------------------
imshow("Логарифм амплитуды", LogMag);
imshow("Фаза", Phase);
imshow("Результат фильтрации", img);
Can you try to do the IFFT step and see if you recover the original image ? then , you can check step by step where is your problem. Another solution to find the problem is to do this process with a small matrix predefined by you ,and calculate it FFT in MATLAB, and check step by step, it worked for me!