I am trying to create an application which can detect heartbeat using your computer webcam. I am working on the code since 2 weeks and developed this code and here I got so far
How does it works? Illustrated below ...
Detecting face using opencv
Getting image of forehead
Applying filter to convert it into grayscale image [you can skip it]
Finding the average intensity of green pixle per frame
Saving the averages into an Array
Applying FFT (I have used minim library)Extract heart beat from FFT spectrum (Here, I need some help)
Here, I need help for extracting heartbeat from FFT spectrum. Can anyone help me. Here, is the similar application developed in python but I am not able to undersand this code so I am developing same in the proessing. Can anyone help me to undersatnd the part of this python code where it is extracting the heartbeat.
//---------import required ilbrary -----------
import gab.opencv.*;
import processing.video.*;
import java.awt.*;
import java.util.*;
import ddf.minim.analysis.*;
import ddf.minim.*;
//----------create objects---------------------------------
Capture video; // camera object
OpenCV opencv; // opencv object
Minim minim;
FFT fft;
//IIRFilter filt;
//--------- Create ArrayList--------------------------------
ArrayList<Float> poop = new ArrayList();
float[] sample;
int bufferSize = 128;
int sampleRate = 512;
int bandWidth = 20;
int centerFreq = 80;
void setup() {
size(640, 480); // size of the window
minim = new Minim(this);
fft = new FFT( bufferSize, sampleRate);
video = new Capture(this, 640/2, 480/2); // initializing video object
opencv = new OpenCV(this, 640/2, 480/2); // initializing opencv object
opencv.loadCascade(OpenCV.CASCADE_FRONTALFACE); // loading haar cscade file for face detection
video.start(); // start video
void draw() {
// image(video, 0, 0 ); // show video in the background
Rectangle[] faces = opencv.detect();
//------------ Finding faces in the video -----------
float gavg = 0;
for (int i = 0; i < faces.length; i++) {
stroke(#FFB700); // yellow rectangle
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height); // creating rectangle around the face (YELLOW)
stroke(#0070FF); //blue rectangle
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height-2*faces[i].height/3); // creating a blue rectangle around the forehead
//-------------------- storing forehead white rectangle part into an image -------------------
stroke(0, 255, 255);
rect(faces[i].x+faces[i].width/2-15, faces[i].y+15, 30, 15);
PImage img = video.get(faces[i].x+faces[i].width/2-15, faces[i].y+15, 30, 15); // storing the forehead aera into a image
img.filter(GRAY); // converting capture image rgb to gray
int numPixels = img.width*img.height;
for (int px = 0; px < numPixels; px++) { // For each pixel in the video frame...
final color c = img.pixels[px];
final color luminG = c>>010 & 0xFF;
final float luminRangeG = luminG/255.0;
gavg = gavg + luminRangeG;
gavg = gavg/numPixels;
if (poop.size()< bufferSize) {
else poop.remove(0);
sample = new float[poop.size()];
for (int i=0;i<poop.size();i++) {
Float f = (float) poop.get(i);
sample[i] = f;
if (sample.length>=bufferSize) {
fft.forward(sample, 0);
// bpf = new BandPass(centerFreq, bandwidth, sampleRate);
// in.addEffect(bpf);
float bw = fft.getBandWidth(); // returns the width of each frequency band in the spectrum (in Hz).
println(bw); // returns 21.5332031 Hz for spectrum [0] & [512]
for (int i = 0; i < fft.specSize(); i++)
// println( " Freq" + max(sample));
stroke(0, 255, 0);
float x = map(i, 0, fft.specSize(), 0, width);
line( x, height, x, height - fft.getBand(i)*100);
// text("FFT FREQ " + fft.getFreq(i), width/2-100, 10*(i+1));
// text("FFT BAND " + fft.getBand(i), width/2+100, 10*(i+1));
else {
println(sample.length + " " + poop.size());
void captureEvent(Capture c) {
The FFT is applied in a window with 128 samples.
int bufferSize = 128;
During the draw method the samples are stored in a array until fill the buffer for the FFT to be applied. Then after that the buffer is keep full. To insert a new sample the oldest is removed. gavg is the average gray channel color.
gavg = gavg/numPixels;
if (poop.size()< bufferSize) {
else poop.remove(0);
Coping poop to sample
sample = new float[poop.size()];
for (int i=0;i < poop.size();i++) {
Float f = (float) poop.get(i);
sample[i] = f;
Now is possible to apply the FFT to sample Array
fft.forward(sample, 0);
In the code is only show the spectrum result. The heartbeat frequency must be calculated.
For each band in fft you have to find the maximum and that position is the frequency of heartbeat.
for(int i = 0; i < fft.specSize(); i++)
{ // draw the line for frequency band i, scaling it up a bit so we can see it
heartBeatFrequency = max(heartBeatFrequency,fft.getBand(i));
Then get the bandwidth to know the frequency.
float bw = fft.getBandWidth();
Adjusting frequency.
heartBeatFrequency = fft.getBandWidth() * heartBeatFrequency ;
After you get samples size 128 that is bufferSize value or greater than that, forward the fft with the samples array and then get the peak value of the spectrum which would be our heartBeatRate
Following Papers explains the same :
Measuring Heart Rate from Video - Isabel Bush - Stanford - link (Page 4 paragraphs below Figure 2 explain this.)
Real Time Heart Rate Monitoring From Facial RGB Color Video Using Webcam - H. Rahman, M.U. Ahmed, S. Begum, P. Funk - link (Page 4)
After looking at your question , I thought let me get my hands onto this and I tried making a repository for this.
Well, having some issues if someone can have a look at it.
Thank you David Clifte for this answer it helped a lot.
Since the Corona situation characterizes my studies as self-study, as a Processing-Language newbie I don't have an easy time getting into the subject of image processing , more specifically convolution. Therefore I hope that you can help me.
My lecturer, who unfortunately is nearly never reachable, left me the following conv code. The theory behind convolution is clear to me, but I have many gaps in understanding related to the code. Could someone leave a line comment so that I can get into the code a bit more fluently?
The Code is following
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
int offset = matrix_size / 2;
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
int xloc = x+i-offset;
int yloc = y+j-offset;
int loc = xloc + img.width*yloc;
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
return color(rtotal, gtotal, btotal);
I have to do a bit of guesswork since I'm not positive about all of the functions you're using and I'm not familiar with the Processing 3+ library, but here's my best shot at it.
color convolution (int x, int y, float[][] matrix, int matrix_size, PImage img){
// Note: the 'matrix' parameter here will also frequently be referred to as
// a 'window' or 'kernel' in research
// I'm not certain what your PImage class is from, but I'll assume
// you're using the Processing 3+ library and work off of that assumption
// how much of each color we see within the kernel (matrix) space
float rtotal = 0.0;
float gtotal = 0.0;
float btotal = 0.0;
// this offset is to zero-center our kernel
// the fact that we use matrix_size / 2 sort of implicitly
// alludes to the fact that our matrix_size should be an odd-number
// so that we can have a middle-pixel
int offset = matrix_size / 2;
// looping through the kernel. the fact that we use 'matrix_size'
// as our end-condition for both dimensions means that our 'matrix' kernel
// must always be a square
for (int i = 0; i < matrix_size; i++){
for (int j= 0; j < matrix_size; j++){
// calculating the index conversion from 2D to the 1D format that PImage uses
// refer to: https://processing.org/tutorials/pixels/
// for a better understanding of PImage indexing (about 1/3 of the way down the page)
// WARNING: by subtracting the offset it is possible to hit negative
// x,y values here if you pick an x or y position less than matrix_size / 2.
// the same index-out-of-bounds can occur on the high end.
// When you convolve using a kernel of N x N size (N here would be matrix_size)
// you can only convolve from [N / 2, Width - (N / 2)] for x and y
int xloc = x+i-offset;
int yloc = y+j-offset;
// this is the final 1D PImage index that corresponds to [xloc, yloc] in our 2D image
// really go back up and take a look at the link if this doesn't make sense, it's pretty good
int loc = xloc + img.width*yloc;
// I have to do some speculation again since I'm not certain what red(img.pixels[loc]) does
// I'll assume it returns the red red channel of the pixel
// this section just adds up all of the pixel colors multiplied by the value in the kernel
rtotal += (red(img.pixels[loc]) * matrix[i][j]);
gtotal += (green(img.pixels[loc]) * matrix[i][j]);
btotal += (blue(img.pixels[loc]) * matrix[i][j]);
// the fact that no further division or averaging happens after the for-loops implies
// that the kernel you feed in should have balanced values for your kernel size
// for example, a kernel that's designed to average out the color over the 3 x 3 area
// it covers (this would be like blurring the image) would be filled with 1/9
// in general: the kernel you're using should have a sum of 1 for all of the numbers inside
// this is just 'in general' you can play around with not doing that, but you'll probably notice a
// darkening effect for when the sum is less than 1, and a brightening effect if it's greater than 1
// for more info on kernels, read this: https://en.wikipedia.org/wiki/Kernel_(image_processing)
// I don't have the code for this constrain function,
// but it's almost certainly just your typical clamp (constrains the values to [0, 255])
// Note: this means that your values saturate at 0 and 255
// if you see a lot of black or white then that means your kernel
// probably isn't balanced as mentioned above
rtotal = constrain(rtotal, 0, 255);
gtotal = constrain(gtotal, 0, 255);
btotal = constrain(btotal, 0, 255);
// Finished!
return color(rtotal, gtotal, btotal);
I am doing a background subtraction capture demo recently but I met with difficulties. I have already get the pixel of silhouette extraction and I intend to draw it into a buffer through createGraphics(). I set the new background is 100% transparent so that I could only get the foreground extraction. Then I use saveFrame() function in order to get png file of each frame. However, it doesn't work as I expected. I intend to get a series of png of the silhouette extraction
with 100% transparent background but now I only get the general png of frames from the camera feed. Is there anyone could help me to see what's the problem with this code? Thanks a lot in advance. Any help will be appreciated.
import processing.video.*;
Capture video;
PGraphics pg;
PImage backgroundImage;
float threshold = 30;
void setup() {
size(320, 240);
video = new Capture(this, width, height);
backgroundImage = createImage(video.width, video.height, RGB);
pg = createGraphics(320, 240);
void captureEvent(Capture video) {
void draw() {
image(video, 0, 0);
for (int x = 0; x < video.width; x++) {
for (int y = 0; y < video.height; y++) {
int loc = x + y * video.width;
color fgColor = video.pixels[loc];
color bgColor = backgroundImage.pixels[loc];
float r1 = red(fgColor); float g1 = green(fgColor); float b1 = blue(fgColor);
float r2 = red(bgColor); float g2 = green(bgColor); float b2 = blue(bgColor);
float diff = dist(r1, g1, b1, r2, g2, b2);
if (diff > threshold) {
pixels[loc] = fgColor;
} else {
pixels[loc] = color(0, 0);
void mousePressed() {
backgroundImage.copy(video, 0, 0, video.width, video.height, 0, 0, video.width, video.height);
Then I use saveFrame() function in order to get png file of each frame. However, it doesn't work as I expected. I intend to get a series of png of the silhouette extraction with 100% transparent background but now I only get the general png of frames from the camera feed.
This won't work, because saveFrame() saves the canvas, and the canvas doesn't support transparency. For example, from the reference:
It is not possible to use the transparency alpha parameter with background colors on the main drawing surface. It can only be used along with a PGraphics object and createGraphics(). https://processing.org/reference/background_.html
If you want to dump a frame with transparency you need to use .save() to dump it directly from a PImage / PGraphics.
If you need to clear your PImage / PGraphics and reuse it each frame, either use pg.clear() or pg.background(0,0,0,0) (set all pixels to transparent black).
I have a problem with opencv, I must detect and tracking grapes with a camera using the program: processing, how do it do? Can I have an exemple? thankyou
This code is an exemple code that detect the face:
import gab.opencv.*;
import processing.video.*;
import java.awt.*;
Capture video;
OpenCV opencv;
void setup() {
size(640, 480);
video = new Capture(this, 640/2, 480/2);
opencv = new OpenCV(this, 640/2, 480/2);
void draw() {
image(video, 0, 0 );
stroke(0, 255, 0);
Rectangle[] faces = opencv.detect();
for (int i = 0; i < faces.length; i++) {
println(faces[i].x + "," + faces[i].y);
rect(faces[i].x, faces[i].y, faces[i].width, faces[i].height);
void captureEvent(Capture c) {
The code you're using trying to detect faces.
As a basic breakdown you will need to segment the object you're trying to detect (grapes in this case) from the background. I recommend starting simple:
try simply using threshold() and see if the highlights of each grape can be isolated. Hopefully they'll be the brightest spot in the image (if the camera isn't looking directly at a light source)
if method 1 isn't effective, try using colour detection: if you what kind of grapes you want to detect you can select a range of colours to detect and ignore the rest. Run the HSVColorTracking example and have a play with the ranges. Swap the marbles image with an image of grapes and see what you can get.
OpenCV has a function specifically built for detecting circles: HoughCircles. Unfortunately Greg's OpenCV Processing library doesn't wrap this function as he does with HoughLines yet, but there it provides function to convert between OpenCV's Mat and Processing PImage. If you're just getting started with Processing and don't have a experience with plain Java, this may be more convoluted.
Try basic thresholding and HSB range thresholding first. Once you have a good looking binary image (where the background is completely black and the grapes are white) you can findContours, get the centroid of each contour, compute the minEnclosingCircle(), etc.
Another option might be to train a support vector machine to distinguish between two classes: grapes and not grapes. This is a more advanced topic, but luckily Greg Borenstein, author of the OpenCV Processing library wrote a nice article with videos and example code on creating on the topic. Check out PSVM: Support Vector Machines for Processing.
Here's a mashup of the HueRangeSelection and FindContours examples using an google image result:
import gab.opencv.*;
PImage img;
OpenCV opencv;
Histogram histogram;
int lowerb = 50;
int upperb = 100;
ArrayList<Contour> contours;
ArrayList<Contour> polygons;
void setup() {
img = loadImage("grape-harvest-inside.jpg");
opencv = new OpenCV(this, img);
void draw() {
image(img, 0, 0);
opencv.inRange(lowerb, upperb);
histogram = opencv.findHistogram(opencv.getH(), 255);
image(opencv.getOutput(), width/2, height/2, width/2,height/2);
noStroke(); fill(0);
histogram.draw(10, height - 230, 400, 200);
noFill(); stroke(0);
line(10, height-30, 410, height-30);
text("Hue", 10, height - (textAscent() + textDescent()));
float lb = map(lowerb, 0, 255, 0, 400);
float ub = map(upperb, 0, 255, 0, 400);
stroke(255, 0, 0); fill(255, 0, 0);
line(lb + 10, height-30, ub +10, height-30);
ellipse(lb+10, height-30, 3, 3 );
text(lowerb, lb-10, height-15);
ellipse(ub+10, height-30, 3, 3 );
text(upperb, ub+10, height-15);
contours = opencv.findContours();
for (Contour contour : contours) {
stroke(0, 255, 0);
void mouseMoved() {
if (keyPressed) {
upperb += mouseX - pmouseX;
else {
if (upperb < 255 || (mouseX - pmouseX) < 0) {
lowerb += mouseX - pmouseX;
if (lowerb > 0 || (mouseX - pmouseX) > 0) {
upperb += mouseX - pmouseX;
upperb = constrain(upperb, lowerb, 255);
lowerb = constrain(lowerb, 0, upperb-1);
Here's a preview of selecting range closer to the grapes colour:
You already notice this is both easy to use, but also not full proof and should get you on the right track to asking yourself the right kind of questions.
For example:
what environments are you supporting ? (indoors/outdoors, natural lighting, artificial lighting, daytime, nighttime, both ? etc.) - light controls what your input images will look like and is therefore crucial
how many different grapes will you support ? (can you get away with a single type (colour range), are there are elements that may trigger a false positive ?)
Detect the hoop(basket).To see the samples of "hoop".
Count the no of successful attempts(shoot) and the failure attempts.
I am using opencv.
Camera position will be static.
The Portrait mode videos from any mobile device.
What have i tried:
Able to track the basket ball. Still, seeking for a better solution.
My code:
int main () {
VideoCapture vid(path);
if (!vid.isOpened())
int i_frame_height = vid.get(CV_CAP_PROP_FRAME_HEIGHT);
i_height_basketball = i_height_basketball * I_HEIGHT / i_frame_height;
int fps = vid.get(CV_CAP_PROP_FPS);
Mat mat_black(640, 480, CV_8UC3, Scalar(0, 0, 0));
vector <Mat> vec_frames;
for (int i_push = 0; i_push < I_NO_FRAMES_STORE; i_push++)
vector <Mat> vec_mat_result;
for (int i_push = 0; i_push < I_RESULT_STORE; i_push++)
int count_frame = 0;
while (true) {
int clk_start = clock();
Mat image, result;
vid >> image;
if (image.empty())
resize(image, image, Size(I_WIDTH, I_HEIGHT));
image.copyTo(vec_mat_result[count_frame % I_RESULT_STORE]);
if (count_frame >= 1)
vec_mat_result[(count_frame - 1) % I_RESULT_STORE].copyTo(result);
GaussianBlur(image, image, Size(9, 9), 2, 2);
image.copyTo(vec_frames[count_frame % I_NO_FRAMES_STORE]);
if (count_frame >= I_NO_FRAMES_STORE - 1) {
Mat mat_diff_temp(I_HEIGHT, I_WIDTH, CV_32S, Scalar(0));
for (int i_diff = 0; i_diff < I_NO_FRAMES_STORE; i_diff++) {
Mat mat_rgb_diff_temp = abs(vec_frames[ (count_frame - 1) % I_NO_FRAMES_STORE ] - vec_frames[ (count_frame - i_diff) % I_NO_FRAMES_STORE ]);
cvtColor(mat_rgb_diff_temp, mat_rgb_diff_temp, CV_BGR2GRAY);
mat_rgb_diff_temp = mat_rgb_diff_temp > I_THRESHOLD;
mat_rgb_diff_temp.convertTo(mat_rgb_diff_temp, CV_32S);
mat_diff_temp = mat_diff_temp + mat_rgb_diff_temp;
mat_diff_temp = mat_diff_temp > I_THRESHOLD_2;
// mat_diff_temp.convertTo(mat_diff_temp, CV_8U);
Mat mat_roi = mat_diff_temp.rowRange(0, i_height_basketball);
// imshow("ROI", mat_roi);
Moments mm = cv::moments(mat_roi, true);
Point p_center = Point(mm.m10 / mm.m00, mm.m01 / mm.m00);
circle(result, p_center, 3, CV_RGB(0, 255, 0), -1);
line(result, Point(0, i_height_basketball), Point(result.cols, i_height_basketball), Scalar(225, 0, 0), 1);
count_frame = count_frame + 1;
int clk_processing_time = (clock() - clk_start);
if (count_frame > 1)
imshow("image", result);
// waitKey(0);
int delay = (1000 / fps) - clk_processing_time;
if (delay <= 0)
delay = 2;
if (waitKey(delay) >= 27)
return 0;
How to detect the hoop? I thought of doing with Square detection to detect the square regions around the hoop.
What is the best way of counting the successful shoots? Or How to count ?
I have what I suspect will be a fairly strong baseline: once the ball has commenced its downward arc, if the ball demonstrates significant upward movement again, its a miss. Otherwise, its a basket. This won't catch airballs, but I suspect they're relatively few anyway.
I think you could get a whole lot of mileage out of learning the ball trajectory of a successful shot and not worry too much about the hoop. Furthermore, didn't you say the camera was fixed-position? Doesn't that mean the hoop's always in the same place, and so you could just specify its location?
If you absolutely did have to find the hoop, I'd look for an object (sub-region of the image) of about the same size as the ball (which you say you can track) that's orange. More generally, you could learn a classifier for the hoop based on the training images you linked to, and apply it at a mixture of locations and scales, searching for the best match. You should know its approximate location, i.e. that it's in the upper portion of the image and likely to be to one side or the other. Then you could use proximity features to this identified region in addition to trajectory features to build a classifier for whether the shot succeeded or not.
I'm doing project in OpenCV on object detection which consists of matching the object in template image with the reference image. Using SIFT algorithm the features get acurately detected and matched but I want a rectagle around the matched features
My algorithm uses the KD-Tree est ean First technique to get the matches
If you want a rectangle around the detected object, here you have code example with exactly that. You just need to draw a rectangle around the homography H.
Hope it helps. Good luck.
I use the following code, adapted from the SURF algoritm in OpenCV (modules/features2d/src/surf.cpp) to extract a surrounding of a keypoint.
Apart from other examples based on rectangles and ROI, this code returns the patch correctly oriented according to the orientation and scale determined by the feature detection algorithm (both available in the KeyPoint struct).
An example of the results of the detection on several different images:
const int PATCH_SZ = 20;
Mat extractKeyPoint(const Mat& image, KeyPoint kp)
int x = (int)kp.pt.x;
int y = (int)kp.pt.y;
float size = kp.size;
float angle = kp.angle;
int win_size = (int)((PATCH_SZ+1)*size*1.2f/9.0);
Mat win(win_size, win_size, CV_8UC3);
float descriptor_dir = angle * (CV_PI/180);
float sin_dir = sin(descriptor_dir);
float cos_dir = cos(descriptor_dir);
float win_offset = -(float)(win_size-1)/2;
float start_x = x + win_offset*cos_dir + win_offset*sin_dir;
float start_y = y - win_offset*sin_dir + win_offset*cos_dir;
uchar* WIN = win.data;
uchar* IMG = image.data;
for( int i = 0; i < win_size; i++, start_x += sin_dir, start_y += cos_dir )
float pixel_x = start_x;
float pixel_y = start_y;
for( int j = 0; j < win_size; j++, pixel_x += cos_dir, pixel_y -= sin_dir )
int x = std::min(std::max(cvRound(pixel_x), 0), image.cols-1);
int y = std::min(std::max(cvRound(pixel_y), 0), image.rows-1);
for (int c=0; c<3; c++) {
WIN[i*win_size*3 + j*3 + c] = IMG[y*image.step1() + x*3 + c];
return win;
I am not sure if the scale is entirely OK, but it is taken from the SURF source and the results look relevant to me.