Automatic color calibration for object tracker - opencv

This is my first post, so forgive me if I miss something.
I have been playing around with OpenCV2 with Visual Studio C++. I have a basic object tracker working. By applying a Gaussian Blur, Converting to HSV, Thresholding with Trackbars, Eroding then Dilating. Now I want to set up some way of easily calibrating the color to be thresholded without using the Trackbars.
I've tried setting up an area of interest and taking the average BGR or HSV values (I've tried both ways). Then if needed use trackbars to make finer adjustments, but it does not seem to work. Am I on the right track, or is there a better way?
I have basically followed this video to get where I am.
https://www.youtube.com/watch?v=bSeFrPrqZ2A
I am not looking for a code to copy and paste. I am just looking for an Algorithm or explanation of a way to do it. Cheers
EDIT
Sorry I'll try and clear it up. What I have done is written an object tracking program for a home robot vision project. I just want to make it easier to calibrate what color is to be thresholded. At the moment I use trackbars to set the min and max HSV values for thresholding. Then use Erode and Dilate to clear up the binary image. Before using cv::findConturs and cv::moments to find the centroid for the largest contour.
What I have tried is setting a small 40x40pixel square in the center of the screen. When, for example, I hold a green ball in this square and hit spacebar. I cycle through each pixel in the square and get each separate Hue, Saturation and Value um...value. Then take the mode of each and use that to set the min and max threshold values.
Here is a segment of the code
if(cv::waitKey(20) == 32){ // wait for spacebar
int count = 0;
cv::Mat roi_Crop = frame_HSV(roi); //create cropped image from frame_HSV
for(int i=0; i<roi_Crop.rows; i++) // cycle through each pixel
{
for(int j=0; j<roi_Crop.cols; j++)
{
Hue[count] = roi_Crop.at<cv::Vec3b>(i,j)[0];
Sat[count] = roi_Crop.at<cv::Vec3b>(i,j)[1];
Val[count] = roi_Crop.at<cv::Vec3b>(i,j)[2];
count++;
}
}
HSV_Mode[0] = findMode(Hue);
HSV_Mode[1] = findMode(Sat);
HSV_Mode[2] = findMode(Val);
}
I hope this helps.

Related

How to use OpenCV stereoCalibrate output to map pixels from one camera to another

Context: I have two cameras of very different focus and size that I want to align for image processing. One is RGB, one is near-infrared. The cameras are in a static rig, so fixed relative to each other. Because the image focus/width are so different, it's hard to even get both images to recognize the chessboard at the same time. Pretty much only works when the chessboard is centered in both images with very little skew/tilt.
I need to perform computations on the aligned images, so I need as good of a mapping between the optical frames as I can get. Right now the results I'm getting are pretty far off. I'm not sure if I'm using the method itself wrong, or if I am misusing the output. Details and image below.
Computation: I am using OpenCV stereoCalibrate to estimate the rotation and translation matrices with the following code, and throwing out bad results based on final error.
int flag = cv::CALIB_FIX_INTRINSIC;
double err = cv::stereoCalibrate(temp_points_object_vec, temp_points_alignvec, temp_points_basevec, camera_mat_align, camera_distort_align, camera_mat_base, camera_distort_base, mat_align.size(), rotate_mat, translate_mat, essential_mat, F, flag, cv::TermCriteria(cv::TermCriteria::MAX_ITER + cv::TermCriteria::EPS, 30, 1e-6));
if (last_error_ == -1.0 || (err < last_error_ + improve_threshold_)) {
// -1.0 indicate first calibration, accept points. Other cond indicates acceptable error.
points_alignvec_.push_back(addalign);
points_basevec_.push_back(addbase);
points_object_vec_.push_back(object_points);
}
The result doesn't produce an OpenCV error as is, and due to the large difference between images, more than half of the matched points are rejected. Results are much better since I added the conditional on the error, but still pretty poor. Error as computed above starts around 30, but doesn't get lower than 15-17. For comparison, I believe a "good" error would be <1. So for starters, I know the output isn't great, but on top of that, I'm not sure I'm using the output right for validating visually. I've attached images showing some of the best and worst results I see. The middle image on the right of each shows the "cross-validated" chessboard keypoints. These are computed like this (note addalign is the temporary vector containing only the chessboard keypoints from the current image in the frame to be aligned):
for (int i = 0; i < addalign.size(); i++) {
cv::Point2f validate_pt;// = rotate_mat * addalign.at(i) + translate_mat;
// Project pixel from image aligned to 3D
cv::Point3f ray3d = align_camera_model_.projectPixelTo3dRay(addalign.at(i));
// Rotate and translate
rotate_mat.convertTo(rotate_mat, CV_32F);
cv::Mat temp_result = rotate_mat * cv::Mat(ray3d, false);
cv::Point3f ray_transformed;
temp_result.copyTo(cv::Mat(ray_transformed, false));
cv::Mat tmat = cv::Mat(translate_mat, false);
ray_transformed.x += tmat.at<float>(0);
ray_transformed.y += tmat.at<float>(1);
ray_transformed.z += tmat.at<float>(2);
// Reproject to base image pixel
cv::Point2f pixel = base_camera_model_.project3dToPixel(ray_transformed);
corners_validated.push_back(pixel);
}
Here are two images showing sample outputs, including both raw images, both images with "drawChessboard," and a cross-validated image showing the base image with above-computed keypoints translated from the alignment image.
Better result
Worse result
In the computation of corners_validated, I'm not sure I'm using rotate_mat andtranslate_mat correctly. I'm sure there is probably an OpenCV method that does this more efficiently, but I just did it the way that made sense to me at the time.
Also relevant: This is all inside a ROS package, using ROS noetic on Ubuntu 20.04 which only permits the use of OpenCV 4.2, so I don't have access to some of the newer opencv methods.

Detecting a hand above a chessboard using opencv

I am developing an android application for analyzing chess games based on series of photos. To process images, I am using OpenCV. My question is how can I detect that there is a player's hand on a picture? Because I would like to filter those photos and analyze only the ones with the only chessboard on them.
So far I managed to get the Canny, so from an image like that
original image
I am able to get that canny
.
But I have no idea what can I do next...
The code I used to get Canny:
Mat gray, blur, cannyed;
cvtColor(img, gray, CV_BGR2GRAY);
GaussianBlur(gray, blur, Size(7, 7), 0, 0);
Canny(blur, cannyed, 50, 100, 3);
I would highly appreciate any ideas and advice on what to do next and what OpenCV functions can I use.
You have a very nice spectrum in the chess board. A hand in it messes up the frequencies built up by the regular transitions between the black and white squares. Try moving a bigger square (let's say the size of a 4.5 x 4.5 squares) around and see what happens to the frequencies.
Another approach if you have the sequence of pictures taken as a movie is to analyse the motions. Take the difference of consecutive frames (low pass filter them a bit first) to detect motions. Filter the motions in time (over several frames). Then threshold these motions to get a binary image. Erode the binary shapes to filter out small moving objects (noise, chess figure) be able to detect if any larger moving shape is on the board (e.g. a hand).
Here, After Canny Edge detection the morphological operations of horizontal and vertical lines extraction process i tried.
Mat horizontal = cannyed.clone();
// Specify size on horizontal axis
int horizontalsize = horizontal.cols / 60;
// Create structure element for extracting horizontal lines through morphology operations
Mat horizontalStructure = getStructuringElement(MORPH_RECT, Size(horizontalsize,1));
erode(horizontal, horizontal, horizontalStructure, Point(-1, -1),2);
dilate(horizontal, horizontal, horizontalStructure, Point(-1, -1),1);
imshow("horizontal",horizontal);
Mat vertical = cannyed.clone();
// Specify size on horizontal axis
int verticalsize = vertical.cols / 60;
// Create structure element for extracting horizontal lines through morphology operations
Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1,verticalsize));
erode(vertical, vertical, verticalStructure, Point(-1, -1));
dilate(vertical, vertical, verticalStructure, Point(-1, -1),2);
imshow("vertical",vertical);
the results are ,
Horizontal Lines in the chess board
Then, from the figure you can see there is a proper interval in between the lines. The area where hand is present there is more interval in lines.
In that location, if contour is done, the hand (or any object ) over the chess board can be detected.
This helps to solve for any object when placed over chess board.
Thank you all very much for your suggestions.
So I solved the problem mostly using Gowthaman's method. First I use his code to generate vertical and horizontal lines. Then I combine them like this:
Mat combined = vertical + horizontal;
So I get something like that when there is no hand
or like that when there is a hand
.
Next I count white pixels using the code:
int GetPixelCount(Mat image, uchar color)
{
int result = 0;
for (int i = 0; i < image.rows; i++)
{
for (int j = 0; j < image.cols; j++)
{
if (image.at<uchar>(Point(j, i)) == color)
result++;
}
}
return result;
}
I do that for every photo in the series. First photo is always without a hand, so I use is as a template. If current photo has less then 98% of template white pixels then I deduce there is hand (or something else) in it.
Most likely this is not an optimal method and has lots of weaknesses, but it is very simple and works for me just fine :)

Extract an object on a sheet of paper

From pictures of tools on a sheet of paper, I'm asked to find their outline contour to vectorize them.
I'm a total beginner in computer-vision-related problems and the only thing I thought about was OpenCV and edge detection.
The result is better than what I've imagined, this is still very unreliable, especially if the source picture isn't "perfect".
I took 2 photographies of a wrench they gave me.
After playing around with opencv bindings for node, I get this:
Then, I've tried with the less-good picture:
That's totally inexploitable.
I can get something a little better by changing the Canny thresold, but that must be automatized (given that the picture is relatively correct).
So I've got a few questions:
Am I taking the right approach? Is GrabCut better for this? A combination of Grabcut and Canny edge detection? I still need vertices at the end, but I feel that GrabCut does what I want too.
The borders are rough and have certain errors. I can augment approxPolyDP's multiplier, but without a loss of precision on good parts.
Related to the above point, I'm thinking of integrating Savitzky-Golay algorithm to smooth the outline, instead of polygon simplification with approxPolyDP. Is it a good idea?
Normally, the line of the outer border must form a simple, cuttable block. Is there a way in OpenCL to avoid that line to do impossible things, like passing on itself? - Or, simply, detect the problem? Those configurations are, of course, impossible but happen when the detection is failed (like in the second pic).
I'm searching a way to do automatic Canny thresold calculation, since I must tweak it manually for each image. Do you have a good example for that?
I noticed that converting the image to grayscale before edge detection sometimes deteriorates the result, and sometimes makes it better. Which one should I choose? (tools can be of any color btw!)
here is the source for my tests:
const cv = require('opencv');
const lowThresh = 90;
const highThresh = 90;
const nIters = 1;
const GRAY = [120, 120, 120];
const WHITE = [255, 255, 255];
cv.readImage('./files/viv1.jpg', function(err, im) {
if (err) throw err;
width = im.width()
height = im.height()
if (width < 1 || height < 1) throw new Error('Image has no size');
const out = new cv.Matrix(height, width);
im.convertGrayscale();
im_canny = im.copy();
im_canny.canny(lowThresh, highThresh);
im_canny.dilate(nIters);
contours = im_canny.findContours();
let maxArea = 0;
let biggestContour;
for (i = 0; i < contours.size(); i++) {
const area = contours.area(i);
if (area > maxArea) {
maxArea = area;
biggestContour = i;
}
out.drawContour(contours, i, GRAY);
}
const arcLength = contours.arcLength(biggestContour, true);
contours.approxPolyDP(biggestContour, 0.001 * arcLength, true);
out.drawContour(contours, biggestContour, WHITE, 5);
out.save('./tmp/out.png');
console.log('Image saved to ./tmp/out.png');
});
You'll need to add some pre-processing to clean up the image. Because you have a large variation in intensities in the image because of shadow, poor lighting, high shine on tools, etc you should equalize the image. This will help you get a better response in the regions that are currently poorly lit or have high shine.
Here's an opencv tutorial on Histogram equalization in C++: http://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/histogram_equalization/histogram_equalization.html
Hope this helps
EDIT:
You can have an automatic threshold based on some loss function(?). For eg: If you know that the tool will be completely captured in the frame, you know that you should get a high value at every column from x = 10 to x = 800(say). You could then keep reducing the threshold until you get a high value at every column from x = 10 to x = 800. This is a very naive way of doing it, but its an interesting experiment, I think, since you are generating the images yourself and have control over object placement.
You might also try running your images through an adaptive threshold first. This type of binarization is fairly adept at segmenting foreground and background in cases like this, even with inconsistent lighting/shadows (which seems to be the issue in your second example above). Adathresh will require some parameter fine-tuning, but once the entire tool is segmented from the background, Canny edge detection should produce more consistent results.
As for the roughness in your contours, you could try setting your findContours mode to one of the CV_CHAIN_APPROX methods described here.

OpenCV: How to ignore background pixels from colour custers

I am trying to find the dominant colors in dresses.
1) First step is to remove the background. I did this using the solution mentioned here. It works perfectly and makes the background black.
2) Now with the result of the first step I am trying to find dominant colors using the solution mentioned here. But I am getting black (the background) as one of the dominant colours.
How can I ignore the background pixels in step 2?
Depending on the case, you could find the bounding rectangle of the region that you're interested in. If the number of color pixels is much higher than the number of black pixels inside that bounding rectangle, black shouldn't be detected as the dominant color.
Call findContours(binaryMask) on the binary image of your mask. Make sure you found just the contour you were looking for. If not, filter them to get the best one for the application. Then call boundingRect(cnt) on the contour. Then crop the image using that rectangle and run your function. If that's insufficient, try minAreaRect(cnt), but the cropping is a bit trickier: see this answer.
If that doesn't work, I'd probably go for the "dumb" solution, by changing the color of the mask to a color that will for 99% not appear on a dress and then - knowing it exact RGB values - filter it out from the results.
Next time please remember to provide an image of your case, so the answers may be more accurate.
One easy way to do it would be to simply discard black as a dominant colour. Grab one more cluster than you really want, ignore black. If black may genuinely be the dominant colour, repeat the operation with a different background colour and discard that; compare results. This would be slow, but simple to do.
Alternatively, you could only sample from pixels in your foreground. From your foreground extraction method, you should have a binary black and white foreground/background mask. If you only sample from white areas of the mask, then only these colours should be taken into consideration.
I have a rough C++ implementation of this, but it's almost certainly not the most efficient possible. Maybe it's a start you could work from?
Mat src; //Your source image
Mat mask; //Your black & white foreground/background image
Mat samples(src.rows * src.cols, 3, CV_32F);
//Set up samples with only foreground pixels
for (int y = 0; y < src.rows; y++) {
for (int x = 0; x < src.cols; x++) {
if (mask.at<uchar>(y, x) == 255) {
for (int z = 0; z < 3; z++) {
samples.at<float>(y + x*src.rows, z) = src.at<Vec3b>(y, x)[z];
}
}
}
}
int clusterNo = 3;
int attempts = 5;
Mat labels;
Mat centers;
kmeans(samples, clusterNo, labels, TermCriteria(), attempts, KMEANS_RANDOM_CENTERS, centers);
Your dominant colours will be stored in the rows of centres, where you can do what you want with them.
Remove the background. That gives you a binary image - foreground and background pixels. Now do a morphological closing to close up little holes in foreground images and generally clean up the contours. Finally substitute pixels back in again to get a colour foreground image.

Determining the average distance of pixels (to the centre of an image) in OpenCV

I'm trying to figure out how to do the following calculation in OpenCV.
Assuming a binary image (black/white):
Average distance of white pixels from the centre of the image. An image with most of its white pixels near the edges will have a high score, whereas an image with most white pixels near the centre will have a low score.
I know how to do this manually with loops, but since I'm working Java I'd rather offload it to a set of high-performance OpenCV calls which are native.
Thanks
distanceTransform() is almost what you want. Unfortunately, it only calculates distance to the nearest black pixel, which means the data must be massaged a little bit. The image needs to contain only a single black pixel at the center for distanceTransform() to work properly.
My method is as follows:
Set all black pixels to an intermediate value
Set the center pixel to black
Call distanceTransform() on the modified image
Calculate the mean distance via mean(), using the white pixels in the binary image as a mask
Example code is below. It's in C++, but you should be able to get the idea:
cv::Mat img; // binary image
img.setTo(128, img == 0);
img.at<uchar>(img.rows/2, img.cols/2) = 0; // Set center point to zero
cv::Mat dist;
cv::distanceTransform(img, dist, CV_DIST_L2, 3); // Can be tweaked for desired accuracy
cv::Scalar val = cv::mean(dist, img == 255);
double mean = val[0];
With that said, I recommend you test whether this method is actually any faster than iterating in a loop. This method does a fair bit more processing than necessary to accommodate the API call.

Resources