so i want to segment a tree from an aerial image
sample image (original image) :
and i expect the result like this (or better) :
the first thing i do is using threshold function in opencv and i didn't get expected result (it cant segment the tree crown), and then i'm using black and white filter in photoshop using some adjusted parameter (the result is shown beloww) and do the threshold and morphological filter and got result like shown above.
my question, is there a some ways to do the segmentation to the image without using photoshop first, and produce segmented image like the second image (or better) ? or maybe is there a way to do produce image like the third image ?
ps: you can read the photoshop b&w filter question here : https://dsp.stackexchange.com/questions/688/whats-the-algorithm-behind-photoshops-black-and-white-adjustment-layer
You can do it in OpenCV. The code below will basically do the same operations you did in Photoshop. You may need to tune some of the parameters to get exactly what you want.
#include "opencv2\opencv.hpp"
using namespace cv;
int main(int, char**)
{
Mat3b img = imread("path_to_image");
// Use HSV color to threshold the image
Mat3b hsv;
cvtColor(img, hsv, COLOR_BGR2HSV);
// Apply a treshold
// HSV values in OpenCV are not in [0,100], but:
// H in [0,180]
// S,V in [0,255]
Mat1b res;
inRange(hsv, Scalar(100, 80, 100), Scalar(120, 255, 255), res);
// Negate the image
res = ~res;
// Apply morphology
Mat element = getStructuringElement( MORPH_ELLIPSE, Size(5,5));
morphologyEx(res, res, MORPH_ERODE, element, Point(-1,-1), 2);
morphologyEx(res, res, MORPH_OPEN, element);
// Blending
Mat3b green(res.size(), Vec3b(0,0,0));
for(int r=0; r<res.rows; ++r) {
for(int c=0; c<res.cols; ++c) {
if(res(r,c)) { green(r,c)[1] = uchar(255); }
}
}
Mat3b blend;
addWeighted(img, 0.7, green, 0.3, 0.0, blend);
imshow("result", res);
imshow("blend", blend);
waitKey();
return 0;
}
The resulting image is:
The blended image is:
This has been an interesting topic of research in the past - mainly in the remote sensing literature.
While the morphological methods proposed using OpenCV will work in certain cases, you might want to consider more sophisticated approaches (depending on how variable your data is and how robust a detector you want to build).
For example, this paper, and those who cite it - give you a flavour of what has been attempted.
Pragmatically speaking - I think a neat solution would be one founded more on statistical texture analysis. There are many ways to classify (and then count) regions of an image as belong to a texture (co-occurance matrices, filter banks, textons, wavelets, etc, etc.).
Sadly, this is an area where OpenCV is rather deficient - it only provides a subset of the useful algorithms out there... However, here are a few quick ideas (none of which I have tried directly, just what I'm aware of are based on underlying OpenCV):
Use OpenCV Gabor filter support and cluster (for example).
You could also possibly train an OpenCV SVM with Local Binary Patterns.
A new library - but probably not so relevant for static images - LIBDT
Anyways, I hope you get something that just works for your purposes!
Related
So I understand how to convert a BGR image to YCrCb format using cvtColor() and seperate different channels using split() or mixChannels() in OpenCV. However, these channels are displayed as grayscale images as they are CV_8UC1 Mats.
I would like to display Cb and Cr channels in color like
Barns image on Wikipedia.
I found this solution in Matlab, but how do I do it in OpenCV?
Furthermore, the mentioned solution displayed Cb and Cr channels by "fills the other channels with a constant value of 50%". My question is:
Is this the common way to display Cr Cb channels? Or is there any recommendations or specifications when displaying Cr Cb channels?
I made a code from scratch as described in answer. Looks like it's what you need.
Mat bgr_image = imread("lena.png");
Mat yCrCb_image;
cvtColor(bgr_image, yCrCb_image, CV_BGR2YCrCb);
Mat yCrCbChannels[3];
split(yCrCb_image, yCrCbChannels);
Mat half(yCrCbChannels[0].size(), yCrCbChannels[0].type(), 127);
vector<Mat> yChannels = { yCrCbChannels[0], half, half };
Mat yPlot;
merge(yChannels, yPlot);
cvtColor(yPlot, yPlot, CV_YCrCb2BGR);
imshow("y", yPlot);
vector<Mat> CrChannels = { half, yCrCbChannels[1], half };
Mat CrPlot;
merge(CrChannels, CrPlot);
cvtColor(CrPlot, CrPlot, CV_YCrCb2BGR);
imshow("Cr", CrPlot);
vector<Mat> CbChannels = { half, half, yCrCbChannels[2] };
Mat CbPlot;
merge(CbChannels, CbPlot);
cvtColor(CrPlot, CrPlot, CV_YCrCb2BGR);
imshow("Cb", CbPlot);
waitKey(0);
As for converting grayscale images to color format, usually in such case all color channels (B, G, R) set to one grayscale value. In OpenCV CV_GRAY2BGR mode implemented in that manner.
As for "fills the other channels with a constant value of 50%" I believe it's common way to visualize such color spaces as YCbCr and Lab. I did not find any articles and descriptions of this approach, but I think it's driven by visualization purposes. Indeed, if we fill the other channels with zero, fundamentally nothing has changed: we can also see the influence of each channel, but the picture does not look very nice:
So, the aim of this approach to make visualization more colorful.
For thoses who want the code in Python (which is the same as #akarsakov) here it is :
import cv2
import numpy as np
img = cv2.imread(r"lena.png")
imgYCC = cv2.cvtColor(img, cv2.COLOR_BGR2YCR_CB)
Y,Cr,Cb = cv2.split(imgYCC)
half = np.array([[127]*Y.shape[1]]*Y.shape[0]).astype(Y.dtype)
merge_Y = cv2.merge([Y, half, half])
merge_Cb = cv2.merge([half, half, Cb])
merge_Cr = cv2.merge([half, Cr, half])
merge_Y = cv2.cvtColor(merge_Y, cv2.COLOR_YCrCb2BGR)
merge_Cb = cv2.cvtColor(merge_Cb, cv2.COLOR_YCrCb2BGR)
merge_Cr = cv2.cvtColor(merge_Cr, cv2.COLOR_YCrCb2BGR)
cv2.imwrite(r'Y.png', merge_Y)
cv2.imwrite(r'Cb.png', merge_Cb)
cv2.imwrite(r'Cr.png', merge_Cr)
The result isn't the same by the way and more look like what we could find on google when we write Y Cb Cr images.
Or maybe I did a mistake with split and/or BGR.
I am doing a project of combining multiple images similar to HDR in iOS. I have managed to get 3 images of different exposures through the Camera and now I want to align them because during the capture, one's hand must have shaken and resulted in all 3 images having slightly different alignment.
I have imported OpenCV framework and I have been exploring functions in OpenCV to align/register images, but found nothing. Is there actually a function in OpenCV to achieve this? If not, is there any other alternatives?
Thanks!
In OpenCV 3.0 you can use findTransformECC. I have copied this ECC Image Alignment code from LearnOpenCV.com where a very similar problem is solved for aligning color channels. The post also contains code in Python. Hope this helps.
// Read the images to be aligned
Mat im1 = imread("images/image1.jpg");
Mat im2 = imread("images/image2.jpg");
// Convert images to gray scale;
Mat im1_gray, im2_gray;
cvtColor(im1, im1_gray, CV_BGR2GRAY);
cvtColor(im2, im2_gray, CV_BGR2GRAY);
// Define the motion model
const int warp_mode = MOTION_EUCLIDEAN;
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
Mat warp_matrix;
// Initialize the matrix to identity
if ( warp_mode == MOTION_HOMOGRAPHY )
warp_matrix = Mat::eye(3, 3, CV_32F);
else
warp_matrix = Mat::eye(2, 3, CV_32F);
// Specify the number of iterations.
int number_of_iterations = 5000;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-10;
// Define termination criteria
TermCriteria criteria (TermCriteria::COUNT+TermCriteria::EPS, number_of_iterations, termination_eps);
// Run the ECC algorithm. The results are stored in warp_matrix.
findTransformECC(
im1_gray,
im2_gray,
warp_matrix,
warp_mode,
criteria
);
// Storage for warped image.
Mat im2_aligned;
if (warp_mode != MOTION_HOMOGRAPHY)
// Use warpAffine for Translation, Euclidean and Affine
warpAffine(im2, im2_aligned, warp_matrix, im1.size(), INTER_LINEAR + WARP_INVERSE_MAP);
else
// Use warpPerspective for Homography
warpPerspective (im2, im2_aligned, warp_matrix, im1.size(),INTER_LINEAR + WARP_INVERSE_MAP);
// Show final result
imshow("Image 1", im1);
imshow("Image 2", im2);
imshow("Image 2 Aligned", im2_aligned);
waitKey(0);
There is no single function called something like align, you need to do/implement it yourself, or find an already implemented one.
Here is a one solution.
You need to extract keypoints from all 3 images and try to match them. Be sure that your keypoint extraction technique is invariant to illumination changes since all have different intensity values because of different exposures. You need to match your keypoints and find some disparity. Then you can use disparity to align your images.
Remember this answer is so superficial, for details first you need to do some research about keypoint/descriptor extraction, and keypoint/descriptor matching.
Good luck!
I am currently doing a bit of computer vision using openCv. I have a sample of bottles a label on it. I am trying to determine when a bottle does not have a label on it.
The label is rectangular in shape.
I have done an edge detection using Canny.I have tried using findcountour() to detect if a bottle has an inner contour(this would represent the rectangular label).
If your problem is this simple, just place reduce your image using a rectangle.
cv::Mat image = imread("image.png");
cv::Rect labelRegion(50, 200, 50, 50);
cv::Mat labelImage = image(labelRegion);
Then decompose your image region into three channels.
cv::Mat channels[3];
cv::split(labelImage, channels);
cv::Mat labelImageRed = channels[2];
cv::Mat labelImageGreen = channels[1];
cv::Mat labelImageBlue = channels[0];
Then threshold each of these one channeled images and count number of zero/nonzero pixels.
I'm not providing code for this part!
If you don't have label on the image then each channel has values bigger then ~200 (you should check this). If there is a label, then you will see different result when counting zero/nonzero pixels from the non labeled one.
#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;
int main()
{
Mat img=imread("c:/data/bottles/1.png");
Mat gray;
cvtColor(img,gray,CV_BGR2GRAY);
resize(gray,gray,Size(50,100));
Sobel(gray,gray,CV_16SC1,0,1);
convertScaleAbs(gray,gray);
if(sum(gray)[0]<130000)
{
cout<<"no label";
}else{
cout<<"has label";
}
imshow("gray",gray);
waitKey();
return 0;
}
I am guessing it should be enough to just see if there is text present on the bottle or not (if yes, then it has a label and vice versa).. You could check out a project like THIS.. There are numerous papers in this area; some of the more famous ones are done by the Stanford CV group - 1 and 2..
HTH
guneykayim suggested image segmentation which I feel would be the easiest method. I am just adding a little bit more...
my suggestion is that you convert your BGR image into YCbCr and then look for values within the Cb and Cr channels to match the color of your label. This will allow you to easily segment out colors even if lighting conditions on the bottle change (a darkly lit bottle will end up having white regions appear dark gray and this can be a problem if you have gray colored labeling)
something like this should work in python:
# Required moduls
import cv2
import numpy
# Convert image to YCrCb
imageYCrCb = cv2.cvtColor(sourceImage,cv2.COLOR_BGR2YCR_CB)
# Constants for finding range of label color in YCrCb
# a, b, c and d need to be defined
min_YCrCb = numpy.array([0,a,b],numpy.uint8)
max_YCrCb = numpy.array([0,c,d],numpy.uint8)
# Threshold the image to produce blobs that indicate the labeling
labelRegion = cv2.inRange(imageYCrCb,min_YCrCb,max_YCrCb)
# Just in case you are interested in going an extra step
contours, hierarchy = cv2.findContours(labelRegion, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Draw the contour on the source image
for i, c in enumerate(contours):
area = cv2.contourArea(c)
if area > minArea: # minArea needs to be defined, try 300 square pixels
cv2.drawContours(sourceImage, contours, i, (0, 255, 0), 3)
the function cv2.inRange() will also work incase you decided to work with BGR image.
Reference:
http://en.wikipedia.org/wiki/YCbCr
How can I threshold this blurry image to make the digits as clear as possible?
In a previous post, I tried adaptively thresholding a blurry image (left), which resulted in distorted and disconnected digits (right):
Since then, I've tried using a morphological closing operation as described in this post to make the brightness of the image uniform:
If I adaptively threshold this image, I don't get significantly better results. However, because the brightness is approximately uniform, I can now use an ordinary threshold:
This is a lot better than before, but I have two problems:
I had to manually choose the threshold value. Although the closing operation results in uniform brightness, the level of brightness might be different for other images.
Different parts of the image would do better with slight variations in the threshold level. For instance, the 9 and 7 in the top left come out partially faded and should have a lower threshold, while some of the 6s have fused into 8s and should have a higher threshold.
I thought that going back to an adaptive threshold, but with a very large block size (1/9th of the image) would solve both problems. Instead, I end up with a weird "halo effect" where the centre of the image is a lot brighter, but the edges are about the same as the normally-thresholded image:
Edit: remi suggested morphologically opening the thresholded image at the top right of this post. This doesn't work too well. Using elliptical kernels, only a 3x3 is small enough to avoid obliterating the image entirely, and even then there are significant breakages in the digits:
Edit2: mmgp suggested using a Wiener filter to remove blur. I adapted this code for Wiener filtering in OpenCV to OpenCV4Android, but it makes the image even blurrier! Here's the image before (left) and after filtering with my code and a 5x5 kernel:
Here is my adapted code, which filters in-place:
private void wiener(Mat input, int nRows, int nCols) { // I tried nRows=5 and nCols=5
Mat localMean = new Mat(input.rows(), input.cols(), input.type());
Mat temp = new Mat(input.rows(), input.cols(), input.type());
Mat temp2 = new Mat(input.rows(), input.cols(), input.type());
// Create the kernel for convolution: a constant matrix with nRows rows
// and nCols cols, normalized so that the sum of the pixels is 1.
Mat kernel = new Mat(nRows, nCols, CvType.CV_32F, new Scalar(1.0 / (double) (nRows * nCols)));
// Get the local mean of the input. localMean = convolution(input, kernel)
Imgproc.filter2D(input, localMean, -1, kernel, new Point(nCols/2, nRows/2), 0);
// Get the local variance of the input. localVariance = convolution(input^2, kernel) - localMean^2
Core.multiply(input, input, temp); // temp = input^2
Imgproc.filter2D(temp, temp, -1, kernel, new Point(nCols/2, nRows/2), 0); // temp = convolution(input^2, kernel)
Core.multiply(localMean, localMean, temp2); //temp2 = localMean^2
Core.subtract(temp, temp2, temp); // temp = localVariance = convolution(input^2, kernel) - localMean^2
// Estimate the noise as mean(localVariance)
Scalar noise = Core.mean(temp);
// Compute the result. result = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Core.max(temp, noise, temp2); // temp2 = max(localVariance, noise)
Core.subtract(temp, noise, temp); // temp = localVariance - noise
Core.max(temp, new Scalar(0), temp); // temp = max(0, localVariance - noise)
Core.divide(temp, temp2, temp); // temp = max(0, localVar-noise) / max(localVariance, noise)
Core.subtract(input, localMean, input); // input = input - localMean
Core.multiply(temp, input, input); // input = max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Core.add(input, localMean, input); // input = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
}
Some hints that you might try out:
Apply the morphological opening in your original thresholded image (the one which is noisy at the right of the first picture). You should get rid of most of the background noise and be able to reconnect the digits.
Use a different preprocessing of your original image instead of morpho closing, such as median filter (tends to blur the edges) or bilateral filtering which will preserve better the edges but is slower to compute.
As far as threshold is concerned, you can use CV_OTSU flag in the cv::threshold to determine an optimal value for a global threshold. Local thresholding might still be better, but should work better with the bilateral or median filter
I've tried thresholding each 3x3 box separately, using Otsu's algorithm (CV_OTSU - thanks remi!) to determine an optimal threshold value for each box. This works a bit better than thresholding the entire image, and is probably a bit more robust.
Better solutions are welcome, though.
If you're willing to spend some cycles on it there are de-blurring techniques that could be used to sharpen up the picture prior to processing. Nothing in OpenCV yet but if this is a make-or-break kind of thing you could add it.
There's a bunch of literature on the subject:
http://www.cse.cuhk.edu.hk/~leojia/projects/motion_deblurring/index.html
http://www.google.com/search?q=motion+deblurring
And some chatter on the OpenCV mailing list:
http://tech.groups.yahoo.com/group/OpenCV/message/20938
The weird "halo effect" that you're seeing is likely due to OpenCV assuming black for the color when the adaptive threshold is at/near the edge of the image and the window that it's using "hangs over" the edge into non-image territory. There are ways to correct for this, most likely you would make an temporary image that's at least two full block-sizes taller and wider than the image from the camera. Then copy the camera image into the middle of it. Then set the surrounding "blank" portion of the temp image to be the average color of the image from the camera. Now when you perform the adaptive threshold the data at/near the edges will be much closer to accurate. It won't be perfect since its not a real picture but it will yield better results than the black that OpenCV is assuming is there.
My proposal assumes you can identify the sudoku cells, which I think, is not asking too much. Trying to apply morphological operators (although I really like them) and/or binarization methods as a first step is the wrong way here, in my opinion of course. Your image is at least partially blurry, for whatever reason (original camera angle and/or movement, among other reasons). So what you need is to revert that, by performing a deconvolution. Of course asking for a perfect deconvolution is too much, but we can try some things.
One of these "things" is the Wiener filter, and in Matlab, for instance, the function is named deconvwnr. I noticed the blurry to be in the vertical direction, so we can perform a deconvolution with a vertical kernel of certain length (10 in the following example) and also assume the input is not noise free (assumption of 5%) -- I'm just trying to give a very superficial view here, take it easy. In Matlab, your problem is at least partially solved by doing:
f = imread('some_sudoku_cell.png');
g = deconvwnr(f, fspecial('motion', 10, 90), 0.05));
h = im2bw(g, graythresh(g)); % graythresh is the Otsu method
Here are the results from some of your cells (original, otsu, otsu of region growing, morphological enhanced image, otsu from morphological enhanced image with region growing, otsu of deconvolution):
The enhanced image was produced by performing original + tophat(original) - bottomhat(original) with a flat disk of radius 3. I manually picked the seed point for region growing and manually picked the best threshold.
For empty cells you get weird results (original and otsu of deconvlution):
But I don't think you would have trouble to detect whether a cell is empty or not (the global threshold already solves it).
EDIT:
Added the best results I could get with a different approach: region growing. I also attempted some other approaches, but this was the second best one.
I'm throwing this out there in hope that someone will have attempted something this ridiculous before. My goal is to take in an input image, and segment it based upon the standard deviation of a small window around each pixel. Bascially, this should mathematically resemble a gauss or box filter, in that it will be applied to a compile time (or even run-time) user specified window size around each pixel, and the destination array will contain the SD information at each pixel, in an image the same size as the original.
The idea is to do this on an image in HSV space, so that I can easily find regions of homogeneous color (i.e. those with small local SDs in the Hue and Sat planes) and extract them from the image for more in-depth processing.
So the question is, has anyone ever built a custom filter like this before? I don't know how to do the SD in a simple box type filter kernel like the ones used for gauss and blur, so I'm guessing I'll have to use the FilterEngine construct. Also, I forgot to mention I'm doing this in C++.
Your advice and musings are much appreciated.
Wikipedia has a nice explanation of standard deviation, which you can use to for a standard deviation filter.
Basically, it boils down to blurring the image with a box filter, blurring the square of the image with a box filter, and taking the square root of their difference.
UPDATE: This is probably better shown with the equation from Wikipedia...
You can think of the OpenCV blur function as representing the expected value (i.e., E[X] a.k.a. the sample mean) of the neighborhood of interest. The random samples X in this case are represented by image pixels in the local neighborhood. Therefore, by using the above equivalence we have something like sqrt(blur(img^2) - blur(img)^2) in OpenCV. Doing it this way allows you to compute the local means and standard deviations.
Also, just in case you are curious about the mathematical proof. This equivalence is known as the computational formula for variance.
Here is how you can do this in OpenCV:
#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
using namespace std;
using namespace cv;
Mat mat2gray(const Mat& src)
{
Mat dst;
normalize(src, dst, 0.0, 1.0, NORM_MINMAX);
return dst;
}
int main()
{
Mat image = imread("coke-can.jpg", 0);
Mat image32f;
image.convertTo(image32f, CV_32F);
Mat mu;
blur(image32f, mu, Size(3, 3));
Mat mu2;
blur(image32f.mul(image32f), mu2, Size(3, 3));
Mat sigma;
cv::sqrt(mu2 - mu.mul(mu), sigma);
imshow("coke", mat2gray(image32f));
imshow("mu", mat2gray(mu));
imshow("sigma",mat2gray(sigma));
waitKey();
return 0;
}
This produces the following images:
Original
Mean
Standard Deviation
Hope that helps!
In case you want to use this in more general way this can produce nan values
Values close to zero can be sometimes "negative".
Mat sigma;
cv::sqrt(mu2 - mu.mul(mu), sigma);
correct way should be
Mat sigma;
cv::sqrt(cv::abs(mu2 - mu.mul(mu)), sigma);