I have a UIImagePickerViewController where the user takes a photo. My problem is how to know before uploading the photo to the server if the user is sending a dark photo. I mean a totally or nearly black.
I was researching and I found this:
const UInt8 *pixels = CFDataGetBytePtr(imageData);
UInt8 blackThreshold = 10; // or some value close to 0
int bytesPerPixel = 4;
for(int x = 0; x < width1; x++) {
for(int y = 0; y < height1; y++) {
int pixelStartIndex = (x + (y * width1)) * bytesPerPixel;
UInt8 alphaVal = pixels[pixelStartIndex]; // can probably ignore this value
UInt8 redVal = pixels[pixelStartIndex + 1];
UInt8 greenVal = pixels[pixelStartIndex + 2];
UInt8 blueVal = pixels[pixelStartIndex + 3];
if(redVal < blackThreshold && blueVal < blackThreshold && greenVal < blackThreshold) {
//This pixel is close to black...do something with it
}
}
}
However, I don't know how to apply the algorithm.
Yep that's a fairly simple way of doing it. You could, for example, iterate through and see what percentage of the pixels are pure black (i.e. clipped shadows) or nearly black. Or you could average the pixel colors throughout the whole image and see if it falls below a certain threshold. There are lots of approaches and these two might be a tad simplistic, but I'm not sure if this calls for anything particularly sophisticated. What threshold you want to use is up to you.
Also, while it has little practical impact, if I was going to be picky about the algorithm, I might only perform the "brightness" logic if the alphaVal was over a certain threshold, as well, as the color information is meaningless at transparent portions of image. Having said that, real photos rarely have any transparency, so this may be non-issue.
FYI, here is Apple's code for retrieving the pixel buffer. It's an oldie, but a goodie. (If I recall correctly, the only hassle is that the kCGImageAlphaPremultipliedFirst reference in CreateARGBBitmapContext must be cast with (CGBitmapInfo).)
By the way, if you're trying to determine the luminance of a particular pixel, one common algorithm is:
luminance = 0.2126 * red + 0.7152 * green + 0.0722 * blue
Related
I am working on a OCR project, and in the preprocessing, some RED stamps need to be removed, so that the text near the stamps could be detected. I try a lot of methods(like change the values of pixel, threshold in Red channel) but fail.
Any suggestions are highly appreciated.
Python, C++, Java or what? Since you didn't state the OpenCV implementation you are using, I'm giving my answer in C++.
An option is to use the HSV color space to filter out the range of red values that defines the seal. My approach is to use the CMYK color space to filter everything except the black (or dark) text. It should do a pretty good job on printed media, which is your case.
//read input image:
std::string imageName = "C://opencvImages//seal.png";
cv::Mat imageInput = cv::imread( imageName );
Now, perform the CMYK conversion. OpenCV does not support this operation out of the box, bear with me as I provide the helper function at the end of this post.
//CMYK conversion:
std::vector<cv::Mat> cmyk;
cmyk = rgb2cmyk( imageInput );
//This is the Black channel:
cv::Mat blackChannel = cmyk[3].clone();
This is the image of the black channel; it is nice how everything that is not black (or dark) practically disappears!
Now, optionally, enhance the result applying brightness and contrast adjustment. Just try to separate the text from the background a little bit better; we want some defined pixel distributions to get a nice binary image.
//Brightness and contrast adjustment:
float alpha = 2.0;
float beta = -50.0;
contrastBrightnessAdjustment( blackChannel, alpha, beta );
Again, OpenCV does not offer brightness and contrast adjustment out of the box; however, its implementation is very easy. Hold on a little bit, and let me show you the result of this operation:
Nice. Let's Otsu-threshold this bad boy to get a nice binary image containing the clean text:
cv::threshold( blackChannel, binaryImage ,0, 255, cv::THRESH_OTSU );
This is what you get:
Now, the RGB to CMYK conversion function. I'm using the following implementation. The function receives an RGB image and returns a vector containing each of the CMYK channels
std::vector<cv::Mat> rgb2cmyk( cv::Mat& inputImage ){
std::vector<cv::Mat> cmyk;
for (int i = 0; i < 4; i++) {
cmyk.push_back( cv::Mat( inputImage.size(), CV_8UC1 ) );
}
std::vector<cv::Mat> inputRGB;
cv::split( inputImage, inputRGB );
for (int i = 0; i < inputImage.rows; i++)
{
for (int j = 0; j < inputImage.cols; j++)
{
float r = (int)inputRGB[2].at<uchar>(i, j) / 255.;
float g = (int)inputRGB[1].at<uchar>(i, j) / 255.;
float b = (int)inputRGB[0].at<uchar>(i, j) / 255.;
float k = std::min(std::min(1-r, 1-g), 1-b);
cmyk[0].at<uchar>(i, j) = (1 - r - k) / (1 - k) * 255.;
cmyk[1].at<uchar>(i, j) = (1 - g - k) / (1 - k) * 255.;
cmyk[2].at<uchar>(i, j) = (1 - b - k) / (1 - k) * 255.;
cmyk[3].at<uchar>(i, j) = k * 255.;
}
}
return cmyk;
}
And the contrastBrightnessAdjustment function is this, implemented using pointer arithmetic. The function receives a grayscale image and applies the linear transformation via the alpha and beta parameters:
void contrastBrightnessAdjustment( cv::Mat inputImage, float alpha, int beta ){
cv::MatIterator_<cv::Vec3b> it, end;
for (it = inputImage.begin<cv::Vec3b>(), end = inputImage.end<cv::Vec3b>(); it != end; ++it) {
uchar &pixel = (*it)[0];
pixel = cv::saturate_cast<uchar>(alpha*pixel+beta);
}
}
I have an organized point cloud (1280 * 720) captured from a 3D camera. I just wonder whether there's a method to resize(cut down) this point cloud to a smaller size (eg. 128 * 72), when keeping this cloud organized.
(I think this shouldn't be the same as down sampling. "Resize" means like zooming an image).
I am using Point Cloud Library 1.8.0 but stuck with this.
Any advice is welcome, thanks first!
The answer of Rooscannon is in particular correct, but has some bugs in it. The correct uniform subsampling of a organized point cloud is as follows:
// Downsampling or keypoint extraction
int scale = 3;
PointCloud<PointXYZRGB>::Ptr keypoints (new PointCloud<PointXYZRGB>);
keypoints->width = cloud->width / scale;
keypoints->height = cloud->height / scale;
keypoints->points.resize(keypoints->width * keypoints->height);
for( size_t i = 0, ii = 0; i < keypoints->height; ii += scale, i++){
for( size_t j = 0, jj = 0; j < keypoints->width; jj += scale, j++){
keypoints->at(j, i) = cloud->at(jj, ii); //at(column, row)
}
}
So the loop conditions, the indexing and the initialization of the subsampled point cloud are different. Otherwise, the subsampled point cloud would not be organized anymore.
Just take a point out of the number of time you want to reduce your cloud,
something like that shloud work :
for (pcl::PointCloud<pcl::PointXYZ>::const_iterator it = src->begin(); it< src->end(); it+=times)
{
dest.points.push_back(*it);
}
Only problem is the cloud might containt some NaN values. To correct it just set is_dense to false into dest and call removeNaNFromPointCloud on it.
Hope this can help you !
Can't comment but removing NaNs from your point cloud by default makes it unorganized. Quite likely the NaNs are there as dummy points in case your instrument was not able to observe a point in the matrix just to keep the matrix dimensions correct. Removing those breaks the matrix structure and you'll have a different amount of points than your 1280 * 720 matrix would expect.
If you wish to down sample an organized point cloud say by a factor of 2, you could try something like
int scale = 2;
pcl::PointCloud<pcl::your_point_type> down_sampled_cloud;
down_sampled_cloud.width = original_cloud.width / scale;
down_sampled_cloud.height = original_cloud.height / scale;
for( int ii = 0; ii < original_cloud.height; ii+=scale){
for( int jj = 0; jj < original_cloud.width; jj+=scale ){
down_sampled_cloud.push_back(original_cloud.at(ii,jj));
}
}
Change scale to what you wish.
This method just down samples the original point cloud, it will not interpolate points between existing points. Scaling by a decimal factor is trickier and might yield unwanted results if the surface is not continuous.
I would like to perform screenshots (or screen captures) the fastest possible.
Googling this question brings many answeers but my concern is more specific :
I am not interested in the image itself, I would like to grab in near real time the screen brightness, not the hardware one, but the image one, given that, for example, the firefox white google page gives a brighter image than a dark xterm (when both are maximzed).
To make me as clear as possible, here is one way I already managed to implement with X11 and CImg library :
Here is the header :
#include <CImg.h>
using namespace cimg_library;
#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/Xos.h>
and the core part which extract an X11 image and make a loop on very pixel :
Display *display = XOpenDisplay(NULL);
Window root = DefaultRootWindow(display);
Screen* screen = DefaultScreenOfDisplay(display);
const int W = WidthOfScreen(screen);
const int H = HeightOfScreen(screen);
XImage *image = XGetImage(display, root, 0, 0, W, H, AllPlanes, ZPixmap);
unsigned long red_count(0), green_count(0), blue_count(0), count(0);
const unsigned long red_mask = image->red_mask;
const unsigned long green_mask = image->green_mask;
const unsigned long blue_mask = image->blue_mask;
CImg<unsigned char> screenshot(W, H, 1, 3, 0);
for (int x = 0; x < W; x += pixel_stride)
for (int y = 0; y < H; y += pixel_stride)
{
unsigned long pixel = XGetPixel(image, x, y);
screenshot(x, y, 0) = (pixel & red_mask) >> 16;
screenshot(x, y, 1) = (pixel & green_mask) >> 8;
screenshot(x, y, 2) = pixel & blue_mask;
red_count += (int) screenshot(x, y, 0);
green_count += (int) screenshot(x, y, 1);
blue_count += (int) screenshot(x, y, 2);
count++;
}
As I said, I do not keep the image itself, I just try to compute an average luminance value with respective values of red, green and blue pixels.
XFree(image);
const double luminance_relative = (red_luminance * double(red_count) +
green_luminance * double(green_count) +
blue_luminance * double(blue_count))
/ (double(255) * double(count));
The underlying idea is to adjust the hardware screen brightness depending on the image luminance. In short, the whiter is the screenshot, the more the brightness can be reduced and conversely.
I want to do that because I have sensitive eyes, it usually hurts my eyes when I switch from xterm to firefox.
To do so, the hardware brightness must be adjusted in a very short time, the screenshot, that is to say, the loop on pixels must be as fast as possible.
I began to implement it with X11 methods, but I wonder if there could be faster access methods ? Which comes to the question : what is the fastest way/library to get a screenshot ?
Thanks in advance for your help.
Regards
I'm using Emgu.CV to perform some basic image manipulation and composition. My images are loaded as Image<Bgra,Byte>.
Question #1: When I use the Image<,>.Add() method, the images are always blended together, regardless of the alpha value. Instead I'd like them to be composited one atop the other, and use the included alpha channel to determine how the images should be blended. So if I call image1.Add(image2) any fully opaque pixels in image2 would completely cover the pixels from image1, while semi-transparent pixels would be blended based on the alpha value.
Here's what I'm trying to do in visual form. There's a city image with some "transparent holes" cut out, and a frog behind. This is what it should look like:
And this is what openCV produces.
How can I get this effect with OpenCV? And will it be as fast as calling Add()?
Question #2: is there a way to perform this composition in-place instead of creating a new image with each call to Add()? (e.g. image1.AddImageInPlace(image2) modifies the bytes of image1?)
NOTE: Looking for answers within Emgu.CV, which I'm using because of how well it handles perspective warping.
Before OpenCV 2.4 there was no support of PNGs with alpha channel.
To verify if your current version supports it, print the number of channels after loading an image that you are certain to be RGBA. If it supports, the application will output the number 4, else it will output number 3 (RGB). Using the C API you would do:
IplImage* t_img = cvLoadImage(argv[1], CV_LOAD_IMAGE_UNCHANGED);
if (!t_img)
{
printf("!!! Unable to load transparent image.\n");
return -1;
}
printf("Channels: %d\n", t_img->nChannels);
If you can't update OpenCV:
There are some posts around that try to bypass this limitation but I haven't tested them myself;
The easiest solution would be to use another API to load the image and blend it, check blImageBlending;
Another alternative, not as lightweight, is to use Qt.
If your version already supports PNGs with RGBA:
Take a look at Emulating photoshop’s blending modes in OpenCV. It implements several Photoshop blending modes and I imagine you are capable of converting that code to .Net.
EDIT:
I had to deal with this problem recently and I've demonstrated how to deal with it on this answer.
You'll have to iterate through each pixel. I'm assuming image 1 is the frog image, and image 2 is the city image, with image1 always being bigger than image2.
//to simulate image1.AddInPlace(image2)
int image2w = image2.Width;
int image2h = image2.Height;
int i,j;
var alpha;
for (i = 0; i < w; i++)
{
for (j = 0; j < h; j++)
{
//alpha=255 is opaque > image2 should be used
alpha = image2[3][j,i].Intensity;
image1[j, i]
= new Bgra(
image2[j, i].Blue * alpha + (image1[j, i].Blue * (255-alpha)),
image2[j, i].Green * alpha + (image1[j, i].Green * (255-alpha)),
image2[j, i].Red * alpha + (image1[j, i].Red * (255-alpha)));
}
}
Using Osiris's suggestion as a starting point, and having checked out alpha compositing on Wikipedia, i ended up with the following which worked really nicely for my purposes.
This was used this with Emgucv. I was hoping that the opencv gpu::AlphaComposite methods were available in Emgucv which I believe would have done the following for me, but alas the version I am using didn't appear to have them implemented.
static public Image<Bgra, Byte> Overlay( Image<Bgra, Byte> image1, Image<Bgra, Byte> image2 )
{
Image<Bgra, Byte> result = image1.Copy();
Image<Bgra, Byte> src = image2;
Image<Bgra, Byte> dst = image1;
int rows = result.Rows;
int cols = result.Cols;
for (int y = 0; y < rows; ++y)
{
for (int x = 0; x < cols; ++x)
{
// http://en.wikipedia.org/wiki/Alpha_compositing
double srcA = 1.0/255 * src.Data[y, x, 3];
double dstA = 1.0/255 * dst.Data[y, x, 3];
double outA = (srcA + (dstA - dstA * srcA));
result.Data[y, x, 0] = (Byte)(((src.Data[y, x, 0] * srcA) + (dst.Data[y, x, 0] * (1 - srcA))) / outA); // Blue
result.Data[y, x, 1] = (Byte)(((src.Data[y, x, 1] * srcA) + (dst.Data[y, x, 1] * (1 - srcA))) / outA); // Green
result.Data[y, x, 2] = (Byte)(((src.Data[y, x, 2] * srcA) + (dst.Data[y, x, 2] * (1 - srcA))) / outA); // Red
result.Data[y, x, 3] = (Byte)(outA*255);
}
}
return result;
}
A newer version, using emgucv methods. rather than a loop. Not sure it improves on performance.
double unit = 1.0 / 255.0;
Image[] dstS = dst.Split();
Image[] srcS = src.Split();
Image[] rs = result.Split();
Image<Gray, double> srcA = srcS[3] * unit;
Image<Gray, double> dstA = dstS[3] * unit;
Image<Gray, double> outA = srcA.Add(dstA.Sub(dstA.Mul(srcA)));// (srcA + (dstA - dstA * srcA));
// Red.
rs[0] = srcS[0].Mul(srcA).Add(dstS[0].Mul(1 - srcA)).Mul(outA.Pow(-1.0)); // Mul.Pow is divide.
rs[1] = srcS[1].Mul(srcA).Add(dstS[1].Mul(1 - srcA)).Mul(outA.Pow(-1.0));
rs[2] = srcS[2].Mul(srcA).Add(dstS[2].Mul(1 - srcA)).Mul(outA.Pow(-1.0));
rs[3] = outA.Mul(255);
// Merge image back together.
CvInvoke.cvMerge(rs[0], rs[1], rs[2], rs[3], result);
return result.Convert<Bgra, Byte>();
I found an interesting blog post on internet, which I think is related to what you are trying to do.
Please have a look at the Creating Overlays Method (archive.org link). You can use this idea to implement your own function to add two images in the way you mentioned above, making some particular areas in the image transparent while leaving the rest as it is.
I need to convert an 8-bit IplImage to a 32-bits IplImage. Using documentation from all over the web I've tried the following things:
// general code
img2 = cvCreateImage(cvSize(img->width, img->height), 32, 3);
int height = img->height;
int width = img->width;
int channels = img->nChannels;
int step1 = img->widthStep;
int step2 = img2->widthStep;
int depth1 = img->depth;
int depth2 = img2->depth;
uchar *data1 = (uchar *)img->imageData;
uchar *data2 = (uchar *)img2->imageData;
for(h=0;h<height;h++) for(w=0;w<width;w++) for(c=0;c<channels;c++) {
// attempt code...
}
// attempt one
// result: white image, two red spots which appear in the original image too.
// this is the closest result, what's going wrong?!
// see: http://files.dazjorz.com/cache/conversion.png
((float*)data2+h*step2+w*channels+c)[0] = data1[h*step1+w*channels+c];
// attempt two
// when I change float to unsigned long in both previous examples, I get a black screen.
// attempt three
// result: seemingly random data to the top of the screen.
data2[h*step2+w*channels*3+c] = data1[h*step1+w*channels+c];
data2[h*step2+w*channels*3+c+1] = 0x00;
data2[h*step2+w*channels*3+c+2] = 0x00;
// and then some other things. Nothing did what I wanted. I couldn't get an output
// image which looked the same as the input image.
As you see I don't really know what I'm doing. I'd love to find out, but I'd love it more if I could get this done correctly.
Thanks for any help I get!
The function you are looking for is cvConvertScale(). It automagically does any type conversion for you. You just have to specify that you want to scale by a factor of 1/255 (which maps the range [0...255] to [0...1]).
Example:
IplImage *im8 = cvLoadImage(argv[1]);
IplImage *im32 = cvCreateImage(cvSize(im8->width, im8->height), 32, 3);
cvConvertScale(im8, im32, 1/255.);
Note the dot in 1/255. - to force a double division. Without it you get a scale of 0.
Perhaps this link can help you?
Edit In response to the second edit of the OP and the comment
Have you tried
float value = 0.5
instead of
float value = 0x0000001;
I thought the range for a float color value goes from 0.0 to 1.0, where 1.0 is white.
Floating point colors go from 0.0 to 1.0, and uchars go from 0 to 255. The following code fixes it:
// h is height, w is width, c is current channel (0 to 2)
int b = ((uchar *)(img->imageData + h*img->widthStep))[w*img->nChannels + c];
((float *)(img2->imageData + h*img2->widthStep))[w*img2->nChannels + c] = ((float)b) / 255.0;
Many, many thanks to Stefan Schmidt for helping me fix this!
If you do not put the dot (.), some compilers will understand is as an int division, giving you a int result (zero in this case).
You can create an IplImage wrapper using boost::shared_ptr and template-metaprogramming. I have done that, and I get automatic garbage collection, together with automatic image conversions from one depth to another, or from one-channel to multi-channel images.
I have called the API blImageAPI and it can be found here:
http://www.barbato.us/2010/10/14/image-data-structure-based-shared_ptr-iplimage/
It is very fast, and make code very readable, (good for maintaining algorithms)
It is also can be used instead of IplImage in opencv algorithms without changing anything.
Good luck and have fun writing algorithms!!!
IplImage *img8,*img32;
img8 =cvLoadImage("a.jpg",1);
cvNamedWindow("Convert",1);
img32 = cvCreateImage(cvGetSize(img8),IPL_DEPTH_32F,3);
cvConvertScale(img8,img32,1.0/255.0,0.0);
//For Confirmation Check the pixel values (between 0 - 1)
for(int row = 0; row < img32->height; row++ ){
float* pt = (float*) (img32->imageData + row * img32->widthStep);
for ( int col = 0; col < width; col++ )
printf("\n %3.3f , %3.3f , %3.3f ",pt[3*col],pt[3*col+1],pt[3*col+2]);
}
cvShowImage("Convert",img32);
cvWaitKey(0);
cvReleaseImage(&img8);
cvReleaseImage(&img32);
cvDestroyWindow("Convert");