Find illuminance value of an image taken from a camera - ios

I got an application that should calculate the illuminance of an given photo.
My problem is: i can't find a way to calculate this illuminance index (in lux)
i can get luminosity with this code:
UIImage* image = [UIImage imageNamed:#"image.png"];
unsigned char* pixels = [image rgbaPixels];
double totalLuminance = 0.0;
for(int p=0;p<image.size.width*image.size.height*4;p+=4) {
totalLuminance += pixels[p]*0.299 + pixels[p+1]*0.587 + pixels[p+2]*0.114;
}
totalLuminance /= (image.size.width*image.size.height);
totalLuminance /= 255.0;
NSLog(#"Image.png = %f",totalLuminance);
Source
but the results are between 0 and 1, witch i don't think i can use to calculate illuminance in lux
looking for an answer on stack overflow i found this answer https://stackoverflow.com/a/2720635/839049 witch gave me a direction, but i don't know how to proceed because i don't know what to do with exposure.
.. you can just ignore the pixel data and just use the exposure information as a light meter. ...
how? does anyone know how do i do that?
or there is a better way to do it?

If you know the exposure value from the Exif data, then the total scene illuminance can be calculated as
pow(2, Exposure)*(0.3 * Calibration);
where Calibration, unfortunately, depends on the physical characteristics of the scene you're imaging. If you have a lot of dark, low-reflectivity objects, then the constant will have to be set higher.
Usually the Exposure works out so that the average illuminance you get from your formula, i.e., the sum of all your Y values divided by the number of pixels, is around 0.5 (but that depends on the camera's "brain": some cameras divide the scene in "zones" and apply different weights to each zone, e.g. they try to get 0.5 in the central area even if this means darkening the edges; the latest cameras integrate the contrast values, so as to capture the most details from what they deduce to be the "zone of interest").
This means that your image will always be "scaled", unless you somehow instruct the camera to take pictures at a fixed speed and stop setting, without compensating in any way. If you do, you will be able to use the average pixel luminance to determine total apparent illuminance.
You will always have to calibrate your results with a known meter, though.

Related

Simple way to check if an image bitmap is blur

I am looking for a "very" simple way to check if an image bitmap is blur. I do not need accurate and complicate algorithm which involves fft, wavelet, etc. Just a very simple idea even if it is not accurate.
I've thought to compute the average euclidian distance between pixel (x,y) and pixel (x+1,y) considering their RGB components and then using a threshold but it works very bad. Any other idea?
Don't calculate the average differences between adjacent pixels.
Even when a photograph is perfectly in focus, it can still contain large areas of uniform colour, like the sky for example. These will push down the average difference and mask the details you're interested in. What you really want to find is the maximum difference value.
Also, to speed things up, I wouldn't bother checking every pixel in the image. You should get reasonable results by checking along a grid of horizontal and vertical lines spaced, say, 10 pixels apart.
Here are the results of some tests with PHP's GD graphics functions using an image from Wikimedia Commons (Bokeh_Ipomea.jpg). The Sharpness values are simply the maximum pixel difference values as a percentage of 255 (I only looked in the green channel; you should probably convert to greyscale first). The numbers underneath show how long it took to process the image.
If you want them, here are the source images I used:
original
slightly blurred
blurred
Update:
There's a problem with this algorithm in that it relies on the image having a fairly high level of contrast as well as sharp focused edges. It can be improved by finding the maximum pixel difference (maxdiff), and finding the overall range of pixel values in a small area centred on this location (range). The sharpness is then calculated as follows:
sharpness = (maxdiff / (offset + range)) * (1.0 + offset / 255) * 100%
where offset is a parameter that reduces the effects of very small edges so that background noise does not affect the results significantly. (I used a value of 15.)
This produces fairly good results. Anything with a sharpness of less than 40% is probably out of focus. Here's are some examples (the locations of the maximum pixel difference and the 9×9 local search areas are also shown for reference):
(source)
(source)
(source)
(source)
The results still aren't perfect, though. Subjects that are inherently blurry will always result in a low sharpness value:
(source)
Bokeh effects can produce sharp edges from point sources of light, even when they are completely out of focus:
(source)
You commented that you want to be able to reject user-submitted photos that are out of focus. Since this technique isn't perfect, I would suggest that you instead notify the user if an image appears blurry instead of rejecting it altogether.
I suppose that, philosophically speaking, all natural images are blurry...How blurry and to which amount, is something that depends upon your application. Broadly speaking, the blurriness or sharpness of images can be measured in various ways. As a first easy attempt I would check for the energy of the image, defined as the normalised summation of the squared pixel values:
1 2
E = --- Σ I, where I the image and N the number of pixels (defined for grayscale)
N
First you may apply a Laplacian of Gaussian (LoG) filter to detect the "energetic" areas of the image and then check the energy. The blurry image should show considerably lower energy.
See an example in MATLAB using a typical grayscale lena image:
This is the original image
This is the blurry image, blurred with gaussian noise
This is the LoG image of the original
And this is the LoG image of the blurry one
If you just compute the energy of the two LoG images you get:
E = 1265 E = 88
or bl
which is a huge amount of difference...
Then you just have to select a threshold to judge which amount of energy is good for your application...
calculate the average L1-distance of adjacent pixels:
N1=1/(2*N_pixel) * sum( abs(p(x,y)-p(x-1,y)) + abs(p(x,y)-p(x,y-1)) )
then the average L2 distance:
N2= 1/(2*N_pixel) * sum( (p(x,y)-p(x-1,y))^2 + (p(x,y)-p(x,y-1))^2 )
then the ratio N2 / (N1*N1) is a measure of blurriness. This is for grayscale images, for color you do this for each channel separately.

OpenCV: solvePnP detection problems

I've got problem with precise detection of markers using OpenCV.
I've recorded video presenting that issue: http://youtu.be/IeSSW4MdyfU
As you see I'm markers that I'm detecting are slightly moved at some camera angles. I've read on the web that this may be camera calibration problems, so I'll tell you guys how I'm calibrating camera, and maybe you'd be able to tell me what am I doing wrong?
At the beginnig I'm collecting data from various images, and storing calibration corners in _imagePoints vector like this
std::vector<cv::Point2f> corners;
_imageSize = cvSize(image->size().width, image->size().height);
bool found = cv::findChessboardCorners(*image, _patternSize, corners);
if (found) {
cv::Mat *gray_image = new cv::Mat(image->size().height, image->size().width, CV_8UC1);
cv::cvtColor(*image, *gray_image, CV_RGB2GRAY);
cv::cornerSubPix(*gray_image, corners, cvSize(11, 11), cvSize(-1, -1), cvTermCriteria(CV_TERMCRIT_EPS+ CV_TERMCRIT_ITER, 30, 0.1));
cv::drawChessboardCorners(*image, _patternSize, corners, found);
}
_imagePoints->push_back(_corners);
Than, after collecting enough data I'm calculating camera matrix and coefficients with this code:
std::vector< std::vector<cv::Point3f> > *objectPoints = new std::vector< std::vector< cv::Point3f> >();
for (unsigned long i = 0; i < _imagePoints->size(); i++) {
std::vector<cv::Point2f> currentImagePoints = _imagePoints->at(i);
std::vector<cv::Point3f> currentObjectPoints;
for (int j = 0; j < currentImagePoints.size(); j++) {
cv::Point3f newPoint = cv::Point3f(j % _patternSize.width, j / _patternSize.width, 0);
currentObjectPoints.push_back(newPoint);
}
objectPoints->push_back(currentObjectPoints);
}
std::vector<cv::Mat> rvecs, tvecs;
static CGSize size = CGSizeMake(_imageSize.width, _imageSize.height);
cv::Mat cameraMatrix = [_userDefaultsManager cameraMatrixwithCurrentResolution:size]; // previously detected matrix
cv::Mat coeffs = _userDefaultsManager.distCoeffs; // previously detected coeffs
cv::calibrateCamera(*objectPoints, *_imagePoints, _imageSize, cameraMatrix, coeffs, rvecs, tvecs);
Results are like you've seen in the video.
What am I doing wrong? is that an issue in the code? How much images should I use to perform calibration (right now I'm trying to obtain 20-30 images before end of calibration).
Should I use images that containg wrongly detected chessboard corners, like this:
or should I use only properly detected chessboards like these:
I've been experimenting with circles grid instead of of chessboards, but results were much worse that now.
In case of questions how I'm detecting marker: I'm using solvepnp function:
solvePnP(modelPoints, imagePoints, [_arEngine currentCameraMatrix], _userDefaultsManager.distCoeffs, rvec, tvec);
with modelPoints specified like this:
markerPoints3D.push_back(cv::Point3d(-kMarkerRealSize / 2.0f, -kMarkerRealSize / 2.0f, 0));
markerPoints3D.push_back(cv::Point3d(kMarkerRealSize / 2.0f, -kMarkerRealSize / 2.0f, 0));
markerPoints3D.push_back(cv::Point3d(kMarkerRealSize / 2.0f, kMarkerRealSize / 2.0f, 0));
markerPoints3D.push_back(cv::Point3d(-kMarkerRealSize / 2.0f, kMarkerRealSize / 2.0f, 0));
and imagePoints are coordinates of marker corners in processing image (I'm using custom algorithm to do that)
In order to properly debug your problem I would need all the code :-)
I assume you are following the approach suggested in the tutorials (calibration and pose) cited by #kobejohn in his comment and so that your code follows these steps:
collect various images of chessboard target
find chessboard corners in images of point 1)
calibrate the camera (with cv::calibrateCamera) and so obtain as a result the intrinsic camera parameters (let's call them intrinsic) and the lens distortion parameters (let's call them distortion)
collect an image of your own custom target (the target is seen at 0:57 in your video) and it is shown in the following figure and find some relevant points in it (let's call the point you found in image image_custom_target_vertices and world_custom_target_vertices the corresponding 3D points).
estimate the rotation matrix (let's call it R) and the translation vector (let's call it t) of the camera from the image of your own custom target you get in point 4), with a call to cv::solvePnP like this one cv::solvePnP(world_custom_target_vertices,image_custom_target_vertices,intrinsic,distortion,R,t)
giving the 8 corners cube in 3D (let's call them world_cube_vertices) you get the 8 2D image points (let's call them image_cube_vertices) by means of a call to cv2::projectPoints like this one cv::projectPoints(world_cube_vertices,R,t,intrinsic,distortion,image_cube_vertices)
draw the cube with your own draw function.
Now, the final result of the draw procedure depends on all the previous computed data and we have to find where the problem lies:
Calibration: as you observed in your answer, in 3) you should discard the images where the corners are not properly detected. You need a threshold for the reprojection error in order to discard "bad" chessboard target images. Quoting from the calibration tutorial:
Re-projection Error
Re-projection error gives a good estimation of just how exact is the
found parameters. This should be as close to zero as possible. Given
the intrinsic, distortion, rotation and translation matrices, we first
transform the object point to image point using cv2.projectPoints().
Then we calculate the absolute norm between what we got with our
transformation and the corner finding algorithm. To find the average
error we calculate the arithmetical mean of the errors calculate for
all the calibration images.
Usually you will find a suitable threshold with some experiments. With this extra step you will get better values for intrinsic and distortion.
Finding you own custom target: it does not seem to me that you explain how you find your own custom target in the step I labeled as point 4). Do you get the expected image_custom_target_vertices? Do you discard images where that results are "bad"?
Pose of the camera: I think that in 5) you use intrinsic found in 3), are you sure nothing is changed in the camera in the meanwhile? Referring to the Callari's Second Rule of Camera Calibration:
Second Rule of Camera Calibration: "Thou shalt not touch the lens
after calibration". In particular, you may not refocus nor change the
f-stop, because both focusing and iris affect the nonlinear lens
distortion and (albeit less so, depending on the lens) the field of
view. Of course, you are completely free to change the exposure time,
as it does not affect the lens geometry at all.
And then there may be some problems in the draw function.
So, I've experimented a lot with my code, and I still haven't fixed the main issue (shifted objects), but I've managed to answer some of calibration questions I've asked.
First of all - in order to obtain good calibration results you have to use images with properly detected grid elements/circles positions!. Using all captured images in calibration process (even those that aren't properly detected) will result bad calibration.
I've experimented with various calibration patterns:
Asymmetric circles pattern (CALIB_CB_ASYMMETRIC_GRID), give much worse results than any other pattern. By worse results I mean that it produces a lot of wrongly detected corners like these:
I've experimented with CALIB_CB_CLUSTERING and it haven't helped much - in some cases (different light environment) it got better, but not much.
Symmetric circles pattern (CALIB_CB_SYMMETRIC_GRID) - better results than asymmetric grid, but still I've got much worse results than standard grid (chessboard). It often produces errors like these:
Chessboard (found using findChessboardCorners function) - this method is producing best possible results - it doesn't produce misaligned corners very often, and almost every calibration is producing similar results to best-possible results from symmetric circles grid
For every calibration I've been using 20-30 images that were coming from different angles. I've tried even with 100+ images but it haven't produced noticeable change in calibration results than smaller amount of images. It's worth noticing that larger number of test images is increasing time needed to compute camera parameters in non-linear way (100 test images in 480x360 resolution are computing 25 minutes in iPad4, compared with 4 minutes with ~50 images)
I've also experimented with solvePNP parameters - but is also haven't gave me any acceptable results: I've tried all 3 detection methods (ITERATIVE, EPNP and P3P), but I haven't seen aby noticeable change.
Also I've tried with useExtrinsicGuess set to true, and I've used rvec and tvec from previous detection, but this one resulted with complete disapperance of detected cube.
I've ran out of ideas - what else could be affecting these shifting problems?
For those still interested:
this is an old question, but I think your problem is not the bad calibration.
I developed an AR app for iOS, using OpenCV and SceneKit, and I have had your same issue.
I think your problem is the wrong render position of the cube:
OpenCV's solvePnP returns the X, Y, Z coordinates of the marker center, but you wanna render the cube over the marker, at a specific distance along the Z axis of the marker, exactly at one half of the cube side size. So you need to improve the Z coordinate of the marker translation vector of this distance.
In fact, when you see your cube from the top, the cube is render properly.
I have done an image in order to explain the problem, but my reputation prevent to post it.

CUDA Circle Hough Transform [duplicate]

I'm trying to implement a maximum performance Circle Hough Transform in CUDA, whereby edge pixel coordinates cast votes in the hough space. Pseudo code for the CHT is as follows, I'm using image sizes of 256 x 256 pixels:
int maxRadius = 100;
int minRadius = 20;
int imageWidth = 256;
int imageHeight = 256;
int houghSpace[imageWidth x imageHeight * maxRadius];
for(int radius = minRadius; radius < maxRadius; ++radius)
{
for(float theta = 0.0; theta < 180.0; ++theta)
{
xCenter = edgeCoordinateX + (radius * cos(theta));
yCenter = edgeCoordinateY + (radius * sin(theta));
houghSpace[xCenter, yCenter, radius] += 1;
}
}
My basic idea is to have each thread block calculate a (small) tile of the output Hough space (maybe one block for each row of the output hough space). Therefore, I need to get the required part of the input image into shared memory somehow in order to carry out the voting in a particular output sub-hough space.
My questions are as follows:
How do I calculate and store the coordinates for the required part of the input image in shared memory?
How do I retrieve the x,y coordinates of the edge pixels, previously stored in shared memory?
Do I cast votes in another shared memory array or write the votes directly to global memory?
Thanks everyone in advance for your time. I'm new to CUDA and any help with this would be gratefully received.
I don't profess to know much about this sort of filtering, but the basic idea of propagating characteristics from a source doesn't sound too different to marching and sweeping methods for solving the stationary Eikonal equation. There is a very good paper on solving this class of problem (PDF might still be available here):
A Fast Iterative Method for Eikonal Equations. Won-Ki Jeong, Ross T.
Whitaker. SIAM Journal on Scientific Computing, Vol 30, No 5,
pp.2512-2534, 2008
The basic idea is to decompose the computational domain into tiles, and the sweep the characteristic from source across the domain. As tiles get touched by the advancing characteristic, they get added to a list of active tiles and calculated. Each time a tile is "solved" (converged to a numerical tolerance in the Eikonal case, probably a state in your problem) it is retired from the working set and its neighbours are activated. If a tile is touched again, it is re-added to the active list. The process continues until all tiles are calculated and the active list is empty. Each calculation iteration can be solved by a kernel launch, which explictly synchronizes the calculation. Run as many kernels in series as required to reach an empty work list.
I don't think it is worth trying to answer your questions until you have a more concrete algorithmic approach and are getting into implementation details.

CUDA implementation of the Circle Hough Transform

I'm trying to implement a maximum performance Circle Hough Transform in CUDA, whereby edge pixel coordinates cast votes in the hough space. Pseudo code for the CHT is as follows, I'm using image sizes of 256 x 256 pixels:
int maxRadius = 100;
int minRadius = 20;
int imageWidth = 256;
int imageHeight = 256;
int houghSpace[imageWidth x imageHeight * maxRadius];
for(int radius = minRadius; radius < maxRadius; ++radius)
{
for(float theta = 0.0; theta < 180.0; ++theta)
{
xCenter = edgeCoordinateX + (radius * cos(theta));
yCenter = edgeCoordinateY + (radius * sin(theta));
houghSpace[xCenter, yCenter, radius] += 1;
}
}
My basic idea is to have each thread block calculate a (small) tile of the output Hough space (maybe one block for each row of the output hough space). Therefore, I need to get the required part of the input image into shared memory somehow in order to carry out the voting in a particular output sub-hough space.
My questions are as follows:
How do I calculate and store the coordinates for the required part of the input image in shared memory?
How do I retrieve the x,y coordinates of the edge pixels, previously stored in shared memory?
Do I cast votes in another shared memory array or write the votes directly to global memory?
Thanks everyone in advance for your time. I'm new to CUDA and any help with this would be gratefully received.
I don't profess to know much about this sort of filtering, but the basic idea of propagating characteristics from a source doesn't sound too different to marching and sweeping methods for solving the stationary Eikonal equation. There is a very good paper on solving this class of problem (PDF might still be available here):
A Fast Iterative Method for Eikonal Equations. Won-Ki Jeong, Ross T.
Whitaker. SIAM Journal on Scientific Computing, Vol 30, No 5,
pp.2512-2534, 2008
The basic idea is to decompose the computational domain into tiles, and the sweep the characteristic from source across the domain. As tiles get touched by the advancing characteristic, they get added to a list of active tiles and calculated. Each time a tile is "solved" (converged to a numerical tolerance in the Eikonal case, probably a state in your problem) it is retired from the working set and its neighbours are activated. If a tile is touched again, it is re-added to the active list. The process continues until all tiles are calculated and the active list is empty. Each calculation iteration can be solved by a kernel launch, which explictly synchronizes the calculation. Run as many kernels in series as required to reach an empty work list.
I don't think it is worth trying to answer your questions until you have a more concrete algorithmic approach and are getting into implementation details.

Finding the actual size of an object using kinect depth data

I was wondering how would I figure out the actual size of the object, using the kinect depth values.
For example, if the kinect sees a round object in front of it, and the round object take 100 pixels of space in the image, and the depth value the kinect gives is x, how would I know the actual size of the round object?
I don't need it in units like meters or anything, I am just trying to find a formula to calculate the size of object that is independant from how far the object is from the kinect.
I am using OpenCV and the kinect SDK, if anything is useful there please let me know.
Thanks in advance.
To find the size in 3d, given a size in 2d, you just do:
3d_rad = 2d_rad * depth
So if the ball appears on the screen as 10 pixels wide and is 1 metre away, it really is 10 "units" wide. Do a little playing to find out the units returned, I'm unsure what they will be.
Suppose you have a 20 pixel radius ball on screen and the depth is returned as 30, the real size of the ball is 20*30 = 600 units. Again, I'm unsure what unit exactly, it depends on the camera, but it is a constant so play around with it. Put a 1 metre ball in front of the camera, far enough away that it looks like 100 pixels. The reciprocal of that distance should be the conversion factor to turn the units you have into centimetres and can be used as a constant. For example:
3d_rad_in_cm = conversion * 2d_rad * depth

Resources