UIImage part of the image in focus - ios

I'm trying to extend my understanding of the AVFoundation framework.
I want to add a Bezier Path (not necessarily a high resolution one) around the area of an image that is in focus.
So, given an UIImage, is it possible to know which points of the UIImage are in focus and which points aren't?
(not sure if any of the GPUImage "detection filters" would be useful to achieve what I'm trying).

One way would be to look for areas of high frequencies vs. low frequencies. The low frequency areas are more likely to be out-of-focus.
You could do this with a fast fourier transform. But a cheap hack might be to blur your input image and then compare the blurred version to the original. The lower the absolute difference, the lower frequency the input image is at that point. However, this has the downside of detecting areas of flat color as "out-of-focus". Though, I guess it's hard for a human to distinguish those, as well, unless there's other context in the image.

Related

How can I quickly and reliably estimate blur severity in a photo of a document?

Suppose I have a 20 MP photo of some document, containing printed or handwritten text. The text, of course, and background, can be mildly distorted by shadows, halo from flash lightning or a lamp, etc.
I want to estimate the blur in the top half and in the bottom half of the image. Since I know that printed (and hopefully, handwritten) text is much too sharp to detect in general-purpose camera resolutions/settings, I assume text-to-background boundaries are infinitely sharp. I am thinking of detecting the minimum number (or 1st percentile) of pixels that form the boundary between minBrightness+5% (text color) and maxBrightness-5% inside a local brightness window - because the dynamic range and lightning conditions change in different localities of the photo. So, if I need at best 3 pixels to cross from BlackPoint to WhitePoint, I would infer that my blur size is roughly 2 pixels.
There are a few problems with my idea. The algorithm I am thinking of seems way slower than a filter. It could give misleading results if I run it on a region that has no text at all (e.g. a document, whose lower half is entirely blank), and so it relies on hardcoded minimum dynamic range (e.g. if maxBrightness-minBrightness < 100, there is no text; do not try to estimate blur). Thirdly, the algorithm I have does not seem very robust in regards to noise, shadows, etc, and it could fail if the actual text font is not black-white, but is grayscale for aesthetic purposes.
Given my concerns, is there a fast and robust, moderately accurate algorithm that will do the task better than the algorithm I have in mind?
PS for now I am assuming uniform blur as opposed to directional blur because the direction of the blur is not central to my task.
Since your text should be sharp, it seems like a general "in focus" or blur detector might work. Something like: Is there a way to detect if an image is blurry? and Detection of Blur in Images/Video sequences applied to sections of your image.

is it possible to take low resolution image from street camera, increase it and see image details

I would like to know if it is possible to take low resolution image from street camera, increase it
and see image details (for example a face, or car plate number). Is there any software that is able to do it?
Thank you.
example of image: http://imgur.com/9Jv7Wid
Possible? Yes. In existence? not to my knowledge.
What you are referring to is called super-resolution. The way it works, in theory, is that you combine multiple low resolution images, and then combine them to create a high-resolution image.
The way this works is that you essentially map each image onto all the others to form a stack, where the target portion of the image is all the same. This gets extremely complicated extremely fast as any distortion (e.g. movement of the target) will cause the images to differ dramatically, on the pixel level.
But, let's you have the images stacked and have removed the non-relevant pixels from the stack of images. You are left hopefully with a movie/stack of images that all show the exact same image, but with sub-pixel distortions. A sub-pixel distortion simply means that the target has moved somewhere inside the pixel, or has moved partially into the neighboring pixel.
You can't measure if the target has moved within the pixel, but you can detect if the target has moved partially into a neighboring pixel. You can do this by knowing that the target is going to give off X amount of photons, so if you see 1/4 of the photons in one pixel and 3/4 of the photons in the neighboring pixel you know it's approximate location, which is 3/4 in one pixel and 1/4 in the other. You then construct an image that has a resolution of these sub-pixels and place these sub-pixels in their proper place.
All of this gets very computationally intensive, and sometimes the images are just too low-resolution and have too much distortion from image to image to even create a meaningful stack of images. I did read a paper about a lab in a university being able to create high-resolution images form low-resolution images, but it was a very very tightly controlled experiment, where they moved the target precisely X amount from image to image and had a very precise camera (probably scientific grade, which is far more sensitive than any commercial grade security camera).
In essence to do this in the real world reliably you need to set up cameras in a very precise way and they need to be very accurate in a particular way, which is going to be expensive, so you are better off just putting in a better camera than relying on this very imprecise technique.
Actually it is possible to do super-resolution (SR) out of even a single low-resolution (LR) image! So you don't have to hassle taking many LR images with sub-pixel shifts to achieve that. The intuition behind such techniques is that natural scenes are full of many repettitive patterns that can be use to enahance the frequency content of similar patches (e.g. you can implement dictionary learning in your SR reconstruction technique to generate the high-resolution version). Sure the enhancment may not be as good as using many LR images but such technique is simpler and more practicle.
Photoshop would be your best bet. But know that you cannot reliably inclrease the size of an image without making the quality even worse.

Image Segmentation/Background Subtraction

My current project is to calculate the surface area of the paste covered on the cylinder.
Refer the images below. The images below are cropped from the original images taken via a phone camera.
I am thinking terms like segmentation but due to the light reflection and shadows a simple segmentation won’t work out.
Can anyone tell me how to find the surface area covered by paste on the cylinder?
First I'd simplify the problem by rectifying the perspective effect (you may need to upscale the image to not lose precision here).
Then I'd scan vertical lines across the image.
Further, you can simplify the problem by segmentation of two classes of pixels, base and painted. Make some statistical analysis to find the range for the larger region, consisting of base pixels. Probably will make use of mathematical median of all pixels.
Then you expand the color space around this representative pixel, until you find the highest color distance gap. Repeat the procedure to retrieve the painted pixels. There's other image processing routines you may have to do such as smoothing out the noise, removing outliers and background, etc.

Balancing contrast and brightness between stitched images

I'm working on an image stitching project, and I understand there's different approaches on dealing with contrast and brightness of an image. I could of course deal with this issue before I even stitched the image, but yet the result is not as consistent as I would hope. So my question is if it's possible by any chance to "balance" or rather "equalize" the contrast and brightness in color pictures after the stitching has taken place?
You want to determine the histogram equalization function not from the entire images, but on the zone where they will touch or overlap. You obviously want to have identical histograms in the overlap area, so this is where you calculate the functions. You then apply the equalization functions that accomplish this on the entire images. If you have more than two stitches, you still want to have global equalization beforehand, and then use a weighted application of the overlap-equalizing functions that decreases the impact as you move away from the stitched edge.
Apologies if this is all obvious to you already, but your general question leads me to a general answer.
You may want to have a look at the Exposure Compensator class provided by OpenCV.
Exposure compensation is done in 3 steps:
Create your exposure compensator
Ptr<ExposureCompensator> compensator = ExposureCompensator::createDefault(expos_comp_type);
You input all of your images along with the top left corners of each of them. You can leave the masks completely white by default unless you want to specify certain parts of the image to work on.
compensator->feed(corners, images, masks);
Now it has all the information of how the images overlap, you can compensate each image individually
compensator->apply(image_index, corners[image_index], image, mask);
The compensated image will be stored in image

Fast, reliable focus score for camera frames

I'm doing real-time frame-by-frame analysis of a video stream in iOS.
I need to assign a score to each frame for how in focus it is. The method must be very fast to calculate on a mobile device and should be fairly reliable.
I've tried simple things like summing after using an edge detector, but haven't been impressed by the results. I've also tried using the focus scores provided in the frame's metadata dictionary, but they're significantly affected by the brightness of the image, and much more device-specific.
What are good ways to calculate a fast, reliable focus score?
Poor focus means that edges are not very sharp, and small details are lost. High JPEG compression gives very similar distortions.
Compress a copy of your image heavily, unpack and calculate the difference with the original. Intense difference, even at few spots, should mean that the source image had sharp details that are lost in compression. If difference is relatively small everywhere, the source was already fuzzy.
The method can be easily tried in an image editor. (No, I did not yet try it.) Hopefully iPhone has an optimized JPEG compressor already.
A simple answer that human visual system probably uses is to implemnt focusing on top of edge
Tracking. Thus if a set of edges can be tracked across a visual sequence one can work with intensity profile
Of these edges only to detrmine when it the steepest.
From a theoretical point of view, blur manifests as a lost of the high frequency content. Thus, you can just use do a FFT and check the relative frequency distribution. iPhone uses ARM Cortex chips which have NEON instructions that can be used for an efficient FFT implementation.
#9000's suggestion of heavily compressed JPEG has the effect of taking a very small number of the largest wavelet coefficients will usually result in what's in essence a low pass filter.
Consider different kind of edges: e.g. peaks versus step edges. The latter will still be present regardless of focus. To isolate the former use non max suppression in the direction of gradient. As a focus score use the ratio of suppressed edges at two different resolutions.

Resources