Just completed building a camera using AVCaptureSession for scanning documents on iPhone, I am looking for away to determine if the captured image is in good quality and not blurred.
I saw many solutions using OpenCV and I am looking for an other options.
Any Help would be appreciated.
Thanks
First of all, interesting question, made me do some research to figure out stuff myself. In general, Analysis of focus measure operators for shape-from-focus is a great research paper, talking about a few methods (36 to be precise) on how to get measure of blurriness in an image, from simple/straightforward ones to more complex ones.
I have done myself some basic laplacian operation on one channel of the image (essentially 2nd derivative of the pixels) to measure the blurriness, which worked quite well for me. Once you convolve the channel with the laplacian operator, the variance of this laplacian image is a good estimate of the blurriness. The assumption here is that if an image contains high variance then there is a wide spread of responses, both edge-like and non-edge like, representative of a normal, in-focus image. But if there is very low variance, then there is a tiny spread of responses, indicating there are very little edges in the image. As we know, the more an image is blurred, the less edges there are. The trick here is to find an apt threshold for the variance to be high/low, which I guess you can ascertain by running it on your dataset.
Courtesy: Blog
PS. Although the blog I reference here mentions "OpenCV", the methods can be implemented as you want if you understand the concept and hence I started the answer with the research paper.
Related
I've spent some time trying to calibrate two similar cameras (ExCam IPQ1715 and ExCam IPQ1765) to varying degrees of success, with the eventual goal of using them for short-range photogrammetry. I've been using a charuco board, along with the OpenCV Charuco calibration library, and have noticed that the quality of my calibration is closely tied to how much of the images is taken up by the board. (I measure calibration quality by RMS reprojection error given by OpenCV, and also by just seeing if the undistorted images appear to have straighter lines on the board than the originals).
I'm still pretty inexperienced, and there have been other factors messing with my calibration (leaving autofocus on, OpenCV charuco identification sometimes getting strange false positives on some images without me noticing), so my question is less about my results and more about best practice for camera calibration in general:
How crucial is it that the board (charuco, chessboard) take up most of the image space? Is there generally a minimum amount that it should cover? Is this even an issue at all, or am I likely mistaking it for another cause of bad calibration?
I've seen lots of calibration tutorials online where the board seems to take up a small portion of the image, but then have also found other people experiencing similar issues. In short, I'm horribly lost.
Any guidance would be awesome - thanks!
Consider the point that camera calibration calculation is a model fitting.
i.e. optimize the model parameters with the measurements.
So... You should pay attention to:
If the board image is too small to see the distortion in the board image,
is it possible to optimize the distortion parameters with such image?
If the pattern image is only distributed near the center of the image,
is it possible to estimate valid parameter values for regions far from the center?
(this will be an extrapolation).
If the pattern distribution is not uniform, the density of the data can affect the results.
e.g. With least square optimization, errors in regions with little data can be neglected.
Therefore, my suggestion is:
Pattern images that are extremely small are useless.
The data should cover the entire field of view of the camera image, and the distribution should be as uniform as possible.
Use enough data. With few data may cause overfitting.
Check the pattern recognition results of all images(sample code often omit this).
I've been processing some image frames in videos and I discovered that sometimes one or two frames of the video will have artifacts or noise like the images below:
The artifacts look like abrasions of paint with noisy colors that covers only a small region (less than 100x100 in a 1000x2000 frame) of the image. I wonder if there are ways to detect the noisy frames? I've tried to use the difference of frames with SSIM, NMSE or PSNR but found limited effectiveness. Saliency map (left) or sobel/scharr filtering (right) providing more obvious view but regular borders are also included and I'm not sure how to form a classifier.
Scharr saliency map:
Since they are only a few frames in videos it's not quite necessary to denoising and I can just remove the frames one detected. The main problem here is that it's difficult to distinguish those frames in playing videos.
Can anybody offer some help here?
Detailing the comment as an answer with a few more details:
The Scharr and saliency map looks good.
Thresholding will result in a binary image which can be cleaned up with morphological filters (erode to enhance artefacts, dilate to 'erase' gradient contours).
Finding contours will result in lists of points which can be further processed/filtered using contour features.
If the gradients are always bigger than the artefacts, contour features, such as the bounding box dimensions and aspect ratio should help segment artefact contours from gradient contours (if any: hopefully dilation would've cleaned up the thresholded/binary image).
Another idea could be looking into oriented gradients:
either computer the oriented gradients (see visualisations): with the right cell size you might strike a balance where the artefacts have a high magnitude while gradient edges don't
you could try a full histogram of oriented gradients (HoG) classifier setup (using an SVM trained on histograms (as features))
The above options do rely on hand crafted features/making assumptions about the size of artefacts.
ML could be an interesting route too, hopefully it can generalise well enough.
Depending how many example images you have available, you could test a basic prototype using Teachable Machine (which behind the scenes would apply KNN to a transfer learning layer on top of MobileNet (or similar net)) fairly fast.
(Note: I've posted OpenCV Python links, but there are libraries that can help (e.g. scikit-image, scikit-learn, kornia, etc. in Python, cvv in c++, BoofCV in java, etc. (and there might be toolboxes for Matlab/Octave with similar features))
I asked this question previously "How to extract numbers from an image" LINK and finally i made this step but there is some test cases that leads to awful outputs when i try to recognize digits .. Consider this image as an example
This image is low contrast (from my POV) i tried to adjust its contrast and the results still unacceptable .I tried also to sharp it then i applied gamma correction but the results still not fair ,so the extracted numbers doesn't recognized well by the classifier
this is the image after (sharpening + gamma)
Number 4 after separation :
Could anybody tell me what is the best ideas to solve such a problem ?
Sharpening is not always the best tool to approach a problem like this. Contrary to what the name implies, sharpening does not "recover" information to add detail and edges back into an image. Instead, sharpening is a class of operations that increase local contrast along edges.
Because your original image is highly degraded, this sharpening operation looks to be adding a lot of noise in, and generally not making anything better.
There is another class of algorithms called "deblurring" algorithms that attempt to actually reconstruct image detail through (much more complex) mathematical models. Some versions of this are blind deconvolution, regularized deconvolution, and Wiener deconvolution.
However, it is important to note that all of these methods are approximations - once image content is lost through an operation such as blurring , it can (almost) never be fully recovered. Also, these methods are generally much more complex.
The best way to handle these situations is make sure that they never happen. Ensure good focus during image capture, use a system with a resolution well suited to your task, control the lighting environment. However, when these methods do not or cannot work, image reconstruction techniques are needed.
Your image is blurred, and I suggest you try wiener deconvolution. You can assume the point spread function a Gaussian function and observe what's going on with the deconvolution process. Since you do not know the blur kernel in advance, blind deconvolution is an alternative.
I have a problem very similar but very much simple than this.
To begin with I have a small image:
Then I take a screenshot and I want to detect if my small house is in the screenshot.
The problem is that my house can be different in size and slightly different in color.
I've found so far the OpenCV library but it seem quite oversized for my need.
Do you know any simpler library to achieve this task?
Tx
Edit: I've found this about SURF algorithm
Judging by your question, there will be no sheer or skew to your image as it will be on screen, whereas the problem you referenced is a much more difficult situation. Your image will not experience any distortion, but only an increase/decrease in size.
To match regardless of color, I recommend computing the gradient image (using sobel kernels) for both your template image and your screen shot. Now you're matching based on visible edges and take color out of the mix.
To match regardless of size, create multiple versions of your template (the more versions you make the more precise, but the longer the processing) and slide your template across the image until you find an acceptable match.
OpenCV is a beast that has a steep learning curve. If my assumptions are correct, then you are correctly stating that OpenCV is oversized when simple image processing techniques can be applied :).
I'm thinking of starting a project for school where I'll use genetic algorithms to optimize digital sharpening of images. I've been playing around with unsharp masking (USM) techniques in Photoshop. Basically, I want to create a software that optimizes the parameters (i.e. blur radius, types of blur, blending the image) to create the "best-fit" set of filters.
I'm sort of quickly planning this project before starting it, and I can't think of a good fitness function for the 'selection' part. How would I determine the 'quality' of the filter sets, or measure how sharp the image is?
Also, I will be programming using python (with the Python Imaging Library) since it's the only language I'm proficient with. Should I learn a low-level language instead?
Any advice/tips on anything is greatly appreciated. Thanks in advance!
tl;dr How do I measure how 'sharp' an image is?
if its for tuning parameters you could take a known image and apply a known blurring/low pass filter. Then sharpen this with your GA+USM algorithm. Calculate your fitness function making use of the original image, e.g maybe something as simple as the mean absolute error. May need to create different datasets, e.g. landscape images (mostly sharp, in focus with large depth of field), portrait images (could be large areas deliberately out of focus and "soft"), along with low noise and noisy images. Sharpening noisy images is actually quite a challenge.
It would definitely be worth taking a look at Bruce Frasier' work on sharpening techniques for Photoshop etc.
Also it might worth checking out Imatest (www.imatest.com) to see if there is anything regarding sharpness/resolution. And finally you might also consider resolution charts.
And finally I seroiusly doubt one set of ideal parameters exists for USM, the optimum parameters will be image dependant and indeed be a personal perference (thatwhy I suggest starting for a known sharp image and blurring it). Understanding the type of image is probably as important and in itself and very interesting and challenging problem. Although perhaps basic hueristics like image varinance and edge histogram would reveal suitable clues.
Anyway just a thought, hopefully some of the above is useful