I have a thermal image of human standing either carrying a cold tool or a hot tool. I want to find the place this tool is. So basically i am trying to make an image processing filter which would give me the area of the place where drastic change of intensity of gray color occurs in the relatively smoother background. I have tried canny edge detector but it gives a lot of noise.
Hot Object To be detected: https://imgur.com/0ZyK6WP
Cold object to be detected: https://imgur.com/YYT9rHW
You might increase the Gaussian smoothing kernel to filter out the noise, but that might result in losing out the edges. So in that case you might want to use filter that would preserve the edge and also smooth out the image. Something like Bilateral filter could help in that case. It replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels.
Also have you tried different threshold values foe non-max suppression. As that might be helpful when dealing with false positives.
Related
I am using threshold in Opencv to find the contours. My input is a hand image. Sometimes the threshold is not good so I couldnt find the contours.
I have applied the below preprocessing steps
1. Grabcut
cv::grabCut(image, result,rectangle,bgModel,fgModel, 3,cv::GC_INIT_WITH_RECT);
gray Scale conversion
cvtColor(handMat, handMat, CV_BGR2GRAY);
meadianblur
medianBlur(handMat, handMat, MEDIAN_BLUR_K);
I used the below code to find threshold
threshold( handMat, handMat, 141, 255, THRESH_BINARY||CV_THRESH_OTSU );
Sometimes I get good output and sometimes the threshold output is not good. I have attached the two output images.
Is there any other way than threshold from which contours can be found?
Good threshold Output:
Bad threshold Output
Have you tried an adaptive threshold? A single value of threshold rarely works in real life application. Another truism - threshold is a non-linear operation and hence non-stable. Gradient on the other hand is linear so you may want to find a contour by tracking the gradient if your background is smooth and solid color. Gradient is also more reliable during illumination changes or shadows than thresholding.
Grab-cut, by the way, uses color information to improve segmentation on the boundary when you already found 90% or so of the segment, so it is a post processing step. Also your initialization of grab cut with rectangle lets in a lot of contamination from background colors. Instead of rectangle use a mask where you mark as GC_FGD deep inside your initial segment where you are sure the hand is; mark as GC_BGD far outside your segment where you sure background is; mark GC_PR_FGD or probably foreground everywhere else - this is what will be refined by grab cut. to sum up - your initialization of grab cut will look like a russian doll with three layers indicating foreground (gray), probably foreground (white) and background (balck). You can use dilate and erode to create these layers, see below
Overall my suggestion is to define what you want to do first. Are you looking for contours of arbitrary objects on arbitrary moving background? If you are looking for a contour of a hand to find fingers on relatively uniform background I would:
1. use connected components or MSER to segment out a hand. Possibly improve results with grab cut initialized with the conservative mask and not rectangle!
2. use convexity defects to find fingers if this is your goal;
One issue is to try to find contours without binarizing the image.
If your input is in color, you can try to change color space in order to enhance the difference between the hand and the background.
Otsu try to find an optimal threshold, you can also try to set it manually but Otsu is useful because if the illumination change, the threshold will adapt automatically.
There are also many other kind of binarization : Sauvola, Bradley, Niblack, Kasar... but Otsu is simple, and work well. I suggest you to do preprocessing or postprocessing if you want to improve the binarization result.
Is there a robust way to detect the water line, like the edge of a river in this image, in OpenCV?
(source: pequannockriver.org)
This task is challenging because a combination of techniques must be used. Furthermore, for each technique, the numerical parameters may only work correctly for a very narrow range. This means either a human expert must tune them by trial-and-error for each image, or that the technique must be executed many times with many different parameters, in order for the correct result to be selected.
The following outline is highly-specific to this sample image. It might not work with any other images.
One bit of advice: As usual, any multi-step image analysis should always begin with the most reliable step, and then proceed down to the less reliable steps. Whenever possible, the less reliable step should make use of the result of more-reliable steps to augment its own accuracy.
Detection of sky
Convert image to HSV colorspace, and find the cyan located at the upper-half of the image.
Keep this HSV image, becuase it could be handy for the next few steps as well.
Detection of shrubs
Run Canny edge detection on the grayscale version of image, with suitably chosen sigma and thresholds. This will pick up the branches on the shrubs, which would look like a bunch of noise. Meanwhile, the water surface would be relatively smooth.
Grayscale is used in this technique in order to reduce the influence of reflections on the water surface (the green and yellow reflections from the shrubs). There might be other colorspaces (or preprocessing techniques) more capable of removing that reflection.
Detection of water ripples from a lower elevation angle viewpoint
Firstly, mark off any image parts that are already classified as shrubs or sky. Since shrub detection would be more reliable than water detection, shrub detection's result should be used to inform the less-reliable water detection.
Observation
Because of the low elevation angle viewpoint, the water ripples appear horizontally elongated. In fact, every image feature appears stretched horizontally. This is called Anisotropy. We could make use of this tendency to detect them.
Note: I am not experienced in anisotropy detection. Perhaps you can get better ideas from other people.
Idea 1:
Use maximally-stable extremal regions (MSER) as a blob detector.
The Wikipedia introduction appears intimidating, but it is really related to connected-component algorithms. A naive implementation can be done similar to Dijkstra's algorithm.
Idea 2:
Notice that the image features are horizontally stretched, a simpler approach is to just sum up the absolute values of horizontal gradients and compare that to the sum of absolute values of vertical gradients.
I had asked this on photo stackexchange but thought it might be relevant here as well, since I want to implement this programatically in my implementation.
I am trying to implement a blur detection algorithm for my imaging pipeline. The blur that I want to detect is both -
1) Camera Shake: Pictures captured using hand which moves/shakes when shutter speed is less.
2) Lens focussing errors - (Depth of Field) issues, like focussing on a incorrect object causing some blur.
3) Motion blur: Fast moving objects in the scene, captured using a not high enough shutter speed. E.g. A moving car a night might show a trail of its headlight/tail light in the image as a blur.
How can one detect this blur and quantify it in some way to make some decision based on that computed 'blur metric'?
What is the theory behind blur detection?
I am looking of good reading material using which I can implement some algorithm for this in C/Matlab.
thank you.
-AD.
Motion blur and camera shake are kind of the same thing when you think about the cause: relative motion of the camera and the object. You mention slow shutter speed -- it is a culprit in both cases.
Focus misses are subjective as they depend on the intent on the photographer. Without knowing what the photographer wanted to focus on, it's impossible to achieve this. And even if you do know what you wanted to focus on, it still wouldn't be trivial.
With that dose of realism aside, let me reassure you that blur detection is actually a very active research field, and there are already a few metrics that you can try out on your images. Here are some that I've used recently:
Edge width. Basically, perform edge detection on your image (using Canny or otherwise) and then measure the width of the edges. Blurry images will have wider edges that are more spread out. Sharper images will have thinner edges. Google for "A no-reference perceptual blur metric" by Marziliano -- it's a famous paper that describes this approach well enough for a full implementation. If you're dealing with motion blur, then the edges will be blurred (wide) in the direction of the motion.
Presence of fine detail. Have a look at my answer to this question (the edited part).
Frequency domain approaches. Taking the histogram of the DCT coefficients of the image (assuming you're working with JPEG) would give you an idea of how much fine detail the image has. This is how you grab the DCT coefficients from a JPEG file directly. If the count for the non-DC terms is low, it is likely that the image is blurry. This is the simplest way -- there are more sophisticated approaches in the frequency domain.
There are more, but I feel that that should be enough to get you started. If you require further info on either of those points, fire up Google Scholar and look around. In particular, check out the references of Marziliano's paper to get an idea about what has been tried in the past.
There is a great paper called : "analysis of focus measure operators for shape-from-focus" (https://www.researchgate.net/publication/234073157_Analysis_of_focus_measure_operators_in_shape-from-focus) , which does a comparison about 30 different techniques.
Out of all the different techniques, the "Laplacian" based methods seem to have the best performance. Most image processing programs like : MATLAB or OPENCV have already implemented this method . Below is an example using OpenCV : http://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/
One important point to note here is that an image can have some blurry areas and some sharp areas. For example, if an image contains portrait photography, the image in the foreground is sharp whereas the background is blurry. In sports photography, the object in focus is sharp and the background usually has motion blur. One way to detect such a spatially varying blur in an image is to run a frequency domain analysis at every location in the image. One of the papers which addresses this topic is "Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes" (cvpr2017).
the authors look at multi resolution DCT coefficients at every pixel. These DCT coefficients are divided into low, medium, and high frequency bands, out of which only the high frequency coefficients are selected.
The DCT coefficients are then fused together and sorted to form the multiscale-fused and sorted high-frequency transform coefficients
A subset of these coefficients are selected. the number of selected coefficients is a tunable parameter which is application specific.
The selected subset of coefficients are then sent through a max pooling block to retain the highest activation within all the scales. This gives the blur map as the output, which is then sent through a post processing step to refine the map.
This blur map can be used to quantify the sharpness in various regions of the image. In order to get a single global metric to quantify the bluriness of the entire image, the mean of this blur map or the histogram of this blur map can be used
Here are some examples results on how the algorithm performs:
The sharp regions in the image have a high intensity in the blur_map, whereas blurry regions have a low intensity.
The github link to the project is: https://github.com/Utkarsh-Deshmukh/Spatially-Varying-Blur-Detection-python
The python implementation of this algorithm can be found on pypi which can easily be installed as shown below:
pip install blur_detector
A sample code snippet to generate the blur map is as follows:
import blur_detector
import cv2
if __name__ == '__main__':
img = cv2.imread('image_name', 0)
blur_map = blur_detector.detectBlur(img, downsampling_factor=4, num_scales=4, scale_start=2, num_iterations_RF_filter=3)
cv2.imshow('ori_img', img)
cv2.imshow('blur_map', blur_map)
cv2.waitKey(0)
For detecting blurry images, you can tweak the approach and add "Region of Interest estimation".
In this github link: https://github.com/Utkarsh-Deshmukh/Blurry-Image-Detector , I have used local entropy filters to estimate a region of interest. In this ROI, I then use DCT coefficients as feature extractors and train a simple multi-layer perceptron. On testing this approach on 20000 images in the "BSD-B" dataset (http://cg.postech.ac.kr/research/realblur/) I got an average accuracy of 94%
Just to add on the focussing errors, these may be detected by comparing the psf of the captured blurry images (wider) with reference ones (sharper). Deconvolution techniques may help correcting them but leaving artificial errors (shadows, rippling, ...). A light field camera can help refocusing to any depth planes since it captures the angular information besides the traditional spatial ones of the scene.
I had asked this on photo stackexchange but thought it might be relevant here as well, since I want to implement this programatically in my implementation.
I am trying to implement a blur detection algorithm for my imaging pipeline. The blur that I want to detect is both -
1) Camera Shake: Pictures captured using hand which moves/shakes when shutter speed is less.
2) Lens focussing errors - (Depth of Field) issues, like focussing on a incorrect object causing some blur.
3) Motion blur: Fast moving objects in the scene, captured using a not high enough shutter speed. E.g. A moving car a night might show a trail of its headlight/tail light in the image as a blur.
How can one detect this blur and quantify it in some way to make some decision based on that computed 'blur metric'?
What is the theory behind blur detection?
I am looking of good reading material using which I can implement some algorithm for this in C/Matlab.
thank you.
-AD.
Motion blur and camera shake are kind of the same thing when you think about the cause: relative motion of the camera and the object. You mention slow shutter speed -- it is a culprit in both cases.
Focus misses are subjective as they depend on the intent on the photographer. Without knowing what the photographer wanted to focus on, it's impossible to achieve this. And even if you do know what you wanted to focus on, it still wouldn't be trivial.
With that dose of realism aside, let me reassure you that blur detection is actually a very active research field, and there are already a few metrics that you can try out on your images. Here are some that I've used recently:
Edge width. Basically, perform edge detection on your image (using Canny or otherwise) and then measure the width of the edges. Blurry images will have wider edges that are more spread out. Sharper images will have thinner edges. Google for "A no-reference perceptual blur metric" by Marziliano -- it's a famous paper that describes this approach well enough for a full implementation. If you're dealing with motion blur, then the edges will be blurred (wide) in the direction of the motion.
Presence of fine detail. Have a look at my answer to this question (the edited part).
Frequency domain approaches. Taking the histogram of the DCT coefficients of the image (assuming you're working with JPEG) would give you an idea of how much fine detail the image has. This is how you grab the DCT coefficients from a JPEG file directly. If the count for the non-DC terms is low, it is likely that the image is blurry. This is the simplest way -- there are more sophisticated approaches in the frequency domain.
There are more, but I feel that that should be enough to get you started. If you require further info on either of those points, fire up Google Scholar and look around. In particular, check out the references of Marziliano's paper to get an idea about what has been tried in the past.
There is a great paper called : "analysis of focus measure operators for shape-from-focus" (https://www.researchgate.net/publication/234073157_Analysis_of_focus_measure_operators_in_shape-from-focus) , which does a comparison about 30 different techniques.
Out of all the different techniques, the "Laplacian" based methods seem to have the best performance. Most image processing programs like : MATLAB or OPENCV have already implemented this method . Below is an example using OpenCV : http://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/
One important point to note here is that an image can have some blurry areas and some sharp areas. For example, if an image contains portrait photography, the image in the foreground is sharp whereas the background is blurry. In sports photography, the object in focus is sharp and the background usually has motion blur. One way to detect such a spatially varying blur in an image is to run a frequency domain analysis at every location in the image. One of the papers which addresses this topic is "Spatially-Varying Blur Detection Based on Multiscale Fused and Sorted Transform Coefficients of Gradient Magnitudes" (cvpr2017).
the authors look at multi resolution DCT coefficients at every pixel. These DCT coefficients are divided into low, medium, and high frequency bands, out of which only the high frequency coefficients are selected.
The DCT coefficients are then fused together and sorted to form the multiscale-fused and sorted high-frequency transform coefficients
A subset of these coefficients are selected. the number of selected coefficients is a tunable parameter which is application specific.
The selected subset of coefficients are then sent through a max pooling block to retain the highest activation within all the scales. This gives the blur map as the output, which is then sent through a post processing step to refine the map.
This blur map can be used to quantify the sharpness in various regions of the image. In order to get a single global metric to quantify the bluriness of the entire image, the mean of this blur map or the histogram of this blur map can be used
Here are some examples results on how the algorithm performs:
The sharp regions in the image have a high intensity in the blur_map, whereas blurry regions have a low intensity.
The github link to the project is: https://github.com/Utkarsh-Deshmukh/Spatially-Varying-Blur-Detection-python
The python implementation of this algorithm can be found on pypi which can easily be installed as shown below:
pip install blur_detector
A sample code snippet to generate the blur map is as follows:
import blur_detector
import cv2
if __name__ == '__main__':
img = cv2.imread('image_name', 0)
blur_map = blur_detector.detectBlur(img, downsampling_factor=4, num_scales=4, scale_start=2, num_iterations_RF_filter=3)
cv2.imshow('ori_img', img)
cv2.imshow('blur_map', blur_map)
cv2.waitKey(0)
For detecting blurry images, you can tweak the approach and add "Region of Interest estimation".
In this github link: https://github.com/Utkarsh-Deshmukh/Blurry-Image-Detector , I have used local entropy filters to estimate a region of interest. In this ROI, I then use DCT coefficients as feature extractors and train a simple multi-layer perceptron. On testing this approach on 20000 images in the "BSD-B" dataset (http://cg.postech.ac.kr/research/realblur/) I got an average accuracy of 94%
Just to add on the focussing errors, these may be detected by comparing the psf of the captured blurry images (wider) with reference ones (sharper). Deconvolution techniques may help correcting them but leaving artificial errors (shadows, rippling, ...). A light field camera can help refocusing to any depth planes since it captures the angular information besides the traditional spatial ones of the scene.
How to get rid of uneven illumination from images, that contain text data, usually printed but may be handwritten? It can have some spots of lights because the light reflected while making picture.
I've seen the Halcon program's segment_characters function that is doing this work perfectly,
but it is not open source.
I wish to convert an image to the image that has a constant illumination at background and more dark colored regions of text. So that binarization will be easy and without noise.
The text is assumed to be dark colored than it's background.
Any ideas?
Strictly speaking, assuming you have access to the image's pixels (you can search online for how to accomplish this in your programming language as the topic is abundantly available), the exercise involves going over the pixels once to determine a "darkness threshold". In order to do this you convert each pixel from RGB to HSL in order to get the lightness level component for each pixel. During this process you calculate an average lightness for the whole image which you can use as your "darkness threshold"
Once you have the image average lightness level, you can go over the image pixels once more and if a pixel is less than the darkness threshold, set it's color to full white RGB(255,255,255), otherwise, set it's color to full black RGB (0,0,0). This will give you a binary image with in which the text should be black - the rest should be white.
Of course, the key is in finding the appropriate darkness threshold - so if the average method doesn't give you good results you may have to come up with a different method to augment that step. Such a method could involve separating the image in the primary channels Red, Green, Blue and computing the darkness threshold for each channel separately and then using the aggressive threshold of the three..
And lastly, a better approach may be to compute the light levels distribution - as opposed to simply the average - and then from that, the range around the maximum is what you want to keep. Again, go over each pixel and if it's lightness fits the band make it black, otherwise, make it white.
EDIT
For further reading about HSL I recommend starting with the Wiky entry on HSL and HSV Color spaces.
Have you tried using morphological techniques? Closure-by-reconstruction (as presented in Gonzalez, Woods and Eddins) can be used to create a grayscale representation of background illumination levels. You can more-or-less standardize the effective illumination by:
1) Calculating the mean intensity of all the pixels in the image
2) Using closure-by-reconstruction to estimate background illumination levels
3) Subtract the output of (2) from the original image
4) Adding the mean intensity from (1) to every pixel in the output of (3).
Basically what closure-by-reconstruction does is remove all image features that are smaller than a certain size, erasing the "foreground" (the text you want to capture) and leaving only the "background" (illumination levels) behind. Subtracting the result from the original image leaves behind only small-scale deviations (the text). Adding the original average intensity to those deviations is simply to make the text readable, so that the resulting picture looks like a light-normalized version of the original image.
Use Local-Thresholding instead of the global thresholding algorithm.
Divide your image(grayscale) in to a grid of smaller images (say 50x50 px) and apply the thresholding algorithm on each individual image.
If the background features are generally larger than the letters, you can try to estimate and subsequently remove the background.
There are many ways to do that, a very simple one would be to run a median filter on your image. You want the filter window to be large enough that text inside the window rarely makes up more than a third of the pixels, but small enough that there are several windows that fit into the bright spots. This filter should result in an image without text, but with background only. Subtract that from the original, and you should have an image that can be segmented with a global threshold.
Note that if the bright spots are much smaller than the text, you do the inverse: choose the filter window such that it removes the light only.
The first thing you need to try and do it change the lighting, use a dome light or some other light that will give you a more diffuse and even light.
If that's not possible, you can try some of the ideas in this question or this one. You want to implement some type of "adaptive threshold", this will apply a local threshold to individual parts of the image so that the change in contrast won't be as noticable.
There is also a simple but effective method explained here. The simple outline of the alrithm is the following:
Split the image up into NxN regions or neighbourhoods
Calculate the mean or median pixel value for the neighbourhood
Threshold the region based on the value calculated in 2) or the value from 2) minus C (where C is a chosen constant)
It seems like what you're trying to do is improve local contrast while attenuating larger scale lighting variations. I'll agree with other posters that optimizing the image through better lighting should always be the first move.
After that, here are two tricks.
1) Use smooth_image() operator to convolve a gaussian on your original image. Use a relaitively large kernel, like 20-50px. Then subtract this blurred image from your original image. Apply scale and offset within sub_image() operator, or use equ_histo() to equalize histogram.
This basically subtracts the low spatial frequency information from the original, leaving the higher frequency information intact.
2) You could try highpass_image() operator, or one of the laplacian operators to extract a gradiant image.