Effects of 2-D Preprocessing on Feature Extraction - image-processing

Which of this papers is true?
" Effects of 2-D Preprocessing on Feature Extraction "
We show that higher frequencies
do not, for the purposes of feature extraction, necessarily represent
human-salient features and that the combination of contrast
enhancement, decimation, and lowpass filtering achieve more
robust feature extraction than simple high-frequency boosting.
Our ideal feature extractor therefore incorporates a decimator for
reduction to an idealized size, contrast enhancement through
stretched dynamic range, and frequency-domain filtering with a
Gaussian lowpass filter."
" Fast SIFT algorithm based on Sobel edge detector "
This paper proposes a fast SIFT
algorithm based on Sobel edge detector. Sobel edge detector is
applied to generate an edge group scale space and SIFT detector
detects the extreme point under the constraint of the edge group
scale space. The experimental results show that the proposed
algorithm decreases the redundancy of keypoints and speeds up
the implementation while the matching rate between different
images maintains at a high level. As the threshold of Sobel
detector increases, number of keypoints decreases and matching
rate gets higher.

Related

Semantic Segmentation: How to evaluate the noise influence of the effectivity and robustness of the medical image segmentation?

I have done some pre-processing including N4 Bias correction, noise removal and scaling on medical 3D MRIs, and I was asked one question:
How to evaluate the noise influence of the effectivity and robustness of the medical image segmentation? When affecting the image structure with various noise, the extracted features will be deteriorated. Such effect should be taken advantage in the context of the method
effectivity for different noise intensity.
How to evaluate the noise affect and how to justify the noise removal method used in the scientific manuscript?
I don't know if this can be helpful but I did once in classrom with nuclear magnetic resonance.
In that case we use the Shepp Logan Phantom with FFT. then we add noise to the picture (by adding random numbers with gaussian distribution).
When you transform the image back to the phantom you can see the effects of noise and sometimes artifacts (mostly due to the FFT algorithm and the window function choosed).
What I did was check the mean value of color in the image before and after, then on edges of the pahntom (skull) you can see how much is clear the passage from white to black and vice versa.
This can be easily tested with MATLAB code and the phantom. When you have the accuracy you need you can then apply the algorithm you choose on real images.

Normalizing lighting conditions for image recognition

I am using OpenCV to process a video to identify frames containing objects using a rudimentary algorithm (background subtraction → brightness threshold → dilation → contour detection).
The video is shot over the course of a day, so lighting conditions are gradually changing, so I expect it would improve my results if I did some sort of brightness and/or contrast normalization step first.
Answers to this question suggest using convertScaleAbs, or contrast optimization with histogram clipping, or histogram equalization.
Of these techniques or others, what would be the best preprocessing step to normalize the frames?

Include the spatial context of pixels during image clustering

How can the spatial context (or neighbourhood) of a pixel be taken into account (besides the pixel intensity) when clustering an image?
For the time being, I'm using K-means, GMM and Fuzzy C-means which cluster the image based only on the distribution of the pixel intensities. But, I need to include the information on the spatial context of the pixel into the clustering, to avoid the misclassification caused by the noise speckle.
The standard approach for segmentation is to add the X and Y coordinates with appropriate scaling to the color values (in RGB or Lab space).
Examples of these are SLIC (K-means clustering in x-y-Lab space) and Quickshift (an accelerated mean shift in x-y-Lab space).
When also considering spacial distances, it is often possible to gain a lot of speed. Check out the implementations in scikit-image or this blog or my blog

Stein Unbiased Estimate of Risk (Sure) to denoise signals

I wish to use Stein Unbiased Estimate of Risk (Sure) for denoising signals.
I have a 1-Dimensional signal. I am using wavelets to decompose the signal into multiple levels of approximate and detail coefficients.
For denoising the original signal, do I need to do a thresholding for every level of detail coefficients or doing it on the last level of detail coeffcient will do the job ?
Thresholding is usually applied to all the frequencies of a signal because the procedure exploits the fact that the wavelet transform maps white noise (purely random, uncorrelated and constant power spectral density noise) in the signal domain to white noise in the transform domain and as such is spread across the different frequencies Thus, while signal energy becomes more concentrated into fewer coefficients in the transform domain, noise energy does not. Other noises given that have different spectrum properties will map differently and this is where the selection of the type of thresholding procedure becomes important.
In thresholding the highest decomposition level (lowest frequencies) while leaving the lower levels (higher frequencies) not denoised sounds a little extrange if you want to reconstruct the signal.
However you could also extract a level and denoise its related range of frequencies (e.g. from level 1 to level 2) if you have a range of frequencies you may have interest for.
Speaking about the thresholding function be aware in any case that Sure has different results depending on the type of noises the signal has. For example it will reduce the distribution of white noise in horizontal components but will only decrease large amplitudes. For signals where togueter with white you may have other noise colors like random walk and flicker noise sure is not an efective procedure.

Confusion regarding Object recognition and features using SURF

I have some conceptual issues in understanding SURF and SIFT algorithm All about SURF. As far as my understanding goes SURF finds Laplacian of Gaussians and SIFT operates on difference of Gaussians. It then constructs a 64-variable vector around it to extract the features. I have applied this CODE.
(Q1 ) So, what forms the features?
(Q2) We initialize the algorithm using SurfFeatureDetector detector(500). So, does this means that the size of the feature space is 500?
(Q3) The output of SURF Good_Matches gives matches between Keypoint1 and Keypoint2 and by tuning the number of matches we can conclude that if the object has been found/detected or not. What is meant by KeyPoints ? Do these store the features ?
(Q4) I need to do object recognition application. In the code, it appears that the algorithm can recognize the book. So, it can be applied for object recognition. I was under the impression that SURF can be used to differentiate objects based on color and shape. But, SURF and SIFT find the corner edge detection, so there is no point in using color images as training samples since they will be converted to gray scale. There is no option of using colors or HSV in these algorithms, unless I compute the keypoints for each channel separately, which is a different area of research (Evaluating Color Descriptors for Object and Scene Recognition).
So, how can I detect and recognize objects based on their color, shape? I think I can use SURF for differentiating objects based on their shape. Say, for instance I have a 2 books and a bottle. I need to only recognize a single book out of the entire objects. But, as soon as there are other similar shaped objects in the scene, SURF gives lots of false positives. I shall appreciate suggestions on what methods to apply for my application.
The local maxima (response of the DoG which is greater (smaller) than responses of the neighbour pixels about the point, upper and lover image in pyramid -- 3x3x3 neighbourhood) forms the coordinates of the feature (circle) center. The radius of the circle is level of the pyramid.
It is Hessian threshold. It means that you would take only maximas (see 1) with values bigger than threshold. Bigger threshold lead to the less number of features, but stability of features is better and visa versa.
Keypoint == feature. In OpenCV Keypoint is the structure to store features.
No, SURF is good for comparison of the textured objects but not for shape and color. For the shape I recommend to use MSER (but not OpenCV one), Canny edge detector, not local features. This presentation might be useful

Resources