Calculate harmonics for a set of data extracted from images - signal-processing

I am implementing a method known as trace transform for image analysis. The algorithm extracts a lot of features of an image (in my case features pertained to texture) using a set of transforms on the values of the pixels. Here is a paper describing the algorithm: http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=18&cad=rja&ved=0CGoQFjAHOAo&url=http%3A%2F%2Fpdf.aminer.org%2F000%2F067%2F295%2Ftexture_classification_with_thousands_of_features.pdf&ei=lV9cUez_GYrx2QX-voCgDw&usg=AFQjCNEbCd8GBm4X8V4vk0PYyQwPZPlWyg&sig2=KTtvd1XxtvuUpCDeBzUu4A (sorry for the long link)
The algorithm finally constructs a "triple feature" which is a number that characterizes the image and was calculated by applying a first set of transforms on the pixel values extracted by all the tracing lines going through the image at all angles. On the paper you can see this description starting on page 2, Figure 1 shows an abstraction of such a tracing line which is defined by its angle (phi) and the distance (d) of the line from the center of the image. So we can describe each tracing line by the pair (phi,d).
Now on the next page on the paper (page 3) we have Table 1 which is the set of functionals that is applied to the pixel values extracted by the tracing lines. Each of these functionals applied to all the tracing lines generates another representation of the image. Then another set of functionals (those that appear on Table 2 on page 4) are applied to the new representation of the image along the columns which is basically along parameter d generating, this way, another new representation of the image.
Finally another set of functionals is applied to this last representation which is an array of values over parameter phi which was the angle parameter (that means we have 360 values describing the image, one for each angle of the tracing lines). You can see these functionals in the paper on Table 3 page 4.
Now some of this last set of functionals that have to be applied to the last representation of the image over parameter phi need to calculate the harmonics of this set of 360 values. First I need to calculate the first harmonic and then the amplitude and phase of the first through fourth harmonic but I don't know how to do this.
I hope this explains it better. You can read the first couple of pages of the paper since it's probably better explained there. And figure 1 makes it clear what I mean by tracing lines and the representation of each line by the (phi,d) pair.
Thanks

Related

Generalized hough transform for center of an object calculation based of SIFT points?

Do You have any suggestion for this?
I am following the steps in one research paper to re-implement it for my specific problem. I am not a professional programmer, however, I am struggling a lot, more than one month.When I am reaching to scoring step with generalized Hough transform, I do not get any result and what I am getting is a blank image without finding the object center.
What I did includes the following steps:
I define a spatially constrained area for a training image and extract SIFT features within the area. The red point in the center represents the object center in template(training) image.
and this the interest point extracted by SIFT in query image:
Keypoints are matched according to the some conditions: 1)they should be quantized to the same visual word and be spatially consistent. So I get the following points after matching conditions:
I have 15 and 14 points for template and query images, respectively. I send these points along with template image center of object coordinate to generalized hough transform (the code that I found from github). the code is working properly for it default images. However, according to the few points that I am getting by the algorithm, I do not know what I am doing wrong?!
I thought maybe that is because of theta calculation, so I changed this line to return abs of y and x differences. But it did not help.
In line 20 they only consider 90 degrees for binning, Could I ask what is the reason and how can I define a binning according to my problem and range of angles of rotation around the center of an object?
- Does binning range affect the center calculation?
I really appreciate it of you let me know what I am doing wrong here.

Pass histogram equalized image to second pass hist eq?

Suppose that a digital image is subjected to histogram equalization. The problem is the following: Show that a second pass histogram equalization (on the histogram equalized image) will produce exactly the same result as the first pass?
Here is the solution :- problem 3.7
I am not able to understand the following part of the answer: because every pixel (and no others) with value r_k is mapped to s_k, n_{s_k}=n_{r_k}.
This fact stems from the assumptions made on the transformation T representing the histogram equalization operation:
a. T is single-valued and monotonically increasing;
b. T(r) is in [0, 1] for every r in [0,1].
See the corresponding chapter in "Digital Image Processing" by Gonzalez and Woods. In particular, it can be verified that these two assumptions hold for the discrete histogram normalization (equations see Problem 3.7) - at least if none of the intensity values is 0, because then T is strictly monotonic implying a one-to-one mapping; this in turn implies the fact that only every pixel with value r_k is mapped to s_k and no other intensity values. This is done in the solution of Problem 3.10 in the PDF you provided. If one (or more) intensity value(s) is (are) 0, this might not be the case anymore. However, in the continuous domain (in which histogram equalization is usually derived), both assumptions hold, as is shown in the corresponding Chapter in the book by Gonzalez and Woods.

direction on image pattern description and representation

I have a basic question regarding pattern learning, or pattern representation. Assume I have a complex pattern of this form, could you please provide me with some research directions or concepts that I can follow to learn how to represent (mathematically describe) these forms of patterns? in general the pattern does not have a closed contour nor it can be represented with analytical objects like boxes, circles etc.
By mathematically describe I'm assuming you mean derive from the image a vector of values that represents the content of the image. In computer vision/image processing we call this an "image descriptor".
There are several image descriptors that could be applied to pixel based data of the form you showed, which appear to be 1 value per pixel i.e. greyscale images.
One approach is to perform "spatial gridding" where you divide the image up into a regular grid of a constant size e.g. a 4x4 grid. You then average the pixel values within each cell of the grid. Then concatenate these values to form a 16 element vector - this coarsely describes the pixel distribution of the image.
Another approach would be to use "image moments" which are 2D statistical moments. Use this equation:
where f(x,y) is they pixel value at coordinates (x,y). W and H are the image width and height. The mu_x and mu_y indicate the average x and y. The values i and j select the order of moment you want to compute. Various orders of moment can be combined in different ways for example in the "Hu moments" we can compute 7 numbers using combinations of image moments:
The cool thing about the Hu moments is you can scale, rotate, flip etc the image and you still get the same 7 values which makes this a robust ("affine invariant") image descriptor.
Hope this helps as a general direction to read more in.

Documentation of CvStereoBMState for disparity calculation with cv::StereoBM

The application of Konolige's block matching algorithm is not sufficiantly explained in the OpenCV documentation. The parameters of CvStereoBMState influence the accuracy of the disparities calculated by cv::StereoBM. However, those parameters are not documented. I will list those parameters below and describe, what I understand. Maybe someone can add a description of the parameters, which are unclear.
preFilterType: Determines, which filter is applied on the image before the disparities are calculated. Can be CV_STEREO_BM_XSOBEL (Sobel filter) or CV_STEREO_BM_NORMALIZED_RESPONSE (maybe differences to mean intensity???)
preFilterSize: Window size of the prefilter (width = height of the window, negative value)
preFilterCap: Clips the output to [-preFilterCap, preFilterCap]. What happens to the values outside the interval?
SADWindowSize: Size of the compared windows in the left and in the right image, where the sums of absolute differences are calculated to find corresponding pixels.
minDisparity: The smallest disparity, which is taken into account. Default is zero, should be set to a negative value, if negative disparities are possible (depends on the angle between the cameras views and the distance of the measured object to the cameras).
numberOfDisparities: The disparity search range [minDisparity, minDisparity+numberOfDisparities].
textureThreshold: Calculate the disparity only at locations, where the texture is larger than (or at least equal to?) this threshold. How is texture defined??? Variance in the surrounding window???
uniquenessRatio: Cited from calib3d.hpp: "accept the computed disparity d* only ifSAD(d) >= SAD(d*)(1 + uniquenessRatio/100.) for any d != d+/-1 within the search range."
speckleRange: Unsure.
trySmallerWindows: ???
roi1, roi2: Calculate the disparities only in these regions??? Unsure.
speckleWindowSize: Unsure.
disp12MaxDiff: Unsure, but a comment in calib3d.hpp says, that a left-right check is performed. Guess: Pixels are matched from the left image to the right image and from the right image back to the left image. The disparities are only valid, if the distance between the original left pixel and the back-matched pixel is smaller than disp12MaxDiff.
speckleWindowSize and speckleRange are parameters for the function cv::filterSpeckles. Take a look at OpenCV's documentation.
cv::filterSpeckles is used to post-process the disparity map. It replaces blobs of similar disparities (the difference of two adjacent values does not exceed speckleRange) whose size is less or equal speckleWindowSize (the number of pixels forming the blob) by the invalid disparity value (either short -16 or float -1.f).
The parameters are better described in the Python tutorial on depth map from stereo images. The parameters seem to be the same.
texture_threshold: filters out areas that don't have enough texture
for reliable matching
Speckle range and size: Block-based matchers
often produce "speckles" near the boundaries of objects, where the
matching window catches the foreground on one side and the background
on the other. In this scene it appears that the matcher is also
finding small spurious matches in the projected texture on the table.
To get rid of these artifacts we post-process the disparity image with
a speckle filter controlled by the speckle_size and speckle_range
parameters. speckle_size is the number of pixels below which a
disparity blob is dismissed as "speckle." speckle_range controls how
close in value disparities must be to be considered part of the same
blob.
Number of disparities: How many pixels to slide the window over.
The larger it is, the larger the range of visible depths, but more
computation is required.
min_disparity: the offset from the x-position
of the left pixel at which to begin searching.
uniqueness_ratio:
Another post-filtering step. If the best matching disparity is not
sufficiently better than every other disparity in the search range,
the pixel is filtered out. You can try tweaking this if
texture_threshold and the speckle filtering are still letting through
spurious matches.
prefilter_size and prefilter_cap: The pre-filtering
phase, which normalizes image brightness and enhances texture in
preparation for block matching. Normally you should not need to adjust
these.
Also check out this ROS tutorial on choosing stereo parameters.

Deforming an image so that curved lines become straight lines

I have an image with free-form curved lines (actually lists of small line-segments) overlayed onto it, and I want to generate some kind of image-warp that will deform the image in such a way that these curves are deformed into horizontal straight lines.
I already have the coordinates of all the line-segment points stored separately so they don't have to be extracted from the image. What I'm looking for is an appropriate method of warping the image such that these lines are warped into straight ones.
thanks
You can use methods similar to those developed here:
http://www-ui.is.s.u-tokyo.ac.jp/~takeo/research/rigid/
What you do, is you define an MxN grid of control points which covers your source image.
You then need to determine how to modify each of your control points so that the final image will minimize some energy function (minimum curvature or something of this sort).
The final image is a linear warp determined by your control points (think of it as a 2D mesh whose texture is your source image and whose vertices' positions you're about to modify).
As long as your energy function can be expressed using linear equations, you can globally solve your problem (figuring out where to send each control point) using linear equations solver.
You express each of your source points (those which lie on your curved lines) using bi-linear interpolation weights of their surrounding grid points, then you express your restriction on the target by writing equations for these points.
After solving these linear equations you end up with destination grid points, then you just render your 2D mesh with the new vertices' positions.
You need to start out with a mapping formula that given an output coordinate will provide the corresponding coordinate from the input image. Depending on the distortion you're trying to correct for, this can get exceedingly complex; your question doesn't specify the problem in enough detail. For example, are the curves at the top of the image the same as the curves on the bottom and the same as those in the middle? Do horizontal distances compress based on the angle of the line? Let's assume the simplest case where the horizontal coordinate doesn't need any correction at all, and the vertical simply needs a constant correction based on the horizontal. Here x,y are the coordinates on the input image, x',y' are the coordinates on the output image, and f() is the difference between the drawn line segment and your ideal straight line.
x = x'
y = y' + f(x')
Now you simply go through all the pixels of your output image, calculate the corresponding point in the input image, and copy the pixel. The wrinkle here is that your formula is likely to give you points that lie between input pixels, such as y=4.37. In that case you'll need to interpolate to get an intermediate value from the input; there are many interpolation methods for images and I won't try to get into that here. The simplest would be "nearest neighbor", where you simply round the coordinate to the nearest integer.

Resources