What is the difference between bicubic and cubic? - image-processing

I am trying to do image interpolation in my code. Recently I heard an image interpolation method called "bicubic". Does it has any relationship with "cubic"? If yes, what is the similarities and differences between them?

"Bicubic" is simply cubic interpolation applied in two dimensions.
A similar term is "bilinear", which is linear interpolation in two dimensions. "trilinear" is linear interpolation in 3D. I have not yet seen the term "tricubic". :)
In general, any interpolation scheme can be implemented using a 1D interpolation algorithm. One first interpolates, for example, the rows of the image. In the result, one interpolates the columns. In a 3D image one then would interpolate also along the 3rd dimension.
Thus, if you know how 1D cubic interpolation works, you can also derive how 2D or 3D cubic interpolation works.

Related

where is the DCT matrix in Libjpeg?

In libjpeg I am unable to locate the 8x8 DCT matrix ? If I am not wrong this matrix is always a constant for a 8x8 block . it must contain 1/sqrt(8) on the first row but where is this matrix ?
In an actual JPEG implementation, the DCT matrix is usually factored down to its Gaussian Normal Form. That gives a series of matrix multiplications. However, in the normal form, these only involve operations on the diagonal and values adjacent to the diagonal. Most of the values in the normalized matrices are zero so you can omit them.
That transforms the DCT into a series of 8 parallel operations.
This book describes a couple of ways the matrix operations can be transformed:
http://www.amazon.com/Compressed-Image-File-Formats-JPEG/dp/0201604434/ref=pd_bxgy_b_img_y
This book describes a tensor approach that is theoretically more efficient but tends not to be so in implementation
http://www.amazon.com/JPEG-Compression-Standard-Multimedia-Standards/dp/0442012721/ref=pd_bxgy_b_img_y
It doesn't. Or maybe it's somewhere in a sneaky place, but it doesn't really matter. Real implementations of DCT don't work that way, they're very specialized pieces of code that have all the constants hardcoded into them, and they look nothing like a matrix multiplication. It is occasionally useful to view the transform as a matrix multiplication from a theoretical standpoint, but it can be implemented much more efficiently.
For the DCT in libjpeg, see for example the file jfdctflt.c (or one of its friends).

Finding transformation between two frames

I have two consecutive frames from a video feed and I detect the keypoints using the FAST algorithm for both of them. I match the keypoints using the sum of squared difference's method (SSD).
So basically I have matched keypoints between the two frames. Now I want to calculate the affine transformation (scale + rotation + translation ) between the two frames from the set of matched keypoints.
I know how to calculate affine transformation from a pair of two points.
My question is how can we calculate it for more than two or three points? I know I have to use least median square method but I'm new to this field so I don't know how to use it.
Can someone please explain this in detail or provide a useful link that does this in a simple way?
You could use function findHomography, doc for that purpose.
If all the point matches you are providing are good matches, you can keep the default value for parameter method (i.e. value 0). The least square method will then be used.
However, if you obtained the point matches from SSD keypoint matches, you will likely have some wrong matches among the true matches. Hence, you will obtain better results using a robust method such as RANSAC or Least Medians.
Note that this findHomography function returns a perspective transform (i.e. full 3x3 matrix). If you really want an affine transform (2x3 matrix), you will have to implement the least squares (have a look at this post) or RANSAC (see this post) yourself.

Detecting a pattern of dark/bright bands in an image

I'm trying to detect a pattern like this in some images
The actual image looks something like this
It could be scaled and/or rotated. Is there a way to do that efficiently without resorting to neural nets or some learning algorithm? Can some detection be done based on the value gradient for example (dark-bright-dark-bright-dark)?
input image is MxN (in your example M<N ):
take mean RGB image
mean Y to get 1xN vector
derive
abs
threshold
calculate the distance between peaks.
search for a location where the ratio between the distances is as expected (from what i see in your example ~ 1:7:1)
if a place found, validate the colors in the middle of the distance (from your example should be white-black-white)
You might be able to use Gabor Filters at varying orientations, and do standard threshold to identify objects.
If you know the frequency of the pattern you could try using a bandpass filter to isolate objects at that frequency. If it is a very strong frequency, you might be able to identify it in the image's Fourier transform.
Without much other knowledge about what you are looking for in your image, it will be very difficult to identify a specific repeating pattern.

Finding simple shapes in 2D point clouds

I am currently looking for a way to fit a simple shape (e.g. a T or an L shape) to a 2D point cloud. What I need as a result is the position and orientation of the shape.
I have been looking at a couple of approaches but most seem very complicated and involve building and learning a sample database first. As I am dealing with very simple shapes I was hoping that there might be a simpler approach.
By saying you don't want to do any training I am guessing that you mean you don't want to do any feature matching; feature matching is used to make good guesses about the pose (location and orientation) of the object in the image, and would be applicable along with RANSAC to your problem for guessing and verifying good hypotheses about object pose.
The simplest approach is template matching, but this may be too computationally complex (it depends on your use case). In template matching you simply loop over the possible locations of the object and its possible orientations and possible scales and check how well the template (a cloud that looks like an L or a T at that location and orientation and scale) matches (or you sample possible locations orientations and scales randomly). The checking of the template could be made fairly fast if your points are organised (or you organise them by e.g. converting them into pixels).
If this is too slow there are many methods for making template matching faster and I would recommend to you the Generalised Hough Transform.
Here, before starting the search for templates you loop over the boundary of the shape you are looking for (T or L) and for each point on its boundary you look at the gradient direction and then the angle at that point between the gradient direction and the origin of the object template, and the distance to the origin. You add that to a table (Let us call it Table A) for each boundary point and you end up with a table that maps from gradient direction to the set of possible locations of the origin of the object. Now you set up a 2D voting space, which is really just a 2D array (let us call it Table B) where each pixel contains a number representing the number of votes for the object in that location. Then for each point in the target image (point cloud) you check the gradient and find the set of possible object locations as found in Table A corresponding to that gradient, and then add one vote for all the corresponding object locations in Table B (the Hough space).
This is a very terse explanation but knowing to look for Template Matching and Generalised Hough transform you will be able to find better explanations on the web. E.g. Look at the Wikipedia pages for Template Matching and Hough Transform.
You may need to :
1- extract some features from the image inside which you are looking for the object.
2- extract another set of features in the image of the object
3- match the features (it is possible using methods like SIFT)
4- when you find a match apply RANSAC algorithm. it provides you with transformation matrix (including translation, rotation information).
for using SIFT start from here. it is actually one of the best source-codes written for SIFT. It includes RANSAC algorithm and you do not need to implement it by yourself.
you can read about RANSAC here.
Two common ways for detecting the shapes (L, T, ...) in your 2D pointcloud data would be using OpenCV or Point Cloud Library. I'll explain steps you may take for detecting those shapes in OpenCV. In order to do that, you can use the following 3 methods and the selection of the right method depends on the shape (Size, Area of the shape, ...):
Hough Line Transformation
Template Matching
Finding Contours
The first step would be converting your point to a grayscale Mat object, by doing that you basically make an image of your 2D pointcloud data and so you can use other OpenCV functions. Then you may smooth the image in order to reduce the noises and the result would be somehow a blurry image which contains real edges, if your application does not need real-time processing, you can use bilateralFilter. You can find more information about smoothing here.
The next step would be choosing the method. If the shape is just some sort of orthogonal lines (such as L or T) you can use Hough Line Transformation in order to detect the lines and after detection, you can loop over the lines and calculate the dot product of the lines (since they are orthogonal the result should be 0). You can find more information about Hough Line Transformation here.
Another way would be detecting your shape using Template Matching. Basically, you should make a template of your shape (L or T) and use it in matchTemplate function. You should consider that the size of the template you want to use should be in the order of your image, otherwise you may resize your image. More information about the algorithm can be found here.
If the shapes include areas you can find contours of the shape using findContours, it will give you the number of polygons which are around your shape you want to detect. For instance, if your shape is L, it would have polygon which has roughly 6 lines. Also, you can use some other filters along with findContours such as calculating the area of the shape.

How to match texture similarity in images?

What are the ways in which to quantify the texture of a portion of an image? I'm trying to detect areas that are similar in texture in an image, sort of a measure of "how closely similar are they?"
So the question is what information about the image (edge, pixel value, gradient etc.) can be taken as containing its texture information.
Please note that this is not based on template matching.
Wikipedia didn't give much details on actually implementing any of the texture analyses.
Do you want to find two distinct areas in the image that looks the same (same texture) or match a texture in one image to another?
The second is harder due to different radiometry.
Here is a basic scheme of how to measure similarity of areas.
You write a function which as input gets an area in the image and calculates scalar value. Like average brightness. This scalar is called a feature
You write more such functions to obtain about 8 - 30 features. which form together a vector which encodes information about the area in the image
Calculate such vector to both areas that you want to compare
Define similarity function which takes two vectors and output how much they are alike.
You need to focus on steps 2 and 4.
Step 2.: Use the following features: std() of brightness, some kind of corner detector, entropy filter, histogram of edges orientation, histogram of FFT frequencies (x and y directions). Use color information if available.
Step 4. You can use cosine simmilarity, min-max or weighted cosine.
After you implement about 4-6 such features and a similarity function start to run tests. Look at the results and try to understand why or where it doesnt work. Then add a specific feature to cover that topic.
For example if you see that texture with big blobs is regarded as simmilar to texture with tiny blobs then add morphological filter calculated densitiy of objects with size > 20sq pixels.
Iterate the process of identifying problem-design specific feature about 5 times and you will start to get very good results.
I'd suggest to use wavelet analysis. Wavelets are localized in both time and frequency and give a better signal representation using multiresolution analysis than FT does.
Thre is a paper explaining a wavelete approach for texture description. There is also a comparison method.
You might need to slightly modify an algorithm to process images of arbitrary shape.
An interesting approach for this, is to use the Local Binary Patterns.
Here is an basic example and some explanations : http://hanzratech.in/2015/05/30/local-binary-patterns.html
See that method as one of the many different ways to get features from your pictures. It corresponds to the 2nd step of DanielHsH's method.

Resources