FFT index of vertical spatial frequency - image-processing

Hi I have some problem with the following:
I have f(x,y), a 64x64 image coming from a sensor chip. I would like to use a filtering approach in the frequency domain and hence I zero pad the image f(x,y) by a factor 4 and then multiply it by (-1)(x+y). The resulting image is then run through a 2D FFT. I can assume that all indices start at 0. Now the question is: At what FFT index would I now find the vertical spatial frequency of +20 cycles per original chip side length?
Well ok. I have my 64x64 image and zero pad it so that I then have a 128x128 image. Is that correct? Now I don't understand what exactly the vertical spatial frequency of +20 cycles is.
Can somebody help me? Thank you!

Related

Slanted edge MTF evaluation with ImageJ plugin

I am trying to evaluate an optical system by calculating the MTF with the slanted edge method. For this I use the following ImageJ plugin:
https://imagej.nih.gov/ij/plugins/se-mtf/index.html
No I want to calculate the MTF with the frequency units "lp/mm". For this I have to insert the "Sensor size (mm)" and the "Number of photodetectors". Sadly I cannot find any description and what these values are exactly. If I use the diagonal of the sensor in mm and the number of pixels my sensor has as the second value, I get nonsense values (very high frequencies, higher than 100000 lp/mm).
Does anyone have experience with this tool and can give me a hint on what values I need here?
Thanks a lot in advance!
I am also not 100% sure but I guess its the sensor width and the number of pixels along the sensor width
The 2 input values are just there to fix the scale, even it could have been reduced to 1 = xx µm/mm.
So, "Sensor size (mm)" = whatever size (mm) in the image considered, just choose it coherent with the real size of the image (just for logic).
Then, "Number of photodetectors" = the number (qty) of Voxels corresponding to this "whatever size (mm)" input above.
Then ImageJ is having the scale into the image made of Voxel.
Last but not least, 2 things : (1) do not forget in your ROI selection (square) that void shall be on the Left Hand side ; (2) The more accurate result is obtained when material wall is vertical on your image (otherwise, when bended, you will have bias vs.vertical wall.

How bilinear interpolation works when down scaling?

I can clearly understand how bilinear interpolation works when up scaling the image, like fill the values while taking 4 nearest neighbours, but i can't understand how it works while down scaling the image. It would mean a lot to me if someone clarify for me.
Scaling an image requires mapping pixels from the input to pixels on the output. If those pixel coordinates don't map to an integer, interpolation is required to estimate what the pixel value would have been. The "Bi" part of bilinear means it's linear interpolation applied in two dimensions independently. If for example output pixel 2,3 needs to come from input coordinates 1.5,7.2 you would interpolate in the X direction by taking 0.5 of each of the pixels at 1.0 and 2.0, then interpolate in the Y direction by taking 0.8 of the pixel at 7.0 and 0.2 of the pixel at 8.0. Usually these operations are combined into a single set of equations, but they can be applied separately if needed.
Bilinear is a poor choice for downscaling because it leads to aliasing artifacts. This is when you attempt to create spatial frequencies that are beyond the Nyquist sampling limit, and high frequency detail turns into low frequency artifacts. You can minimize this by blurring the image before you downscale it. Or you can choose an interpolation algorithm that incorporates some low pass filtering.

Determining pixel coordinates across display resolutions

If a program displays a pixel at X,Y on a display with resolution A, can I precisely predict at what coordinates the same pixel will display at resolution B?
MORE INFORMATION
The 2 display resolutions are:
A-->1366 x 768
B-->1600 x 900
Dividing the max resolutions in each direction yields:
X-direction scaling factor = 1600/1366 = 1.171303075
Y-direction scaling factor = 900/768 = 1.171875
Say for example that the only red pixel on display A occurs at pixel (1,1). If I merely scale up using these factors, then on display B, that red pixel will be displayed at pixel (1.171303075, 1.171875). I'm not sure how to interpret that, as I'm used to thinking of pixels as integer values. It might help if I knew the exact geometry of pixel coordinates/placement on a screen. e.g., do pixel coordinates (1,1) mean that the center of the pixel is at (1,1)? Or a particular corner of the pixel is at (1,1)? I'm sure diagrams would assist in visualizing this--if anyone can post a link to helpful resources, I'd appreciate it. And finally, I may be approaching this all wrong.
Thanks in advance.
I think, your problem is related to the field of scaling/resampling images. Bitmap-, or raster images are digital photographs, so they are the most common form to represent natural images that are rich in detail. The term bitmap refers to how a given pattern (bits in a pixel) maps to a specific color. A bitmap images take the form of an array, where the value of each element, called a pixel picture element, correspond to the color of that region of the image.
Sampling
When measuring the value for a pixel, one takes the average color of an area around the location of the pixel. A simplistic model is sampling a square, and a more accurate measurement is to calculate a weighted Gaussian average. When perceiving a bitmap image the human eye should blend the pixel values together, recreating an illusion of the continuous image it represents.
Raster dimensions
The number of horizontal and vertical samples in the pixel grid is called raster dimensions, it is specified as width x height.
Resolution
Resolution is a measurement of sampling density, resolution of bitmap images give a relationship between pixel dimensions and physical dimensions. The most often used measurement is ppi, pixels per inch.
Scaling / Resampling
Image scaling is the name of the process when we need to create an image with different dimensions from what we have. A different name for scaling is resampling. When resampling algorithms try to reconstruct the original continuous image and create a new sample grid. There are two kind of scaling: up and down.
Scaling image down
The process of reducing the raster dimensions is called decimation, this can be done by averaging the values of source pixels contributing to each output pixel.
Scaling image up
When we increase the image size we actually want to create sample points between the original sample points in the original raster, this is done by interpolation the values in the sample grid, effectively guessing the values of the unknown pixels. This interpolation can be done by nearest-neighbor interpolation, bilinear interpolation, bicubic interpolation, etc. But the scaled up/down image must be also represented over discrete grid.

Comparison metric for two open contours

I'm validating an image segmentation algorithm applied to 2D images. The algorithm generates a contour segment, i.e. a set of connected pixels that form a freecurve in 2D space. The idea is to compare this set of pixels with a ground-truth, in my case another contour segment manually traced by an expert. An image showing what would be a segmentation result and the corresponding manual (ground-truth) segmentation is shown below:
I'm trying to think of an adequate comparison metric to validate the segmentation results. Ideally the best metric would be the point-to-point euclidean distance between corresponding pairs of pixels on each segment, however (as seen in previous figure) the segments don't have the same length (i.e. differ by the total number of pixels) so pixel-to-pixel comparisons have to be discarded.
Can you suggest me an adequate metric for validating my algorithm? Thanks for any suggestion!
For each pixel in the ground truth, take the distance to the nearest pixel in the segmentation result. Then take the sum of that for all ground truth pixels as the total error.
That's basically recall weighted by distance. If you start with the pixels in the result, it would resemble precision instead.
If the curves are closed, you can compute the area between the curves. If you can tell which pixels belong to a segment, that is as easy as computing XOR set of the 2 pixel sets.
Here is an example using that I've created using Matlab:
You could divide each line into n segments of equal length, then compute the euclidean distance between each segment and its pair on the other line.

Computing HOG features

I have one problem in the second step which is to accumulate weighted votes for gradient orientation over spatial cells.
Assuming the cell is 8*8. Let me use two matrix GO[8][8]([1 9]), GM[8][8] to represent the gradient orientation and gradient magnitude respectively.
The gradient orientation ranges from 0 - 180 and there are 9 orientation bins.
According to my understanding of HOG, for every pixel in a cell, adding its gradient magnitude to its corresponding orientation bin. In this way, we can have the histogram for every cell.
But there is one sentence thats confusing me.
"To reduce aliasing, votes(gradient magnitude) are interpolated
trilinearly between the neighbouring bin centers in both orientation
and position."1
Why interpolated? How to interpolate? Can someone explains more detailed? No reducing aliasing.
Thanks in advance.
1 This sentence is in Navneet Dalal's PHD thesis, p38, line 4.
Interpolation is a standard technique for computing histograms. The idea here is that each value is not simply placed into one bin, but is distributed between two neighboring bins (assuming a 1d histogram), based on how far away it is from the center of the original bin.
The purpose of this is to deal with situations when a small error in your measurement can cause a value to be placed into a different bin. This is a very good thing to do for any type of histogram, not just for HOGs, assuming you have the CPU cycles.
There is also bi-linear and tri-linear interpolation for 2d and 3d histograms, where each value is distributed between 4 and 8 neighboring bins respectively.

Resources