Is pixel value normalization needed in medical image segmentation? - image-processing

I have a dataset of CT-Scan representing hips scan. I'm currently not normalizing the pixel value because in CT-Scan pixel value represent different part of the scan (bone 1000+, water0, air-1000, etc). Also the range of pixels value change every scan (ex. -500:1500, -400:1200).
I'm wondering if normalizing pixel value between [0,1] would be a + for my training or I would lost information on the relation between int pixel value and segmentation truth.
Thanks for the answers

It depends a little on your data. What you are describing are so called Hounsfield Units (probably read up on that), you basically express every intensity relative to the one of water.
Bone density (and with that the corresponding intensity) can vary greatly, not to mention if there is metal present.
Your HU range is highly dependent on the body region and mainly the patient.
https://images.app.goo.gl/WNLCs8eENTdbXWwM7
CT Scans are usually uint16 grayscale, I would definitely normalize as long as you can ensure that your float range is sufficient to accommodate the 2^16 different grayscale values.

Related

Interpreting unsigned short depth map values

I'm trying to test my algorithm on the lineMOD object detection dataset. According to the author, the depth values are stored as unsigned short values. I've managed to load the depth values into a cv::Mat but I would like to convert them to the typical float representation [0,1].
At first I assumed that I just have to divide with the maximum unsigned short but this doesn't seem to be the case since the maximum value I find seems to be 3399 while there are a lot of zeros in the depth map. I suppose the zeros mean that the specific pixel is a point that is too far for the depth camera to detect.
Is it possible that these unsigned shorts represent millimeters? If not, how should I convert the depth values before applying the transforms that generate the point cloud?
I guess the pixel values are not millimeters, rather some relative values, because it is easier for a depth camera to get relative depth values than accurate millimeter values, the values even might not be linear. Consult the author to get more information.
You may try a few options:
Consult the author to fully understand what does the depth value mean, then do the conversion accordingly.
Find out what is the actual pixel range among a single image, or among all of
your images, say [534, 4399], scale it to [0.1, 1.0], set those zeros to be 0.0
Simply scale the full range of unsigned short [0 ~ 65535] to [0.0, 1.0]

Is there a known algorithm to find groups of adjacent pixels with similar color?

I'd like to know if this is a known algorithm with a name.
I've never done any image processing, but I'm picturing an image as a 2-d matrix of 3-d vectors (ignore transparency).
The only input parameter is distance. Every pixel is tested against its neighbors. If they are closer than the parameter, they join a group and their values are averaged. As groups grow by gaining new pixels all pixels get the average value of the group.
For your typical selfie the result might resemble quantizing or posterizing, but unlike quantizing or posterizing, there is no fixed count of output colors. If absolutely no pixels are close enough to their neighbors, the result is a 1:1 mapping of every pixel to its own group.
Is there a name for this?

Find High Frequencies with Discrete Fourier Transform [OpenCV]

I want to determine image sharpness by the amount of high frequencies within the image. As far as I understand the dft() function from OpenCV returns two matrices with real and complex numbers.
This is where I am stuck. How can I determine the amount of high frequencies from this data?
I am thankful for every hint/link which could provide me with a better understanding.
Greetings
Make FT
Calculate magnitude of result
Now you have 2D matrix. Consider upper left quadrant (other are mirrors for real source).
Here Magn[0][0] entry corresponds to zero frequency, and Magn[(n-1)/2][(n-1)/2] entry corresponds to the highest frequency.
Left upper part of this submatrix contains low-frequency samples, so you can calculate sum of values in this part and in the rest part and compare these sums. For example (pseudocode):
cvIntegral(Magn, Rect(0..n/4, 0..n/4)) compare with
cvIntegral(Magn, Rect(0..n/2, 0..n/2)) - cvIntegral(Magn, Rect(0..n/4, 0..n/4))

How to choose the number of bins when creating HSV histogram?

I was reading some documentation about HSV histogram, and in several refs the Saturation channel was quantized into 256 values. Why is that? Is there any reason behind choosing this number?
I have the same questions for the Hue channel, often it is quantized into 180 values.
Disclaimer: Off-hand answers (i.e., not backed up by any documentation):
"256" is a popular number for a bin size because Programmers Like Round Numbers -- it fits in a single byte. And "180" because the HSB circle is "360 [degrees]", but "360" does not fit into a single byte.
For many image formats, the range of RGB values is limited to 0..255 per channel -- 3 bytes in total. To store the same amount of data (ignoring any artifacts of converting to another color model), Saturation and Brightness are often expressed in single bytes as well. The same could be done for Hue, by scaling the original range of 0..359 (as Hue is usually expressed as a value in degrees on the HSB Color Wheel) into the byte range 0..255. However, probably because it's easier to do calculations with a number close to the original 360° full circle, the range is clipped to 0..179. That way the value can be stored into a single byte (and thus "HSB" uses as much memory as "RGB") and can be converted trivially back to (close to) its original value -- multiply by 2. Obviously, sticking to the storage space wins over fidelity.
Given 256 values for both S and B, and 180 for H, you end up with a color space of 256*256*180 = 11,796,480 colors. To inspect the number of colors, you build a histogram: an array where you can read out the total amount of pixels in a certain color or color range. Using a color range here, instead of actual values, significantly cuts down the memory requirements.
For an RGB color image, with the colors fairly evenly distributed, you could shift down each channel a certain number of bits. This is how a straightforward conversion from 24-bit "true-color" RGB down to 15-bit RGB "high-color" space works: each channel gets divided by 8, reducing 256 values down to 32 (5 bits per channel). Conversion to a 16-bit high-color RGB space works the same; the bit that got left over in the 15-bit conversion is assigned to green. Thus, the range of colors for green is doubled, which is useful since the human eye is more perceptive for shades of green than for the other two primaries.
It gets more complicated when the colors in the input image are not evenly distributed. A naive solution is to create an array of [256][256][256], initialize all to zero, then fill the array with the colors of the image, and finally sort them. There are better alternatives -- let me consult my old Computer Graphics [1] here. Hold on.
13.4 Reproducing Color mentions the names of two different approaches from Heckbert (Color Image Quantization for Frame Buffer Display, SIGGRAPH 82): the popularity and the median-cut algorithms. (Unfortunately, that's all they say about this topic. I assume efficient code for both can be googled for.)
A rough guess:
The size for each bin (H,S,B) should be reflected by what you are trying to use it for. This older SO question, for example, uses a large bin for hue -- color is considered the most important -- and only 3 different values for both saturation and brightness. Thus, bright images with some subdued areas (say, a comic book) will give a good spread in this histogram, but a real-color photograph will not so much.
The main limit is that the bin sizes, multiplied with each other, should use a reasonably small amount of memory, yet cover enough of each component to get evenly filled. Perhaps some trial-and-error comes into play here. You could initially evenly distribute all of H, S, and B components over the available memory in your histogram and process a small part of the image; say, 1 out of 4 pixels, horizontally and vertically. If you notice one of the component bins fills up too fas where others stay untouched, adjust the ranges and restart.
If you need to do an analysis of multiple pictures, make sure they are all alike in their color gamut. You cannot expect a reasonable bin size to work on all sorts of images; you would end up with an evenly distribution, where all matches are only so-so.
[1] Computer Graphics. Principles and Practices. (1997) J.D. Foley, A. van Dam, S.K. Feiner, and J.F. Hughes, 2nd ed., Reading, MA: Addison-Wesley.

Comparison metric for two open contours

I'm validating an image segmentation algorithm applied to 2D images. The algorithm generates a contour segment, i.e. a set of connected pixels that form a freecurve in 2D space. The idea is to compare this set of pixels with a ground-truth, in my case another contour segment manually traced by an expert. An image showing what would be a segmentation result and the corresponding manual (ground-truth) segmentation is shown below:
I'm trying to think of an adequate comparison metric to validate the segmentation results. Ideally the best metric would be the point-to-point euclidean distance between corresponding pairs of pixels on each segment, however (as seen in previous figure) the segments don't have the same length (i.e. differ by the total number of pixels) so pixel-to-pixel comparisons have to be discarded.
Can you suggest me an adequate metric for validating my algorithm? Thanks for any suggestion!
For each pixel in the ground truth, take the distance to the nearest pixel in the segmentation result. Then take the sum of that for all ground truth pixels as the total error.
That's basically recall weighted by distance. If you start with the pixels in the result, it would resemble precision instead.
If the curves are closed, you can compute the area between the curves. If you can tell which pixels belong to a segment, that is as easy as computing XOR set of the 2 pixel sets.
Here is an example using that I've created using Matlab:
You could divide each line into n segments of equal length, then compute the euclidean distance between each segment and its pair on the other line.

Resources