Window width and center calculation of DICOM image - image-processing

What is "rescale intercept" and "rescale slope" in DICOM image (CT)?
How to calculate window width and window center with that?

The rescale intercept and slope are applied to transform the pixel values of the image into values that are meaningful to the application.
For instance, the original pixel values could store a device specific value that has a meaning only when used by the device that generated it: applying the rescale slope/intercept to pixel value converts the original values into optical density or other known measurement units (e.g. hounsfield).
When the transformation is not linear, then a LUT (lookup table) is applied.
After the modality transform has been applied (rescale slope/intercept or LUT) then the window width/center specify which pixels should be visible: all the pixels outside the values specified by the window are displayed as black or white.
For instance, if the window center is 100 and the window width is 20 then all the pixels with a value smaller than 90 are displayed as black and all the pixels with a value bigger than 110 are displayed as white.
This allow to display only portions of the images (for instance just the bones or just the tissues).
Hounsfield scale: http://en.wikipedia.org/wiki/Hounsfield_scale
How to apply the rescale slope/intercept:
final_value = original_value * rescale_slope + rescale_intercept
How to calculate the pixels to display using the window center/width:
lowest_visible_value = window_center - window_width / 2
highest_visible_value = window_center + window_width / 2

Rescale intercept and slope are a simple linear transform applied to the raw pixel data before applying the window width/center. The basic formula is:
NewValue = (RawPixelValue * RescaleSlope) + RescaleIntercept

Related

Problem in understanding how colors are applied to each pixel

I'm new to action recognition and anything related to image processing. I'm studying a paper about image processing. It is about action recognition based on human pose estimation. Here is a summary of how it works:
We first run a state-of-the-art human pose estimator [4] in every
frame and obtain heatmaps for every human joint. These heatmaps encode
the probabilities of each pixel to contain a particular joint. We
colorize these heatmaps using a color that depends on the relative
time of the frame in the video clip. For each joint, we sum the
colorized heatmaps over all frames to obtain the PoTion representation
for the entire video clip.
So for each joint j in frame t, it extracts a heatmap H^t_j[x, y] that is the likelihood of pixel (x, y) containing joint j at frame t. The resolution of this heatmap is denoted by W*H.
My first question: What is a heatmap exactly? I wanted to be sure whether heatmap is a probability matrix in which, for example, the element in (1,1) contains a number which is an indicator of the probability that (1,1) pixel may contain the joint.
In the next step this heatmap is colorized with C channels which C shows the number of colors for visualizing each pixel. Here the idea is to use the same color for the joint heatmaps of a frame.
We start by presenting the proposed colorization scheme for 2 channels
(C = 2). For visualization we can for example use red and green colors
for channel 1 and 2. The main idea is to colorize the first frame in
red, the last one in green, and the middle one with equal proportion
(50%) of green and red. The exact proportion of red and green is a
linear function of the relative time t, i.e., t−1/T−1 , see Figure 2
(left). For C = 2, we have o(t) = (t−1/T−1 , 1−(t−1/T−1). The
colorized heatmap of joint j for a pixel (x, y) and a channel c at
time t is given by:
And here is figure 2 which is mentioned in the context:
My problem is that I cannot figure out whether this equation ( o(t) = (t−1/T−1 , 1−(t−1/T−1) ) represents the degree of one color (i.e red) in a frame or it shows the proportion of both of these colors. If it is used for each color channel separately, What does o_red(t) = (1/6 , 5/6) means when the number of frames (T) is equal to 7?
Or if it used for both channels, since the article says that the first frame is colored red and the last frame is colored green, how we can interpret o(1) = (0,1) if the first element indicates the proportion of red and the second one the proportion of green? As far as I can understand it means the first frame is colored green not red!
In this concept there is a subtle relationship between time and pixel positions.
As far as I know: This kind of heatmap is for involving Time in your image. The purpose is to show the movement of a moving object which is captured by a video, in only one image, so every pixel of the image that is related to the fixed (unmoving) objects of the scene (like background pixels) get to be zero (black). In contrast, if in the video, the moving object pass from a pixel position, that corresponding pixel in the image, will be colorful and it's color depends on the number (time) of the frame that moving object has been seen in the pixel.
For example consider we have a completely black curtain in front of the camera and we are filming. We get a 1-second video which is made from 10 frames. At the first moment (frame 1) a very tiny white ball comes into the scene and get captured at pixel (1,1) in frame 1. then at frame two , that small ball got captured at pixel (1,2), and so on. At the end when we stop filming at frame 10, ball will be seen at pixel (1,10). Now we have 10 frames, which one of them has a white pixel at different position and we want to show the whole process in only one image, so 10 pixels of that image will be colorful (pixels: (1,1), (1,2),(1,3),...,(1,10)) and the other pixels are black.
With the formula you mentioned, the color of each pixel is computed according to the related frame number (which the ball got captured):
T=10 # 10 frames
pixel (1,1) got the white ball at frame 1 so its color would be ((0/9),1-(0/9)) which means the green channel has a zero value in that pixels and the red channel has 1 value so this pixel looks completely red.
pixel (1,2) got the white ball at frame 2 so its color would be (1/9 , 8/9), and this pixels is more red than green.
... # continue so on for other 7 pixels
pixel (1,10) got the white ball at frame 2 so its color would be (1 , 0), and this pixels is completely green.
Now at the if you look at the image, you see a colorful line which is 10-pixel long and it is red at the beginning and its color gradually changes to green as it goes to the end (10th pixel). WHICH means the ball moved from pixel one to pixel 10 during that 1 second video.
(If I were unclear at any point of the explanation, please comment and I will elaborate)

How to get back the co-ordinate points corresponding to the intensity points obtained from a faster r-cnn object detection process?

As a result of the faster r-cnn method of object detection, I have obtained a set of boxes of intensity values(each bounding box can be thought of as a 3D matrix with depth of 3 for rgb intensity, a width and a height which can then be converted into a 2D matrix by taking gray scale) corresponding to the region containing the object. What I want to do is to obtain the corresponding co-ordinate points in the original image for each cell of intensity inside of the bounding box. Any ideas how to do so?
From what I understand, you got an R-CNN model that outputs cropped pieces of the input image and you now want to trace those output crops back to their coordinates in the original image.
What you can do is simply use a patch-similarity-measure to find the original position.
Since the output crop should look exactly like itself in the original image, just use Pixel-based distance:
Find the place in the image with the smallest distance (should be zero) and from that you can find your desired coordinates.
In python:
d_min = 10**6
crop_size = crop.shape
for x in range(org_image.shape[0]-crop_size[0]):
for y in range(org_image.shape[1]-crop_size[1]):
d = np.abs(np.sum(np.sum(org_image[x:x+crop_size[0],y:y+crop_size[0]]-crop)))
if d <= d_min:
d_min = d
coord = [x,y]
However, your model should have that info available in it (after all, it crops the output based on some coordinates). Maybe if you add some info on your implementation.

Determining pixel coordinates across display resolutions

If a program displays a pixel at X,Y on a display with resolution A, can I precisely predict at what coordinates the same pixel will display at resolution B?
MORE INFORMATION
The 2 display resolutions are:
A-->1366 x 768
B-->1600 x 900
Dividing the max resolutions in each direction yields:
X-direction scaling factor = 1600/1366 = 1.171303075
Y-direction scaling factor = 900/768 = 1.171875
Say for example that the only red pixel on display A occurs at pixel (1,1). If I merely scale up using these factors, then on display B, that red pixel will be displayed at pixel (1.171303075, 1.171875). I'm not sure how to interpret that, as I'm used to thinking of pixels as integer values. It might help if I knew the exact geometry of pixel coordinates/placement on a screen. e.g., do pixel coordinates (1,1) mean that the center of the pixel is at (1,1)? Or a particular corner of the pixel is at (1,1)? I'm sure diagrams would assist in visualizing this--if anyone can post a link to helpful resources, I'd appreciate it. And finally, I may be approaching this all wrong.
Thanks in advance.
I think, your problem is related to the field of scaling/resampling images. Bitmap-, or raster images are digital photographs, so they are the most common form to represent natural images that are rich in detail. The term bitmap refers to how a given pattern (bits in a pixel) maps to a specific color. A bitmap images take the form of an array, where the value of each element, called a pixel picture element, correspond to the color of that region of the image.
Sampling
When measuring the value for a pixel, one takes the average color of an area around the location of the pixel. A simplistic model is sampling a square, and a more accurate measurement is to calculate a weighted Gaussian average. When perceiving a bitmap image the human eye should blend the pixel values together, recreating an illusion of the continuous image it represents.
Raster dimensions
The number of horizontal and vertical samples in the pixel grid is called raster dimensions, it is specified as width x height.
Resolution
Resolution is a measurement of sampling density, resolution of bitmap images give a relationship between pixel dimensions and physical dimensions. The most often used measurement is ppi, pixels per inch.
Scaling / Resampling
Image scaling is the name of the process when we need to create an image with different dimensions from what we have. A different name for scaling is resampling. When resampling algorithms try to reconstruct the original continuous image and create a new sample grid. There are two kind of scaling: up and down.
Scaling image down
The process of reducing the raster dimensions is called decimation, this can be done by averaging the values of source pixels contributing to each output pixel.
Scaling image up
When we increase the image size we actually want to create sample points between the original sample points in the original raster, this is done by interpolation the values in the sample grid, effectively guessing the values of the unknown pixels. This interpolation can be done by nearest-neighbor interpolation, bilinear interpolation, bicubic interpolation, etc. But the scaled up/down image must be also represented over discrete grid.

Display 3D matrix with its thickness information

I have a problem about plotting 3D matrix. Assume that I have one image with its size 384x384. In loop function, I will create about 10 images with same size and store them into a 3D matrix and plot the 3D matrix in loop. The thickness size is 0.69 between each size (distance between two slices). So I want to display its thickness by z coordinate. But it does not work well. The problem is that slice distance visualization is not correct. And it appears blue color. I want to adjust the visualization and remove the color. Could you help me to fix it by matlab code. Thank you so much
for slice = 1 : 10
Img = getImage(); % get one 2D image.
if slice == 1
image3D = Img;
else
image3D = cat(3, image3D, Img);
end
%Plot image
figure(1)
[x,y,z] = meshgrid(1:384,1:384,1:slice);
scatter3(x(:),y(:),z(:).*0.69,90,image3D(:),'filled')
end
The blue color can be fixed by changing the colormap. Right now you are setting the color of each plot point to the value in image3D with the default colormap of jet which shows lower values as blue. try adding colormap gray; after you plot or whichever colormap you desire.
I'm not sure what you mean by "The problem is that slice distance visualization is not correct". If each slice is of a thickness 0.69 than the image values are an integral of all the values within each voxel of thickness 0.69. So what you are displaying is a point at the centroid of each voxel that represents the integral of the values within that voxel. Your z scale seems correct as each voxel centroid will be spaced 0.69 apart, although it won't start at zero.
I think a more accurate z-scale would be to use (0:slice-1)+0.5*0.69 as your z vector. This would put the edge of the lowest slice at zero and center each point directly on the centroid of the voxel.
I still don't think this will give you the visualization you are looking for. 3D data is most easily viewed by looking at slices of it. you can check out matlab's slice which let's you make nice displays like this one:
slice view http://people.rit.edu/pnveme/pigf/ThreeDGraphics/thrd_threev_slice_1.gif

Determining the average distance of pixels (to the centre of an image) in OpenCV

I'm trying to figure out how to do the following calculation in OpenCV.
Assuming a binary image (black/white):
Average distance of white pixels from the centre of the image. An image with most of its white pixels near the edges will have a high score, whereas an image with most white pixels near the centre will have a low score.
I know how to do this manually with loops, but since I'm working Java I'd rather offload it to a set of high-performance OpenCV calls which are native.
Thanks
distanceTransform() is almost what you want. Unfortunately, it only calculates distance to the nearest black pixel, which means the data must be massaged a little bit. The image needs to contain only a single black pixel at the center for distanceTransform() to work properly.
My method is as follows:
Set all black pixels to an intermediate value
Set the center pixel to black
Call distanceTransform() on the modified image
Calculate the mean distance via mean(), using the white pixels in the binary image as a mask
Example code is below. It's in C++, but you should be able to get the idea:
cv::Mat img; // binary image
img.setTo(128, img == 0);
img.at<uchar>(img.rows/2, img.cols/2) = 0; // Set center point to zero
cv::Mat dist;
cv::distanceTransform(img, dist, CV_DIST_L2, 3); // Can be tweaked for desired accuracy
cv::Scalar val = cv::mean(dist, img == 255);
double mean = val[0];
With that said, I recommend you test whether this method is actually any faster than iterating in a loop. This method does a fair bit more processing than necessary to accommodate the API call.

Resources