Bitmap image manipulation - f#

I want to replace GetPixel and SetPixel using LockBits method, so I came across this F# lazy pixels reading
open System.Drawing
open System.Drawing.Imaging
let pixels (image:Bitmap) =
let Width = image.Width
let Height = image.Height
let rect = new Rectangle(0,0,Width,Height)
// Lock the image for access
let data = image.LockBits(rect, ImageLockMode.ReadOnly, image.PixelFormat)
// Copy the data
let ptr = data.Scan0
let stride = data.Stride
let bytes = stride * data.Height
let values : byte[] = Array.zeroCreate bytes
System.Runtime.InteropServices.Marshal.Copy(ptr,values,0,bytes)
// Unlock the image
image.UnlockBits(data)
let pixelSize = 4 // <-- calculate this from the PixelFormat
// Create and return a 3D-array with the copied data
Array3D.init 3 Width Height (fun i x y ->
values.[stride * y + x * pixelSize + i])
At the end of the code, it returns a 3D array with the copied data.
So the 3D array is a copied image, how do I edit the pixels of the 3D array such as changing color? What is the pixelSize for? Why store an image in 3D byte array not 2D?
Example if we want to use 2D array instead, and I want to change the colors of specified pixels, how do we go about doing that?
Do we do operations on the given copied image in bytearray OUTSIDE pixels function OR we do it INSIDE the pixels function before unlocking the image?
If we no longer use GetPixel or SetPixel? How do I retrieve color of the pixels from the copied image byte[]?
If you don't understand my questions, please do explain how do I use above code to do opeation such as "add 50" to R,G,B of every pixel of a given image, without getPixel, setPixel

The first component of the 3D array is the colour component. So at index 1,78,218 is the value of the blue component of the pixel at 78,218.
Like this:
Array2D.init Width Height (fun x y ->
let color i = values.[stride * y + x * pixelSize + i] |> int
new Color(color 0, color 1, color 2)
Since the images is copied, it doesn't make a difference if you mess with it before or after unlocking the image. The locking is there to make sure nobody changes the image while you do the actual copying.
The values array is a flattening of a 2D array into a flat array. The 2D-index .[x,y] is at stride * y + x * pixelSize. The RGB components then have a byte each. This explains why this finds the i'th color component at x,y:
values.[stride * y + x * pixelSize + i] |> int
To add 50 to every pixel, its easier to use the original 3D array. Suppose you have an image myImage:
pixels (myImage) |> Array3D.map ((+) 50)
The type of this is Array3D<Color>, not Image. If you need the an Image, you'll need to construct that, somehow, from the Array3D you now have.

Related

How do we do rectilinear image conversion with swift and iOS 11+

How do we use the function apple provides (below) to perform rectilinear conversion?
Apple provides a reference implementation in 'AVCameraCalibrationData.h' on how to correct images for lens distortion. Ie going from images taken with a wide-angle or telephoto lens to the rectilinear 'real world' image. A pictoral representation is here:
To create a rectilinear image we must begin with an empty destination buffer and iterate through it row by row, calling the sample implementation below for each point in the output image, passing the lensDistortionLookupTable to find the corresponding value in the distorted image, and write it to your output buffer.
func lensDistortionPoint(for point: CGPoint, lookupTable: Data, distortionOpticalCenter opticalCenter: CGPoint, imageSize: CGSize) -> CGPoint {
// The lookup table holds the relative radial magnification for n linearly spaced radii.
// The first position corresponds to radius = 0
// The last position corresponds to the largest radius found in the image.
// Determine the maximum radius.
let delta_ocx_max = Float(max(opticalCenter.x, imageSize.width - opticalCenter.x))
let delta_ocy_max = Float(max(opticalCenter.y, imageSize.height - opticalCenter.y))
let r_max = sqrt(delta_ocx_max * delta_ocx_max + delta_ocy_max * delta_ocy_max)
// Determine the vector from the optical center to the given point.
let v_point_x = Float(point.x - opticalCenter.x)
let v_point_y = Float(point.y - opticalCenter.y)
// Determine the radius of the given point.
let r_point = sqrt(v_point_x * v_point_x + v_point_y * v_point_y)
// Look up the relative radial magnification to apply in the provided lookup table
let magnification: Float = lookupTable.withUnsafeBytes { (lookupTableValues: UnsafePointer<Float>) in
let lookupTableCount = lookupTable.count / MemoryLayout<Float>.size
if r_point < r_max {
// Linear interpolation
let val = r_point * Float(lookupTableCount - 1) / r_max
let idx = Int(val)
let frac = val - Float(idx)
let mag_1 = lookupTableValues[idx]
let mag_2 = lookupTableValues[idx + 1]
return (1.0 - frac) * mag_1 + frac * mag_2
} else {
return lookupTableValues[lookupTableCount - 1]
}
}
// Apply radial magnification
let new_v_point_x = v_point_x + magnification * v_point_x
let new_v_point_y = v_point_y + magnification * v_point_y
// Construct output
return CGPoint(x: opticalCenter.x + CGFloat(new_v_point_x), y: opticalCenter.y + CGFloat(new_v_point_y))
}
Additionally apple states: "point", "opticalCenter", and "imageSize" parameters below must be in the same coordinate system.
With that in mind, what values do we pass for opticalCenter and imageSize and why? What exactly is the "applying radial magnification" doing?
The opticalCenter is actually named distortionOpticalCenter. So you can provide lensDistortionCenter from AVCameraCalibrationData.
Image size is a height and width of image you want to rectilinear.
"Applying radial magnification". It changes the coordinates of given point to the point where it will be with ideal lens without distortion.
"How do we use the function...". We should create an empty buffer with same size as the distorted image. For each pixel of empty buffer we should apply the lensDistortionPointForPoint function. And take a pixel with corrected coordinates from distorted image to empty buffer. After fill all buffer space you should get an undistorted image.

why the CGImageGetBytesPerRow() method return a weird value on some images?

I got an image from a bigger image by
let partialCGImage = CGImageCreateWithImageInRect(CGImage, frame)
but sometimes I got wrong RGBA value. For example, I calculated the average red values of an image, but it turned out like a gray image.
So I checked the info as follow.
image width: 64
image height: 64
image has 5120 bytes per row
image has 8 bits per component
image color space: <CGColorSpace 0x15d68fbd0> (kCGColorSpaceICCBased; kCGColorSpaceModelRGB; sRGB IEC61966-2.1)
image is mask: false
image bitmap info: CGBitmapInfo(rawValue: 8194)
image has 32 bits per pixel
image utt type: nil
image should interpolate: true
image rendering intent: CGColorRenderingIntent
Bitamp Info: ------
Alpha info mask: Ture
Float components: False
Byte oder mask: Ture
Byte order default: False
Byte order 16 little: False
Byte order 32 little: Ture
Byte order 16 big: Ture
Byte order 32 big: False
Image Info ended---------------
Then I got a really weird problem, why the width and height are both 64 pxs, and the image has 8 bits(1 byte) per component(4 bytes per pixel), but the bytes per row is 5120?
And I notice the bitmap info of the normal image is quite different, it doesn't has any byte order infomation.
I googled the different between little endian and big endian, but I got confused when they showed up together.
I really need help since my project has already delayed for 2 days because of that. Thanks!
By the way, I used following code to get the RGBA value.
let pixelData=CGDataProviderCopyData(CGImageGetDataProvider(self.CGImage))
let data:UnsafePointer<UInt8> = CFDataGetBytePtr(pixelData)
var rs: [[Int]] = []
var gs: [[Int]] = []
var bs: [[Int]] = []
let widthMax = imageWidth
let heightMax = imageHeight
for indexX in 0...widthMax {
var tempR: [Int] = []
var tempG: [Int] = []
var tempB: [Int] = []
for indexY in 0...heightMax {
let offSet = 4 * (indexX * imageWidth + indexY)
**let r = Int(data[pixelInfo + offSet])
let g = Int(data[pixelInfo + 1 + offSet])
let b = Int(data[pixelInfo + 2 + offSet])**
tempR.append(r)
tempG.append(g)
tempB.append(b)
}
rs.append(tempR)
gs.append(tempG)
bs.append(tempB)
}
Ask me if you have problem with my code. Thank you for help.
The bytes-per-row is 5120 because you used CGImageCreateWithImageInRect on a larger image. From the CGImage reference manual:
The resulting image retains a reference to the original image, which means you may release the original image after calling this function.
The new image uses the same pixel storage as the old (larger) image. That's why the new image retains the old image, and why they have the same bytes-per-row.
As for why you're not getting the red values you expect: Rob's answer has some useful information, but if you want to explore deeper, consider that your bitmap info is 8194 = 0x2002.
print(CGBitmapInfo.ByteOrder32Little.rawValue | CGImageAlphaInfo.PremultipliedFirst.rawValue)
# Output:
8194
These bits determine the byte order of your bitmap. But those names aren't all that helpful. Let's figure out exactly what byte order we get for those bits:
let context = CGBitmapContextCreate(nil, 1, 1, 8, 4, CGColorSpaceCreateDeviceRGB(), CGBitmapInfo.ByteOrder32Little.rawValue | CGImageAlphaInfo.PremultipliedFirst.rawValue)!
UIGraphicsPushContext(context)
let d: CGFloat = 255
UIColor(red: 1/d, green: 2/d, blue: 3/d, alpha: 1).setFill()
UIRectFill(.infinite)
UIGraphicsPopContext()
let data = UnsafePointer<UInt8>(CGBitmapContextGetData(context))
for i in 0 ..< 4 {
print("\(i): \(data[i])")
}
# Output:
0: 3
1: 2
2: 1
3: 255
So we can see that a bitmap info of 8194 means that the byte order is BGRA. Your code assumes it's RGBA.
In addition to the question about pixelInfo, raised by Segmentation, the calculation of offSet seem curious:
let offSet = 4 * (indexX * imageWidth + indexY)
The x and y values are backwards. Also, you also cannot assume that the bytes per row is always equal to 4 times the width in pixels because some image formats pad bytes per row. Anyway, it theoretically it should be:
let offSet = indexY * bytesPerRow + indexX * bytesPerPixel
Also note that in addition to the x/y flip issue, you don't want 0 ... widthMax and 0 ... heightMax (as those will return widthMax + 1 and heightMax + 1 data points). Instead, you want to use 0 ..< widthMax and 0 ..< heightMax.
Also if you're dealing with random image files, there are other deeper problems here. For example, you can't make assumptions regarding RGBA vs ARGB vs CMYK, big endian vs little endian, etc., captured in the bitmap info field.
Rather than writing code that can deal with all of these variations in pixel buffers, Apple suggests alternative to take the image of some random configuration and render it to some consistent context configuration, and then you can navigate the buffer more easily. See Technical Q&A #1509.
First of all you haven't initialize pixelInfo variable. Second you aren't doing anything with the A value shifting everything 8bits to the left. Also i don't think you need pixelInfo and offset, these two variables are the same so keep one of them equal to what you wrote for offset

Getting Pixel value in the image

I am calculating the RGB values of pixels in my captured photo. I have this code
func getPixelColorAtLocation(context: CGContext, point: CGPoint) -> Color {
self.context = createARGBBitmapContext(imgView.image!)
let data = CGBitmapContextGetData(context)
let dataType = UnsafePointer<UInt8>(data)
let offset = 4 * ((Int(imageHeight) * Int(point.x)) + Int(point.y))
var color = Color()
color.blue = dataType[offset]
color.green = dataType[offset + 1]
color.red = dataType[offset + 2]
color.alpha = dataType[offset + 3]
color.point.x = point.x
color.point.y = point.y
But I am not sure what this line means in the code.
let offset = 4 * ((Int(imageHeight) * Int(point.x)) + Int(point.y))
Any help??
Thanks in advance
Image is the set of pixels. In order to get the pixel at (x,y) point, you need to calculate the offset for that set.
If you use dataType[0], it has no offset 'cos it points to the place where the pointer is. If you used dataType[10], it would mean you took 10-th element from the beginning where the pointer is.
Due to the fact, we have RGBA colour model, you should multiply by 4, then you need to get what offset by x (it will be x), and by y (it will be the width of the image multiplied by y, in order to get the necessary column) or:
offset = x + width * y
// offset, offset + 1, offset + 2, offset + 3 <- necessary values for you
Imagine, like you have a long array with values in it.
It will be clear if you imagine the implementation of two-dimensional array in the form of one-dimensional array. It would help you, I hope.

What is the structure of Point2f in openCV?

I am confused about what does Point2f returns. I have vector<Point2f> corner; So, what would be the coordinate of rows and columns? Will it be following:
int row_coordinate = corner[i].x;
int col_coordinate = corner[i].y;
But I get a segmentation fault if I take the above-mentioned convention. And if I do it like
int row_coordinate = corner[i].y;
int col_coordinate = corner[i].x;
then I get the results but then it seems to be opposite to the OpenCV documentation. Kindly tell me which one is correct. Would be very nice if you provide some documentation link (which I have already tried to search a lot).
If I'm correct, I assume you're confused with the coordinate system of OpenCV.
Since I always use x as width and y as height, in my program, I use OpenCV like this:
// make an image with height 100 and width 200
cv::Mat img = cv::Mat::zeros(100, 200, CV_8UC1);
int width = img.cols;
int height = img.rows;
cv::Point2f pt(10, 20);
// How do I get a pixel at x = 10 and y = 20 ?
int px = img.at<uchar>(pt.y, pt.x); // yep, it's inverted
What does it mean? OpenCV corrdinate system is based on rows and then columns. If you want to get pixels at (x, y) access it using (y, x)

how to superimpose two images?

I have a visualization output of gabor filter with 12 different orientations.I want to superimpose the vizualization image on my image of retina for vessel extraction.How do i do it?I have tried the below method.is there any other method to perform superimposition of images in matlab.
here is my code
I = getimage();
I=I(:,:,2);
lambda = 8;
theta = 0;
psi = [0 pi/2];
gamma = 0.5;
bw = 1;
N = 2;
img_in = im2double(I);
%img_in(:,:,2:3) = []; % discard redundant channels, it's gray anyway
img_out = zeros(size(img_in,1), size(img_in,2), N);
for n=1:N
gb = gabor_fn(bw,gamma,psi(1),lambda,theta)...
+ 1i * gabor_fn(bw,gamma,psi(2),lambda,theta);
% gb is the n-th gabor filter
img_out(:,:,n) = imfilter(img_in, gb, 'symmetric');
% filter output to the n-th channel
%theta = theta + 2*pi/N
%figure;
%imshow(img_out(:,:,n));
imshow(img_in); hold on;
h = imagesc(img_out(:,:,n)); % here i am getting error saying CDATA must be size[M*N]
set( h, 'AlphaData', .5 ); % .5 transparency
figure;
imshow(h);
theta = 15 * n; % next orientation
end
this is my original image
this is my visualized image got by gabor filter using orientation
this is the kind/type of image i have to get with respect to visualisation .i.e i have to impose visualized image on my original image and i have to get this type of image
With the information you have provided, my understanding is you want the third/final image to be an overlay on top of the first/initial image. I do things like this when using segmentation to detect hemorrhaging in MRI images of the brain.
First, let's set up some defintions:
I_src = source/original image
I_out = output/final image
Now, make a copy of I_src and make it a color image rather than grayscale.
I_hybrid = I_src
colorIm = gray2rgb(I_src)
Let's assume both I_src and I_out are the same visual dimensions (ie: width, height), and that I_out is strictly black-and-white (ie: monochrome). Now, we can use I_out as a mask template for alpha channel adjustments in the resulting image. This is where it gets fun.
BLACK=0;
WHITE=1;
[length width] = size(I_out);
for i = 1:1:length
for j = 1:1:width
if (I_out(i,j) == WHITE)
I_hybrid(i,j) = I_hybrid(i,j) + [0.25 0 0]a;
end
end
This will result in you getting your original image with the blood vessels in the eye being slightly brighter and tinted red. You now have a beautiful composite of your original image with the desired features highlighted, but not overwritten (ie: you can undo the highlighting by subtracting the original color vector).
I will include an example of what the output would look like, but it's noisy because I had to create it in GIMP as I don't have Matlab installed right now. The results will be similar, but yours would be much cleaner and prettier.
Please let me know how this goes.
References
"Converting Images from Grayscale to Color" http://blogs.mathworks.com/pick/2012/11/25/converting-images-from-grayscale-to-color/

Resources