Checking image feature alignment - image-processing

I have written my own software in C# for performing microscopy imaging. See this screenshot.
The images that can be seen there are of the same sample but recorded through physically different detectors. It s crucial for my experiments that these images be exactly aligned. I thought the easiest would be to somehow blend/substract the two bitmaps but this doesn't give me good results. Therefore I am looking for a better way to do this.
It might be useful to point out that the images exist as arrays of intensities in memory and are converted to bitmaps for on-screen painting to my self written image control.
I would greatly appreciate any help!

If the images are the same orientation and same size, but slightly shifted vertically or horizontally, can you use cross-correlation to find the best alignment?
If you know that features in the yellow channel need to line up, for instance, just feed the yellow channels into the cross-correlation algorithm, and then find the peak in the result. The peak will occur at the offset where the two images line up best.
It will work even with noisy images, and I suspect it will work even for images that are significantly different, like in your screenshot.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.

I got around solving this some time ago... Since I only need to verify that two images from two detectors are perfectly aligned and since I do not have to try and align them if they are not I solved it like this:
1) Use the Aforge Framework and apply a grayscale filter to both images. This will average the RGB values for each pixel.
2) On one image apply a ChannelFilter to retain only the red channel.
3) On the other image, apply a ChannelFilter to retain only the green channel.
4) Add Both images.
Here are the filters I used, I leave it to the reader to apply them if needed (it's trivial and there are examples on the Aforge website).
AForge.Imaging.Filters.IFilter filterR = new AForge.Imaging.Filters.ChannelFiltering(new AForge.IntRange( 0, 255 ), new AForge.IntRange( 0, 0 ), new AForge.IntRange( 0, 0 ));
AForge.Imaging.Filters.IFilter filterG = new AForge.Imaging.Filters.ChannelFiltering(new AForge.IntRange( 0, 0 ), new AForge.IntRange( 0, 255 ), new AForge.IntRange( 0, 0 ));
AForge.Imaging.Filters.GrayscaleRMY FilterGray= new AForge.Imaging.Filters.GrayscaleRMY();
AForge.Imaging.Filters.Add filterADD = new AForge.Imaging.Filters.Add();
When significant features are present in both images I want to check, they will show up in Yellow thus doing exactly what I need.
Thanks for all the input!

So the detectors are different, so the alignment will be slightly wrong, in that pixel (256,512) in image 1 could be a feature represented by pixel (257,513) in image 2. Is that the problem? What about magnification? If the detector is different, couldn't the magnification be slightly different as well?
If you mean something like the above, and judging from your screenshot, it shouldn't be too difficult to find the centers of the 4 or 5 areas of highest intensity - normalize the data and go through the entire image looking for blocks of 9 neighboring pixels with the highest average intensity. Note the center pixel of four or five of these features for each image. Then calculate the distance between each set of pixels between the two images.
If the distance is 0 for all sets, the two images should be in alignment. If the distance is constant, all you have to do is move one image that distance. If the distance varies, you will need to resize one image until it is constant, and then slide it to match up the features. Then you can average the intensity values of the two images, since they should be in alignment.
That's how I would start, anyway.

If the images are generated from different sensors then the problem will be difficult, in general. Particularly for you since one of your images seems to have a lot of noise.
Assuming there's no warping or rotation in he sensors, then I would suggest that you first normalize the intensities of each image. Then find the shift that minimizes the error between the images. The error can be euclidean (i.e the total sum of squared differences of each pixel). That, to me at least, is the definition of alignment.

The only way you can align is if there is some feature in the images that is known to be identical (or with a known transformation). A common approach is to put something in the image -- for instance have the image capture add an alignment artifact -- something easy to detect and figure out the transformation required to normalize the image.
A common example is to put + markers at the corners. You might also see barcodes used for this purpose sometimes.
Without this artifact, there has to be something in the image whose size and orientation is known (and that exists in both images).

Related

comparing two Binary images in opencv

I have two binary images of hand which are almost same.How should I compare them to know whether they represent almost same shape or not.I have tried finding euclidean distance between two images but its not giving correct answer if the image is slightly changed or moved to left or right or slight decrease in size.I have also tried HOG descriptors in opencv still I am unable to get correct answer if I compare more than one image.What is the best way to compare two binary images based on shape or any feature to know nearly matching images not considering the size of the image.Links to images are http://postimg.org/image/w20tuuzmv/ and http://postimg.org/image/jndr4br9x/
I think that Generalized Hough transform might be a good solution for you. Here is a tutorial about it.
Alternatively uou can try to cut hand from one image (just use contour bounding rect) and than use it as a template and search for it in second image using template matching technique - here you can read more about. When you will find point with highest correlation value, you need to decide whether it is big enough - you need to find threshold on your own.
Are the images just rotated, translated and scaled? If so you could compute the principal components of the images using PCA, then rotate the images so that the first component is in a certain direction (e.g. always vertical) you could then compute the centroids of the images and translate them to be always in the same position (e.g. center of the image), to use always the same scale you could resize the images so that the sum of the distances between each white pixel with the centroid is the same in both images. Now it's easy to compare the images for example score = np.sum(A==B)

which algorithm to choose for object detection?

I am interested in detecting single object more precisely a fire extinguisher which has no inter class variability (all fire extinguisher looks same). However, The application is supposedly realtime i.e a robot is exploring the environment and whenever it sees the object of interest it should be able to detect it and give pixel coordinates of it.
My question is which algorithm will be good choice for this task?
1. Is this a classification problem and should we use features(sift/surf etc) + bow +svm?
2. some other solution (no idea yet).
Any kind of input will be appreciated.
Thanks.
(P.S bear with me i am newbie to computer vision and stack over flow)
update1:
Height varies all are mounted on the wall but with different height. I tried with SIFT features and bow but it is expensive to extract bow descriptors in testing part. Moreover I have no idea how to locate the object(pixel coordinates) inside the image after its been classified positive.
update 2:
I finally used sift + bow + svm and am able to classify the object. But using this technique, i only get output interms of whether the object is present in the scene or not?
How can i detect the object i.e getting the bounding box or centre of the object. what is the compatible approach with the above method for achieving these results.
Thank you all.
I would suggest using color as the main feature to look for, and only try other features as needed. The fire extinguisher red is very distinctive, and should not occur too often elsewhere in an office environment. Other, more computationally expensive tests can then be performed only in regions of the right color.
Here is a good tutorial for color detection that also explains how to find good thresholds for your desired color.
I would suggest the following approach:
denoise your image with a median filter
convert the image to HSV format (Hue, Saturation, Value)
select pixels close to that particular shade of red with InRange()
Now you have a binary image image that contains only the pixels that are red.
count the number of red pixels with CountNonZero()
If that number is too small, abort
remove noise from the binary image by morphological opening / closing
find contours of all blobs in your picture with findContours or the CvBlob library
check if there are blobs of the correct width, correct height and correct width/height ratio
since your fire extinguishers are vertical cylinders, the width/height ratio will be constant from every angle. The width and height will of course vary somewhat with distance to the camera.
if the width and height do not match, abort
repeat these steps to find the black-colored part on the bottom of the extinguisher,
abort if there is no black region with correct width/height below the red region
(perhaps also repeat these steps for the metallic top and the yellow rectangle)
These tests should all be very fast. If they are too slow, you could reduce the resolution of your input images.
Depending on your environment, it is possible that this is already a robust enough test. If not, you can proceed with sift/surf feature matching, but only in a small region around the blobs with the correct color. You also do not necessarily have to do that for each frame, each n-th frame should be be enough for confirmation.
This is a old question .. but will still like to give my recommendation to use YOLO algorithm to solve this problem.
YOLO fits very well to this scenario.

Image Processing for recognizing 2D features

I've created an iPhone app that can scan an image of a page of graph paper and can then tell me which squares have been blacked out and which squares are blank.
I do this by scanning from left to right and use the graph paper's lines as guides. When I encounter a graph paper line, I start to look for black, until I hit the graph paper line again. Then, instead of continuing along the scan line, I go ahead and completely scan the square for black. Then I continue on to the next box. At the end of the line, I skip down so many pixels before starting the scan on a new line (since I have already figured out how tall each box is).
This sort of works, but there are problems. Sometimes I mistake the graph lines as "black". Sometimes, if the image is skewed, or I don't have uniform lighting across the page, then I don't get good results.
What I'd like to do is to specify a few "alignment" boxes that I then resize and rotate (and skew) the picture to align with those. Then, I was thinking that once I have the image aligned, I would then know where all the boxes are and won't have to scan for the boxes, just scan inside the location of the boxes to see if they are black. This should be faster and more reliable. And if I were to operate on images coming from the camera, I'd have more flexibility in asking the user to align the picture to match the alignment marks, rather than having to align the image myself.
Given that this is my first Image Processing project, I feel like I am reinventing the wheel. I'd like suggestions on how to do this, and whether to utilize libraries like OpenCV.
I am enclosing an image similar to what I would like processed. I am looking for a list of all squares that have a significant amount of black marking, i.e. A8, C4, E7, G4, H1, J9.
Issues to be aware of:
Light coverage of the image may not be ideal, but should be relatively consistent across the image (i.e. no shadows)
All squares may be empty or all dark, and the algorithm needs to be able to determine that
the image may be skewed or rotated about any of the axis. Rotation about the z axis maybe easy to fix. There may be rotation around the x or y axis making ones side of the image be wider than the other. However, if I scan the image in realtime as it comes from the camera, I can ask the user to align the alignment marks with marks on the screen. How best to ensure that alignment to give the user appropriate feedback? Just checking to make sure that the 4 corners are dark could result in a false positive when the camera is pointing to a black surface.
not every square will be equally or consistently blacked, but I think there will be enough black to make it unquestionable to a human eye.
the blue grid may be useful, but there are cases where the black markings may overlap the blue grid. I think a virtual grid is probably better than relying on the printed grid. I would think that using the alignment markers to align the image, would then allow for a precise virtual grid to be laid out. And then the contents of each grid box could be sampled, to see if it was predominantly black, vs scanning from left-to-right, no? Here is another image with more markings on the grid. In this image, in addition to the previous marking in A8, C4, E7, G4, H1, J9, I have marked E2, G8 and G9, and I4 and J4 and you can see how the blue grid is obscured.
This is my first phase of this project. Eventually I'd like to scale this algorithm to be able to process at least a few hundred slots and possibly different colors.
To start with, this problem reminded me a bit of these demo's that might be useful to learn from:
The DNA microarray image processing
The Matlab Sudoku solver
The Iphone Sudoku solver blog post, explaining the image processing
Personally, I think the most simple approach would be to detect the squares in your image.
1) Remove the background and small cruft
f_makebw = #(I) im2bw(I.data, double(median(I.data(:)))/1.3);
bw = ~blockproc(im, [128 128], f_makebw);
bw = bwareaopen(bw, 30);
2) Remove everything but the squares and circles.
se = strel('disk', 5);
bw = imerode(bw, se);
% Detect the squares and cricles via morphology
[B, L] = bwboundaries(bw, 'noholes');
3) Detect the squares using 'extend' from regionprops. The 'Extent' metric measures what proportion of the bounding-box is filled. This makes it a
nice measure to distinguish between circles and squares
stats = regionprops(L, 'Extent');
extent = [stats.Extent];
idx1 = find(extent > 0.8);
bw = ismember(L, idx1);
4) This leaves you with your features, to synchronize or rectify the image with. An easy, and robust way, to do this, is via the Autocorrelation Function.
This gives nice peaks, which are easily detected. These peaks can be matched against the ACF peaks from a template image via the Hungarian algorithm. Once matched, you can correct rotation and scaling as you now have a linear system which you can solve:
x = Ax'
Translation can then be corrected using run-of-the-mill cross correlation against the same pre defined template.
If all goes well, you know have an aligned or synchronized image, which should help considerably in determining the position of the dots.
I've been starting to do something similar using my GPUImage iOS framework, so that might be an alternative to doing all of this in OpenCV or something else. As it's name indicates, GPUImage is entirely GPU-based, so it can have some tremendous performance benefits over CPU-bound processing (up to 180X faster for doing things like processing live video).
As a first stage, I took your images and ran them through a simple luminance thresholding filter with a threshold of 0.5 and arrived at the following for your two images:
I just added an adaptive thresholding filter, which attempts to correct for local illumination variances, and works really well for picking out text. However, in your images it uses too small of an averaging radius to handle your blobs well:
and seems to bring out your grid lines, which it sounds like you wish to ignore.
Maurits provides a more comprehensive description of what you could do, but there might be a way to implement these processing operations as high-performance GPU-based filters instead of relying on slower OpenCV versions of the same calculations. If you could grab rotation and scaling information from this thresholded image, you could construct a transform that could also be applied as a filter to your thresholded image to produce your final aligned image, which could then be downsampled and read out by your application to determine which grid locations were filled in.
These GPU-based thresholding operations run in less than 2 ms for 640x480 frames on an iPhone 4, so it might be possible to chain filters together to analyze incoming video frames as fast as the device's video camera can provide them.

Image Comparison

What is the efficient way to compare two images in visual c..?
Also in which format images has to be stored.(bmp, gif , jpeg.....)?
Please provide some suggestions
If the images you are trying to compare have distinctive characteristics that you are trying to differentiate then PCA is an excellent way to go. The question of what format of the file you need is irrelevant really; you need to load it into the program as an array of numbers and do analysis.
Your question opens a can of worms in terms of complexity.
If you want to compare two images to check if they are the same, then you need to perform an md5 on the file (removing possible metainfos which could distort your result).
If you want to compare if they look the same, then it's a completely different story altogether. "Look the same" is intended in a very loose meaning (e.g. they are exactly the same image but stored with two different file formats). For this, you need advanced algorithms, which will give you a probability for two images to be the same. Not being an expert in the field, I would perform the following "invented out of my head" algorithm:
take an arbitrary set of pixel points from the image.
for each pixel "grow" a polygon out of the surrounding pixels which are near in color (according to HSV colorspace)
do the same for the other image
for each polygon of one image, check the geometrical similitude with all the other polygons in the other image, and pick the highest value. Divide this value by the area of the polygon (to normalize).
create a vector out of the highest values obtained
the higher is the norm of this vector, the higher is the chance that the two images are the same.
This algorithm should be insensitive to color drift and image rotation. Maybe also scaling (you normalize against the area). But I restate: not an expert, there's probably much better, and it could make kittens cry.
I did something similar to detect movement from a MJPEG stream and record images only when movement occurs.
For each decoded image, I compared to the previous using the following method.
Resize the image to effectively thumbnail size (I resized fairly hi-res images down by a factor of ten
Compare the brightness of each pixel to the previous image and flag if it is much lighter or darker (threshold value 1)
Once you've done that for each pixel, you can use the count of different pixels to determine whether the image is the same or different (threshold value 2)
Then it was just a matter of tuning the two threshold values.
I did the comparisons using System.Drawing.Bitmap, but as my source images were jpg, there were some artifacting.
It's a nice simple way to compare images for differences if you're going to roll it yourself.
If you want to determine if 2 images are the same perceptually, I believe the best way to do it is using an Image Hashing algorithm. You'd compute the hash of both images and you'd be able to use the hashes to get a confidence rating of how much they match.
One that I've had some success with is pHash, though I don't know how easy it would be to use with Visual C. Searching for "Geometric Hashing" or "Image Hashing" might be helpful.
Testing for strict identity is simple: Just compare every pixel in source image A to the corresponding pixel value in image B. If all pixels are identical, the images are identical.
But I guess don't want this kind of strict identity. You probably want images to be "identical" even if certain transformations have been applied to image B. Examples for these transformations might be:
changing image brightness globally (for every pixel)
changing image brightness locally (for every pixel in a certain area)
changing image saturation golbally or locally
gamma correction
applying some kind of filter to the image (e.g. blurring, sharpening)
changing the size of the image
rotation
e.g. printing an image and scanning it again would probably include all of the above.
In a nutshell, you have to decide which transformations you want to treat as "identical" and then find image measures that are invariant to those transformations. (Alternatively, you could try to revert the translations, but that's not possible if the transformation removes information from the image, like e.g. blurring or clipping the image)

Adaptive threshold Binarization's bad effects

I implemented some adaptive binarization methods, they use a small window and at each pixel the threshold value is calculated. There are problems with these methods:
If we select the window size too small we will get this effect (I think the reason is because of window size is small)
(source: piccy.info)
At the left upper corner there is an original image, right upper corner - global threshold result. Bottom left - example of dividing image to some parts (but I am talking about analyzing image's pixel small surrounding, for example window of size 10X10).
So you can see the result of such algorithms at the bottom right picture, we got a black area, but it must be white.
Does anybody know how to improve an algorithm to solve this problem?
There shpuld be quite a lot of research going on in this area, but unfortunately I have no good links to give.
An idea, which might work but I have not tested, is to try to estimate the lighting variations and then remove that before thresholding (which is a better term than "binarization").
The problem is then moved from adaptive thresholding to finding a good lighting model.
If you know anything about the light sources then you could of course build a model from that.
Otherwise a quick hack that might work is to apply a really heavy low pass filter to your image (blur it) and then use that as your lighting model. Then create a difference image between the original and the blurred version, and threshold that.
EDIT: After quick testing, it appears that my "quick hack" is not really going to work at all. After thinking about it I am not very surprised either :)
I = someImage
Ib = blur(I, 'a lot!')
Idiff = I - Idiff
It = threshold(Idiff, 'some global threshold')
EDIT 2
Got one other idea which could work depending on how your images are generated.
Try estimating the lighting model from the first few rows in the image:
Take the first N rows in the image
Create a mean row from the N collected rows. You know have one row as your background model.
For each row in the image subtract the background model row (the mean row).
Threshold the resulting image.
Unfortunately I am at home without any good tools to test this.
It looks like you're doing adaptive thresholding wrong. Your images look as if you divided your image into small blocks, calculated a threshold for each block and applied that threshold to the whole block. That would explain the "box" artifacts. Usually, adaptive thresholding means finding a threshold for each pixel separately, with a separate window centered around the pixel.
Another suggestion would be to build a global model for your lighting: In your sample image, I'm pretty sure you could fit a plane (in X/Y/Brightness space) to the image using least-squares, then separate the pixels into pixels brighter (foreground) and darker than that plane (background). You can then fit separate planes to the background and foreground pixels, threshold using the mean between these planes again and improve the segmentation iteratively. How well that would work in practice depends on how well your lightning can be modeled with a linear model.
If the actual objects you try to segment are "thinner" (you said something about barcodes in a comment), you could try a simple opening/closing operation the get a lighting model. (i.e. close the image to remove the foreground pixels, then use [closed image+X] as threshold).
Or, you could try mean-shift filtering to get the foreground and background pixels to the same brightness. (Personally, I'd try that one first)
You have very non-uniform illumination and fairly large object (thus, no universal easy way to extract the background and correct the non-uniformity). This basically means you can not use global thresholding at all, you need adaptive thresholding.
You want to try Niblack binarization. Matlab code is available here
http://www.uio.no/studier/emner/matnat/ifi/INF3300/h06/undervisningsmateriale/week-36-2006-solution.pdf (page 4).
There are two parameters you'll have to tune by hand: window size (N in the above code) and weight.
Try to apply a local adaptive threshold using this procedure:
convolve the image with a mean or median filter
subtract the original image from the convolved one
threshold the difference image
The local adaptive threshold method selects an individual threshold for each pixel.
I'm using this approach extensively and it's working fine with images having non uniform background.

Resources