Matching and cropping 2 slightly shifted photos of an object - image-processing

There's some 2D object that I have 2 photos of. I want to find their difference, but can't since they were captured with slight shift compared to each other.
Thus, each picture is almost identical to the other, but also each contains some information that the other one doesn't have (in the frame, because of the shift).
Using python, how do I align and crop them, so I'll be able to calculate the difference image?
Note: I realize that if the images are identical except for the shift then the difference should be 0 everywhere, but I'm expecting very small changes that I'll be trying to find with the difference image.
Demonstration here:

Related

How to count tablets successfully?

My last question on image recognition seemed to be too broad, so I would like to ask a more concrete question.
First the background. I have already developed a (round) pill counter. It uses something similar to this tutorial. After I made it I also found something similar with this other tutorial.
However my method fails for something like this image
Although the segmentation process is a bit complicated (because of the semi-transparency of the tablets) I have managed to get it
My problem is here. How can I count the elongated tablets, separating each one from the image, similar to the final results in the linked tutorials?
So far I have applied distance transform and then my own version of watershed and I got
As you can see it fails in the adjacent tablets (distance transform usually does).
Take into account that the solution does have to work for this image and also for other arrangements of the tablets, the most difficult being for example
I am open to use OpenCV or if necessary implement on my own algorithms. So far I have tried both (used OpenCV functions and also programmed my own libraries) I am also open to use C++, or python or other. (I programmed them in C++ and I have done it on C# too).
I am also working on this pill counting problem (I'm much earlier in this process than you are), and to solve the piece you are working on - of touching pills, my general idea how to solve this is to capture contours of the pills once you have a good mask of the pills, and then calculate the area of a single pill.
For this approach I'm assuming that I have enough pills in the image such that the amount of them that are untouching is greater than those which are touching, and no pills overlap one another. For my application, placing this restriction I think is reasonable (humans can do a quick look at the pills they've dumped out, and at least roughly make them not touching without too much work. It's also possible that I could design a tray with some sort of dimples in it such that it would coerce the pills to not be touching)
I do this by sorting the contour areas (which, with the right thresholding should lead to only pills and pill-groups being in the identified contours), and taking the median value.
Then, with a good value for the area of a pill, you can look for contours with areas that are a multiple of that median area (+/- some % error value).
I also use that median value to filter out contours that are clearly not big enough to be pills, and ones that are far too large to be a pill (the latter though could be more troublesome, since it could still be a grouping of touching pills).
Given that the pills are all identical and don’t overlap, simply divide the total pill area by the area of a single pill.
The area is estimated simply counting the number of “pill” pixels.
You do need to calibrate the method by giving it the area of a single pill. This can be trivially obtained by giving the correct solution to one of the images (manual counting), then all the other images can be counted automatically.

How can I identify bullet holes by comparing images taken from different angles?

I have two images of a shooting target (4 meters by 4 meters), divided into sections of 0.5 meter by 0.5 meter squares. The images are taken before and after a firing trial. The target has already bullet holes on it before the firing. Moreover, there is some clutter on or in front of the target (fixing screws and steel lines to hold target straight). Let us assume all bullet holes are visible on both images. How can I programmatically identify bullet holes by comparing before and after images? Can you specify tools or libraries, or algorithm steps?
A possible approach would consist in the following steps:
perform image registration in order to have both images seen from the same angle. Here, you'll need to find the combination of rotation, scaling and translation that relates one view to the other. See for example http://scikit-image.org/docs/dev/auto_examples/transform/plot_matching.html#example-transform-plot-matching-py that determines the transformation from a set of points of interest (corners for example). (The transformation that you need might be a bit more complex than the one of the example, since the rotation is in 3D for your images and not only in 2D.).
once you have aligned the images together, you can try different approaches. One of them is to detect the holes in both images, with a segmentation method. Since the holes seem to be lighter, you can try thresholding the image (http://scikit-image.org/docs/dev/auto_examples/segmentation/plot_local_otsu.html) and maybe cleaning the result with mathematical morphology (http://www.scipy-lectures.org/packages/scikit-image/index.html#mathematical-morphology). Then, for each hole of the target after, you can try to match it with a hole in the target before, for example by picking the closest center of mass in the target before and computing the cross-correlation between a given patch around the hole, in the two images.
I've given a few links to scikit-image examples, but openCV is often cited as the reference library for computer vision.

Good features to compare an image with a capture of this image

I want to capture one frame with all the frames stored in a database. This frame is captured by the mobile phone, while the database is with the original ones. I have been searching most days in order to find a good method to compare them, taking into account that they have not the same resolution, colors and luminance, etc. Does anyone have an idea?
I have already done the preprocessing step of the captured frame to be as faithful as possible than the original one with C++ and the OpenCV library. But then, I do not know what can be a good feature to compare them or not.
Any comment will be very helpful, thank you!
EDIT: I implemented an algorithm which compares the difference between the two images resized to 160x90, in grayscale and quantized. The results are the following:
The mean value of the image difference is 13. However, if I use two completely different images, the mean value of the image difference is 20. So, I do not know if this measure can be improved on some manner in order to have a better margin for the matching.
Thanks for the help in advance.
Cut the color depth from 24-bits per pixel (or whatever) to 8 or 16 bits per pixel. You may be able use a posterize function for this. Then resize both images to a small size (maybe 16x16 or 100x100, depending on your images), and then compare. This should match similar images fairly closely. It will not take into account different rotation and locations of objects in the image.

How to identify changes in two images of same object

I have two images which I know represent the exact same object. In the picture below, they are referred as Reference and Match.
The image Match can undergo the following transformations compared to Reference:
The object may have changed its appearance locally by addition(e.g. dirt or lettering added to the side) or omission (side mirror has been taken out).
Stretched or reduced in size horizontally only (it is not resized in vertical direction)
Portions of Reference image are not present in Match (shaded in red in Reference Image).
Question: How can the regions which have "changed" in the ways mentioned above be identified ?
Idea#1: Dynamic Time Warping seems like a good candidate once the beginning and end of Match image (numbered 1 and 3 in the image) are aligned with corresponding columns in Reference Image, but I am not sure how to proceed.
Idea#2: Match SIFT features across images. The tessellation produced by feature point locations breaks up the image into non-uniform tiles. Use feature correspondences across images to determine which tiles to match across images. Use a similarity measure to figure out any changes.
You might want to consider an iterative registration algorithm. Basically you want to perform optimization to find the parameters of the transform, in your case horizontal scaling and horizontal translation. Once you optimize the parameters you will have the transformation between the two images, transform one to match the other, and can then use a subtraction to identify the regions with differences.
For registration take a look at the ITK library.
You can probably do a gradient decent optimization using mutual information as the metric. It has a number of different transforms that will capture translation and scaling. The code should run quickly on the sample images you show.

Segmentation for connected characters

How can I segment if the characters are connected? I just tried using watershed with distance transform (http://opencv-code.com/tutorials/count-and-segment-overlapping-objects-with-watershed-and-distance-transform/) to find the number of components but it seems that it does not perform well.
It requires the object to be separated after a threshold in order to perform well.
Having said so, how can I segment the characters effectively? Need helps/ideas.
As attached is the example of binary image.
An example of heavily connected.
Ans:
#mmgp this is my o/p
I believe there are two approaches here: 1) redo the binarization step that led to these images you have right now; 2) consider different possibilities based on image size. Let us focus on the second approach given the question.
In your smallest image, only two digits are connected, and that happens only when considering 8-connectivity. If you handle your image as 4-connected, then there is nothing to do because there are no two components connected that should be separated. This is shown below. The right image can be obtained simply by finding the points that are connected to another one only when considering 8-connectivity. In this case, there are only two such points, and by removing them we disconnect the two digits '1'.
In your other image this is no longer the case. And I don't have a simple method to apply on it that can be applied on the smaller image without making it worse. But, actually, we could consider upscaling both images to some common size, using interpolation by nearest neighbor so we don't move from the binary representation. By resizing both of your images so they width equal to 200, and keeping the aspect ratio, we can apply the following morphological method to both of them. First do a thinning:
Now, as can be seen, the morphological branch points are the ones connecting your digits (there is another one at the left-most digit 'six' too, which will be handled). We can extract these branch points and apply a morphological closing with a vertical line of 2*height+1 (height is from your image), so no matter where the point is, its closing will produce a full vertical line. Since your image is not so small anymore, this line doesn't need to be 1 point-wide, in fact I considered a line that is 6 points-wide. Since some of the branch points are horizontally close, this closing operation will join them in the same vertical line. If a branch point is not close to another, then performing an erosion will remove a vertical line. And, by doing this, we eliminate the branch point related to the digit six at left. After applying these steps, we obtain the following image at left. Subtracting the original image from it, we get the image at right.
If we apply these same steps to the '8011' image, we end with the exactly same image as we started with. But this is still good, because applying the simple method that remove points that are only connected in 8-connectivity, we obtain the separated components as before.
It is common to use "smearing algorithms" for this. Also known as Run Length Smoothing Algorithm (RLSA). It is a method that segments black and white images into blocks. You can find some information here or look around on the internet to find an implementation of the algorithm.
Not sure if I want to help you solve captchas, but one idea would be to use erosion. Depending on how many pixels you have to work with it might be able to sufficiently separate the characters without destroying them. This would likely be best used as a pre-processing step for some other segmentation algorithm.

Resources