Finding cropped and scaled similar image - opencv

Given several large original images and a small image which is cropped and isotropic-scaled from one of large images, the task is to find where the small image comes from.
cropping usually occur at the center of large image
but exact crop boundary is unknown
the size of small image is about 200x200
again, exact size of small image is unknown
if the size of cropped area is (width, height), the size of small image must be (width * k, height * k), where k < 1.0
I've read some related topics in SO and tried methods like ORB / color histograms, however the accuracy is not acceptable. Would you please give me some advice? Is there any efficient algorithm to deal with this problem? Thank you very much.

The wording you are looking for is template matching, as you want to scan the original image and look for the origin of the cropped and scaled one.
The OpenCV tutorial has an extensive explaination for it

Related

PIllow Image.paste v.s composite v.s. alpha_composite v.s. blend, what's the difference?

Newbie in image processing. I'm confused with these methods when merging two images with Pillow:
PIL.Image.Image
.paste()
.composite()
.alpha_composite()
.blend()
Could anyone provide a quick explanation? Or where could I grab the related background knowledge?
I see it like this:
blend is the simplest. It takes a fixed and constant proportion of each image at each pixel location, e.g. 30% of image A and 70% of image B at each location all over the image. The ratio is a single number. This operation is not really interested in transparency, it is more of a weighted average where a part of both input images will be visible at every pixel location in the output image
paste and composite are synonyms. They use a mask, with the same size as the images, and take a proportion of image A and image B according to the value of the mask which may be different at each location. So you might have a 0-100 proportion of image A and image B at the top and 100-0 proportion at the bottom, and this would look like a smoothly blended transition from one image at the top to the other image at the bottom. Or, it may be like a largely opaque foreground where you only see one input image, but a transparent window through which you see the other input image. The mask, of the same size as the two input images, is key here and it can assume different values at different locations.
alpha compositing is the most complicated and is best described by Wikipedia
——-
Put another way, blend is no alpha/transparency channel and a fixed proportion of each input image present throughout the output image.
paste is a single alpha channel that can vary across the image.
alpha_composite is two alpha channels that can both vary across the image.

How exactly does dp parameter in cv::HoughCircles work?

I read similar question in Stack Overflow. I tried, but I still can not understand how it works.
I read OpenCV document cv::HoughCircles, here are some explanation about dp parameter:
Inverse ratio of the accumulator resolution to the image resolution. For example, if dp=1 , the accumulator has the same resolution as the input image. If dp=2 , the accumulator has half as big width and height.
Here are my question. For example, if dp = 1, the size of accumulator is same as image, there is a consistent one-to-one match between pixels in image and positions in accumulator, but if dp = 2, how to match?
Thanks in advance.
There is no such thing as a one-to-one match here. You do have an image with pixels and a hough space, which is used for voting for circles. This parameter is just a convenient way to specify the size of the hough space relatively to the image size.
Please take a look at this answer for more details.
EDIT:
Your image has (x,y)-coordinates. Your circle hough space has (a,b,r)-coordinates, whereas (a,b) are the circle centers and r are the radii. Let's say you find a edge pixel. Now you vote for each circle, which could go through this edge pixel. I found this nice picture of hough space with a single vote i.e. a single edge pixel (continuous case). In practice this vote happens within the 3D accumulator matrix. You can think of it as rasterization of this continuous case.
Now, as already mentioned the dp parameter defines the size of this accumulator matrix relatively to your image size. The bigger the dp parameter the lower the resolution of your rasterization. It's like taking photos with different resolutions. If you downsize your photo multiple pixels will reduce to a single one. Same happens if you reduce your accumulator matrix respectively increase your dp parameter. Multiple votes for different circle centers (which lie next to each other) and radii (which are of similar size) are now merged, i.e. you do get less accurate circle parameters, but a more "robust" voting.
Please be aware that the OpenCV implementation is a little bit more complicated (they use the Hough gradient method instead of the standard Hough transform) but the considerations still apply.

Determining pixel coordinates across display resolutions

If a program displays a pixel at X,Y on a display with resolution A, can I precisely predict at what coordinates the same pixel will display at resolution B?
MORE INFORMATION
The 2 display resolutions are:
A-->1366 x 768
B-->1600 x 900
Dividing the max resolutions in each direction yields:
X-direction scaling factor = 1600/1366 = 1.171303075
Y-direction scaling factor = 900/768 = 1.171875
Say for example that the only red pixel on display A occurs at pixel (1,1). If I merely scale up using these factors, then on display B, that red pixel will be displayed at pixel (1.171303075, 1.171875). I'm not sure how to interpret that, as I'm used to thinking of pixels as integer values. It might help if I knew the exact geometry of pixel coordinates/placement on a screen. e.g., do pixel coordinates (1,1) mean that the center of the pixel is at (1,1)? Or a particular corner of the pixel is at (1,1)? I'm sure diagrams would assist in visualizing this--if anyone can post a link to helpful resources, I'd appreciate it. And finally, I may be approaching this all wrong.
Thanks in advance.
I think, your problem is related to the field of scaling/resampling images. Bitmap-, or raster images are digital photographs, so they are the most common form to represent natural images that are rich in detail. The term bitmap refers to how a given pattern (bits in a pixel) maps to a specific color. A bitmap images take the form of an array, where the value of each element, called a pixel picture element, correspond to the color of that region of the image.
Sampling
When measuring the value for a pixel, one takes the average color of an area around the location of the pixel. A simplistic model is sampling a square, and a more accurate measurement is to calculate a weighted Gaussian average. When perceiving a bitmap image the human eye should blend the pixel values together, recreating an illusion of the continuous image it represents.
Raster dimensions
The number of horizontal and vertical samples in the pixel grid is called raster dimensions, it is specified as width x height.
Resolution
Resolution is a measurement of sampling density, resolution of bitmap images give a relationship between pixel dimensions and physical dimensions. The most often used measurement is ppi, pixels per inch.
Scaling / Resampling
Image scaling is the name of the process when we need to create an image with different dimensions from what we have. A different name for scaling is resampling. When resampling algorithms try to reconstruct the original continuous image and create a new sample grid. There are two kind of scaling: up and down.
Scaling image down
The process of reducing the raster dimensions is called decimation, this can be done by averaging the values of source pixels contributing to each output pixel.
Scaling image up
When we increase the image size we actually want to create sample points between the original sample points in the original raster, this is done by interpolation the values in the sample grid, effectively guessing the values of the unknown pixels. This interpolation can be done by nearest-neighbor interpolation, bilinear interpolation, bicubic interpolation, etc. But the scaled up/down image must be also represented over discrete grid.

Simple way to check if an image bitmap is blur

I am looking for a "very" simple way to check if an image bitmap is blur. I do not need accurate and complicate algorithm which involves fft, wavelet, etc. Just a very simple idea even if it is not accurate.
I've thought to compute the average euclidian distance between pixel (x,y) and pixel (x+1,y) considering their RGB components and then using a threshold but it works very bad. Any other idea?
Don't calculate the average differences between adjacent pixels.
Even when a photograph is perfectly in focus, it can still contain large areas of uniform colour, like the sky for example. These will push down the average difference and mask the details you're interested in. What you really want to find is the maximum difference value.
Also, to speed things up, I wouldn't bother checking every pixel in the image. You should get reasonable results by checking along a grid of horizontal and vertical lines spaced, say, 10 pixels apart.
Here are the results of some tests with PHP's GD graphics functions using an image from Wikimedia Commons (Bokeh_Ipomea.jpg). The Sharpness values are simply the maximum pixel difference values as a percentage of 255 (I only looked in the green channel; you should probably convert to greyscale first). The numbers underneath show how long it took to process the image.
If you want them, here are the source images I used:
original
slightly blurred
blurred
Update:
There's a problem with this algorithm in that it relies on the image having a fairly high level of contrast as well as sharp focused edges. It can be improved by finding the maximum pixel difference (maxdiff), and finding the overall range of pixel values in a small area centred on this location (range). The sharpness is then calculated as follows:
sharpness = (maxdiff / (offset + range)) * (1.0 + offset / 255) * 100%
where offset is a parameter that reduces the effects of very small edges so that background noise does not affect the results significantly. (I used a value of 15.)
This produces fairly good results. Anything with a sharpness of less than 40% is probably out of focus. Here's are some examples (the locations of the maximum pixel difference and the 9×9 local search areas are also shown for reference):
(source)
(source)
(source)
(source)
The results still aren't perfect, though. Subjects that are inherently blurry will always result in a low sharpness value:
(source)
Bokeh effects can produce sharp edges from point sources of light, even when they are completely out of focus:
(source)
You commented that you want to be able to reject user-submitted photos that are out of focus. Since this technique isn't perfect, I would suggest that you instead notify the user if an image appears blurry instead of rejecting it altogether.
I suppose that, philosophically speaking, all natural images are blurry...How blurry and to which amount, is something that depends upon your application. Broadly speaking, the blurriness or sharpness of images can be measured in various ways. As a first easy attempt I would check for the energy of the image, defined as the normalised summation of the squared pixel values:
1 2
E = --- Σ I, where I the image and N the number of pixels (defined for grayscale)
N
First you may apply a Laplacian of Gaussian (LoG) filter to detect the "energetic" areas of the image and then check the energy. The blurry image should show considerably lower energy.
See an example in MATLAB using a typical grayscale lena image:
This is the original image
This is the blurry image, blurred with gaussian noise
This is the LoG image of the original
And this is the LoG image of the blurry one
If you just compute the energy of the two LoG images you get:
E = 1265 E = 88
or bl
which is a huge amount of difference...
Then you just have to select a threshold to judge which amount of energy is good for your application...
calculate the average L1-distance of adjacent pixels:
N1=1/(2*N_pixel) * sum( abs(p(x,y)-p(x-1,y)) + abs(p(x,y)-p(x,y-1)) )
then the average L2 distance:
N2= 1/(2*N_pixel) * sum( (p(x,y)-p(x-1,y))^2 + (p(x,y)-p(x,y-1))^2 )
then the ratio N2 / (N1*N1) is a measure of blurriness. This is for grayscale images, for color you do this for each channel separately.

resize png image using rmagick without losing quality

I need to resize an 200*200 image to 60*60 in rmagick without losing image quality. Currently I am doing following for a png image
img = Magick::Image.from_blob(params[:file].read)[0]
img.write(RootPath + params[:dir_str] + "/#{filename}") do
self.quality=100;
# self.compression = Magick::ZipCompression
end
I am losing sharpness in the resulting image. I want to be able to resize by losing the least amount of image quality.
I tried to set it's quality and different compressions, but all of them seems not works fine.
all resulting image are still looks being removed a layer of colors and the word character are losing sharpness
anyone could give me some instructions for resizing png images?
You're resizing a picture from 200x200 = 40,000 down to 60x60 = 3,600 - that is, less than a tenth of the resolution - and you're surprised that you lose image quality? Think of it this way - could you take a 16x16 image and resize it to 5x5 with no loss of quality? That is about the same as you are trying to do here.
If what you are saying you want to do was actually possible, then every picture could be reduced to one pixel with no loss of quality.
With the art designer's 60x60 image being better quality than yours, it depends on the original size of the image that the art designer is working from. For example, if the art designer was working from an 800x800 image and provided your 200x200 image from that, then also reduced the original 800x800 image to 60x60 in PS then that 60x60 image will be better quailty than the one you have. This is because your 60x60 image has gone throuogh two losses of quality: one to go to 200x200 and the second to go from 200x200 to 60x60. Necessarily this will be worse than the image resized from the original.
You could convert the png to a vector image, resize the vector to 60x60 then convert the vector to png. Almost lossless.

Resources