Efficiently rendering many images on a unified canvas - ruby-on-rails

I'm lost and in need of direction.
We're trying to render a bunch of small images (X) onto a single, unified canvas using imagemagick.
The different X's can be one of five different sizes: 20x20, 40x40, 60x60, 80x80 or 100x100 each. The large image's width is always set to 600, but the height can be regulated as needed.
We can be using as few as 10 or as many as 10,000 X's at any given moment.
Currently, the bare-bones proof of concept we're working with goes something like:
images.each do |image|
image = Magick::Image.read("#{RAILS_ROOT}/public/images/#{image}").first
w = image.columns
h = image.rows
pixels = image.export_pixels(0, 0, w, h, "RGB")
img.import_pixels(x, y, w, h, "RGB", pixels)
x += w
end
...it's simple and stupid, but it does output a series of images merged into one. Almost there ;-)
Does anyone know of an effective algorithm with which we can iterate many X's and place them side-by-side, spanning multiple lines and still optimizing the space? The goal here is to create a single image without white space, constructed by all small images.
As stated, I would love any feedback you guys might have on this. Pointers? Ideas? Examples?
Thanks.

It seems like right now the images are noise. You want to solve a tiles problem. The tiles have some fixed size and you want to put them on a surface of a fixed width and minimum height. This can be done globally with DFS, BFS, A*, etc. You can also look at some local method like simulated annealing or hill climbing, depending if you need to global optimum or just a good, reasonable solution. You can find implementations of these methods in the online source repository for AIMA.
Once you have solved the tile problem you can overlay the images, with as piece of code similar to the one you're showing.

Related

Comparing Images via pixels

I was playing around with images and came across a little game I tried to create.
You see an Image (for simplicity let's say a circle) and you have to redraw the circle as exact as possible on top of it.
All of that works in my little project already.
I want to be able to tell how many % of the image was recreated correctly (and again for simplicity no colours needed. It's always black and white)
Could I just count the the black pixels overlaying the image subtract the ones that are not and divide it by the amount of black pixels in the original?
This would look like this I guess:
ratio = (correctPixelCount - wrongPixelCount) / originalPixelCount
If yes, how would I go about getting each pixel and compare them?
If no, what else could I do?
PS: I already tried a Image compare cocoa pod called AIImageCompare.
Unfortunately it crashes for some unknown reasons.
Thank you!

Finding word's bounding box on a low quality image

I'm trying to get a bounding box for the word "ЛИЛИЯ" in this image, using opencv.
(source: litprom.ru)
I am already experimenting with cv::findContours() and different thresholding alogrithms for couple of days, but can not get any satisfying results.
So, what do I know about this word:
letters are of similar size;
letters' height is in range: 40px — 90px;
word is oriented horizontaly (±5˚);
there is one and only one word on this image;
this word does not intersect image's border (it's fully visible);
different parts of image may have different luminosity;
hotspots (totally white areas) may be present on an image.
English is not my native language, so I'm sorry if the question is not properly explained.
If someone needs more images to answer this question, I have at least a dozen more.
Check out stroke width transform. That is used to text detection.
You can preprocess your image with adaptiveThreshold. You should use a blocksize a little bit bigger than your biggest character. I tried on your image with 91 and it gave good results. Then you can use FindContours and filter the blobs/contours using their height. Note that the letters will still be connected one to another so you cannot really filter using the width.

Checking for overlapping images with a hole in an image

I have two image views. They are "puzzle pieces" I want to test if one fits inside the other. Not that the frames overlap. I guess its a CGRect thing... but seems like they test the outer boundaries. Any ideas would be appreciated? Thanks.
Just brainstorming here... Maybe this will get you thinking of something that will work for you. If the images do not overlap, then drawing image A on top of image B will result in the same image as drawing image B on top of image A. If they overlap, that will result in different images. You could do something like draw image A, then B. Create a checksum of the result, draw A again, and checksum that. If the checksums match, the puzzle piece fits.
If you have a 1-bit mask that represents each image, then ORing them together and XORing them together will have the same result if they don't overlap and different results if they do.
Do you know the correct order of pieces beforehand? May be it's better assign the tag to each UIImageView which will represent the image's index number. Then you just create a kind of mesh and check in which cell the piece was placed. If the cell number and UIImageView tag match - then this is the right place.
If you have only two images and one must fit to the specific area in another, you could store the frame of this hole and check if the piece is placed somewhere around the centre of this frame. It'll be more user-friendly because when you're checking pixels or bit masks you want the user be extremely precise. Or your comparison code should allow some shifts and will be very complicated.
But if you don't want to hardcode the hole frame you could calculate it dynamically (just find transparent areas in the image). Anyway, this solution will be more effective then checking bit match on the fly.

EmguCV Shape Detection affected by Image Size

I'm using the Emgu shape detection example application to detect rectangles on a given image. The dimensions of the resized image appear to impact the number of shapes detected even though the aspect ratio remains the same. Here's what I mean:
Using (400,400), actual img size == 342,400
Using (520,520), actual img size == 445,520
Why is this so? And how can the optimal value be determined?
Thanks
I replied to your post on EMGU but figured you haven't checked back but this is it. The shape detection works on the principle of thresh-holding unlikely matches, this prevents lots of false classifications. This is true for many image processing algorithms. Basically there are no perfect setting and a designer must select the most appropriate settings to produce the most desirable results. I.E. match the most objects without saying there's more than there actually is.
You will need to adjust each variable individually to see what kind of results you get. Start of with the edge detection.
Image<Gray, Byte> cannyEdges = gray.Canny(cannyThreshold, cannyThresholdLinking);
Have a look at your smaller image see what the difference is between the rectangles detected and the one that isn't. You could be missing and edge or a corner which is why it's not classified. If you are adjust cannyThreshold and observe the results, if good then keep it :) if bad :( go back to the original value. Once satisfied adjust cannyThresholdLinking and observe.
You will keep repeating this until you get a preferred image the advantage here is that you have 3 items to compare you will continue until the item that's not being recognised matches the other two.
If they are the similar, likely as it is a black and white image you'll need to go onto the Hough lines detection.
LineSegment2D[] lines = cannyEdges.HoughLinesBinary(
1, //Distance resolution in pixel-related units
Math.PI / 45.0, //Angle resolution measured in radians.
20, //threshold
30, //min Line width
10 //gap between lines
)[0]; //Get the lines from the first channel
Use the same method of adjusting one value at a time and observing the output you will hopefully find the settings you need. Never jump in with both feet and change all the values as you will never know if your improving the accuracy or not. Finally if all else fails look at the section that inspects the Hough results for a rectangle
if (angle < 80 || angle > 100)
{
isRectangle = false;
break;
}
Less variables to change as hough should do all the work for you. but still it could all work out here.
I'm sorry that there is no straight forward answer, but I hope you keep at it and solve the problem. Else you could always resize the image each time.
Cheers
Chris

What processing steps should I use to clean photos of line drawings?

My usual method of 100% contrast and some brightness adjusting to tweak the cutoff point usually works reasonably well to clean up photos of small sub-circuits or equations for posting on E&R.SE, however sometimes it's not quite that great, like with this image:
What other methods besides contrast (or instead of) can I use to give me a more consistent output?
I'm expecting a fairly general answer, but I'll probably implement it in a script (that I can just dump files into) using ImageMagick and/or PIL (Python) so if you have anything specific to them it would be welcome.
Ideally a better source image would be nice, but I occasionally use this on other folk's images to add some polish.
The first step is to equalize the illumination differences in the image while taking into account the white balance issues. The theory here is that the brightest part of the image within a limited area represents white. By blurring the image beforehand we eliminate the influence of noise in the image.
from PIL import Image
from PIL import ImageFilter
im = Image.open(r'c:\temp\temp.png')
white = im.filter(ImageFilter.BLUR).filter(ImageFilter.MaxFilter(15))
The next step is to create a grey-scale image from the RGB input. By scaling to the white point we correct for white balance issues. By taking the max of R,G,B we de-emphasize any color that isn't a pure grey such as the blue lines of the grid. The first line of code presented here is a dummy, to create an image of the correct size and format.
grey = im.convert('L')
width,height = im.size
impix = im.load()
whitepix = white.load()
greypix = grey.load()
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 * impix[x,y][0] / whitepix[x,y][0], 255 * impix[x,y][1] / whitepix[x,y][1], 255 * impix[x,y][2] / whitepix[x,y][2]))
The result of these operations is an image that has mostly consistent values and can be converted to black and white via a simple threshold.
Edit: It's nice to see a little competition. nikie has proposed a very similar approach, using subtraction instead of scaling to remove the variations in the white level. My method increases the contrast in the regions with poor lighting, and nikie's method does not - which method you prefer will depend on whether there is information in the poorly lighted areas which you wish to retain.
My attempt to recreate this approach resulted in this:
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 + impix[x,y][0] - whitepix[x,y][0], 255 + impix[x,y][1] - whitepix[x,y][1], 255 + impix[x,y][2] - whitepix[x,y][2]))
I'm working on a combination of techniques to deliver an even better result, but it's not quite ready yet.
One common way to remove the different background illumination is to calculate a "white image" from the image, by opening the image.
In this sample Octave code, I've used the blue channel of the image, because the lines in the background are least prominent in this channel (EDITED: using a circular structuring element produces less visual artifacts than a simple box):
src = imread('lines.png');
blue = src(:,:,3);
mask = fspecial("disk",10);
opened = imerode(imdilate(blue,mask),mask);
Result:
Then subtract this from the source image:
background_subtracted = opened-blue;
(contrast enhanced version)
Finally, I'd just binarize the image with a fixed threshold:
binary = background_subtracted < 35;
How about detecting edges? That should pick up the line drawings.
Here's the result of Sobel edge detection on your image:
If you then threshold the image (using either an empirically determined threshold or the Ohtsu method), you can clean up the image using morphological operations (e.g. dilation and erosion). That will help you get rid of broken/double lines.
As Lambert pointed out, you can pre-process the image using the blue channel to get rid of the grid lines if you don't want them in your result.
You will also get better results if you light the page evenly before you image it (or just use a scanner) cause then you don't have to worry about global vs. local thresholding as much.

Resources