How to generate an image using parts of another image? - image-processing

Before clarifying my question, please just consider these two generative portraits by Sergio Albiac:
Since I really like this kind of portraits I wanted to find a way of producing them myself.
I don't have much for now, the only things I can deduce from these examples are:
each portrait takes at least two inputs, one target image (the
portrait) and one or more source images (pictures of text) whose parts are used to
generate a stylized portrait
matching the parts from source images with the target image is
done using template matching
What I'd like to know is how to proceed, what things to learn and look for? What other concepts should I consider before trying to make this work?
Cheers

The Cover Maker plugin for Fiji/ImageJ does a similar thing.
It first builds a database from your source images indexed according to color/intensity. These source images are then used to build your target image. (Contrary to your example images, it only works with a constant tile size throughout the image, though.)
Have a look at the python source code for details.
EDIT: If you want to avoid the constant tile size, you could use e.g. a quadtree segmentation or a k-means segmentation to get regions of similiar intensity/texture in your target image, and then do the template matching for the segmented regions.

Related

How to detect text in a photo

I am researching into the best way to detect test in a photo using open source libraries.
I think the standard way is as follows (note: steps 1 - 4 all use OpenCV):
1) detect outline of document
2) transform document so it's flat and cropped, using said outline
3) Make the background of document white, using a filter
4) Feed resulting image to Tesseract
Is this the optimum process, or is there a better way, or better tools?
Also, what happens for case if the photo doesn't have a document outline (It's possible that step 1 & 2 are redundant)?
Is there anyway to automatically detect document orientation (i.e. portrait / landscape)?
I think your process is fine. I've used a similar process for an Android project.
I think that the only way you can discover if a document is portrait/landscape is to reason with the length of the sides of the bounding box of your outline.
I don't think there's an automatic way to do this, maybe you can find the most external contour approximable with a 4 segment polyline (all doable in opencv). In order to get this you'll have to work with contour hierarchy and contous approximation (see cv2.approxPolyDP).
This is how I would go for automatic outline detection. As I said, the rest of your algorithm seems just fine to me.
PS. I'll leave my Android project GitHub link. I don't know if it can be useful to you, but here I specify the outline by dragging some handles, then transform the image and feed it to Tesseract, using Java and OpenCV. Yeah It's a very bad idea to do that in the main thread of an Android app and yeah, the app is not finished. I just wanted to experiment with OCR, so I didn't care much of performance and usability, since this was not intended to use, but just for studying.
Look up the uniform width transform.
What this does is detect edges which have more or less the same width with respect to their opposite edge. So things like drainpipes (which can be eliminated at a later pass) but also the majority of text. Whilst conceptually it's similar to a distance transform, the published method uses rather ad hoc normal projection methods and Canny edge detection.

Perceptual Image Comparison

I'm trying to do image comparison to detect changes in a video processing application. These are two images that look identical to me, but are different according to both
http://pdiff.sourceforge.net/
and http://www.itec.uni-klu.ac.at/lire/nightly/api/net/semanticmetadata/lire/imageanalysis/LireFeature.html
Can anyone explain the difference? Eventually I need to find a library that can detect differences that doesn't have any false positives.
The two images are different.
I used GIMP (open source) to stack the two images one on top of the other and do a difference for the top layer. It showed a very faint black image, i.e. very little difference. I then used Curve to raise the tones and it revealed that what seem to be JPEG artifacts, even though the files given are PNG. I recommend GIMP and sometimes I use it instead of Photoshop.
Using GIMP to do a blink comparison between layers at 400% view, I would guess that the first image is closer to the original. The second may be saved copy of the first or from the original but saved at a lower quality setting.
It seems that the metadata has been stripped off both images (haven't done a definitive look), so no clues there.
There was a program called Unique Filer that I used for years. It is tunable and rather good. But any comparator is likely to generate a number of false positives if you tune it well enough to make sure it doesn't miss duplicates. If you only want to catch images that are very similar like this pair, then you can tune it very tightly. It is old and may not work on Windows 7 or later.
I would like to find good image checkers / comparators too. I've considered writing my own program.

Valid technique for scalable graphics on iOS?

A little background: I'm working on an iOS app that has a variety of status icons for various states. These icons are used in a variety of places and sizes including as UITableViewCell imageViews, as custom MKMapAnnotations and a few other spots. I actually have a couple sets which include a more static status icon as well as ones that have dynamic text injected into the design.
So at first I went the conventional route of using static raster assets, but because the sizes were dynamic this wasn't always the best solution and I wasn't thrilled with the quality of the scaling using CGAffineTransforms. So instead I changed gears a bit and tried something else:
Created a custom UIView subclass for each high level class of icon. It takes as input the model object that derives the status from (I suppose I could have also just used an enum and loaded this into some kind of model constructor but this is how I did it) so it can decide what it needs to draw, then does the necessary drawing in drawRect. Since all of the drawing is based on the view bounds it scales to any reasonable dimensions.
Created a Category which has class method constructors that take the model inputs as well as the size you want to use and constructs the custom views.
Since I also wanted the option to have rasterized versions of these icons to plug into certain places (such as a UITableViewCell imageView) I also created constructors that build the view and return a UIImage using the fast iOS7 snapshotting functions.
So what does this give me? Well here's the pros/cons that I can see.
Pros
Completely scalable graphics that can easily be used in a variety of different scenarios and contexts.
Easy compatibility with adding dynamic info to the graphics such as text. Because I have the exact shape data on everything I'm drawing I don't need to guesstimate on the bounds for a text box since I know how everything is laid out.
Compatibility with situations where I might want a rasterized asset but I still get all the advantages of the dynamic view since I'm not rasterizing it till I need it.
Reduces the size of the application since I don't need to include raster assets.
Cons
The workflow for creating the draw code in the first place isn't ideal. For simple stuff I can do it straight in code but for more complex things I'll need to create the vector asset in Illustrator or Sketch then bring it into PaintCode and clean up the generated draw code into something more streamlined. This is not the most ideal process.
So the question is: does anyone have any better suggestions for how to deal with this sort of situation? I haven't found an enormous amount of material on techniques for this sort of thing and I'm wondering if I'm missing a better way of handling this or if there are any hidden gotchas here...performance doesn't seem to be an issue from my testing with my approach but I haven't tested it on the iPad3 or iPhone 4 yet so there could still be some unknowns.
You could try SVGKit, which draws SVG files, and can export to a UIImage, if desired.

Pixel-perfect acceptance testing on iOS

I'm given exact size .png renders from Application Design showing exactly what my app should look like on Retina 4", Retina 3.5", etc.
Would like to automate a comparison between these "golden master" renders and screenshots of what the app actually looks like when that screen is shown.
Ideally I would like to have something I can run via continuous integration so I can break the build if a .xib gets messed up.
How can I do this?
Already tried:
Used Command-S in iPhone simulator to grab a screenshot suitable for comparison
Used GitHub's excellent image diff interface to manually compare the images
Pulled them up side-by-side in Preview.app, in actual size (Command-0)
Did some research on ImageMagick's comparison capabilities (examples)
Possible approaches:
Getting a screenshot of the app in code is already implemented
Similarly, I'm pretty sure I can find code to simulate a tap on the screen
Might need some way to exclude a mask or bounding box of areas known to not match exactly
Take a look at ios-snapshot-test-case, which was built for something close to this.
It will take a reference image the first time a test is run and then compare subsequent test outputs to the reference image. You could essentially use this but instead of creating reference images from the tests, you supply your own reference images.
In practice, this will be extremely tricky to do correctly. There are subtle differences in how text, gradients, etc are rendered between iOS and whatever tool your designers are using.
I'd check out KIF for functional testing.
You can create a custom test (small example near the end of the readme just above "Use with other testing frameworks") that takes a screenshot and compares it to your expected screenshot for that view. Just call failWithException:stopTest: if it doesn't match.
As you mentioned, you will want to save a mask with each expected screenshot, and apply the mask before comparing. You will always have parts of the screen that won't match, like the time in the status bar at a minimum.
For the comparison itself, here are a couple links:
Building an image mask
Slow, straightforward way to compare two images
OpenCV: I've seen this recommended, but haven't tried it.
I know this is an older question, but it's worth pointing out that KIF has built a "Perceptual Difference Testing Framework" called Lela:
https://github.com/kif-framework/Lela
If you're already using KIF this is the way to go. I believe it uses somewhat fuzzy image diffing so it may be able to get around the text rendering issues David Grandinetti mentioned. I haven't tried using it against external comps though.
If you're more comfortable with BDD/Cucumber/Gherkin syntax, you should also check out Zucchini, which uses reference images:
http://zucchiniframework.org/
I haven't used it but it's well spoken of.
I suggest you take a look at Visual CI
It's a software built for Continuous integration image compare,
It has UI that allows you to control settings which also include which parts of your image to compare
It's kind of new, but may answer your requirements better.

Parsing / Scraping information from an image

I am looking for a library that would help scrape the information from the image below.
I need the current value so it would have to recognise the values on the left and then estimate the value of the bottom line.
Any ideas if there is a library out there that could do something like this? Language isn't really important but I guess Python would be preferable.
Thanks
I don't know of any "out of the box" solution for this and I doubt one exists. If all you have is the image, then you'll need to do some image processing. A simple binarization method (like Otsu binarization) would make it easier to process:
The binarization makes it easier because now the pixels are either "on" or "off."
The locations for the lines can be found by searching for some number of pixels that are all on horizontally (5 on in a row while iterating on the x axis?).
Then a possible solution would be to pass the image to an OCR engine to get the numbers (tesseractOCR is an open source OCR engine hosted at Google (C++): tesseractOCR). You'd still have to find out where the numbers are in the image by iterating through it.
Then, you'd have to find where the lines are relative to the keys on the left and do a little math and you can get your answer.
OpenCV is a beefy computer vision library that has things like the binarization. It is also a C++ library.
Hope that helps.

Resources