What foreground selection algorithm does GIMP use? - image-processing

Title says it all.
I would have thought that there would be a reference to the algorithm in the documentation, but there doesn't seem to be one.
I could probably find out from the source, but ... it's a huuuuge code base :-(

Uniform Color Clustering
It does not matter how huge a code base is. It is a matter of seconds to find all files that contain the words foreground and select. A few minutes later you find:
https://github.com/GNOME/gimp/blob/5d79fba8238a27b8691556489898d33b3fa0dda0/libgimp/gimpdrawable_pdb.c#L1065
* Extract the foreground of a drawable using a given trimap.
*
* Image Segmentation by Uniform Color Clustering, see
* https://www.inf.fu-berlin.de/inst/pubs/tr-b-05-07.pdf
Abstract. The following article presents an approach for interactive
foreground extraction in still images. The presented approach has been
derived from color signatures, a technique originated from image
retrieval. The article explains the algorithm and presents some
benchmark results to show the improvements in speed and accuracy
compared to state-ofthe-art solutions. The article also describes how
the algorithm can easily be adapted for video segmentation.
https://www.inf.fu-berlin.de/inst/pubs/tr-b-05-07.pdf

Related

Quantifying differences in an image sequence to measure activity

I'm looking for a program that will enable me to quantity the difference between images in an image sequence over time.
We are hoping to use timelapse images to measure the activity of tadpoles by comparing how the images change over time. Tracking the movement of individuals isn’t necessary. The tadpoles are dark and the background of the aquarium is light, however the background isn’t uniform and some of the decor items like dark rocks and foliage make it so that all the tadpoles aren’t visible at all times.
Basically need a program that will allow me to quantity the differences/motion detected in an image sequence (i.e 209 images) and produce data that can be exported...
Any and all suggestions appreciated!!
Your question is rather vague and you don't supply any images or real indication of what you expect as results, so my answer will not be as thorough as it might otherwise be.
You don't mention any tools you are familiar with, but my recommendation would be Python and OpenCV. Alternatives are probably scikit-image, Python Wand.
In general, when trying to detect movement across a series of images, you would:
try and work out what the background is
look for movement by sutracting, or differencing, frames from the background
clean up the difference image
identify objects - maybe by shape or size or colour
maybe track objects
produce statistics
As regards working out the background, I did an example here by finding the median pixel across all images at each location in the images. There is also an OpenCV tutorial here.
As regards cleaning up images, you can probably remove noise in the background subtraction with a small median filter, say 3x3 or 5x5 depending on the resolution of your images.
As regards detecting tadpoles, you will probably want to use OpenCV findContours() and filter by size, or colour, or circularity. There are some fairly decent tutorials on PyImageSearch. There is also an ImageMagick "Connected Component" analysis to find a tennis player that I did here.

What is the difference between color deconvolution and K-means clustering for colors?

I have some color images needing segmentation. They are images of slides that are stained with hematoxylin and eosin ("H&E").
I found this method for color deconvolution by Ruifrok
http://europepmc.org/abstract/med/11531144
that separates out the images by color.
However it seems that you can do something similar just by using K-means clustering:
http://www.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-clustering.html
I am curious what the difference is. Any insight would be welcome. Thanks.
I can't seem to find a copy of the article (well without paying) but they are not exactly the same.
K means seeks to cluster data. So if you just want to find dominant colors in an image, or do some sorting based on colors, this is the way to go. As a side note: Kmeans can be used on any vector. Its not confined to color, so you can use it for many other applications.
Color Deconvolution is trying to remove the effects of chemical dyes commonly used for microscopy. (If I understood the abstract properly). Based on the specific dye used, the algorithm tries to reverse its effects and give you the original color image back (before the dye was added). I found this website that shows some output. This is deconvolving the dye contribution to the RGB spectrum. It doesn't do any clustering/grouping (other than finding the dye)
Hope that helps
EDIT
If you didn't know, convolution is most often associated with signals/image processing. Basically you take a filter and run it over a signal. The output is a modified version of the original input. In this case, the original image is filtered by a dye with known RGB values. IF we know the full characteristics of the dye/filter we can invert it. Then by running the convolution again using the inverse filter we can hopefully de -convolve the effect. In principle it sounds simple enough, but in many cases this isn't possible.

After Effect's Rotoscoping brush algorithms

I don't think I'm going to get any replies but here goes: I'm developing an iOS app that performs image segmentation functions. I'm trying to implement the easiest way to crop out a subject from an image without the need of a greenscreen/keying. Most automated solutions like using OpenCV just aren't cutting it.
I've found the rotoscope brush tool in After Effects to be effective at giving hints on where the app should be cutting out. Anyone know what kind of algorithms the rotoscope brush tool is using?
Check out this page, which contains a couple of video presentations from SIGGRAPH (a computer graphics conference) about the Roto Brush tool. Also take a look at Jue Wang's paper on Video SnapCut. As Damien guessed, object extraction relies on some pretty intense image processing algorithms. You might be able to implement something similar in OpenCV depending on how clever/masochistic you're feeling.
The algorithm is a graph-cut based segmentation algorithm where Gaussian Mixture Models (GMM) are trained using color pixels in "local" regions as well as "globally", together with some sort of shape prior.
OpenCV has a "cheap hack" implementation of the "GrabCut" paper where the user specifies a bounding box around the object he wish to segment. Typically, using just the bounding box will not give good results. You will need the user to specify the "foreground" and "background" pixels (as is done in Adobe's Rotoscoping tool) to help the algorithm build foreground and background color models (in this case GMMs) so that it will know what are the typical colors in the foreground object you wish to segment, and those for the background that you want to leave out.
A basic graph-cut implementation can be found on this blog. You can probably start from there and experiment with different ways to compute the cost terms to get better results.
Lastly, the "soften" the edges, a cheap hack is to blur the binary mask to obtain a mask with values between 0 and 1. Then recomposite your image using the mask i.e. c[i][j] = mask[i][j] * fgd[i][j] + (1 - mask[i][j]) * bgd[i][j], where you are blending the foreground you segmented (fgd), with a new background image (bgd) using the mask values as blending weights.

Sparse Image matching in iOS

I am building an iOS app that, as a key feature, incorporates image matching. The problem is the images I need to recognize are small orienteering 10x10 plaques with simple large text on them. They can be quite reflective and will be outside(so the light conditions will be variable). Sample image
There will be up to 15 of these types of image in the pool and really all I need to detect is the text, in order to log where the user has been.
The problem I am facing is that with the image matching software I have tried, aurasma and slightly more successfully arlabs, they can't distinguish between them as they are primarily built to work with detailed images.
I need to accurately detect which plaque is being scanned and have considered using gps to refine the selection but the only reliable way I have found is to get the user to manually enter the text. One of the key attractions we have based the product around is being able to detect these images that are already in place and not have to set up any additional material.
Can anyone suggest a piece of software that would work(as is iOS friendly) or a method of detection that would be effective and interactive/pleasing for the user.
Sample environment:
http://www.orienteeringcoach.com/wp-content/uploads/2012/08/startfinishscp.jpeg
The environment can change substantially, basically anywhere a plaque could be positioned they are; fences, walls, and posts in either wooded or open areas, but overwhelmingly outdoors.
I'm not an iOs programmer, but I will try to answer from an algorithmic point of view. Essentially, you have a detection problem ("Where is the plaque?") and a classification problem ("Which one is it?"). Asking the user to keep the plaque in a pre-defined region is certainly a good idea. This solves the detection problem, which is often harder to solve with limited resources than the classification problem.
For classification, I see two alternatives:
The classic "Computer Vision" route would be feature extraction and classification. Local Binary Patterns and HOG are feature extractors known to be fast enough for mobile (the former more than the latter), and they are not too complicated to implement. Classifiers, however, are non-trivial, and you would probably have to search for an appropriate iOs library.
Alternatively, you could try to binarize the image, i.e. classify pixels as "plate" / white or "text" / black. Then you can use an error-tolerant similarity measure for comparing your binarized image with a binarized reference image of the plaque. The chamfer distance measure is a good candidate. It essentially boils down to comparing the distance transforms of your two binarized images. This is more tolerant to misalignment than comparing binary images directly. The distance transforms of the reference images can be pre-computed and stored on the device.
Personally, I would try the second approach. A (non-mobile) prototype of the second approach is relatively easy to code and evaluate with a good image processing library (OpenCV, Matlab + Image Processing Toolbox, Python, etc).
I managed to find a solution that is working quite well. Im not fully optimized yet but I think its just tweaking filters, as ill explain later on.
Initially I tried to set up opencv but it was very time consuming and a steep learning curve but it did give me an idea. The key to my problem is really detecting the characters within the image and ignoring the background, which was basically just noise. OCR was designed exactly for this purpose.
I found the free library tesseract (https://github.com/ldiqual/tesseract-ios-lib) easy to use and with plenty of customizability. At first the results were very random but applying sharpening and monochromatic filter and a color invert worked well to clean up the text. Next a marked out a target area on the ui and used that to cut out the rectangle of image to process. The speed of processing is slow on large images and this cut it dramatically. The OCR filter allowed me to restrict allowable characters and as the plaques follow a standard configuration this narrowed down the accuracy.
So far its been successful with the grey background plaques but I havent found the correct filter for the red and white editions. My goal will be to add color detection and remove the need to feed in the data type.

Genetic algorithms for image processing project

I'm thinking of starting a project for school where I'll use genetic algorithms to optimize digital sharpening of images. I've been playing around with unsharp masking (USM) techniques in Photoshop. Basically, I want to create a software that optimizes the parameters (i.e. blur radius, types of blur, blending the image) to create the "best-fit" set of filters.
I'm sort of quickly planning this project before starting it, and I can't think of a good fitness function for the 'selection' part. How would I determine the 'quality' of the filter sets, or measure how sharp the image is?
Also, I will be programming using python (with the Python Imaging Library) since it's the only language I'm proficient with. Should I learn a low-level language instead?
Any advice/tips on anything is greatly appreciated. Thanks in advance!
tl;dr How do I measure how 'sharp' an image is?
if its for tuning parameters you could take a known image and apply a known blurring/low pass filter. Then sharpen this with your GA+USM algorithm. Calculate your fitness function making use of the original image, e.g maybe something as simple as the mean absolute error. May need to create different datasets, e.g. landscape images (mostly sharp, in focus with large depth of field), portrait images (could be large areas deliberately out of focus and "soft"), along with low noise and noisy images. Sharpening noisy images is actually quite a challenge.
It would definitely be worth taking a look at Bruce Frasier' work on sharpening techniques for Photoshop etc.
Also it might worth checking out Imatest (www.imatest.com) to see if there is anything regarding sharpness/resolution. And finally you might also consider resolution charts.
And finally I seroiusly doubt one set of ideal parameters exists for USM, the optimum parameters will be image dependant and indeed be a personal perference (thatwhy I suggest starting for a known sharp image and blurring it). Understanding the type of image is probably as important and in itself and very interesting and challenging problem. Although perhaps basic hueristics like image varinance and edge histogram would reveal suitable clues.
Anyway just a thought, hopefully some of the above is useful

Resources