Region growing Vs Clustering - image-processing

In image processing, how region growing and clustering differ from each other ? Give more information on how they differ. Thank you for reading

Region growing :
You have to select seed points and then the local area around the seed is analyzed in order to know if the neighbor pixels should have the same label. http://en.wikipedia.org/wiki/Region_growing
It can be used for precise image segmentation.
Clustering :
There are many clustering techniques (k-means, hierarchical clustering, density clustering, etc.). Clustering algorithms don't ask to input seed points because they are based on unsupervised learning.
It can be use for coarse image segmentation.

I found region growing similar to some clustering algorithm. I explained my view point below:
In region growing there are 2 cases:
Selecting seed points randomly which is similar to k-mean. seeds
play the role of means here. Then, we start with one seed and spread
it until we cannot grow it anymore (like we start with one mean and
we continue till we reach a convergence). and the way we grow the
region is based on the euclidean distance form seed grey value
(usually).
Second case in region growing can be considered with no seed (assume
we don't know how many seeds to choose or we don't know the number
of clusters). So we start with the first pixel. Then we find
neighbors of current pixel with considering distance d from mean
grey value of the region (of course at first iteration mean grey
value is exactly current grey value). Afterwards, we update the mean
grey value. In this way region growing seems to act like mean shift
algorithm. If we don't update mean grey value after each assigning,
then it could be considered as a DBSCAN algorithm.

Related

DONUT- Anomaly detection Algorithm ignores the relationship between sliding windows?

I'm trying to understand the paper : https://netman.aiops.org/wp-content/uploads/2018/05/PID5338621.pdf about Robust and Rapid Clustering of KPIs for Large-Scale Anomaly Detection.
Clustering is done using ROCKA algorithm.
Steps:
1.) Preprocessing is conducted on the raw KPI data to remove amplitude differences and standardize data.
2.) In baseline extraction step, we reduce noises, remove the extreme values (which are likely anomalies), and extract underlying shapes, referred to as baselines, of KPIs. It's done by applying moving average with a small sliding window.
3.) Clustering is then conducted on the baselines of sampled KPIs, with robustness against phase shifts and noises.
4.) Finally, we calculate the centroid of each cluster, then assign the unlabeled KPIs by their distances to these centroids.
I understand ROCKA mechanism.
Now, i'm trying to understand DONUT algorithm which is applied for "Anomaly Detection".
How it works is :
DONUT applies sliding windows over the KPI to get short series x and tries to recognize what normal patterns x follows. The indicator is then calculated by the difference between reconstructed normal patterns and x to show the severity of anomalies. In practice, a threshold should be selected for each KPI. A data point with an indicator value larger than the threshold is regarded as an anomaly.
Now my question is :
IT seems like DONUT is not robust enough against time information related anomalies. Meaning that it works on a set of sliding windows and it ignores the relationship between windows. So the window becomes a very critical parameter here. So it might generate high false positives. What I'm understanding wrong here?
Please help and make me understand how DONUT will capture the relationship between sliding windows.

Clustering K-means algorithm for elongated data set

I have go question while programming K-means algorithm in Matlab. Why K-means algorithm not suitable for classifying elongated data set?
In sort, draw some thick lines on a paper. Can you really represent each one with a single point? How would single points give information about orientation?
K-means assigns each datapoint to each nearest centroid. That is to say that for each centroid c, all points that their distance from c is smaller (in comparison to all other centroids) will be assigned to c. And, since the surface of a (hyper)sphere is in fact, all points with distance less or equal to some value from a center, I think it is easy to see how resulted clusters tend to be spherical. (To be exact, kmeans practically creates a Voronoi diagram in the vector space)
Elongated clusters however, don't necessarily satisfy the requirement that all their points are closer to their "center of mass" than to some other cluster's center.
It is difficult for you to choose a init cluster center point in elongated data set, but it has a powerful effect on the result.You may get different results when choose different points.
You will get only one result in this case when you choose 3 init points:
But it is different in elongated data set.

Algorithm suitable for a region-growing image segmentation based on the minimization of a metric

Is there a pixel-based region growing algorithm that can be employed for the extraction of features (segmentation) on an image, by adding pixels to the seed based on the minimization of a certain metric. Potentially, a pixel can be removed if the metric is not optimized when this pixel is added (i.e. possibility to backtrack and go back to the seed obtained in the previous iterations).
I'll try to explain further my objectives:
This algorithm starts from a central pixel selected as an initial seed on the image.
Afterwards, each of the 4 neighbors is explored (right, left, bottom and top neighbors) separately, to see if the metric is optimized by growing the seed in the selected direction.
A neighboring pixel might not optimize the metric immediately, even if the seed created by adding this pixel will be optimal in future iterations.
There is a possibility that a neighboring pixel is added to the seed but is removed later, if the obtained seed is not optimal.
Can anyone suggest to me an Artificial Intelligence technique (or a greedy approach) that is adequate to solve this kind of problems? Also, what would be a good criteria to judge that the addition of a pixel will optimize the metric even though this will probably happen in future iterations.
P.S: I started implementing what's explained above in Python but was stuck in the issue of determining if a path (neighboring pixel) is worth exploring or not. Right now, I try to add a neighboring pixel only if the seed produced improve (i.e. minimize) the error relatively to the metric. However, even though by adding the right or left neighbors the metric isn't optimized, one of these two paths might lead to the optimal solution in the future (as explained in the third objective).
You've basically outlined the most successful algorithm you could get with this approach. It's success will depend heavily on the metric you use to add/remove pixels, but there are a few things you can do to emulate the behavior you want.
Definitions
We'll call the metric we're optimizing M where M(R) is the metric's value for a region R and a region R is some collections of pixels. I will assume that optimizing the metric will result in the largest possible value of M, but this approach can work if the goal is to minimize M as well.
Methodology
This approach is going to be slightly backwards to your original outline, but it should satisfy both requirements of adding pixels that lie in non-optimal paths from the seed and removing pixels that do not contribute significantly to the optimization.
We will begin at a seed s, but instead of evaluating paths as we go we will add all pixels in the image (or maximum feature size) iteratively to our region. At each step we will determine a value of the pixel based on how much it improves the metric for the current region, M(p). This is not the same as the value of the region containing the pixel (M(R) where p is in R). Rather it would be the difference of the value of the region containing the pixel and the value of the region before the pixel was added (M(p) = M(R) - M(R') where R = R' + p). If you have the capacity to evaluate a single pixel you could simply use that instead.
The next change is to include an regularization parameter in M(R) that penalizes the score based on the number of pixels included: N(R) = M(R) - a * |R| where a is some arbitrary positive constant and |R| represents the cardinality (number of pixels) in our region. Note: if the goal is to minimize M then a should be negative. This will have the effect of penalizing the score of the region if it includes too many pixels.
Finally, after all pixels have been added to the region and N(p) has been evaluated for each pixel we iterate over the region again. This time we begin at the last pixel added and iterate backwards over our set of pixels, ending at the seed s. At each iteration determine the score of the region N(R). If the score N(R) has decreased since the last iteration then we remove the pixel p with the lowest score N(p). This should have the effect of the smallest number of pixels in the region that contribute the most to the score.
Additional Considerations
If the remaining pixels lie on non-contiguous paths after pruning you could run a secondary algorithm to add in adjoining pixels. You'll need to do testing to determine an optimal value of a such that enough pixels are kept to reconstruct the building, but it doesn't include every pixel from the image.
My Opinion (that you didn't ask for)
In general I think you would have more luck with more robust algorithms such as Convolutional Neural Networks for feature classification. They'll likely be faster and definitely more accurate than the algorithm described above.

Image segmentation with maxflow

I have to do a foreground/background segmentation using maxflow algorithm in C++. (http://wiki.icub.org/iCub/contrib/dox/html/poeticon_2src_2objSeg_2src_2maxflow-v3_802_2maxflow_8cpp_source.html). I get an array of pixels from a png file according to their RBG but what are the next steps. How could I use this algorithm for my problem?
I recognize that source very well. That's the Boykov-Kolmogorov Graph Cuts library. What I would recommend you do first is read their paper.
Graph Cuts is an interactive image segmentation algorithm. You mark pixels in your image on what you believe belong to the object (a.k.a. foreground) and what don't belong to the object (a.k.a the background). That's what you need first. Once you do this, the Graph Cuts algorithm best guesses what the labels of the other pixels are in the image. It basically goes through each of the other pixels that are not labeled and figures out whether or not they belong to foreground and background.
The whole premise behind Graph Cuts is that image segmentation is akin to energy minimization. Image segmentation can be formulated as a cost function with a summation of two terms:
Self-Penalty: This is the cost of assigning each pixel as either foreground or background. This is also known as a data cost.
Neighbouring Penalties: This enforces that neighbouring pixels more or less should share the same classification label. This is also known as a smoothness cost.
This kind of formulation is well known as the Maximum A Posteriori Markov Random Field classification problem (MAP-MRF). The goal is to minimize that cost function so that you achieve the best image segmentation possible. This is actually an NP-Hard problem, and is actually one of the problems that is up for money from the Clay Math Institute.
Boykov and Kolmogorov theoretically proved that the MAP-MRF problem can be translated into graph theory, and solving the MAP-MRF problem is akin to taking your image and forming it into a graph with source and sink links, as well as links that connect neighbouring pixels together. To solve the MAP-MRF, you perform the maximum-flow/minimum-cut algorithm. There are many ways to do this, but Boykov / Kolmogorov find a more efficient way that is much faster than more established algorithms, such as Push-Relabel, Ford-Fulkenson, etc.
The self penalties are what are known as t links, while the neighbouring penalties are what are known as n links. You should read up the paper to figure out how these are computed, but the t links describe the classification penalty. Basically, how much it would cost to classify each pixel as belonging to the foreground or the background. These are usually based on the negative log probability distributions of the image. What you do is you create a histogram of the distribution of what was classified as foreground and a histogram of what was classified as background.
Usually, a uniform quanitization of each colour channel for both foreground and background suffices. You then turn these into PDFs but dividing by the total number of elements in each histogram, then when you calculate the t-links for each pixel, you access the colour, then see where it lies in the histogram, then take the negative log. This will tell you how much it will cost to classify that pixel to be either foreground or background.
The neighbouring pixel costs are more intuitive. People usually just take the Euclidean distance between one pixel and a neighbouring pixel and apply this distance to a Gaussian. To make things simple, a 4 pixel neighbourhood is what is usually used (North, South, East and West).
Once you figure out how to compute the cost, you follow this procedure:
Mark pixels as foreground or background.
Create a graph structure using their library
Compute the histograms of the foreground and background pixels
Calculate t-links and add to the graph
Calculate n-links and add to the graph
Invoke the maxflow routine on the graph to segment the image
Go through each pixel and figure out whether or not the pixel belongs to foreground or background.
Create a binary map that reflects this, then copy over image pixels where the binary map is true, and don't do this when it's false.
The original source of maxflow can be found here: http://pub.ist.ac.at/~vnk/software/maxflow-v3.03.src.zip
It also has a README so you can see how the library is supposed to work given some example images.
You have a lot to digest, but Graph Cuts is one of the most powerful interactive segmentation tools out there.
Good luck!

Selecting an appropriate similarity metric & assessing the validity of a k-means clustering model

I have implemented k-means clustering for determining the clusters in 300 objects. Each of my object
has about 30 dimensions. The distance is calculated using the Euclidean metric.
I need to know
How would I determine if my algorithms works correctly? I can't have a graph which will
give some idea about the correctness of my algorithm.
Is Euclidean distance the correct method for calculating distances? What if I have 100 dimensions
instead of 30 ?
The two questions in the OP are separate topics (i.e., no overlap in the answers), so I'll try to answer them one at a time staring with item 1 on the list.
How would I determine if my [clustering] algorithms works correctly?
k-means, like other unsupervised ML techniques, lacks a good selection of diagnostic tests to answer questions like "are the cluster assignments returned by k-means more meaningful for k=3 or k=5?"
Still, there is one widely accepted test that yields intuitive results and that is straightforward to apply. This diagnostic metric is just this ratio:
inter-centroidal separation / intra-cluster variance
As the value of this ratio increase, the quality of your clustering result increases.
This is intuitive. The first of these metrics is just how far apart is each cluster from the others (measured according to the cluster centers)?
But inter-centroidal separation alone doesn't tell the whole story, because two clustering algorithms could return results having the same inter-centroidal separation though one is clearly better, because the clusters are "tighter" (i.e., smaller radii); in other words, the cluster edges have more separation. The second metric--intra-cluster variance--accounts for this. This is just the mean variance, calculated per cluster.
In sum, the ratio of inter-centroidal separation to intra-cluster variance is a quick, consistent, and reliable technique for comparing results from different clustering algorithms, or to compare the results from the same algorithm run under different variable parameters--e.g., number of iterations, choice of distance metric, number of centroids (value of k).
The desired result is tight (small) clusters, each one far away from the others.
The calculation is simple:
For inter-centroidal separation:
calculate the pair-wise distance between cluster centers; then
calculate the median of those distances.
For intra-cluster variance:
for each cluster, calculate the distance of every data point in a given cluster from
its cluster center; next
(for each cluster) calculate the variance of the sequence of distances from the step above; then
average these variance values.
That's my answer to the first question. Here's the second question:
Is Euclidean distance the correct method for calculating distances? What if I have 100 dimensions instead of 30 ?
First, the easy question--is Euclidean distance a valid metric as dimensions/features increase?
Euclidean distance is perfectly scalable--works for two dimensions or two thousand. For any pair of data points:
subtract their feature vectors element-wise,
square each item in that result vector,
sum that result,
take the square root of that scalar.
Nowhere in this sequence of calculations is scale implicated.
But whether Euclidean distance is the appropriate similarity metric for your problem, depends on your data. For instance, is it purely numeric (continuous)? Or does it have discrete (categorical) variables as well (e.g., gender? M/F) If one of your dimensions is "current location" and of the 200 users, 100 have the value "San Francisco" and the other 100 have "Boston", you can't really say that, on average, your users are from somewhere in Kansas, but that's sort of what Euclidean distance would do.
In any event, since we don't know anything about it, i'll just give you a simple flow diagram so that you can apply it to your data and identify an appropriate similarity metric.
To identify an appropriate similarity metric given your data:
Euclidean distance is good when dimensions are comparable and on the same scale. If one dimension represents length and another - weight of item - euclidean should be replaced with weighted.
Make it in 2d and show the picture - this is good option to see visually if it works.
Or you may use some sanity check - like to find cluster centers and see that all items in the cluster aren't too away of it.
Can't you just try sum |xi - yi| instead if (xi - yi)^2
in your code, and see if it makes much difference ?
I can't have a graph which will give some idea about the correctness of my algorithm.
A couple of possibilities:
look at some points midway between 2 clusters in detail
vary k a bit, see what happens (what is your k ?)
use
PCA
to map 30d down to 2d; see the plots under
calculating-the-percentage-of-variance-measure-for-k-means,
also SO questions/tagged/pca
By the way, scipy.spatial.cKDTree
can easily give you say 3 nearest neighbors of each point,
in p=2 (Euclidean) or p=1 (Manhattan, L1), to look at.
It's fast up to ~ 20d, and with early cutoff works even in 128d.
Added: I like Cosine distance in high dimensions; see euclidean-distance-is-usually-not-good-for-sparse-data for why.
Euclidean distance is the intuitive and "normal" distance between continuous variable. It can be inappropriate if too noisy or if data has a non-gaussian distribution.
You might want to try the Manhattan distance (or cityblock) which is robust to that (bear in mind that robustness always comes at a cost : a bit of the information is lost, in this case).
There are many further distance metrics for specific problems (for example Bray-Curtis distance for count data). You might want to try some of the distances implemented in pdist from python module scipy.spatial.distance.

Resources