Normalized Mutual Information (NMI) and B3 are used for extrinsic clustering evaluation metrics when each instance (sample) has only one label.
What are equivalent metrics when each instance (sample) has only one label?
For example, in first image, we see [apple, orange, pears], in second image, we see [orange, lime, lemon] and in third image, we see [apple], and in the forth image we see [orange]. Then, if put first image and last image in the one cluster it is good, and if put third and forth image in one cluster is bad.
Application: Many popular datasets for object detection or image segmentation have multi labels for each image. If we used this data for classification (not detection and not segmentation), we have multiple labels for each image.
Note: My task is unsupervised clustering, not supervised classification. I know that for supervised classification, we can use top-5 or top-10 score. But I do not know what will be in unsupervised clustering.
If the multi-labels are still sparse, then you can use Element-centric similarity, Omega Index, or Overlapping NMI (I dont recommend the last, it has severe biases). All three are implemented in the python package, CluSim.
If the multi-labels are dense, then you are into fuzzy clustering comparison. There are several divergence measures for membership functions, including L1-norm, Euclidean distance, KL-divergence, but I'm not aware of a literature arguing for one method over another.
Related
I have some 2d data which looks like it could be well classified by an area described by two intersecting straight lines. These lines won't necessarily by at right angles to each other. Here is a simple example where the two lines would be more or less at right angles:
Is there a suitable classifier for this? Logistic regression will give me one straight line but I am not sure what will give me two as a decision boundary. A decision tree will give me two that are axis parallel which isn't really want I want.
You can give Support Vector Machine (SVM) a try. There are multiple kernels that can be used with SVM, like
Linear
Polynomial
RBF (Radial Basis Function)
Sigmoid
You can even specify custom kernels as mentioned in this list.
Here is an image of decision boundaries over Iris dataset, taken from this example
References
Difference between various SVM kernels
Selecting Kernels for SVM
Custom kernel SVM
My goal is to be able to have a way to process a product photo through the model, and have it return the same photo with the product against a white background. The product photos will be of varying sizes and product types.
I'd like to feed the model photos of products with backgrounds, and those without. In the future I will also expand on the dataset with partially removed backgrounds.
If you are looking for an easy way of doing this, I'd suggest the K-means clustering algorithm. Assuming that you have a simple plain background and an image (of interest) you can obtain the RGB pixel values and use a K-means clustering algorithm with the number of clusters set to 2.
Let me explain this to you with the help of an example. Suppose you have an image of dimension 28*28 (just another arbitrary dimension). The total number of pixels in the image would be 784. Each pixel is represented as a combination of 3 RGB values ranging from 0-255.
A K-Means clustering algorithm will cluster the pixel values into K clusters thus each cluster represents pixel values which are more similar than the pixel values in another cluster. This technique is especially helpful in drawing contours (borders) around images of interest.
In the K-means clustering algorithm, there would be 784 sample points each represented in a 3 dimensional plane for this example. It will cluster these data points into K (2 in this example) clusters.
Here is a very simple implementation of the K-means clustering algorithm.
If you are looking for advanced machine learning implementation, then I'd suggest you look for Deep Convolution Neural Networks for Background Removal in Images. This machine learning technique has been successfully used for the task for background image removal
Read more about it from here, here and here.
I am doing research in the field of computer vision, and am working on a problem related to finding visually similar images to a query image. For example, finding t-shirts of similar colour with similar patterns (Striped/ Checkered), or shoes of similar colour and shape, and so on.
I have explored hand-crafted image features such as Color Histograms, Texture features, Shape features (Histogram of Oriented Gradients), SIFT and so on. I have also read up literature about Deep Neural Networks (Convolutional Neural Networks), which have been trained on massive amounts of data and are currently state of the art in Image Classification.
I was wondering if the same features (extracted from the CNN's) can also be used for my project - finding fine-grained similarities between images. From what I understand, the CNNs have learnt good representative features that can help classify images - for example, be it a red shirt or a blue shirt or an orange shirt, it is able to identify that the image is a shirt. However it doesn't understand that an orange shirt looks more similar to a red shirt than a blue shirt does, and hence it is not able to capture these similarities.
Please correct me if I am wrong. I would like to know if there are any Deep Neural Networks that capture these similarities, and have proven to be superior to the hand-crafted features. Thanks in advance.
For your task, a CNN is definitely worth a try!
Many researchers used networks which are pretrained for Image Classification and obtained state-of-the-art results on fine-grained classification. For example, trying to classify birds species or cars.
Now, your task is not classification, but it is related. You can think about similarity as some geometric distance between features, which are basically vectors. Thus, you may carry out some experiments computing the distance between the feature vectors for all your training images (the reference) and the feature vector extracted from the query image.
CNNs features extracted from the first layers of the net should be more related to color or other graphical traits, rather than more "semantical" ones.
Alternatively, there is some work on learning directly a similarity metric through CNN, see here for example.
A little bit out-dated, but it can still be useful for other people. Yes, CNNs can be used for image similarity and I used before. As Flavio pointed out, for a simple start, you can use a pre-trained CNN of your choice such as Alexnet,GoogleNet etc.. and then use it as feature extractor. You can compare the features based on the distance, similar pictures will have a smaller distance between their feature vectors.
I have been doing reading about Self Organizing Maps, and I understand the Algorithm(I think), however something still eludes me.
How do you interpret the trained network?
How would you then actually use it for say, a classification task(once you have done the clustering with your training data)?
All of the material I seem to find(printed and digital) focuses on the training of the Algorithm. I believe I may be missing something crucial.
Regards
SOMs are mainly a dimensionality reduction algorithm, not a classification tool. They are used for the dimensionality reduction just like PCA and similar methods (as once trained, you can check which neuron is activated by your input and use this neuron's position as the value), the only actual difference is their ability to preserve a given topology of output representation.
So what is SOM actually producing is a mapping from your input space X to the reduced space Y (the most common is a 2d lattice, making Y a 2 dimensional space). To perform actual classification you should transform your data through this mapping, and run some other, classificational model (SVM, Neural Network, Decision Tree, etc.).
In other words - SOMs are used for finding other representation of the data. Representation, which is easy for further analyzis by humans (as it is mostly 2dimensional and can be plotted), and very easy for any further classification models. This is a great method of visualizing highly dimensional data, analyzing "what is going on", how are some classes grouped geometricaly, etc.. But they should not be confused with other neural models like artificial neural networks or even growing neural gas (which is a very similar concept, yet giving a direct data clustering) as they serve a different purpose.
Of course one can use SOMs directly for the classification, but this is a modification of the original idea, which requires other data representation, and in general, it does not work that well as using some other classifier on top of it.
EDIT
There are at least few ways of visualizing the trained SOM:
one can render the SOM's neurons as points in the input space, with edges connecting the topologicaly close ones (this is possible only if the input space has small number of dimensions, like 2-3)
display data classes on the SOM's topology - if your data is labeled with some numbers {1,..k}, we can bind some k colors to them, for binary case let us consider blue and red. Next, for each data point we calculate its corresponding neuron in the SOM and add this label's color to the neuron. Once all data have been processed, we plot the SOM's neurons, each with its original position in the topology, with the color being some agregate (eg. mean) of colors assigned to it. This approach, if we use some simple topology like 2d grid, gives us a nice low-dimensional representation of data. In the following image, subimages from the third one to the end are the results of such visualization, where red color means label 1("yes" answer) andbluemeans label2` ("no" answer)
onc can also visualize the inter-neuron distances by calculating how far away are each connected neurons and plotting it on the SOM's map (second subimage in the above visualization)
one can cluster the neuron's positions with some clustering algorithm (like K-means) and visualize the clusters ids as colors (first subimage)
My problem is as follows:
I have 6 types of images, or 6 classes. For example, cat, dog, bird, etc.
For every type of image, I have many variations of that image. For example, brown cat, black dog, etc.
I'm currently using a Support Vector Machine (SVM) to classify the images using one-versus-rest classification. I'm unfolding each image into a single pixel vector and using that as the feature vector for a given image I'm experiencing decent classification accuracy, but I want to try something different.
I want to use image descriptors, particularly SURF features, as the feature vector for each image. This issue is, I can only have a single feature vector per given image and I'm given a variable number of SURF features from the feature extraction process. For example, 1 picture of a cat may give me 40 SURF features, while 1 picture of a dog will give me 68 SURF features. I could pick the n strongest features, but I have no way of guaranteeing that the chosen SURF features are ones that describe my image (for example, it could focus on the background). There's also no guarantee that ANY SURF features are found.
So, my problem is, how can I get many observations (each being a SURF feature vector), and "fold" these observations into a single feature vector which describes the raw image and can fed to an SVM for training?
Thanks for your help!
Typically the SURF descriptors are quantized using a K-means dictionary and aggregated into one l1-normalized histogram. So your inputs to the SVM algorithm are now fixed in size.