Can you help me waith Image classification using SIFT feature?
I want to classify images based on SIFT features:
Given a training set of images, extract SIFT from them
Compute K-Means over the entire set of SIFTs extracted form the
training set. the "K" parameter (the number of clusters) depends on
the number of SIFTs that you have for training, but usually is around
500->8000 (the higher, the better).
Now you have obtained K cluster centers.
You can compute the descriptor of an image by assigning each SIFT of
the image to one of the K clusters. In this way you obtain a
histogram of length K.
I have 130 images in training set so my training set 130*K
dimensional
I want to classify my test images ı have 1 images so my sample is 1*k
dimensional. I wrote this code knnclassify(sample,training
set,group).
I want to classify to 7 group. So, knnclassify(sample(1*10),trainingset(130*10),group(7*1))
The error is: The length of GROUP must equal the number of rows in TRAINING. What can I do?
Straight from the docs:
CLASS = knnclassify(SAMPLE,TRAINING,GROUP) classifies each row of the
data in SAMPLE into one of the groups in TRAINING using the nearest-
neighbor method. SAMPLE and TRAINING must be matrices with the same
number of columns. GROUP is a grouping variable for TRAINING. Its
unique values define groups, and each element defines the group to
which the corresponding row of TRAINING belongs. GROUP can be a
numeric vector, a string array, or a cell array of strings. TRAINING
and GROUP must have the same number of rows.
What this means, is that group should be 130x1, and should indicate which group each of the training samples belong to. unique(group) should return 7 values in your case - the seven categories represented in your training set.
If you don't already have a group vector which specifies which categories which image falls into, you could use kmeans to split your training set into 7 groups:
group = kmeans(trainingset,7);
knnclassify(sample, trainingset, group);
Related
As part of preprocessing:
I have removed attributes that are high in correlation(>0.8).
standardized the data(Standard Scalar)
`#To reduce it to lower dimensions I used
umap =UMAP(n_neighbors=20,
min_dist=0,
spread=2,
n_components=3,
metric='euclidean')
df_umap = umap.fit_transform(df_scaled1)
#For Clustering I used HDBSCAN
clusterer = hdbscan.HDBSCAN(min_cluster_size=30, max_cluster_size=100, prediction_data=True)
clusterer.fit(df_umap)
#Assign clusters to the original dataset
df['cluster'] = clusterer.labels_`
Data--(130351,6)
Column a
Column b
Column c
Column d
Column e
Column f
6.000194
7.0
1059216
353069.000000
26.863543
15.891751
3.001162
3.5
1303727
396995.666667
32.508957
11.215764
6.000019
7.0
25887
3379.000000
18.004558
10.993119
6.000208
7.0
201138
59076.666667
41.140104
10.972880
6.000079
7.0
59600
4509.666667
37.469000
9.667119
df.describe():
df.describe()
Results:
1.While some of the clusters have very similar data points;
example: cluster: 1555, but a lot of them are having extreme data points associated with single cluster;
example: cluster: 5423.
Also cluster id '-1' have 36221 data points associated with it.
My questions:
Am I using the correct approach for the data I have and the result I am trying to achieve?
Is UMAP the correct choice for dimension reduction?
Is HDBSCAN the right choice for this clustering problem? (I chose HDBSCAN, as it doesnt need any user input for defining number of clusters, the maximum and minimum data points associated to a cluster can be set before hand)
How to tune the clustering model to achieve better cluster quality ?(I am assuming with better cluster quality the points associated with cluster '-1' will also get clustered)
Is there any method to assess cluster quality?
The dimensions for the input data for LSTM are [Batch Size, Sequence Length, Input Dimension] in tensorflow.
What is the meaning of Sequence Length & Input Dimension ?
How do we assign the values to them if my input data is of the form :
[[[1.23] [2.24] [5.68] [9.54] [6.90] [7.74] [3.26]]] ?
LSTMs are a subclass of recurrent neural networks. Recurrent neural nets are by definition applied on sequential data, which without loss of generality means data samples that change over a time axis. A full history of a data sample is then described by the sample values over a finite time window, i.e. if your data live in an N-dimensional space and evolve over t-time steps, your input representation must be of shape (num_samples, t, N).
Your data does not fit the above description. I assume, however, that this representation means you have a scalar value x which evolves over 7 time instances, such that x[0] = 1.23, x[1] = 2.24, etc.
If that is the case, you need to reshape your input such that instead of a list of 7 elements, you have an array of shape (7,1). Then, your full data can be described by a 3rd order tensor of shape (num_samples, 7, 1) which can be accepted by a LSTM.
Simply put seq_len is number of time steps that will be inputted into LSTM network, Let's understand this by example...
Suppose you are doing a sentiment classification using LSTM.
Your input sentence to the network is =["I hate to eat apples"]. Every single token would be fed as input at each timestep, So accordingly here the seq_Len would total number of tokens in a sentence that is 5.
Coming to the input_dim you might know we can't directly feed words to the netowrk you would need to encode those words into numbers. In Pytorch/tensorflow embedding layers are used where we have to specify embedding dimension.
Suppose your embedding dimension is 50 that means that embedding layer will take index of respective token and convert it into vector representation of size 50. So the input dim to LSTM network would become 50.
I am new to the Neural network.
I have training dataset of 1K examples. each example contains the 5 features.
Initially, I provided some to value to weights.
So, Is there is 1K value is stored for weights associated with each example or the weight values remain same for all the 1K examples?
For example:
example1 => [f1,f2,f3,f4,f5] -> [w1e1,w2e1,w3e1,w4e1,w5e1]
example2 => [f1,f2,f3,f4,f5] -> [w1e2,w2e2,w3e2,w4e2,w5e2]
Here w1 means first weight and e1, e2 mean different examples.
or example1,example2,... -> [gw1,gw2,gw3,gw4,gw5]
Here g means global and w1 means weight for feature one as so on.
Start with a single node in the Neural network. It's output is sigmoid function applied to the linear combination of input as shown below.
So for 5 features you will have 5 weights + 1 bias for each node of the neural network. While training, a batch of inputs are fed, the output at then end of the neural network is calculated, the error is calculated with respect to the actual outputs and gradients are backpropogated based on the error. In simple words, the weights are adjusted based on the error.
So for each node you have 6 weights, and depending on the number of nodes (which depends on the number of layers and size of the layers) you can calculate number of weights. All the weights are updated once per batch (since you are doing batch training)
I use function predict in opencv to classify my gestures.
svm.load("train.xml");
float ret = svm.predict(mat);//mat is my feature vector
I defined 5 labels (1.0,2.0,3.0,4.0,5.0), but in fact the value of ret are (0.521220207,-0.247173533,-0.127723947······)
So I am confused about it. As Opencv official document, the function returns a class label (classification) in my case.
update: I don't still know why to appear this result. But I choose new features to train models and the return value of predict function is what I defined during train phase (e.g. 1 or 2 or 3 or etc).
During the training of an SVM you assign a label to each class of training data.
When you classify a sample the returned result will match up with one of these labels telling you which class the sample is predicted to fall into.
There's some more documentation here which might help:
http://docs.opencv.org/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html
With Support Vector Machines (SVM) you have a training function and a prediction one. The training function is to train your data and save those informations on an xml file (it facilitates the prediction process in case you use a huge number of training data and you must do the prediction function in another project).
Example : 20 images per class in your case : 20*5=100 training images,each image is associated with a label of its appropriate class and all these informations are stocked in train.xml)
For the prediction function , it tells you what's label to assign to your test image according to your training DATA (the hole work you did in training process). Your prediction results might be good and might be bad , it's all about your training data I think.
If you want try to calculate the error rate for your classifier to see how much it can give good results or bad ones.
How can algorithms which partition a space in to halves, such as Suport Vector Machines, be generalised to label data with labels from sets such as the integers?
For example, a support vector machine operates by constructing a hyperplane and then things 'above' the hyperplane take one label, and things below it take the other label.
How does this get generalised so that the labels are, for example, integers, or some other arbitrarily large set?
One option is the 'one-vs-all' approach, in which you create one classifier for each set you want to partition into, and select the set with the highest probability.
For example, say you want to classify objects with a label from {1,2,3}. Then you can create three binary classifiers:
C1 = 1 or (not 1)
C2 = 2 or (not 2)
C3 = 3 or (not 3)
If you run these classifiers on a new piece of data X, then they might return:
C1(X) = 31.6% chance of being in 1
C2(X) = 63.3% chance of being in 2
C3(X) = 89.3% chance of being in 3
Based on these outputs, you could classify X as most likely being from class 3. (The probabilities don't add up to 1 - that's because the classifiers don't know about each other).
If your output labels are ordered (with some kind of meaningful, rather than arbitrary ordering). For example, in finance you want to classify stocks into {BUY, SELL, HOLD}. Although you can't legitimately perform a regression on these (the data is ordinal rather than ratio data) you can assign the values of -1, 0 and 1 to SELL, HOLD and BUY and then pretend that you have ratio data. Sometimes this can give good results even though it's not theoretically justified.
Another approach is the Cramer-Singer method ("On the algorithmic implementation of multiclass kernel-based vector machines").
Svmlight implements it here: http://svmlight.joachims.org/svm_multiclass.html.
Classification into an infinite set (such as the set of integers) is called ordinal regression. Usually this is done by mapping a range of continuous values onto an element of the set. (see http://mlg.eng.cam.ac.uk/zoubin/papers/chu05a.pdf, Figure 1a)