I am currently developing my own kernel to use for classification and want to include it into libsvm, replacing the standard kernels that libsvm offers.
I however am not 100% sure how to do this, and obviously do not want to make any mistakes. Beware, that my c++ is not very good. I found the following on the libsvm faq-page:
Q: I would like to use my own kernel. Any example? In svm.cpp, there
are two subroutines for kernel evaluations: k_function() and
kernel_function(). Which one should I modify ? An example is "LIBSVM
for string data" in LIBSVM Tools.
The reason why we have two functions is as follows. For the RBF kernel
exp(-g |xi - xj|^2), if we calculate xi - xj first and then the norm
square, there are 3n operations. Thus we consider exp(-g (|xi|^2 -
2dot(xi,xj) +|xj|^2)) and by calculating all |xi|^2 in the beginning,
the number of operations is reduced to 2n. This is for the training.
For prediction we cannot do this so a regular subroutine using that 3n
operations is needed. The easiest way to have your own kernel is to
put the same code in these two subroutines by replacing any kernel.
Hence, I was trying to find the two subroutinges k_function() and kernel_function(). The former I found with the following signature in svm.cpp:
double Kernel::k_function(const svm_node *x, const svm_node *y,
const svm_parameter& param)
Am I correct, that x and y each store one observation (=row) of my feature matrix in an array and I need to return the kernel value k(x,y)?
The function kernel_function() on the other hand I was not able to find at all. There is a pointer in the Kernel class with that name and the following declaration
double (Kernel::*kernel_function)(int i, int j) const;
which is set in the Kernel constructor. What are i and j in that case? I suppose I need to set this pointer as well?
Once I overwrote Kernel::k_function and Kernel::*kernel_function I'd be finished, and libsvm would use my kernel to compare two observations?
Thank you!
You don't have to break into the code of LIBSVM to use your own kernel, you can use the pre-computed kernel option (i.e., -t 4 training_set_file).
Thus, you can compute the kernel matrix externally as it suits you, store the values in a file and load the pre-computed kernel to LIBSVM. There's an explanation accompanied with an example of how to do this in the README file that you can find in LIBSVM tar ball (see in the Precomputed Kernels section line 236).
Related
Since two operations Conv2DBackpropFilter and Conv2DBackpropInput count most of the time for lots of applications(AlexNet/VGG/GAN/Inception, etc.), I am analyzing the complexity of these two operations (back-propagation) in TensorFlow and I found out that there are three implementation versions (custom, fast and slot) for Conv2DBackpropFilter (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/conv_grad_filter_ops.cc ) and Conv2DBackpropInput (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/conv_grad_input_ops.cc). While I profile, all computations are passed to "custom" version instead of "fast" or "slow" which directly calls Eigen function SpatialConvolutionBackwardInput to do that.
The issue is:
Conv2DBackpropFilter uses Eigen:“TensorMap.contract" to do the tensor contraction and Conv2DBackpropInput uses Eigen:"MatrixMap.transpose" to do the matrix transposition in the Compute() function. Beside these two functions, I didn't see any convolutional operations which are needed for back-propagation theoretically. Beside convolutions, what else would be run inside these two operations for back-propagation? Does anyone know how to analyze the computation complexity of "back propagation" operation in TensorFlow?
I am looking for any advise/suggestion. Thank you!
In addition to the transposition and contraction, the gradient op for the filter and the gradient op for the input must transform their input using Im2Col and Col2Im respectively. Approximately speaking, these transformations enable the convolution operation to be implemented using tensor contraction. For more information, see the CS231n page on Convolutional Networks (specifically, the paragraphs titled "Implementation as Matrix Multiplication" and "Backpropagation").
mrry, I got it. It means that Conv2D, Conv2DBackpropFilter and Conv2DBackpropInput use the same way by using "GEMM" to work for convolution by Im2Col/Col2Im. An other issue is that while I do the profile of GAN in TensorFlow, the execution time of Conv2DBackpropInput and Conv2DBackpropFilter are around 4-6 times slower than Conv2D with the same input size. Why?
I'm working with Principal Component Analysis (PCA) in openCV. The constructor inputs for the case I'm interested in are:
PCA(InputArray data, InputArray mean, int flags, double retainedVariance);
Regarding the InputArray 'data' the documents state the appropriate flags should be:
CV_PCA_DATA_AS_ROW indicates that the input samples are stored as
matrix rows.
CV_PCA_DATA_AS_COL indicates that the input samples are
stored as matrix columns.
My question pertains to the use of the term 'samples' in that I'm not sure what a sample is in this context.
For example let's say I have 4 sets of data and for the sake of illustration let's label them A-D. Now each set A through D has 8 elements. They are then set up in the Mat variable I'll use as InputArray as follows:
The question is, which is it:
My sets are samples?
My data elements are samples?
Another way of asking:
Do I have 4 samples (CV_PCA_DATA_AS_COL)
Or do I have 4 sets of 8 samples (CV_PCA_DATA_AS_ROW)
?
As a guess, I'd choose CV_PCA_DATA_AS_COL (i.e. I have 4 samples) - but that's just where my head is at... Until I learn the correct terminology it seems the word 'sample' could apply to either reasoning.
Ugh...
So the answer was found by reversing the logic behind the documentation for the PCA::project step...
Mat PCA::project(InputArray vec)
vec – input vector(s); must have the same dimensionality and the same
layout as the input data used at PCA phase, that is, if
CV_PCA_DATA_AS_ROW are specified, then vec.cols==data.cols (vector
dimensionality)
i.e. 'sample' is equivalent to 'set', and the elements are the 'dimension'.
(and my guess was correct :)
How do i do a gaussi smoothing in the 3th dimension?
I have this detection pyramid, votes accumulated at four scales. Objects are found at each peak.
I already smoothed each of them in 2d, and reading in my papers that i need to filter the third dimension with a \sigma = 1, which i havent tried before, i am not even sure what it means.
I Figured out how to do it in Matlab, and need something simular in opencv/c++.
Matlab Raw Values:
Matlab Smoothen with M0 = smooth3(M0,'gaussian'); :
Gaussian filters are separable. You apply 1D filter at each dimension as follows:
for (dim = 0; dim < D; dim++)
tensor = gaussian_filter(tensor, dim);
I would recommend OpenCV for an implementation of a gaussian filter (and image processing in general) in C++.
Note that this assumes that your pyramid levels are all of the same size.
You can have your own functions that sample your scale-space pyramid on the fly while convolving the third dimension, but if you have enough memory I believe that it would be faster to scale up your coarser level to have the same size of the finest level.
Long ago (in 2008-2009) I have developed a small C++ template lib to apply some simple transformations and convolution filters. The library's source can be found in the Linderdaum Engine - it has nothing to do with the rest of the engine and does not use any of the engine's features. The license is MIT, so do whatever you want with it.
Take a look into the Linderdaum's source code (http://www.linderdaum.com) at Src/Linderdaum/Images/VolumeLib.*
The function to prepare the kernel is PrepareGaussianFilter() and MakeScalarVolumeConvolution() applies the filter. It is easy to adapt the library for the different data sources because the I/O is implemented using callback functions.
OpenCV SURF implementation returns a sequence of 64/128 32 bit float values (descriptor) for each feature point found in the image. Is there a way to normalize this float values and take them to an integer scale (for example, [0, 255])?. That would save important space (1 or 2 bytes per value, instead of 4). Besides, the conversion should ensure that the descriptors remain meaningful for other uses, such as clustering.
Thanks!
There are other feature extractors than SURF. The BRIEF extractor uses only 32 bytes per descriptor. It uses 32 unsigned bytes [0-255] as its elements. You can create one like this: Ptr ptrExtractor = DescriptorExtractor::create("BRIEF");
Be aware that a lot of image processing routines in OpenCV need or assume that the data is stored as floating-point numbers.
You can treat the float features as an ordinary image (Mat or cvmat) and then use cv::normalize(). Another option is using cv::norm() to find the range of descriptor values and then cv::convertTo() to convert to CV_8U. Look up the OpenCV documentation for these functions.
The descriptor returned by cv::SurfFeatureDetector is already normalized. You can verify this by taking the L2 Norm of the cv::Mat returned, or refer to the paper.
I try to implement a people detecting system based on SVM and HOG using OpenCV2.3. But I got stucked.
I came this far:
I can compute HOG values from an image database and then I calculate with LIBSVM the SVM vectors, so I get e.g. 1419 SVM vectors with 3780 values each.
OpenCV just wants one feature vector in the method hog.setSVMDetector(). Therefore I have to calculate one feature vector from my 1419 SVM vectors, that LIBSVM has calculated.
I found one hint, how to calculate this single feature vector: link
“The detecting feature vector at component i (where i is in the range e.g. 0-3779) is built out of the sum of the support vectors at i * the alpha value of that support vector, e.g.
det[i] = sum_j (sv_j[i] * alpha[j]) , where j is the number of the support vector, i
is the number of the components of the support vector.”
According to this, my routine works this way:
I take the first element of my first SVM vector, multiply it with the alpha value and add it with the first element of the second SVM vector that has been multiplied with alpha value, …
But after summing up all 1419 elements I get quite high values:
16.0657, -0.351117, 2.73681, 17.5677, -8.10134,
11.0206, -13.4837, -2.84614, 16.796, 15.0564,
8.19778, -0.7101, 5.25691, -9.53694, 23.9357,
If you compare them, to the default vector in the OpenCV sample peopledetect.cpp (and hog.cpp in the OpenCV source)
0.05359386f, -0.14721455f, -0.05532170f, 0.05077307f,
0.11547081f, -0.04268804f, 0.04635834f, -0.05468199f, 0.08232084f,
0.10424068f, -0.02294518f, 0.01108519f, 0.01378693f, 0.11193510f,
0.01268418f, 0.08528346f, -0.06309239f, 0.13054633f, 0.08100729f,
-0.05209739f, -0.04315529f, 0.09341384f, 0.11035026f, -0.07596218f,
-0.05517511f, -0.04465296f, 0.02947334f, 0.04555536f,
you see, that the default vector values are in the boundaries between –1 and +1, but my values exceed them far.
I think, my single feature vector routine needs some adjustment, any ideas?
Regards,
Christoph
The aggregated vector's values do look high.
I used the loadSVMfromModelFile() located in http://lnx.mangaitalia.net/trainer/main.cpp
I had to remove svinstr.sync(); from the code since it caused losing parts of the lines and getting wrong results.
I don't know much about the rest of the file, I only used this function.