How to normalize OpenCV feature descriptors to an integer scale? - opencv

OpenCV SURF implementation returns a sequence of 64/128 32 bit float values (descriptor) for each feature point found in the image. Is there a way to normalize this float values and take them to an integer scale (for example, [0, 255])?. That would save important space (1 or 2 bytes per value, instead of 4). Besides, the conversion should ensure that the descriptors remain meaningful for other uses, such as clustering.
Thanks!

There are other feature extractors than SURF. The BRIEF extractor uses only 32 bytes per descriptor. It uses 32 unsigned bytes [0-255] as its elements. You can create one like this: Ptr ptrExtractor = DescriptorExtractor::create("BRIEF");
Be aware that a lot of image processing routines in OpenCV need or assume that the data is stored as floating-point numbers.

You can treat the float features as an ordinary image (Mat or cvmat) and then use cv::normalize(). Another option is using cv::norm() to find the range of descriptor values and then cv::convertTo() to convert to CV_8U. Look up the OpenCV documentation for these functions.

The descriptor returned by cv::SurfFeatureDetector is already normalized. You can verify this by taking the L2 Norm of the cv::Mat returned, or refer to the paper.

Related

Scilab Error: Mean, Variance not executing

I1 is an rgb image. 'Out' variable basically stores one colour channel of the whole image.
The in-built functions mean, variance and standard deviation when calculated on 'out' gives an error asking for a real vector or matrix as input.
This can be seen in image given below
But when min or max is used, no error is reported.But these in-built function take in the same parameters as mentioned in the Scilab documentation which is of type vector or matrix of integers.
On further examination, it seems that variable 'out' is of type matrix of graphic handles when it should be a matrix of integers.
I can't seem to understand why the error is coming if it works for min and max functions ?
How can I solve this problem?
The output of imread() is a hypermatrix of integers, not of floating point numbers.
This is shown by the fact that min(out) is displayed as "4" (without decimal point), not as "4."
Now, mean() and stdev() do not work with integers, only with real or complex numbers.
The solution is to convert integers into decimal numbers:
mean(double(out))
https://help.scilab.org/docs/6.1.1/en_US/double.html

Packing a 16-bit floating point variable into two 8-bit variables (aka half into two bytes)

I code on XNA and only has access to shader model 3, hence no bitshift operators. I need to pack two random 16-bit floating point variables (meaning NOT in range [0,1] but ANY RANDOM FLOAT VARIABLE) into two 8-bit variables. There is no way to normalize them.
I thought about doing bitshifting manually but I can't find a good article on how to convert a random decimal float (not [0,1]) into binary and back.
Thanks
This is not really a good idea - a 16-bit float already has very limited range and precision. Remember that 8-bits leaves you with just 256 possible values!
Getting an 8-bit value into a shader is trivial. As a colour is one method. You can use each channel as a normalised range, from 0 to 1.
Of course, you say you don't want to normalise your values. So I assume you want to maintain the nice floating-point property of a wide range with better precision closer to zero.
(Now would be a good time to read some background info on floating-point. Especially about half-precision floating-point and minifloats and microfloats.)
One way to do that would be to encode your values using a logarithm and an exponent (to encode and decode, respectivly). This is basically exactly what the floating-point format itself does. The exact maths will depend on the precision and the range that you desire - (which 256 values will you represent?) - so I will leave it as an exercise.

Using my own kernel in libsvm

I am currently developing my own kernel to use for classification and want to include it into libsvm, replacing the standard kernels that libsvm offers.
I however am not 100% sure how to do this, and obviously do not want to make any mistakes. Beware, that my c++ is not very good. I found the following on the libsvm faq-page:
Q: I would like to use my own kernel. Any example? In svm.cpp, there
are two subroutines for kernel evaluations: k_function() and
kernel_function(). Which one should I modify ? An example is "LIBSVM
for string data" in LIBSVM Tools.
The reason why we have two functions is as follows. For the RBF kernel
exp(-g |xi - xj|^2), if we calculate xi - xj first and then the norm
square, there are 3n operations. Thus we consider exp(-g (|xi|^2 -
2dot(xi,xj) +|xj|^2)) and by calculating all |xi|^2 in the beginning,
the number of operations is reduced to 2n. This is for the training.
For prediction we cannot do this so a regular subroutine using that 3n
operations is needed. The easiest way to have your own kernel is to
put the same code in these two subroutines by replacing any kernel.
Hence, I was trying to find the two subroutinges k_function() and kernel_function(). The former I found with the following signature in svm.cpp:
double Kernel::k_function(const svm_node *x, const svm_node *y,
const svm_parameter& param)
Am I correct, that x and y each store one observation (=row) of my feature matrix in an array and I need to return the kernel value k(x,y)?
The function kernel_function() on the other hand I was not able to find at all. There is a pointer in the Kernel class with that name and the following declaration
double (Kernel::*kernel_function)(int i, int j) const;
which is set in the Kernel constructor. What are i and j in that case? I suppose I need to set this pointer as well?
Once I overwrote Kernel::k_function and Kernel::*kernel_function I'd be finished, and libsvm would use my kernel to compare two observations?
Thank you!
You don't have to break into the code of LIBSVM to use your own kernel, you can use the pre-computed kernel option (i.e., -t 4 training_set_file).
Thus, you can compute the kernel matrix externally as it suits you, store the values in a file and load the pre-computed kernel to LIBSVM. There's an explanation accompanied with an example of how to do this in the README file that you can find in LIBSVM tar ball (see in the Precomputed Kernels section line 236).

How are matrices stored in memory?

Note - may be more related to computer organization than software, not sure.
I'm trying to understand something related to data compression, say for jpeg photos. Essentially a very dense matrix is converted (via discrete cosine transforms) into a much more sparse matrix. Supposedly it is this sparse matrix that is stored. Take a look at this link:
http://en.wikipedia.org/wiki/JPEG
Comparing the original 8x8 sub-block image example to matrix "B", which is transformed to have overall lower magnitude values and much more zeros throughout. How is matrix B stored such that it saves much more memory over the original matrix?
The original matrix clearly needs 8x8 (number of entries) x 8 bits/entry since values can range randomly from 0 to 255. OK, so I think it's pretty clear we need 64 bytes of memory for this. Matrix B on the other hand, hmmm. Best case scenario I can think of is that values range from -26 to +5, so at most an entry (like -26) needs 6 bits (5 bits to form 26, 1 bit for sign I guess). So then you could store 8x8x6 bits = 48 bytes.
The other possibility I see is that the matrix is stored in a "zig zag" order from the top left. Then we can specify a start and an end address and just keep storing along the diagonals until we're only left with zeros. Let's say it's a 32-bit machine; then 2 addresses (start + end) will constitute 8 bytes; for the other non-zero entries at 6 bits each, say, we have to go along almost all the top diagonals to store a sum of 28 elements. In total this scheme would take 29 bytes.
To summarize my question: if JPEG and other image encoders are claiming to save space by using algorithms to make the image matrix less dense, how is this extra space being realized in my hard disk?
Cheers
The dct needs to be accompanied with other compression schemes that take advantage of the zeros/high frequency occurrences. A simple example is run length encoding.
JPEG uses a variant of Huffman coding.
As it says in "Entropy coding" a zig-zag pattern is used, together with RLE which will already reduce size for many cases. However, as far as I know the DCT isn't giving a sparse matrix per se. But it usually enhances the entropy of the matrix. This is the point where the compressen becomes lossy: The intput matrix is transferred with DCT, then the values are quantizised and then the huffman-encoding is used.
The most simple compression would take advantage of repeated sequences of symbols (zeros). A matrix in memory may look like this (suppose in dec system)
0000000000000100000000000210000000000004301000300000000004
After compression it may look like this
(0,13)1(0,11)21(0,12)43010003(0,11)4
(Symbol,Count)...
As my under stand, JPEG on only compress, it also drop data. After the 8x8 block transfer to frequent domain, it drop the in-significant (high-frequent) data, which means it only has to save the significant 6x6 or even 4x4 data. That it can has higher compress rate then non-lost method (like gif)

OpenCV + HOG +SVM: help needed with SVM single feature vector

I try to implement a people detecting system based on SVM and HOG using OpenCV2.3. But I got stucked.
I came this far:
I can compute HOG values from an image database and then I calculate with LIBSVM the SVM vectors, so I get e.g. 1419 SVM vectors with 3780 values each.
OpenCV just wants one feature vector in the method hog.setSVMDetector(). Therefore I have to calculate one feature vector from my 1419 SVM vectors, that LIBSVM has calculated.
I found one hint, how to calculate this single feature vector: link
“The detecting feature vector at component i (where i is in the range e.g. 0-3779) is built out of the sum of the support vectors at i * the alpha value of that support vector, e.g.
det[i] = sum_j (sv_j[i] * alpha[j]) , where j is the number of the support vector, i
is the number of the components of the support vector.”
According to this, my routine works this way:
I take the first element of my first SVM vector, multiply it with the alpha value and add it with the first element of the second SVM vector that has been multiplied with alpha value, …
But after summing up all 1419 elements I get quite high values:
16.0657, -0.351117, 2.73681, 17.5677, -8.10134,
11.0206, -13.4837, -2.84614, 16.796, 15.0564,
8.19778, -0.7101, 5.25691, -9.53694, 23.9357,
If you compare them, to the default vector in the OpenCV sample peopledetect.cpp (and hog.cpp in the OpenCV source)
0.05359386f, -0.14721455f, -0.05532170f, 0.05077307f,
0.11547081f, -0.04268804f, 0.04635834f, -0.05468199f, 0.08232084f,
0.10424068f, -0.02294518f, 0.01108519f, 0.01378693f, 0.11193510f,
0.01268418f, 0.08528346f, -0.06309239f, 0.13054633f, 0.08100729f,
-0.05209739f, -0.04315529f, 0.09341384f, 0.11035026f, -0.07596218f,
-0.05517511f, -0.04465296f, 0.02947334f, 0.04555536f,
you see, that the default vector values are in the boundaries between –1 and +1, but my values exceed them far.
I think, my single feature vector routine needs some adjustment, any ideas?
Regards,
Christoph
The aggregated vector's values do look high.
I used the loadSVMfromModelFile() located in http://lnx.mangaitalia.net/trainer/main.cpp
I had to remove svinstr.sync(); from the code since it caused losing parts of the lines and getting wrong results.
I don't know much about the rest of the file, I only used this function.

Resources