Get a blobs Skewness by inspecting 3rd Order Moment? - opencv

I've been reading up on Image Moments and it looks like they are very useful for efficiently describing a blob. Apparently, the 3rd order moment represents/can tell me about a blob's skewness (is this correct?).
How can I get the 3rd order moment in OpenCV? Do you have a calculation/formula you can point me to?
Moments m = moments(contour, false);
// Are any of these the 3rd order moment?
m.m03;
m.mu03;
m.nu03;

As stated in the OpenCV docs for moments():
Calculates all of the moments up to the third order of a polygon or rasterized shape.
So yes, moments() does return what you're after.
The three quantities you mention, m03, mu03, nu03 are all different types of the third or moment.
m03 is the third order moment
mu03 is the third order central moment, i.e. the same as m03 but if the blob was centered around (0, 0)
nu03 is mean-shifted and normalized, i.e. the same as mu03 but divided by an area.
Let's say you wanted to describe a shape but be agnostic to the size or the location of it in your image. Then you would use the mean-shifted and normalized descriptors, nu03. But if you wanted to keep the size as part of the description, then you'd use mu03. If you wanted to keep the location information as well, you'd use m03.
You can think about it in the same way as you think about distributions in general. Saying that I have a sample of x = 500 in a normal distribution with mean 450 and standard deviation 25 is basically the same thing as saying I have a sample of x = 2 in a normal distribution with mean 0 and std dev 1. Sometimes you might want to talk about the distribution in terms of it's actual parameters (mean 450, std dev 25), sometimes you might want to talk about the distribution as if it's mean centered (mean 0, std dev 25), and sometimes talk about it as if it's the standard Gaussian (mean 0, std dev 1).
This excellent answer goes over how to manually calculate the moments, which in tandem with the 'basic' descriptions I gave above, should make the formulas in the OpenCV docs for moments() a little more comfortable.

Related

Calculate total distance between multiple pairwise distributions/histograms

I am not sure about the terminology I should use for my problem, so I will give an example.
I have 2 sets of measurements (6 empirical distributions per set = D1-6) that describe 2 different states of the same system (BLUE & RED). These distributions can be multimodal, skewed, undersampled, and strange in some other unpredictable ways.
BLUE is my reference and I want to make RED distributed as closely as possible to BLUE, for all pairwise distributions. For this, I will play with parameters of my RED system and monitor the RED set of measurements D1-6 trying to make it overlap BLUE perfectly.
I know that I can use Jensen-Shannon or Bhattacharyya distances to evaluate the distance between 2 distributions (i.e. RED-D1 and BLUE-D1, for example). However, I do not know if there exist other metrics that could be applied here to get a global distance between all distributions (i.e. quantify the global mismatch between 2 sets of pairwise distributions). Is that the case ?
I am thinking about building an empirical scoring function that would use all the pairwise Jensen-Shannon distances, but I have no better ideas yet. I believe I can NOT just sum all the JS distances because I would get similar scores in these 2 hypothetical, different cases:
D1-6 are distributed as in my image
RED-D1-5 are a much better fit to BLUE-D1-5, BUT RED-D6 is shifted compared to BLUE-D6
And that would be wrong because I would have missed one important feature of my system. Given these 2 cases, it is better to have D1-6 distributed as in my image (solution 1).
The pairwise match between each distribution is equally important and should be equally weighted (i.e. the match between BLUE-D1 and RED-D1 is as important as the match between BLUE-D2 and RED-D2, etc).
D1-3 has a given range DOM1 of [0, 5] and D4-6 has another range DOM2 of [50, 800]. Diamonds represent the weighted means of BLUE and RED distributions.
Thank you very much for your help!
I ended up using a sum of all pairwise Earth Mover's Distance (EMD, https://en.wikipedia.org/wiki/Earth_mover%27s_distance, also known as Wasserstein metric) as a global metric of distance between all pairwise distributions. This describes the difference or similarity between 2 states of my system in an appropriate way.
EMD is implemented in python in package 'pyemd' or using scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wasserstein_distance.html.

How to Address Noise Resulting from Inverse-Scaling for a Machine Learning Task

I'm still a little unsure of whether questions like these belong on stackoverflow. Is this website only for questions with explicit code? The "How to Format" just tells me it should be a programming question, which it is. I will remove my question if the community thinks otherwise.
I have created a neural network and am predicting reasonable values for most of my data (the task is multi-variate time series forecasting).
I scale my data before inputting it using scikit-learn's MinMaxScaler(0,1) or MinMaxScaler(-1,1) (the two primary scalings I am using).
The model learns, predicts, and I inverse the scaling using MinMaxScaler()'s inverse_transform method to visually see how close my predictions were to the actual values. However, I notice that the inverse_scaled values for a particular part of the vector I predicted have now become very noisy. Here is what I mean (inverse_scaled prediction):
Left end: noisy; right end: not-so noisy.
I initially thought that perhaps my network didn't learn that part of the vector well, so is just outputting ~random values. BUT, I notice that the predicted values before the inverse scaling seem to match the actual values very well, but that these values are typically near 0 or -1 (lower limit of the feature scale) because of the fact that these values have a very large spread (unscaled mean= 1E-1, max= 1E+1 [not an outlier]). Example (scaled prediction):
So, when inverse transforming these values (again, often near -1 or 0), the transformed values exhibit loud noise, as shown in the images.
Questions:
1.) Should I be using a different scaler/scaling differently, perhaps one that exponentially/nonlinearly scales? MinMaxScaler() scales each column. Simply dropping the high-magnitude data isn't an option since they are real, meaningful data. 2.) What other solutions can help this?
Please let me know if you'd like anything else clarified.

OpenCV: PNP pose estimation fails in a specific case

I am using OpenCV's solvePnPRansac function to estimate the pose of my camera given a pointcloud made from tracked features. My pipeline consists of multiple cameras where I form the point cloud from matched features between two cameras, and use that as a reference to estimate the pose of one of the cameras as it starts moving. I have tested this in multiple settings and it works as long as there are enough features to track while the camera is in motion.
Strangely, during a test I did today, I encountered a failure case where solvePnP would just return junk values all the time. What's confusing here is that in this data set, my point cloud is much denser, it's reconstructed pretty accurately from the two views, the tracked number of points (currently visible features vs. features in the point cloud) at any given time was much higher than what I usually have, so theoretically it should have been a breeze for solvePnP, yet it fails terribly.
I tried with CV_ITERATIVE, CV_EPNP and even the non RANSAC version of solvePnP. I was just wondering if I am missing something basic here? The scene I am looking at can be seen in these images (image 1 is the scene and feature matches between two perspectives, image 2 is the point cloud for reference)
The part of the code doing PNP is pretty simple. If P3D is the array of tracked 3Dpoints, P2D is the corresponding set of image points,
solvePnpRansac(P3D, P2D, K, d, R, T, false, 500, 2.0, 100, noArray(), CV_ITERATIVE);
EDIT: I should also mention that my reference poincloud was obtained with a baseline of 8 feet between the cameras, whereas the building I am looking at was probably like a 100 feet away. Could the possible lack of disparity cause issues as well?

Explanation for Values in Scharr-Filter used in OpenCV (and other places)

The Scharr-Filter is explained in Scharrs dissertation. However the values given on page 155 (167 in the pdf) are [47 162 47] / 256. Multiplying this with the derivation-filter would yield:
Yet all other references I found use
Which is roughly the same as the ones given by Scharr, scaled by a factor of 32.
Now my guess is that the range can be represented better, but I'm curious if there is an official explanation somewhere.
To get the ball rolling on this question in case no "expert" can be found...
I believe the values [3, 10, 3] ... instead of [47 162 47] / 256 ... are used simply for speed. Recall that this method is competing against the Sobel Operator whose coefficient values are are 0, and positive/negative 1's and 2's.
Even though the divisor in the division, 256 or 512, is a power of 2 and can can be performed by a shift, doing that and multiplying by 47 or 162 is going to take more time. A multiplication by 3 however can in fact be done on some RISC architectures like the IBM POWER series in a single shift-and-add operation. That is 3x = (x << 1) + x. (On these architectures, the shifter and adder are separate units and can be done independently).
I don't find it surprising that Phd paper used the more complicated and probably more precise formula; it needed to prove or demonstrate something, and the author probably wasn't totally certain or concerned that it be used and implemented alongside other methods. The purpose in the thesis was probably to have "perfect rotational symmetry". Afterwards when one decides to implement it, that person I suspect used the approximation formula and gave up a little on perfect rotational symmetry, to gain speed. That person's goal as I said was to have something that was competitive at the expense of little bit of speed for this rotational stuff.
Since I'm guessing you are willing to do work this as it is your thesis, my suggestion is to implement the original algorithm and benchmark it against both the OpenCV Scharr and Sobel code.
The other thing to try to get an "official" answer is: "Use the 'source', Luke!". The code is on github so check it out and see who added the Scharr filter there and contact that person. I won't put the person's name here, but I will say that the code was added 2010-05-11.

Which of the parameters in LibSVM is the slack variable?

I am a bit confused about the namings in the SVM. I am using this library LibSVM. There are so many parameters that can be set. Does anyone know which of these is the slack variable?
thx
The "slack variable" is C in c-svm and nu in nu-SVM. These both serve the same function in their respective formulations - controlling the tradeoff between a wide margin and classifier error. In the case of C, one generally test it in orders of magnitude, say 10^-4, 10^-3, 10^-2,... to 1, 5 or so. nu is a number between 0 and 1, generally from .1 to .8, which controls the ratio of support vectors to data points. When nu is .1, the margin is small, the number of support vectors will be a small percentage of the number of data points. When nu is .8, the margin is very large and most of the points will fall in the margin.
The other things to consider are your choice of kernel (linear, RBF, sigmoid, polynomial) and the parameters for the chosen kernel. Generally one has to do a lot of experimenting to find the best combination of parameters. However, be careful of over-fitting to your dataset.
Burges wrote a great tutorial: A Tutorial on Support Vector Machines for Pattern
Recognition
But if you mostly just want to know how to USE it and less about how it works, read "A Practical Guide to Support Vector Classication" by Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin (authors of libsvm)
First decide which type of SVM are u intending to use: C-SVC, nu-SVC , epsilon-SVR or nu-SVR. In my opinion u need to vary C and gamma most of the time... the rest are usually fixed..

Resources