Is the median or mean of a set of values in Decibels (dB) taken directly or conversion to linear is required - mean

I need to take the median of a set of values of path loss (dB) in MatLab. Does anyone know that they shall be converted to linear units like Watts before their median is calculated by the formula. The result is different in both the cases but i don't know which one is correct.

The median is just the middle number if you were to line them up in ascending or descending order. It should give the same result if you convert to a linear scale or if you keep it as dB.
However, if there are an even number of values, then it will make a difference and you should probably stick with the linear values.

Related

How to convert kernel to matrix notation?

I'm trying to understand the bicubic convolution algorithm and haven't been able to understand how the kernel given as a piece wide function,
is turned into this matrix:
I understand to arrive at the matrix a was set to -0.5. No matter how I look at it I can't arrive at the non-symmetric matrix shown.
I've looked through the paper by Keys, but he does not expand into matrix notation and I've struggled with how to get there.
Any insight would be much appreciated.
Step 1 to see the relation is to multiply the function W(x) with the sampled input data f[n] for a given shift t. This gives 5 weights multiplying to 5 input samples, and added together to form an output sample p(t).
The matrix used to compute p(t) is not symmetric because, for any shift t that is not 0, the weights applied to the samples are not symmetric either. You can see this by writing out W(t+i), which are the weights applied to the 5 samples around the output position t (i in [-2,2]).
I've found and understood where Keys describes the process. You can follow along from top to bottom in the image below, but the most important bit to note is Equation 7.
All of the values within the matrix come from the coefficients of the c-terms. The first row of the matrix corresponds to the coefficients of the constant terms, and the first column corresponds to the c_j-1 terms. This can be seen by comparing the figure below to Equation 7's coefficients:
I was able to use this understanding to implement the cubic convolution method to interpolate a surface for which I was able to tune the value of a in order to see the response. I'm happy to help expand on this if anything is unclear!

How to make the labels of superpixels to be locally consistent in a gray-level map?

I have a bunch of gray-scale images decomposed into superpixels. Each superpixel in these images have a label in the rage of [0-1]. You can see one sample of images below.
Here is the challenge: I want the spatially (locally) neighboring superpixels to have consistent labels (close in value).
I'm kind of interested in smoothing local labels but do not want to apply Gaussian smoothing functions or whatever, as some colleagues suggested. I have also heard about Conditional Random Field (CRF). Is it helpful?
Any suggestion would be welcome.
I'm kind of interested in smoothing local labels but do not want to apply Gaussian smoothing functions or whatever, as some colleagues suggested.
And why is that? Why do you not consider helpful advice of your colleagues, which are actually right. Applying smoothing function is the most reasonable way to go.
I have also heard about Conditional Random Field (CRF). Is it helpful?
This also suggests, that you should rather go with collegues advice, as CRF has nothing to do with your problem. CRF is a classifier, sequence classifier to be exact, requiring labeled examples to learn from and has nothing to do with the setting presented.
What are typical approaches?
The exact thing proposed by your collegues, you should define a smoothing function and apply it to your function values (I will not use a term "labels" as it is missleading, you do have values in [0,1], continuous values, "label" denotes categorical variable in machine learning) and its neighbourhood.
Another approach would be to define some optimization problem, where your current assignment of values is one goal, and the second one is "closeness", for example:
Let us assume that you have points with values {(x_i, y_i)}_{i=1}^N and that n(x) returns indices of neighbouring points of x.
Consequently you are trying to find {a_i}_{i=1}^N such that they minimize
SUM_{i=1}^N (y_i - a_i)^2 + C * SUM_{i=1}^N SUM_{j \in n(x_i)} (a_i - a_j)^2
------------------------- - --------------------------------------------
closeness to current constant to closeness to neighbouring values
values weight each part
You can solve the above optimization problem using many techniques, for example through scipy.optimize.minimize module.
I am not sure that your request makes any sense.
Having close label values for nearby superpixels is trivial: take some smooth function of (X, Y), such as constant or affine, taking values in the range [0,1], and assign the function value to the superpixel centered at (X, Y).
You could also take the distance function from any point in the plane.
But this is of no use as it is unrelated to the image content.

Linear Regression :: Normalization (Vs) Standardization

I am using Linear regression to predict data. But, I am getting totally contrasting results when I Normalize (Vs) Standardize variables.
Normalization = x -xmin/ xmax – xmin
 
Zero Score Standardization = x - xmean/ xstd
 
a) Also, when to Normalize (Vs) Standardize ?
b) How Normalization affects Linear Regression?
c) Is it okay if I don't normalize all the attributes/lables in the linear regression?
Thanks,
Santosh
Note that the results might not necessarily be so different. You might simply need different hyperparameters for the two options to give similar results.
The ideal thing is to test what works best for your problem. If you can't afford this for some reason, most algorithms will probably benefit from standardization more so than from normalization.
See here for some examples of when one should be preferred over the other:
For example, in clustering analyses, standardization may be especially crucial in order to compare similarities between features based on certain distance measures. Another prominent example is the Principal Component Analysis, where we usually prefer standardization over Min-Max scaling, since we are interested in the components that maximize the variance (depending on the question and if the PCA computes the components via the correlation matrix instead of the covariance matrix; but more about PCA in my previous article).
However, this doesn’t mean that Min-Max scaling is not useful at all! A popular application is image processing, where pixel intensities have to be normalized to fit within a certain range (i.e., 0 to 255 for the RGB color range). Also, typical neural network algorithm require data that on a 0-1 scale.
One disadvantage of normalization over standardization is that it loses some information in the data, especially about outliers.
Also on the linked page, there is this picture:
As you can see, scaling clusters all the data very close together, which may not be what you want. It might cause algorithms such as gradient descent to take longer to converge to the same solution they would on a standardized data set, or it might even make it impossible.
"Normalizing variables" doesn't really make sense. The correct terminology is "normalizing / scaling the features". If you're going to normalize or scale one feature, you should do the same for the rest.
That makes sense because normalization and standardization do different things.
Normalization transforms your data into a range between 0 and 1
Standardization transforms your data such that the resulting distribution has a mean of 0 and a standard deviation of 1
Normalization/standardization are designed to achieve a similar goal, which is to create features that have similar ranges to each other. We want that so we can be sure we are capturing the true information in a feature, and that we dont over weigh a particular feature just because its values are much larger than other features.
If all of your features are within a similar range of each other then theres no real need to standardize/normalize. If, however, some features naturally take on values that are much larger/smaller than others then normalization/standardization is called for
If you're going to be normalizing at least one variable/feature, I would do the same thing to all of the others as well
First question is why we need Normalisation/Standardisation?
=> We take a example of dataset where we have salary variable and age variable.
Age can take range from 0 to 90 where salary can be from 25thousand to 2.5lakh.
We compare difference for 2 person then age difference will be in range of below 100 where salary difference will in range of thousands.
So if we don't want one variable to dominate other then we use either Normalisation or Standardization. Now both age and salary will be in same scale
but when we use standardiztion or normalisation, we lose original values and it is transformed to some values. So loss of interpretation but extremely important when we want to draw inference from our data.
Normalization rescales the values into a range of [0,1]. also called min-max scaled.
Standardization rescales data to have a mean (μ) of 0 and standard deviation (σ) of 1.So it gives a normal graph.
Example below:
Another example:
In above image, you can see that our actual data(in green) is spread b/w 1 to 6, standardised data(in red) is spread around -1 to 3 whereas normalised data(in blue) is spread around 0 to 1.
Normally many algorithm required you to first standardise/normalise data before passing as parameter. Like in PCA, where we do dimension reduction by plotting our 3D data into 1D(say).Here we required standardisation.
But in Image processing, it is required to normalise pixels before processing.
But during normalisation, we lose outliers(extreme datapoints-either too low or too high) which is slight disadvantage.
So it depends on our preference what we chose but standardisation is most recommended as it gives a normal curve.
None of the mentioned transformations shall matter for linear regression as these are all affine transformations.
Found coefficients would change but explained variance will ultimately remain the same. So, from linear regression perspective, Outliers remain as outliers (leverage points).
And these transformations also will not change the distribution. Shape of the distribution remains the same.
lot of people use Normalisation and Standardisation interchangeably. The purpose remains the same is to bring features into the same scale. The approach is to subtract each value from min value or mean and divide by max value minus min value or SD respectively. The difference you can observe that when using min value u will get all value + ve and mean value u will get bot + ve and -ve values. This is also one of the factors to decide which approach to use.

how to interpret the "soft" and "max" in the SoftMax regression?

I know the form of the softmax regression, but I am curious about why it has such a name? Or just for some historical reasons?
The maximum of two numbers max(x,y) could have sharp corners / steep edges which sometimes is an unwanted property (e.g. if you want to compute gradients).
To soften the edges of max(x,y), one can use a variant with softer edges: the softmax function. It's still a max function at its core (well, to be precise it's an approximation of it) but smoothed out.
If it's still unclear, here's a good read.
Let's say you have a set of scalars xi and you want to calculate a weighted sum of them, giving a weight wi to each xi such that the weights sum up to 1 (like a discrete probability). One way to do it is to set wi=exp(a*xi) for some positive constant a, and then normalize the weights to one. If a=0 you get just a regular sample average. On the other hand, for a very large value of a you get max operator, that is the weighted sum will be just the largest xi. Therefore, varying the value of a gives you a "soft", or a continues way to go from regular averaging to selecting the max. The functional form of this weighted average should look familiar to you if you already know what a SoftMax regression is.

GPUImage Taking sum of columns of image

Im using GPUImage in my project and I need an efficient way of taking the column sums. Naive way would obviously be retrieving the raw data and adding values of every column. Can anybody suggest a faster way for that?
One way to do this would be to use the approach I take with the GPUImageAverageColor class (as described in this answer), only instead of reducing the total size of each frame at each step, only do this for one dimension of the image.
The average color filter determines the average color of the overall image by stepping down in a factor of four in both X and Y, averaging 16 pixels into one at each step. If operating in a single direction, you should be able to use hardware interpolation to get an 18X reduction in a single direction per step with good performance. Your final step might either require a quick CPU-based iteration on the much smaller image or a tweaked version of this shader that pulls the last few pixels in a column together into the final result pixel for that column.
You notice that I've been talking about averaging here, because the output values for any OpenGL ES operation will need to be in terms of colors, which only have a 0-255 range per channel. A sum will easily overflow this, but you could use an average as an approximation of your sum, with a more limited dynamic range.
If you only care about one color channel, you could possibly encode a larger value into the RGBA channels and maintain a 32-bit sum that way.
Beyond what I describe above, you could look at performing this sum with the help of the Accelerate framework. While probably not quite as fast as doing a shader-based reduction, it might be good enough for your needs.

Resources