The necessity of score normalization in detection tasks

The necessity of score normalization in detection tasks - machine-learning

In a detection task, such as a speaker recognition task, a score threshold is set to decide whether the unknow input is a TARGET speaker or a IMPOSTOR speaker.
Let's say a two-speaker recognition system(tells the input voice is from speaker A, speaker B or non of both). If we want to set a general(or global, cross-speaker) threshold, score normalization(like impostor-centric z-normalization) is necessary because there is a difference between A and B's score distribution. Score normalization eliminate the difference then make setting a global threshold suit for both A and B possible.
BUT, in the case a set of speaker-specific(dependent) thresholds is acceptable, thresholds can be selected only depend on a speaker's score distribution, is it still necessary to perform score normalization to eliminate the difference?
OR, is there some misunderstanding in my concept?

Related

How to calculate reflection with GNUradio?

I am preforming an experiment that involves a transmitter, material target, and two receivers (as a baseline). The goal is to record the RF reflectivity of the target. How can I calculate/measure this from the received signal, and can it be done in GNUradio-companion?
Any help is appreciated.
Thank You.

You can do that, in many ways. In the end, chances are you'll send some predefined signal, e.g., precomputed white pseudorandom noise from a "vector source", record that (e.g. using a "file sink" or a "vector sink") and build a correlation estimator that processes that data offline.
Of course, a correlation is just convolution with the (conjugate) time-inverse, so you can also (conjugate if complex and) time-reverse your reference signal, and use it as filter taps.
Note that in general, SDR devices are nice and linear, but not calibrated – you can only compare received signal powers, but you cannot attribute an absolute power to them – unless you know the strength of some reference reception.

Is reinforcement learning suitable for predicting bias in dice?

I would like to analyze a problem similar to the following.
Problem:
You will be given N dices.
You will be given a lot of data about each dice (eg surface information, material information, location of the center of gravity … etc).
The features of the dice are randomly generated every game and are fired at the same speed, angle and initial position.
As a result of rolling the dice, you get 1 point if you get 6 and 0 points otherwise.
There are training data of 100000 games. (Dice data and match results)
I would like to learn the rule of selecting only dice whose probability of getting 6 is higher than 1/6.
I apologize for the vague problem statement.
First of all, it is my mistake to assume that "N dice".
The dice may be one by one.
One dice with random characteristics are distributed
When it rolls, it is recorded whether 6 has come out or not.
It was easy to understand if it was made into the problem that "this [characteristics, result] data is 100,000".
If you get something other than 6, you will get -1 points.
If you get 6, you will get +5 points.
Example:
X: vector of a dice data
f: function I want to know
f: X-> [0, 1]
(if result> 0.5, I pick this dice.)
For example, a dice with a 1/5 chance of getting a 6 gets 4 out of 5 times a non-6, so I wondered if it would be better to give an immediate reward.
Is it good to decide the reward by the number of points after 100000 games?
I have read some general reinforcement learning methods, but there is a concept of state transition. However, there is no state transition in this game. (Each game ends in 1 step, and each game is independent.)
I am a student just learning neural networks from scratch. It helps if you give me a hint. Thank you.
by the way,
I think that the result of this learning can be concluded "It is good to choose the dice whose pips farthest to the center of gravity is 6."

Let's first talk about Reinforcement-Learning.
Problem setups, in order of increasing generality:
Multi-Armed Bandit - no state, just actions with unknown rewards
Contextual Bandit - rewards also depend on some context (state)
Reinforcement Learning (MDP) - actions can also influence the next state
Common to all of all three is that you want to maximize the sum of rewards over time, and there is an exploration vs exploitation trade-off. You are not just given a large dataset. If you want to know what the best action is, you have to try it a few times and observe the reward. This may cost you some reward you could have earned otherwise.
Of those three, the Contextual Bandit is the closest match to your setup, although it doesn't quite match to your goals. It would go like this: Given some properties of the dice (the context), select the best dice from a group of possible choices (the action, e.g. the output from your network), such that you get the highest expected reward. At the same time you are also training your network, so you have to pick bad or unknown properties sometimes to explore them.
However, there are two reasons why it doesn't match:
You already have data from several 100000 of games, and seem to be not interested in minimizing the cost of trial and error to acquire more data. You assume this data is representative, so no exploration is required.
You are only interested in prediction. You want classify the dice into "good for rolling 6" vs "bad". This piece of information could be used later to make a decision between different choices if you know the cost for making a wrong decision. If you are just learning f() because you are curious about the property of a dice, then is a pure statistical prediction problem. You don't have to worry about short- or long-term rewards. You don't have to worry about selection or consequences of any actions.
Because of this, you actually only have a supervised learning problem. You could still solve it with reinforcement learning because RL is more general. But your RL algorithm would be wasting a lot of time figuring out that it really cannot influence the next state.
Supervised Learning
Your dice actually behaves like a biased coin, it's a Bernoulli trial with ~1/6 success probability. This is now a standard classification problem: given your features, predict the probability that a dice will lead to a good match result.
It seems that your "match results" can be easily converted in the number of rolls and the number of positive outcomes (rolled a six) with the same dice. If you have a large number of rolls for every dice, you can simply classify this die and use this class (together with the physical properties) as one data point to train your network.
You can do more fancy things if you have fewer rolls but I won't go into that. (If you are interested, have a look at the beta distribution and how the cross-entropy loss works with neural networks.)

Using MFCC coefficients for simple voice activity detection

Since MFCC coefficients stores information about amplitudes for bands of frequencies (that depends on used filter bank), how can those coefficient be used for voice activity detection?
Would it be sufficient to use this coefficients to perform further energy calculaction and make decisions with them?

Question 1)Since MFCC coefficients stores information about amplitudes for bands of frequencies (that depends on used filter bank), how can those coefficient be used for voice activity detection?
Answer:-As MFCC coefficients stores information about amplitudes for bands of frequencies, different people have different frequencies for utterance for same sentence. The bands of spoken words are compared with database bands frequencies and person is identified.
Question 2)Would it be sufficient to use this coefficients to perform further energy calculaction and make decisions with them?
answer:-yes it would be sufficient to perform further energy calculation , but along with MFCC if you add LPC and other features it will give you better decision.

How can I find process noise and measurement noise in a Kalman filter if I have a set of RSSI readings?

im have RSSI readings but no idea how to find measurement and process noise. What is the way to find those values?

Not at all. RSSI stands for "Received Signal Strength Indicator" and says absolutely nothing about the signal-to-noise ratio related to your Kalman filter. RSSI is not a "well-defined" things; it can mean a million things:
Defining the "strength" of a signal is a tricky thing. Imagine you're sitting in a car with an FM radio. What does the RSSI bars on that radio's display mean? Maybe:
The amount of Energy passing through the antenna port (including noise, because at this point no one knows what noise and signal are)?
The amount of Energy passing through the selected bandpass for the whole ultra shortwave band (78-108 MHz, depending on region) (incl. noise)?
Energy coming out of the preamplifier (incl. Noise and noise generated by the amplifier)?
Energy passing through the IF filter, which selects your individual station (is that already the signal strength as you want to define it?)?
RMS of the voltage observed by the ADC (the ADC probably samples much higher than your channel bandwidth) (is that the signal strength as you want to define it?)?
RMS of the digital values after a digital channel selection filter (i.t.t.s.s.a.y.w.t.d.i?)?
RMS of the digital values after FM demodulation (i.t.t.s.s.a.y.w.t.d.i?)?
RMS of the digital values after FM demodulation and audio frequency filtering for a mono mix (i.t.t.s.s.a.y.w.t.d.i?)?
RMS of digital values in a stereo audio signal (i.t.t.s.s.a.y.w.t.d.i?) ?
...
as you can imagine, for systems like FM radios, this is still relatively easy. For things like mobile phones, multichannel GPS receivers, WiFi cards, digital beamforming radars etc., RSSI really can mean everything or nothing at all.
You will have to mathematically define away to describe what your noise is. And then you will need to find the formula that describes your exact implementation of what "RSSI" is, and then you can deduct whether knowing RSSI says anything about process noise.

A Kalman Filter is a mathematical construct for computing the expected state of a system that is changing over time, given an initial state and noisy measurements of that system. The key to the "process noise" component of this is the fact that the system is changing. The way that the system changes is the process.
Your state might change due to manual control or due to the nature of the system. For example, if you have a car on a hill, it can roll down the hill naturally (described by the state transition matrix), or you might drive it down the hill manually (described by the control input matrix). Any noise that might affect these inputs - wind, bumps, twitches - can be described with the process noise.
You can measure the process noise the way you would measure variance in any system - take the expected dynamics and compare them with the true dynamics to generate a covariance matrix.

How to test the quality of a probabilities estimator?

I created a heuristic (an ANN, but that's not important) to estimate the probabilities of an event (the results of sports games, but that's not important either). Given some inputs, this heuristics tell me what are the probabilities of the event. Something like : Given theses inputs, team B as 65% chances to win.
I have a large set of inputs data for which I now the result (games previously played). Which formula/metric could I use to qualify the accuracy of my estimator.
The problem I see is, if the estimator says the event has a probability of 20% and the event actually do occurs. I have no way to tell if my estimator is right or wrong. Maybe it's wrong and the event was more likely than that. Maybe it's right, the event as about 20% chance to occur and did occur. Maybe it's wrong, the event has really low chances to occurs, say 1 in 1000, but happened to occur this time.
Fortunately I have lots of theses actual test data, so there is probably a way to use them to qualify my heuristic.
anybody got an idea?

There are a number of measurements that you could use to quantify the performance of a binary classifier.
Do you care whether or not your estimator (ANN, e.g.) outputs a calibrated probability or not?
If not, i.e. all that matters is rank ordering, maximizing area under ROC curve (AUROC) is a pretty good summary of the performance of the metric. Others are "KS" statistic, lift. There are many in use, and emphasize different facets of performance.
If you care about calibrated probabilities then the most common metrics are the "cross entropy" (also known as Bernoulli probability/maximum likelihood, the typical measure used in logistic regression) or "Brier score". Brier score is none other than mean squared error comparing continuous predicted probabilites to binary actual outcomes.
Which is the right thing to use depends on the ultimate application of the classifier. For example, your classifier may estimate probability of blowouts really well, but be substandard on close outcomes.
Usually, the true metric that you're trying to optimize is "dollars made". That's often hard to represent mathematically but starting from that is your best shot to coming up with an appropriate and computationally tractable metric.

In a way it depends on the decision function you are using.
In the case of a binary classification task (predicting whether an event occurred or not [ex: win]), a simple implementation is to predict 1 if the probability is greater than 50%, 0 otherwise.
If you have a multiclass problem (predicting which one of K events occurred [ex: win/draw/lose]), you can predict the class with the highest probability.
And the way to evaluate your heuristic is to compute the prediction error by comparing the actual class of each input with the prediction of your heuristic for that instance.
Note that you would usually divide your data into train/test parts to get better (unbiased) estimates of the performance.
Other tools for evaluation exist such as ROC curves, which is a way to depict the performance with regard to true/false postitives.

As you stated, if you predict that an event has a 20% of happening - and 80% not to happen - observing a single isolated event would not tell you how good or poor your estimator was. However, if you had a large sample of events for which you predicted 20% success, but observe that over that sample, 30% succeeded, you could begin to suspect that your estimator is off.
One approach would be to group your events by predicted probability of occurrence, and observe the actual frequency by group, and measure the difference. For instance, depending on how much data you have, group all events where you predict 20% to 25% occurrence, and compute the actual frequency of occurrence by group - and measure the difference for each group. This should give you a good idea of whether your estimator is biased, and possibly for which ranges it's off.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart