calculating lot size to recover lost pips using a fixed TP value? - mql4

If I have let say 100 pips total loss on multiple trades , and I want to recover that 100 pips loss by opening a new trade which has 30 pips TakeProfit, how do I calculate the Lot size for this new trade, so that when the TP of 30 pips is hit , it will recover the loss which was incurred (100 pips loss)?

If X lots = 100pips, first determine how many X lots will 30pips equivalent to 100pips.
Numbers of lots = X(100/30) or 0.33X lots size.

Related

Understanding histogram_quantile based on rate in Prometheus

According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
Source: https://prometheus.io/docs/practices/histograms/#quantiles
Since each bucket of histogram is a counter we can calculate rate each of the buckets as:
per-second average rate of increase of the time series in the range vector.
See: https://prometheus.io/docs/prometheus/latest/querying/functions/#rate
So, for instance, if bucket value[t-5m] = 100 and bucket value[t] = 200 then bucket rate[t] = (200-100)/(10*60) = 0.167
And finally, the most confusing part is how can histogram_quantile function find 95th percentile for given metric knowing all the bucket rates?
Is there any code or algorithm I can take a look to better understand it?
A solid example will explain histogram_quantile well.
Assumptions:
ONLY ONE series for simplicity
10 buckets for metric http_request_duration_seconds.
10ms, 50ms, 100ms, 200ms, 300ms, 500ms, 1s, 2s, 3s, 5s
http_request_duration_seconds is a metric type of COUNTER
time
value
delta
rate (quantity of items)
t-10m
50
N/A
N/A
t-5m
100
50
50 / (5*60)
t
200
100
100 / (5*60)
...
...
...
...
We have at least two scrapes of the series covering 5 minutes for rate() to calculate the quantity for each bucket
rate_xxx(t) = (value_xxx[t]-value_xxx[t-5m]) / (5m*60) is the quantity of items for [t-5m, t]
We are looking at 2 samples(value(t) and value(t-5m)) here.
10000 http request durations (items) were recorded, that is,
10000 = rate_10ms(t) + rate_50ms(t) + rate_100ms(t) + ... + rate_5s(t).
bucket(le)
10ms
50ms
100ms
200ms
300ms
500ms
1s
2s
3s
5s
+Inf
range
~10ms
10~50ms
50~100ms
100~200ms
200~300ms
300~500ms
500ms~1s
1~2s
2s~3s
3~5s
5s~
rate_xxx(t)
3000
3000
1500
1000
800
400
200
40
30
5
5
Bucket is the essence of histogram. We just need 10 numbers in rate_xxx(t) to do the quantile calculation
Let's take a close look at this expression (aggregation like sum() is omitted for simplicity)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
We are actually looking for the 95%th item in rate_xxx(t) from bucket=10ms to bucket=+Inf. And 95%th means 9500th here since we got 10000 items in total (10000 * 0.95).
From the table above, there are 9300 = 3000+3000+1500+1000+800 items together before bucket=500ms.
So the 9500th item is the 200th item (9500-9300) in bucket=500ms(range=300~500ms) which got 400 items within
And Prometheus assumes that items in a bucket spread evenly in a linear pattern.
The metric value for the 200th item in bucket=500ms is 400ms = 300+(500-300)*(200/400)
That is, 95% is 400ms.
There are a few to bear in mind
Metric should be COUNTER in nature for histogram metric type
Series for quantile calculation should always get label le defined
Items (Data) in a specific bucket spread evenly a linear pattern (e.g.: 300~500ms)
Prometheus makes this assumption at least
Quantile calculation requires buckets being sorted(defined) in some ascending/descending order (e.g.: 1ms < 5ms < 10ms < ...)
Result of histogram_quantile is an approximation
P.S.:
The metric value is not always accurate due to the assumption of Items (Data) in a specific bucket spread evenly a linear pattern
Say, the max duration in reality (e.g.: from nginx access log) in bucket=500ms(range=300~500ms) is 310ms, however, we will get 400ms from histogram_quantile via above setup which is quite confusing sometimes.
The smaller bucket distance is, the more accurate approximation is.
So setup the bucket distances that fit your needs.
You can refer to my reply here
Actually the rate() function is just used to specify the time window, the denominator has no effect in the computation of the pecentile value.
I believe this is the code for it in prometheus
The general idea is that you use the data in the buckets to extrapolate / approximate the quantiles
Elasticsearch also does something similar (yet different/much simpler) in their rollup capabilities
You have to use reset because counters can be reset, rate automatically considers resets and give you the right count for each second. Just remember that always use rate before using counters.

Precision and Recall computation for different group sizes

I didn't find an answer for this question anywhere, so I hope someone here could help me and also other people with the same problem.
Suppose that I have 1000 Positive samples and 1500 Negative samples.
Now, suppose that there are 950 True Positives (positive samples correctly classified as positive) and 100 False Positives (negative samples incorrectly classified as positive).
Should I use these raw numbers to compute the Precision, or should I consider the different group sizes?
In other words, should my precision be:
TruePositive / (TruePositive + FalsePositive) = 950 / (950 + 100) = 90.476%
OR should it be:
(TruePositive / 1000) / [(TruePositive / 1000) + (FalsePositive / 1500)] = 0.95 / (0.95 + 0.067) = 93.44%
In the first calculation, I took the raw numbers without any consideration to the amount of samples in each group, while in the second calculation, I used the proportions of each measure to its corresponding group, to remove the bias caused by the groups' different size
Answering the stated question: by definition, precision is computed by the first formula: TP/(TP+FP).
However, it doesn't mean that you have to use this formula, i.e. precision measure. There are many other measures, look at the table on this wiki page and choose the one most suited for your task.
For example, positive likelihood ratio seems to be the most similar to your second formula.

Time series normalization, how to handle zeros

I'm working on a player churn prediction model for a game. I have number of rounds played time series for 60 days. Before I feed the time series to classification algorithms, I need to normalize the time series.
I was thinking about using min-max normalization by transform x to x/Max(x). Max(x) in the 60 days time series doesn't necessarily captures the peak of how many times a player usually play a day.
But the z-normalization by transform x to (x-mean(x))/std(x) will not work since I need to preserve the information of the days with no play is zero. Doing z-normalization maps 0 to different values which makes them uncomparable.
Is there a normalization scheme which requires no information about the maximum of the time series and can map 0 still to 0?
You can transform your values into probabilities by dividing each value in the array by the sum of the values in the array (normalization factor "sum to unity").i.e.transform x to x./sum(x)
That would map 0 values to 0 and requires no information about the maximum value.

Does FFT neccessary to find peaks and pits on audio files

I'm able to read a wav files and its values. I need to find peaks and pits positions and their values. First time, i tried to smooth it by (i-1 + i + i +1) / 3 formula then searching on array as array[i-1] > array[i] & direction == 'up' --> pits style solution but because of noise and other reasons of future calculations of project, I'm tring to find better working area. Since couple days, I'm researching FFT. As my understanding, fft translates the audio files to series of sines and cosines. After fft operation the given values is a0's and a1's for a0 + ak * cos(k*x) + bk * sin(k*x) which k++ and x++ as this picture
http://zone.ni.com/images/reference/en-XX/help/371361E-01/loc_eps_sigadd3freqcomp.gif
My question is, does fft helps to me find peaks and pits on audio? Does anybody has a experience for this kind of problems?
It depends on exactly what you are trying to do, which you haven't really made clear. "finding the peaks and pits" is one thing, but since there might be various reasons for doing this there might be various methods. You already tried the straightforward thing of actually looking for the local maximum and minima, it sounds like. Here are some tips:
you do not need the FFT.
audio data usually swings above and below zero (there are exceptions, including 8-bit wavs, which are unsigned, but these are exceptions), so you must be aware of positive and negative values. Generally, large positive and large negative values carry large amounts of energy, though, so you want to count those as the same.
due to #2, if you want to average, you might want to take the average of the absolute value, or more commonly, the average of the square. Once you find the average of the squares, take the square root of that value and this gives the RMS, which is related to the power of the signal, so you might do something like this is you are trying to indicate signal loudness, intensity or approximate an analog meter. The average of absolutes may be more robust against extreme values, but is less commonly used.
another approach is to simply look for the peak of the absolute value over some number of samples, this is commonly done when drawing waveforms, and for digital "peak" meters. It makes less sense to look at the minimum absolute.
Once you've done something like the above, yes you may want to compute the log of the value you've found in order to display the signal in dB, but make sure you use the right formula. 10 * log_10( amplitude ) is not it. Rule of thumb: usually when computing logs from amplitude you will see a 20, not a 10. If you want to compute dBFS (the amount of "headroom" before clipping, which is the standard measurement for digital meters), the formula is -20 * log_10( |amplitude| ), where amplitude is normalize to +/- 1. Watch out for amplitude = 0, which gives an infinite headroom in dB.
If I understand you correctly, you just want to estimate the relative loudness/quietness of an audio digital sample at a given point.
For this estimation, you don't need to use FFT. However your method of averaging the signal does not produce the appropiate picture neither.
The digital signal is the value of the audio wave at a given moment. You need to find the overall amplitude of the signal at that given moment. You can somewhat see it as the local maximum value for a given interval around the moment you want to calculate. You may have a moving max for the signal and get your amplitude estimation.
At a 16 bit sound sample, the sound signal value can go from 0 up to 32767. At a 44.1 kHz sample rate, you can find peaks and pits of around 0.01 secs by finding the max value of 441 samples around a given t moment.
max=1;
for (i=0; i<441; i++) if (array[t*44100+i]>max) max=array[t*44100+i];
then for representing it on a 0 to 1 scale you (not really 0, because we used a minimum of 1)
amplitude = max / 32767;
or you might represent it in relative dB logarithmic scale (here you see why we used 1 for the minimum value)
dB = 20 * log10(amplitude);
all you need to do is take dy/dx, which can getapproximately by just scanning through the wave and and subtracting the previous value from the current one and look at where it goes to zero or changes from positive to negative
in this code I made it really brief and unintelligent for sake of brevity, of course you could handle cases of dy being zero better, find the 'centre' of a long section of a flat peak, that kind of thing. But if all you need is basic peaks and troughs, this will find them.
lastY=0;
bool goingup=true;
for( i=0; i < wave.length; i++ ) {
y = wave[i];
dy = y - lastY;
bool stillgoingup = (dy>0);
if( goingup != direction ) {
// changed direction - note value of i(place) and 'y'(height)
stillgoingup = goingup;
}
}

Correcting a known bias in collected data

Ok, so here is a problem analogous to my problem (I'll elaborate on the real problem below, but I think this analogy will be easier to understand).
I have a strange two-sided coin that only comes up heads (randomly) 1 in every 1,001 tosses (the remainder being tails). In other words, for every 1,000 tails I see, there will be 1 heads.
I have a peculiar disease where I only notice 1 in every 1,000 tails I see, but I notice every heads, and so it appears to me that the rate of noticing a heads or tails is 0.5. Of course, I'm aware of this disease and its effect so I can compensate for it.
Someone now gives me a new coin, and I noticed that the rate of noticing heads is now 0.6. Given that my disease hasn't changed (I still only notice 1 in every 1,000 tails), how do I calculate the actual ratio of heads to tails that this new coin produces?
Ok, so what is the real problem? Well, I have a bunch of data consisting of input, and outputs which are 1s and 0s. I want to teach a supervised machine learning algorithm to predict the expected output (a float between 0 and 1) given an input. The problem is that the 1s are very rare, and this screws up the internal math because it becomes very susceptible to rounding errors - even with high-precision floating point math.
So, I normalize the data by randomly omitting most of the 0 training samples so that it appears that there is a roughly equal ratio of 1s and 0s. Of course, this means that now the machine learning algorithm's output is no-longer predicting a probability, ie. instead of predicting 0.001 as it should, it would now predict 0.5.
I need a way to convert the output of the machine learning algorithm back to a probability within the original training set.
Author's Note (2015-10-07): I later discovered that this technique is commonly known as "downsampling"
You are calculating the following
calculatedRatio = heads / (heads + tails / 1000)
and you need
realRatio = heads / (heads + tails)
Solving both equations for tails yields the following equations.
tails = 1000 / calculatedRatio - 1000
tails = 1 / realRatio - 1
Combining both yields the following.
1000 / calculateRatio - 1000 = 1 / realRatio - 1
And finally solving for realRatio.
realRatio = 1 / (1000 / calculatedRatio - 999)
Seems to be correct. calculatedRatio 0.5 yields realRatio 1/1001, 0.6 yields 3 / 2003.

Resources