The F# language has the functions log which computes the natural logarithm and log10 which computes the base 10 logarithm.
What is the best way to compute a base 2 logarithm in F#?
Use System.Math.Log(number, base)
Example:
open System
Math.Log(32., 2.)
val it : float = 5.0
You could simply use the fact that the "a-logarithm of b" = ln(b) / ln(a), that is, the 2-logarithm of x is ln(x) / ln(2).
log2(8) = ln(8) / ln(2) = 3
log2(32) = ln(32) / ln(2) = 5
...where ln is either the natural logarithm or log10, either logarithm will work.
Since .NET 5.0 there is Math.Log2.
> Math.Log2 64;;
6
Related
I do data normalization as:
X = ( X - X.mean(axis=0) ) / X.std(axis=0)
But some features of X have 0 variance. It gives me Runtime error for ZeroDivision.
I know we can normalize using "StandardScalar" class from sklearn. But how can I normalize data by myself from scratch if std=0 ?
To quote sklearn documentation for StandardScaler:
Per feature relative scaling of the data to achieve zero mean and unit variance. Generally this is calculated using np.sqrt(var_). If a variance is zero, we can’t achieve unit variance, and the data is left as-is, giving a scaling factor of 1.
Therefore, like what the other answer said, you can omit the standard deviation term and just do X - X.mean(axis=0) when standard deviation is 0. However this only works if the whole of X has 0 standard deviation.
To make this work where you have a mix of values with some std dev and values that don't, use this instead:
std = X.std(axis=0)
std = np.where(std == 0, 1, std)
X = ( X - X.mean(axis=0) ) / std
This code checks if standard deviation is zero for each row of values in axis 0, and replaces them with 1 if true.
If standard deviation is 0 for a particular feature, than all of its values are identical. In this case X = X - X.mean(axis=0) should suffice. This would give you 0 mean and 0 standardeviation.
What do you want to achieve?
I want to make a factorial function
What is the issue?
I can't find any way to solve factorials unless they are whole numbers
What solutions have you tried so far?
I have looked on roblox devforum and youtube and google and discord and I literally can't find any way to get decimals to properly work with factorials
the only thing I know is that you can use gamma functions to solve decimal factorials but I am looking and I have no idea how I would implement that into luau so I am really struggling
I even used Stirling's approximation but that is not 100% true as I need something to be completely true to the actual answer
local function SolveFactorial(FN)
if string.match(FN, "^-") then
local T = 1
FN *= -1
for i = FN, 1, -1 do
T = T * i
end
T *= -1
return T
else
local T = 1
for i = FN, 1, -1 do
T = T * i
end
return T
end
end
this is a normal factorial function that works with all integers expect 0
local function SF(FN)
if string.match(FN, "^-") then
FN *= -1
local N = math.sqrt(2*math.pi*FN)*math.pow((FN/math.exp(1)), FN)
N *= -1
return N
else
local N = math.sqrt(2*math.pi*FN)*math.pow((FN/math.exp(1)), FN)
return N
end
end
and this is Stirling's approximation which as I said before isn't 100% accurate
these are the two functions that I have so far and I don't know what I should do at this point to fix it
is there a way to use the gamma function or is there an easier way to do this then what I am doing atm
note that this is roblox lua!!!
any help will really save a lot of time thank you, nici
Your question title is a little misleading, it's not that decimals aren't working. It's that Stirling's Approximation provides inaccurate numbers for your purposes.
If you need a gamma function, there's an implementation on Rosetta Code that's pretty easy to use. I've made small adjustments and transcribed it here :
local function gammafunc(z)
local gamma = 0.577215664901
local coeff = -0.65587807152056
local quad = -0.042002635033944
local qui = 0.16653861138228
local set = -0.042197734555571
function recigamma(rz)
return rz + gamma * rz^2 + coeff * rz^3 + quad * rz^4 + qui * rz^5 + set * rz^6
end
if z == 1 then
return 1
elseif math.abs(z) <= 0.5 then
return 1 / recigamma(z)
else
return (z - 1) * gammafunc(z - 1)
end
end
In order to get it to properly return the correct factorial values, you need to offset z by 1.
for n = 0.0, 10, 0.2 do
local f1 = factorial(n)
local f2 = stirlingApprox(n)
local f3 = gammafunc(n + 1)
print(n, f1, f2, f3)
end
---[[ Returns...
n n! Stirling Gamma
0.0 1.0000 -0.0000 1.0000
0.2 0.6652 0.9182
0.4 0.7366 0.8872
0.6 0.7843 0.8934
0.8 0.8427 0.9314
1.0 1.0000 0.9221 1.0000
1.2 1.0293 1.1018
1.4 1.1714 1.2421
1.6 1.3579 1.4295
1.8 1.6014 1.6765
2.0 2.0000 1.9190 2.0000
2.2 2.3344 2.4240
2.4 2.8800 2.9811
2.6 3.6003 3.7167
2.8 4.5571 4.6942
3.0 6.0000 5.8362 6.0000
3.2 7.5579 7.7567
3.4 9.8914 10.1358
3.6 13.0759 13.3803
3.8 17.4518 17.8378
4.0 24.0000 23.5062 24.0000
4.2 31.9393 32.5781
4.4 43.7635 44.5977
4.6 60.4506 61.5492
4.8 84.1502 85.6217
5.0 120.0000 118.0192 120.0000
5.2 166.7162 169.4060
5.4 237.1499 240.8277
5.6 339.6157 344.6753
5.8 489.5289 496.6057
6.0 720.0000 710.0782 720.0000
6.2 1036.3071 1050.3173
6.4 1521.4128 1541.2974
6.6 2246.5097 2274.8568
6.8 3335.8193 3376.9185
7.0 5040.0000 4980.3958 5040.0000
7.2 7475.3217 7562.2846
7.4 11278.2406 11405.6005
7.6 17101.8057 17288.9114
7.8 26060.2260 26339.9645
8.0 40320.0000 39902.3955 40320.0000
8.2 61384.0725 62010.7339
8.4 94864.1090 95807.0442
8.6 147262.8822 148684.6384
8.8 229608.1738 231791.6880
9.0 362880.0000 359536.8728 362880.0000
9.2 565356.8075 570498.7521
9.4 892663.0429 900586.2158
9.6 1415149.6282 1427372.5285
9.8 2252332.9545 2271558.5427
10.0 3628800.0000 3598695.6187 3628800.0000]]
Please keep in mind that this implementation runs the same stackoverflow risk that the traditional factorial function has with larger numbers as it needs to recursively calculate values. The Stirling approximation that you're using is significantly faster and that might be good enough in some cases.
I am currently converting some python statistics library that needs to produce a number with high decimal precision. For example, I did this:
i = 1
n = 151
sum = (i - 3/8) / (n + 1/4)
it will result to 0.
My question is how to always show decimal precision automatically when I do this kind of computation?
My desired output is:
0.004132231404958678
In ruby all the arithmetic operations result in the value of the same type as operands (the one having better precision.)
That said, 3/4 is an integer division, resulting in 0.
To make your example working, you are to ensure you are not losing precision anywhere:
i = 1.0
n = 151.0
sum = (i - 3.0/8) / (n + 1/4.0)
Please note, that as in most (if not all) languages, Float is tainted:
0.1 + 0.2 #⇒ 0.30000000000000004
If you need an exact value, you might use BigDecimal or Rational.
I know xgboost need first gradient and second gradient, but anybody else has used "mae" as obj function?
A little bit of theory first, sorry! You asked for the grad and hessian for MAE, however, the MAE is not continuously twice differentiable so trying to calculate the first and second derivatives becomes tricky. Below we can see the "kink" at x=0 which prevents the MAE from being continuously differentiable.
Moreover, the second derivative is zero at all the points where it is well behaved. In XGBoost, the second derivative is used as a denominator in the leaf weights, and when zero, creates serious math-errors.
Given these complexities, our best bet is to try to approximate the MAE using some other, nicely behaved function. Let's take a look.
We can see above that there are several functions that approximate the absolute value. Clearly, for very small values, the Squared Error (MSE) is a fairly good approximation of the MAE. However, I assume that this is not sufficient for your use case.
Huber Loss is a well documented loss function. However, it is not smooth so we cannot guarantee smooth derivatives. We can approximate it using the Psuedo-Huber function. It can be implemented in python XGBoost as follows,
import xgboost as xgb
dtrain = xgb.DMatrix(x_train, label=y_train)
dtest = xgb.DMatrix(x_test, label=y_test)
param = {'max_depth': 5}
num_round = 10
def huber_approx_obj(preds, dtrain):
d = preds - dtrain.get_labels() #remove .get_labels() for sklearn
h = 1 #h is delta in the graphic
scale = 1 + (d / h) ** 2
scale_sqrt = np.sqrt(scale)
grad = d / scale_sqrt
hess = 1 / scale / scale_sqrt
return grad, hess
bst = xgb.train(param, dtrain, num_round, obj=huber_approx_obj)
Other function can be used by replacing the obj=huber_approx_obj.
Fair Loss is not well documented at all but it seems to work rather well. The fair loss function is:
It can be implemented as such,
def fair_obj(preds, dtrain):
"""y = c * abs(x) - c**2 * np.log(abs(x)/c + 1)"""
x = preds - dtrain.get_labels()
c = 1
den = abs(x) + c
grad = c*x / den
hess = c*c / den ** 2
return grad, hess
This code is taken and adapted from the second place solution in the Kaggle Allstate Challenge.
Log-Cosh Loss function.
def log_cosh_obj(preds, dtrain):
x = preds - dtrain.get_labels()
grad = np.tanh(x)
hess = 1 / np.cosh(x)**2
return grad, hess
Finally, you can create your own custom loss functions using the above functions as templates.
Warning: Due to API changes newer versions of XGBoost may require loss functions for the form:
def custom_objective(y_true, y_pred):
...
return grad, hess
For the Huber loss above, I think the gradient is missing a negative sign upfront. Should be as
grad = - d / scale_sqrt
I am running the huber/fair metric from above on ~normally distributed Y, but for some reason with alpha <0 (and all the time for fair) the result prediction will equal to zero...
This has become quite a frustrating question, but I've asked in the Coursera discussions and they won't help. Below is the question:
I've gotten it wrong 6 times now. How do I normalize the feature? Hints are all I'm asking for.
I'm assuming x_2^(2) is the value 5184, unless I am adding the x_0 column of 1's, which they don't mention but he certainly mentions in the lectures when talking about creating the design matrix X. In which case x_2^(2) would be the value 72. Assuming one or the other is right (I'm playing a guessing game), what should I use to normalize it? He talks about 3 different ways to normalize in the lectures: one using the maximum value, another with the range/difference between max and mins, and another the standard deviation -- they want an answer correct to the hundredths. Which one am I to use? This is so confusing.
...use both feature scaling (dividing by the
"max-min", or range, of a feature) and mean normalization.
So for any individual feature f:
f_norm = (f - f_mean) / (f_max - f_min)
e.g. for x2,(midterm exam)^2 = {7921, 5184, 8836, 4761}
> x2 <- c(7921, 5184, 8836, 4761)
> mean(x2)
6676
> max(x2) - min(x2)
4075
> (x2 - mean(x2)) / (max(x2) - min(x2))
0.306 -0.366 0.530 -0.470
Hence norm(5184) = 0.366
(using R language, which is great at vectorizing expressions like this)
I agree it's confusing they used the notation x2 (2) to mean x2 (norm) or x2'
EDIT: in practice everyone calls the builtin scale(...) function, which does the same thing.
It's asking to normalize the second feature under second column using both feature scaling and mean normalization. Therefore,
(5184 - 6675.5) / 4075 = -0.366
Usually we normalize all of them to have zero mean and go between [-1, 1].
You can do that easily by dividing by the maximum of the absolute value and then remove the mean of the samples.
"I'm assuming x_2^(2) is the value 5184" is this because it's the second item in the list and using the subscript _2? x_2 is just a variable identity in maths, it applies to all rows in the list. Note that the highest raw mid-term exam result (i.e. that which is not squared) goes down on the final test and the lowest raw mid-term result increases the most for the final exam result. Theta is a fixed value, a coefficient, so somewhere your normalisation of x_1 and x_2 values must become (EDIT: not negative, less than 1) in order to allow for this behaviour. That should hopefully give you a starting basis, by identifying where the pivot point is.
I had the same problem, in my case the thing was that I was using as average the maximum x2 value (8836) minus minimum x2 value (4761) divided by two, instead of the sum of each x2 value divided by the number of examples.
For the same training set, I got the question as
Q. What is the normalized feature x^(3)_1?
Thus, 3rd training ex and 1st feature makes out to 94 in above table.
Now, normalized form is
x = (x - mean(x's)) / range(x)
Values are :
x = 94
mean(89+72+94+69) / 4 = 81
range = 94 - 69 = 25
Normalized x = (94 - 81) / 25 = 0.52
I'm taking this course at the moment and a really trivial mistake I made first time I answered this question was using comma instead of dot in the answer, since I did by hand and in my country we use comma to denote decimals. Ex:(0,52 instead of 0.52)
So in the second time I tried I used dot and works fine.