Minimum of two functions - geogebra

In GeoGebra, is there a way to define a function of two variables which is the pointwise minimum of two functions.
Like h(x, y):= min(x² + y², x + y).
(The GeoGebra Min command does something different.)
I could work around by means of the abs function, whichis available, using min(a, b) = (a + b - |a - b|) / 2, but this is not very convenient (actually I need to take the minimum of more than two functions).

You could use a Conditional Function to create a piecewise function that is equal to f(x) if f(x) < g(x) and g(x) otherwise. The definition of this is:
If(f(x) < g(x), f, g)
Here's an example of this in action.

One option: Plot a point A on the x-axis. Compute F=min({f(x(A)),g(x(A)),h(x(A))}). Plot B=(x(A),F). Create Locus(B,A).

If you want the minimum of two functions f(x) and g(x) then you can define (f(x) + g(x) - abs(f(x) - g(x)))/2

Related

compute ci square distance in python

im using knn model from sklean (- documentation
https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) to train some model for images classification. as you can see in the documentation, there is no option to pass the chi square distance as a metric to the KNeighborsClassifier function. but there is an option to pass a callable item-so i can pass a function that i built to calc the chi square. so i tried to write my own function.
i know that for some two images, the chi square is calculated up to this formula:
formula to calaculate chi square distance
my task is to solve this problem without using for loops as it takes too long. i need to solve this using vectorization, as my data passed to the chi square are images so if i have two images represented as np arrays' i can do math actions on them without loops, i mean for example for A=[1,2,3] and B=[3,4,5], A+B just give us [4,6,8], no need to use loop to calc this. i this way i also need to calc the chi square function.
anyway, when i tried for example this function:
def chi2(A, B):
#compute the chi-squared distance using above formula
chi = 0.5 * (((A - B) ** 2) / (A + B))
return chi
to calc the chi square, i get an error, and if i try some others similar functions, for example this code :
def chi2(A, B):
#compute the chi-squared distance using above formula
def chi2_distance(A, B):
# compute the chi-squared distance using above formula
chi = 0.5 * np.sum([((a - b) ** 2) / (a + b)
for (a, b) in zip(A, B)])
return chi
i get warnings: RuntimeWarning: invalid value encountered in double_scalars
for (a, b) in zip(A, B)])
and the program running like forever.
any suggestions to some efficient code to calc the chi square distance (as i said, with no loops)?

"Z" Variable is undefined when used to represent a matrix for sigmoid function

I'm a high school student and I just started going into machine learning to further my knowledge of coding. I tried out the program Octave and been working with neurological networks, or at least, tried to. In my first program, however, I already found myself at an impasse with my Sigmoid gradient function. When I try to make the function work for each value within a matrix, I have no idea how to do so. I tried placing z as the parameter of the function but it says that "z" itself is undefined. I have no knowledge on C or C++, and I'm still an amateur in this area, so sorry if I take some time to understand. Thanks to anyone who offers to help!
I'm running Octave 4.4.1, and I haven't tried any other solution yet, as I don't really have any.
% Main Code
g = sigGrad([-2 -1 0 1 2]);
% G is supposed to be my sigmoid Gradient for each value of Theta, which is the matrix within it's parameters.
% Sigmoid Gradient function
function g = sigGrad(z)
g = zeros(size(z));
% This is where the code tells me that z is undefined
g = sigmoid(z).*(1.-sigmoid(z));
% I began by initializing a matrix of zeroes with the size of z
% It should later do the Gradient Equation, but it marks z as undefined before that
% Sigmoid function
g = sigmoid(z)
g = 1.0 ./ (1.0 + exp(-z));
From what I see, I make out that you are committing simple syntax mistakes, I'd recommend get a gist of octave first than diving into the code head on. That being said you have to declare your functions with proper syntax and use them as shown below:
function g = sigmoid(z)
% SIGMOID Compute sigmoid function
% J = SIGMOID(z) computes the sigmoid of z.
g = 1.0 ./ (1.0 + exp(-z));
end
And the other piece of code should be
function g = sigGrad(z)
% sigGrad returns the gradient of the sigmoid function evaluated at z
% g = sigGrad(z) computes the gradient of the sigmoid function evaluated at z.
% This should work regardless if z is a matrix or a vector.
% In particular, if z is a vector or matrix, you should return the gradient for each element.
g = zeros(size(z));
g = sigmoid(z).*(1 - sigmoid(z));
end
And then finally call the above implemented functions using:
g = sigGrad([1 -0.5 0 0.5 1]);

Can you pass your own weights to skimage.color.rgb2gray?

The documentation page for skimage.color.rgb2gray says:
The weights used in this conversion are calibrated for contemporary CRT phosphors:
Y = 0.2125 R + 0.7154 G + 0.0721 B
Suppose I want to use my own weights such that Y = Wr * R + Wg * G + Wb * B. Is there a way of passing an array like [Wr, Wg, Wb] to rgb2gray so it uses that instead?
Currently there is no way of doing that, but since the channels are the last axis on the array, it's actually easy to do this with matrix multiplication:
Y = image # [Wr, Wg, Wb]
So you could very easily write your own (or use this one-liner directly).
(Note: in Python 3.4 or earlier, you would instead use np.dot(image, [Wr, Wg, Wb]).)

Can you give me a short step by step numerical example of radial basis function kernel trick? I would like to understand how to apply on perceptron

I understand well perceptron so put accent only on kernel but I am not familiar with matemathic expressions so please give me an numerical example and a guide on kernel.
For example:
My hyperplane of perceptron is x1*w1+x2*w2+x3*w3+b=0; The RBF kernel formula: k(x,z) = exp((-|x-z|^2)/2*variance^2) where takes action the radial basis function kernel here. Is x an input and what is z variable here?
Or of what I have to calculate variance if it is variance in the formula?
Somewhere I have understood so that I have to plug this formula in perceptron decision function x1*w1+x2*w2+x3*w3+b=0; but how does it look look like If I plug in?
I would like to ask a numerical example to avoid confusion.
Linear Perceptron
As you know linear perceptrons can be trained for binary classification. More precisely, if there is n features, x1, x2, ..., xn in n-dimensional space, Rn, and you want to label them in 2 categories, y1 & y2 (usually -1 and +1), you can use linear perceptron which defines a hyperplane w1*x1 + ... + wn*xn + b = 0 to do so.
w1*x1 + ... + wn*xn + b > 0 or W.X + b > 0 ==> class = y1
w1*x1 + ... + wn*xn + b < 0 or W.X + b < 0 ==> class = y2
Linear perceptron will work well, only if the problem is linearly separable in Rn. For example, in 2D space, this means that one line can separate the 2 sets of points.
Algorithm
One common algorithm to train the perceptron, i.e., find weights and bias, w's & b, based on N data points, X1, ..., XN, and their labels, Y1, ..., YN is the following:
Initialize: W = zeros(n,1); b = 0
For i=1 to N:
Calculate F(Xi) = W.Xi + b
If F(Xi)*Yi <= 0:
W <--- W + Xi*Yi
b <--- b + Yi
This will give the final value for W & b. Besides, based on the training, W will be a linear combination of training points, Xi's, more precisely, the ones that were misclassified. So W = a1*X1 + ... + ...aN*XN where a's are in {0,y1,y2}.
Now, if there is a new point, let's say Z, to label, we check the sign of F(Z) = W.Z + b = a1*(X1.Z) + ... + aN*(XN.Z) + b. It is interesting that only the inner product of new point and training points take part in it.
Kernel Perceptron
Now, if the problem is not linearly separable, one may try to go to a higher dimensional space in which a hyperplane can do the classification. As an example, consider a circle in 2D space. The points inside and outside of the circle can't be separated by a line. However, if you find a transformation that can take the points to 3D space such that the first 2 coordinates remain the same for all points, and the 3rd coordinate become +1 and -1 for the points inside and outside of the circle respectively, then a plane defined as 3rd coordinate = 0 can separate the points.
Finding such transformations can be difficult and computationally heavy, so the kernel trick is introduced. Notice that we only used the inner product of new points with the training points. Kernel trick employs this fact and defines the inner product of the transformed points without actually finding the transformation.
If the unknown transformation is P(X) then Kernel function will be:
K(Xi,Xj) = <P(Xi),P(Xj)>. So instead of finding P, kernel functions are defined which represent the scalar result of the inner product in high-dimensional space. There are also theorems about what functions can be kernel functions, i.e., correspond to inner product in another space.
After choosing a kernel function, the algorithm will be modified as follows:
Initialize: F(X) = 0
For i=1 to N:
Calculate F(Xi)
If F(Xi)*Yi <= 0:
F(.) <--- F(.) + K(.,Xi)*Yi + Yi
At the end, F(.) = a1*K(.,X1) + ... + ...aN*K(.,XN) + b where a's are in {0,y1,y2}.
RBF Kernel
Radial basis function is one type of kernel function that is actually computing the inner product in an infinite-dimensional space. It can be written as
K(Xi,Xj) = exp(- norm2(Xi-Xj)^2 / (2*sigma^2))
Sigma is some parameter that you can work with to find an optimum value for. For example, you can train the model with different values of sigma and then find the best value based on the performance. You can start with sigma = 1
After training the model to find F(.), for a new data Z, the sign of F(Z) = a1*K(Z,X1) + ... + ...aN*K(Z,XN) + b will determine the class.
Remarks:
Regarding to your question about variance, you don't need to find any variance.
About x and z in your question, in each iteration, you should find the kernel output for the current data point and all the previously added points (the points that were misclassified and hence were added to F).
I couldn't come up with a simple instructive numerical example.
References:
I borrowed some notation from
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=0ahUKEwjVu-fXo8DOAhVDxCYKHQkcDDAQFggoMAE&url=http%3A%2F%2Falex.smola.org%2Fteaching%2Fpune2007%2Fpune_3.pdf&usg=AFQjCNHlxy9TnY8xNe2-QDERipN_GycSqQ&bvm=bv.129422649,d.eWE

Mathematica NMinimize runs into memory problems

I'm trying to minimize my function "FunctionToMinimize", which is defined as follows:
FunctionToMinimize[a_, b_, c_, d_] := (2.35*Sqrt[
Variance[1/2*
(a*#1 + b*#2 + c*#3 + d*#4)
]
]
/Mean[1/2*(a*#1 + b*#2 + c*#3 + d*#4)])
&[DataList1[[1 ;; 1000]],DataList2[[1 ;; 1000]],
DataList3[[1 ;; 1000]], DataList4[[1 ;; 1000]]]
The four parameters a,b,c and d are restricted to be somewhere between 0.5 and 1.5. My Problem is now, that if I call
NMinimize[{Funktion[w, x, y, z],
0.75 < w < 1.25 && 0.75 < y < 1.25 && 0.75 < x < 1.25 && 0.75 < z < 1.25},
{w, x, y, z}]
the Mathematica kernel shuts down because it has not enough memory. If I use only the first 100 entries in my DataLists, it will find me results (in 4.1 sec), but if I use DataList[[1;;1000]] or even more entries, the kernel crashes.
Has anybody an idea, why the NMinimize function uses so much memory? I would need to have the minimization for 150'000 events in each list...
Thanks for your answer,
Cheers,
Andreas
I would guess (but haven't in any way checked) that the problem is that on each call to your function, Mathematica is trying to construct a symbolic expression derived from all your data and that occupies much more memory than you'd expect.
Regardless, the good news -- if you haven't long since moved on and forgotten about this problem -- is that you can turn the function into something much simpler.
So, first of all, the 2.35 and the 1/2s just change your function by a constant factor and don't affect where the minimum is, so let's ignore them. Next, your function is always non-negative, so minimizing it is the same as minimizing its square, so let's do that.
So now you're trying to minimize var(aw+bx+cy+dz)/mean(aw+bx+cy+dz)^2 where w,x,y,z are (perhaps quite long) vectors.
Now your numerator and denominator are both just quadratic forms in a,b,c,d whose coefficients depend (in fixed ways) on those vectors. Specifically, suppose your vectors have length N. Then your function is just
[sum(aw+bx+cy+dz)^2/N - sum(aw+bx+cy+dz)^2/N^2] / (sum(aw+bx+cy+dz)^2/N^2)
which you might prefer to write as N sum(aw+bx+cy+dz)^2 / sum(aw+bx+cy+dz)^2 - 1
and in that fraction, e.g., the coefficient of bc in the numerator is 2 sum(xy), and the coefficient in the denominator is 2 sum(x) sum(y).
So you can take your big vectors, compute the relevant coefficients once, and then just ask Mathematica to optimize a function of the form (quadratic / quadratic), which should be pretty painless.

Resources