Point division Elliptic Curve in Java - elliptic-curve

Suppose I have G(x,y) = k.P(x,y). I know G(x,y) and P(x,y).
How do I calculate k?

If G(x,y) and P(x,y)are on a secure elliptic curve for ECC, the problem of solving k is called "elliptic curve discrete logarithm problem", or ECDLP. It is infeasible to find k on a secure elliptic curve.
If you're not on such curve, enumerating all possible k and see if G=kP is a reasonable approach.

Related

Question about the composite body algorithm implementation in Drake

I am trying to relate algorithm 9.3 in Jain to the implementation of the composite body algorithm in Drake.
The documentation mentions that the hinge matrix is the transpose of that used in Jain. Looking in Featherstone (2008), it seemed this implementation was more in the spirit of section 6.3 of featherstone, i.e. S*=H, and the matrix is calculated as columns rather than blockwise. I wanted to double check if this was the case?
Yes, that's right. Although we use Jain's formulation and terminology (mostly) we like to think of the joint motion matrix as a small Jacobian as Featherstone does with his "S" (∂V/∂v, i.e. partial of spatial velocity V across the joint w.r.t. generalized velocities v of that joint). That way our H matrix follows the same orientation convention as does our full robotics Jacobian (where J maps velocites and Jᵀ maps forces).
We don't necessarily write down the H matrix for a given joint but rather have that joint (technically, mobilizer) provide functions for multiplying by H and Hᵀ.

Eigenvalues of symmetric band matrix using Accelerate framework

In a macOS/iOS code base, I've got a real symmetric band matrix that can be anywhere from 10 × 10 to about 500 × 500, and I need to compute whether all its eigenvalues are greater than (or equal to) a certain threshold. So I only strictly need to know the lowest eigenvalue, in case that helps.
Is there any function or set of functions in Apple's Accelerate framework that can provide a full or partial solution to this? Ideally with a cost proportional to the number of non-zero entries.
Based on this, it appears there's a set of LAPACK functions that compute eigenvalues efficiently for banded symmetric matrices. (LAPACK is implemented a part of the Accelerate framework.)
As I understand it, ssbtrd followed by ssterf should do the trick.
SSBTRD reduces a real symmetric band matrix A to symmetric
tridiagonal form T by an orthogonal similarity transformation:
Q**T * A * Q = T.
SSTERF computes all eigenvalues of a symmetric tridiagonal matrix
using the Pal-Walker-Kahan variant of the QL or QR algorithm.

ML Classification - Decision Boundary Algorithm

Given a classification problem in Machine Learning the hypothesis is described as below.
hθ(x)=g(θ'x)
z = θ'x
g(z) = 1 / (1+e^−z)
In order to get our discrete 0 or 1 classification, we can translate the output of the hypothesis function as follows:
hθ(x)≥0.5→y=1
hθ(x)<0.5→y=0
The way our logistic function g behaves is that when its input is greater than or equal to zero, its output is greater than or equal to 0.5:
g(z)≥0.5
whenz≥0
Remember.
z=0,e0=1⇒g(z)=1/2
z→∞,e−∞→0⇒g(z)=1
z→−∞,e∞→∞⇒g(z)=0
So if our input to g is θTX, then that means:
hθ(x)=g(θTx)≥0.5
whenθTx≥0
From these statements we can now say:
θ'x≥0⇒y=1
θ'x<0⇒y=0
If The decision boundary is the line that separates the area where y = 0 and where y = 1 and is created by our hypothesis function:
What part of this relates to the Decision Boundary? Or where does the Decision Boundary algorithm come from?
This is basic logistic regression with a threshold. So your theta' * x is just the vector notation of your weight vector multiplied by your input. If you put that into the logistic function which outputs a value between 0 and 1 exclusively, you'll threshold that value at 0.5. So if it's equal and above this, you'll treat it as a positive sample and as a negative one otherwise.
The classification algorithm is just that simple. The training is a bit more complicated and the goal of it is the find a weight vector theta which satisfies the condition to correctly classify all your labeled data...or at least as much as possible. The way to do this is to minimize a cost function which measures the difference between the output of your function and the expected label. You can do this using gradient descent. I guess, Andrew Ng is teaching this.
Edit: Your classification algorithm is g(theta'x)>=0.5 and g(theta'x)<0.5, so a basic step function.
Courtesy of other posters on a different tech forum.
Solving for theta'*x >= 0 and theta'*x<0 gives the decision boundary. The RHS of the inequality ( i.e. 0) comes from the sigmoid function.
Theta gives you the hypothesis that best fits the training set.
From theta, you can compute the decision boundary - it is the locus of points where (X * theta) = 0, or equivalently where g(X * theta) = 0.5.

SVM - Can I normalize W vector?

In SVM, there is something wrong with normalize W vector such:
for each i W_i = W_i / norm(W)
I confused. At first sight it seems that the result sign(<W, x>) will be same. But if so, in the loss function norm(W)^2 + C*Sum(hinge_loss) we can minimize W just by do W = W / (large number).
So, where am I wrong?
I suggest you to read either my minimal 5 ideas of SVMs or better
[Bur98] C. J. Burges, “A tutorial on support vector machines for pattern recognition”, Data mining and knowledge discovery, vol. 2, no. 2, pp. 121–167, 1998.
To answer your question: SVMs define a hyperplane to seperate data. Hyperplanes are defined by a normal vector w and a bias b:
If you change only w, this will give another hyperplane. However, SVMs do more tricks (see my 5 ideas) and the weight vector actually is normalized to be in a relationship to the margin between the two classes.
I think you are missing out on the constraint that
r(wTx+w0)>=1 for all examples, thus normalizing the weight vector will violate this constraint.
In fact this constraint is introduced in the SVM in the first place to actually achieve a unique solution like else you mentioned there are infinite solutions possible just by scaling the weight vector.

Scikit-learn - Stochastic Gradient Descent with custom cost and gradient functions

I am implementing matrix factorization to predict a movie rating by a reviewer. The dataset is taken from MovieLen (http://grouplens.org/datasets/movielens/). This is a well-studied recommendation problem so I just implement this matrix factorization method as for my learning purpose.
I model the cost function as a root-mean-square error between predict rating and actual rating in the training dataset. I use scipy.optimize.minimize function (I use conjugate gradient descent) to factor the movie rating matrix, but this optimization tool is too slow even for only a dataset with 100K items. I plan to scale my algorithms for the dataset with 20 million items.
I have been searching for a Python-based solution for Stochastic Gradient Descent, but the stochastic gradient descent I found on scikit-learn does not allow me to use my custom cost and gradient functions.
I can implement my own stochastic gradient descent but I am checking with you guys if there exists a tool for doing this already.
Basically, I am wondering if there is such as API that is similar to this:
optimize.minimize(my_cost_function,
my_input_param,
jac=my_gradient_function,
...)
Thanks!
Un
This is so simple (at least the vanilla method) to implement that I don't think there is a "framework" around it.
It is just
my_input_param += alpha * my_gradient_function
Maybe you want to have a look at theano, which will do the differentiation for you, though. Depending on what you want to do, it might be a bit overkill, though.
I've been trying to do something similar in R but with a different custom cost function.
As I understand it the key is to find the gradient and see which way takes you towards the local minimum.
With linear regression (y = mx + c) and a least squares function, our cost function is
(mx + c - y)^2
The partial derivative of this with relation to m is
2m(mX + c - y)
Which with the more traditional machine learning notation where m = theta gives us theta <- theta - learning_rate * t(X) %*% (X %*% theta - y) / length(y)
I don't know this for sure but I would assume that for linear regression and a cost function of sqrt(mx + c - y) that the gradient step is the partial derivative with relation to m, which I believe is
m/(2*sqrt(mX + c - y))
If any/all of this is incorrect please (anybody) correct me. This is something I am trying to learn myself and would appreciate knowing if I'm heading in completely the wrong direction.

Resources