support vector machines - solving for alphas - machine-learning

In Svm, this is our dual problem optimization objective with the following constrains along with alphas between 0 and C. How do i find alphas with this optimization objective and these constrians.
Also, please correct me, if i am wrong somewhere

Well, you want to maximize the Lagrangean of this optimization problem, right?
So what you did was to set the partial derivatives of your relevant components to zero to build the Lagrangean. You did that because in the optimum, they are equal to zero. This is the necessary condition to find a stationary point but because of the convexity property of your original problem this is also the optimal solution.
To maximize over the alphas, you want to use the SMO algorithm as described here:
https://en.m.wikipedia.org/wiki/Sequential_minimal_optimization
Good luck!

Related

If we can clip gradient in WGAN, why bother with WGAN-GP?

I am working on WGAN and would like to implement WGAN-GP.
In its original paper, WGAN-GP is implemented with a gradient penalty because of the 1-Lipschitiz constraint. But packages out there like Keras can clip the gradient norm at 1 (which by definition is equivalent to 1-Lipschitiz constraint), so why do we bother to penalize the gradient? Why don't we just clip the gradient?
The reason is that clipping in general is a pretty hard constraint in a mathematical sense, not in a sense of implementation complexity. If you check original WGAN paper, you'll notice that clip procedure inputs model's weights and some hyperparameter c, which controls range for clipping.
If c is small then weights would be severely clipped to a tiny values range. The question is how to determine an appropriate c value. It depends on your model, dataset in a question, training procedure and so on and so forth. So why not to try soft penalizing instead of hard clipping? That's why WGAN-GP paper introduces additional constraint to a loss function that forces gradient's norm to be as much close to 1 as possible, avoiding hard collapsing to a predefined values.
The answer by CaptainTrunky is correct but I also wanted to point out one, really important, aspect.
Citing the original WGAN-GP paper:
Implementing k-Lipshitz constraint via weight clipping biases the critic towards much simpler functions. As stated previously in [Corollary 1], the optimal WGAN critic has unit gradient norm almost everywhere under Pr and Pg; under a weight-clipping constraint, we observe that our neural network architectures which try to attain their maximum gradient norm k end up learning extremely simple functions.
So as You can see weight clipping may (it depends on the data You want to generate - autors of this article stated that it doesn't always behave like that) lead to undesired behaviour. When You will try to train WGAN to generate more complex data the task has high possibility of failure.

How do I segment the connected characters in this case?

It seems that I need some advice on segmenting connected characters (see the image below).
As you can see, C and U, as well as 4,9 and 9 are connected and therefore when I try to draw contours they are joined into one block. Unfortunately, there are plenty of such problematic images so I think I need to find some solution.
I have tried using different morphological transforms (erosion, dilation, opening), but that doesn't solve the problem.
Thanks in advance for any recommendations.
It seems to me that the best solution will be to work on the preprocessing, if there is a possibility.
Otherwise, you can try Machine Learning techniques. You may get inspiration from Viola-Jones or Histograms of Oriented Gradients + SVM algorithms (even though those algorithms solve a problem that differs from Optical Character Recognition, I had plenty of insights from them). In other words, try "sliding" a window along a horizontal of predefined aspect ratio and recognize characters. But the problem may be that you will need to train a model, which may require a lot of data.
As I said earlier, it may be a good idea to reconsider the image preprocessing step. By the way, it seems that in the case of "C" and "U", erosion may help.
Good luck!:)

Homography and projective transformation

im trying to write a code that will do projective transformation, but with more than 4 key points. i found this helpful guide but it uses 4 points of reference
https://math.stackexchange.com/questions/296794/finding-the-transform-matrix-from-4-projected-points-with-javascript
i know that matlab uses has a function tcp2form that handles that, but i haven't found a way so far.
anyone can give me some guidance, on how to do so? i can solve the equations using (least squares), but i'm stuck since i have a matrix that is larger than 3*3 and i can't multiple the homogeneous coordinates.
Thanks
If you have more than four control points, you have an overdetermined system of equations. There are two possible scenarios. Either your points are all compatible with the same transformation. In that case, any four points can be used, and the rest will match the transformation exactly. At least in theory. For the sake of numeric stability you'd probably want to choose your points so that they are far from being collinear.
Or your points are not all compatible with a single projective transformation. In this case, all you can hope for is an approximation. If you want the best approximation, you'll have to be more specific about what “best” means, i.e. some kind of error measure. Measuring things in a projective setup is inherently tricky, since there are usually a lot of arbitrary decisions involved.
What you can try is fixing one matrix entry (e.g. the lower right one to 1), then writing the conditions for the remaining 8 coordinates as a system of linear equations, and performing a least squares approximation. But the choice of matrix representative (i.e. fixing one entry here) affects the least squares error measure while it has no effect on the geometric meaning, so this is a pretty arbitrary choice. If the lower right entry of the desired matrix should happen to be zero, you'd computation will run into numeric problems due to overflow.

from heatmap, calculate discrete points

Sorry if this question might sound like another I'm too lazy to google one, but I couldn't find what I'm looking for.
This question is for avoiding to reinvent the wheel. I think that what I'm looking for might exist, so I dont want to start out implementing it on my own:
I want to turn a heat map into a list of discrete points. I'm sure that an algorithm can be used which first thresholds the heatmap and then, for every "island" which was created by the thresholding, finds the gravitational center. This center would be a point. I want to get the list of these points:
Step 0:
Step 1:
Step 2:
I wonder if such an algorithm already exists, or if there is a better approach to this than my idea. Moreover, it would be perfect if there is a ready to use implementation, of course. E.g. computer vision libraries like OpenCV have the thresholding included. I just couldn't find the next step.
My target platform is iOS, so for implementations, Objective-C, C or C++ is preferred.
You could do
Apply Threshold binary on source.
Find contour.
Finally calculate contour mass center using Image Moment.
There are a lot of ways to reach what you are looking for. This is one of them:
By applying cv::threshold(); you should get something like this:
Now it's time for cv::distanceTransform();and cv::normalize();
You can see it better by appling cv::applyColorMap();
Next step, cv::connectedComponents(); to make sure there won't be anything connected:
And finally cv::approxPolyDP(); and cv::minEnclosingCircle(); to find centers:
I hope this was helpful!
Keep up the good work and have fun :)
You can look at my post for the question whose link is mentioned below. Basically, after thresholding you can get some circles or elipses. Then you can fit a gaussian mixture model on them to estimate the exact centers. There are many existing libraries to fit GMMs
Robust tracking of blobs

motion reconstruction from a single camera

I have a single calibrated camera (known intrinsic parameters, i.e. camera matrix K is known, as well as the distortion coefficients).
I would like to reconstruct the camera's 3d trajectory. There is no a-priori knowledge about the scene.
simplifying the problem by presenting two images that look on the same scene and extracting two set of corresponding matched feature points from them (SIFT, SURF, ORB, etc.)
My problem is how can I calculate the camera extrinsic parameters (i.e. the rotation matrix R and the translation vector t ) between the to viewpoints?
I have managed to calculate the fundamental matrix, and since K is know, the essential matrix as well. using David Nister's efficient solution to the Five-Point Relative Pose Problem I've managed to get 4 possible solution but:
the constraint on the essential matrix E ~ U * diag (s,s,0) * V' doesn't always apply - causing incorrect results.
[EDIT]: taking the average singular value seems to correct the results :) one down
how can I tell which one of the four is the correct one?
Thanks
Your solution to point 1 is correct: diag( (s1 + s2)/2, (s1 + s2)/2, 0).
As for telling which one of the four solutions is correct, only one will give positive depths for all points with respect to the camera frame. That's the one you want.
Code for checking which solution is correct can be found here: http://cs.gmu.edu/%7Ekosecka/examples-code/essentialDiscrete.m from http://cs.gmu.edu/%7Ekosecka/bookcode.html
They use the determinants of U and V to determine the solution with the correct orientation. Look for the comment "then four possibilities are". Since you're only estimating the essential matrix, it's susceptible to noise and does not behave well at all if all of the points are coplanar.
Also, the translation is only recovered to within a constant scaling factor, so the fact that you're seeing a normalized translation vector of unit magnitude is exactly correct. The reason is that the depth is unknown and estimated to be 1. You'll have to find some way to recover the depth as in the code for the eight-point algorithm + 3d reconstruction (Algorithm 5.1 in the bookcode link.)
The book the sample code above is taken from is also a very good reference. http://vision.ucla.edu/MASKS/ Chapter 5, the one you're interested in, is available on the Sample Chapters link.
Congrats on your hard work, sounds like you've tried hard to learn these techniques.
For actual production-strength code, I'd advise to download libmv and ceres, and re-code your solution using them.
Your two questions are really one: invalid solutions are rejected based on the data you have collected. In particular, Nister's (as well as Stewenius's) algorithm is normally used in the inner loop of a RANSAC-like solver, which selects for the solution with the best fit / max number of inliers.

Resources