I am trying to backpropogate a very primitive / simple ANN.
I've almost got it working. I'm trying to implement the formulas and the article I'm reading does not specify whether to use dot product or element wise multiplication or some other multiplication.
Article: https://ml-cheatsheet.readthedocs.io/en/latest/backpropagation.html
Here's the formula for calculating the error (or delta) of a single Hidden layer:
Or, as I read it in the context of my algorithm,
Delta = prev_delta * prev_weight * zprime
Where delta is the error of this layer, prev_delta is the delta of the previous layer, prev_weight is the weight of the previous layer, and zprime is the derivative of the activation function of the current layer.
Also, for a single Output Layer:
Or, as I read it in the context of my algorithm,
Delta = (output - target) % zprime;
Where output is the final output of the feed-forward and target is the target values.
I've written this code to run this calculation:
void Layer::backward(Matrix & prev_delta, Matrix & prev_weight) {
// all variables are matrices
// except for prev_layer, that's a pointer to a layer object.
// I'm using Armadillo for linear algebra / matrices
// delta, weight, output, and zprime refer to the current layer.
// prev_delta, prev_weight belong the the previous layer.
if (next_layer == nullptr) {
// if next layer is null, this is the output layer.
// in that case, prev_delta is target.
// yHat - y * R'(Zo)
delta = (output - prev_delta) * zprime;
}
else {
// Eo * Wo * R'(Zh)
delta = prev_delta * prev_weight * zprime;
}
// tell the next layer to backpropogate
if (prev_layer != nullptr)
prev_layer -> backward(delta, weight);
}
matrix * matrix indicates a matrix multiplication (dot product)
matrix % matrix indicates element-wise multiplication
The issue I'm having is that these matrices don't seem to multiply properly. I've made sure everything lines up the same way the article has it, but these pieces just don't seem to fit. How should these matrices be multiplied to get the result?
Edit: to clarify, I get errors when I try to take the dot product of these matrices. "invalid size". I've tried using element wise multiplication but then things get weird there too.
Related
I need to write an own implementation of computing the fundamental matrix between two images based on the corresponding image coordinates without using OpenCV.
Is it possible to describe this algorithm in its simplest form in accordance with the following function? a simple and straightforward formula.
FMatrixEightPoint()
Input Arguments:
points1(x,y)−pixel coordinates in the first image ,
corresponding to points2 in the second image
points2(x,y)−pixel coordinates in the second image ,
corresponding to points1 in the first image
Output :
F − the fundamental matrix between the first image and the second image
Yes, it is possible to describe the algorithm in the mentioned form.
If you would use OpenCV, you could just use findFundamentalMat. This also provides the 8-point method for computing the fundamental matrix.
The example (in C++) taken from the OpenCV documentation, but adapted (using the RANSAC algorithm for computing the fundamental matrix):
// Example. Estimation of fundamental matrix using the 8-point algorithm
int point_count = 8; // must be >= 8
vector<Point2f> points1(point_count);
vector<Point2f> points2(point_count);
// initialize the points here ... */
for( int i = 0; i < point_count; i++ )
{
points1[i] = ...;
points2[i] = ...;
}
Mat fundamental_matrix =
findFundamentalMat(points1, points2, CV_FM_8POINT);
If you want to write your own function, it would look like this (no valid code)
Matrix findFundamentalMat(Array points1, Array points2)
{
Matrix fundamentalMatrix;
// compute fundamental matrix based on input points1 and points2 or call OpenCV's findFundamentalMat
return fundamentalMatrix;
}
I am currently using a Project Tango tablet for robotic obstacle avoidance. I want to create a matrix of z-values as they would appear on the Tango screen, so that I can use OpenCV to process the matrix. When I say z-values, I mean the distance each point is from the Tango. However, I don't know how to extract the z-values from the TangoXyzIjData and organize the values into a matrix. This is the code I have so far:
public void action(TangoPoseData poseData, TangoXyzIjData depthData) {
byte[] buffer = new byte[depthData.xyzCount * 3 * 4];
FileInputStream fileStream = new FileInputStream(
depthData.xyzParcelFileDescriptor.getFileDescriptor());
try {
fileStream.read(buffer, depthData.xyzParcelFileDescriptorOffset, buffer.length);
fileStream.close();
} catch (IOException e) {
e.printStackTrace();
}
Mat m = new Mat(depthData.ijRows, depthData.ijCols, CvType.CV_8UC1);
m.put(0, 0, buffer);
}
Does anyone know how to do this? I would really appreciate help.
The short answer is it can't be done, at least not simply. The XYZij struct in the Tango API does not work completely yet. There is no "ij" data. Your retrieval of buffer will work as you have it coded. The contents are a set of X, Y, Z values for measured depth points, roughly 10000+ each callback. Each X, Y, and Z value is of type float, so not CV_8UC1. The problem is that the points are not ordered in any way, so they do not correspond to an "image" or xy raster. They are a random list of depth points. There are ways to get them into some xy order, but it is not straightforward. I have done both of these:
render them to an image, with the depth encoded as color, and pull out the image as pixels
use the model/view/perspective from OpenGL and multiply out the locations of each point and then figure out their screen space location (like OpenGL would during rendering). Sort the points by their xy screen space. Instead of the calculated screen-space depth just keep the Z value from the original buffer.
or
wait until (if) the XYZij struct is fixed so that it returns ij values.
I too wish to use Tango for object avoidance for robotics. I've had some success by simplifying the use case to be only interested in the distance of any object located at the center view of the Tango device.
In Java:
private Double centerCoordinateMax = 0.020;
private TangoXyzIjData xyzIjData;
final FloatBuffer xyz = xyzIjData.xyz;
double cumulativeZ = 0.0;
int numberOfPoints = 0;
for (int i = 0; i < xyzIjData.xyzCount; i += 3) {
float x = xyz.get(i);
float y = xyz.get(i + 1);
if (Math.abs(x) < centerCoordinateMax &&
Math.abs(y) < centerCoordinateMax) {
float z = xyz.get(i + 2);
cumulativeZ += z;
numberOfPoints++;
}
}
Double distanceInMeters;
if (numberOfPoints > 0) {
distanceInMeters = cumulativeZ / numberOfPoints;
} else {
distanceInMeters = null;
}
Said simply this code is taking the average distance of a small square located at the origin of x and y axes.
centerCoordinateMax = 0.020 was determined to work based on observation and testing. The square typically contains 50 points in ideal conditions and fewer when held close to the floor.
I've tested this using version 2 of my tango-caminada application and the depth measuring seems quite accurate. Standing 1/2 meter from a doorway I slid towards the open door and the distance changed form 0.5 meters to 2.5 meters which is the wall at the end of the hallway.
Simulating a robot being navigated I moved the device towards a trash can in the path until 0.5 meters separation and then rotated left until the distance was more than 0.5 meters and proceeded forward. An oversimplified simulation, but the basis for object avoidance using Tango depth perception.
You can do this by using camera intrinsics to convert XY coordinates to normalized values -- see this post - Google Tango: Aligning Depth and Color Frames - it's talking about texture coordinates but it's exactly the same problem
Once normalized, move to screen space x[1280,720] and then the Z coordinate can be used to generate a pixel value for openCV to chew on. You'll need to decide how to color pixels that don't correspond to depth points on your own, and advisedly, before you use the depth information to further colorize pixels.
The main thing is to remember that the raw coordinates returned are already using the basis vectors you want, i.e. you do not want the pose attitude or location
I'm using Gaussian equation for a particular photo effect in an iOS application.
I use:
double sigmaX = ...; //some value here
for(int i=0;i<height;i++)
{
double F = 0;
double step = -(pos)*width/20;
/*height,width,pos - all predefined, no problem there*/
for(int j=0;j<4*width;j+=4)
{
F = (double) ((1/1)*exp(-sigmaX*(pow((step++)/1, 2.0)))) ;
//do some operation here...
}
}
and the value of F is used to determine a particular intensity which is used up elsewhere.
So far so good.... F is the typical bell curve as expected.
But, the question is, I want to scale the standard deviation of this curve as per user input.
For example, in the following image, I'd like to shift the curve from the green to the red line (blue maybe an intermediate), hopefully in linear steps:
Now, given the standard notation of:
and comparing it with the way I implemented it in my code, I got the idea to vary 1/sqrt(sigmaX) to alter the scale/SD. I tried incrementing 1/sqrt(sigmaX) in linear steps (to get linear increment) or by x^n to get power of n increment in SD, but none of that worked.
I am a bit stuck with the concept.
Can you please let me know how to scale the Standard Deviation by a predefined ratio, i.e I may want it 1.34 or 3.78 times the oirginal SD and it will scale up the the +3sigma to -3sigma span accordingly.
Your calculation here:
F = (double) ((1/1)*exp(-sigmaX*(pow((step++)/1, 2.0)))) ;
Is not reflecting the Gaussian formula you showed. It should be something like this:
double dSigma = 1.0;
static const double dRootTwoPi = sqrt(2.0 * M_PI);
F = (1.0 / (dSigma * dRootTwoPi)) * exp(-0.5 * pow(step++ / dSigma, 2.0));
Then you can vary dSigma from 1.0 to 3.0 (or whatever) to get the effect you want.
Thanks Roger Rowland, for the help... I finally got this to work:
Changed the gaussian function to:
sigmaX*=scaling;
F = (double) ((scaling / (sigmaX))*exp(-0.0005*(powf((step++/sigmaX), 2.0)))) ;
Indeed, what I had done before wasn't exactly Gaussian. This works fine and scales fine, based on the scaling parameter.
Thanks again.
Perhaps this is more of a math question than a programming question, but I've been trying to implement the rotating calipers algorithm in XNA.
I've deduced a convex hull from my point set using a monotone chain as detailed on wikipedia.
Now I'm trying to model my algorithm to find the OBB after the one found here:
http://www.cs.purdue.edu/research/technical_reports/1983/TR%2083-463.pdf
However, I don't understand what the DOTPR and CROSSPR methods it mentions on the final page are supposed to return.
I understand how to get the Dot Product of two points and the Cross Product of two points, but it seems these functions are supposed to return the Dot and Cross Products of two edges / line segments. My knowledge of mathematics is admittedly limited but this is my best guess as to what the algorithm is looking for
public static float PolygonCross(List<Vector2> polygon, int indexA, int indexB)
{
var segmentA1 = NextVertice(indexA, polygon) - polygon[indexA];
var segmentB1 = NextVertice(indexB, polygon) - polygon[indexB];
float crossProduct1 = CrossProduct(segmentA1, segmentB1);
return crossProduct1;
}
public static float CrossProduct(Vector2 v1, Vector2 v2)
{
return (v1.X * v2.Y - v1.Y * v2.X);
}
public static float PolygonDot(List<Vector2> polygon, int indexA, int indexB)
{
var segmentA1 = NextVertice(indexA, polygon) - polygon[indexA];
var segmentB1 = NextVertice(indexB, polygon) - polygon[indexB];
float dotProduct = Vector2.Dot(segmentA1, segmentB1);
return dotProduct;
}
However, when I use those methods as directed in this portion of my code...
while (PolygonDot(polygon, i, j) > 0)
{
j = NextIndex(j, polygon);
}
if (i == 0)
{
k = j;
}
while (PolygonCross(polygon, i, k) > 0)
{
k = NextIndex(k, polygon);
}
if (i == 0)
{
m = k;
}
while (PolygonDot(polygon, i, m) < 0)
{
m = NextIndex(m, polygon);
}
..it returns the same index for j, k when I give it a test set of points:
List<Vector2> polygon = new List<Vector2>()
{
new Vector2(0, 138),
new Vector2(1, 138),
new Vector2(150, 110),
new Vector2(199, 68),
new Vector2(204, 63),
new Vector2(131, 0),
new Vector2(129, 0),
new Vector2(115, 14),
new Vector2(0, 138),
};
Note, that I call polygon.Reverse to place these points in Counter-clockwise order as indicated in the technical document from perdue.edu. My algorithm for finding a convex-hull of a point set generates a list of points in counter-clockwise order, but does so assuming y < 0 is higher than y > 0 because when drawing to the screen 0,0 is the top left corner. Reversing the list seems sufficient. I also remove the duplicate point at the end.
After this process, the data becomes:
Vector2(115, 14)
Vector2(129, 0)
Vector2(131, 0)
Vector2(204, 63)
Vector2(199, 68)
Vector2(150, 110)
Vector2(1, 138)
Vector2(0, 138)
This test fails on the first loop when i equals 0 and j equals 3. It finds that the cross-product of the line (115,14) to (204,63) and the line (204,63) to (199,68) is 0. It then find that the dot product of the same lines is also 0, so j and k share the same index.
In contrast, when given this test set:
http://www.wolframalpha.com/input/?i=polygon+%282%2C1%29%2C%281%2C2%29%2C%281%2C3%29%2C%282%2C4%29%2C%284%2C4%29%2C%285%2C3%29%2C%283%2C1%29
My code successfully returns this OBB:
http://www.wolframalpha.com/input/?i=polygon+%282.5%2C0.5%29%2C%280.5%2C2.5%29%2C%283%2C5%29%2C%285%2C3%29
I've read over the C++ algorithm found on http://www.geometrictools.com/LibMathematics/Containment/Wm5ContMinBox2.cpp but I'm too dense to follow it completely. It also appears to be very different than the other one detailed in the paper above.
Does anyone know what step I'm skipping or see some error in my code for finding the dot product and cross product of two line segments? Has anyone successfully implemented this code before in C# and have an example?
Points and vectors as data structures are essentially the same thing; both consist of two floats (or three if you're working in three dimensions). So, when asked to take the dot product of the edges, I suppose it means taking the dot product of the vectors that the edges define. The code you provided does exactly this.
Your implementation of CrossProduct seems correct (see Wolfram MathWorld). However, in PolygonCross and PolygonDot I think you shouldn't normalize the segments. It will affect the magnitude of the return values of PolygonDot and PolygonCross. By removing the superfluous calls to Vector2.Normalize you can speed up your code and reduce the amount of noise in your floating point values. However, normalization is not relevant to the correctness of the code that you have pasted as it only compares the results with zero.
Note that the paper you refer to assumes that the polygon vertices are listed in counterclockwise order (page 5, first paragraph after "Beginning of comments") but your example polygon is defined in clockwise order. That's why PolygonCross(polygon, 0, 1) is negative and you get the same value for j and k.
I assume DOTPR is a normal vector dot product, crosspr is a crossproduct. dotproduct will return a normal number , crossproduct will return a vector which is perpendicular to the two vectors given. (basic vector math,check wikipedia)
they are actually defined in the paper as DOTPR(i,j) returns dotproduct of vectors from vertex i to i+1 and j to j+1. same for CROSSPR but with cross product.
Can anyone please help.
I have a cube which I have made in 3DS Max. I don't know the dimensions of the cube. Is there a way to get the vertices of each of the triangles of the faces of the cube? I am trying to get the normal to one of the faces of the cube to determine which way its pointing. So if I can determine the vertices I can get the normal for the face if I have 3 vertices, V1, V2 and V3, ordered in counterclockwise order, I can obtain the direction of the normal by computing (V2 - V1) x (V3 - V1), where x is the cross product of the two vectors.
I have looked in my models .fbx file and I can see a number of values there:
Vertices: *24 {
a: -15,-12.5,0,15,-12.5,0,-15,12.5,0,15,12.5,0,-15,-12.5,0.5,15,-12.5,0.5,-15,12.5,0.5,15,12.5,0.5}
PolygonVertexIndex: *36 {
a: 0,2,-4,3,1,-1,4,5,-8,7,6,-5,0,1,-6,5,4,-1,1,3,-8,7,5,-2,3,2,-7,6,7,-4,2,0,-5,4,6,-3}
Are these my models vertices?
Also, I would assume that Vertices: * 24 would be my list of vertices, but why is there only 24? Should a cube not have 36 vertices? And finally, if the coordinates for my vertices are PolygonVertexIndex: * 36 these values just seem off to me when I imagine the cube in my head with those dimensions?
Or alternatively, is there a automatic way to get the vertices of a cube without having to manually enter all the values for each vertex? I might have a couple of models to
Any help would be greatly appreciated
I can't figure why you need that... because when you load a model it is calculated , internally each vertex will have the normal,...
Anyway it is easy to calc...
The three first indexes define the first triangle of a face, the next three, the other triangle of a face.
You need only one triangle to calculate the normal...
So with the three indexes access to the veretex array and get three points... A, B and C
Now your normal is the result of the cross product between two vectors formed with that vertex.
Vector3 Normal = Vector3.Cross(B-A, C-B);
If the normal go back or forward will depend on the A,B,C order, can be CounterClockWise or ClockWise, but every triangle of the model will be ordered in one way. So you will have try it and fix it
You can write an XNA program which reads your normals without much hassle.
If you still want to calculate them, however, use this C# code, taken from FFWD, as a guide. Check the URL for a more detailed discussion on pros and cons. Personally, I'm not too happy with the result, but for the time being it works. Of course, since this code is FFWD related (implementation of Unity's API for XNA), it does not match XNA exactly, but the mathematics remain the same.
/// <summary>
/// Recalculates the normals.
/// Implementation adapted from http://devmaster.net/forums/topic/1065-calculating-normals-of-a-mesh/
/// </summary>
public void RecalculateNormals()
{
Vector3[] newNormals = new Vector3[_vertices.Length];
// _triangles is a list of vertex indices,
// with each triplet referencing the three vertices of the corresponding triangle
for (int i = 0; i < _triangles.Length; i = i + 3)
{
Vector3[] v = new Vector3[]
{
_vertices[_triangles[i]],
_vertices[_triangles[i + 1]],
_vertices[_triangles[i + 2]]
};
Vector3 normal = Vector3.Cross(v[1] - v[0], v[2] - v[0]);
for (int j = 0; j < 3; ++j)
{
Vector3 a = v[(j+1) % 3] - v[j];
Vector3 b = v[(j+2) % 3] - v[j];
float weight = (float)Math.Acos(Vector3.Dot(a, b) / (a.magnitude * b.magnitude));
newNormals[_triangles[i + j]] += weight * normal;
}
}
foreach (Vector3 normal in newNormals)
{
normal.Normalize();
}
normals = newNormals;
}