When calculating Manhattan Distance, should you calculate distance to end point or start point? - a-star

I'm trying to learn the A* algorithm (when applied to a grid pattern) and think I have grasped that before you can find the shortest path, you need to calculate the distance away from the start for any given square.
I'm following the guide here: https://medium.com/#nicholas.w.swift/easy-a-star-pathfinding-7e6689c7f7b2 which has the following image showing the Manhattan distances for each square on the grid:
with the start square being the green square and the end being the blue.
However, surely it makes more sense to figure the distance in reverse? The A* chooses the connected square with the shortest distance to the goal right? So surely this (based on the image) would make sense if we started at the end and asked what's the lowest value connected to the start, in this case 17, so go there, then 15 so go there etc etc.
Sub question: The distances in the image away from the start appear to be based on moving through Von Neumann neighbours, so surely on the way back you cannot go diagonally?

It is quite simple actually:
F = G + H
F is the total cost of the node.
G is the distance between the current node and the start node.
H is the heuristic — estimated distance from the current node to the end node.
The numbers in the grid represent G (and not the heuristic). G is the actual distance from the start point.
H should be calculated to the endpoint.

Related

How to check whether Essential Matrix is correct or not without decomposing it?

On a very high level, my pose estimation pipeline looks somewhat like this:
Find features in image_1 and image_2 (let's say cv::ORB)
Match the features (let's say using the BruteForce-Hamming descriptor matcher)
Calculate Essential Matrix (using cv::findEssentialMat)
Decompose it to get the proper rotation matrix and translation unit vector (using cv::recoverPose)
Repeat
I noticed that at some point, the yaw angle (calculated using the output rotation matrix R of cv::recoverPose) suddenly jumps by more than 150 degrees. For that particular frame, the number of inliers is 0 (the return value of cv::recoverPose). So, to understand what exactly that means and what's going on, I asked this question on SO.
As per the answer to my question:
So, if the number of inliers is 0, then something went very wrong. Either your E is wrong, or the point matches are wrong, or both. In this case you simply cannot estimate the camera motion from those two images.
For that particular image pair, based on the visualization and from my understanding, matches look good:
The next step in the pipeline is finding the Essential Matrix. So, now, how can I check whether the calculated Essential Matrix is correct or not without decomposing it i.e. without calculating Roll Pitch Yaw angles (which can be done by finding the rotation matrix via cv::recoverPose)?
Basically, I want to double-check whether my Essential Matrix is correct or not before I move on to the next component (which is cv::recoverPose) in the pipeline!
The essential matrix maps each point p in image 1 to its epipolar line in image 2. The point q in image 2 that corresponds to p must be very close to the line. You can plot the epipolar lines and the matching points to see if they make sense. Remember that E is defined in normalized image coordinates, not in pixels. To get the epipolar lines in pixel coordinates, you would need to convert E to F (the fundamental matrix).
The epipolar lines must all intersect in one point, called the epipole. The epipole is the projection of the other camera's center in your image. You can find the epipole by computing the nullspace of F.
If you know something about the camera motion, then you can determine where the epipole should be. For example, if the camera is mounted on a vehicle that is moving directly forward, and you know the pitch and roll angles of the camera relative to the ground, then you can calculate exactly where the epipole will be. If the vehicle can turn in-plane, then you can find the horizontal line on which the epipole must lie.
You can get the epipole from F, and see if it is where it is supposed to be.

what input x maximize activation function in an autoencoder hidden layer?

Hi when i am reading about Stanford's Machine Learning materials about autoencoder, found a formula hard to prove by myself. Link to Material
Question is:
" What input image x would cause ai to be maximally activated? "
Screen shot of the Question and Context:
Many thanks to your answers in advance!
While this can be rigorously solved using KLT conditions and Lagrange multipliers, there is a more intuitive way to figure the result out. I assume that f(.) is a monotone increasing, sigmoid type of nonlinearity (ReLU is also valid). So, finding the maximum of w1x1+...+w100x100 + b under the constraint (x1)^2+...+(x100)^2 <= 1 is equivalent to finding the maximum of f(w1x1+...+w100x100 + b) with the same constraint.
Note that g = w1x1+...+w100x100 + b is a linear function of x terms (Name it as g, so later we can refer it by that). So, the direction of largest increase at any point (x1,...,x100) in the domain of that function is the same, which is the gradient. The gradient is simply (w1,w2,...,w100) at any point in the domain, which means if we go in the direction of (w1,w2,...,w100), independent from where we start, we obtain the largest increase in the function. To make things simplier and to allow us to visualize, assume that we are in the R^2 space and the function is w1x1 + w2x2 + b:
The optimum x1 and x2 are constrained to lie in or on the circle C:(x1)^2 + (x2)^2 =1. Assume that we are on the origin (0.0). If we go in the direction of the gradient (blue arrow) (w1,w2), we are going to attain the largest value of the function where the blue arrow intersects with the circle. That intersection has the coordinates c*(w1,w2) and it is c^2(w1^2 + w2^2) = 1, where c is a scalar coefficient. c is easily solved as c= 1 / sqrt(w1^2 + w2^2). Then at the intersection we have x1=w1/sqrt(w1^2 + w2^2) and x2=w2/sqrt(w1^2 + w2^2), which the solution we seek. This can be extended in the same way to 100 dimensional case.
You may ask why we started at the origin and not any other point in the circle. Note that the red line is perpendicular to the gradient vector and the function is constant along that line. Draw that (u1,u2) line, preserving its orientation, arbitrarily with the constraint that it intersects the circle C. Then choose any point on the line, such that it lies within the circle. On the (u1,u2) line, you start at the same value of the function g, wherever you are. Then as you go in the (w1,w2) direction, the longest path taken within the circle always goes through the origin, which means the path you increase the function g the most.

Distance metric heuristic informedness

Having Manhattan distance heuristic and a heuristic which takes the
greater value of (square root(x1-x2),square root(y1-y2))
How would you consider their informedness and are they admissable in searching the shortest path in a grid from a to b, where only horizontal and vertical movements are allowed ?
While testing them in all the cases the second heuristic always takes the diagonal ways and sometimes its number of discovered nodes is significantly smaller than Manhattan. But this is not always the case and this is what confuses me.
Given current point a = (x1, y1) and goal b = (x2, y2). I'll let dist1(a, b) denote the Manhattan distance, and dist2(a, b) denote that other heuristic which you propose. We have:
dist1(a, b) = |x1 - x2| + |y1 - y2|
dist2(a, b) = max(sqrt(|x1 - x2|), sqrt(|y1 - y2|))
Note that I changed your new proposed heuristic a bit to take the square root of absolute differences, instead of just differences, since taking the square root of a negative number would lead to problems. Anyway, it should be easy to see that, for any a and b, dist1(a, b) >= dist2(a, b).
Since both heuristics are admissible in the case of a grid with only vertical and horizontal movement allowed, this should typically mean that the greatest heuristic of the two (the Manhattan distance) is more effective, since it'll be closer to the truth.
In the OP you actually mentioned that you're measuring the ''number of nodes discovered'', and that this is sometimes smaller (better) for the second heuristic. With this, I'm going to assume that you mean that you're running the A* algorithm, and that you're measuring the number of nodes that are popped off of the frontier/open list/priority queue/whatever term you want to use.
My guess is that what's happening is that you have bad tie-breaking in cases where multiple nodes have an equal score in the frontier (often referred to as f). The second heuristic you proposed would indeed tend to prefer nodes along the diagonal between current node and goal, whereas the Manhattan distance has no such tendency. A good tie-breaker when multiple nodes in the frontier have an equal (f) score, would be to prioritize nodes with a high real cost (often referred to as g) so far, and a low heuristic cost (often referred to as h). This can either be implemented in practice by explicitly comparing g or h scores when f scores are equal, or by simply multiplying all of your heuristic scores by a number slightly greater than 1 (for instance, 1.0000001). For more on this, see http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html#breaking-ties

Pathfinding On a huge Map

I am in need of some type of pathfinding, so I searched the Internet and found some algorithms.
It seems like they all need some type of map also.
This map can be represented by:
Grid
Nodes
As my map is currently quite huge (20.000 x 20.000 px), a grid map of 1 x 1 px tiles would lead to 400.000.000 unique points on the Grid and also the best Quality I would think. But thats way to much points for me so I could either
increase the tile size (e.g. 50 x 50 px = 160.000 unique points)
switch to Nodes
As the 160.000 unique points are also to much for me, or I would say, not the quality I would like to have, as some units are bigger as 50 px, I think Nodes are the better way to go.
I found this on the Internet 2D Nodal Pathfinding without a Grid and did some calculations:
local radius = 75 -- this varies for some units so i stick to the biggest value
local DistanceBetweenNodes = radius * 2 -- to pass tiles diagonaly
local grids = 166 -- how many col/row
local MapSize = grids * DistanceBetweenNodes -- around 25.000
local walkable = 0 -- used later
local Map = {}
function even(a)
return ((a / radius) % 2 == 0)
end
for x = 0, MapSize, radius do
Map[x] = {}
for y = 0, MapSize, radius do
if (even(x) and even(y)) or (not even(x) and not even(y)) then
Map[x][y] = walkable
end
end
end
Without removing the unpassable Nodes and a unit size of 75 i would end up with ~55445 unique Nodes. The Nodes will drastically shrink if i remove the unpassable Nodes, but as my units have different sizes i need to make the radius to the smallest unit i got. I dont know if this will work with bigger units later then.
So i searched the Internet again and found this Nav Meshes.
This will reduce the Nodes to only "a few" in my eyes and would work with any unit size.
UPDATE 28.09
I have created a nodal Map of all passable Areas now ~30.000 nodes.
Here is an totally random example of a map and the points i have:
Example Map
This calls for some optimization, and reduce the amount of nodes you have.
Almost any pathfinding algorithm can take a node list that is not a grid. You will need to adjust for distance between nodes, though.
You could also increase your grid size so that it does not have as many squares. You will need to compensate for small, narrow paths, in some sort of way, though.
At the end of the day, i would suggest you reduce your node count by simply placing nodes in an arranged path, where you know it is possible to get from point A to B, specifying the neighbors. You will need to manually make a node path for every level, though. Take my test as an example (There are no walls, just the node path):
For your provided map, you would end up with a path node similar to this:
Which has around 50 nodes, compared to the hundreds a grid can have.
This can work on any scale, since your node count is dramatically cut, compared to the grid approach. You will need to make some adjustments, like calculating the distance between nodes, now that they are not in a grid. For this test i am using dijkstra algorithm, in Corona SDK (Lua), but you can try using any other like A-star (A*), which is used in many games and can be faster.
I found a Unity example that takes a similar approach using nodes, and you can see that the approach works in 3D as well:

Calculating the neighborhood distance

What method would you use to compute a distance that represents the number of "jumps" one has to do to go from one area to another area in a given 2D map?
Let's take the following map for instance:
(source: free.fr)
The end result of the computation would be a triangle like this:
A B C D E F
A
B 1
C 2 1
D 2 1 1
E . . . .
F 3 2 2 1 .
Which means that going from A to D, it takes 2 jumps.
However, to go from anywhere to E, it's impossible because the "gap" is too big, and so the value is "infinite", represented here as a dot for simplification.
As you can see on the example, the polygons may share points, but most often they are simply close together and so a maximum gap should be allowed to consider two polygons to be adjacent.
This, obviously, is a simplified example, but in the real case I'm faced with about 60000 polygons and am only interested by jump values up to 4.
As input data, I have the polygon vertices as an array of coordinates, from which I already know how to calculate the centroid.
My initial approach would be to "paint" the polygons on a white background canvas, each with their own color and then walk the line between two candidate polygons centroid. Counting the colors I encounter could give me the number of jumps.
However, this is not really reliable as it does not take into account concave arrangements where one has to walk around the "notch" to go from one polygon to the other as can be seen when going from A to F.
I have tried looking for reference material on this subject but could not find any because I have a hard time figuring what the proper terms are for describing this kind of problem.
My target language is Delphi XE2, but any example would be most welcome.
You can create inflated polygon with small offset for every initial polygon, then check for intersection with neighbouring (inflated) polygons. Offseting is useful to compensate small gaps between polygons.
Both inflating and intersection problems might be solved with Clipper library.
Solution of the potential neighbours problem depends on real conditions - for example, simple method - divide plane to square cells, and check for neighbours that have vertices in the same cell and in the nearest cells.
Every pair of intersecting polygons gives an edge in (unweighted, undirected) graph. You want to find all the path with length <=4 - just execute depth-limited BFS from every vertice (polygon) - assuming that graph is sparse
You can try a single link clustering or some voronoi diagrams. You can also brute-force or try Density-based spatial clustering of applications with noise (DBSCAN) or K-means clustering.
I would try that:
1) Do a Delaunay triangulation of all the points of all polygons
2) Remove from Delaunay graph all triangles that have their 3 points in the same polygon
Two polygons are neightbor by point if at least one triangle have at least one points in both polygons (or obviously if polygons have a common point)
Two polygons are neightbor by side if each polygon have at least two adjacents points in the same quad = two adjacent triangles (or obviously two common and adjacent points)
Once the gaps are filled with new polygons (triangles eventually combined) use Djikistra Algorithm ponderated with distance from nearest points (or polygons centroid) to compute the pathes.

Resources