Understanding OpenCV Hierarchies - opencv

New to OpenCV here. I'm trying to understand what a hierarchy vector is supposed to contain. I understand that for object tracking and when using the findCountours function, it is common to use vector, but I would like to understand what this means. Thanks in advance for the help!

A contour is a closed edge around an area of an image. This contour could contain contours so we need a way to store this hierarchy. The hierarchy vector contains all of the information to explain how contours are nested in each other.
From the OpenCV documentation it is a
Optional output vector, containing information about the image topology. It has as many elements as the number of contours. For each i-th contour contours[i] , the elements hierarchy[i][0] , hiearchy[i][1] , hiearchy[i][2] , and hiearchy[i][3] are set to 0-based indices in contours of the next and previous contours at the same hierarchical level, the first child contour and the parent contour, respectively. If for the contour i there are no next, previous, parent, or nested contours, the corresponding elements of hierarchy[i] will be negative.
You can think of this as a doubly linked list, but each item in the list points could possibly point to a parent and/or child. We can use the next and previous indices to find all of the contours that have the same parent. Each contour will point to a child linked list if they contain a child contour. A negative value is the same as a NULL pointer for a traditional linked list.
An example:
a
|
b,c,d,e,f
| |
g h,i
a points to b as one child and from b we know that b, c, d, e, and f are all contained at the same level in a. b also has a child contour g. e has two children as well.

Related

OpenCV: What does it mean when the number of inliers returned by recoverPose() function is 0?

I've been working on a pose estimation project and one of the steps is finding the pose using the recoverPose function of OpenCV.
int cv::recoverPose(InputArray E,
InputArray points1,
InputArray points2,
InputArray cameraMatrix,
OutputArray R,
OutputArray t,
InputOutputArray mask = noArray()
)
I have all the required info: essential matrix E, key points in image 1 points1, corresponding key points in image 2 points2, and the cameraMatrix. However, the one thing that still confuses me a lot is the int value (i.e. the number of inliers) returned by the function. As per the documentation:
Recover relative camera rotation and translation from an estimated essential matrix and the corresponding points in two images, using cheirality check. Returns the number of inliers which pass the check.
However, I don't completely understand that yet. I'm concerned with this because, at some point, the yaw angle (calculated using the output rotation matrix R) suddenly jumps by more than 150 degrees. For that particular frame, the number of inliers is 0. So, as per the documentation, no points passed the cheirality check. But still, what does it mean exactly? Can that be the reason for the sudden jump in yaw angle? If yes, what are my options to avoid that? As the process is iterative, that one sudden jump affects all the further poses!
This function decomposes the Essential matrix E into R and t. However, you can get up to 4 solutions, i. e. pairs of R and t. Of these 4, only one is physically realizable, meaning that the other 3 project the 3D points behind one or both cameras.
The cheirality check is what you use to find that one physically realizable solution, and this is why you need to pass matching points into the function. It will use the matching 2D points to triangulate the corresponding 3D points using each of the 4 R and t pairs, and choose the one for which it gets the most 3D points in front of both cameras. This accounts for the possibility that some of the point matches can be wrong. The number of points that end up in front of both cameras is the number of inliers that the functions returns.
So, if the number of inliers is 0, then something went very wrong. Either your E is wrong, or the point matches are wrong, or both. In this case you simply cannot estimate the camera motion from those two images.
There are several things you can check.
After you call findEssentialMat you get the inliers from the RANSAC used to find E. Make sure that you are passing only those inlier points into recoverPose. You don't want to pass in all the points that you passed into findEssentialMat.
Before you pass E into recoverPose check if it is of rank 2. If it is not, then you can enforce the rank 2 constraint on E. You can take the SVD of E, set the smallest eigenvalue to 0, and then reconstitute E.
After you get R and t from recoverPose, you can check that R is indeed a rotation matrix with the determinate equal to 1. If the determinant is equal to -1, then R is a reflection, and things have gone wrong.

Calculating the neighborhood distance

What method would you use to compute a distance that represents the number of "jumps" one has to do to go from one area to another area in a given 2D map?
Let's take the following map for instance:
(source: free.fr)
The end result of the computation would be a triangle like this:
A B C D E F
A
B 1
C 2 1
D 2 1 1
E . . . .
F 3 2 2 1 .
Which means that going from A to D, it takes 2 jumps.
However, to go from anywhere to E, it's impossible because the "gap" is too big, and so the value is "infinite", represented here as a dot for simplification.
As you can see on the example, the polygons may share points, but most often they are simply close together and so a maximum gap should be allowed to consider two polygons to be adjacent.
This, obviously, is a simplified example, but in the real case I'm faced with about 60000 polygons and am only interested by jump values up to 4.
As input data, I have the polygon vertices as an array of coordinates, from which I already know how to calculate the centroid.
My initial approach would be to "paint" the polygons on a white background canvas, each with their own color and then walk the line between two candidate polygons centroid. Counting the colors I encounter could give me the number of jumps.
However, this is not really reliable as it does not take into account concave arrangements where one has to walk around the "notch" to go from one polygon to the other as can be seen when going from A to F.
I have tried looking for reference material on this subject but could not find any because I have a hard time figuring what the proper terms are for describing this kind of problem.
My target language is Delphi XE2, but any example would be most welcome.
You can create inflated polygon with small offset for every initial polygon, then check for intersection with neighbouring (inflated) polygons. Offseting is useful to compensate small gaps between polygons.
Both inflating and intersection problems might be solved with Clipper library.
Solution of the potential neighbours problem depends on real conditions - for example, simple method - divide plane to square cells, and check for neighbours that have vertices in the same cell and in the nearest cells.
Every pair of intersecting polygons gives an edge in (unweighted, undirected) graph. You want to find all the path with length <=4 - just execute depth-limited BFS from every vertice (polygon) - assuming that graph is sparse
You can try a single link clustering or some voronoi diagrams. You can also brute-force or try Density-based spatial clustering of applications with noise (DBSCAN) or K-means clustering.
I would try that:
1) Do a Delaunay triangulation of all the points of all polygons
2) Remove from Delaunay graph all triangles that have their 3 points in the same polygon
Two polygons are neightbor by point if at least one triangle have at least one points in both polygons (or obviously if polygons have a common point)
Two polygons are neightbor by side if each polygon have at least two adjacents points in the same quad = two adjacent triangles (or obviously two common and adjacent points)
Once the gaps are filled with new polygons (triangles eventually combined) use Djikistra Algorithm ponderated with distance from nearest points (or polygons centroid) to compute the pathes.

performing border tracing on multiple objects in an image

I developed an algorithm for border tracing of objects in an image. The algorithm is capable of tracing all the objects in an image and returns the result so that you don't have to slice an image with multiple objects to use them with the algorithm.
So basically I begin by finding a threshold value, then get the binary image after threshold and then run the algorithm on it.
The algorithm is below:
find the first pixel that belongs to any object.
Trace that object (has its own algorithm)
get the minimum area of the square that contains that object
mark all the pixels in that square as 0 (erase it from the binary image)
repeat from 1 until there isn't any objects left.
This algorithm worked perfectly with objects that are far from each other, but when I tried with the image attached, I got the result attached also.
The problem is that, the square is near the circle and part of it lies in the square that contains the object, so this part is deleted because the program thinks that it is part of the first object.
I would appreciate it if anyone has a solution to this issue.
Thanks!
A quick-and-dirty method is to sort the bounding boxes in ascending order by area before erasing the shapes. That way smaller shapes are removed first, which will reduce the number of overlapping objects. This will be sufficient if you have only convex shapes.
Pseudocode:
calculate all bounding boxes of shapes
sort boxes by area (smallest area first)
foreach box in list:
foreach pixel in box:
set pixel to 0
A method guaranteed to work for arbitrary shapes is to fill the box using a mask of the object. You already create a binary image, so you can use this as the mask.
Pseudocode:
foreach box in list:
foreach pixel in box:
if (pixel in mask == white): set pixel to 0
You can try using the canny edge detection technique for resolving this issue.
You can find more about it in the following URL,
http://homepages.inf.ed.ac.uk/rbf/HIPR2/canny.htm
Regards
Shiva

equivalent of hierarchy in emgu

I am converting Python OpenCV code to Emgu.
In Python, function findContours can return hierarchy
hierarchy – Optional output vector, containing information about the image topology. It has as many elements as the number of contours. For each i-th contour contours[i] , the elements hierarchy[i][0] , hiearchy[i][1] , hiearchy[i][2] , and hiearchy[i][3] are set to 0-based indices in contours of the next and previous contours at the same hierarchical level, the first child contour and the parent contour, respectively. If for the contour i there are no next, previous, parent, or nested contours, the corresponding elements of hierarchy[i] will be negative.
Unfortunately in Emgu I can't not return such array for findContours function.Is there any equivalent for this?
If you choose CV_RETR_TREE as retrieval type, the Contour<Point> that is returned will contain a hierarchical tree structure.
This image from here shows how you can navigate in the hierarchy using h_next and v_next pointers in OpenCV (i.e. HNext and VNext in Emgu CV).
In this way, you can get the whole hierarchy.

Map points from one 2D plane to another

Given a point on a plane A, I want to be able to map to its corresponding point on plane B. I have a set of N corresponding pairs of reference points between the two planes, however, the overall mapping is not a simple affine transform (no homographies for me).
Things I have tried:
For a given point, find the three closest reference points in plane A, compute barrycentric coordinates of that triangle, and then apply that transform to the corresponding reference points in plane B. How it failed: sometimes the three closest points were nearly collinear, so errors were huge. Also, there was no consistency in the mapping when crossing borders. It was very "jittery."
Compute all possible triangles given the N reference points (N^3). Order them by size. For the given point, find the smallest triangle that it's in. This fixes the linearly of the
points problem, but was still extremely jittery and slow.
Start with a triangulated plane A. Iterate through the reference points, adding each one to the reference plane. Every time you add a point it exists in at least one triangle. Break that triangle into three triangles using the new reference point as a vertex. You end up with plane A triangulated so you can map from plane A to plane B with ease. Issues: You can prove that every triangle will have a point that is on the edge of the planes. This results in huge errors if your reference points are far from the edge of the planes.
I feel like this should be a fairly standard problem. Are there standard algorithms/libraries for this?
There you go my friend.. I have used it myslef and can only recommend you give it a try.
Kahn Academy - Matrix transformations
Understanding how we can map one set of vectors to another set. Matrices used to define linear transformations
https://www.khanacademy.org/math/linear-algebra/matrix_transformations

Resources