Shortest path of a rectangular cuboid in a board - breadth-first-search

I have a challenge with shortest path algorithms. I have a rectangular cuboid in a board (screenshot attached), the movement of the cube is done doing a rotation on one of the edges. So depending on the state the cube can occupy 2 tiles in the board, or standing in one tile of the board. On every rotation the size (in tiles) of the next movement could change. Is there any algorithm capable of calculating shortest path with this behavior? Any help is appreciated.

Yes, it can be done using BFS. In the Queue we need to maintain a tuple "state of cuboid" + "Current Cells of Cuboid" and need to figure out immediate next cells the cuboid can move to and in in which state and Push that in the Queue. It's a standard Breadth First Search Question.

Related

OpenCV - align stack of images - different cameras

We have this camera array arranged in an arc around a person (red dot). Think The Matrix - each camera fires at the same time and then we create an animated gif from the output. The problem is that it is near impossible to align the cameras exactly and so I am looking for a way in OpenCV to align the images better and make it smoother.
Looking for general steps. I'm unsure of the order I would do it. If I start with image 1 and match 2 to it, then 2 is further from three than it was at the start. And so matching 3 to 2 would be more change... and the error would propagate. I have seen similar alignments done though. Any help much appreciated.
Here's a thought. How about performing a quick and very simple "calibration" of the imaging system by using a single reference point?
The best thing about this is you can try it out pretty quickly and even if results are too bad for you, they can give you some more insight into the problem. But the bad thing is it may just not be good enough because it's hard to think of anything "less advanced" than this. Here's the description:
Remove the object from the scene
Place a small object (let's call it a "dot") to position that rougly corresponds to center of mass of object you are about to record (the center of area denoted by red circle).
Record a single image with each camera
Use some simple algorithm to find the position of the dot on every image
Compute distances from dot positions to image centers on every image
Shift images by (-x, -y), where (x, y) is the above mentioned distance; after that, the dot should be located in the center of every image.
When recording an actual object, use these precomputed distances to shift all images. After you translate the images, they will be roughly aligned. But since you are shooting an object that is three-dimensional and has considerable size, I am not sure whether the alignment will be very convincing ... I wonder what results you'd get, actually.
If I understand the application correctly, you should be able to obtain the relative pose of each camera in your array using homographies:
https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html
From here, the next step would be to correct for alignment issues by estimating the transform between each camera's actual position and their 'ideal' position in the array. These ideal positions could be computed relative to a single camera, or relative to the focus point of the array (which may help simplify calculation). For each image, applying this corrective transform will result in an image that 'looks like' it was taken from the 'ideal' position.
Note that you may need to estimate relative camera pose in 3-4 array 'sections', as it looks like you have a full 180deg array (e.g. estimate homographies for 4-5 cameras at a time). As long as you have some overlap between sections it should work out.
Most of my experience with this sort of thing comes from using MATLAB's stereo camera calibrator app and related functions. Their help page gives a good overview of how to get started estimating camera pose. OpenCV has similar functionality.
https://www.mathworks.com/help/vision/ug/stereo-camera-calibrator-app.html
The cited paper by Zhang gives a great description of the mathematics of pose estimation from correspondence, if you're interested.

Ray tracer for complicated figures

I have implemented realtime ray tracer with MetalFramework for iOS and it is implemented for following optical prisms like dodecahedron, icosahedron, octahedron, cube, etc. All my figures are composed from triangles, for example cube - 12 triangles, octahedron - 4 triangles. I trace rays and search intersection with figure, then I search how ray moves in prism. Then ray leaves figure and I search intersection with skybox. The problem is in complicated figures. When I test cube fps is 60, but when I test dodecahedron fps is 6. In my algorithm intersection with figure is the same as intersection with any triangle. It means that when I check intersection with ray and figure I have to check intersection with all triangles. I need some idea how to do not check intersections for all triangles. Thanks.
let say you have world bounded by some bounding box
create grid (dividing this box to cubes or whatever)
each voxel/cell
Is a list of triangles that intersects or are in it so before rendering for each cell process all triangles and store index of all triangles inside or crossing
rewrite ray-tracer to trace through this voxel map
So just increment the ray through neighboring voxels it is the same as line rasterization on pixels. This way you have partially Z-sort done. So take first voxel hit by ray and test only triangles contained in it. If any hit on voxel was found then stop (no need to test other voxels because they are farer).
further optimizations
You can add flag if triangle has been tested so test only those which where not already tested because many triangles will be multiple times tested otherwise
[notes]
Number of voxels per axis greatly affect performance so you need to play with it a bit to achieve best performance. If you have dynamic objects then the voxel map lists computations must be done once in a while or even per each frame. For static scene there is sufficient to do this just once.
To trace efficiently you'll need to use an acceleration structure, for example a KD-tree or a bounding volume hierarchy(BVH). This is similar to using a binary search tree to find a matching element.
I would suggest using a BVH because it is easier to construct and traverse than a KD-tree. And I would suggest against using a uniform voxel grid structure. A voxel grid can easily have very poor performance when triangles are unevenly distributed through the scene or heavily concentrated in a few voxels.
The BVH is just a tree of bounding volumes, such as an axis aligned bounding box (AABB) which encompass the primitives within it. This way if your a ray misses the bounding volume you know that it does not hit any primitives contained with it.
To construct a BVH:
Put all the triangle in one bounding volume. This will be the root of the tree.
Divide the triangles into two sets where the bounding volume of each set of triangles is minimized. More properly you'd want to follow the surface area heuristic (SAH), where you want to create set of triangles where you minimize the sum of the (surface area of the BVH)/(# triangles) for both sets of triangles.
Repeat step 2 for each node recursively until you the number of triangles you have left hits some threshold (4 is a good number to try).
To traverse
Check if the ray hits the root bounding box, if it does then proceed to step 2 otherwise no hit.
Check if it hits the child bounding boxes. If it does then repeat this step for its children bounding boxes. Otherwise no hit.
When you get the a bounding box which only contains triangles you'll need to test each triangle to see if it is hit just like normal.
This is a basic idea of a BVH. There much more detail that I haven't gone into about the BVH that you'll have to search for, since there are so many variations in the details.
In Short Implement a bounding volume hierarchy to trace.

Using Hough Transform in robot navigation

My goal to for an autonomous robot to navigate a walled mazed using a camera. The camera is fixed atop the robot and facing down to be able to view the walls and the robot from above.
The approach I took that seemed most straightforward from my experience was to
Threshold the image to extract the red walls
Perform Canny edge detection
Use the Hough transform the detect the strong lines from the edges
as seen below after some parameter tweaking
I want to have the robot move forward and avoid "hitting" the red walls. The problem is that there are multiple lines detected per wall edge from the hough transform. One idea I had was to perform k-means clustering to cluster the lines and find the centers (means) of each cluster, but I do not know the number of wall edges (and therefore number of clusters to input to the k-means algorithm) I will have at any time in navigating the maze (walls ahead, behind, multiple turn intersections, etc.).
Any help in finding a good way to have a consistent wall location to compare the robot's location (which is always fixed in every image frame) to at any time in navigating the maze would be greatly appreciated. I'm also open to any other approach to this problem.
Run a skeletonization algorithm before extracting the HoughLines.

How to use the A* path finding algorithm on a grid less 2D plane?

How can I implement the A* algorithm on a gridless 2D plane with no nodes or cells? I need the object to maneuver around a relatively high number of static and moving obstacles in the way of the goal.
My current implementation is to create eight points around the object and treat them as the centers of imaginary adjacent squares that might be a potential position for the object. Then I calculate the heuristic function for each and select the best. The distances between the starting point and the movement point, and between the movement point and the goal I calculate the normal way with the Pythagorean theorem. The problem is that this way the object often ignores all obstacle and even more often gets stuck moving back and forth between two positions.
I realize how silly mu question might seem, but any help is appreciated.
Create an imaginary grid at whatever resolution is suitable for your problem: As coarse grained as possible for good performance but fine-grained enough to find (desirable) gaps between obstacles. Your grid might relate to a quadtree with your obstacle objects as well.
Execute A* over the grid. The grid may even be pre-populated with useful information like proximity to static obstacles. Once you have a path along the grid squares, post-process that path into a sequence of waypoints wherever there's an inflection in the path. Then travel along the lines between the waypoints.
By the way, you do not need the actual distance (c.f. your mention of Pythagorean theorem): A* works fine with an estimate of the distance. Manhattan distance is a popular choice: |dx| + |dy|. If your grid game allows diagonal movement (or the grid is "fake"), simply max(|dx|, |dy|) is probably sufficient.
Uh. The first thing that come to my mind is, that at each point you need to calculate the gradient or vector to find out the direction to go in the next step. Then you move by a small epsilon and redo.
This basically creates a grid for you, you could vary the cell size by choosing a small epsilon. By doing this instead of using a fixed grid you should be able to calculate even with small degrees in each step -- smaller then 45° from your 8-point example.
Theoretically you might be able to solve the formulas symbolically (eps against 0), which could lead to on optimal solution... just a thought.
How are the obstacles represented? Are they polygons? You can then use the polygon vertices as nodes. If the obstacles are not represented as polygons, you could generate some sort of convex hull around them, and use its vertices for navigation. EDIT: I just realized, you mentioned that you have to navigate around a relatively high number of obstacles. Using the obstacle vertices might be infeasible with to many obstacles.
I do not know about moving obstacles, I believe A* doesn't find an optimal path with moving obstacles.
You mention that your object moves back and fourth - A* should not do this. A* visits each movement point only once. This could be an artifact of generating movement points on the fly, or from the moving obstacles.
I remember encountering this problem in college, but we didn't use an A* search. I can't remember the exact details of the math but I can give you the basic idea. Maybe someone else can be more detailed.
We're going to create a potential field out of your playing area that an object can follow.
Take your playing field and tilt or warp it so that the start point is at the highest point, and the goal is at the lowest point.
Poke a potential well down into the goal, to reinforce that it's a destination.
For every obstacle, create a potential hill. For non-point obstacles, which yours are, the potential field can increase asymptotically at the edges of the obstacle.
Now imagine your object as a marble. If you placed it at the starting point, it should roll down the playing field, around obstacles, and fall into the goal.
The hard part, the math I don't remember, is the equations that represent each of these bumps and wells. If you figure that out, add them together to get your final field, then do some vector calculus to find the gradient (just like towi said) and that's the direction you want to go at any step. Hopefully this method is fast enough that you can recalculate it at every step, since your obstacles move.
Sounds like you're implementing The Wumpus game based on Norvig and Russel's discussion of A* in Artifical Intelligence: A Modern Approach, or something very similar.
If so, you'll probably need to incorporate obstacle detection as part of your heuristic function (hence you'll need to have sensors that alert your agent to the signs of obstacles, as seen here).
To solve the back and forth issue, you may need to store the traveled path so you can tell if you've already been to a location and have the heurisitic function examine the past N number of moves (say 4) and use that as a tie-breaker (i.e. if I can go north and east from here, and my last 4 moves have been east, west, east, west, go north this time)

how to remove background image and get fore image

there are two images
alt text http://bbs.shoucangshidai.com/attachments/month_1001/1001211535bd7a644e95187acd.jpg
alt text http://bbs.shoucangshidai.com/attachments/month_1001/10012115357cfe13c148d3d8da.jpg
one is background image another one is a person's photo with the same background ,same size,what i want to do is remove the second image's background and distill the person's profile only. the common method is subtract first image from the second one,but my problem is if the color of person's wear is similar to the background. the result of subtract is awful. i can not get whole people's profile. who have good idea to remove the background give me some advice.
thank you in advance.
If you have a good estimate of the image background, subtracting it from the image with the person is a good first step. But it is only the first step. After that, you have to segment the image, i.e. you have to partition the image into "background" and "foreground" pixels, with constraints like these:
in the foreground areas, the average difference from the background image should be high
in the background areas, the average difference from the background image should be low
the areas should be smooth. Outline length and curvature should be minimal.
the borders of the areas should have a high contrast in the source image
If you are mathematically inclined, these constraints can be modeled perfectly with the Mumford-Shah functional. See here for more information.
But you can probably adapt other segmentation algorithms to the problem.
If you want a fast and simple (but not perfect) version, you could try this:
subtract the two images
find the largest consecutive "blob" of pixels with a background-foreground difference greater than some threshold. This is the first rough estimate for the "person area" in the foreground image, but the segmentation does not meet the criteria 3 and 4 above.
Find the outline of the largest blob (EDIT: Note that you don't have to start at the outline. You can also start with a larger polygon, as the steps will automatically shrink it to the optimal position.)
now go through each point in the outline and smooth the outline. i.e. for each point find the point that minimizes the formula: c1*L - c2*G, where L is the length of the outline polygon if the point were moved here and G is the gradient at the location the point would be moved to, c1/c2 are constants to control the process. Move the point to that position. This has the effect of smoothing the contour polygon in areas of low gradient in the source image, while keeping it tied to high gradients in the source image (i.e. the visible borders of the person). You can try different expressions for L and G, for example, L could take the length and curvature into account, and G could also take the gradient in the background and subtracted images into account.
you probably will have to re-normalize the outline polygon, i.e. make sure that the points on the outline are spaced regularly. Either that, or make sure that the distances between the points stay regular in the step before. ("Geodesic Snakes")
repeat the last two steps until convergence
You now have an outline polygon that touches the visible person-background border and continues smoothly where the border is not visible or has low contrast.
Look up "Snakes" (e.g. here) for more information.
Low-pass filter (blur) the images before you subtract them.
Then use that difference signal as a mask to select the pixels of interest.
A wide-enough filter will ignore the too-small (high-frequency) features that end up carving out "awful" regions inside your object of interest. It'll also reduce the highlighting of pixel-level noise and misalignment (the highest-frequency information).
In addition, if you have more than two frames, introducing some time hysteresis will let you form more stable regions of interest over time too.
One technique that I think is common is to use a mixture model. Grab a number of background frames and for each pixel build a mixture model for its color.
When you apply a frame with the person in it you will get some probability that the color is foreground or background, given the probability densities in the mixture model for each pixel.
After you have P(pixel is foreground) and P(pixel is background) you could just threshold the probability images.
Another possibility is to use the probabilities as inputs in some more clever segmentation algorithm. One example is graph cuts which I have noticed works quite well.
However, if the person is wearing clothes that are visually indistguishable from the background obviously none of the methods described above would work. You'd either have to get another sensor (like IR or UV) or have a quite elaborate "person model" which could "add" the legs in the right position if it finds what it thinks is a torso and head.
Good luck with the project!
Background vs Foreground detection is very subjective. The application scenario defines background or foreground. However in the application you detail, I guess you are implicitly saying that the person is the foreground.
Using the above assumption, what you seek is a person detection algorithm. A possible solution is:
Run a haar feature detector+ boosted cascade of weak classifiers
(see the opencv wiki for details)
Compute inter-frame motion (differences)
If there is a +ve face detection for a frame, cluster motion pixels
around the face (kNN algorithm)
voila... you should have a simple person detector.
Post the photo on Craigslist and tell them that you'll pay $5 for someone to do it.
Guaranteed you'll get hits in minutes.
Instead of a straight subtraction, you could step through both images, pixel by pixel, and only "subtract" the pixels which are exactly the same. That of course won't account for minor variances in colors, though.

Resources