My goal to for an autonomous robot to navigate a walled mazed using a camera. The camera is fixed atop the robot and facing down to be able to view the walls and the robot from above.
The approach I took that seemed most straightforward from my experience was to
Threshold the image to extract the red walls
Perform Canny edge detection
Use the Hough transform the detect the strong lines from the edges
as seen below after some parameter tweaking
I want to have the robot move forward and avoid "hitting" the red walls. The problem is that there are multiple lines detected per wall edge from the hough transform. One idea I had was to perform k-means clustering to cluster the lines and find the centers (means) of each cluster, but I do not know the number of wall edges (and therefore number of clusters to input to the k-means algorithm) I will have at any time in navigating the maze (walls ahead, behind, multiple turn intersections, etc.).
Any help in finding a good way to have a consistent wall location to compare the robot's location (which is always fixed in every image frame) to at any time in navigating the maze would be greatly appreciated. I'm also open to any other approach to this problem.
Run a skeletonization algorithm before extracting the HoughLines.
Related
I'm currently working on a project where I need to be able to very reliable get the positions of the balls on a pool table.
I'm using a Kinect v2 above the table as the source.
Initial image looks like this (after converting it to 8-bit from 16-bit by throwing away pixels which is not around table level):
Then a I subtract a reference image with the empty table from the current image.
After thresholding and equalization it looks like this: image
It's fairly easy to detect the individual balls on a single image, the problem is that I have to do it constantly with 30fps.
Difficulties:
Low resolution image (512*424), a ball is around 4-5 pixel in diameter
Kinect depth image has a lot of noise from this distance (2 meters)
Balls look different on the depth image, for example the black ball is kind of inverted compared to the others
If they touch each other then they can become one blob on the image, if I try to separate them with depth thresholding (only using the top of the balls) then some of the balls can disappear from the image
It's really important that anything other than balls should not be detected e.g.: cue, hands etc...
My process which kind of works but not reliable enough:
16bit to 8bit by thresholding
Subtracting sample image with empty table
Cropping
Thresholding
Equalizing
Eroding
Dilating
Binary threshold
Contour finder
Some further algorithms on the output coordinates
The problem is that a pool cue or hand can be detected as a ball and also if two ball touches then it can cause issues. Also tried with hough circles but with even less success. (Works nicely if the Kinect is closer but then it cant cover the whole table)
Any clues would be much appreciated.
Expanding comments above:
I recommend improving the IRL setup as much as possible.
Most of the time it's easier to ensure a reliable setup than to try to "fix" that user computer vision before even getting to detecting/tracking anything.
My suggestions are:
Move the camera closer to the table. (the image you posted can be 117% bigger and still cover the pockets)
Align the camera to be perfectly perpendicular to the table (and ensure the sensor stand is sturdy and well fixed): it will be easier to process a perfect top down view than a slightly tilted view (which is what the depth gradient shows). (sure the data can be rotated, but why waste CPU cycles when you can simply keep the sensor straight)
With a more reliable setup you should be able to threshold based on depth.
You can possible threshold to the centre of balls since the information bellow is occluded anyway. The balls do not deform, so it the radius decreases fast the ball probably went in a pocket.
One you have a clear threshold image, you can findContours() and minEnclosingCircle(). Additionally you should contrain the result based on min and max radius values to avoid other objects that may be in the view (hands, pool cues, etc.). Also have a look at moments() and be sure to read Adrian's excellent Ball Tracking with OpenCV article
It's using Python, but you should be able to find OpenCV equivalent call for the language you use.
In terms tracking
If you use OpenCV 2.4 you should look into OpenCV 2.4's tracking algorithms (such as Lucas-Kanade).
If you already use OpenCV 3.0, it has it's own list of contributed tracking algorithms (such as TLD).
I recommend starting with Moments first: use the simplest and least computationally expensive setup initially and see how robuts the results are before going into the more complex algorithms (which will take to understand and get the parameters right to get expected results out of)
I want to get 3d model of some real word object.
I have two web cams and using openCV and SBM for stereo correspondence I get point cloud of the scene, and filtering through z I can get point cloud only of object.
I know that ICP is good for this purprose, but it needs point clouds to be initally good aligned, so it is combined with SAC to achieve better results.
But my SAC fitness score it too big smth like 70 or 40, also ICP doesn't give good results.
My questions are:
Is it ok for ICP if I just rotate the object infront of cameras for obtaining point clouds? What angle of rotation must be to achieve good results? Or maybe there are better way of taking pictures of the object for getting 3d model? Is it ok if my point clouds will have some holes? What is maximal acceptable fitness score of SAC for good ICP, and what is maximal fitness score of good ICP?
Example of my point cloud files:
https://drive.google.com/file/d/0B1VdSoFbwNShcmo4ZUhPWjZHWG8/view?usp=sharing
My advice and experience is that you already have rgb images or grey. ICP is an good application for optimising the point cloud, but has some troubles aligning them.
First start with rgb odometry (through feature points aligning the point cloud (rotated from each other)) then use and learn how ICP works with the already mentioned point cloud library. Let rgb features giving you a prediction and then use ICP to optimize that when possible.
When this application works think about good fitness score calculation. If that all works use the trunk version of ICP and optimize the parameter. After this all been done You have code that is not only fast, but also with the a low error of going wrong.
The following post is explain what went wrong.
Using ICP, we refine this transformation using only geometric information. However, here ICP decreases the precision. What happens is that ICP tries to match as many corresponding points as it can. Here the background behind the screen has more points that the screen itself on the two scans. ICP will then align the clouds to maximize the correspondences on the background. The screen is then misaligned
https://github.com/introlab/rtabmap/wiki/ICP
Currently, I desperately try to detect an object (robot) based on 2D laser scans (of another robot). In the following two pictures, the blue arrow corresponds to the pose of the laser scanner and points towards the object, that I would like to detect.
one side of the object
two sides of the object
Since it is basically a 2D picture, my first approach was to to look for some OpenCV implementations such as HoughLinesP or LSDDetector in order to detect the lines. Unfortunately, since the focus of OpenCV is more on "real" images with "real" lines, this approach does not really work with the point clouds, as far as I have understood it correctly. Another famous library is the point-cloud library, which on the other hand focus more on 3D point clouds.
My current approach would be to segment the laser scans and then use some iterative closest point (ICP) C++ implementation to find a 2D point cloud template in the laser scans. Since I am not that familiar with object detection and all that nice stuff, I am quite sure that there are some more sophisticated solutions...
Do you have any suggestions?
Many thanks in advance :)
To get lines from points, you could try RANSAC.
You would iteratively fit lines to the points, then remove points corresponding to the new line and repeat until there is not enough points or the support is too low or something like that.
Hope it helps.
First of all I'm a total newbie in image processing, so please don't be too harsh on me.
That being said, I'm developing an application to analyse changes in blood flow in extremities using thermal images obtained by a camera. The user is able to define a region of interest by placing a shape (circle,rectangle,etc.) on the current image. The user should then be able to see how the average temperature changes from frame to frame inside the specified ROI.
The problem is that some of the images are not steady, due to (small) movement by the test subject. My question is how can I determine the movement between the frames, so that I can relocate the ROI accordingly?
I'm using the Emgu OpenCV .Net wrapper for image processing.
What I've tried so far is calculating the center of gravity using GetMoments() on the biggest contour found and calculating the direction vector between this and the previous center of gravity. The ROI is then translated using this vector but the results are not that promising yet.
Is this the right way to do it or am I totally barking up the wrong tree?
------Edit------
Here are two sample images showing slight movement downwards to the right:
http://postimg.org/image/wznf2r27n/
Comparison between the contours:
http://postimg.org/image/4ldez2di1/
As you can see the shape of the contour is pretty much the same, although there are some small differences near the toes.
Seems like I was finally able to find a solution for my problem using optical flow based on the Lukas-Kanade method.
Just in case anyone else is wondering how to implement it in Emgu/C#, here's the link to a Emgu examples project, where they use Lukas-Kanade and Farneback's algorithms:
http://sourceforge.net/projects/emguexample/files/Image/BuildBackgroundImage.zip/download
You may need to adapt a few things, e.g. the parameters for the corner detection (the frame.GoodFeaturesToTrack(..) method) , but it's definetly something to start with.
Thanks for all the ideas!
I have a field filled with obstacles, I know where they are located, and I know the robot's position. Using a path-finding algorithm, I calculate a path for the robot to follow.
Now my problem is, I am guiding the robot from grid to grid but this creates a not-so-smooth motion. I start at A, turn the nose to point B, move straight until I reach point B, rinse and repeat until the final point is reached.
So my question is: What kind of techniques are used for navigating in such an environment so that I get a smooth motion?
The robot has two wheels and two motors. I change the direction of the motor by turning the motors in reverse.
EDIT: I can vary the speed of the motors basically the robot is an arduino plus ardumoto, I can supply values between 0-255 to the motors on either direction.
You need feedback linearization for a differentially driven robot. This document explains it in Section 2.2. I've included relevant portions below:
The simulated robot required for the
project is a differential drive robot
with a bounded velocity. Since
the differential drive robots are
nonholonomic, the students are encouraged to use feedback linearization to
convert the kinematic control output
from their algorithms to control the
differential drive robots. The
transformation follows:
where v, ω, x, y are the linear,
angular, and kinematic velocities. L
is an offset length proportional to the
wheel base dimension of the robot.
One control algorithm I've had pretty good results with is pure pursuit. Basically, the robot attempts to move to a point along the path a fixed distance ahead of the robot. So as the robot moves along the path, the look ahead point also advances. The algorithm compensates for non-holonomic constraints by modeling possible paths as arcs.
Larger look ahead distances will create smoother movement. However, larger look ahead distances will cause the robot to cut corners, which may collide with obstacles. You can fix this problem by implementing ideas from a reactive control algorithm called Vector Field Histogram (VFH). VFH basically pushes the robot away from close walls. While this normally uses a range finding sensor of some sort, you can extrapolate the relative locations of the obstacles since you know the robot pose and the obstacle locations.
My initial thoughts on this(I'm at work so can't spend too much time):
It depends how tight you want or need your corners to be (which would depend on how much distance your path finder gives you from the obstacles)
Given the width of the robot you can calculate the turning radius given the speeds for each wheel. Assuming you want to go as fast as possible and that skidding isn't an issue, you will always keep the outside wheel at 255 and reduce the inside wheel down to the speed that gives you the required turning radius.
Given the angle for any particular turn on your path and the turning radius that you will use, you can work out the distance from that node where you will slow down the inside wheel.
An optimization approach is a very general way to handle this.
Use your calculated path as input to a generic non-linear optimization algorithm (your choice!) with a cost function made up of closeness of the answer trajectory to the input trajectory as well as adherence to non-holonomic constraints, and any other constraints you want to enforce (e.g. staying away from the obstacles). The optimization algorithm can also be initialised with a trajectory constructed from the original trajectory.
Marc Toussaint's robotics course notes are a good source for this type of approach. See in particular lecture 7:
http://userpage.fu-berlin.de/mtoussai/teaching/10-robotics/