Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I know that in particle filter algorithm a robot can pick the best pose given the map. But how can robot predict the pose in SLAM where map is not given. Do we get data from IMU?
SLAM is a very broad application and there exist numerous methods on how to perform SLAM.
The basic idea is to estimate a map of an environment and the path a robot takes through the environment at the same time. The map generally is sensed by a sensor that measures the position of prominent landmarks in the environment. The movements of the robot are often integrated by IMU, odometry or GPS sensors. The setup and algorithms used to perform SLAM may vary drastically though.
It is not even necessary to get movement data from the robot. You can think of a Kalman filter which tracks landmark positions and robot location in it's state vector and assumes a constant position state transition model for the robot with no control input. Even though the transition model is assumed to be constant position the measurement updates of the landmark position are theoretically enough to give an estimate of the updated position of the robot.
This is taken a step further if you consider the Structure from Motion approach where a single camera that is moved through an environment is used to estimate a map of the environment from image features and at the same time estimate the path the camera takes through that map.
So as long as you don't have a specific sensor setup and algorithm in mind the question of "how a robot does pose estimation in SLAM" is not really productive. If you have a specific question I can maybe point you into the direction of specific literature.
The probabilistic robotics book by Sebastian Thrun has a good intro to probabilistic approaches to SLAM.
In cases where the map is not provided, the robot detects its surroundings by means of the sensors on it. It creates the map by marking the points it detects in three-dimensional space. For example, a vision-based robot tries to find points of interest (features) it detects in the present scene (frame) in the next scene (frame). It tries to determine its own displacement by looking at the displacement of the features it finds on the screen. With this concept expressed as odometry, the robot estimates how much it has moved in the environment. Odometry information usually obtained by using several different sensors by means of sensor fusion, not with a single sensor.
Related
I'm doing some research on navigation algorithms in ROS and I want to test kidnapped robot problem in gazebo. Looking on internet I saw the two solutions are particle and kalman filter. I know that amcl already implements particle filter and you can use kalman filter with this package, but the problem with them is that amcl needs robot's initial position. So my question is does amcl realy solve the kidnapped robot problem and are there any other methods for solving this issue?
AMCL doesn't need initial pose. When the initial pose is not given, it will initialize the particles uniformly across the map. After moving the robot enough distance, particle filter will converge to correct pose.
AMCL solves kidnapped robot problem by adding random particles. When the robot is kidnapped, number of random particles added will increase. Of the random particles, which are near the actual pose of the robot get highest weight and upon resampling, more particle will be added near the correct pose. After few sensor updates and resampling, pf will converge to actual pose of the robot.
There are many solutions proposed for kidnapped robot problem in research. Most of them use additional setup or additional sensors.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed last month.
Improve this question
In my eye tracking project, the pupil center is jumping a lot and I don't see it as a fixed point.
What should I do?
My idea is comparing the pupil center and pupil in 2 frames with a threshold but it doesn't help the problem. Another point is camera noise.
What should I do to reduce the noises?
I used the starburst algorithm.
Starburst: A hybrid algorithm for video-based eye tracking combining feature-based and model-based approaches.
Eye trackers come with 2 types of noise/errors: variable and systematic error. Variable noise is basically the dispersion around the gazed target and the constant drift or the deviation from the gaze target is the systematic noise. For reference, see the following paper:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196348
In your case, its the variable error. Variable error arises due to fatigue, involuntary eye ball vibrations, light, and so on. You can remove it by just filtering the gaze data. However, be careful not to smoothen it too much which might lead to the loss of natural fluctuations inherent in the eye ball.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I am adding a SCNNode with text, and I want to save its location, so that next I can show that added object on to map.
I can get node position using SCNVector3. Please help me how to convert it in latitude and longitude.
Thanks
There are ways to do what you seem to be asking about, but most of them are probably overdoing it.
ARKit and geolocation/mapping technologies (CoreLocation, MapKit) really operate on different scales.
ARKit works at room scale—that is, it’s hard to do anything meaningful within a single AR session that involves distances above 10-20 meters—and has precision/error on the order of millimeters.
Geolocation, by definition, operates at planet scale, and has precision/error ranging from one to several meters depending on factors in the local environment (cellular AGPS and WiFi location availability, satellite visibility, radio noise, position-refining features like iBeacon, etc).
In other words, almost any virtual content you can accurately place in an AR session is probably so close to you that its distance from your geolocated position is smaller than the error in your geolocated position. (For example, ARKit might think your position within a room is stable, letting you place virtual content 1.5m in front of you. But while you’re doing that, your estimated geolocation is drifting around back and forth within a 5m radius.)
So, if you want to put a marker on a world map for your ARKit content, you’ll probably have an easier time ignoring that distance — just use your current position from CoreLocation as the marker.
If your app is for use only in a controlled environment (retail store, museum, etc) where you can ensure high-precision geolocation (say, by deploying iBeacons), you might be able to do something useful with that distance. Translating positions from ARKit world space to lat/long is pretty simple regardless, but you have to be in a situation like this for the result to be meaningful:
Use the gravityAndHeading world alignment option for your AR session, so that the x/z axes of ARKit world coordinate space line up with world compass directions.
Project the distance vector between the camera and an object in AR space into 2D (x/z) to get east/west and north/south distance along Earth’s surface in meters relative to the user’s position.
Get the user’s position in geographic space from CoreLocation, and add the vector you got in step 2 to get the object’s position in geographic space. (CLLocationCoordinate2D doesn’t do meters, so you may find it useful to convert through another space like MKMapPoint to get a final result.)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Is the gyroscope in an iPhone accurate enough to find distance in the same way that a measuring tape could, I am not looking at using GPS as I would plan on this only working for short distances. If so how would I calculate this?
The short answer is, "no".
A more complete answer is: the gyroscope measures rotation, not distance. It is possible to use the gyroscope to help measure distance, but it would have to be coupled with some other sensor that can actually measure distance.
I will go farther and mention that, the accelerometer packaged with the gyroscope, by itself, can't measure distance, either. Even though the accelerometer measures acceleration, which is more closely related to distance than rotation, it needs some kind of absolute reference to give you a useful distance.
For an iPhone, your best bet is to use the camera, and computer vision techniques, combined with the gyroscope and accelerometer, to do integrated tracking of your stationary environment. The tricky bit will be reliably ignoring non-stationary things in your environment...
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
HOG is popular in human detection. Can it be used for detecting objects like cup in the image for example.
I am sorry for not asking programming question, but I mean to get the idea if i can use hog to extract object features.
According to my research I have dont for few days I feel yes but I am not sure.
Yes, HOG (Histogram of Oriented Gradients) can be used to detect any kind of objects, as to a computer, an image is a bunch of pixels and you may extract features regardless of their contents. Another question, though, is its effectiveness in doing so.
HOG, SIFT, and other such feature extractors are methods used to extract relevant information from an image to describe it in a more meaningful way. When you want to detect an object or person in an image with thousands (and maybe millions) of pixels, it is inefficient to simply feed a vector with millions of numbers to a machine learning algorithm as
It will take a large amount of time to complete
There will be a lot of noisy information (background, blur, lightning and rotation changes) which we do not wish to regard as important
The HOG algorithm, specifically, creates histograms of edge orientations from certain patches in images. A patch may come from an object, a person, meaningless background, or anything else, and is merely a way to describe an area using edge information. As mentioned previously, this information can then be used to feed a machine learning algorithm such as the classical support vector machines to train a classifier able to distinguish one type of object from another.
The reason HOG has had so much success with pedestrian detection is because a person can greatly vary in color, clothing, and other factors, but the general edges of a pedestrian remain relatively constant, especially around the leg area. This does not mean that it cannot be used to detect other types of objects, but its success can vary depending on your particular application. The HOG paper shows in detail how these descriptors can be used for classification.
It is worthwhile to note that for several applications, the results obtained by HOG can be greatly improved using a pyramidal scheme. This works as follows: Instead of extracting a single HOG vector from an image, you can successively divide the image (or patch) into several sub-images, extracting from each of these smaller divisions an individual HOG vector. The process can then be repeated. In the end, you can obtain a final descriptor by concatenating all of the HOG vectors into a single vector, as shown in the following image.
This has the advantage that in larger scales the HOG features provide more global information, while in smaller scales (that is, in smaller subdivisions) they provide more fine-grained detail. The disadvantage is that the final descriptor vector grows larger, thus taking more time to extract and to train using a given classifier.
In short: Yes, you can use them.