reinforcement learning mini-golf game

reinforcement learning mini-golf game - machine-learning

I'm trying to use a reinforcement learning algorithm to play a simple mini-golf game.
I want to give inputs(angle and force) to a game engine.
Get the final position of the ball.
Based on the final position calculate reward.
Iterate process until success.
I think I can achieve this by using the greedy approach or function approximation. I want to know whether this is possible and want to find a similar example.

In literature, reinforcement learning is the closest thing to artificial general intelligence, so yes, you can apply it to this mini golf game.
The following will be the layout:
States: Location of ball on the field (x, y, z)
Actions: Angle, Force
Rewards: Distance of ball from hole
Depending on how big your field is, this problem should be easily solvable.
I think I can achieve this by using the greedy approach or function approximation.
You would definitely want to use at least a e-greedy approach to promote exploration in earlier episodes.
To simplify the problem, I would consider just a 2D, or maybe even a 1D case first so you get familiar with the algorithm.
For the 1D case, your state would be where along the line your ball is. Your action is the amount of force applied to the ball. And the reward can be based off of how far your ball is from the goal post.
I can code this environment for you if you'd like.

Related

Self Driving Algorithm

I'm currently trying to make a self-driving car algorithm that will work with a racing game using open cv and lane detection.
My current algorithm is to take the difference between the the x-component of the lanes to the centre (a certain y distance away from the car) and to adjust the direction based on the difference. This works well with oval tracks. However, it's really unstable with sharp turns.
Does anyone have any suggestions of a better way to implement a self-driving algorithm?
Thanks

Finger/Hand Gesture Recognition using Kinect

Let me explain my need before I explain the problem.
I am looking forward for a hand controlled application.
Navigation using palm and clicks using grab/fist.
Currently, I am working with Openni, which sounds promising and has few examples which turned out to be useful in my case, as it had inbuild hand tracker in samples. which serves my purpose for time being.
What I want to ask is,
1) what would be the best approach to have a fist/grab detector ?
I trained and used Adaboost fist classifiers on extracted RGB data, which was pretty good, but, it has too many false detections to move forward.
So, here I frame two more questions
2) Is there any other good library which is capable of achieving my needs using depth data ?
3)Can we train our own hand gestures, especially using fingers, as some paper was referring to HMM, if yes, how do we proceed with a library like OpenNI ?
Yeah, I tried with the middle ware libraries in OpenNI like, the grab detector, but, they wont serve my purpose, as its neither opensource nor matches my need.
Apart from what I asked, if there is something which you think, that could help me will be accepted as a good suggestion.

You don't need to train your first algorithm since it will complicate things.
Don't use color either since it's unreliable (mixes with background and changes unpredictably depending on lighting and viewpoint)
Assuming that your hand is a closest object you can simply
segment it out by depth threshold. You can set threshold manually, use a closest region of depth histogram, or perform connected component on a depth map to break it on meaningful parts first (and then select your object based not only on its depth but also using its dimensions, motion, user input, etc). Here is the output of a connected components method:
Apply convex defects from opencv library to find fingers;
Track fingers rather than rediscover them in 3D.This will increase stability. I successfully implemented such finger detection about 3 years ago.

Read my paper :) http://robau.files.wordpress.com/2010/06/final_report_00012.pdf
I have done research on gesture recognition for hands, and evaluated several approaches that are robust to scale, rotation etc. You have depth information which is very valuable, as the hardest problem for me was to actually segment the hand out of the image.
My most successful approach is to trail the contour of the hand and for each point on the contour, take the distance to the centroid of the hand. This gives a set of points that can be used as input for many training algorithms.
I use the image moments of the segmented hand to determine its rotation, so there is a good starting point on the hands contour. It is very easy to determine a fist, stretched out hand and the number of extended fingers.
Note that while it works fine, your arm tends to get tired from pointing into the air.

It seems that you are unaware of the Point Cloud Library (PCL). It is an open-source library dedicated to the processing of point clouds and RGB-D data, which is based on OpenNI for the low-level operations and which provides a lot of high-level algorithm, for instance to perform registration, segmentation and also recognition.
A very interesting algorithm for shape/object recognition in general is called implicit shape model. In order to detect a global object (such as a car, or an open hand), the idea is first to detect possible parts of it (e.g. wheels, trunk, etc, or fingers, palm, wrist etc) using a local feature detector, and then to infer the position of the global object by considering the density and the relative position of its parts. For instance, if I can detect five fingers, a palm and a wrist in a given neighborhood, there's a good chance that I am in fact looking at a hand, however, if I only detect one finger and a wrist somewhere, it could be a pair of false detections. The academic research article on this implicit shape model algorithm can be found here.
In PCL, there is a couple of tutorials dedicated to the topic of shape recognition, and luckily, one of them covers the implicit shape model, which has been implemented in PCL. I never tested this implementation, but from what I could read in the tutorial, you can specify your own point clouds for the training of the classifier.
That being said, you did not mentioned it explicitly in your question, but since your goal is to program a hand-controlled application, you might in fact be interested in a real-time shape detection algorithm. You would have to test the speed of the implicit shape model provided in PCL, but I think this approach is better suited to offline shape recognition.
If you do need real-time shape recognition, I think you should first use a hand/arm tracking algorithm (which are usually faster than full detection) in order to know where to look in the images, instead of trying to perform a full shape detection at each frame of your RGB-D stream. You could for instance track the hand location by segmenting the depthmap (e.g. using an appropriate threshold on the depth) and then detecting the extermities.
Then, once you approximately know where the hand is, it should be easier to decide whether the hand is making one gesture relevant to your application. I am not sure what you exactly mean by fist/grab gestures, but I suggest that you define and use some app-controlling gestures which are easy and quick to distinguish from one another.
Hope this helps.

The fast answer is: Yes, you can train your own gesture detector using depth data. It is really easy, but it depends on the type of the gesture.
Suppose you want to detect a hand movement:
Detect the hand position (x,y,x). Using OpenNi is straighforward as you have one node for the hand
Execute the gesture and collect ALL the positions of the hand during the gesture.
With the list of positions train a HMM. For example you can use Matlab, C, or Python.
For your own gestures, you can test the model and detect the gestures.
Here you can find a nice tutorial and code (in Matlab). The code (test.m is pretty easy to follow). Here is an snipet:
%Load collected data
training = get_xyz_data('data/train',train_gesture);
testing = get_xyz_data('data/test',test_gesture);
%Get clusters
[centroids N] = get_point_centroids(training,N,D);
ATrainBinned = get_point_clusters(training,centroids,D);
ATestBinned = get_point_clusters(testing,centroids,D);
% Set priors:
pP = prior_transition_matrix(M,LR);
% Train the model:
cyc = 50;
[E,P,Pi,LL] = dhmm_numeric(ATrainBinned,pP,[1:N]',M,cyc,.00001);
Dealing with fingers is pretty much the same, but instead of detecting the hand you need to detect de fingers. As Kinect doesn't have finger points, you need to use a specific code to detect them (using segmentation or contour tracking). Some examples using OpenCV can be found here and here, but the most promising one is the ROS library that have a finger node (see example here).

If you only need the detection of a fist/grab state, you should give microsoft a chance. Microsoft.Kinect.Toolkit.Interaction contains methods and events that detects the grip / grip release state of a hand. Take a look at the HandEventType of InteractionHandPointer . That works quite good for the fist/grab detection, but does not detect or report the position of individual fingers.
The next kinect (kinect one) detects 3 joint per hand (Wrist, Hand, Thumb) and has 3 hand based gestures: open, closed (grip/fist) and lasso (pointer). If that is enough for you, you should consider the microsoft libraries.

1) If there are a lot of false detections, you could try to extend the negative sample set of the classifier, and train it again. The extended negative image set should contain such images, where the fist was false detected. Maybe this will help to create a better classifier.

I've had quite a bit of succes with the middleware library as provided by http://www.threegear.com/. They provide several gestures (including grabbing, pinching and pointing) and 6 DOF handtracking.

You might be interested in this paper & open-source code:
Robust Articulated-ICP for Real-Time Hand Tracking
Code: https://github.com/OpenGP/htrack
Screenshot: http://lgg.epfl.ch/img/codedata/htrack_icp.png
YouTube Video: https://youtu.be/rm3YnClSmIQ
Paper PDF: http://infoscience.epfl.ch/record/206951/files/htrack.pdf

Perlin noise for motion?

I'm successfully using Perlin noise to generate terrain, clouds and a few other nifty things. However, I'm now trying to animate a group of flying insects (specifically fireflies), and it was suggested to me to use Perlin noise for this, as well. However, I'm not really sure how to go about this.
The first thing that occurred to me was, given a noise map like so:
Assign each firefly a random initial location, velocity and angular acceleration.
On frame, advance the fly's position following its direction vector.
Read the noise map at the new location, and use it to adjust the angular acceleration, causing
the fly to "turn" towards lighter pixels.
Adjust angular acceleration again by proximity of other flies to avoid having them cluster around local maximums.
However, this doesn't cover cases where flies reach the edge of the map, or cases where they might wind up just orbiting a single point. The second case might not be a big deal, but I'm unsure of a reliable way to have them turn to avoid collisions with the map edge.
Suggestions? Tutorials or papers (in English, please)?

Here is a very good source for 2D perlin noise. You can follow the exact same principles, but instead of creating a 2D grid of gradients, you can create a 1D array of gradients. You can use this to create your noise for a particular axis.
Simply follow this recipe, and you can create similar perlin noise functions for each of your other axes too! Combine these motions, and you should have some good looking noise on your hands. (You could also use these noise functions as random accellerations or velocities. Since the Perlin noise function is globally monotonous, your flies won't rocket off to crazy distances.)
http://webstaff.itn.liu.se/~stegu/TNM022-2005/perlinnoiselinks/perlin-noise-math-faq.html
If you're curious about other types of motion, I would suggest Brownian Motion. That is the same sort of motion that dust particles exhibit when they are floating around your room. This article gets into some more interesting math at the end, but if you're at all familliar with Matlab, the first few sets of instructions should be pretty easy to understand. If not, just google the funcitons, and find their native equivalents for your environment (or create them yourself!) This will be a little more realistic, and much quicker to calculate than perlin noise
http://lben.epfl.ch/files/content/sites/lben/files/users/179705/Simulating%20Brownian%20Motion.pdf
Happy flying!

Maybe you're looking for boids?
Wikipedia page
It doesn't feature Perlin noise in the original concept, maybe you could use the noise to generate attractors or repulsors, as you're trying to do with the 'fly to lighter' behavior.
PS: the page linked above features a related link to Firefly algorithm, maybe you'll be interested in that?

How to use the A* path finding algorithm on a grid less 2D plane?

How can I implement the A* algorithm on a gridless 2D plane with no nodes or cells? I need the object to maneuver around a relatively high number of static and moving obstacles in the way of the goal.
My current implementation is to create eight points around the object and treat them as the centers of imaginary adjacent squares that might be a potential position for the object. Then I calculate the heuristic function for each and select the best. The distances between the starting point and the movement point, and between the movement point and the goal I calculate the normal way with the Pythagorean theorem. The problem is that this way the object often ignores all obstacle and even more often gets stuck moving back and forth between two positions.
I realize how silly mu question might seem, but any help is appreciated.

Create an imaginary grid at whatever resolution is suitable for your problem: As coarse grained as possible for good performance but fine-grained enough to find (desirable) gaps between obstacles. Your grid might relate to a quadtree with your obstacle objects as well.
Execute A* over the grid. The grid may even be pre-populated with useful information like proximity to static obstacles. Once you have a path along the grid squares, post-process that path into a sequence of waypoints wherever there's an inflection in the path. Then travel along the lines between the waypoints.
By the way, you do not need the actual distance (c.f. your mention of Pythagorean theorem): A* works fine with an estimate of the distance. Manhattan distance is a popular choice: |dx| + |dy|. If your grid game allows diagonal movement (or the grid is "fake"), simply max(|dx|, |dy|) is probably sufficient.

Uh. The first thing that come to my mind is, that at each point you need to calculate the gradient or vector to find out the direction to go in the next step. Then you move by a small epsilon and redo.
This basically creates a grid for you, you could vary the cell size by choosing a small epsilon. By doing this instead of using a fixed grid you should be able to calculate even with small degrees in each step -- smaller then 45° from your 8-point example.
Theoretically you might be able to solve the formulas symbolically (eps against 0), which could lead to on optimal solution... just a thought.

How are the obstacles represented? Are they polygons? You can then use the polygon vertices as nodes. If the obstacles are not represented as polygons, you could generate some sort of convex hull around them, and use its vertices for navigation. EDIT: I just realized, you mentioned that you have to navigate around a relatively high number of obstacles. Using the obstacle vertices might be infeasible with to many obstacles.
I do not know about moving obstacles, I believe A* doesn't find an optimal path with moving obstacles.
You mention that your object moves back and fourth - A* should not do this. A* visits each movement point only once. This could be an artifact of generating movement points on the fly, or from the moving obstacles.

I remember encountering this problem in college, but we didn't use an A* search. I can't remember the exact details of the math but I can give you the basic idea. Maybe someone else can be more detailed.
We're going to create a potential field out of your playing area that an object can follow.
Take your playing field and tilt or warp it so that the start point is at the highest point, and the goal is at the lowest point.
Poke a potential well down into the goal, to reinforce that it's a destination.
For every obstacle, create a potential hill. For non-point obstacles, which yours are, the potential field can increase asymptotically at the edges of the obstacle.
Now imagine your object as a marble. If you placed it at the starting point, it should roll down the playing field, around obstacles, and fall into the goal.
The hard part, the math I don't remember, is the equations that represent each of these bumps and wells. If you figure that out, add them together to get your final field, then do some vector calculus to find the gradient (just like towi said) and that's the direction you want to go at any step. Hopefully this method is fast enough that you can recalculate it at every step, since your obstacles move.

Sounds like you're implementing The Wumpus game based on Norvig and Russel's discussion of A* in Artifical Intelligence: A Modern Approach, or something very similar.
If so, you'll probably need to incorporate obstacle detection as part of your heuristic function (hence you'll need to have sensors that alert your agent to the signs of obstacles, as seen here).
To solve the back and forth issue, you may need to store the traveled path so you can tell if you've already been to a location and have the heurisitic function examine the past N number of moves (say 4) and use that as a tie-breaker (i.e. if I can go north and east from here, and my last 4 moves have been east, west, east, west, go north this time)

Guiding a Robot Through a Path

I have a field filled with obstacles, I know where they are located, and I know the robot's position. Using a path-finding algorithm, I calculate a path for the robot to follow.
Now my problem is, I am guiding the robot from grid to grid but this creates a not-so-smooth motion. I start at A, turn the nose to point B, move straight until I reach point B, rinse and repeat until the final point is reached.
So my question is: What kind of techniques are used for navigating in such an environment so that I get a smooth motion?
The robot has two wheels and two motors. I change the direction of the motor by turning the motors in reverse.
EDIT: I can vary the speed of the motors basically the robot is an arduino plus ardumoto, I can supply values between 0-255 to the motors on either direction.

You need feedback linearization for a differentially driven robot. This document explains it in Section 2.2. I've included relevant portions below:
The simulated robot required for the
project is a diﬀerential drive robot
with a bounded velocity. Since
the diﬀerential drive robots are
nonholonomic, the students are encouraged to use feedback linearization to
convert the kinematic control output
from their algorithms to control the
diﬀerential drive robots. The
transformation follows:
where v, ω, x, y are the linear,
angular, and kinematic velocities. L
is an oﬀset length proportional to the
wheel base dimension of the robot.

One control algorithm I've had pretty good results with is pure pursuit. Basically, the robot attempts to move to a point along the path a fixed distance ahead of the robot. So as the robot moves along the path, the look ahead point also advances. The algorithm compensates for non-holonomic constraints by modeling possible paths as arcs.
Larger look ahead distances will create smoother movement. However, larger look ahead distances will cause the robot to cut corners, which may collide with obstacles. You can fix this problem by implementing ideas from a reactive control algorithm called Vector Field Histogram (VFH). VFH basically pushes the robot away from close walls. While this normally uses a range finding sensor of some sort, you can extrapolate the relative locations of the obstacles since you know the robot pose and the obstacle locations.

My initial thoughts on this(I'm at work so can't spend too much time):
It depends how tight you want or need your corners to be (which would depend on how much distance your path finder gives you from the obstacles)
Given the width of the robot you can calculate the turning radius given the speeds for each wheel. Assuming you want to go as fast as possible and that skidding isn't an issue, you will always keep the outside wheel at 255 and reduce the inside wheel down to the speed that gives you the required turning radius.
Given the angle for any particular turn on your path and the turning radius that you will use, you can work out the distance from that node where you will slow down the inside wheel.

An optimization approach is a very general way to handle this.
Use your calculated path as input to a generic non-linear optimization algorithm (your choice!) with a cost function made up of closeness of the answer trajectory to the input trajectory as well as adherence to non-holonomic constraints, and any other constraints you want to enforce (e.g. staying away from the obstacles). The optimization algorithm can also be initialised with a trajectory constructed from the original trajectory.
Marc Toussaint's robotics course notes are a good source for this type of approach. See in particular lecture 7:
http://userpage.fu-berlin.de/mtoussai/teaching/10-robotics/

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart