Building a Tetris AI using Neuroevolution - machine-learning

I am planning to create a Tetris AI using artificial neural network and train it with genetic algorithm for a project in my high school computer science class. I have a basic understanding of how an ANN works and how to implement it with a genetic algorithm. I have already written a working Neural Network based on this tutorial and I'm currently working on a genetic algorithm.
My questions are:
Which GA model is better for this situation (Tetris), and why?
What should I use for input for the neural network? Because currently, the method I'm using is to simply convert the state of the board (the pieces) into a one dimensional array and feed it into the neural network? Is there a better approach?
What should the size (number of layers, neurons per layer) the neural network be?
Are there any good sources of information that can help me?
Thank you!

Similar task was already solved by Google, but they solved it for all kinds of Atari games - https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf.
Carefully read this article and all of the related articles too
This is a reinforcement learning task, in my opinion the hardest task in ML domain. So there will be no short answer for your questions - except that probably you shouldn't use GA heuristic at all and rely on reinforcements methods.

Related

Clarification on a Neural Net that plays Snake

I'm new to neural networks/machine learning/genetic algorithms, and for my first implementation I am writing a network that learns to play snake (An example in case you haven't played it before) I have a few questions that I don't fully understand:
Before my questions I just want to make sure I understand the general idea correctly. There is a population of snakes, each with randomly generated DNA. The DNA is the weights used in the neural network. Each time the snake moves, it uses the neural net to decide where to go (using a bias). When the population dies, select some parents (maybe highest fitness), and crossover their DNA with a slight mutation chance.
1) If given the whole board as an input (about 400 spots) enough hidden layers (no idea how many, maybe 256-64-32-2?), and enough time, would it learn to not box itself in?
2) What would be good inputs? Here are some of my ideas:
400 inputs, one for each space on the board. Positive if snake should go there (the apple) and negative if it is a wall/your body. The closer to -1/1 it is the closer it is.
6 inputs: game width, game height, snake x, snake y, apple x, and apple y (may learn to play on different size boards if trained that way, but not sure how to input it's body, since it changes size)
Give it a field of view (maybe 3x3 square in front of head) that can alert the snake of a wall, apple, or it's body. (the snake would only be able to see whats right in front unfortunately, which could hinder it's learning ability)
3) Given the input method, what would be a good starting place for hidden layer sizes (of course plan on tweaking this, just don't know what a good starting place)
4) Finally, the fitness of the snake. Besides time to get the apple, it's length, and it's lifetime, should anything else be factored in? In order to get the snake to learn to not block itself in, is there anything else I could add to the fitness to help that?
Thank you!
In this post, I will advise you of:
How to map navigational instructions to action sequences with an LSTM
neural network
Resources that will help you learn how to use neural
networks to accomplish your task
How to install and configure neural
network libraries based on what I needed to learn the hard way
General opinion of your idea:
I can see what you're trying to do, and I believe that your game idea (of using randomly generated identities of adversaries that control their behavior in a way that randomly alters the way they're using artificial intelligence to behave intelligently) has a lot of potential.
Mapping navigational instructions to action sequences with a neural network
For processing your game board, because it involves dense (as opposed to sparse) data, you could find a Convolutional Neural Network (CNN) to be useful. However, because you need to translate the map to an action sequence, sequence-optimized neural networks (such as Recurrent Neural Networks) will likely be the most useful for you. I did find some studies that use neural networks to map navigational instructions to action sequences, construct the game map, and move a character through a game with many types of inputs:
Mei, H., Bansal, M., & Walter, M. R. (2015). Listen, attend, and walk: Neural mapping of navigational instructions to action sequences. arXiv preprint arXiv:1506.04089. Available at: Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
Lample, G., & Chaplot, D. S. (2016). Playing FPS games with deep reinforcement learning. arXiv preprint arXiv:1609.05521. Available at: Super Mario as a String: Platformer Level Generation Via LSTMs
Lample, G., & Chaplot, D. S. (2016). Playing FPS games with deep reinforcement learning. arXiv preprint arXiv:1609.05521. Available at: Playing FPS Games with Deep Reinforcement Learning
Schulz, R., Talbot, B., Lam, O., Dayoub, F., Corke, P., Upcroft, B., & Wyeth, G. (2015, May). Robot navigation using human cues: A robot navigation system for symbolic goal-directed exploration. In Robotics and Automation (ICRA), 2015 IEEE International Conference on (pp. 1100-1105). IEEE. Available at: Robot Navigation Using Human Cues: A robot navigation system for symbolic goal-directed exploration
General opinion of what will help you
It sounds like you're missing some basic understanding of how neural networks work, so my primary recommendation to you is to study more of the underlying mechanics behind neural networks in general. It's important to keep in mind that a neural network is a type of machine learning model. So, it doesn't really make sense to just construct a neural network with random parameters. A neural network is a machine learning model that is trained from sample data, and once it is trained, it can be evaluated on test data (e.g. to perform predictions).
The root of machine learning is largely influenced by Bayesian statistics, so you might benefit from getting a textbook on Bayesian statistics to gain a deeper understanding of how machine-based classification works in general.
It will also be valuable for you to learn the differences between different types of neural networks, such as Long Short Term Memory (LSTM) and Convolutional Neural Networks (CNNs).
If you want to tinker with how neural networks can be used for classification tasks, try this:
Tensorflow Playground
To learn the math:
My professional opinion is that learning the underlying math of neural networks is very important. If it's intimidating, I give you my testimony that I was able to learn all of it on my own. But if you prefer learning in a classroom environment, then I recommend that you try that. A great resource and textbook for learning the mechanics and mathematics of neural networks is:
Neural Networks and Deep Learning
Tutorials for neural network libraries
I recommend that you try working through the tutorials for a neural network library, such as:
TensorFlow tutorials
Deep Learning tutorials with Theano
CNTK tutorials (CNTK 205: Artistic Style Transfer is particularly cool.)
Keras tutorial (Keras is a powerful high-level neural network library that can use either TensorFlow or Theano.)
I saw similar application. Inputs usually were snake coordinates, apple coordinates and some sensory data(is wall next to snake head or no in your case).
Using genetic algorithm is a good idea in this case. You doing only parametric learning(finding set of weights), but structure will be based on your estimation. GA can be also used for structure learning(finding topology of ANN). But using GA for both will be very computational hard.
Professor Floreano did something similar. He use GA for finding weights for neural network controller of robot. Robot was in labyrinth and perform some task. Neural network hidden layer was one neuron with recurrent joints on inputs and one lateral connection on himself. There was two outputs. Outputs were connected on input layer and hidden layer(mentioned one neuron).
But Floreano did something more interesting. He say, We don't born with determined synapses, our synapses change in our lifetime. So he use GA for finding rules for change of synapses. These rules was based on Hebbian learning. He perform node encoding(for all weights connected to neuron will apply same rule). On beginning, he initialized weights on small random values. Finding rules instead of numerical value of synapse leads to better results.
One from Floreno's articles.
And on the and my own experience. In last semester I and my schoolmate get a task finding the rules for synapse with GA but for Spiking neural network. Our SNN was controller for kinematic model of mobile robot and task was lead robot in to the chosen point. We obtained some results but not expected. You can see results here. So I recommend you use "ordinary" ANN instead off SNN because SNN brings new phenomens.

Machine Learning, GA + BP or GA with huge NN?

Sorry for the poor title,
I'm currently studying ML and I want to focus on a problem using the toolset I have acquired, which exludes reinforcement learning.
I want to create a NN that takes a simple 2D game level ( think of mario in the simplest case, simple fitness function, simple controls and easy feature selection) and outputs key sequence.
Since we don't know the correct key sequence(ks), I see 2 options,
1-) I find that out using genetic algorithm and use backprop or similar algorithms to associate levels with key sequences and predict a KS for a new level
2-) I build a huge NN and use genetic algorithm to solve whole internal structure of it.
What are the pros and cons of each approach? Why should I implement one instead of the other? Please remember that I'm fairly new to the topic and want to solve this problem with what I've learned until now, basics really.
What you are suggesting is in essence reinforcement learning, e.g. trying out "semi random" combinations and then using rewards to learn the network. The first approach is classical reinforcement learning and the other one is reinforcement learning using a neural network.
If you want to solve the topic like this there are plenty of tutorials and github repos available to help you solve this problem, with a simple google search.

Temporal Difference Learning and Back-propagation

I have read this page of standford - https://web.stanford.edu/group/pdplab/pdphandbook/handbookch10.html. I am not able to understand how TD learning is used in neural networks. I am trying to make a checkers AI which will use TD learning, similar to what they have implemented in backgammon. Please explain the working of TD Back-Propagation.
I have already referred this question - Neural Network and Temporal Difference Learning
But I am not able to understand the accepted answer. Please explain with a different approach if possible.
TD learning is not used in neural networks. Instead, neural networks are used in TD learning to store the value (or q-value) function.
I think that you are confusing backpropagation ( a neural networks' concept) with bootstrapping in RL. Bootstrapping uses a combination of recent information and previous estimations to generate new estimations.
When the state-space is large and it is not easy to store the value function in tables, neural networks are used as an approximation scheme to store the value function.
The discussion on forward/backward views is more about eligibility traces, etc. A case where RL bootstraps serval steps ahead in time. However, this is not practical and there are ways (such as eligibility traces) to leave a trail and update past states.
This should not be connected or confused with back propagation in neural networks. It has nothing to do with it.

Can we use Deep learning techniques in binary classification?

Recently, I started reading about the deep learning. Mainly the weights are pre-trained using unsupervised RBM network and after that, they use neural network networks with many hidden layers to address their task.
So my question is, Whether we can use DNN with for 2 class classification problem.
Thanks to the people who are going to respond.
Yes, you can do that with a simple logistic regression on top of your hidden layers (whatever you choose for that, RBMs other autoencoders).
Absolutely Yes!
As mentioned by Thomas you can use a Logistic Regression as you output layers. Also, another approach is you can use a Sotmax layer with two classes as you output layers.
Good Luck!

Neural Network Learning Without Training Values

I am wondering how to go about training a neural network without providing it with training values. My premise for this is that the neural network(s) will be used on a robot that can receive positive/negative feedback from sensors. IE, in order to train it to roam freely without bumping into things, a positive feedback occurs when no collision sensors or proximity sensors are triggered. A negative feedback occurs when the collision/proximity sensors ARE triggered. How can the neural network be trained using this method?
I am writing this in C++
What you describe is called reinforcement learning. It could be applied to neural networks, but does not require them in general. The canonical textbook to read on the subject is Reinforcement Learning: An Introduction by Richard Sutton and Andrew Barto. The connection between neural networks and reinforcement learning is explored in a bit more detail in the PDP Handbook by James McClelland.
Have you taken a look at SLAM? It's a technique robots can use to navigate an area while simultaneously building up and keeping a map of that area.

Resources