AI Learns to Play Snake – Explained

AI Learns to Play Snake – Explained


Just in case you haven’t had a childhood,
I’m gonna briefly explain how snake works Believe it or not, you play as a snake, and
your objective is, unsurprisingly, to avoid death
You can steer left or right to pick up fruit that makes you grow, and if you hit a wall,
you die It’s a pretty complicated game for an AI,
so part of my job was to simplify it Some of the complication comes from the fact
that, even though the map is symetrical, the snake has to learn the same thing 4 times,
since it doesn’t know how to rotate the image. To simplify it, I turned the game into a first
person shooter for the snake, giving it as an input what it would see from its own frame
of reference. It can see whether there is fruit, a wall
or nothing in the direction of its sight, and it can see in 5 directions. So, It gets a 1 if it can see something in
that direction, and a 0 if it can’t. This input gets processed in the following
layers, using matrix multiplication and signmoid functions, and we get an output. The snake can decide to either turn left,
keep going straight or turn right. The fitness function was another hard thing
to decide. If you just give it a point every time it
gets a fruit, the first generations will get no points, and when it gets some out of pure luck, it is going to find it hard to relate the fact it got a point with picking up fruits. Points need to be given frequently, so small
improvements in the intelligence are reflected in the scoring. With that in mind i thought “okay, why not
give it a point for every turn it survives” Why not
Why not Looking for ideas on how to solve it, i found
this article that solves it by awarding the snake a point each time the snake gets closer
to the fruit, and penalizing it with 1 and a half points for going away from it. This makes it so that a loop results in negative
points for the snake. Then, I awarded 15 points per each fruit,
and penalized it with 50 points if it hit its own tail, because for some reason they
wouldn’t stop doing that We have been seeing it learning for a while,
let’s fast forward a bit through the learning process, and see some of the interesting stuff… A bit about the genetic algorithm:
If you know something about natural selecion, this will result extreamely familiar to you. We start with 60 completely random snakes. We test them by making them play, and we select
the fittests specimens. Then, the less fortunate ones… they kinda
die. We create new, hopefully better individuals
by breeding the fittest ones togeter, combining them and adding small random mutations. Since we still have the best specimens from
the previous round, this round will be at least as good as the previous one, and thanks
to the randomness, improvement is possible. This may seem like too simple, but evolution
works extremely well Sooo that’s been all I have for you this video. I hoped you liked it. If you did, please like and suscribe, and
I’ll see you in the next one.

Leave a Reply

Your email address will not be published. Required fields are marked *