SPEAKER 1: It was about 1963

when a noted philosopher here at MIT, named Hubert Dreyfus– Hubert Dreyfus wrote a paper in

about 1963 in which he had a heading titled, “Computers

Can’t Play Chess.” Of course, he was subsequently invited

over to the artificial intelligence laboratory

to play the Greenblatt chess machine. And, of course, he lost. Whereupon Seymour Pavitt wrote

a rebuttal to Dreyfus’ famous paper, which had a subject

heading, “Dreyfus Can’t Play Chess Either.” But in a strange sense, Dreyfus

might have been right and would have been right if he

were to have said computers can’t play chess the way

humans play chess yet. In any case, around about 1968

a chess master named David Levy bet noted founder of

artificial intelligence John McCarthy that no computer would

beat the world champion within 10 years. And five years later, McCarthy

gave up, because it had already become clear that no

computer would win in a way that McCarthy wanted it to win,

that is to say by playing chess the way humans

play chess. But then 20 years after that

in 1997, Deep Blue beat the world champion, and chess

suddenly became uninteresting. But we’re going to talk about

games today, because there are elements of game-play that do

model some of the things that go on in our head. And if they don’t model things

that go on in our head, they do model some kind

of intelligence. And if we’re to have a general

understanding of what intelligence is all about, we

have to understand that kind of intelligence, too. So, we’ll start out by talking

about various ways that we might design a computer

program to play a game like chess. And we’ll conclude by talking

a little bit about what Deep Blue adds to the mix other

than tremendous speed. So, that’s our agenda. By the end of the hour, you’ll

understand and be able to write your own Deep Blue

if you feel like it. First, we want to talk about how

it might be possible for a computer to play chess. Let’s talk about several

approaches that might be possible. Approach number one is that

the machine might make a description of the board the

same way a human would; talk about pawn structure, King

safety, whether it’s a good time to castle, that

sort of thing. So, it would be analysis and

perhaps some strategy mixed up with some tactics. And all that would get mixed

up and, finally, result in some kind of move. If this is the game board, the

next thing to do would be determined by some process

like that. And the trouble is no one

knows how to do it. And so in that sense,

Dreyfus is right. None the game playing programs

today incorporate any of that kind of stuff. And since nobody knows

how to do that, we can’t talk about it. So we can talk about

other ways, though, that we might try. For example, we can have

if-then rules. How would that work? That would work this way. You look at the board,

represented by this node here, and you say, well, if it’s

possible to move the Queen pawn forward by one,

then do that. So, it doesn’t do any of

evaluation of the board. It doesn’t try anything. It just says let me look at the

board and select a move on that basis. So, that would be a way

of approaching a game situation like this. Here’s the situation. Here are the possible moves. And one is selected

on the basis of an if-then rule like so. And nobody can make a very

strong chess player that works like that. Curiously enough, someone has

made a pretty good checkers playing program that

works like that. It checks to see what moves are

available on the board, ranks them, and picks the

highest one available. But, in general, that’s not

a very good approach. It’s not very powerful. You couldn’t make it– well, when I say, couldn’t, it

means I can’t think of any way that you could make a

strong chess playing program that way. So, the third way to do this is

to look ahead and evaluate. What that means is you

look ahead like so. You see all the possible

consequences of moves, and you say, which of these board

situations is best for me? So, that would be an approach

that comes in here like so and says, which one of those three

situations is best? And to do that, we have to have

some way of evaluating the situation deciding which

of those is best. Now, I want to do a little,

brief aside, because I want to talk about the mechanisms that

are popularly used to do that kind of evaluation. In the end, there are lots of

features of the chessboard. Let’s call them f1,

f2, and so on. And we might form some function

of those features. And that, overall, is called

the static value. So, it’s static because you’re

not exploring any consequences of what might happen. You’re just looking at the board

as it is, checking the King’s safety, checking

the pawn structure. Each of those produces a number

fed into this function, out comes a value. And that is a value of the

board seen from your perspective. Now, normally, this function,

g, is reduced to a linear polynomial. So, in the end, the most popular

kind of way of forming a static value is to take f1,

multiply it times some constant, c1, add c2, multiply

it times f2. And that is a linear

scoring polynomial. So, we could use that function

to produce numbers from each of these things and then pick

the highest number. And that would be a way

of playing the game. Actually, a scoring polynomial

is a little bit more than we need. Because all we really need is

a method that looks at those three boards and says,

I like this one best. It doesn’t have to rank them. It doesn’t have to give

them numbers. All it has to do is say which

one it likes best. So, one way of doing that is

to use a linear scoring polynomial. But it’s not the only

way of doing that. So, that’s number two

and number three. But now what else might we do? Well, if we reflect back on some

of the searches we talked about, what’s the base case

against which everything else is compared much the way of

doing search that doesn’t require any intelligence,

just brute force? We could use the British Museum

algorithm and simply evaluate the entire tree of

possibilities; I move, you move, I move, you move,

all the way down to– what?– maybe 100, 50 moves. You do 50 things. I do 50 things. So, before we can decide if

that’s a good idea or not, we probably ought to develop

some vocabulary. So, consider this

tree of moves. There will be some

number of choices considered at each level. And there will be some

number of levels. So, the standard language for

this as we call this the branching factor. And in this particular case,

b is equal to 3. This is the depth of the tree. And, in this case, d is two. So, now that produces a certain

number of terminal or leaf nodes. How many of those are there? Well, that’s pretty simple

computation. It’s just b to the d. Right, Christopher,

b to the d? So, if you have b to the d at

this level, you have one. b to the d at this level,

you have b. b to the d at this level, you

have [? d ?] squared. So, b to the d, in this

particular case, is 9. So, now we can use this

vocabulary that we’ve developed to talk about whether

it’s reasonable to just do the British Museum

algorithm, be done with it, forget about chess,

and go home. Well, let’s see. It’s pretty deep down there. If we think about chess, and we

think about a standard game which each person does

50 things, that gives a d about 100. And if you think about the

branching factor in chess, it’s generally presumed to be,

depending on the stage of the game and so on and so forth,

it varies, but it might average around 14 or 15. If it were just 10, that would

be 10 to the 100th. But it’s a little more than

that, because the branching factor is more than 10. So, in the end, it looks like,

according to Claude Shannon, there are about 10 to the 120th

leaf nodes down there. And if you’re going to go to a

British Museum treatment of this tree, you’d have to do

10 to the 120th static evaluations down there at the

bottom if you’re going to see which one of the moves

is best at the top. Is that a reasonable number? It didn’t used to seem

practicable. It used to seem impossible. But now we’ve got cloud

computing and everything. And maybe we could actually

do that, right? What do you think, Vanessa, can

you do that, get enough computers going in the cloud? No? You’re not sure? Should we work it out? Let’s work it out. I’ll need some help, especially

from any of you who are studying cosmology. So, we’ll start with

how many atoms are there in the universe? Volunteers? 10 to the– SPEAKER 2: 10 to the 38th? SPEAKER 1: No, no, 10 to the

38th has been offered. That’s why it’s way too low. The last time I looked, it was

about 10 to the 80th atoms in the universe. The next thing I’d like to know

is how many seconds are there in a year? It’s a good number

have memorized. That number is approximately

pi times 10 to the seventh. So, how many nanoseconds

in a second? That gives us 10 to the ninth. At last, how many years

are there in the history of the universe? SPEAKER 3: [INAUDIBLE]. 14.7 billion. SPEAKER 1: She offers something

on the order of 10 billion, maybe 14 billion. But we’ll say 10 billion to make

our calculation simple. That’s 10 to the 10th years. If we will add that up, 80, 90,

plus 16, that’s 10 to the 106th nanoseconds in the history

of the universe. Multiply it times the number

of atoms in the universe. So, if all of the atoms in the

universe were doing static evaluations at nanosecond speeds

since the beginning of the Big Bang, we’d still be 14

orders of magnitudes short. So, it’d be a pretty

good cloud. It would have to harness

together lots of universes. So, the British Museum

algorithm is not going to work. No good. So, what we’re going to have to

do is we’re going to have to put some things together

and hope for the best. So, the fifth way is the way

we’re actually going to do it. And what we’re going to do is

we’re going to look ahead, not just one level, but as

far as possible. We consider, not only the

situation that we’ve developed here, but we’ll try to push that

out as far as we can and look at these static values of

the leaf nodes down here and somehow use that as a way

of playing the game. So, that is number five. And number four is going

all the way down there. And this, in the end, is

all that we can do. This idea is multiply invented

most notably by Claude Shannon and also by Alan Turing, who,

I found out from a friend of mine, spent a lot a lunch time

conversations talking with each other about how a computer

might play chess against the future when there

would be computers. So, Donald, Mickey and Alan

Turing also invented this over lunch while they were taking

some time off from cracking the German codes. Well, what is the method? I want to illustrate the method

with the simplest possible tree. So, we’re going to have a

branching factor of 2 not 14. And we’re going to have a

depth of 2 not something highly serious. Here’s the game tree. And there are going

to be some numbers down here at the bottom. And these are going to be the

value of the board from the perspective of the player

at the top. Let us say that the player at

the top would like to drive the play as much as possible

toward the big numbers. So, we’re going to call that

player the maximizing player. He would like to get over here

to the 8, because that’s the biggest number. There’s another player, his

opponent, which we’ll call the minimizing player. And he’s hoping that the play

will go down to the board situation that’s as

small as possible. Because his view is the opposite

of the maximizing player, hence the

name minimax. But how does it work? Do you see which way the

play is going to go? How do you decide which way

the play is going to go? Well, it’s not obvious

at a glance. Do you see which way

it’s going to go? It’s not obvious

to the glance. But if we do more than a glance,

if we look at the situation from the perspective

of the minimizing player here at the middle level, it’s

pretty clear that if the minimizing player finds himself

in that situation, he’s going to choose

to go that way. And so the value of this

situation, from the perspective of the minimizing

player, is 2. He’d never go over

there to the 7. Similarly, if the minimizing

player is over here with a choice between going toward

a 1 or toward an 8, he’ll obviously go toward a 1. And so the value of that board

situation, from the perspective of the minimizing

player, is 1. Now, we’ve taken the scores down

here at the bottom of the tree, and we back them

up one level. And you see how we can

just keep doing this? Now the maximizing player can

see that if he goes to the left, he gets a score of 2. If he goes to the right, he

only gets a score of 1. So, he’s going to

go to the left. So, overall, then, the

maximizing player is going to have a 2 as the perceived value

of that situation there at the top. That’s the minimax algorithm. It’s very simple. You go down to the bottom of the

tree, you compute static values, you back them up level

by level, and then you decide where to go. And in this particular

situation, the maximizer goes to the left. And the minimizer goes to the

left, too, so the play ends up here, far short of the 8 that

the maximizer wanted and less than the 1 that the

minimizer wanted. But this is an adversarial

game. You’re competing with

each other. So, you don’t expect to get

what you want, right? So, maybe we ought to see if

we can make that work. There’s a game tree. Do you see how it goes? Let’s see if the system

can figure it out. There it goes, crawling its

way through the tree. This is a branching factor of

2, just like our sample, but now four levels. You can see that it’s got quite

a lot of work to do. That’s 2 to the fourth, one,

two, three, four, 2 to the fourth, 16 static evaluations

to do. So, it found the answer. But it’s a lot of work. We could get a new tree and

restart it, maybe speed it up. There is goes down that

way, get a new tree. Those are just random numbers. So, each time it’s going to find

a different path through the tree according to the

numbers that it’s generated. Now, 16 isn’t bad. But if you get down there around

10 levels deep and your branching factor is 14, well,

we know those numbers get pretty awful pretty bad, because

the number of static evaluations to do down

there at the bottom goes as b to the d. It’s exponential. And time has shown, if you get

down about seven or eight levels, you’re a jerk. And if you get down about 15

or 16 levels, you beat the world champion. So, you’d like to get as far

down in the tree as possible. Because when you get as far

down into the tree as possible, what happens is as

these that these crude measures of bored quality

begin to clarify. And, in fact, when you get far

enough, the only thing that really counts is piece count,

one of those features. If you get far enough, piece

count and a few other things will give you a pretty good idea

of what to do if you get far enough. But getting far enough

can be a problem. So, we want to do everything

we can to get as far as possible. We want to pull out every trick

we can find to get as far as possible. Now, you remember when we talked

about branching down, we knew that there were some

things that we could do that would cut off whole portions

of the search tree. So, what we’d like to do is find

something analogous to this world of games, so we cut

off whole portions of this search tree, so we don’t

have to look at those static values. What I want to do is I want to

come back and redo this thing. But this time, I’m going

to compute the static values one at a time. I’ve got the same structure

in the tree. And just as before, I’m going to

assume that the top player wants to go toward the maximum

values, and the next player wants to go toward the

minimum values. But none of the static values

have been computed yet. So, I better start

computing them. That’s the first

one I find, 2. Now, as soon as I see that 2, as

soon as the minimizer sees that 2, the minimizer knows that

the value of this node can’t be any greater than 2. Because he’ll always choose to

go down this way if this branch produces a

bigger number. So, we can say that the

minimizer is assured already that the score there will be

equal to or less than 2. Now, we go over and compute

the next number. There’s a 7. Now, I know this is exactly

equal to 2, because he’ll never go down toward a 7. As soon as the minimizer says

equal to 2, the maximizer says, OK, I can do equal

to or greater than 2. One, minimizer says equal

to or less than 1. Now what? Did you prepare those

2 numbers? The maximizer knows that if he

goes down here, he can’t do better than 1. He already knows if he goes

over here, he an get a 2. It’s as if this branch

doesn’t even exist. Because the maximizer would

never choose to go down there. So, you have to see that. This is the important essence

of the notion the alpha-beta algorithm, which is a layering

on top of minimax that cuts off large sections of

the search tree. So, one more time. We’ve developed a situation so

we know that the maximizer gets a 2 going down to the left,

and he sees that if he goes down to the right, he

can’t do better than 1. So, he says to himself, it’s

as if that branch doesn’t exist and the overall

score is 2. And it doesn’t matter what

that static value is. It can be 8, as it was,

it can be plus 1,000. It doesn’t matter. It can be minus 1,000. Or it could be plus infinity

or minus infinity. It doesn’t matter, because

the maximizer will always go the other way. So, that’s the alpha-beta

algorithm. Can you guess why it’s called

the alpha-beta algorithm? Well, because in the algorithm

there are two parameters, alpha and beta. So, it’s important to understand

that alpha-beta is not an alternative to minimax. It’s minimax with a flourish. It’s something layered on top

like we layered things on top of branch and bound to make

it more efficient. We layer stuff on top

of minimax to make it more efficient. As you say to me, well, that’s

a pretty easy example. And it is. So, let’s try a little

bit more complex one. This is just to see if I can

do it without screwing up. The reason I do one that’s

complex is not just to show how tough I am in front

of a large audience. But, rather, there’s certain

points of interest that only occur in a tree of depth

four or greater. That’s the reason for

this example. But work with me and let’s

see if we can work our way through it. What I’m going to do is I’ll

circle the numbers that we actually have to compute. So, we actually have

to compute 8. As soon as we do that, the

minimizer knows that that node is going to have a score of

equal to or less than 8 without looking at

anything else. Then, he looks at 7. So, that’s equal to 7. Because the minimizer will

clearly go to the right. As soon as that is determined,

then the maximizer knows that the score here is equal

to or greater than 8. Now, we evaluate the 3. The minimizer knows equal

to or less than 3. SPEAKER 4: [INAUDIBLE]. SPEAKER 1: Oh, sorry, the

minimizer at 7, yeah. OK, now what happens? Well, let’s see, the maximizer

gets a 7 going that way. He can’t do better than 3 going

that way, so we got another one of these

cut off situations. It’s as if this branch

doesn’t even exist. So, this static evaluation

need not be made. And now we know that that’s not

merely equal to or greater than 7, but exactly

equal to 7. And we can push that

number back up. That becomes equal to

or less than 7. OK, are you with me so far? Let’s get over to the other

side of the tree as quickly as possible. So, there’s a 9, equal to or

less than 9, 8 equal to 8, push the 8 up equal

or greater than 8. The minimizer can go down

this way and get a 7. He’ll certainly never go

that way where the maximizer can get an 8. Once again, we’ve

got a cut off. And if this branch didn’t exist,

then that means that these static evaluations

don’t have to be made. And this value is

now exactly 7. But there’s one more

thing to note here. And that is that not only do

we not have to make these static evaluations down here,

but we don’t even have to generate these moves. So, we save two ways, both on

static evaluation and on move generation. This is a real winner, this

alpha-beta thing, because it saves as enormous amount

of computation. Well, we’re on the way now. The maximizer up here is

guaranteed equal to or greater than 7. Has anyone found the winning

media move yet? Is it to the left? I know that we better keep

going, because we want to trust any oracles. So, let’s see. There’s a 1. We’ve calculated that. The minimizer can be guaranteed

equal to or less than 1 at that particular

point. Think about that for a while. At the top, the maximizer

knows he can go left and get a 7. the minimizer, if the play ever

gets here, can ensure that he’s going to drive the

situation to a board number that’s 1. So, the question is will

the maximizer ever permit that to happen? And the answer is surely not. So, over here in the development

of this side of the tree, we’re always comparing

numbers at adjacent levels in the tree. But here’s a situation where

we’re comparing numbers that are separated from each

other in the tree. And we still concluded that no

further examination of this node makes any sense at all. This is called deep cut off. And that means that this whole

branch here might as well not exist, and we won’t have to

compute that static value. All right? So, it looks– you have this stare of

disbelief, which is perfectly normal. I have to reconvince myself

every time that this actually works. But when you think your way

through it, it is clear that these computations that

I’ve x-ed out don’t have to be made. So, let’s carry on and see if we

can complete this equal to or less than 8, equal

to 8, equal to 8– because the other branch

doesn’t even exist– equal to or less than 8. And we compare these two

numbers, do we keep going? Yes, we keep going. Because maybe the maximizer

can go to the right and actually get to that 8. So, we have to go over here

and keep working away. There’s a nine, equal

to or less than 9, another 9 equal to 9. Push that number up equal

to or greater than 9. The minimizer gets an

8 going this way. The maximizer is insured of

getting a 9 going that way. So, once again, we’ve got

a cut off situation. It’s as if this doesn’t exist. Those static evaluations

are not made. This move generation is not made

and computation is saved. So, let’s see if we can do

better on this very example using this alpha-beta idea. I’ll slow it down a little bit

and change the search type to minimax with alpha-beta. We see two numbers on each of

those nodes now, guess what they’re called. We already know. They’re alpha and beta. So, what’s going to happen is

the algorithm proceeds through trees that those numbers are

going to shrink wrap themselves around

the situation. So, we’ll start that up. Two static evaluations

were not made. Let’s try a new tree. Two different ones

were not made. A new tree, still again, two

different ones not made. Let’s see what happens when we

use the classroom example, the one I did up there. Let’s make sure that I

didn’t screw it up. I’ll slow that down to 1. 2, same answer. So, you probably didn’t realize

it at the start. Who could? In fact, the play goes down that

way, over this way, down that way, and ultimately to

the 8, which is not the biggest number. And it’s not the smallest

number. It’s the compromised number

that’s arrived at virtue of the fact that this is an

adversarial situation. So, you say to me, how much

energy, how much work do you actually saved by doing this? Well, it is the case that in

the optimal situation, if everything is ordered right,

if God has come down and arranged your tree in just

the right way, then the approximate amount of work you

need to do, the approximate number of static evaluations

performed, is approximately equal to 2 times b

to the d over 2. We don’t care about this 2. We care a whole lot

about that 2. That’s the amount of

work that’s done. It’s b to the d over 2,

instead of b to d. What’s that mean? Suppose that without

this idea, I can go down seven levels. How far can I go down

with this idea? 14 levels. So, it’s the difference

between a jerk and a world champion. So, that, however, is only in

the optimal case when God has arranged things just right. But in practical situations,

practical game situations, it appears to be the case,

experimentally, that the actual number is close to this

approximation for optimal arrangements. So, you’d never not want

to use alpha-beta. It saves an amazing

amount of time. You could look at

it another way. Suppose you go down the same

number of levels, how much less work do you have to do? Well, quite a bit. The square root [INAUDIBLE],

right? That’s another way of looking

at how it works. So, we could go home at this

point except for one problem, and that is that we pretended

that the branching factor is always the same. But, in fact, the branching

factor will vary with the game state and will vary

with the game. So, you can calculate how much

computing you can do in two minutes, or however much time

you have for an average move. And then you could say,

how deep can I go? And you won’t know for

sure, because it depends on the game. So, in the earlier days of

game-playing programs, the game-playing program left a

lot of computation on the table, because it would make a

decision in three seconds. And it might have made a much

different move if it used all the competition it

had available. Alternatively, it might be

grinding away, and after two minutes was consumed. It had no move and just

did something random. That’s not very good. But that’s what the early

game-playing program’s did, because no one knew how

deep they could go. So, let’s have a look at the

situation here and say, well, here’s a game tree. It’s a binary game tree. That’s level 0. That’s level 1. This is level d minus 1. And this is level d. So, down here you

have a situation that looks like this. And I left all the game

tree out in between . So, how many leaf nodes

are there down here? b to the d, right? Oh, I’m going to forget about

alpha alpha-beta for a moment. As we did when we looked at

some of those optimal searches, we’re going to add

these things one at a time. So, forget about alpha-beta,

assume we’re just doing straight minimax. In that case, we would have to

calculate all the static values down here

at the bottom. And there are b to d of those. How many are there at

this next level up? Well, that must be b

to the d minus 1. How many fewer nodes are there

at the second to the last, the penultimate level, relative

to the final level? Well, 1 over b, right? So, if I’m concerned about not

getting all the way through these calculations at the d

level, I can give myself an insurance policy by calculating

out what the answer would be if I only went

down to the d minus 1th level. Do you get that insurance

policy? Let’s say the branching factor

is 10, how much does that insurance policy cost me? 10% of my competition. Because I can do this

calculation and have a move in hand here at level d minus 1 for

only 1/10 of the amount of the computation that’s required

to figure out what I would do if I go all the way

down to the base level. OK, is that clear? So this idea is extremely

important in its general form. But we haven’t quite got there

yet, because what if the branching factor turns out to be

really big and we can’t get through this level either? What should we do to

make sure that we still have a good move? SPEAKER 5: [INAUDIBLE]. SPEAKER 1: Right, we can do

it at the b minus 2 level. So, that would be up here. And at that level, the amount

of computation would be b to the d minus 2. So, now we’ve added 10%

plus 10% of that. And our knee jerk is begin

to form, right? What are we going to do in the

end to make sure that no matter what we’ve got a move? CHRISTOPHER: Start from

the very first– SPEAKER 1: Correct, what’s

that, Christopher? CHRISTOPHER: Start from

the very first level? SPEAKER 1: Start from the very

first level and give our self an insurance policy for every

level we try to calculate. But that might be real costly. So, we better figure out if this

is going to be too big of an expense to bear. So, let’s see, if we do what

Christopher suggests, then the amount of computation we need

in our insurance policy is going to be equal 1– we’re going to do it up here at

this level, 2, even though we don’t need it, just to make

everything work out easy. 1 plus b, that’s getting or

insurance policy down here at this first level. And we’re going to add b squared

all the way down to b to d minus 1. That’s how much we’re going to

spend getting an insurance policy at every level. I wished that some of that high

school algebra, right? Let’s just do it for fun. Oh, unfortunate choice

of variable names. bs is equal to– oh, we’re going to multiply

all those by b. Now, we’ll subtract the first

one from the second one, which tells us that the amount of

calculation needed for our insurance policy is equal

to b to the d minus 1 over b minus 1. Is that a big number? We could do a little algebra on

that and say that b to the d is a huge number. So, that minus one

doesn’t count. And B is probably 10 to 15. So, b minus 1 is, essentially,

equal to b. So, that’s approximately equal

b to the d minus 1. So, with an approximation

factored in, the amount of computation needed to do

insurance policies at every level is not much different from

the amount of computation needed to get an insurance

policy at just one level, the penultimate one. So, this idea is called

progressive deepening. And now we can visit our gold

star idea list and see how these things match

up with that. First of all, the dead horse

principle comes to the fore when we talk about alpha-beta. Because we know with alpha-beta

that we can get rid of a whole lot of the tree and

not do static evaluation, not even do move generation. That’s the dead horse we

don’t want to beat. There’s no point in doing that

calculation, because it can’t figure into the answer. The development of the

progressive deepening idea, I like to think of in terms of

the martial arts principle, we’re using the enemy’s

characteristics against them. Because of this exponential

blow-up, we have exactly the right characteristics to have

a move available at every level as an insurance policy

against not getting through to the next level. And, finally, this whole idea

of progressive deepening can be viewed as a prime example

of what we like to call anytime algorithms that always

have an answer ready to go as soon as an answer is demanded. So, as soon as that clock runs

out at two minutes, some answer is available. It’ll be the best one that the

system can compute in the time available given the

characteristics of the game tree as it’s developed so far. So, there are other kinds

of anytime algorithms. This is an example of one. That’s how all game playing

programs work, minimax, plus alpha-beta, plus progressive

deepening. Christopher, is alpha-beta

a alternative to minimax? CHRISTOPHER: No. SPEAKER 1: No, it’s not. It’s something you layer

on top of minimax. Does alpha-beta give you a

different answer from minimax? CHRISTOPHER: No. No, it doesn’t. SPEAKER 1: Let’s see everybody

shake their head one way or the other. It does not give you an answer

different from minimax. That’s right. It gives you exactly

the same answer, not a different answer. It’s a speed-up. It’s not an approximation. It’s a speed-up. It cuts off lots of the tree. It’s a dead horse principle

at work. You got a question,

Christopher? CHRISTOPHER: Yeah, since all

of the lines progressively [INAUDIBLE], is it possible to

keep a temporary value if the value [INAUDIBLE] each node of

the tree and then [INAUDIBLE]? SPEAKER 1: Oh, excellent

suggestion. In fact, Christopher

has just– I think, if I can jump ahead

a couple steps– Christopher has reinvented

a very important idea. Progressive deepening not only

ensures you have an answer at any time, it actually improves

the performance of alpha-beta when you layer alpha-beta

on top of it. Because these values that are

calculated at intermediate parts of the tree are used to

reorder the nodes under the tree so as to give you maximum

alpha-beta cut-off. I think that’s what you

said, Christopher. But if it isn’t, we’ll talk

about your idea after class. So, this is what every game

playing program does. How is Deep Blue different? Not much. So, Deep Blue, as of 1997, did

about 200 million static evaluations per second. And it went down, using

alpha-beta, about 14, 15, 16 levels. So, Deep Blue was minimax,

plus alpha-beta, plus progressive deepening, plus

a whole lot of parallel computing, plus an opening book,

plus special purpose stuff for the end game, plus– perhaps the most important

thing– uneven tree development. So far, we’ve pretended that the

tree always goes up in an even way to a fixed level. But there’s no particular reason

why that has to be so. Some situation down at the

bottom of the tree may be particularly dynamic. In the very next move, you might

be able to capture the opponent’s Queen. So, in circumstances like that,

you want to blow out a little extra search. So, eventually, you get to

the idea that there’s no particular reason to

have the search go down to a fixed level. But, instead, you can develop

the tree in a way that gives you the most confidence

that your backed-up numbers are correct. That’s the most important of

these extra flourishes added by Deep Blue when it beat

Kasparov in 1997. And now we can come back

and say, well, you understand Deep Blue. But is this a model of

anything that goes on in our own heads? Is this a model of any kind

of human intelligence? Or is it a different kind

of intelligence? And the answer is

mixed, right? Because we are often in

situations where we are playing a game. We’re competing with another

manufacturer. We have to think what the other

manufacturer will do in response to what we do

down several levels. On the other hand, is going

down 14 levels what human chess players do when they win

the world championship? It doesn’t seem, even to them,

like that’s even a remote possibility. They have to do something

different, because they don’t have that kind of computational

horsepower. This is doing computation in the

same way that a bulldozer processes gravel. It’s substituting raw power

for sophistication. So, when a human chess master

plays the game, they have a great deal of chess knowledge

in their head and they recognize patterns. There are famous experiments,

by the way, that demonstrate this in the following way. Show a chessboard to a chess

master and ask them to memorize it. They’re very good at that, as

long as it’s a legitimate chessboard. If the pieces are placed

randomly, they’re no good at it at all. So, it’s very clear that they’ve

developed a repertoire of chess knowledge that makes

it possible for them to recognize situations and play

the game much more like number 1 up there. So, Deep Blue is manifesting

some kind of intelligence. But it’s not our intelligence. It’s bulldozer intelligence. So, it’s important to understand

that kind of intelligence, too. But it’s not necessarily the

same kind of intelligence that we have in our own head. So, that concludes what we’re

going to do today. And, as you know, on Wednesday

we have a celebration of learning, which is familiar to

you if you take a 309.1. And, therefore, I will

see you on Wednesday, all of you, I imagine.

~ 15:54 he starts getting into minimax

here's a 4 minute-ish rendition from another course: https://www.youtube.com/watch?v=o3Z3oAoKhDA

what does he mean by martial art example? Using your opponents force against him

40:50: He misspelled "martial arts". Well at least he doesn't teach in the English dept, lol. He is awesome though, nonetheless.

coming from a military background, I can't stop wanting to scream at all of the fucking retards moving about and disrespecting this guys lecture. Jesus fucking christ kids. I would of lost it as the professor and kicked everyone out.

thanks for this lecture 🙂

Brilliant alpha beta explanation. So short example and so simple, not like those cam videos.

I paused the video just after he explained the 2^2 British museum method, and I thought "hang on, shouldn't the player who's making the move be trying to figure out what the implications of planning 5 moves ahead are given that each second successive state has to make assumptions about what the opponent would do in the intermediate state? So I started thinking about how you might predict an opponents move and even got onto thinking if you could use a neural network to predict it based off the opponents previous moves!"

Then he explained minimax and I felt stupid…

Although it does raise a second question, "If you matched two of these algorithms against each other, is the result deterministic?"

Any ideas?

2 7 1 8.. coincidence? i think not

Amazing professor. My hat off to you sir

One of the best lectures in the series, fantastic professor and amazing didactic. Many thanks to MIT for this contribution.

I have a problem understanding the insurance policy being taught at 35:50. Can anybody help?

Great lecture

Thanks to the guy who wrote the subtitles. It clearly made me understand beter.

What did Christopher ever do to get picked on? lmao jk This lecture was really clear and I'm so glad that there are subtitles.

Patrick did a good explanation of search strategy in games and Minimax and alpha beta pruning.

Great lecture. Very clearly explained alpha beta pruning. I liked the greater than and less than comparisons on each level. This was much clearer then just defining alpha and beta at each level.

this nigga wrote "marshall arts" smh. I guess man haha

perfect lecture!

00:30 – funny, but Dreyfus account of this all "battle against AI" thing, is that he won (check out the really cool philosophy movie "Being in the World")

Thank you very much for these.

21:36 It's not "branching down", he says "branch and bound" from previous class.

Question – Prof Winston says (around 12:30) that there are PI * 10^7 seconds in a year. Can anyone tell me where the PI comes from in this result? Does this take into account leap years somehow? Otherwise, shouldn't it be 60 secs/min x 60 min/hr x 24 hrs/day x 365 days/year = 3.1536 which is close to PI but not equal. Thanks

Can anyone explain how the deep cut off works at 28:13? Is the maximizer making a comparison from the root to the minimizer value just above the leaf?

Finally I understand this :O

I have a question: At 36:22 he says what if we don't have enough time and we went only till the (d-1) th level. And then he also suggests we can have a temporary answer at every level as we go down as we should have some approximate answer at any point of time. But!! How can we have any answer without going to the leaf nodes because it's only at the leaf nodes we can conclude who can win the game. Think this for tic-tac-toe game. At (d-1)th level we don't have enough information to decide if this series of moves till this node at (d-1) will win me or lose me the game. At higher levels say at (d-3) it's so blur! Everything is possible as we go down! Isn't it? So, if an algorithm decides to compute till (d-1) th level then all those path options are equal!! Nothing guarantees a win and nothing guarantees a lose at (d-1)th level because if I understand correctly wins and losses are calculated only at the leaf nodes. This is so true especially in pure MinMax algorithm. So how exactly are we going to have an 'approximate answer' at (d-1)th level or say (d-5)th level?

Sir u R Great !

Really This is Excellent Lecture 🙂

Thanks

13:26 "nearly 14 billion years ago expansion started wait…"

didn't pay attention in my classes, now here i am at 4 am watching a lecture from 7 years ago……

Excellent, very helpful for my Artificial Intelligence exam. Greetings from Germany.

At 30:25, shouldn't the root node be updated to >= 8 ?

This professor is perfect. It is waste of time to attend the same classes in other school.

Wowwww I've never seen anyone evaluate the time cost of brute forcing chess the way he did! Amazing! This guy is just amazing.

14:27 Skillz

"Marshall" Arts >_< I was hoping it was a pun, but it looks like it's not…

@ 29:34, for the deep cut, did he compare two Max nodes? or compared the bottom Min node with the root Max node?

This is such a great video, I am pretty amazed at how anyone could have came up with this. Great lecture.

to the left to the left

I feel proud that I've been watching MIT lectures enough to have gotten the "celebration of learning" reference. xD

The AI controlling the camera is faulty. It should be showing some of the writing on the board.

Alan Turing was a Badass

it is funny reading all these comments complaining about students moving chairs, coughing, and sleeping, while 80% of the students sitting in that classroom is (almost) guaranteed to be smarter or more hardworking than you (and me). It is even more interesting if you take a step back and think about how people are attracted to unimportant details while ignoring the big picture.

I understand the need to make it into class and stay caught up on complicated subjects…but man oh man: if you're so sick that you cough every 4-8 seconds on average, you need to keep your ass at home and have a colleague share notes with you (or watch the video!). You are detracting from everyone else's education and likely spread your tuberculosis to 10 other people. Infuriating, can't imagine how annoyed the instructor and students must have felt.

Love this professor. Calm clear explanation. Smooth voice. And humour.

Greetings from the Politecnico di Milano; thank you for these beautiful lectures!

This is the Breaking Bad of AI lectures. Epic beyond comparison. I've watched it more than once and I've learned something new every time.

Such a great, clear lecturer!

Came here for a good explanation of alpha-beta pruning, and got what I came for. Fantastic lecture!

…but what really blew me away was how

absurdly cleanthat blackboard is. Just look at it!Yeah

I fornicate to Minimax a lot

Very clear and concise.

I wish my professor explained this well, I'd be sitting front row every lesson.

Thank you MIT for this video!

Better than my CS teacher…

The sound of the chalk is torturous oO

This was an excellent lecture. The explanation of alpha-beta pruning was so clear and easy to follow, and Prof. Winston is excellent at presenting the material in an engaging fashion. And I loved how Prof. Winston goes the extra mile to tie in these concepts to real life situations such as Deep Blue. Thank you so much!

wonderful lecture

This prof explains stuff so well. Respect.

I got thrown off a little on the alpha beta part. So at each level we when we make comparisons do we look at the values from both the min and max perspective?

Jeez this prof is so cool in the way he talks about things wish I have a teacher like that so I don't have to watch this in a class with a super bad teacher lol

what software is he using for it ?

wrong choice of variables' naming, hhh

When exactly does A-B prune the most nodes?

if you watching this in 2018, like this comment!

painfully slow teacher. have to watch in the background while I do other things.

he spends forever going over ideas that are known to be old/slow/dumb when he could just go into detail on the new ideas

Great lecture. Btw we have a sleepyhead in the very first row – https://youtu.be/STjW3eH0Cik?t=486 🙂

very nicely explained the concepts my ai lecturer couldnt teach

At first I thought he was holding a dildo in his hand. But after he started writing on the blackboard, I realized it was a dildo modified to hold chalk.

Wait. Im 15. I knew where it was going to go before understanding the rest. To me before he started working on the branches with the depth of 4 it was obvious to me that because the last tjrn was mns turn that the winninv move would be the highest lowdr number of the numbers in depth 4. Omg omg omg its so obvious though. No its 8. 8vs9 at depth 4.

I pray every day for more lectures

He is an amazing professor. I would have considered myself lucky to be in his class.

i can't listen to this disgusting audio. so much breathing and coughing.

A SIMPLER explanation:

1.- HEURISTIC

We have 3 variables (Grahics=8, Gameplay=7, Sound=10) which represent how good a VIDEOGAME is, but we want to find the SIMPLEST WAY to get a UNIQUE NUMBER to represent the 3 values (this is called heuristic):

Option 1:

multiply them = Graphics * Gameplay * Sound = 560

This has 2 problems:

1.1.- If one of the variables is zero, the heuristic is zero

1.2.- All the variables have the same importance (weight in the equation)

Option 2:

sum them = Graphics + Gameplay + Sound = 25

This solves the cancellation problem when one of the variables is zero but the problem of all the variables having the same importance (weight) persists.

Option 3:

(Graphics * GraphicsWeight) + (Gameplay * GameplayWeight) + (Sound * SoundWeight) = 8 *0.5 + 7 * 0.3 + 10 *0.2 = 8.1

This SOLVES BOTH PROBLEMS: cancellation and same weight for all variables.

Generalization: Sum (Vi * Wi)

NOTE: Same equation for NEURON in NEURAL NETWORKS. In fact, MINIMAX could be seen as a run-time generated FEEDFORWARD neural network.

2.- STATES

A game like chess, Tic-Tac-Toe, Checkers… is made of a board and tokens.

Every combination of board + tokens is called 'state' or 'node'.

3.- TREE of STATES

If two gamers play chess, they alternate movements and they have several possible moves to choose between. This generates a tree:

Level1 (1st turn): All the possible moves of WHITE player

Level2 (2nd turn): All the possible moves of BLACK player

Level3 (3rd turn): All the possible moves of WHITE player

:

The IDEA is to evaluate each node (STATE) and assign an number (heuristic) representing how good or bad is the result of the move (one level for the WHITE (MAX), and the next for the BLACK (min)).

You only need to find a PATH to a STATE (node) where you have won (eliminate the other player's king).

The difficulty lies in that you only can pick xor ODD levels (playing WHITE) xor EVEN levels (if you're playing with BLACK tokens).

MAX (WHITE ) have to pick the branches that left min (BLACK) the highest values for the WHITE to choose in the next turn (the WORST for min is the BETTER for MAX).

4.- PRUNING BRANCHES with BAD results (alpha-beta pruning) to avoid evaluating the HUGE TREE

You CANNOT evaluate the whole tree because it's generated in run-time and it's HUGE so you use "alpha-beta pruning" to avoid analyzing branches with bad results for your interest.

Alpha starts with value +infinite and Beta starts with value – infinite; when they cross, that branch is discarded.

5.- IMPLEMENTATION (of chess):

0.- Define heuristic based on things like occupying the center of the board, different weights for each piece,…

1.- Openings library (https://en.wikipedia.org/wiki/List_of_chess_openings) with an heuristic value for each one.

2.- MiniMax with alpha-beta pruning for middle game

3.- Mate library (https://en.wikipedia.org/wiki/Checkmate_pattern) to deal with game endings

You only need to look a few levels down to choose the branch.

How many levels ? as many as you can inside a TIME LIMIT (the more difficulty, the more time to search).

28:20 .NVC

When the intro didn't said "This content is provided under MIT open course ware…." I thought my earphone broke.

no such thing as solve or tough or interesx or not about it or large audienx or not, cepitx, do, be, say, can do, be, say any nmw and any be perfx

Minimax : 16:17

alpha beta simple example : 21:51

alpha beta big example : 24:54

Damn, this was good. I ended up skipping the proof like stuff and could only really understood the actual algorithm. Might watch more of these.

In the alpha-beta example, the branch that doesn't actually get created is just the right-most one that leads to the terminal node (not computed, because of the "cut"). Is that right?

If it's right, than the statement "it's as if that branch doesn't exist" (24:00) must be interpreted such that the algorithm will never choose the action that leads to the right-hand node (the one <= 1). Is this interpretation correct?

I find something here about alpha & betha, what if we're changing the position between 3 and 9 on the left tree…then the first cut off wouldn't happen… so the interesting thing is the alpha betha depended on the evaluation method… For example if you're doing evaluation from the right position so the cut-off will be different 😀 … anyway thank you for the explanation… it's really clear

So what does "bs" means?

you saved my life , thank youu

Prof Winston is quite a genius in giving funny Memorable names for algorithms – British Museum, dead horse, Marshall Art etc. Also the way he explained how Deep Blue applied minimax + alphabet prune + Progressive Deepening etc immediate relate the material to real-life applications. Good Job! But I hope he could explain more on how paralleled computing helped alpha beta punning in DB.

30:29 Shouldn't the root then be = 8 ?

I understand how minimax and alpha-beta algorithms work. But I can't understand how they are able to find a good solution. How can alternatively selecting the minimum and maximum values in consecutive levels of a game tree provide a good solution?

Can someone help me out here?

This Prof draws beautiful trees

Seems like a really nice professor. My AI professor also nice and good teacher but leaves out some details which I learn it from here. Thanks for great courses!

28:20 What is this useless camera angle

I'm glad I never went to a university, someone like me needs to hear or see something done a few times, this is better for me video lectures from MIT xD

Lecture could be better if there was less coughing and sneezing

algorime

b^d

b = 3

d = level

d start at 1

and d + 1 for each level

level 1: 3^1

Level2: 3^2 +

Level3: 3^3 +

Level4: 3^4 +

Level5: 3^5 +

Level6: 3^6 +

Level7: 3^7 +

Level8: 3^8 +

Level9: 3^9 +

Level10: 3^10

Algoritme

2b^(d/2)

b = 2

d = level

d = 0

2*2^(1/2)

2*1000^(1000/2)

over reaction algoritme is equal a anytime algoritme if things becomes to hard to calculate we go in to freeze, fight and flight

GAME-ALGORITHME

minimax concept = a compromise between the min-algorime and max-algoritme = "optimal solution"

Alpha-Beta = eliminate branches so we dont have to calculate all of them

Progressive deepning = after every level the best move is stored and got to output if the next level crashes

Minimax

Alpha-Beta

level 1 = calculating = best level 1 move stored

level 2 = calculating = best level 2 move stored

level 3 = calculating = best level 3 move stored

level 4 = calculating = best level 4 move stored

level 5 = calculating = best level 5 move stored

level 6 = calculating = best level 6 move stored

minimax = the selfish algorithms

Alpha-Beta = the strategic algoritme/speed-up

Progressiv Deepning = the pick and store algoritme

max the top player

min the bad player

20:49 what does he mean by "you are a jerk" in this context?

Great explanation! It's basically everything you need to build any game with AI opponent in one lecture. And you can easily determinate the level of difficulty by limiting the depth level of calculating.

Why are all the trees upside down?

Like daca si tu ai venit aici ptc ai examen la PA

I didn't think anyone would call a bulldozer sophisticated, but they are! This course is quite eye-opening.

what I understand this system search in knowledge base of top players data to score possibilities 16:16

Im so excited to go to uni (im still in highschool)