0:00

So, let's speak a little bit about how hard it is to compute a Nash equilibrium

ย in a normal form game. So, let's, let's start with a little

ย history. John von Neumann, one of the founders of modern game theory, when he

ย investigated zero sum game it proved the, the existence of equilibrium there.

ย And he uses what's known as the Brouwer fixed point theorem for that. and that

ย led directly to algorithms for computing fixed points in such

ย in such linear programs. first those Danzig's Algorithm that

ย the real equivalent to what in modern day is called LP duality.

ย And it's an exponential procedure although it practice to use widely of

ย note is the Khachiyan polynomial-time approach to solving linear programs.

ย Although in truth in practice it's, it's not used as widely.

ย It's a practical procedure. And when you go beyond the zero sum games

ย so when John Nash approved the existence of equilibrium for general-sum games, he

ย used the same fixed point theorem of Brouwer.

ย And that also informed a series of algorithms and they're noted there on the

ย slide. And we will be looking at two of them,

ย the Lemke-Howson algorithm and a much more recent algorithm due to Ryan Porter

ย and others. I will note that all of these are

ย exponential in the worse case and I'll get back to that later.

ย So, let's start with the Lemke-Howson algorithm.

ย And let's start with a a formulation of the Nash equilibrium for 2-player games.

ย It looks, it looks as follows. and this is a busy slide but I'll walk

ย you through it and all will become become clear.

ย At heart are two sets of variables, the s's and the r's.

ย The s's will denote the will capture the mixed strategy used by the two players,

ย player 1 and player 2. s1, for example, of s, s2k for example,

ย would be the rate or the probability with which player 2 plays action k in in its

ย mixed strategy. So, the s1's and the s2's are the capture

ย the mixed strategy of the two players, player 1, player 2.

ย r is, are what they call the slack variables.

ย And to understand their roles, let's look at, for example, this equation right

ย there. So, this applies to any action of player

ย 1. For any action of player 1, we look at

ย the value that it the, the value that it give with respect to the strategy of the

ย of the other player. And so, we look at all the actions

ย available to player 2. We look at the pay-off to player 1 given

ย that he is playing a particular action, j, and looking at that action of player

ย 5:27

So, we're going to add this slack variable, as it's called, that will say,

ย is this is how much player 1 is missing relative to their best response when

ย they're playing strategy j. Those are the slack variables.

ย And, so now, that will also give us the sense for this condition here.

ย So, the slack variables are always non-negative.

ย And in a Nash equilibrium, they will be exactly zero, except when you speak about

ย strategies that are actually played with zero probability by the player.

ย So, now we talk about the s's. s's, as we said, speak about the weight

ย and the mix that each player gives to their each of their actions in the each

ย strategy they play. And so, when player 1 plays strategy j

ย with non-zero probability, it better be the case that is better, best responding

ย namely that the slight variable is zero. And when they're playing with zero

ย probability, you don't care what the factor variable is because they're not

ย playing that strategy at all. And you capture that by requiring that

ย the product to be zero. This is exactly the condition, and this

ย is what makes it a linear complementarity problem.

ย So, I hope that's clear and you can see now similarly for player 2.

ย For player 2, we take each of their actions and we say, if they're going to

ย play it with none, with, with some probability then, and we look at their

ย best response here given whatever player 1 is going to play their mixed strategy.

ย And we're going to look at the slack variable here, and again, we're going to

ย require that the product be zero. In other words the probability that they

ย play j is nonzero just in case the slack variable is zero, okay?

ย So this is the nature of this of this mathematical optimization program.

ย And, of course I forgot to mention, but of course, we want the, the s's to be a

ย probability distribution. Though, they sum to one and they are all

ย nonzero. Alright. So, this is our linear

ย complementarity program. And now come Lemke-Howson who suggest to

ย find a solution in a particular way. And we won't go over it, but the flavor

ย of it is to initialize the s's and the r's a particular way.

ย In fact, to artificially initialize them all to zero, and then one by one take

ย them, it's called a pivoting procedure. Take the, an, an s and an r in turn

ย alternating between the two taking them out of the set that has the current value

ย and replacing it with a complementary variable.

ย If it's an r, replacing it with an s. And if it's an s, replacing it with an r,

ย until you get a, an equilibrium. That's the general flavor of it and in,

ย in this lecture we won't go in more detail into the Lemke-Howson procedure.

ย But it is a procedure that looks very deeply at what a Nash equilibrium is.

ย Sets it up as a mathematical program, and then searches the space of variables in

ย an informed way. Let's now look at a very different

ย procedure. One that doesn't look in as much detail

ย as the structure of equlibria but compensates by by performing uristic

ย search in a certain. So so let's look at it and we'll look at

ย it at at two stages. The first step is to note that when you

ย fix the supports of a strategy profile, finding out whether there is a Nash

ย equilibrium with that support is an easy problem.

ย Remember that the support of a strategy is consists of all the actions to which

ย the players is giving nonzero probability in that mixed strategy.

ย So, let's look at this formulation. Let's look, and this will be limited to

ย two players. and so let's look at all players in turn, for example, player 1,

ย and let's look at every action of that player, for example, a sub i.

ย We'll be looking for some mix strategy, p, mix strategy profile for one for each

ย of the players that will give us a Nash equilibrium, namely each will be best

ย responding. And so, for all actions in the support

ย that we're considering, we'd want the agent to be best

ย responding. So, let's assume that the best response

ย value is v sub i, just call it that number.

ย Then, we want ai to in fact to be a best response to the rest. And what we want is

ย all other actions, and the other action, none is support, to

ย give us a value that's no greater than the best response.

ย And, we want it for each, each of the two players and each of their actions in the

ย support, so that makes sense. And we want these to be a probability so

ย we want the probabilities in the support to be nonzero.

ย We want the probabilities outside the support to be zero,

ย and we want it indeed to be a probability distribution.

ย This all makes sense. So, this is a linear program.

ย It's solved only in polynomial time. that is theoretically there is a

ย polynomial time procedure. in fact, these procedures used are not

ย polynomial of the worst case but but nonetheless effective.

ย By the way, notice that we actually did use the assumption that we're fixing the

ย support here. Superficially, you might look at it and

ย say, oh ยทย , I could do the same thing but simply ignore the support part.

ย Where, where are we using that really? Well, we're using it in the assumumption

ย that when we're best responding inside and don't have a profit, proper

ย abbreviation. we're actually playing these pi's with probability with a

ย positive probability. Because if we, playing the remaining

ย strategy with zero probability, in fact doesn't matter if we're best responding

ย to it or not. And so, this is where this assumption is

ย hidden. So, now we know that when we fix the

ย support we can solve the question efficiently.

ย the [UNKNOWN] is the fact that there's an exponential number support to explore.

ย And this is the second part, we need to simply now start the exploring the the

ย sets of support. And, I won't go into details about how we

ย do it. But the basic idea is the following. We

ย will bias the support to supports that are close in size to one another, that is

ย we will not start by considering one agent looking at only two strategies

ย among which is randomizing, and the other agent looking at 17 strategies.

ย We'll look at a strategic profile of the similar whose support is similar in size.

ย We also start with small supports and gradualling with the larger supports.

ย If we do that and we involve, and we, we use another trick called conditional

ย domination that is look at certain thing we can ignore along the way, then what

ย turns out that although the procedure is in the worst case exponential as is the

ย Lemke-Howson in fact it performs in practice quite well.

ย And in fact better than essentially all other procedures tried including the the

ย Lemke-Howson. This procedures do have exponential worst

ย case and so the question is, can we do better?

ย Are there procedures that are less than exponential in the worst case?

ย And that takes us from the realm of complex algorithms to the realm of

ย complex analysis. So, let's first remind ourselves a little

ย bit about what complexity analysis looks like.

ย We're looking at classes, whole classes of problems such as the class of old

ย games, and the problem of determining a a sample Nash equilibrium of those games.

ย And we're looking at the how hard is that class as a whole. And

ย so, here are, it's a small part of the complexity hierarchy.

ย The class P as it's known is the class of problem for which a polynomial kind of

ย solution is, is known. The class NP is the class of problems for

ย which a class a solution can be verified in polynomial time, but not necessarily

ย found in polynomial time. The class NP-complete is the hardest

ย among all the NP classes, that is the classes to which all NP problems can be

ย reduced. And perhaps, the biggest

ย answer problem in theoretical computer science is a question of whether NP is

ย different from P, widely believed to be but has not been proved.

ย So, this is the general background we need to keep in mind.

ย And now, we can ask, well, where does where does the problem finding a Nash

ย equilibrium reside in P, in NP? What can we say?

ย Well, first of all, strictly speaking, we can't quite speak about it being in P or

ย NP because we know from Nash's theorem that a Nash equilibrium always exists.

ย So, the question, does it exist in Nash equilibrium is trivial, the answer is

ย yes. So, we need to look at it a little

ย differently. One way to look at differently is to ask

ย for Nash equilibrium with specific kind of properties.

ย So, for example, we can say does it have unique Nash equilibrium? Or does a,

ย existent of equilibrium that's strictly Pareto efficient?

ย Or does, is there a Nash equilibrium that guarantees a given player some minimum

ย pay-off? Or can we guarantee some minimum, some minimum social welfare in a

ย Nash equilibrium? Does the existent Nash equilibrium that include, that includes

ย some, fewer, fewer strategies, some action of the player in it?

ย Or conversely, that does exclude it? All of these and others, are examples of

ย questions that are provably NP NP hard. Okay. So, that tells us something about

ย the hardness. But still we still ask about just finding a sample in Nash

ย equilibrium. How hard is that? So, we've seen the

ย algorithm. People have tried very hard to find

ย algorithms computing a sample in Nash equilibrium.

ย and it does seem hard. The question is, can we somehow capture

ย that formally within the complexity hierarchy? And and and for that, we need

ย to introduce some new node, new new concepts.

ย the essential concept is that of the new class of problems called PPAD for

ย Polynomial Parity Arguments on Directed graphs introduced by Christophe

ย Papadimitriou in 1994. we won't go into detail, but just so you

ย know the chronology. PPAD is a specialization of the class

ย called TFNP, which is in turn was a specialization of a problem called FNP.

ย going detail here is, is, is beyond the scope of, of, of what we want to speak

ย about. But it does help us now position the

ย complexity of finding a sample Nash equilibrium in the complexity hierarchy.

ย And again, we have the class of polynomial time problems,

ย of problem that can be verified in polynomial time with these being the

ย hardest among them. And given that, PPAD turns out to reside

ย somewhere within this class. Now again, we don't know whether this

ย entire class does not collapse and all become one and the same.

ย It's why they believe that it does not but proof doesn't exist.

ย However we do know that PPAD lies someplace in between P and NP.

ย Now what does that have to do with the problem computing a Nash equilibrium.

ย Well, that's where the, the following theorems come in.

ย originally, it was shown that the problem of computing a Nash equilibrium is

ย complete for this class PPAD. That is, it's the hardest among all problems in

ย that class, initially proved for four players than for all, for games with

ย three or more players. And then, finally, in '06 for all, all,

ย all classes game. And so, we widely believe that the

ย problem is not polynomial, cannot prove it but we do know where it reside and

ย within the complex hierarchy that we are familiar with.

ย