[MUSIC]. So here in Lecture 9.4, we're going to continue looking at Iterative Improvement Placement. in the previous lecture, we showed a very simple scheme for doing a placer, randomly locate the gates on the grid that we used to model the chip. Randomly perturb or swap the gates, measure the estimated change in the wire length and just continue this until things stop getting better. And what we showed is you can actually take a pretty bad placement and make it an okay placement. The problem is, you can't make it a very good placement. And it turns out that what one needs to be able to do is not only to perturb randomly the placement to make it better. But in some interesting and controlled manner, we need to be able to perturb the placement to make it a little worse. There are interesting problems with local minima versus global minima. And we're going to talk about all of that in a lecture that's called hill climbing. This is the, sort of, the generic term for the kind of technology that we need to build, to build a batter quality placer. So let's go look at iterative improvement with hill climbing. >> What's the problem with that simple random iterative improvement placement algorithm that I showed you in the last lecture? It looked like it was making good progress. You know, we started with a wire length of something like 45,000 and we fell to a wire length of something like 25,000 for that little 2,500 gate problem. That sounds like a pretty good answer. it's okay, but it's not as good as you can do, and it turns out to be really surprisingly far from how good you can do. And in order to give you some sense of why that happens, I need to sort of perform, if you will, a sort of a thought experiment. imagine that you could plot the total sum of the half perimeter wire length for all possible placements and you could make a graph, right? And so, if you would, the x and y axes of this graph would be all possible placement solutions you can find. Now, look you cannot do this, because if you have a, a layout as I just showed you with 2500 gates, this is a 2501-dimensional graph. The first 2500 dimensions are every place you could put every gate and the 2501st dimension would be the wire length. So just pretend that I could graph this thing. Alright? And so, I'm showing you two dimensions on this little plot which has a whole bunch of hills and valleys, it's got a great big peak in the middle, and then a medium size peak, and then some big, deep valleys on the left and the right. Imagine that you could plot this. So the vertical axis is the sum of half perimeter wirelength and I'm looking for small. What is true about the random iterative improvement method that I showed you? What is true is that I only took swaps of the gates that improved the wirelength. And so, said differently, the first thing I did was I generated a random initial placement. That puts a point some place on this landscape of hills and valleys in the cost and I'm just going to show that initial random placement as this orange circle. these things are called, by the way, balls and hills diagrams. No surprise. Right, so the hills are all the possible cost values for all of my solutions, the ball is the first random placement. And you know what, that ball is probably pretty high up on a on a valley, on a peak someplace, because, you know, that random placement, that's probably really bad. It's probably got a really big half perimeter wirelength. And the problem is that, every time I took a swap, I accepted a swap, I only accepted swaps of the gates that made the wirelength better. So, I only went downhill on this conceptual hills and valleys cost landscape. And you know what I did? I fell in some local minimum. I fell in a local minimum kind of close to the random initial placement I started with, which was probably pretty bad. And so, here is the problem, these kinds of random iterative improvement techniques that only take changes to the layout that make the layout better, which is to say have less half perimeter wirelength. They easily get stuck in local minima and the problem is that there are lots of better answers. There are lots of less bad local minima and better, more higher-quality smaller wirelength global minima. But I don't know how to find them. I need some other ideas. So, what I'd really like is a method that, instead of only going down hill from the random initial placement and sort of, you know, falling into the nearest local minimum, I would like a method that could go uphill and maybe get to better placements. And so I'm just going to show you the little sort of example, you know, an uphill method would allow me to go for a time, exploring placements that are worse than the last one, in the hopes that I'll go up over this peak and I'll start falling into a better global valley. That would be really wonderful. But, wow, that seems incredibly hard and I'm going to circle the, the big problem here. How would I control this? The reason I showed you this simple random iterative improvement placement is they're easy to explain. There's a grid. There's one gate per slot in the grid. I measure the sum of the half perimeter wirelengths. I randomly grab two gates. I swap them. I see if things got better. I keep it if it got better and I go until I'm tired. That sounds very simple, but it works. How would I control a placement method, a placement algorithm, if I'm allowed to grab two gates, exchange their positions, and make the placement worse? How will it ever converge? Well, it turns out there's an answer. So, let's sort of draw a diagram to just sort of summarize where we are. The kind of placer that you know right now is random iterative improvement, but these things actually have a name. This is a greedy algorithm. A greedy algorithm, you know, algorithm is one in which every decision you make is a positive one. You always just sort of do the next best thing you can do. So, step one, you grab two random gates, 1 and 2, and I'm just showing you a little tiny, version of that grid that had one, two, three, four gates in it. Two black gates, two red gates. a blue net, a green net umh, and an orange net. And I swapped gates 1 and 2, the red gates. And now, some of the nets are longer some of the nets are shorter. It's exactly the same picture as before. Step 1, grab two gates, exchange them. Step 2, evaluate delta L, the change in half perimeter wirelength. Do this incrementally, please, so that you do it efficiently. Step 3, the if conditional. If delta L equals the wirelength got smaller, what do I do? If this is true, yes. Step 4, keep the swap. If this is false, no, undo the swap. Step five, undo the swap. This is really where I need some new ideas, right? The idea that I should swap things to see if I can improve them. That's good. The idea that I should evaluate delta L, the change in the wirelength, to evaluate the quality, that's good. The idea that I should ask the question, did it get worse or did it get better? That's good. The idea that if it got better, I should keep it, that's good. The idea that if it got worse, I must undo it and throw that away, that is what's got to go. We have to have something smarter here. And to have something smarter here, I need a really big idea. So, the big idea is something called, I'm going to call improvement with hill climbing, and we're going to sort of explain that. so, I've got grayed out here, the previous parts of the algorithm that I said are good and worth keeping. 1, do the swap. 2, evaluate delta L, the wirelength change. 3, if delta L got smaller. 4, yes, keep it. And what I'm going to do now is add some new sophistication. So what happens now is amazingly enough, a thermometer appears in the right-hand side of the slide. Yes, a thermometer like a little mercury thermometer, because everybody, I hope recognizes that. And it says New, there's a Hill-climbing control parameter, which it is most easy to think of as the temperature, right? And so, depending on how hot it is, that's how much hill-climbing I'm going to allow in my algorithm. And now, when we say if delta L, the wirelength that changes as a result of this swap. If it didn't get smaller, if it got bigger, I'm going to evaluate a new function, a mathematical function, P. P is a function of delta L, okay? The change in the wirelength and P is also a function of T, the temperature. And P, the result is a probability, a number between 0 and 1. And it's actually going to be the probability that we should accept the swap. And what's going to happen is that I'm going to use that probability and I'm going to generate a random number. Yes, a real random number. And that's why I'm showing you a picture of dice. I'm going to generate a number and I'm going to randomly keep that swap with that probability. Now, this is all probably very confusing. So, let's talk about this a little more algorithmically. So, again, I'm showing you the same picture of the swap, two black gates, two red gates, three nets, blue, orange and green. I swap gate 1 and gate 2, the blue net looks longer. The orange net is the same length, the green net looks shorter. I calculate delta L, the change in the length and it is longer. Okay, this swap made the length, the wirelength worse. And so, I'm also showing you a picture of the thermometer. So what happens next, the delta L number and the thermometer temperature go into a mathematical function. That mathematical function is e to the minus delta L over T. That generates a number and that number, because it's e to the minus something is a number between 0 and 1. So I can treat it like a probability. So that number might come out to be something like .72. The way you think of that is there is a 72% chance that I should keep this swap. 72% of the time, I should keep this swap. 28% of the time, which is 1 minus 0.72. 28% of the time, I should not keep this swap. I'm going to randomly keep this swap. How do I do that? I generate a random number uniformly random between 0 and 1. So, you know, in most programming languages, if you call something, you know, random with two parentheses or, you know, rand, right, with you know, two parentheses, you get a uniform random number between 0 and 1. I'm going to generate R, a uniform random number between 0 and 1. And I'm going to compare R to P. If R is less than P, then I keep the swap. If R is greater than P, I undo the swap. That's what it means to take this swap with probability P, because, you know, I'm going to get a different answer depending on how this random number turns out. And sometimes, for a swap with exactly this delta L add exactly this temperature, sometimes I won't take this swap. 28% of the time, in fact. And sometimes I will take the swap, 72% of the time. I know this sounds very strange, but this is how it works. We're going to talk about more of this in more detail. So, this is the macroscopic, the big picture. This is a new random improvement algorithm. It's almost the same as the old greedy one, but it's got two big changes. So the first is there's this hill-climbing parameter T, which you can think of as a temperature. And the way this actually works is that you start the temperature hot and you do many swaps. And then you cool the temperature a little bit and you do many swaps, and then you cool the temperature some more, and you do many swaps. And so, the temperature starts out very hot and it gets cold as time progresses. And at any given temperature, we evaluate this magic function, e to minus delta L over T. And if the swap makes the wirelength worse, we accept it randomly with a probability. So here's how the algorithm goes at a very high level. At the start of my new random iterative improvement algorithm, I set the temperature to be very hot. And you know what happens when the temperature is very hot? Each of the minus delta L over T. T is a really big number, e to the minus delta L over T is e to the minus something really close to 0. That's really close to 1. The probabilities are very high that I can take a big uphill move. And so, one of the things that becomes possible is it becomes possible to go uphill a lot. You can take a move, a perturbation, an exchange, a swap, a change in the placement that really makes it worse, because it's hot. But, it doesn't stay hot forever. We cool the temperature slowly. Over all of the swaps. And when the temperature is cool, it is no longer possible to go uphill so big, it is only possible to go a moderate way uphill, but you can always take a downhill move. Right? And as the temperature gets colder and colder, you can't go uphill at all. So, the big uphill moves. Well, I can't really do those anymore. Their probability is just too low. And even the medium size uphill moves, the probability is too low, but I can always go downhill. So that's how you control the hill-climbing. The hill-climbing is controlled by a temperature parameter. You start the problem hot and you cool it slowly over many, many swaps. When the temperature is hot, there is a high probability of being able to take a large uphill move. When the temperature is cold, there is a low probability. That's how you control the uphill exploration, so you just don't keep messing up the placement. And that's how you put this interesting form of randomness down deep inside the inner loop of this placement algorithm. So that's things at a very high level. And in the next lecture, we're going to sort of dive into this one in some algorithmic detail and explain why this really works. [SOUND].