0:43

Statistical variability can have serious consequences in circuit design.

I'll give you an example, you will recall that we've been plotting the log of ID

versus VGS, between 0 and the maximum value.

0 would be in order to turn the device off in digital operation and VDD to turn

it on as much as possible. And draw the maximum current from it.

And we have been using a log ID axis in order to reveal several orders of

magnitude of current here. encompassing weak inverse and moderate

inverse and, and strong inverse. And so we used to draw a single curve

that was a straight line in weak conversion and then curved in moderate

and strong inversion. But if the parameters of the transistor

vary, then for each transistor you measure, you're likely to get a different

curve. So you might end up with something like

this. Now this can be disastrous and the reason

is that some devices will have an off-current that is pretty small.

But some of them will have orders of magnitude more of current.

Recall that this is logarithmic axis over here.

And the means that you cannot really turn those devices off.

If you had a threshold shift of 1 or 2 10th of a volt, you can have a change in

the off current by orders of magnitude. Now some terminology.

first of all chips are made on silicone disks.

they can be as large as a large plate. And here I show you one such disc.

It's called a wafer. And on it, every square is one chip.

For example, this one. And chips are often called dies, one chip

is one die, dies is plural, but sometimes die is used also for plural.

Now, variability can be observed between two transistors on the same chip.

So this will be called variability within a die.

Or, you could have two different chips on a wafer.

So this would be chip-to-chip or die-to-die variation.

Now wafers are made in lots. Several wafers, are fabricated together

and in a given lot, let's say we have two wafers, there could be wafer to wafer

variations. The same device, but on different lots.

Of course, you do expect to see some variation from one to the other.

3:28

Or you could have one lot fabricated at a given time.

And another lot fabricated at a different time, possibly a different day or even a

different month. You do expect to have lot-to-lot

variations. And finally you also have fab to fab

variations. You may have different fabrication

facilities. Both of them are supposed to run the same

process. But of course we, you do expect

variations from one fab to another. The types of, of variations we get are

global. this means that there are variations in

the average value parameters, between let's say wafers or lots.

Or you would have local variations due to random some variation even in adjacent

transistors, and sometimes this is called mismatch.

And there the dimensions of the devices play a key role.

So let me give you an example in very simple terms.

Let us say we have two pairs of neighboring square-gate devices.

These two and these two. So these are large devices, these are

small device. Now, the question is, which pair shows

better matching characteristics. Are these two devices better match than

these two devices? So now, we see here that we have some

lack of smoothness. We have some roughness in the surface of

the channel in the two cases. And likewise, for the bottom of the

silicon gate, we also have some roughness.

And these are local, random variations. Of course, the variations here cannot be

repeated there. They're different, but their average

value is similar. In other words, i, i, if you would like

to define an average of excite thickness in the two cases, it will be similar in

these two for these two devices. And that would lead you to a similar

excite capacitance per unit area, which will lead you to a similar threshold

voltage for these two devices. Now, if you go to these two devices,

things can be very different, right? Because the local variations cannot be

expected to average out if you only have such small dimensions.

So that means that this device can have an oxide capacitance per unit area which

is significantly different from that of this device.

And consequently, the thresholds can be significantly different.

You can say similar things about other parameters like doping concentration.

So we do expect to see the gate area figuring in an important way in the

matching characteristics of transistors due to local variations.

So let's talk about the threshold variation with respect to it's mean

value. What I will do now is, on 2 axis I will

plot the thresholds of 2 devices next to each other.

Identically laid out as before, and on the horizontal axis I will have the

deviation of the threshold of device 1 from it's mean value.

And on the vertical axis I will do the same for the second device.

So, here is one situation. in this case we have large W and L.

This is the deviation from the mean of the threshold voltage for device 1, and

this is deviation from the same mean for device 2.

Let's take one device represented by this asterisk.

This means that for this device, if we go down vertically the threshold is found to

be about 15 milivolts above the mean. And if you go like this, you may find

that the second device, has a threshold that is less say 17 mil volts above the

mean. You can see that both of them deviates

from the mean by a similar amount. Similarly if you go to a device here, you

can see the deviation for both of them is something like minus 20 mil volts.

So there is correlation of the threshold in the two devices.

Now, different symbols here represent different lots.

So the asterisks represent devices all obtained from the same lot.

The open circles represent devices obtained from a second lot and so on.

You can see that all devices within a lot tend to have similar threshold

deviations. And all devices from another lot also

have similar threshold deviations, but the two differ.

So the mean value will differ from lot to lot, but the local variations are rather

small, and the reason is that we have a large W and a large L.

Intuitively we explain what happens then, averaging takes place and therefore the

local variations do not have much of an effect.

Now, if instead you small w and l, then the plot looks something like this.

By the way, these are measurements. So this is the deviation of device one

from mean, the deviation of device two from the mean in terms of the threshold

voltage. And you can have situations like let's

say, this one, where it shows that for this pair, device one deviates from the

mean by something like 30 mill-volts and device two deviates from the same mean by

something like -50 milli-volts. You can see that the correlation is much

smaller, and the reason is you have a very small w and l.

9:02

So the first plot shows you what happens if the main affect is global variations

in the average value of the physical parameters.

Whereas the second plot shows you where, what happens when local random variations

dominate, and you have large mismatch between adjacent Identically laid out

devices. This can be a serious issue especially in

analog design, where you really try to match the characteristics of two devices

next to each other. Now how do we model variability?

The ideal way is to focus on independent physical parameters, for example, oxide

thickness and substrate doping. And for some of these perimeters you use

relative variations, for example, the oxide thickness is equal to some nominal

value plus some deviation because of global affects, plus some deviation

because of local affects. For this local affects, we have already

seen that, in order to suppress them, you need to have a large gate area.

And infact it turns out that the variance of this is proportional to one over the

gate area, or variance is the square of the standard deviation.

You can do similar things for substrate doping and mobility, you can model them

again using relative variations just like we did here.

Now let us take some parameters, for example, the flatband voltage.

Now, it doesn't make sense to talk about relative variations, because let's say if

flatband voltage is equal to zero nominally, then any variation from it

would correspond to an infinite percent variation.

So, it really does not make sense to talk about relative variation, so we talk

instead about absolute variations. So VFB has a nominal value plus a change

due to global parameter, variation the change due to local variations, which are

not normalized to the mean. But the variance of the local variation

still turns out to be inversely proportional to the gate area.

Now let's take delta W as another example.

Delta W, I'll remind you, is the correction you need to apply to the mask

width of a transistor in order to arrive at it's real channel width.

And that delta W turns out, again, to have a nominal value, plus some change

due to global variations, and some change due to local variations.

So, let's say the device looks like that. Now the variations, the local variations

here cannot be expected to depend on the gate area, simply because of the nature

of delta W, were talking about how W is different.

If the device has some bumpiness along this edge, and this edge, the longer the

channel is, the more this this bumpiness will average out, and then you expect the

variation to, to local effects to be small.

So, it turns out then, that this variance, the variance of this quantity

is inversely portional to L, for the reason I just mentioned, not to WL.

Similarly, the delta L, which is the corresponding variant, correction we need

to apply to the master length, to arrive at the real length of the device has some

local variations that turn out to be to have a variance inversely proportional to

W. [COUGH] Now, for independent statistical

variables, we can add variances. And, for example, for the flat band

voltage, assuming that the global and local variations are independent, we can

take the sum of the two variances to arrive at the total variance of the flat

band voltage. Now, for two devices we can define a

correlation coefficient as it is done in statistics.

For example, we have the correlation between the flatband voltage of 2

devices, 1 and 2. If we have very large devices.

Then the local variations would be small, much smaller than the global variations.

And then the correlation coefficient between the two is approximately 1.

This is close to what we saw in the delta VT plots that I showed you a couple

slides ago for large devices. On the other hand, for very small

devices, the local variations become large.

And then you have almost 0 correlation coefficient.

Which is close to the case for the small devices, in the same delta VT plot, that

I showed you. Similarly for other parameters.

Now there's some important composite parameters which are not fundamental

parameters like the oxide thickness and substrate doping they're not independent

parameters. One is the threshold voltage.

The threshold voltage will depend on oxide thickness, substrate doping, flat

band voltage, and so on. So, the threshold voltage is modeled the

same way as the flat band voltage. It has a nominal value, plus a variation

due to global, effects, and a variation due to local effects.

The variation due to local effects, turns out to have a variance inversely

portional to the gate area for basically the reasons I mentioned before.

And this constant AVT is measured and it is an important parameter, at least in

analog design. Another important parameter, is the

so-called beta, which is W over mu CX prime.

This is the coefficient of proportionality in front of all of our

drain current equations. this one is modeled in the relative

sense, so you have the nominal value plus the nominal value times the relative

variation due to the global effects plus the nominal value times the relative

variation due to local variations. Now the local effects, again, have a

variance that is inversely proportional to the gate area.

And this A sub beta is an important perimeter, that circuit designers like to

know. Both of these lead to the conclusion that

if you want 2 devices matched well, both in terms of threshold and beta, you need

to make their dimensions large. This is why when you look at the layout

of an analog chip you'll find often, devices that are significantly larger

than the devices you find in digital circuits.

I would like to briefly mention about, something about the correlation between

different parameters. Let us take the threshold voltage.

The threshold voltage is given by this formula.

We have derived this formula. This is the body effect coefficient, and

it is inversely proportional to the oxide capacitance per unit area.

The beta parameter I showed you a moment ago is proportional to the oxide

parameter, to the oxide capacitance per unit area.

Now you can see that the two quantities VT 0 and beta are correlated because they

both depend on oxide capacitance per unit area and therefore on oxide thickness.

16:38

To give you a very simple example, let us say we're trying to calculate the on

current in, in digital operation. And the, for a device in saturation, the

simple model had predicted this type of current, right?

Now let's say that the, on a given day in a given [INAUDIBLE], the oxide thickness

turned out to be a little larger than the ec-, nominal ideally expected value.

That would make the oxide capacitance per unit area smaller, both here and there.

This means that the threshold will become larger than expected, and beta would

become smaller than expected. Now if beta is smaller than expected it

will have an effect on current. The current will be smaller than

expected. But at the same time because the

threshold is larger than expected VGS minus VT would be smaller and that also

will contribute to the current being smaller.

So you can see that, oxide thickness affects this and this in such a way that

they combine in the worst possible way to give you a smaller current.

So you need to take such correlations into account.

I would also like to say that correlations sometimes exist between

different types of devices, for example, between nMOS and pMOS.

If their oxide is formed the same way, and sometimes even between transistors

and other devices, such as resistors made of poly-silicon.

Because the gate of the transistors and the body of the resistors is made in the

same way. Now, how do we simulate statistical

variability? A popular way is to do that is called the

Monte Carlo simulations, in which you assign common random values to the global

parameters for many transistors, and separate random values to local ones for

each transistor. So let's say you have two adjacent

devices the, you give the same nominal value to their parameters, and a

different random value to their local parameters.

And then you run these simulations, each time with a different combination.

10s or 100s of times, and you super impose the results, and you get the an

idea of the effect of variability on the characteristics of the transistor, or the

characteristics of the circuit you're running.

Now, of course, this is a time consuming process, and sometimes, instead, people

use the so-called corner simulations in which they combine device parameters in

various worst-case combinations. For example, they may take all of the

variations that contribute to making, your current small.

And then, they combined variation is such a way as to make your current large.

so you may end up for example, with a set of corner parameters.

That makes the device the fastest and another one that makes the slowest.

Such a simple corner combinations may be adequate for digital design, much of

digital design. But for analog design the type of

performance that you are seeking, may not be adequately expressed in terms of what

the current is or what the speed is. You may have addiitonal combinations of

parameters you may have to take into account.

So in analog and in the radio frequency of circuits corner combinations, although

they are used by themselves, may not be enough.