0:48

We are looking now first at when a variable has only two possible outcomes.

We call them success and failure.

Just number 1 for success, number 0 for failure.

Just those two outcomes, a half isn't possible.

Two, three, four, five, are not possible.

The probability of a success is little p.

The probability of the failure then by the compliment rule is 1- p.

This simple random variable is called the Bernoulli random variable, and

this distribution, the Bernoulli probability distribution.

We can easily calculate the expected value and the variance for this random variable.

And here I quickly have those formulas for you.

Now this is pretty boring.

Where it gets interesting is now when we repeat

a Bernoulli random variable over and over again.

This is now called Bernoulli trials.

So we have a sequence of identical trials each and every time.

It's a 0 for failure, a 1 for success,

a probability p for success, 1- p for failure and

all the individual trials are independent.

So what we now essentially do here, we don't just have one Bernoulli

random variable, we have a whole sum of random variables and

they are all independent and identically distributed.

People in probability theory often use the abbreviation IID for

independent and identically distributed.

And this particular distribution is now the binomial distribution.

It depends on two parameters.

On little n, how many repetitions of the trial do I have?

And the probability little p of the success.

Now we can quickly calculate the mean and

the variance of this binomial random variable.

Here just to scare you a little bit, I showed you the proofs,

the only thing that's important.

But it's very intuitive as you will see in the applications.

The expected value is n times p.

p is the number of trials times the probability of success.

So that's enough abstract nonsense, let's look at now some examples.

Let's say you flipped the coin and

you call the head a success and tale a failure.

And now let's say you flip the coin five times, and

the question is now, how many times do we get success, a head?

Now notice this is easy.

If I asked you about 0 heads and 5 tails,

that would mean we get the tail on the first try and

on the second [SOUND] and on the last because they're independent.

Remember now the multiplication rule for independent events.

The probability of 0 heads and 5 tails means we get a tail and

a tail and a tail and a tail and a tail.

We multiply a half x a half x a half x a half x a half.

A half to the power of 5 is one-in-thirty-second.

It's a tad more than 3%.

Now that's easy, we learned that before, we can do this with all previous formulas.

But now it gets tricky,

if we don't look at the special case of 0 heads, but of 1 head.

Why is this now difficult?

[SOUND] If I have 5 flips of a coin, and 1 head,

that head could be the first coin and then I have 4 tails.

But it also could be that I first flipped a tail, then a head and then 3 tails.

Or I do 2 tails, a head and 2 tails.

And now you see [SOUND] this gets adds up.

Suddenly, there are 5 different possibilities.

The head could be in the first position, or in the second, or in the third, or

in the fourth or the fifth.

Now I have to look at all these possible outcomes and

then start adding the probabilities.

You can see this gets quickly messy.

Now here you say,

I can see there are 5 possibilities because there are 5 positions and

each of them has a probability of one-in-thirty-two, so I can add this up.

So maybe this we can still handle in our head.

But I can tell you, as soon as n gets larger, or

as we look at 2 successes out of n, things get awfully nasty.

Luckily, there's now an easy solution, namely in Excel there's

a function called BINOM.DIST that calculates these numbers exactly for us.

Let me show you where you can find this function in Excel.

Here please have a look at the spreadsheet for the probabilities of 5 coin flips.

Before I explain all this numbers to you please follow me here in Excel.

Under Formulas you find in this leftmost icon Insert function.

And after Insert function, there's Statistical,

a collection of functions from probability and statistics.

And under Statistical you find the function BINOM.DIST.

If you have an older version of Excel, the period sign or ., may be missing.

But don't despair everything will work on your computer as well.

And when you click on BINOM.DIST, this function appears here.

And we can either via dialog box or

by typing fill in the numbers that we need.

And this is what I have already done here.

So let me now show you what this actually looks like so

6:53

Here now, We go here.

What have I typed here or got in Y as the insert function?

We have first an = sign because I want to do some math,

then BINOM.DIST and now I need to enter 4 Numbers.

The first number I need to enter is a number of successes.

Here I typed A4 because I want to refer to the cell A4 where we have the 0.

Next comes little n, the number of repeated Bernoulli trials.

Here I said we flipped the coin 5 times, so this number is a 5.

Next number is 0.5, why?

That's the probability of a success, 0.5.

And finally here I want to calculate the probability of exactly 0 and

this last argument therefore needs to be the number 0 or the word FALSE.

This is just a technicality.

There's nothing to be understood here.

Just accept it as FALSE.

Now here on my screen in Switzerland it shows semicolons.

This is in the Excel that we have here in Switzerland.

In other countries this may be a comma so keep that in mind.

If you type this and you get an error message depending on your country this may

need to be a comma instead of a semicolon.

So now let's go further down here.

Let's look at when I have Y = 2.

If I click here we see BIINOM.DIST of the 2, that's in A6,

and as 5, 0.5 is the probability and FALSE.

And that's how I very easily get the probabilities

of having exactly Y successes,

where Y coud be any number 0 to 5, when I flip a fair coin.

Now remember the cumulative distribution.

I can also immediately calculate those probabilities here for example.

For a 2, I type exactly the same first two or three arguments as before.

So I've BINOM.DIST here, cell A6 where I have the number 2.

Still 5 trials, still a probability of 0.5, but for the cumulative distribution,

the last argument always must be the number 1 or a TRUE.

And so here now, the probability that I have 0,

1 or 2 heads successes is exactly a half.

And as always, for the largest number of course,

the cumulative probability has to be equal to 1.

So in a nutshell by typing in BINOM.DIST for arguments,

you can very quickly calculate any binomial distribution

probability that you may care about.

You can calculate exact probabilities of hitting Y exactly or

of cumulative probabilities.

And you only need to plugin little n, the number of Bernoulli trials and

the probability.

So let me summarize all these numbers now in the slides for you.

So on the next two slides I summarized what we just learned in Excel.

So here on this slide there's a five coin flips.

I'll show you the commands that you need to type in,

in order to calculate the probability of, for

example, having Y = 2 heads out of 5 coin flips.

So we have all the different probabilities, the BINOM.DIST function.

And recall that we want to use the last argument of FALSE when we

look at the probability of an exact number of coin flips.

At the bottom, we can also see the expected value,and the variance quickly

calculated using the specialized formula that we saw before for

binomial probability distribution.

Next, we also saw the concept of the cumulative distribution.

That's when the last argument in Excel is a TRUE.

That's again, adding up all the probabilities up to a specific value.

And so here, I've summarized those numbers for you.

11:09

Now, as I said at the beginning,

this probability distribution is used in neat applications.

And so, let's get away from this little toy problem to a really cool application.

Some advertising stunt that a beer brewing company in

the United States pulled in the 1981 Super Bowl.

The Schlitz Brewing Company asked at the time live on TV, so

this wasn't televised in a film studio before and then shown.

So this was not canned, it was live on TV.

They asked 100 beer drinkers who will proclaim my favorite

beer is one of your competitors this beer brand called Michelob.

And ask them in a blind taste test between their beer, Schlitz,

and the Michelob beer, which beer do you like better?

Now at first when you think that, this is a god awful idea.

What are they doing?

Are they nuts?

You ask these people who say this is my favorite beer to test it

against your beer.

Isn't it likely that you'll look like an idiot and 100 people will say, no,

my beer is still my favorite beer.

Maybe 98 say, it's still better, maybe you convince 1 or 2.

Is this really risky?

12:59

But now, keep in mind these are American beers.

And they're not some special American microbrewed beer.

No, this is some large beer companies and let's taste it.

These beers are all rather similar in taste.

Some people may even question whether they have any taste at all.

But I don't want to discuss this here with my American friends.

So, let's assume now It's essentially impossible to

distinguish between these two beers because they all taste alike.

[SOUND] Now we get into binomial distribution context.

We have success probability, a half.

Failure probability, a half.

We have 100 people sitting in the room.

They're not talking to each other.

And anyway they are blind, this is a blind tasting, it doesn't say on the bottle

whether it's Michelob or Schlitz, so we really have a binomial setup.

And now we can analyze this advertising stunt, using the binomial distribution.

Question, let's say a really really bad outcome for

Schlitz is that less than one-third of all people favor their beer,

what is the probability that, that happens?

The probability of less than a third of 100,

that means that the number of successes is less than 33 and a third.

Means the number is, given that you only have integers, Y is smaller,

equal 33, quickly calculated with BINOM.DIST is 0.00044%, so it's tiny.

So it's really, really unlikely that if we are Schlitz looks so bad with this ad.

What's the probability of a pretty good outcome?

Let's say a pretty good outcome is 45 or more.

Why is that a pretty good outcome?

I think it's a pretty good outcome because these are all people who say,

Michelob is my favorite beer.

If we can get 45 of them to suddenly prefer us that's pretty impressive

wouldn't you say?

And look at the probability.

More than 86% chance that in this blind test this will happen.

Here I calculated some other probabilities for you.

If you consider less than 40, but less than 2% chance of that happening.

The probability of hitting exactly 50 is almost 8%.

The probability of 50 or higher is 54%.

So there's more than half a chance that 50 or more people will favor your beer.

Of course, under the assumptions which I think are realistic,

that you can't really distinguish the taste.

So in the end, it looks like this advertising stunt isn't all that risky.

So now of course you're curious, right?

What really happened that day in 1981?

Exactly 50 out of the 100 loyal Michelob beer drinkers favored Schlitz beer.

So they got exactly the expected value, what a coincidence, under their model.

So, to sum up, the advertising stunt wasn't all that risky and

It turned out it was a nice advertisement.

Quick aside for people around the world, in the United States

you are allowed to do what's called comparative advertising.

Here in Switzerland, for example, it's not allowed.

I can't say on TV, your product is bad, my product is better.

I can only say my product is good.

So just as an explanation,

for those of you who were surprised about this comparative advertising.

It is allowed in the US.

And this brings me already to the end of this lecture.

We learned about a very important discrete probability distribution,

the binomial distribution.

It can actually get quite tricky if you look at the math underneath it.

You don't need to do this because there's a beautiful Excel formula,

BINOM.DIST which makes the calculation of probabilities very easy.

And we saw a really cool, cute application of it

in a legendary TV commercial from the United States.