0:09

As usual we're going to introduce this

new concept using a concrete real data example.

A SurveyUSA poll asked respondents whether any of

their children have ever been the victim of bullying.

Also recorded on the survey was the gender of the respondent, the parent.

Below is

the distribution of responses by gender of the respondent.

Before we proceed with the calculations, I'd like to make note of one thing.

If we were to take this somewhat narrow minded

view, that only a male and female couple has children.

In that case, the proportion of kids that are

bullied should be the same for males and females.

Remember here, we're asking individuals if their kid has been

bullied, not families or households.

1:04

It could be, single parents but we said we're going to take the narrow

minded view that we usually have one mother and one father in the household.

It's probably true for majority of the population.

It could be that one gender is more likely than

other to even know that their kid has been bullied.

Or not, and it could also be that one gender is

more likely than other to actually report this on a survey.

So of the 90 males that were surveyed, 34

of them said that their kid had been bullied.

And of the 122 females that were surveyed, 61 said the same thing as well.

So, to calculate our sample proportions for the males, that would be 34 over 90

38%. And for the female, 61 over 122, 50%.

If we want to compare these two proportions,

our null hypothesis should be that the proportion

of males and proportion of females who, whose

children have ever been the victim of bullying.

Should be equal to each other or

in other words the difference between the two population proportions should be 0.

If we're simply looking for a difference

between the two, the alternative hypothesis will

set this difference between the two population

proportions to be not equal to 0.

Before we can proceed we need to check our conditions.

And we need to calculate a test statistic based

on which we can then calculate a p value.

2:31

We'll get

to the calculations in a little bit, but let's

flash back for a moment to working with one proportion.

We talked about this idea of using a p hat versus a p.

And we said that when we're working with one proportion

and we want to check the success failure condition within the context

of a confidence interval, we used our observed proportion to

do that so these are the observed successes and observed failures.

Versus when we're doing a hypothesis

test, we use the value of the population proportion

that we set equal to in the null hypothesis.

In other words, the null value of p is what

we use to calculate the expected successes and expected failures.

3:13

In terms of the standard error, once again, we use

p hat in calculation of the standard error for confidence intervals.

Versus we use p, the null

value of the population proportion, for the hypothesis test.

So, the moral of the story here, is that when you're dealing

with a confidence interval, use the, use observed counts and observed proportions.

When you're dealing with a hypothesis

test, use expected counts and expected proportions.

3:41

So how will this translate to working with true proportions?

For confidence intervals, we want to look at the total

number of observed successes and failures for each one of the groups.

So we've already done this in the previous video.

We simply look at the sample sizes for each of

the groups and multiply them by the observed sample proportions.

To calculate the observed number of successes and observed number of failures.

Also for the calculation of the standard error we

use the observed proportions from the two groups as well.

4:10

However, calculating the expected

successes and failures or the expected proportion for the hypothesis test.

For difference between two proportions is not as simple.

Take a look at this null hypothesis.

We simply say in the null hypothesis that the

two population proportions should be equal to each other.

Or that their difference should be equal to 0.

But at no point do we define what this, these should be equal to.

So we don't have a readily available

null value.

Of the population proportion that we can use for

the two groups to calculate expected successes and expected failures.

What do we do? We make one up.

This is called the pooled proportion.

So the idea here is that even though your null

hypothesis does not equate the two population proportions to something.

Could we actually come up with a best guess for what

these could be equal to under the assumption of the null hypothesis.

And there, what we use is the idea of the pooled proportion.

This pooled proportion is simply the number of successes

divided by the overall sample size for the two groups.

So we're pooling data from the two groups together, so it can be calculated as the

number of successes in group one plus the

number of successes in group two divided by the

sample sizes for the two groups.

5:31

So let's right away put that to use,

and calculate the pooled proportion of males and

females who said that at least one of

their children has been a victim of bullying.

So our p hat pool is going to be the

total number of successes from the males, so that's 34.

Plus the total number of successes from the females, so that's 61.

Divided by the sample

size for the males, which is 90, plus the sample size for the females, which is 122.

So this gives us roughly 45% as the pooled proportion of males and females,

who said that at least one of their children has been a victim of bullying.

6:13

So now that we have a good estimate for a common

proportion for the two groups, we can actually revisit the chart

we were working with earlier and see how we can calculate our success, failure

condition and our standard error for doing

a hypothesis test for comparing two population proportions.

6:30

For the success failure condition for both of the groups, we actually use this p

hat pooled value to calculate the expected

number of successes and expected number of failures.

The reason why we're making this distinction

once again, is that in a hypothesis test,

we must assume that the null hypothesis is true.

And when we're doing a hypothesis test for comparing two proportions.

Our null hypothesis states that the two proportions are equal to

each other, so we're going to use this value of the

pooled proportion to say this is the value they're equal to

and use that as the truth in going through the hypothesis test.

For a calculation of the standard error we use

the same idea everywhere we see a p hat 1

or a p hat 2, we simply plug in

this common proportion that we calculated for the two groups.

7:22

You might be asking what about means?

When we talked bout doing inference for

means, we did not provide different formulas

for the standard error, when we were

calculating a confidence interval verses a hypothesis test.

But we seem to be making a pretty big

deal about it now that we're talking about proportions.

Well with means our parameter of interest is a Mu, and in our null hypothesis

we said Mu equal to some null value, but in our calculation of the standard error.

This is simply calculated as S over square root of n.

So Mu, our population mean, does not

appear in the calculation of standard error.

So it

really doesn't matter what that number is set equal to,

it's not going to influence the calculation of the standard error.

On the other hand, when you're, you're doing

a hypothesis test for proportion, we set p equal

to some null value, and that same p

does actually appear in calculation of the standard error.

And hence because it does appear in the calculation,

and because we must assume that the null hypothesis is

true when going through our calculations, we need to make a different distinction

between when we do have a null hypothesis that we must assume is true.

Within the context of hypothesis testing, versus

when we don't have a null hypothesis that

we must assume to be true, that is within the context of a confidence interval.

[BLANK_AUDIO]

Let's take a look real quick then.

Our conditions for inference met for conducting a

hypothesis test to compare the two proportions here?

In terms of the condition of independence, within group independence

we have a random sample and the 10% condition is met.

90 and 122 are obviously less than 10% of all males and females.

So, sampled males, in art can be assumed to be independent

of each other, as well as sampled females can

be assumed to be independent of each other, as well

9:18

When it comes to between group independence, we want to think

about how these data were collected in the first place.

This was an overall random sample, and some of the people in this sample happen

to be male, and some of the people in the sample happen to be female.

Therefore, we really have no reason to expect that

sampled males in this sample, and the sampled females

in this sample are dependent on each other.

These are not necessarily paired people in anyways and

even if we had any worries about that, with

the different sample sizes from the two groups we

know that they're definitely not one to one pair.

So since there is no reason to expect dependence between the two groups,

we can assume that this between group independence condition is met as well.

When it comes to sample size and skew, we want to

remember consider the success failure condition.

However, we're doing a hypothesis test for the difference between

the two proportions, so to check our success failure condition, we

use the pooled proportion that we had calculated which was

the 45% shown above on top in our data summary table.

For the males then we have 90 males times 0.45 gives 40.5 and 90 males times

0.55 that's the probability of failure or the proportion of failures gives 49.5 both

numbers are greater than 10. Similarly for the females we have 122

females times 0.45 gives 54.9, and 122 females times 0.55 gives 67.1.

So the success failure condition is met for females as well.

Therefore,

we can assume that this sampling distribution of the

difference between the two sample proportions is nearly normal.

11:02

Since our conditions are met, we can finally

conduct a hypothesis test, and we'll do so

at a 5% significance level, evaluating if males

and females are equally likely to answer yes.

To the question about whether any of their

children have ever been the victim of bullying.

So the null hypothesis

here was that the two proportions are equal to each other,

and the alternative is that the two are different from each other.

We had already set these hypotheses early on in the first slide of this video.

11:31

Our ultimate goal with a hypothesis test is

to calculate a p value, but before we get

there, we need a test statistic, and for

that, we need to figure out our sampling distribution.

The sampling distribution of the difference between

the two sample means is going to be

nearly normal with mean 0. That 0 comes from our null value.

And we know how to calculate the standard error using the pool proportion.

So that's 0.44 times 0.55 divided by 90, the number of males,

plus 0.45 times 0.55 divided by 122, the number of females.

And then we take the square root of the whole thing.

That gives roughly 0.0691.

Our point estimate in this case is the difference

between the two sample proportions, so that proportion of males

minus proportion of females in the sample in other

words 0.38 minus 0.50 our point estimate is negative 0.12.

12:28

So, we finally have everything we need to calculate our p value.

Let's draw our sampling distribution real quick and

show there what the p value actually corresponds to.

We're doing a two-tailed hypothesis test, so we want to shade beyond our point

estimate, both on the lower tail end and the higher tail end as well.

And to calculate that area we can use a z score

that we calculate as our point estimate minus the null value divided

by the standard error that we had calculated.

The z-score comes out to be roughly negative 1.74.

Then our p value can be described as

the absolute value of the z-score being beyond 1.74.

So that really corresponds to a z-score of negative 1.74 or lower,

or a z-score of positive 1.74, or higher. You can use

a table, R, or an applet to calculate this.

So I recommend that you get a little bit of practice doing so.

But you will see that the p value comes out to be roughly 8%.

The final step would be to compare this to our significance level,

and finally make a decision on the research question we were working with.