0:41

The vocabulary test works as follows.

Respondents are given the following list of words and

are asked to choose a word from the list that comes closest to

the meaning of the first word provided in capital letters.

For example, which of these words is the word edible closest in meaning to?

Or if you were the respondent on this survey would you mark don't know?

The answer should be fit to eat.

If you're curious about the vocabulary test, feel free to pause the video and

work through the rest of the words as well.

But for the full purposes of this example,

we're not going to be focusing on what the words mean, but instead we'll take a look

at how people who took the survey did on the vocabulary test and

whether there are scores associated with their social clause.

2:22

The groups that are clearly separated from each other

are most likely to have means that are significantly different from each other.

Plots number one and three shows groups with the same centers.

But the data in plot one are much less variable than the data in plot three.

Hence it would be much easier to detect the differences in means for

data in plot one, as the groups are much obviously separated.

On the other hand, plot number two shows groups with centers that are very close.

And therefore, and this plot is going to be the one that is

with groups that are least likely to be significantly different from each other.

Our goal is to find out if there's a difference between the average vocabulary

scores of Americans from the different classes.

We know that we can compare means of two groups using T statistics and

comparing a groups of three or more is going to require a new test

called analysis of variance and a new statistic, the F statistic.

The null hypothesis in ANOVA,

just like any other null hypothesis says there's nothing going on, or

in other words, the mean outcome is the same across all categories.

We can denote this as mu one one equals mu two equals mu three all the way to mu k.

Where each mu indicates a group mean and k is the number of groups.

In other words, the levels of the explanatory categorical variable.

This value was four for the data set we introduced earlier where

people self identified as either lower, working, middle or upper class.

The alternative hypothesis says there is something going on.

But it's not very specific.

It says that at least one pair of means are different from each other.

But it doesn't specify which means are different.

This is an important point we're going to come back to later.

But for now, think about it as, if we do reject the null hypothesis, we find out

that there is something interesting going on in the data, and we might need to

dig deeper to find out which group means are actually different from each other.

In a t-test,

we compare means from two groups to see whether they're so far apart that

the observed difference cannot reasonably be attributed to sampling variability.

In ANOVA we compare many means, from more than two groups to see whether they're so

far apart that they observed differences cannot all reasonably be attributed to

sampling variability The summary illustrates the parallels between what

we've seen so far in ANOVA, so let's take the comparison further a bit.

The t-test, in a t-test, the test statistic is calculated

as a ratio of the effect size to the standard error.

And in ANOVA, the test statistic is also a ratio but since there

isn't a single population parameter or point estimate that we can identify.

Because remember we're comparing many means.

The test statistic is calculated a little differently.

As the ratio of the variability between groups,

over the variability within groups.

Remember that a large test statistics lead to a small p-values.

In size test statistics gets closer to the tails,

the tail areas only gets smaller and smaller.

And also remember that the p-value is small enough, we can reject

the null hypothesis and conclude that the data provide convincing evidence for

a difference in population means.

We mention an F statistic, so let's also introduce the F distribution.

It's right-skewed and always positive,

since it's a ratio of two measures of variability, which can never be negative.

We know that in order to be able to reject the null hypothesis,

we need a small p value which requires a large F statistic.

Then, obtaining a large F statistic requires that

the variability between groups be much larger than the variability within groups.

In the next few videos we'll get into more details about

how to actually calculate this F statistic.