This is a five-section course as part of a two-course sequence in Research Methods in Psychology. This course deals with descriptive methods and the second course deals with experimental methods.

Loading...

From the course by Georgia Institute of Technology

Descriptive Research Methods in Psychology

ratings

Georgia Institute of Technology

ratings

This is a five-section course as part of a two-course sequence in Research Methods in Psychology. This course deals with descriptive methods and the second course deals with experimental methods.

From the lesson

Module 5: Correlations

- Dr. Anderson D. SmithRegents’ Professor Emeritus

School of Psychology

[SOUND] Hello again,

Anderson Smith, and we're talking about descriptive research methods.

And we want to today talk about what kinds of statistics and

what kinds of analysis can we use when we're using descriptive research methods?

There are two really kinds of studies we can do.

We can do descriptive studies that we're talking about in this course, or

we can do experimental studies where we actually manipulate something.

And when we do descriptive studies, description of data,

there will be two things we can measure.

We can measure frequency, how often does this occur.

And how often does that occur.

And we can look at association.

Are two things that we are measuring associated with each other,

connected to each other in some way.

If we manipulate variables, not just measure them, in experiential studies,

we can use inferential statistics and not just descriptive statistics.

An inferential statistics will measure comparisons,

does this group differ from that group?

And I can measure, actually reach causal relationships which is really what in

most experiments, experimental studies, using those methods we're trying to do.

Now, if we're in the descriptive studies that we're talking about in this course.

There are certain things we can measure.

We can measure frequency by central tendency, to what is the average?

What is the middle?

What is sort of the average score?

And that can be either by the mean, our average, or

things like the median, the middle score.

We could look at distribution.

How do these scores form along a distribution of responses from the sample.

And we can use the technique of correlation,

which allows us to look at relationships among variables.

Correlations, and that's really what we're going to talk about when we talk about

measurement techniques in analysis for descriptive studies.

So in descriptive studies, we can only describe a relationship.

That is Variable A and Variable B are related to each other.

They're associated with each other, they're correlated with each other.

That is, A and B are related.

In experimental studies where I can manipulate variables,

I can use inferential statistics that tell me whether a relationship that I get is

causal, for example, I can show that B is the function of A, B is caused by A.

So I have a unidirectional relationship,

now to bidirectional relationship, A causes B.

A little cartoon.

I used to think correlation was implied causation, but

then I took a statistics course, now I don't.

Sounds like the class helped, well, maybe.

I don't want to make a inference of causation from just a correlation.

So, here, with a correlation,

we don't know the direction of the relationship.

A could've caused B, B could've caused A.

Some other variable could've caused both.

But with inferential statistics, and experimental methods, and

manipulation allows us to reset conclusion about causality.

So, a correlation is a single number that describes

the degree of relationship between two variable.

Are the two variables associated with each other?

As one increases, does the other increase?

And if it that's true, we have a positive correlation.

Or as one increases, does the other one decrease?

We have a negative correlation.

So if they can be positive, and they can be negative.

So Bivariate Correlation, the simplest kind of correlation sometimes called

a Pearson correlation given the technique is actually achieves the actual measure,

test whether the relationship between two variables is linear.

As one variable increases, the other also increases.

Or as one variable increases, the other variable decreases.

The Correlation Coefficient, or r, is a statistical measure that's single measure

that tells us that there is a relationship between the two variables, and

the extent of that relationship.

r can go from -1.0, which is a perfect negative correlation,

to positive 1.0, which is a perfect positive correlation.

And if r is 0, it means no relationship exists at all.

And correlations are simply the extent to which two variables share variance.

So if I measure a variable,

I get a degree of variability on that measure from the subjects that I test on.

And if I measure the other variable, I get a variability on that particular measure.

And the correlation says they're related to each other is their shared variance

between the two variables as represented by this Venn diagram.

This variable, 100% of the variance.

This variable, 100% of the variance.

But if you combine them together, I can show that 32% of the variance is shared.

There's a relationship that exists, and as the shared variance increases,

the two Venn diagrams get closer together, there'll be more shared variance and

the correlation coefficient would increase.

If they were pulled apart, there's less shared variance and

the correlation would decrease, Now as I told you, I study aging and memory.

And I actually did a study one time, just look at whether health is a surrogate for

aging in determining memory performance,

because we know that memory is negatively correlated with adult age.

The older you are, the poorer your memory is.

Normal healthy aging, I'm not talking about memory pathologies.

With normal healthy aging, we get this relationship,

negative correlation between age and memory performance.

We also know that health is negatively correlated with age.

As I get older, my health and health problems increase, so both of these exist.

Now, the question is often use is that the age memorably relationship that I study.

And I really believe that memory performance is actually affected by aging,

affected their changes in the brain.

And the changes in the brain actually cause the memory performance changes that

we see in normal, healthy aging.

Other people say no, it's not age per se, it's not biological aging, it's health.

That if health decreases, memory's going to decrease also.

And so health is the factor that's important in determining whether or

not you get an age memory correlation.

If health is really good, the age memory correlation is small.

If health is bad, the age memory correlation is large, and,

Age affects health, and then health affects memory.

So that is really what produces what I'm seeing.

So we did a study back in the 90s, where we asked the question,

is it age or is it health?

And we did it with this correlation technique.

Here is the correlation between age and memory.

The shared variance between age and

memory is represented by blue in the two circles in the Venn diagram.

But when I add health to the study,

these are all self-reports of health, by the way.

Things like how many prescriptions I use,

how many times I visit the doctor in a month, in a year.

Is my health better now than it was earlier?

How does my health relate to people my same age, and lots of questions but

all self-report.

Notice that it does account for some of the age memory relationships, but

a very small amount.

So the correlation between age and memory is not counted for, the variance accounted

for is not reduced but by a small amount by adding a health into my equation.

So most of the age memory relationship is not

attributed to health, but to something else.

And I'm arguing that aging is really brain efficiency that changes with aging,

and that also produces lack of health and also produces decrease in memory.

But not just health itself, health is not a surrogate for

aging in determining the size of the memory deficits that we see.

So correlation usually shown as a scatter plot.

That is if I take two variables like driver age, again, another age study.

And I'll look at the legibility of signs from a distance, like how many

feet you had to be close to a sign or far away from a sign for it to be legible?

And I look at that, and

I'd simply point out the measurements that how far your driver age?

And I look at the actual legibility of the sign,

and what I notice is that as I increase age,

I decrease the legibility of the sign.

And I had to be at much closer distance to the sign if I'm older to actually see

it, right?

So the negative correlation then between sign legibility and

distance that I have sign legibility in driver age.

And that's the way I usually represent correlations.

And then the line represents the actual correlation.

And you can see that correlations can vary, positive, negative, or zero.

If I'm looking at ice cream consumption by temperature during the day,

I get this very positive correlation between ice cream consumption and

the temperature of the day.

If I look at eating hot foods, I find the same correlation, but it's negative.

And if I look at drinking coffee, another hot food, or hot drink,

but that has no correlation with temperature.

We drink it in the winter, we drink it in the summer.

So correlations can vary and that is determined by the correlation coefficient.

So here's a perfect correlation, 0.80, 0.40, a 0 correlation,

a negative correlation of 0.40, a negative correlation of 0.40.

These are simply scatter plots, and a perfect negative correlation of -1.0.

Let me give you an example of how correlations can be misinterpreted.

This is a child back in the back in the far early 60s who has polio,

a disease we don't see today because of vaccines.

And in 1949, Dr. Benjamin Sandler, a noted scientist studying polio,

noticed there was a correlation between incidence of polio and

ice cream consumption that we saw earlier.

And he concluded that sugar made children more susceptible to the disease.

In fact, the public health service actually issued a warning about sugar

consumption in children in the prevention of polio.

It became a big public information campaign during that period of time.

But it’s not sugar consumption caused by ice cream consumption, but it's warm

weather that increase the disease, and also increase ice cream consumption.

So the correlations are really represented by a third

variable which wzs warm weather.

Because we know that viruses and polio is a virus becomes more active in the summer.

So what we have is, of finding that sugar consumption is represented by ice cream,

increased the prevalence of polio, and that's a correlation.

The Public Health Service said, no, it's not a correlation for some reason,

it's a causal effect.

And so we'll issue warnings about the fact that sugar consumption could lead

to polio.

There was also a correlation between sugar consumption,

ice cream consumption, in warm weather.

And what they found was that warm weather is the variable that increases prevalence

of polio, and not sugar consumption.

In fact, sugar consumption or ice cream consumption has nothing to do with polio,

because the virus is more active in warm weather.

So correlations are what we used to describe relationships,

it exists between variables, but

we have to be careful not to make inferences about causation.

Thank you.

[SOUND]

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.