>> To explore this question, we create a third variable that is categorical.

For this new variable, the income-per-person variable,

which is quantitative, will be categorized as high income countries,

middle income countries, and low income countries.

The adjustments we made to our program are very similar to the adjustments we made to

our Nova syntax, and to our chi square syntax when testing moderation.

We'll start our program calling in our libraries and loading the gap minder data.

Next, we set our three variables of interest to numeric, and

set blank data on our third variable to NAN.

Then I create a new data frame I am calling data_clean that drops all missing,

that is NAN values for each of the variables in the data set.

Now, I create my income group variable which splits the sample of countries into

low, middle, and high income groups using the dummy codes 1, 2, and 3.

Next I create three different data frames that include only one income group each.

Here, called sub1 for low income countries, sub2 for

middle income countries, and sub3 for high income countries.

Then we request a Pearson correlation measuring the association between

urban rate and Internet use rate as well as its associated P value for

each of our new data frames.

We use the Pearson R function from the scipy.stats library and

include our variables urban rate and Internet user rates.

When we examine the correlation coefficients between urban rate and

Internet user rate for each of the income groups, we find the following.

For the low income group the correlation between urban rate and

Internet use rate is 0.11 and the P value is not significant.

For the middle income countries, the association between Internet use rate and

urban rate is 0.32.

With a significant P value of 0.001 and finally among high income

countries the correlation coefficient is 0.089, with a large P value.

Suggesting that the association between urban rate and

Internet use rate is not significant for high income countries.

When we map these findings onto the associated scatter plots for

each income group, we are better able to visualize the significant and

non-significant relationships.

Estimating a line of best fit within each scatter plot

shows the positive association between urban rate and

Internet use rate among the middle income countries.

And almost no relationship between these variables in both the low income and

high income countries.

>> Asking questions about statistical interactions can an interesting way to

explore your data and your associations of interest.

This is not difficult to do using the skills you've acquired this far.