That's the quickest way that you can do it.

One other thing that you will notice in Excel is that it will give you

the upper bound and the lower bound of the, confidence interval, and if you don't

mention that you want 95%, the default is 95%, and it will give it to you twice.

So, if you say 99%, it's going to show you lower 99%, upper 99%, but

also it will show you lower and upper 95%, so I consider that to be a bug,

so I usually just go ahead and delete these because it's redundant.

So what you would see is that there is an intercept and there is a down time.

So this is really my B 0 and my B 1.

So based on this, my regression equation is

98.01- 1.648X, X being the downtown.

How many hours the line has been down.

So, one of the things that you want to look at is P value for

downtown is extremely small.

One of the things that in regression does is that the null hypothesis is that X and

Y has no relationship and if you reject that that means that you have found

a variable that has a relationship with other variable and

the p is so small will reject a null hypothesis the null hypothesis being

they are not related at all.

Then one of the things that we want to look at is R square.

And R Square is the percent of the variation that you can explain

in the productivity because of down time so 88.4% of productivity

differences that we have collected in our data seems to be because of downtime.

So let me explain R Square a little bit in more detail.

For that I'm going to go to this table ANOVA table.

What we see here is this is the sum of squares and this is the sum of

squares due to regression variable and this is the on square due to total.

The solo square of the total shows the total variations that we see in our data

and some of this is coming from the regression variable and

some of this is coming from the errors,

things that we have not counted for, we don't know what they are at this point.

R Square R squared is the ratio of SSR to SST, and

the higher this value is, the more explanation we have found for

the variations that we see, which means that we have a good predictive model.

So if I take that ratio right here, this is SSR divided by SST,

and what you see here is exactly what you see here.

That's why r squared represents the percentage of explained

variations due to the regression variable.

Now multiple r is your correlation.

That is your correlation, and correlation is just simply square root of R Square,

if I take the square root of this value, you would ger R.

As you can see these values are exactly the same.

So R is the square root of R Square.

Correlation is a value that goes from zero to one.

Or zero to negative one.

The closer you are to one,

the higher is the degree of correlation between the two variables.

And the closer you are to zero, there is less correlation.

The sign is simply telling you, if they are directly related,

directly related means that if one goes up the other one also goes up, or

negatively correlated though, inversely correlated.

Which is if one goes up, the other one goes down.

So look at our downtime.

As the number of hours of downtime goes up, what is happening to our productivity?

It goes down.

So this is a negative correlation.

A positive correlation would have had your points going this way.

So this is a negative correlation.

How would you have known it by looking at this output?

You would know this by looking at the sign.

So the sign of the coefficient.

Which you also see here is telling you if this a positive correlation or

a negative correlations.

So, what does this mean now?

That means, that for every hour that you're down,

our productivity goes down by another 1.6%.

That's basically what it means.

So, if I told you that X=8 and I ask you to predict what would be the productivity.

You simply will come and use this equation to answer that.

And you can answer that, and I like just taking this values right here and

copying it elsewhere.

And now I'm going to say for X of eight hours of down time, what is my prediction?

It's simply this value plus this value multiplied by eight.

And that will give you 84.8.

So if you're down in a given month for eight hours total you expect your

average productivity for that month to be 84.82 now let me make a point here.

The coefficients that you see here are the middle of a confidence interval.

So, what you can see here is that you are also doing your confidence interval here.

So regression is bringing everything we have learned so far together.

They bring hypothesis testing when we do key value to check whether or

not there is.

Significant relationship between variable that you identify, is going to give you

a confidence interval for every variable, so then I come up with this value.

I have used the middle values, but this is just one estimation.

We actually have to come up with a confidence interval for

our prediction as well.

The reason I am not showing that to you is because Excel really does not do a very

good job of coming up with a predictive variable.

We need to have another set of equations.

So I'm ignoring it, so what I'm showing you is a point estimation.

This is a point estimation of if you had a month for eight hours.

Of down time.

What would be the point estimates of its predicted productivity?