Note that I talk about the direction of the association.
So rather than just saying manufacturing lead times are significantly
associated with the number of ingredient units in stock
I talk about the direction of the association.
Otherwise, it's not especially informative.
When I talk about the direction of the association, what I mean is that I'm
indicating whether or not it's a positive or negative association.
Manufacturing lead time was negatively associated with ingredient units in stock.
And I described that negative association by writing manufacturing lead
time's shorter when there was a greater number of ingredient units in stock.
And then I do the same thing with the association between number of
hours production workers had worked on this shift before beginning production and
manufacturing lead times which was a positive association.
Manufacturing lead times increased when production workers had
worked more hours on their shift before beginning production.
And then I write that manufacturing lead time was not significantly associated with
the number of steps involved in the production of a batch, and
I provide the Pearson correlation coefficient, -.05 and
the p value and it was not associated with the number of hours of sleep that
production workers reported getting the night before batch production began.
And again, I reported the Pearson correlation coefficient.
0.01, p = 0.71.
Note that the figure has a title that describes what's going in the figure.
Note also that I follow the standard graphing convention
where my predictive variable is spotted on the horizontal, or x axis.
And my response variable is plotted on the vertical, or y axis.
Note also, that the variable labels are explained.
Rather than just providing the actual variable name, the reader can understand
what the variable means without having to refer to a code book.
These are important characteristics of figures.
You want to make sure that you have a title and
you want to make sure that the variable labels are informative.
So in the previous slide, I summarized the association between
each quantitative predictor, and the quantitative response variable
using scatter plots In a Pearson Correlation Coefficient.
In this next section I discuss the association between each
categorical predictor and the quantitative response variable.
The appropriate bivariate analysis when you have a categorical predictor and
a quantitative response variable Is analysis of variance.
So I write analysis of variance indicated that average manufacturing
lead times did not differ significantly as a function of equipment failure, and
I provide the s statistic and the associated degrees of freedom
in parentheses, and the p value associated with the f-statistic.
And then finally, the r-square, which is the variance
in manufacturing lead times that is accounted for by equipment failure.
I also write that trainee involvement and
production is also not significantly associated with manufacturing lead time.
And again I provide the f-statistic, the p value and the r square.
And then I point the reader to figures 2 and
3 to give them a visual of the association.
Notice also that when I describe the results of the analysis of variance,
I do not include the means in the text.
That's because I have provided the means in the figures.
Finally I discuss the results of my multivariable analysis.
I first point the reader to figure four, which showed that five of the six
variables were retained in the model selected by the Lasso Regression Analysis.
Only the number of production steps predictor was excluded.
The number of ingredient units in stock.
And the number of shift hours employees work before beginning production were most
strongly associated with manufacturing lead time followed by equipment failure,
training involvement in production and the number of hours of sleep
that production workers reported getting the night before their shifts began and
this is shown in table two.
Table 2 shows the last or least angle regression variable selection summary.
The average squared error, which is also the means squared error,
associated with each variable as it was entered into the model.
Then I provide a little bit of information about the direction of the association
with each variable, and the response variable manufacturing lead time.
Manufacturing lead times were shorter for batches that had a greater number of
ingredients in stock, and when production operators reported sleeping for
more hours the night prior to bad production.
On the other hand, working more shift hours prior to manufacturing,
equipment failure and having trainees involved in batch production
was associated with increased lead times.
Together these five predictors accounted for
93.3% of the variance in manufacturing lead time.
Then I report the mean square error for
the test data set compared to the mean square error for the training data set.
To show that they differed very little, which suggests that predictive accuracy
did not decline in the last regression algorithm development and training data
set, with applied to predict lead manufacturing times in a test data set.
And again, I refer the readers to figure 4,
which shows the mean square error rates for both the tests and training data sets.