The reason we cut if off at three principle components is, again,

remember that the eigenvalues tell us the variance in the dataset.

At the end of keeping three principle components,

we have already captured over 97% of the variance in the dataset.

So we decided that that was good enough.

Of course, if that's not good enough,

you can include higher order principal components core.

So if you go to four principal components instead of three you would

keep about 97.8% of the variance and so on and so forth.

So one gets this additional information from PCA which is usually

called a screen plot, and this plot tells us,

gives us objective guidance on way to truncate the principal components.

Now let's look at another set of examples.

In the previous set of examples,

the microstructures were obtained from actual experiments.

But one can also think of generating a very large set of synthetic

microstructures where you are just digitally making them up.

So for example, if one were to think of a matrix-precipitates system, two phases, so

in this case, the matrix is shown in black and the precipitates is shown in white.

One can think of making many, many classes of distributions.

In this particular one, we're only focused on four classes.

And one can also think of many shapes of conclusions and

one can think of many volume fractions of interest.

In this particular case study, we generated about 900 structures.

Of course you can generate a lot more but this was a particular example, a case

study, and that case study is described in this paper for further information.

So, nevertheless, in these 900 microstructures there's already

a rich distribution of Inclusion shapes,

placement of inclusions as well as volume fractions.

If you take all these 900 microstructures and

throw them into this protocol that we have been learning in this class.

That is first compute the two point set of sticks and

then do the principle component analysis.

So the principle component analysis we discussed in the previous lessons

applied to this 900 microstructures uses these plots.

The plots in the top row are projections so

the principal component analysis and two sets of axis at a time.

So the first plot is showing you PC1 PC2.

The second plot is showing you PC1 PC3.

And the third plot is showing you PC2 PC3.

So, it is actually the same plot.

The original plot is a 3D plot that contains all 3 scores PC1, PC2, PC3, but

what you're seeing projections, selected projections of this 3 dimensional plot.

And right away, you can see that the five pluses of placement of

the precipitate naturally lead to clustering.

Five different clusters in the principle receipt blocks.

Again, this comes out naturally.

To the PC analysis, we did not tag the microstructures, indicating that

some of these were random, horizontal, vertical, or clustered, or whatever it is.

This information was not provided to the principle component analysis.

In spite of not having that information, the data gets automatically clustered.

One of the benefits of doing the principal components is simply that we get three

principal components.

Whereas the original dataset, any one of this dataset, has 80X80 pixels that

means the original dimensionality of the microstructure is 6400.

So from 6400 you went on 3 principal components and yet the microstructures

have clustered as expected even though we did not provide that information.

Now what are we actually capturing in the principle compliment?

Here are some of the plots of what the average looks like.

This is average for all the 900 microstructures for

all the statistics of all the 900 microstructures.

This is a map of the first principle compliment, second principle compliment,

and third principle compliment.

So each one of these blocks is capturing a particular special pattern.

And this is some sort of a signature pattern.

And the PC score associated with each compliment,

then tells you how strong is this feature in the given microstructure.

As an example, if one looks at the version to Autocorrelation of

one of the 900 microstructures this is what you get.

This is the version Autocorrelation, and the symbol would be this one.

In the truncated principle component representation we are approximating this

using these stuffs.

Of course, there are other terms in the principal component analysis,

but we're ignoring that.

What this is, the pattern represented

by 5,1r, has this much trend, In this particular micrograph.

And likewise, the pattern represented by P2r has

this much strength in this micrograph, so on, so forth.

So the advantage of this visual content analysis representation of presented

presentation, is that the microstructure is represented by these three numbers.

These three numbers are the, weights of the different principle components.

A different microstructure in the same ensemble would have three other numbers

but every microstructure in that ensemble of 900 microstructures.

Now has three distinct, three numbers, a set of three numbers that points to it.

So thereon the hypothesis and our hope is that, this

representation is what we need to make connections to the properties and process.

In summary, we have learned in this lesson that application of PCA on spatial

statistics offers unsupervised classification of material structure.

Although we didn't explicitly state it all the algorithms that are used in

the analysis of the example from in this lesson are very broadly available.

And as a specific example they're easily accessible through

the by pymks.org code repository.

There also other open access,

open source repositories provide similar functionality, of course.

The PCA analysis also allows objective quantification of variance within

the microstructure ensembles.

Because the calculations are very cheap and computationally very efficient.

They can be attach to almost any in line analytics in term of especially

useful in wood expensive experiments as well as in expensive simulations.

Thank you.

[MUSIC]