Data repositories in which cases are related to subcases are identified as hierarchical. This course covers the representation schemes of hierarchies and algorithms that enable analysis of hierarchical data, as well as provides opportunities to apply several methods of analysis.

Associate Professor at Arizona State University in the School of Computing, Informatics & Decision Systems Engineering and Director of the Center for Accelerating Operational Efficiency School of Computing, Informatics & Decision Systems Engineering

K. Selcuk Candan

Professor of Computer Science and Engineering Director of ASU’s Center for Assured and Scalable Data Engineering (CASCADE)

So let's start. Let's start.

So what does this ACF and PACF really tell us?

If you give me CF and PCF plots,

how can I make sense out of it?

So the first important property of

the ACF function is that if your time series have a trend,

if your time series show a positive or negative trend,

the corresponding ACF will show a slow decay.

Now, let's see an example. So, here,

I am given a function that looks like this.

This my time series,

and I simply created this time series using an AR(1) model,

where Xt depends on the path plus some random,

in this case, random but positive random errors.

So as a result, the time series essentially is increasing.

The time series I think it's a positive random error,

it is basically constantly increasing.

So, it is monotonically increasing.

So, there's a positive change.

So, this is our time series.

Now, let's take a look at the corresponding autocorrelation function.

So, on the left-hand side,

I have the series and on the right-hand side,

I have the corresponding autocorrelation plot.

In the autocorrelation plot,

the x-axis are the lags.

So, these are the lags.

The y-axis shows the value of the autocorrelation.

Autocorrelation can be positive or negative.

In this case, the autocorrelation for the given lag is always positive.

For the blue bars, essentially,

shows the value of the autocorrelation for different lags.

As you can see, for this dataset,

what's happening that we start from

a very high autocorrelation value and as we increase the lag,

the autocorrelation value slowly decays.

We should exactly what we had said,

that we should exactly what we had said.

We had said that if basically,

if our series shows a trend,

we would expect that the autocorrelation function

has slow decay and indeed, that's what we'll see.

So, this essentially means that if you take your time series,

and if you basically plot

the corresponding autocorrelation function and if you see that there is a slow decay,

then you kind of know that there is a underlying positive trend

which you may want to remove by applying, for example, difference.

So, the autocorrelation function is telling you about that. Very strong tool.

Now, before we go ahead,

further with examples, I want to basically show one more thing in these plots.

In these plot, there's also these red lines or red curves.

So, these red curves are essentially going to show us

the mathematical significance or statistical significance

of the autocorrelation values.

So, essentially what it means, usually,

what it means is that anything below of this,

anything in this region,

is not significant enough and essentially anything below that,

we should really discount.

We shouldn't really give much importance

or much weight to that because statistically speaking,

those autocorrelation values are not very reliable.

So, in this example, that's not important;

but we will see that information,

that the area within the red curves is going to be important in the future example.

Nevertheless, even if you basically don't look at this region between these two curves,

you will see that the data shows a strong slow decay which essentially tells

us that the time series that we have on the left-hand side has a trend.

That's good. So, now we know essentially one key property of autocorrelation functions.

A slowly decaying autocorrelation function indicates a trend.

What else? Is that the only thing we can do?

Well, it turns out that is not the case.

So, another important property that

ACF and PACF plus have is with the partial autocorrelation function.

Most importantly, the partial autocorrelation function

has zero values.

Partial autocorrelation function takes

zero values at lag greater and equal to a plus one,

for any autoregressive time series with parameter a.

Now, this is good. So, now, if basically,

I kind of think that my time series maybe autoregressive,

I can maybe find the value of the a by studying its partial autocorrelation function.

I can look at the partial autocorrelation function and I can

see where are the zero values for the partial autocorrelation function.

If the partial autocorrelation function starts having zero values beyond,

say the point 10,

I kind of know that basically my autoregressive time series has a lag equal to 10.

So, let's see this. Let's see this with an example whether it works or not.

Once again, I have a time series on the left.

In this case, I have a partial autocorrelation function.

The plot for the partial autocorrelation function on the right.

Let's take a look at our partial autocorrelation function.

In this case, the partial autocorrelation function have

a non-zero value at lag equal to one,

but starting from lag equal to two,

I have zero values,

which essentially tells me that I might be

looking to a autoregressive function,

with lag equal to one.

Indeed, if you take a look at

basically the closed-form formula of the series that I have plotted,

you will see that it's actually a strongly autoregressive function.

The current value depends on the past value plus some random contribution.

It's an autoregressive function with lag equal to one.

It's an AR(1) model,

this is what we called earlier, if you remember,

is an AR(1) model;

and this I can discover by looking at the partial autocorrelation function.

Now, the partial autocorrelation function tells me, "Hey,

you are looking at a autoregressive function,

with lag equal to one."

So, for this curve,

the autocorrelation function told me that there's a trend and

the partial autocorrelation function told me that

the trend essentially is determined by lag equal to one.

Just looking at these plots,

I'm already learning stuff about my time series and I'm

learning about what the closed-form formula for the time series may look like.

Wonderful. What else? Is that all?

Because so far, basically,

if my time series is a trend,

I know how to find that trends to some degree.

If my time series is auto-regressive,

I know how to find the lag for that.

Right? What about for moving average time series? All right.

So, can be ACF and PACF plots,

can they tell me anything for the moving average series?

Well, it turns out that PACF can also tell me things about moving average time series.

In particular, if PACF shuts off,

actually, let me take it the other way.

If the PACF does not shut off,

shut off meaning having zero values beyond.

If the PACF does not shut off,

at a fixed lag,

but moves to zero slowly,

but moves the case towards zero slowly, right?

It does not shut off but moves towards zero,

then that might indicate that we are looking at a moving average series.

So, essentially PACF may tell me whether I'm looking at the moving average series,

or whether I'm looking at the auto-regressive series

based on the shape of the PACF.

So, if I can do that, that's great because I can differentiate

between these two major type of time series.

Let's see whether it is correct or not, can we use that?

Okay. Here, once again,

I have a series on the left,

and in this case,

the partial auto-correlation function plots,

PACF plot on the on the right.

Let's take a look at the PACF plot in this case.

Note that the PACF plot in this case doesn't shut off, rather it doesn't shut off.

I have non-zero values here and there.

Here and there I have non-zero values.

It doesn't really shut off.

It starts high, it gets lowered,

but it doesn't shut off.

Right? Usually, as I mentioned before,

usually we discount the values in this range.

This is the area where it is anything in this range,

I can treat that with zero values,

because they are not significant.

So, they are almost zero,

but essentially what we are seeing with this parameter called the PACF,

we see that it starts high,

it has some large values,

and then it basically gets closer to zero.

It starts basically taking values very close to zero.

So, this tells us that maybe we are looking at a moving average time series.

Indeed, if you take a look at the model of the time series, in its sense,

the function that are used to create this time series,

you will see that it's auto-regressive. All right.

It depends on the random input,

random errors, in the previous time instance.

We called this, if you remember, a MA(1) models.

So this is an MA(1) model.

It's a moving average time series,

and the partial auto-correlation function,

the part of the partial auto-correlation function is telling me that.

It's telling me that, ''Hey, Celtrick,

you are looking at a moving average time series."

Because the plot doesn't shut off,

immediately the plot starts getting smaller and smaller values. Right? That's great.

I have another tool that tells me if I'm looking at a moving average time series,

or if I'm looking at the auto-regressive time series.

Let's see another example.

Right? So, this is another time series I have on the left,

and I have another partial auto-correlation function on the right.

Indeed, again, if you take a look at that,

the values that this is taking doesn't shut off immediately.

In fact, basically, it goes

beyond this red area that I plot which is not significant, but it doesn't shut off.

I mean, it takes non-zero values,

beyond immediate small lags.

So, this again tells me that I might be looking at the moving average time series.

Again, if I take a look at the formula that I used to create this time series,

it is a auto-regressive time series.

Sorry, I apologize, I take it back.

It is a moving average time series.

It has two moving average terms.

In fact, the lag in this case is three,

the maximum lag that I'm using is three.

So, what I know,

if I now look at the closed form formula,

is that this is an MA(3) model,

moving average three model,

but what the partial auto-correlation plot tells me is that,

I am looking at the moving average model.

So, great. So now,

I can tell if I'm looking at a trend,

I can tell if I'm looking at a auto-regressive function,

and I can tell if I'm looking at moving average function

by looking at ACF and PACF plots.

This is strong. So this is really good

because before that I had no idea when I look at the time series.

I couldn't tell what I'm looking at.

Now, I can actually tell that this is really good.

Right? What else though?

I mean, can I do anything else?

Can I be adequate to learn more about the time series

by maybe studying these PACF and ACF more closely?

But it turns out that actually I can if I am looking at a moving average time series,

say if confirmed by the PACF,

then if I look at the ACF,

then I look at the ACF,

maybe I can actually learn more about my moving average time series.

So, it turns out that the ACF plot has non-zero values,

only at the lags of the model.

Only at the lags of the model. What does it tell me?

What it tells me is that, if you suspect,

so that the time series that you have is is a is a moving average time series.

You looked at the PACF and it tells you that,

you are looking at a moving average time series.

What you can do is, you can then look at the auto-correlation plot,

and see where are the corresponding non-zero values?

In this case, I have a non-zero value here,

and everything else essentially I can treat them as zero.

As I mentioned earlier,

anything basically between these two lines is not statistically significant for us,

which essentially means that I can treat them as zeros for our purposes.

Which essentially means that,

if I'm looking at moving average time series,

the corresponding lag should be just one and nothing else.

Indeed, if you remember for this time series,

it's a moving average time series,

with the lag value equal to one.

Great. So, I have a tool right now that

not only tells me that I'm looking at the moving average time series,

but it also tells me that I'm looking at

a moving average time series with lag equal to one.

Well, this was worked before lag equals to one,

does work for other time series?

Well, let's see. Okay? Here, again,

we have our second example.

On the left hand side we have our time series,

on the right hand side we have our auto -correlation function plot,

and let's see where we have non-zeros.

So once again, I will discount anything basically between

these two lines because they are not statistically significant for our purposes.

So, this tells me that basically I have non-zero values,

at lag equal to one,

and at lag equal to three.

For this essentially tells me that, ''Well,

Celtrick, you are looking at a time series,

is a moving average time series as confirmed by

the PACF and as confirmed by the auto-correlation function."

You have two moving average terms,

one of them is that like one,

and one of them is at lag three,

and indeed it is.

So this is how I generated this sample time series.

I created the moving average time series with

two average terms at lag one and leg three. So this is great.

As you can see, by using ACF and PACF,

I can confirm that I'm looking at a moving average time-series, and not only that.

I can also discover the terms,

the moving average terms,

that I need to include in my model.

So, ACF and PACF turns out to be very strong tools.

Although they are very simple to define,

and although they are very simple to compute mathematically,

they are very strong tools to help us to

visual analytics on the plots to discover the characteristics of the time series.

Ознакомьтесь с нашим каталогом

Присоединяйтесь бесплатно и получайте персонализированные рекомендации, обновления и предложения.