0:00

Now, let's talk in more details about

our first class of dynamic weight and variable models, namely State-Space Models.

I remind you that state-space models are dynamic weight and variable models,

where both the hidden and observed states are continuous.

So they are appropriate when your data is continuous.

The problem of modeling continuous data with

a dynamic weight and state is formulated as follows.

Assume that we have T observations of the observed signal y,

which can be either univariate or multivariate depending on your settings.

We want to build a dynamic latent variable model with a hidden state x

^(t) that would capture the dynamics of the system and filter noise out.

If we can do this the hidden state

becomes sufficient statistic for predicting the future.

As in a general case that we introduced earlier in State-Space Models we

assume that the dynamics of the continuous hidden state is first-order Markov.

The absorbed state depends only on the hidden state and nothing else.

In particular it does not directly depend on the previous value,

as will be the case for a regular Markov model.

This means that the probability of the total times

serious path of VPRXY is given by the product over

all time steps T of a product of

transition probability for X times

the mission probability for Y given the value of facts.

We can understand this expression better if we take a special case.

Let's say T=1 so we have only once Yorum in the product.

So in this case the likelihood of complete data is just the product of two probabilities

one for the transition of the hidden state

and another for the emission of the observed signal Y.

Okay, this is the likelihood of complete data,

but we do not have complete data as state X is hidden.

So to compare actual data we have to marginalize over x.

That is to integrate over all possible various effects.

This gives the second formula here.

Now if we rename the variable x1 to Omega.

For example, this formula is the same as a continuous mixture model,

where we take a continuum of probability distributions pure of white,

conditional on omega model parameters data and

integrate over all parameters Omega with sound waits for 4 omega.

If you filter discretous this integral you obtain now

find a mixture model for example a Gaussian mixture model can be updated in this way.

This means that if we are only interested in

the marginal distribution at some fixed future times e. As seen from

today we will not need any states-based model

irregular non dynamic mixture model for

example or Gaussian mixture model would work for these task.

But, if you knew the model some distributions for

future random variables simultaneously for a few or many a different time horizons,

these were states-based models become useful.

Okay let's now come back to a general case when we have more than one times step.

So we are back to our original formula with a product over

all time steps for the probability of the total path of X and Y.

As we just saw it involves the history of the hidden state that we do not really have.

So what would be the steps needed to do times serious forecasting in this framework.

If you look at one step forecast then we need to compute

the probability of the next well of Y conditional on

the previous values of y and the current value of X.

Such probability can be expressed in terms of

some model that is parametrize by some vector of parameter data.

So the model will have two sets of unknowns the hidden state x and model parameters data.

Therefore, the problem of forecasting for dynamic latent models including

both states-space models and

hidden Markov models amounts to two tasks inference and learning.

In the task of inference we learn about the hidden state,

the model parameters are assumed to be fixed in this task.

And the task of learning we learn model parameters

while keeping our assumptions about the hidden state tweaks.

In a specific case of the em algorithm as a way of training camp machine learning

algorithm the inference task respond

to the Estep and they're learning task respond to the Mstep.

A special and highly tractable case

of states-space models is called linear-Gaussian state-space models.

In this specification the next state X is linear in a previous state with

the Gaussian noise and w. Also the observation Y is linear in the current value effects,

plus another independent Gaussian noise V. Now,

the second equation for the observable y here generalizes Factor Analysis.

In factor analysis we had the random variable X that did not depend on time.

Now, we have a random process X of T that changes with time.

Inference and learning in such model can be done using again the EM algorithm.

The Estep of the algorithm learns the posterior over

the human variables X given observe values y and model parameters theta.

For this specific case of

Linear-Gaussian models this procedure is called the Kalman Smoothing.

The Mstep then estimates all parameters given the fixed distribution of human labels.

It turns out that due to a particularly simple structure of

linear-Gaussian state-based models the Mstep in these models can be non analytically.

We will not go into details of that,

but you can always do it on your own once you know the name of the method,

then understand what it does at the high level.

Instead I would like to mention here a couple of examples

of using state-space methods in finance.

One of them deals with analysis of firms leverage,

which is the ratio of debt to the total firm value given by the sum of debt and equity.

A popular assumption regarding the dynamics of leverage is that

firms try to adjust their leverage ratios to some optimal value.

Sometimes referred to as the target leverage.

A such optimal target leverage is not directly observable and can also change with time.

It can be modeled using a dynamic hidden variable.

The paper by Roberts constructed such latent variable model by

modeling factors describing the through unobserved values of marginal tax rate,

probability of bankruptcy, firm size,

investment opportunities, and average industry leverage.

Now observed values corresponding to these factors as well

as the leverage itself are obtained as hidden values those blasts noise.

This was formulated an estimate that as

states-space model in these papers by Roberts in 2002.

Other research for example,

by a Loeffler and Mauller in 2009 has found that

dynamic models of firms leverage help

improve default prediction models, for counterparty defaults.

And finally, I want to mention that

various stochastic volatility models

popular in equity which is affects communities and so on,

can also be formulated an estimated as states-space models.

Where volatility plays the role of a hidden label.

Hey. So let's take a break at this point and

continue with hidden Markov models in our next video.