[SOUND] Hi, welcome to week three.

This time we will see an algorithm called variational inference.

This is an algorithm for computing the posterior probability approximately.

But first of all,

let's see why do we even care about computing approximate posterior.

So here we see base formula that helps us to compute the posterior

on the latent variables given the data.

We will denote the posterior probability, f, as p*(z).

So when the prior is conjugate to the likelihood,

it is really easy to compute the posterior.

However, for most of the other cases, it is really hard.

One important case is called the variational autoencoders and

we will see it in week five.

In variational autoencoders, we model the likelihood as neural networks.

So it would be a normal distribution of the data given that

the mean is some neural network mu of z and

the variance is some other neural networks, sigma squared of z.

And in this case, there's no conjugacy and

we can't compute the posterior using Bayes' formula.

But do we actually need the exact posterior?

For example, here is some distribution and

it doesn't seem to belong to some known family of distributions.

However, we could approximate it using the Gaussian and for

most of the practical considerations, it will really be a good approximation.

For example, it would match the mean, the variance and [INAUDIBLE] the shape.

And so through out this week, we'll see a method that will help us to

find the best approximation of the full posterior.

It works as follows, first of all, we select some family distribution as Q.

We'll call this a variational family.

For example, this could be family of normal distributions with some

arbitrary mean, and the coherence matrix that will be a diagonal one.

What we do next is we try to approximate the full posterior,

the star of z, with some variational distribution, q of z,

and we find the best matching distribution using the KL divergence.

So we try to minimize the KL divergence between the q and

the p* in the family of distributions, q.