Okay. Now, let's talk about the most interesting application of the framework

that we considered in the last week of our course on Reinforcement Learning.

We mentioned back that the same framework that was

developed there can also be used for modeling not only a policy over single trader,

but also to learn market dynamics.

And in your course project for that course,

I offer it to you to try to calibrate the simple model of

market dynamics that is obtained from this analysis.

And now, I would like to discuss this approach in more details.

So let me start with the notion of an agent in the market.

When people discuss market dynamics in terms of agent-based models,

they typically mean one of the two possible settings.

The first setting views an agent as a so-called representative investor

whose objective is to optimize

a given investment portfolio given some objective function.

A representative investor is some kind of an average investor that has

a certain utility function that is optimized in the process of trading by the agent.

And in this view of the world,

the market is clearly external to the agent.

This approach is quite popular in financial literature.

Another approach to agent-based models considers many interaction agents.

Such models were developed in physics and computer science communities.

These models seek to explain some general stylized facts about your markets,

but what if we go back to

the portfolio optimization problem in a single agent formulation and take

an inverse optimization view of this problem as in Black-Litterman model.

In this approach, the optimal portfolio is already known.

Its market portfolio itself.

But then a good question would be who's an agent that

dynamically maintains or he balances such market optimal portfolio?

And we could probably identify such agent with a some sort of a collective mode of

all individual traders that worked in the market that are guided

in their decisions by a commonly agreed set of predictors that we call Zt,

in our previous course,

that may include news,

other market indicators, or marketing.

These is variables describe in the current state of the Limit Order Book etc.

And therefore, the first difference of our framework from

a conventional utility-based model would be that

our agent is a sum of all investors rather than their average,

that is a representative investor.

And because such agent would aggregate actions of

a partly homogeneous and partly heterogeneous crowd of individual investors,

it cannot be a fully rational agent but rather

should be represented as an agent with bounded rationality.