It's not anymore your mom and dad little business as now
these activities are funded by very serious folks such as hedge funds and banks.
Questions of optimal portfolio management of
warm portfolios are therefore quite important for these players.
The main topic is always more or less the same,
which is optimization of risk-return profile
in the context of sequential investment decision-making.
Reinforcement learning can be very useful for these tasks as
well as we discussed at length in our courses.
For P2P portfolios, portfolio optimization can be formulated as
convex optimization is constraints in a similar way to
how we did with stocks but there are also some differences.
For example, you can only go long in these markets as well as each one are computed
differently but the mathematical and modelling approach
is still largely the same as for the stocks.
Application software RL for P2P lending are currently an area of active research.
Another direction I wanted to mention here would be
potential applications to keep the currencies trading.
I believe that if you ended up taking this course chances are that
you're already short or familiar with cryptocurrencies.
Bitcoin or Ethereum are the most famous of them but there are many others,
more than a 100 competing cryptocurrencies that
are available for both investment and analysis.
In 2017, the total volume of cryptocurrencies markets in the US was about 120 billion,
out of which about 40 billion was in Bitcoin.
So, cryptocurrencies markets are similar in many ways to conventional,
financial markets but they're also differences mostly due to very high volatility,
absence of regulation, and vulnerability of investments.
As one example of using RL for optimal management of cryptocurrencies portfolios,
you can take a look at the paper referenced on this slide.
This paper combines reinforcement learning with e-current neural networks and LSTM,
for learning a state representation and a reduce dimensionality of the state space.
There are some other papers probably out there which do similar research.
Finally, more on the theoretical side,
I would like to mention a few interesting topics of perception-action cycles.
One of the very first diagrams in this course was this one.
We talked about differences between perception tasks and action tasks and how
supervised and unsupervised learning solve
perception tasks while enforcement learning solves action tasks.
There exists a very interesting body of work on information theory.
Based approach is where perception and actions
are integrated together into what is called perception-action cycles.
There are some references for you there if you want to follow
and explore these interesting research and I believe that
it has lots of potential as it allows us to bring
the feature selection problem directly into an action optimization task.
For example, in our toy model presented in the third week of this course,
signals CT are supposed to be found from an independent analysis of Alpha Research.