In this module, we're going to walk through and identify some im, implementation difficulties with mean-variance portfolio selection. We're going to walk through 3 main ideas, one, what happens when there are parameter address. Two, what happens when you have to take negative positions and you want to avoid short positions. And three, what happens when variance is not really the best measure for risk. There are many aspects of the implementation details of mean-variance that one could focus on. We chose to focus on the three most important ones. First, has to do with parameter estimation. The parameters that go into a mean-variance portfolio selection problem in practical situations is never known. The true mean vector and the true covariance matrix of the assets is unknown. All we have is historical data, and we will have to estimate these parameters using these historical returns. And as a consequence, we end up making statistical errors. For the mean vector, the data is often sufficient, but when you start estimating the covariance matrix, the data is never sufficient. The reason is, that this covariance matrix has ordered d squared independent parameters. In order to have sufficient data to estimate these d squared parameters, you have to collect returns over a very long period, and over this long period, the market parameters shift. So, you sort of are playing a game where you can never able to get enough data to estimate these parameters sufficiently. Moreover, the portfolio that you compute to allow to be very sensitive to estimation errors and we'll focus on this in one of the modules. We're going to show you why this happens, how you could correct for it, and what are the current state of the art on taking estimates and constructing portfolios from them. They're also going to focus on how does one get negative exposures in the Excel module, that goes with the mean-variance theoretical module which showed you that very often, the optimal portfolio has short positions. Taking on short positions is very dangerous particularly because it has an unlimited downside. You can lose a lot of money because the price could suddenly jump very high and you end up losing a lot of money on the short positions and it's for this reason that it's not very often allowed for wealth managers. One way to get negative exposure is to use a leverage exchange traded fund, or a leveraged ETF. But if you use leveraged ETFs, you have to be very careful. And in one of the modules, we're going to focus on how do ETFs work, what are the difficulties associated with ETFs, how should you interpret the returns of ETFs? Finally, we're going to talk about whether variance itself is a good measure for risk. Mean-variance portfolio selection focuses on variance as a risk measure or equivalently volatility as the risk measure. Does it make sense to use this risk measure? What, what are the limitations of variance? What can you do to mitigate some of these limitations is going to be the focus of another module. In this module, we will mainly focus on the issues associated with parameter estimation. And the starting point of this module is that the true parameters that we are after, which is the mean vector and the covariance matrix of the assets is never known. And we are going to use historical returns to compute estimates for mean var, mean return, and the covariance matrix. And the easiest way to do that is to estimate the mean return by the sample average of the returns over some period N. Once you have the sample average for the mean, you can compute the covariance matrix by just substituting instead of the true mean, the estimated mean, to get an estimate for what the variance is. What I've done on this plot that goes on this slide is I simulated the returns using the mean vector and the covariance matrix given in the spreadsheet that goes with these modules, I simulated 60 months of data. And using those 60 months of data, I estimated the mean. Each of these green dots on this plot are an estimated value of the mean using one particular simulation of 60 months worth of data. I'm only plotting the estimated mean for asset 1 and asset 2. The point that I want you to focus on is that the estimated mean can often be very far away from the true mean. The true mean has been plotted on this plot with a red square. Here's where the true mean is. This is a valid estimated mean generated from 60 months of data. And as you can notice, it's very, very far away from what the true mean is going to be. What we do know is that if I estimate the mean and I construct the 95% confidence interval around it, so here is one particular value of the estimated mean, here is the 95% confidence interval around it. And because we are talking about two assets, this interval becomes an ellipse. It's a 95% confidence ellipse. Then, with probability 0.95, the true mean lies in the ellipse. So, in this particular case, the true mean barely makes itself into the 95% ellipse. So, the question you should ask yourself is, does parameter error matter? And in this slide, I want to tell you that parameter error is often very serious for mean-variance portfolio selection. And what I'm describing on this is the same experiment that I described in the last slide, taken one step further. I estimated the mean and the covariance matrix using 60 months of data. So, I take one sample from all those green dots that I showed you in that slide. I have a mean vector. I have a covariance matrix. So, I can construct an efficient frontier using that data. I'm going to call that the estimated frontier. So, the green line here on this slide, this one, is the estimated frontier. It's the frontier that has been computed using an estimate for the mean and estimate for the covariance matrix. The blue line is the true frontier. This is the frontier corresponding to the unknown true mean and the unknown true covariance matrix. The red line is, is labelled the realized frontier. What that means is I take a frontier portfolio, on the green estimated frontier, compute the true mean return on that portfolio and the true volatility of the portfolio and plot it. And the line that I get from doing that is the red line. So, this diamond here actually gets moved to this diamond when you replace the estimated mean with the true mean and the estimated covariance matrix with the true covariance matrix. And as you can notice, there is a big gap between what the estimated return on that portfolio is going to be and what the true return on that portfolio is. The estimated return is around 6.4%, and the true return, or the realized return if you were to use that portfolio in the market, would be close to 4.4%, a good 2% drop. Why does this happen? Is this generic or did it happen just for one of the samples? In this slide, I'm plotting the estimated frontiers corresponding to 5 different simulation runs. I simulated 60 months of data 5 different times, computed the estimated mean, the estimated covariance matrix and I've plotted the corresponding estimated frontier. The green lines on this plot are five different estimated frontiers and as you can see, these frontiers are extremely unstable. Not only are the frontiers unstable, the difference between the frontiers and the estimated frontiers and the realized frontiers can also be very large. So, we want to understand why this happens. Why is there such a big gap between what happens in the estimated frontier and what is actually realized? Why is the estimated frontier so unstable and is there anything that we can do to remove this gap and remove this instability? Why is parameter error so serious? In order to understand this, let's walk through a very simple example. Suppose I have two identical assets with mean mu and covariance sigma squared and correlation equal to 0, then the optimal investment for these 2 assets would be to take half a position in asset 1 and a half a position in asset 2. That's what would give you the least volatility. Suppose now that the estimate for these returns are slightly off their true values. So, I estimate the return on asset 1 to be slightly larger than the true value so it's mu plus epsilon. I estimate the mean return on asset to be slightly smaller than the true value, mu minus epsilon. So, on average, I'm making zero error. On average, the estimator is very good. So, if you were thinking about the properties of a statistical estimator, you would say that whatever the estimator is been used here, is pretty good. Across the assets, you're not making a lot of error. But the problem with mean-variance portfolio selection is that after I estimate these parameters, I'm going to optimize my portfolio using these parameters. So, what happens? I've estimated that the return on asset 1 is slightly larger than the return on asset 2. And therefore, I will overweight asset 1 as compared to asset 2. If I'm allowed short positions, then I'm going to short asset 2 and actually start investing more, take more leverage on asset 1. But this is precisely the wrong thing to do. If I take the portfolio that I compute which overweights asset 1 and underweights asset 2 and put it into the market, I would get a return where the overweighted asset is going to perform worse than expected, the realized return is going to be mu, below mu plus epsilon. And the undeweighted asset, which is asset 2, will perform better than expected. So, instead of having a return mu minus epsilon, this asset, asset 2, is going to have a return mu, which is an espilon larger than the expected return which is mu minus espilon. And this performance, this gap between the estimated performance and the realized performance will become worse as more and more shorting is allowed. This is what accounts for the big difference between the estimated performance and the realized performance. The main difficulty is, we take the estimated parameters and then optimize. And this optimization procedure inflates or maximizes the statistical errors in the parameters. There is a quote which sort of sums up the situation. Mean-variance results in error maximizing investment irrelevant portfolios. So, we have to do something in order to make mean-variance portfolio selection practical. So, one idea that might come out of looking at this slide is that the performance becomes worse as we allow more leverage. So perhaps, the idea would be to limit short positions, not allow short positions at all. And then, let's see what happens to the performance. In this slide, I'm plotting what happens to the estimated frontier, which is the green line, and the realized frontier, which is the red line, when you have a no-short sales constraint. And as you can see, that the realized frontier becomes very unstable this has a large part of the curve down here which is actually inefficient. And the reason behind this is because the feasible region for the portfolios now has a corner. So, if this is x1, that is x2, you want x1, x2 to be grater than equal to 0, so you end up getting a corner in the feasible region and this corner causes problems in portfolio selection, it causes instabilities in portfolio selections. As you add more constrains, maybe you have some asset sector constraints, maybe you have some constraints on how much money a particular sector can have and so on. All of these become linear constraints. All of these induce more corners and more instabilities. If you want to get at what the no-short sales constraints was doing, which is to limit leverage, the better thing to do is directly put a constrain on leverage. And if you put a constraint on leverage, you end up getting performance shown up in this curve. Now, the realized performance of the portfolio is pretty close to the expected performance. The gap between these two is small. But the gap between what is expected and what is realized, this gap is still very large. So, I expect to perform on the green line based on the data. I get the real, realized performance is going to be the red line. Remember, this blue line is actually not known in practice so even though the true performance and the realized performance are very close, I have no way of knowing how well I'm performing. So, leverage constraints do work well in practice but still, the estimated frontier is very bad and so there's needs to be some work in trying to bring that down. The state of the art right now is something called robust portfolio selection. In the robust portfolio selection, what one does is removes the target constraint, which is imposed with respect to the estimated value of the mean and replaces it by a target return constraint which is with respect to the worst possible mean in the confidence region. So, let Sm denote the confidence region for the mean. A few slides back, I showed you that the confidence region is an ellipse. So, instead of using a target return constraints, which says to take the estimated value of the mu transpose x and in, insist that, that should be greater than equal to r, we'll going to replace it by these constraints. And what do these constraints say? It says, you choose your portfolio x, the return that you're going to get is going to be the worst possible return in the confidence region. Any point in the confidence region is possible and, therefore, this worst return is something that you could possibly see in the market. And now, instead of that constraint on the target return, I'm going to put a constraint that this minimum value must be greater or equal to r. I can do portfolio selection with this constraint. It's a little bit harder but not much harder. And now, the picture I end up getting looks like the plot here. The estimated frontier starts coming down. Why does this happen? This happens because now, I'm putting the worst case. So, I have estimated value of mu, this could be the estimated portfolio performance. But because now I have put the worst case constraint, this gets dragged down. The realized performance also becomes bigger than expected. So, that starts getting pulled up and therefore, the gap between these two starts to become very small. There are issues with this technology. You can sometimes get portfolios which are not very interpretable and therefore it's having a little difficulty getting fraction. But over time technology either directly or some version of this technology is likely to become very practical. All of these methods were focused on trying to improve the optimization strategy. There is a flip side to this methodology, where one tries to improve the estimation strategy. So, here's, are some methods that people have used to improve parameter estimate. One of the most popular methods are so-called the shrinkage methods. And what one does in these shrinkage methods is that one shrinks to some global quantity. These were introduced by Charles Stein in 1961. There's a paper by James and Stein, and more recently, Ledoit and Wolf have extended this to the case of covariance matrices in other circumstances. So, let's take the case of the mean. Earlier, I would have estimated each of the asset means separately. So, mu est i stands for the estimated mean for asset i. Now, in the shrinkage technology, instead of just estimating this asset mean, I'm also going to estimate a global mean, global average mean on the assets. And there is a reason why I put this estimation outside this bracket. And the reason for that is when I estimate this quantity, I don't simply take the estimate for all of the d assets and add them up. I assume that all the assets have the same expected mean and use the data of all the assets to estimate that mean. As a result, I have more data when I'm estimating the total mean, than when I'm estimating a given assets mean. As a result, I expect that the error in this global mean is smaller. So, error is smaller. And the error is larger in individual means. Now, this shrunk estimate, what it does is it takes the estimate for a particular asset, estimate for the global one, let's just call it mu bar and it moves on this line, some element alpha. When alpha is equal to 1, it's up here. When alpha is equal to 0, it's down here. For some intermediate value of alpha between 0 and 1, it's some point over here. This one has a very small error. That one has a bigger error. And when you shrink, you end up getting that the error at this point would be smaller. The tradeoff is as you, as you decrease alpha and start coming closer to the global mean, you have less information about what the asset is going to do, but you have less statistical error. As you increase alpha, you have more information about the asset is going to do, but you have more estimation error. So, somewhere in between is the best thing. The next expression is a same kind of idea applied to the covariance matrix. So here, there shouldn't be estimated, but shrunk. This should be estimated down here. So, we have a shrunk estimate for the covariance. All it does is takes the estimated value for the covariance matrix and shrinks it towards another covariance matrix where all the assets have the same volatility or the same variance. Again, the idea is the same but if I want to compute one variance for all the assets, I have a lot more data, I can estimate it better, and if I shrink the estimated covariance matrix towards this global covariance matrix, I end up getting a better estimate, meaning an estimate with lower errors. Another way to improve parameter estimates is to use subjective views and the most popular way of doing that is the so-called Black-Litterman method. Recently, people have been starting to use non-parametric nearest neighbor like methods to estimate performance and this is because people have started going away from parametric models like mean-variance and going towards more data driven models. And the idea here is to observe the current return here r, go back into the past and find all those times t where the return is close to the current return. So, this is the current return, this is the return at some point t in the past. You want to make sure that it's pretty close to the current return. And for all those times t, find out what happened to time t plus 1 and use that as sample of what is going to happen in the future. These non-parametric methods are currently at a very theoretical level, but there is a possibility that these methods will provide a better way of doing portfolio selection in the future.