In order to combat the class dimensionality, people use dimensionality reduction. And again this is an umbrella term for various techniques. For example, feature selection. And the idea of feature selection is, if you have a dataset with many many features, you probably don't have to train your classifier using all of them, instead, what you can do is you can hand pick specific features that have special predictive power to give you a good prediction for the target that you build, instead of blindly using all features that are available. Feature selection can be automated. People use things like information gained from information theory, like similarly to how a decision through classifier's work. People also use domain knowledge. So you might use experts that can tell you which attribute has sufficient predictive power for the problem you're trying to solve. For example, let's say that you have a dataset with a different view course, so it's an ultimate multi-dataset and you have data for different cars or specific models and manufacturers and you have data like, I don't know, you have data like the weight of the vehicle, you have data like the color, you have data like the brake horsepower, you have data about the maximum RPM's, you have data about the fuel type, and so on. And let's say that you want to predict the MPG, the miles per a gallon. So, you could, in theory, use all these attributes and try to train the classifier, but it will be better if you actually speak with an automotive engineer and they will probably tell you that the mpg is mostly influenced by maybe the weight of the vehicle, and I don't know, the fuel type, but it's definitely not influenced by, let's say the color of the vehicle, so you can remove this attribute from your input dataset when training the classifier. Another approach of tackling the class of dimensionality is to do feature reduction and the idea of feature reduction is that you have a number of features, let's say you have features like x_1, x_2, x_3, x_4, x_5, and you feed all these guys to something and then at the end you end up with a new set of features, maybe just x_1 and x_2, but these features retain most of the information that was already delivered by x_1, x_2, x_3, x_4, and x_5. So, the idea of feature reduction is use all the features from the dataset, right, and then generate a dataset with small number of dimensions while at the same time retaining as much information as possible from the original features. And there are different techniques for feature reduction. Things like principle component analysis, linear discriminant analysis, here we have multi-dimensional scaling, but there're also non-linear methods like self organizing maps, autoencoders, and so on and so on. But in this session, we will focus on principal component analysis, as this is a really universal method. It's widely used and it is extremely useful. So, in the next video, we'll talk about PCA.