Now, let's take what we discussed in that last video in regards to pros and cons and see how that relates to linear regression, and this will hold for logistic regression as well. Now, model fitting is going to be relatively slow for linear regression, since we're conducting a search for what those best parameters are, whereas k-nearest neighbors is going to be relatively fast, since all we have to do is store the data, that's going to be our fit, we just store the data with its labels, and later on, we can use that to predict which labels are closest. Once the model is fit, linear regression costs basically nothing. It's just a short sequence of parameters, so it's just going to have to remember the coefficients that determined the line or the hyperplane in higher dimension space, so they will be very memory efficient. On the other hand, k-nearest neighbors need to remember all of the points, it needs to remember the entire training set, so it's going to be very memory intensive. Finally, prediction for linear regression is simply a linear combination of out vectors, so it's going to be very fast. Whereas k-nearest neighbors needs to determine which points are closest, so that's going to have to compute several different distances from each one of the points, which can take some time, so it'll take a bit longer to actually perform prediction. Now, let's take a look at the syntax for actually creating a KNN classification model. So the first thing that we're going to want to do is import the KNN classifier from Sklearn. Here we see that we're importing from sklearn.neighbors the KNeighborsClassifier. Then as we've done before, we're going to create an instance of the class. When we do so, we need to pass in the hyperparameters that we want to pass through. Here we are setting n_neighbors equals to 3, n_neighbors is going to be that k that we're trying to decide on. So we can pass it in a certain value of k, how many neighbors do we want to use in order to see which values are closest? There are other parameters, obviously, that are available. Here we're letting them set to the default, but you should be aware of the defaults such as that we'll be using uniform weighting. So if we have k equal to 3, we're measuring each one of those three closest neighbors with the same amount of weight. In terms of similarity, in terms of that distance metric, by default, it will use the Euclidean distance. You could read more about the default settings for each parameter in the documentation, and I'd always suggest doing so whenever you're introduced to a new model. The next thing that we're going to do is fit to our training set as we've done with our other models. So we say KNN.fit, and we pass in our X_train and Y_train, and then we can predict on our test set, calling, once it's already been fit, KNN.predicts on our test set, and we set that equal to Y_predict, and those will be our predicted values. Now, the methods fit and predicts and fit and transform as well, will be seen again and again in this course. I just want to highlight here, we see for models, so when we're trying to predict we'll call them predictors since they have this predict method so that fit predict method that we just saw, we are generally going to pass into the fit both in X and a Y set. So we're fitting to the X and the Y so that we can predict the different Y. Then for transformations, when we're doing fit transform, and we'll call those transformers because they have the transform method, we generally are going to be passing in just a single value, such as what we do with standard scalar where we just pass in the X, or if we're going to do one-hot encoding where we just pass it in the X, and then we can get our transformed variable. Now, if you're interested in doing the same thing for regression, all you have to do is replace KNeighborsClassifier with KNeighborsRegressor, and ensure that your Y_train set and your Y_test set are both going to be continuous values, and everything else stays the same. So now let's recap what we've learned here in regards to k-nearest neighbors. In this section, we discussed the k-nearest neighbors approach for classification, and how it will simply take that majority vote amongst the k-value that we chose. So if we chose k equals 3, and two of our closest neighbors are churned versus one not churning, we'd predict that that new customer would churn, since two churned and one did not, in regards to the closest neighbors. We discussed the KNN decision boundary and how different k's will influence what that boundary actually looks like. A high k will mean a high bias, with the extreme being that what we saw with k equal to the number of data points, and when we have that, it would just predict what the majority class was, and a low k would mean high variants, so with extreme being k equals 1, and we just choose the closest neighbor even if it happens to be an outlier. We highlighted the distance measurements, the most popular being the Euclidean and Manhattan distances, as well as the importance of feature scaling to ensure that no one feature had too much or too little influence on our prediction. Then finally, we discussed the implementation of KNN for both classification and regression using Sklearn. Now, we'll take what we just discussed in lecture and begin to apply it in our notebook for k-nearest neighbors. Thank you.