So now we actually use the, similarity values, before we do that, we have to calculate the similarity values. And, when we actually do the similarities we do them on the errors,um, that we take obtained in the baseline prediction. We basically subtract out to figure out how off baseline prediction was, just for the training set again, 'cause we can only use the training set. And we augment baseline predictor with the similarity values that we obtained on the errors. and so there's a couple reasons why do that. First is that we really need to center these values. At zero. and the reason I've set it at zero is that we're kind of subtracting any, any bias out already, by the fact that we're going to an error. So the errors are going to include zero. And in a sense, they'll be centered around zero. And, in order to do correlation, you need things to be centered about zero. So then you can either have positive or negative values, so if we did the money ratings themselves, which are from 1 to 5. You know we can never have negative values, we'd only have positives, and that doesn't give us that sign differential that we need. So we,we subtract down and we do them from kind of around 0 to some Positives and negatives, and second reasons that we're really, we're trying to correct for the errors here, right? So this is an augmenting the baseline predictor with this, similarity. the neighborhood method, really it's called. And, so we, we want to correct for those errors, so we don't really want to do it on the errors themselves because that's what we want to, sort of to go away. And I mean we're not just going to add you know the errors back and give us zero error on everything. We're going to do it in a way that makes sense and is not reverse engineering. We'll see how we do that in a minute, but first let's try to calculate some of the values. So, here I'm showing the table the table that is. So here I'm showing the table of error values. And I've got this again, I got this by subtracting the predictions from the actual values. So I took the actual values and I subtracted the predictions. So if the predictions were higher than the actual values, then this is going to be negative which means, well, we should have made the prediction lower than we had it. And if the predictions are less than the actual values, then this will be positive, which means that the prediction should have been higher. And so we can use the positive as a negative values accordingly. but now for movies one and two let's just just apply that cosine similarity. to do that we have to just figure out how the users that have rated both movies because we can't use in the equation we can't have users that have only rated one. And clearly we can't use those that have rated none, none of the movies. but for instance we can't use A here, because A has not rated movie two. we can use B, we're good because B has rated both movies. we can't use C because, well this is part of the, test set again. So we can't use the test set. we can't use D, because this is part of the test set. We can't E because we don't know it's value. but we can use F because we know both of these. So we have to use Only b and f for this. So, for movies one and two then, we're just going to apply that equation that we had before. We multiply b1 times f1. So we do -0.30 times -0.05, and then remember we add. So we add the product of the terms, so we do this times this plus this times this So then we add .17 times -0.58 and then, remember we have to divide by the length, and then we divide by b, square root of b1 squared plus b2 squared. So, this is negative 0.3 squared plus 0.17 squared. And then we multiply it by negative 0.05 squared, plus negative 0.58 squared. Then, if we do this out. We get negative 0.0220 over. 0.3041 times 0.6044. All right, so this whole top right here comes to this. This comes to this. And this comes to this. And then that Is equal to negative 0.11, okay. So the cosine, or the similarity between movies one and two, is -0.11. Remember now, we said that we have to see whether it's closer to -1, 0, or +1. And if it's closer To negative 1 or plus 1, then it's useful, and otherwise it's not. And so you see, this is really, kind of, close to 0, it's somewhere around here. so that's, that's not very useful at all. And so these movies, we would say are reallly not, very correlated. Now lets try movies III and V if is another example. So we will go through again and we will see III and V now a works so we'll just do next it will say check mark we can use a because a has rated 3,, movie III and this greater movie V. We can use B, again because B is rated movie three, movie five. We can't use, three because this is part of the test set. Sorry, it's not part of the test set, this is, value that we don't know, we can't We can use, user d right here because these we know both these values. We cannot use e because we don't know this as part of the test set. And, we cannot use f because this is part of the test set. But we do have three values now, 1, 2, 3 on each of these sides, so. we have to do a little more. There's a little more terms here. So, now, we do this the same as last time. Which is, we have 3 terms, that we're summing instead of just two. So we do this times this, plus this times this, plus this times this. So we have negative 1 times negative 0.43, plus. Negative, or plus, sorry, 0.25 times negative 0.10 plus 0.25, again, times negative 0.10. They actually turn out to be the same, for both of those. then we divide by the square roots of this squared plus this squared plus this squared times the square root of this squared plus this squared plus this squared. So we do the square root of 1 squared plus 0.25 squared plus 0.25 squared. Times the square root of .43 squared. Notice that I'm omitting the square root, and that's because we, when you square something it becomes positive again. So I don't need to write this garder rits. + .1squared + 11 squared and If we do this, if you do this whole thing out this entire multiplication I'll leave it to you to actually run through the calculation but you get .79 right here. And so that's closer to plus 1 right. So it's a positive correlation and it's kind of closer to plus 1 so we would see if these movies are positively correlated, right. So now we can use this then we can come up with a full table of similarity values and here I'm tabulating this. this is a similarity between one and two you can see. the similarities between also 3 and 5, 0.79 we just found, and so on. And now a couple things to note. First is that this table is symmetric, right. So we've said before that things sometimes aren't symmetric but here they are so the similarity from one to two is the same as the similarity from two to one. So that's why 1 and 2 and 2 and 1 are the same, just like you take 2 and 4, for instance, and 4 and 2, same. So you can slap, flop it over this and, if you mirror image it over it, it will be the same. now in, in the next in the next segment, we're going to choose one neighbor or a movie, right. So we're having the neighbor's movies here, for each movie we're going to choose one neighbor. and we could choose more, we could choose two, three or four. But we've already made the math complicated enough. But and that would get even more complicated. So we'll just stick to choosing one neighbor for each movie. Which will simplifly things a lot when actually go to do it out. And so, basically down the columns right now we'll say movie one we want to try to find the neighbor with the highest similarity. Right, and so that's that would be three here. Right, and these, a lot of the backgrounds now, of green, which are the similarities, or the ones with the highest similarities. So one would choose three as his neighbor. And for that reason, and again where we were finding the magnitude so we want the magnitude to be the highest. So even though this is negative, very negative its still really negatively correlated which is a, a useful thing and now for 2 we are going to choose 1, we choose between point 11, point 741 and point 88 And therefore we're going to choose this. Even though again it's negative it's still the highest in magnitude. For three, three would choose one because it's got negative 0.82, which is higher than any other values in magnitude. Four would choose two just like two chose four. Those are actually perfectly. negative correlation [UNKNOWN] one other. So, four and two. And the reason that they're actually perfectly negative correlation is because there's only one value that we're multiplying here. And so that's not really that great. I mean, normally you need more data before you say that two things are perfectly negative correlation. But here that's just the way it turns out. Now five will actually choose two to be his neighbor right. This .88 is higher than the other values. And notice twodid not choose five it chose four. but five choose two. So they don't have to choose each other. even though the table is symmetric, they don't necessarily choose each other. And we could define other metrics, for instance, we could say, all right, well I'm not going to use similarity at all unless the similarity value is higher, than like, 0.9. For instance. And then in that case, we would only use one neighbor for one pair here, we wouldn't use similarity at all. Some people do taht because sometimes it makes sense to say well, unless similarity is high enough, I'm not going to use it. but here we're just going to choose the most similar and just use that to do our calculation.