So Netflix had an algorithm to do this recommendation scheme called CineMatch, and CineMatch worked pretty well if you look at its prediction on some sort of a test set where again you'd hold back some of the values and look to see how it did. It would be within half of a star of the true prediction 75% of the time. So 75% of the predictions were within half of a star, which is pretty good. but Netflix realized, you know, way back in around 2006, how important prediction was. Prediction's very important for customer loyalty, to increase user satisfaction, things along those sorts. Also for inventory control be able to tell which movies are going to be going out, if you can make a prediction as to what people are going to like and so on, there's other factors as well. But prediction's is very important just, if for nothing else than to make sure users are happy with what movies are recommended to them. So in 2006 they launched what's called the Netflix Prize, which you may have heard of before. And it was a Netflix competition. Basically they said that they would reward $1 million to a team that could improve the RMSE over CineMatch's RMSE value by 10%. So you have to do 10% better The CineMatch, and then we'll give you $1 million. And this is an open, online, and international contest, competition. Everybody can enter, anyone who wants to form a team could enter, and it really spurred a lot of research in the community of data mining and machine learning. We won't look into what those terms means, per se, but there was a lot of, just a lot of research into looking at data, trying to deal with it and make predictions on data. And, so, what this 10% means is that whatever CineMatch gets right, whatever prediction value CineMatch has, whatever error it is, you have to do 10% better. So, take that value and multiply it by 0.9, and then you'd get the value that you had to get down to. So, now let's look at some of the logistics behind the NetFlix competition, and then we'll look at the timeline in a minute. they released over a hundred million ratings to the public, okay, and this consisted of about 480,00 users and 17,770 movies, which is huge. So you think of the tables we showed before of users, and then you have movies, and then all your values, your ratings, your five, four or whatever, and so on. And so, there were 480,000 users, and 17,770 movies released in this data set. The thing is that this data is really sparse, which makes it difficult. We looked at what sparse means previously, but really, what it means is that many users are going to rate only a few movies, right? So there's, for any given user, he may only rate just a really, really, really small fraction of this 17,770. I mean rating that many movies is a lot. but then you're probably wondering, well how is it that then you have 100 million ratings if users are only rating very few number of movies? Well, really because a few users actually rate a ton of movies. Like, some of them rate, you know, even tens of thousands of a scene of this data set, which is really, really remarkable to rate that many movies. And so, Netflix released four data sets total, well they didn't release, but Netflix had four data sets total they were keeping in the bank in this competition. And so, they they divided it like this. This is a good way to visualize how it goes. one of the sets, the set over here was called the training set. Okay, and that, that set was released to the public, okay, and this is in the public. and there were 100 million entries in that set, that they released to the public. And this is the set that they said, okay, train your algorithms using this, right. So, this is what you're going to use to basically, set all your parameters right, to try to get that error down as low as it can go. Then they also released what was called a probe set. And they said use this probe set anytime you want, this is to the public, they said can use this probe set to test how your training set's doing, right. So this is basically to say, alright well, now we want to hold back, quote unquote, hold back some of our entries. So you released this huge bank of data here, you would say, okay, well I'll keep this probe set, and I won't train on the probe set, and then I'll test my performance on the probe set after I've trained my parameters, right? So it's almost like, think about it like this. You're taking a test, right? You have to go in and take a test. And this this is basically your study material for the test, right? And this is a practice exam maybe before the test, this probe set's a practice exam. And we'll look at why it's only practice exam. Now, all right, so now there were private elements of that set too, okay. And there was, these were not released to the public there was what's called a quiz set right here and then a test set. Now the quiz set, they allowed the, people to test on the quiz set, once every day. So that you could submit your algorithm, and you could test on the quiz set, once every day and see how, good your algorithm was doing. And they actually based the progress prizes off of that quiz set. Then the real test for the million dollars at the end had, was on the test set. And, you know, that would only happen once where you would test on the test set, and only one time right at the end that you would see, okay how well you did. And so again, we draw that analogy again. Okay. This training set, this is your study material and you use this to study for the test. This is homework problems or practice problems before you go into the test. It gives you a gauge how well you think you're going to do. Okay? This is, maybe some practice exam that your instructor's given out, but there's only a few of them, right. Or you can only take the quiz, you, test on the quiz set a certain amount of times, and this test is the actual test when you walk into the exam. So how well you do on these two, the probe set and the quiz set can give you some idea of how you're going to do when you walk into the test but you never quite know. Right, there could be some curve balls but it gives you a pretty good estimation, right. It's you can feel more confident and the better you train before you take these probes or quizzes the better you do on those, then the better you expect you'll do on the test set. So, now the the training set again. This just to review, it's really the input data to the algorithms. The probe set is used to gauge performance, quiz set you can test once per day, but that's held by Netflix so it's private, and the test set is the end result, the real test at the end. And that analogy that we just made to in class, so we'll bring back that again here now, because what this did was this sectioning was actually a really a smart way of doing it because it prevented people from reverse engineering the test set. So reverse engineering would mean okay, well if you gave me the test set then what I would do is I could just set everything in my algorithm just so that it would basically make the test set error go to zero. But that's not very useful, because there could be other test sets that we haven't tested on it, and the algorithm would do terribly on. And you're just basically setting it for this guy right here. So that would be this reverse engineering idea. If you reversed engineered the test set that would be like getting the test before you went in to take the exam, and learning how to do all the answers and looking in the back of the book, so when you go in to take the test you'll get 100%, but if the instructor happened to change the questions on the exam you may not do so well because you don't know the material, you didn't actually study with the training set. That's a very good analogy, and good way of thinking about that. And so this prevented this whole reverse-engineering idea. Netflix obviously doesn't want that because if you just reverse-engineer the test set that's not going to be very useful to them in their entire data set which they need to really have a robust algorithm that works under a lot of different sets, not just one specifically.