[MUSIC] The final topic of this module on Eigenproblems, as well as the final topic of this course as a whole, will focus on an algorithm called PageRank. This algorithm was famously published by and named after Google founder Larry Page and colleagues in 1998. And was used by Google to help them decide which order to display their websites when they returned from search. The central assumption underpinning page rank is that the importance of a website is related to its links to and from other websites, and somehow Eigen theory comes up. This bubble diagram represents a model mini Internet, where each bubble is a webpage and each arrow from A, B, C, and D represents a link on that webpage which takes you to one of the others. We're trying to build an expression that tells us, based on this network structure, which of these webpages is most relevant to the person who made the search. As such, we're going to use the concept of Procrastinating Pat who is an imaginary person who goes on the Internet and just randomly click links to avoid doing their work. By mapping all the possible links, we can build a model to estimate the amount of time we would expect Pat to spend on each webpage. We can describe the links present on page A as a vector, where each row is either a one or a zero based on whether there is a link to the corresponding page. And then normalise the vector by the total number of the links, such that they can be used to describe a probability for that page. For example, the vector of links from page A will be 0 1 1 1. Because vector A has links to sites B, to C, and to D, but it doesn't have a link to itself. Also, because there are three links in this page in total, we would normalize by a factor of a third. So the total click probability sums to one. So we can write, L_A = (0, a third, a third, a third). Following the same logic, the link vectors in the next two sites are shown here. And finally, for page D, we can write L_D is going to equal, so D is connected to B and C, but not A, two sides in total, (0, a half, a half, 0). We can now build our link matrix L by using each of our linked vectors as a column, which you can see will form a square matrix. What we're trying to represent here with our matrix L is the probability of ending up on each of the pages. For example, the only way to get to A is by being at B. So you then need to know the probability of being at B, which you could've got to from either A or D. As you can see, this problem is self-referential, as the ranks on all the pages depend on all the others. Although we built our matrix from columns of outward links, we can see that the rows describe inward links normalized with respect to their page of origin. We can now write an expression which summarises the approach. We're going to use the vector r to store the rank of all webpages. To calculate the rank of page A, you need to know three things about all other pages on the Internet. What's your rank? Do you link to page A? And how many outgoing links do you have in total? The following expression combines these three pieces of information for webpage A only. So r_A is going to equal the sum from j = 1 to n, where n is all the webpages of the link matrix relevant to webpage A and location j, multiplied by the rank at location j. So this is going to scroll through each of our webpages. Which means that the rank of A is the sum of the ranks of all the pages which link to it, weighted by their specific link probability taken from matrix L. Now we want to be able to write this expression for all pages and solve them simultaneously. Thinking back to our linear algebra, we can rewrite the above expression applied to all webpages as a simple matrix multiplication. So r = Lr. Clearly, we start off not knowing r. So we simply assume that all the ranks are equally and normalise them by the total number of webpages in our analysis, which in this case is 4. So r equals a quarter, a quarter, a quarter, a quarter. Then, each time you multiply r by our matrix L, this gives us an updated value for r. So we can say that ri+1 is going to be L times ri. Applying this expression repeatedly means that we are solving this problem iteratively. Each time we do this, we update the values in r until, eventually, r stops changing. So now r really does equal Lr. Thinking back to the previous videos in this module, this implies that r is now an eigenvector of matrix L, with an eigenvalue of 1. At this point, you might well be thinking, if we want to multiply r by L many times, perhaps this will be best tackled by applying the diagonalization method that we saw in the last video. But don't forget, this would require us to already know all of the Eigen vectors, which is what we're trying to find in the first place. So now that we have an equation, and hopefully some idea of where it came from, we can ask our computer to iteratively apply it until it converges to find our rank vector. You can see that although it takes about ten iterations for the numbers to settle down, the order is already established after the first iteration. However, this is just an artifact of our system being so tiny. So now we have our result, which says that as Procrastinating Pat randomly clicks around our network, we'd expect them to spend about 40% of their time on page D. But only about 12% of their time on page A, with 24% on each of pages B and C. We now have our ranking, with D at the top and A at the bottom, and B and C equal in the middle. As it turns out, although there are many approaches for efficiently calculating eigenvectors that have been developed over the years, repeatedly multiplying a randomly selected initial guest vector by a matrix, which is called the power method, is still very effective for the page rank problem for two reasons. Firstly, although the power method will clearly only give you one eigenvector, when we know that there will be n for an n webpage system, it turns out that because of the way we've structured our link matrix, the vector it gives you will always be the one that you're looking for, with an eigenvalue of 1. Secondly, although this is not true for the full webpage mini Internet, when looking at the real Internet you can imagine that almost every entry in the link matrix will be zero, i.e,, most pages don't connect to most other pages. This is referred to as a sparse matrix. And algorithms exist such that multiplications can be performed very efficiently. One key aspect of the page rank algorithm that we haven't discussed so far is called the damping factor, d. This adds an additional term to our iterative formula. So r i + 1 is now going to equal d, times Lri + 1- d over n. d is something between 0 and 1. And you can think of it as 1 minus the probability with which procrastinating Pat suddenly, randomly types in a web address, rather than clicking on a link on his current page. The effect that this has on the actual calculation is about finding a compromise between speed and stability of the iterative convergence process. There are over one billion websites on the Internet today, compared with just a few million when the page rank algorithm was first published in 1998. And so the methods for search and ranking have had to evolve to maximize efficiency, although the core concept has remained unchanged for many years. This brings us to the end of our introduction to the page rank algorithm. There are, of course, many details which we didn't cover in this video. But I hope this has allowed you to come away with some insight and understanding into how the page rank works, and hopefully the confidence to apply this to some larger networks yourself. [MUSIC]