Hi folks. So we're back again, and now we're looking at eigenvector-based centrality measures, and and again, as, in terms of what we've looked at for characterizing the position of a node in a network, we've looked at things like degree. or, you know, local clustering, distance to other nodes and then in terms of measuring the centrality, influence and power, one difficulty is that when we've looked at things like degree centrality it doesn't necessarily capture the importance of the node's friends, so you know, when we look at this picture, for instance... the, there, there's sort of two aspects to it. One is that we're, you know, this node might be much further, from a lot of nodes than this one. So, things like de, decay and closeness centrality can capture the fact that it's sort of out in the outskirts. But another thing is that when we look at the friends of this particular node. So if we look at the connections of this node they each have degree 2 themselves whereas this node's friends have degree 6 and 7. And so in some sense they are better connected than than this other node's friends. And so the idea of,of eigenvector centrality is that your importance comes from being connected to other important. Nodes, and in particular here, if we look at a centrality notion where the centrality is proportional to the sum of the neighbor centralities, then we're defining eigenvector centrality. So the node i's centrality is proportional to the centrality Of the nodes to which I is connected. So that gives us a, a, an equation which says that CI is dependent on the sum of the CJs, where what's going to happen here is if GIJ is equal to one, then we get. The c, j being counted and if it's equal to 0 then it's not counted so we're just counting the centrality's of the friends you're connected to. Okay, what's what's interesting here now is that this is a system of equations and a system of unknowns. So we have That's in order to figure out what Ci is we have to figure out what all the Cj's are and so this is a set of equations and a set of unknowns and so basically what we have is that the vector C is equal to sum a times the matrix g times the vector C. And so this is what's known as an Eigenvector. Right? So it's vector where you multiply the matrix by that vector and you get back something which is proportional to the original vector you started with. So this a is just some proportional constant which tell you how you're scaling and that's what's known as an Eigenvector. Now we'll say a little more about that in a moment. but what we can begin to see is that if we calculate the eigenvector centrality of these, so solving that system of equations and unknowns and looking for non negative solutions what we'll find is this one gets a 0.31. This one get a 0.11. This one is essentially three times as important according to this Eigenvector Centrality measure of importance. Okay, so what's happening here is you're getting value from connections to others, it's a self-referential concept. So, we have to figure out how to solve that. And in particular what we have is we have this equation which we just wrote the Eigenvector centrality then of a network g is proportional to itself times g. And one thing about Eigenvectors is there's many possible solutions. So if we have n equations and n unknowns. generally there can be upton N different solutions. And what we tend to do is look for the one with the largest eigenvalue and there's a theorem in matrix algebra known as the Perron-Frobenius Theorem, which says that if we're dealing with a non negative matrix. Then the eigenvector associated with the largest eigenvalue is going to have nonnegative values. and in particular in a lot of contexts we'll be looking at the other eigenvectors wont make sense that might have some negative values or even nonreal values. Use. so the conventional beats, so look for the one which has all, non negative real values. That's going to be the one with the largest Eigen value. And, you know, there's a variety of programs that will calculate Eigen vectors for you, so if you have a matrix... representing your, your network. You can plug them into different programs and at least get an approximation of the largest the eigenvalue. the eigenvector associated with the largest eigenvalue. then we can sort of normalize the, the entries to sum to one so any eigenvector if, if you multiply it by two then that's also an eigen vector. So, so these are all going to be taken up to. re-scale things and so if we normalize these things and sum to 1 and then, then, we'll have a well defined definition. Okay so if we begin to look at you know the Eigen vector centrality of different farmers and the peasant nation data. Actually these are not quite normalized. but here, you know, we see that again the Medici come out higher than other families. Not as strikingly high but that's going to give us a different picture of, of what's going on in this, in this kind of setting. now when we talk about eigenvector centrality, there's a number of, of different concepts that are related to it. Google page rank, so didn't got for little page, was the system that sort of helped Google become the search engine that, that really dominated the market early on. And the idea was that, that when you tighten something and it had a bunch of pages, that it could show what it have, whatever term you to type in on it. it had to pick one and the way that it sort of picked the ordering originally was that it would look for the ones with the largest eigenvector centrality. So it was trying to look for pages that were connected to by pages that were important and that was a sign of its important. And no it's that. Algorithm is now much more complicated and is evolved over time to do all kinds of other kind, you know, things people are manipulating the algorithm. It also could take into account what it thinks people might want and so forth. but the basic idea here was eigenvector centrality to begin with. Something known as the random surfer model as well, which is. You know, suppose you hopped on the web and you just started picking pages at random. And then so you go to a page and click on one of it's links at random, go to that page, click on a link at raondom and so forth. And then look at the fraction of time you would spend on each page as you went through this process over time. That's going to be related to an eigenvector calculation. So eigenvector calculations are ones that encapsulate this idea of, of important connections to other nodes and then being connected to by big important ones helps bring more traffic to you. we can go back to the the network we had before and you know we can put in eigenvector centrality measures, and now node 3 actually comes out ahead of node 4. so, so you know depending on which way you do these things, you get different measures. One last measure which is going to be very useful and is related to eigenvector centrality is what's known as Bonacich centrality or Katz- Bonacich centrality. It actually builds on a, a definition originally by Katz that was later extended by Bonacich, and the idea is that you give each nodes on base value. And the base value is some some number a times the degree of the node. Okay, so, so if I have degree 10, and , and a is, is a half, then I start with a value of 5. And so everybody starts with some value proportional to their degree. And now you add in all paths of length 1 from i to some j times b times j's base value. So I get some fraction of my neighbors value. So let's say that b is a third so now I've got a base value, plus I add in 1 3rd times my friends space values. Then I go to length 2 and I add in b squared times those peoples values. So this is you know 1/2 times my friends or sorry 1/3 times my friends values now this is going to be 1/9 Times friends of friends values, right,. So I can go out length 2, blocks of length 2. Some of them might come back to me. so it, does sort of counts these things doubly. But what it's doing is just counting, so I get a times my degree which is just g times the vector of 1. Then I get b times walks of length 2, remember g squared gives us walks of length 2. so I'm getting b times the connections of length 2, I have times their base values. Then b squared times length 3 and so forth. So what we end up with is this calculation where we're sort of raising g to successive powers, and then weighting that by some level b. And if we normalize the a here to 1, it's just multiplying the whole thing. Then basically, you know, this thing is going to be like a sum of a series, and I'm getting more Value from inderect walks times the value of those nodes. If you want this thing to ber convergent so that actually gives you a number, b's going to have to be relatively small in order for that to be finite. it's going to have to depend on the magnitude of the matrix. the, the largest Eigen value. but a small enough B will make sure that this thing converges. So when we look and Bonacich Centrality, then we can go through and calculate this for different values. and you know go through and for instance then, the Medici again appear more important that some of the other families. so, you know, this kind of calculation gives you something like Eidenvector centrality, and in fact for b being close enough to 1 over the, largest eigen value of the matrix, then it's going to begin to approach eigen centrality, but more generally it can diverge from them. And depending on what bs you work with you are going to get different notions, so here if we normalize a to 1 and work with different bs we can get different values of Bonacich centrality out, and it's going to be another measure of centrality. Basically we've go a whole sequence of different measures of centrality. Some of them are going to work better in some contexts than others. Exactly how we do that and which ones are going to be right for what contexts depends on, on what we're studying. And again this sort of four different ideas that we've gone through degree is just getting basic connectedness of a node. How, how many neighbors does it have. Then there's things like how close are you, how easily can you access other nodes. Then there's connector based definitions. So how important are you as an intermediary. And then there are these sets of influence, prestige, and eigenvectors sorts of things that are capturing not only who you know, but weighting by their importance and doing sort of iterative calions of calculations that will capture importance of the value of your friends as measuring your own importance. so what we'll do is, is now we'll have a, a quick look at one application, which will distinguish which one of these centrality measures does well. And which ones don't, in a particular setting. and then we'll start to move on and, and try and understand a little bit more about formation of networks.