Okay, hello. So, now we're looking at Centrality Measures. We're going to talk about positioning nodes inside a network, and understanding how they're positioned. And in terms of the way we've been going through our definitions and trying to understand structure of networks and so forth. what we've done so far as we talked about, you know, basic patterns of networks,degree disrtibutions, path lengths things like that get to raise works. we talked a little bit about homophily and effect that there can be segregation among nodes. we talked a little bit about Local patterns, things like Clustering and related concepts of transitivity. We'll talk a little bit about, you know, how, how many of links are actually supported so they have friends in common and so forth as we go forward in the, in the course. so these are things which characterize the network itself and we'll also be very interested in understanding how different nodes are positioned in a network. And so how we can talk about whether a node is important or not or influential or central, powerful, etcetra. And so the idea of how to describe a position in a network you know, there's different aspects of individual characteristics, some of which we have already talked about. you know? Just, how, how connected it is. How clustered its friends are. distance to other nodes. but more generally, trying to capture centrality, influence, and power are going to, you know, build on, on specific definitions which, keep track of, of a node's position. So, in terms of looking at nodes centrality. the most basic measure in just trying to figure out how important a node is, or how influential it is and so forth. It's just how connected it is and that's captured directly by degree. So degree captures some notion of connectedness of a, of a node. And you know, in order to make it a measure between zero and one, we can just keep track of dividing through by n minus 1 the most possible links I could have and then what fraction of people am I connected to compared to how many I could be connected to. so if we look, you know, for instance at the Medici, the, the Florentine data we had before. here, what do we see from, from the different families we see that the Medici here have a degree of six the Strozzi, degree of four, Guadagni, degree of four, Albizzi's, degree of threes, so some of the most important families, the Medici were better connected in terms of having higher degree. It's not an enormous difference, but there's some difference there. so degrees capturing some of what goes on. [BLANK_AUDIO] but degree is actually going to miss a lot of what's going on in, in terms of a network. And you know for instance here, degree you know node three has the same degree as node one or node two, and in some sense we might, just looking at the network think if three as being less central than some of the other nodes. And how do we capture that fact that, you know, degree isn't really gathering all of position. It's just saying, you know, how big is your local neighborhood. It's not saying where you are positioned in the network, or how central you are in a, in a deeper sense. So in order to get at things like centrality, we'll have different types of things that we can think about capturing. So what I've done here is, sort of, break things down into four different categories. And so degree is really just capturing basic connectedness. Another thing we might worry about, is how close you are to other nodes. So, closeness centrality measures, and what we'll look at is in terms of decay, is sort of an ease of reaching of other nodes. So, how far am I on average from other nodes? between this, we talked about very briefly, we'll look at that in a little more detail. Role as an intermediary or connector. So are, do other pairs of nodes have to go through me in order to reach themselves? That's a very different concept then, then thinking how close I am to somebody else. This is saying is, am I as important as a connector of, of other individuals. then there will be a whole series of influence or prestige or eigenvectors kinds of, of notions, which we'll try and capture the idea that your are important if your friends are important. So being well connected is something which depends on the connectedness of one's friends. And so you know this is the old it's not what you know, but who you know it's not necessarily important to have more friends but to have well positioned friends, we'll take a look at the definitions which capture that. So we have sort of four different concepts of, of centrality or power and we'll try and incorporate these into different definitions and see, I don't know, the differences between these things. And one thing to emphasize here; there's lots of different measures, and not one it, it, it, there's not one which is best, in a sense that it dominates. These things are capturing different ideas, different aspects of a position, and some of them are going to be more important in making predictions in one setting than another. And so, what we really do in terms of, of using one of these things, it's going to depend very much on the context as to which one was important, most important. Okay. So let's have a look at Closeness centrality. So, Closeness centrality one basic definition of it here is just to look here, this is the length of the shortest path between two nodes i and j. And we can sum across all j. So how far am I away from all the other nodes? And then look at n minus 1 over this and it keeps track of sort of relative distances to other nodes. And so the idea here is that if, if these are very large numbers, then my closeness centrality is going to be a very small number. so I'm dividing by larger numbers, it makes this small. So how close I am to other people the closer I am if, if I'm a distance 1 from everybody, then this thing normalizes to 1, and otherwise it, it's going to become further and further. this scales directly with distance so twice, twice as far from everybody makes me half as central. right, so if I double all these things, I'm going to get half. if I quadruple them, then I'll get a quarter. And so forth. So, it's scaling, with the distance. When we look at the closeness centrality back in our, Medici data again. ignoring the Pucci now because if we add them to everybody and we think of everybody has being infinitely distant from them, then everybody would have closeness centrality of zero. So, if we ignore them and just look at the remaining network, then the, the Medici are 14 out of 25, Strozzi 14 out of 32, Guadagni 14 out of 26, and so forth. So here you know, closeness centrality gives us some differentiation between different families. It's not it, it, it doesn't sort things enormously. it gives us some feeling for who's further and, and who's closer. another measure that we could use instead, is what's known as decay centrality. And this is, designed to capture the idea that, what I, I might get is, is value from being connected or indirectly connected to other nodes. So I might have some value from a friend a different value from a friend of a friend, and so forth. And so the idea is that there's going to be some delta factor which is, generally less than 1 and bigger than 0. And the centrality then of a node i under this decay notion, is going to be, look at the distances to other nodes, and raise delta to that power. So if I'm a direct friend I get a delta. If I'm an indirect friend, distance 2 from somebody, I get delta squared. distance 3 delta cubed. So if this were 0.5 then we're going to get 0.25 here. and, and, and so forth. 0.125. if this were 0.9 then these numbers would be much closer to each other. If this was, you know, 0.05 then this would be 0.0025 and so forth and so, so it would scale more dramatically. So as delta becomes near 1, then this just sort of counts all the people that I can reach indirectly. As delta goes close to 0 then this is just going to become degree centrallity. All it's going to do is, is really emphasise the direct connections and all the other ones are going to be much smaller. And then somewhere in between it's, its going to weight indirect connections compared to, to direct connections. So you can think of varying this delta, as sort of how much do I think of, of it being important to be close to many people, or, of how much do I get from indirect connections, of different varying lengths. [BLANK_AUDIO] . Okay. So, you know? Here's a network, with, you know, sort of, like, bow tie kind of network here. we've got, you know, node 4 in the middle. Node 3 over here. Node 1 over here. basically there is three different types of nodes, so nodes two, six and seven all look node 1, node five looks like node 3. So we're going to just keep track of these three nodes and their centralities. if we look at the degree centrality, then node 3 is a, is the most important in some sense. Its got three connection as supposed to two for the others. Closeness centrality, node 4 is actually the closest. So here it wins out in terms of being able to reach all the other ones in, in shorter paths. Decay centrality depends, if we do 0.5, then these two are, are basically equal to each other. If we raise it to 0.57, then node four ends up doing a little better. If we drop it to 0.25, so that more immediate connections matter, then node three starts to do better, so you can begin to see that these different, Definitions are going to give different positions in terms of importance to different nodes, depending on which kind of centrality definition your looking at. you can normalize decay centrality, by, dividing through, by you know the, the lowest possible decay you could have to each one of each node. So it's n minus 1 times delta, is the lowest possible. and, you know that gives you sort of the, the numbers, we looked at, before. So, you know, normalizing, you can get different numbers here in terms of, you know, what these numbers would be, so that's just readjusted by a normalization. Okay so looking back at between the centrality that we looked at before, so now the formal definition of between a centrality due to Freeman. So the idea here is that when we look at two nodes, i and j, we can keep track of the full number of shortest paths, the short, the number of geodesics between i and j. And then for any k that's not equal to i and j, we can ask what's the number of those shortest paths that k lies on, between i and j? Right, so if we're looking then for the between the centrality of a node k, we can look at all pairs i and j that aren't equal to k. And look at what's the number of shortest paths between i and j that k lies on compared to the number of shortest paths, that exist between i and j. And then we can normalize that by the number of alternative pairs of nodes that we can be considering, and how, you know, the, the most you could be is, is to have b on the shortest paths of all of those. So we're going to normalize by there's n minus 1 times n minus 2 over 2 or this is n minus 1 choose two other pairs of, of nodes that are out there. Okay. So when we look at that calculation, we're basically saying what's the fraction of shortest paths that k lies on between other nodes. And then when we look at that again what we saw was that the Medici now have a much higher number than the others other families in terms of their centrality. And between the centrality captures this idea that you know, at this point in time, if other families wanted to deal with each other they might have to go through somebody that they were connected with. So if it's difficult to enforce contracts, then maybe you have to go through somebody you know in order to deal with somebody you don't know. And then the Medici could be powerful intermediaries connecting other families on pairs between them. So that's a different measure, captures different things. And if we go back and look at our bow tie again, what do we see now? Well, you know, now between this, these nodes out here they're not connected, connecting anybody in terms of being an intermediary. So they get a betweenness of zero. and here node 4 now becomes more prominent. It's really an important path connector between all of the nodes on this side and all of the nodes on this side. And so it ends up on 60% of the shortest paths in this network, whereas node 3, for instance, only ends up on 50% of the paths. So, you know, depending on how you're counting these things, different centrality measures, give you different, notions, and different measures of what's going on. Different ideas. we'll, next topic we'll be looking at is, is, looking at eigen vector-based centrality measures, other kinds of measures, that you're going to capture, importance of, of nodes and how well they're connected to other nodes.