Particular. And so, I decided not to do AS level things. So we can, you can ask me afterwards if you are interested in. Because that's, to some degree, it's a complete, very fascinating topic, and very recent results that indicate how little we know about that world of the AS. What I try to sort of focus on is and I will be a little bit brief here in terms of the first part because I wanna focus more on the measurement side. Because I think you haven't sort of seen too much on the measurement side. And, and so, I really go through quickly. So, the first piece and then, hopefully Get to the second piece, first piece is basically, a repeat of what you, already discussed before in your, and again, if you have questions as I go along, please, feel free to interrupt, this is [INAUDIBLE], let me first start, so you all know about Achilles, where the story comes from. So Achilles, when he was a baby was foretold to you know, die in battle when he goes up. And his mother got all scared, and there was one solution that she found out, and she had to dip his body into this into the water of the river sticks. Of course she have to cry on somewhere and she cried on the heel. And sure enough over the course of the time, I mean, he'd become a very you know, famous warrior. But at the end he, he was killed in battle by an arrow, poisonous arrow that hit precisely his heel. So, that's sort of the whole story about [laugh] Achilles' heel. And if you Google Achilles' heel, I mean, you'll find Achilles heels for everything. So, in particular it's, it's known to, has, has become known deadly weakness in spite of overall strength. So that's you know, the history where this comes from, and then of course we are here at this point where in 2000. This story became very popular. The Internet has an Achilles Heel. And again, what, what is the Achilles' heel of the Internet? Very briefly, this is sort of a cartoon of the Internet in terms of then, these nodes the routers. The red ones are the highly connected. The green ones are the less, less connected. And the blue and the grey ones are the least connected. You see that the red ones and the, and the blue ones are sort of in the middle of the network. And so if you, if you attack one of these red ones, on [INAUDIBLE] you know what happens is the worst thing that sort of happens to the Internet, and it, it partitions into segments. And if you think of the most valuable aspect of internet is that it connects you know billions of people nowadays That making it disconnect and, and, and, and partitioning it into separate pieces that cannot talk to one another is sort of the worst thing that, that you can do to the Internet. And so this story became very popular in 2000 and from there on. And if you look into books in various books this is you know, sold as a fact about the Internet. The longer story is a, is even a little more complicated because the, it starts so appealing in terms of, there has been measurements out there. People have analyzed these measurements, and found that this measurement indicate that the internet has something which we talked about before in power law. And, to me the simple translation of what is power law is, most deg-, most nodes have degree by node two. And on the other hand, you have a few nodes that have degrees 1,000 and maybe even 10,000. It's an extremely skewed distribution. If you compare that to a normal distribution where almost all nodes would have sort of the degree corresponding to the mean, then you would, in a normal distribution you wouldn't find anything that's sort of two, three, four, standard deviations from the mean. You would never seen, if the mean is zero, you would never seen degrees of a thousand, even a hundred. But these power laws have these, these notions. So. And then none, none of the existing models of graphs could capture this power law behavior, so there was need for new mathematical models which are the scale free models. They were analyzed and found to have very strong properties. In particular, if you wanna attack such a scale graph, the worst, the best thing to do is to focus on these high degree notes that are in the middle of the network you attack them, and the network partitions. And then, if you are interested in other aspects of, you know, if you have a virus that propagates on, on such a graph, this graph types are the worst possible scenarios. Because as soon as the virus hits this graph, and then hits one of the highly connected nodes, The network is dead. So when you go back to the internet, you apply this finding, So the empirical finding, the mathematical modeling, and its properties. You apply it to inter, internet, the solu, The conclusion was, the internet has this bad property. And so this cr, this created considerable excitement in various areas. But the networking people looked at it and said [inaudible] this is sort of nonsense. Where are the, where are the issues, and where are the problems? And so I, I just wanna go very quickly through this aspect, and that basically is how would you about building the physical infrastructure of a network, of a, of an internet service provider, and what would you have to consider in order to build it and what would you, what, what would come out of it. And then I want to switch over to the measurement aspect and sort of indicate why these measurements that people have been looking at are actually completely inappropriate for this purpose. So how, how do, how would we build an IS-, a physical network of an internet service provider? So you are an internet service provider, let's say, that serves the United States, or a region in it. What, what, what, what's the, what's the point of building it? So the first question you have to ask yourself is, what's the reason to build this network or to operate it? So, what should be its functionality? Any, anybody has a sort of a quick solution, a quick answer to this? What, what, what should be the purpose of a network? So you have, you have these end users there. All that you wanna do is, to connect the end users to the internet. Okay? So, what are, what kind of end users are out there? So if you blocked, sort of, speed that the end user uses, and the number of such end users that use a particular speed. So what we have is typically, we have a few end users, these are these big enterprises, the huge customers, that need very high speed, up to maybe 40 gigabits per second connectivity to internet. Then you have medium size customers that you wanna serve, in your area, you know maybe 100 megabits per second is a little bit low. But that's, that's maybe, one limit. But then you have all of these end users that you have. So you, you, you yourself at home, you wanna be connected. You wanna be served by that this ISP wants to serve you. And so there you have, you know? I mean, we, we, we don't do anymore. This, this stuff here, the dial ups. But, you know, you still have customers that want dial ups. But, you get, you get into [inaudible] broadband, cable, DSL speeds. And so you have all these kinds of customers out there that have these varieties of different. Bandwidth demands, and you want to serve them. And as I said, a few of them, you will have this that will require high demand, the high bandwidth, and many of them will require the low bandwidth. And so you want to serve this spectrum of end-users. And now the question is really an economic question. Where do you get most of the money and its a trade-off. So the high-end customers will pay a lot to get connected because they need a big pipe. The middle ones are the medium size and then you have the little guys. Ourselves. They will pay, you know, 30 bucks. But you have many of them. So as an ISP you have to serve this clientele and you know what you get from them in terms of money, and you know how many of them are in that region that you want to serve. So the question is then really is how do you want to build a network that does this job well because you wanna make profit, ideally. So that's one, one aspect that an ISP system or this is what I wanna do, And this is what I'm faced with in terms of who I wanna serve. Then the next thing is, you know? I need to build this network using physical hardware. These are the routers. So anybody have seen a router? I thought I, I, I, I, bring with me the interface card, but it's too heavy so I didn't bring it with me. So, I have some pictures. So, most of you have seen a router, and you've seen this one, this is a wireless router. This is not what we are talking about. What we are talking about is these kind of things which, so these are maybe this high. All the way from this height you know, , home sized. And if, if you look, look carefully these guys have lots of interface cards here for, so this is a typical piece for dial-up customers. So, there are like, probably a thousand dial-up customers that you can plug in here. And then as you go to the bigger pieces, you know, it's not for dial-up customers but for higher, high-speed customers. This has been the piece that has been sitting there for the last couple of years. Very, you know, quite expensive. About, you know, almost a million dollar if you buy it fully, loaded. And this is the latest piece that Cisco offers. And this is, if you buy it fully loaded, it's probably between five and ten million dollars. So, but this, this is a total capacity here is about 300 parapets per second. So this is a huge, huge capacity that and, and, and, and this is like two terabits. And then here, you know, gigabits per second. So here, the-, this is what you're faced with if you try to go out and build a network in terms of the hardware. You go to Cisco, Juniper, whatever, and they will give you all this you know equipment, and what you see is that they have different equipment for different purposes. Again so as I said, these guys you, you wouldn't put into a network where you wanna connect these high-speed links. This stuff you put into the edge of a network. It's these expensive pieces that you wanna put in the middle of the network, because they have this highest capacity. But they are also the most expensive one. So, if you, if you buy the most expensive pieces, you also wanna use it in a most efficient way. You don't wanna use it in an inefficient way. So you know, in terms of how much to, does a router cost, it depends on the size and its functionality. And then the question is now so how does an ISP now who wants to serve these end users make use of this technology to achieve its purpose. And so here it's, very relatively simple the choice that the ISP has to do. But one thing to keep in mind is that this things change over years. So, in 2000, you know the most expensive router you could buy was a couple of hundred megabits per second, in terms or total capacity and you could sort of connect maybe ten high-speed connections there. In 2005 this moved up, in terms of total capacity and also in terms of how many, interface cards you could populate. And today we are up there. But it, the picture's always the same. It's just move something in terms of technology advances. And as I said, you know, today, the most expensive piece of equipment that Cisco or the other router vendors sell you, give you capacities of a couple of hundreds of terabytes per second. But still. Even with that, with, with, with that equipment you can maybe you know, connect couple of tens of very high speed links. Or, you can go to higher degrees but then you, you take a hit in the in performance but then you can connect to lower speed links with the expensive equipment. But the network, the ISP certainly wouldn't do that because then, you know, you use this very expensive equipment in a not very efficient way. Okay. So if you now look at what the whole service offering of, of a vebdor like Cisco does. If you have particular technology up there on top, that's very expensive. That's the core technology. Then, you always have old pieces of equipment sitting in the network. And then, on the, on the right hand side, you have these, you know, cheap equipment that lets you connect all these little customers that don't require much bandwidth. But that's the only technology where you can get degrees up to a couple of thousand. And that's just like the technology dictates that it has to sit at the edge of the network, where you bring all these regular customers together. Oh, that's [inaudible]. Okay. Okay so then, you know How Does it, you know how does an ISP deploy this routers. The idea is to put this cheap equipment at the edge and bring the net, The traffic from the edge to the core and then use in the core this expensive equipment because you have all this traffic to deal with. So then if you have that picture then you very quickly end up with a design principle for designing a physical network for an ISP that's very simplistically viewed as a constrained optimization problem. If you only have one router, you have to you know, keep the router within this technology region feasible region, in terms of you cannot have more capacity or higher degree. But you don't know about this exactly this demand that these various users provide, so you have to put in some model. And, you know, if you have a particular knowledge about these end user demands that your customers or your likely customers will put on the network, you will put that in your model. And then what your goal is really to build this network using this technology and this demand in such a way that you can push as much traffic through that network as possible. Because that sort ofreally relates to, you wanna make profit, when you build this network. So now you have this formulation of a network design. Problem as a, and, and you view this net, this, this problem not really as a technological, social in terms of the traffic and then economical problem. And the it, it's not, it's not very complicated. And in this whole process, this notion of power law doesn't even come up because technology and economics are the only drivers to build this network. So, you don't if you, if you have this you know, constraint of the optimization problem, you don't even sort of have to go for the optimal solution, you just go for a heuristic solution. And heuristic very simply tells you, put this core, this expensive equipment in the core. That give you immediately a mesh of low degree nodes but very high capacity nodes. And then puts the cheap equipment that lets you connect lots and lots of low bandwidth customers at the edge. And that's exactly how people build a network. So if you look at a. One example of a network that everybody here could have access to it if they want to. Well, okay. So, here's an outcome of such a design. He-, here is Internet two. I believe, no, Internet two. So. This is a, a education research network in the US. You can go to the website and they tell you everything about that network. So, it has about, you know, eleven or twelve nodes backup nodes, And that's being fed by all the customers of the networks. You look at this network and it is an you know, essentially not very different from what we just design. And, again, power law don't even come up in this context. Okay, so this is how you will design network You have technology. You have economic constraint. These are the two pieces that dictate everything. And particular what you don't do is if you have two routers you do not toss a coin to decide whether you connect the two routers. This is driven by economics, not by randomness. So. Let me switch now to, now you have a network. Now you are asked as a researcher or as a student, or whatever, go and map this network. So it's a network of AT&T You don't know the physical infrastructure. How do you go about figuring out what is that physical infrastructure of AT&T's network? So if you go back to, to, to this one. This network is out there. I told you this is sort of unusual because you can go to the websites of, of internet too and they tell you the connectivity they tell you this physically infrastructure In general, networks or ISPs don't do that. But so, suppose now they don't tell you and you are given the task of mapping that network as I show it to you, in terms of, show, give me these, these oranges nodes and links that make up this network. How would you go about doing that? So, Anybody has heard about traceroute? Is that some tool that's familiar or not? Anybody use traceroute? Nobody used this traceroute. Okay. So, traceroute is a tool that you can use on your laptop. If you're connected the, if you're connected to the Prince network, you can execute something that says traceroute to a particular destination, and what traceroute gives you is the intermediate routers. So, here's a little, so this is what traceroute is note this one. The reported IP addresses are not the router's IP addresses, but the router's of the interfaces. So, I showed you these router's. And they have lots of interface cards. Each one of these interface cards have an IP address. This is what the traceroute gives you. So, if I do a little, what I did here is, si I did traceroute up from my machine in Florham Park to Princeton. This is a typical traceroute probe that if you run it you get so if you do it from Princeton to whatever you would see something similar here and then whatever at the edge. But this one is basically telling you my traceroute probe that I you know ran from, From Florham Park to Princeton went through AT&T's network and hit a bunch of different routers along the way. Here it ends up in New York and then goes to Washington, D.C. And then comes back to Virginia somewhere. And then it hits this network.This is, this was, so this is the provider of Princeton. Okay we just call it [inaudible] which has been bought recently by some other company, but I don't remember what the name is but this traceroute both goes with the AT&T hits now this network that serves Princeton and then ends up in Princeton. So now, this, this is a traceroute probe that gives you the IP addresses of each one of the, interface cards, of the routers, through which this probe went, from Florham Park, to Princeton. So now, if, if, if you would, imagine the experiment that, you know, you have like, 30 people here, suppose you live in all different places, er, around the US, and now you do traceroute from one to another, And you try to map a particular network, like AT&T's network, or internet, too. So this will be something like this, and the idea is basically, if you have enough, machines at the edge, and if you do enough, you, cost, traceroute you would basically be able to recover the, inver-, the inverse structure of this network because you would, ultimately a probe would get so every router one way or the other. That's the idea. And so people have been doing these, these measurements for many, many years, more than twenty years. And there are, you know, hundreds of millions of such traceroute probes that people use, and, this finding of the internet being, having power always based on these traceroutes Now the big question can we trust traceroute. So what happens if you do traceroute of this simple network, so this is internet too, just, simplify in terms of, it's just the backbone, okay, it's about eleven or twelve nodes very small number of links. Now you do traceroute of this simple network, what happens, is, a picture that, yeah, okay, this is what you get. Okay, so that, this doesn't look anything like this one. And there is one simple reason, and the simple reason is this, this one. So, as I told you, a router can have multiple interface cards. Okay? If you don't know that these two interface cards, one and two, are sitting on the same router, you cannot map, take this mapping and map it into this. All you get is this mess here. And for many, I mean for the last two decades, we have, I mean the network researchers have found no generic way to And so this mess here is basically a result of the fact that this high speed backbone routers [inaudible] for internet two have about, you know, a couple of tens of interface cards. And, traceroute things that each one of these interface cards is a if we do a router, which is not the case. And we cannot, and, and there is no generic method to map the interface card to the right router. So I showed you the picture of the router. It has maybe twenty interface cards. We have no way to map these IP addresses of these interface cards to that one router That's the problem. So then, if you, for example to note the gray distribution of this network and this network, you know? You shouldn't be surprised that you get something very different. This, the real network is this purple one. And the one that you just insert from all the traceroute house is, is the one with the yellow one. So it's two very different things. Now here's another little problem that also causes significant problems. So here's a network called Level three One of the biggest IP transit providers in the world. I just show you the US portion of the network. And just concentrate on the, on the green and red nodes. Okay. So people did traceroute trying to map this network, And they get this mess. Again this network with the red nodes and the green and red links is nothing compared to this one. I mean it's just day, day and night. So whats, whats, whats happening here? The problem here, so if this is the physical infrastructure of this Level three network with this all it's routers. This, this network has decided to use a particular traffic management technique called MPLS, which is Multi Label Protocol Switching, and that basically says that if a traceroute packet for example, hits this router., You will only see that packet again when it comes out of the network, and leaves the network. And you will see nothing. You, you won't get any information about the inside of this network, okay? So if you go back here. So what you see is, any traceroute packet that, [inaudible] that, that came to the, to Level three in Florida, in one hop, ends up in all the other ones. And you don't see anything in the middle. Okay. So it's like a forty mash network that is only a result of the appearance of the presence of a particular protocol that runs on that network. It has nothing, this traceroute picture says nothing about the physical infrastructure of the network itself. So, so we have three pieces that, if you put them together, it tells you that these measurements are completely inappropriate for this, for this, For using and mapping a network. So the first one is, you know it's, it's, it's a technique that hasn't been designed to map a network. It has been designed for other purposes. It has, The node degrees themselves. So the values that we get you cannot trust because of this problem that we cannot map interface, IP addresses to help us. The high degree nodes that this technique gives you are fake because they are a result of other protocols that ran on top of it. And then the last piece is the, if there are actual node, high degree nodes in this network. In the first part of my talk I told you they have to be at the edge of the network. Okay. But if you do traceroute experiments, you will never be able to get to those end nodes because you won't have enough machines in that area. I mean, you would have to be unbelievably lucky to be able to have a high degree nodes at the edge of the network. And that you have machi-, and you have hundreds of machine in that particular part of network from which you can do tracerouting And none of these, experiments that people did have that. So if there are high degree nodes, this technique doesn't show it. Th, the high degree nodes that, that the technique gives you are not for, are fake. So this is the problem with the, with the, with the measurements. And then if you'd have turned, you know, go back and closed the loop here, what would be, if we could trust this data. Well, then, we go through this spiel that says, okay, then you have this power law, so you produce a model. The model has certain features, and now, you know, if you have these features, you compare them and you see that for one and the same Power-law. Let's say this is a power-law. You get two very different models, two very different networks. The left-hand side is the one that we designed. The right-hand side is where you throw coins. In the preferential attachment way. And then if you compare these two models with respect to any kind of properties that you are interested in. Then you find that these networks are completely different. So they are drastically different. Okay? And so, let me get back so, finish up with one question here. So, so, in terms of performance, [inaudible] already showed you, you know. The network that we designed has high performance because we designed it that way. Okay. These random networks of course, have low performance, and these bottlenecks are obvious when you look at them. The, In terms of this fragility. So if you can move one of these. Router in the core of the network, not much is happening because there is a redundancy built it. If you remove one of the node, these high degree nodes at the edge, The only thing that happens is that the users that connect to that edge are affected. Everything else is fine. So, that's no problem. Now, so this is not the Achilles heel of the internet. That you remove a high degree node absolutely no problem. What is, what is the real Achilles heel of the internet if it has one? So the real Achilles heel is really you hijack the network. And the way you, you would basically would think about it in terms of you know, biology immune system. It's exactly the same thing. So what's the amazing thing that cancer is doing? It uses the body's immune system for its own purpose. Or if it misuses it or abuses it. The hijacking, exactly the same thing. If you hijack the network, you make use of the network in terms of the connectivity that it provides service, but you make use of it for your own malicious purpose. So you wanna tweak a particular protocol, routing protocol for example, in such a way that traffic cannot go from one place to another. That's, you tweak the protocol. But you make use of the very infrastructure that you and I, actually attack. An, an that's the biggest. Achilles heel of internet. So it has nothing to do with physically, attacking a piece of equipment in a, in a network. That's, that's nonsense. The, the big problem is really this. Protocol dependence. Okay, and finally. The last three points. So even if you could pass the net, the data. So there is this aspect of, the, for one and the same no degree distribution, you can have many different networks which drastically different properties. Some are, clearly wrong. Some are, more or less correct. And then the final piece is really, What I showed you is, an idea here that, when it comes to technological network, Network modeling should not be data fitting, so here's, a nautical distribution and then you type to find a model, that fits that, node degree but it should be an excercise in, what we call, reverse engineering, so here's a network out there that does something, you're asking the question. What is this network, what is it functionality. And, and that's the first question. And from there on you end up with the first thing that I showed you. If the functionality is to serve a certain user base, then it's an economic problem, And a technology problem, And that dictates how you design a network. Alright so I don't talk about AS levels apologies [laugh] because that's not anymore part of the data I said that's sort of a very different aspects of Internet connectivity but it has the same two key pieces in terms of the measurements that have been used are completely unreliable and inappropriate and then the opposite side if you ask the functionality of this AS lvel network. It's a economic issue and you have to think of it in term of economics, and not in term of power law or anything else.