0:16

And what we'll do is we'll start with very simple versions of this where we'll work

with [INAUDIBLE] style networks.

And then we can talk about other kinds of networks as well, and what we can

say about how things like the degree distribution affect diffusion processes.

Okay, so we're bringing in the interaction structure now.

And the basic idea here is going to be that there's

some process going on in the network.

And we'll be thinking, one thing that's going to be important is thinking

about what's the right network to define.

And often in a lot of what we've been looking at,

we've just been sort of taking networks and links as a given and

not explore too much which particular network we want to be looking at, and

exactly what should be defining the nodes and the links.

So in this kind of situation, ultimately what's really going to be important is

when we think about a relationship or a link between two individuals,

we should think of two people being related if one has a chance of

passing something to the other in whatever our diffusion process is.

So if we're thinking about the flu, then we would think about, okay,

I'd be connected to all the individuals who I might infect, and so

that might be a very broad range of individuals.

1:39

Whereas, if we're thinking about a political view or

some new technology that I might tell somebody about, it might be

a much narrower set of individuals who we might have that kind of interaction.

So, nodes are going to be linked if one would infect the other.

And one substantial simplification we're going to make to begin with is that

this is going to be sort of an independent and identical probability across links.

So, each person has an equal chance of infecting any one of their neighbors.

Whereas, that might not be true in reality where you might spend more time

interacting with some individuals than others, and have more of a chance

of a diffusion process proceeding across some links than others.

So, we'll define the links by

the interactions that are necessary for diffusion.

2:35

So, when do we get diffusion?

What's the extent of the diffusion?

How does it depend on the process and the network structure?

Who's likely to be infected earliest?

These are the kinds of questions that we can begin to answer now

with the network analysis.

So, an important part of this is going to be understanding

what the component structure is.

So the reach of the contagion is going to be determined by the component structure

where what we think of is in terms of links,

as links being put down probabilistically according to whether or

not two individuals would actually transmit from one to another.

3:27

Okay, so just to sort of remind you, this is a picture from Bear,

Moody, and Stovel's 2004 data as high school romances.

This would be something, if we were thinking about transmission of

mononucleosis or something, we could think about a network like this.

And what we end up with is the component structure will actually tell us a lot.

So if an initially infected individual ends up being in one of these nodes,

then if they end up being in the large component,

then things can spread quite extensively.

If they end up being in the small component,

then things can be quite limited.

And so, looking at the component structure will help us answer two questions.

First of all, what's the probability that we start a contagion?

And that's going to be the probability that we end up sort of hitting one of

these large components, the large component,

in this case the giant component, and then how extensive should it be.

And in this case, if we did hit somebody in the giant component,

then the reach of it could potentially be the size of the giant component.

So understanding what the component structure is will help us understand

both the probability of starting and the eventual reach conditional on that.

So, we'll think about getting nontrivial diffusion if somebody in

the giant component is infected adopts, so I'll use the word infected but

it could be adopting a new technology, and so forth.

And the size of the giant components can determine both likelihood and its extent.

And random network models are going to allow for giant component calculations.

And now, in terms of what we want to be thinking about in links now.

We could can say, okay, well, a lot of networks we actually look at in the real

world might be very well-connected and have links so that everyone can reach

everyone else in the world, and so the world is one giant component.

But the component structure we actually want to be thinking about are going to

have link probabilities that are associated with the likelihood that

one individual actually infects another.

So it might be that somebody just doesn't catch the flu because they

don't have interactions with people at the right times, and so forth.

So, the network again we're going to looking at is, our people going to

interact within a given time period when they're infected enough to transmit.

And that can actually have a much more fragmented network structure

than an overall long-term network that we would look at normally.

5:55

Okay, so what we can do is a simple example of such a calculation.

And we'll start by working at Erdos-Renyi style random network, and

then we'll also talk about other degree distributions.

And the main question we're going to be answering is just,

what can we say about how big the giant component is in such a network?

Okay, so how big is a giant component if there is one?

And let's think about, so let's think about GNP as our starting model.

And if we think about that in terms of the starting model,

then the size of the giant component's going to be interesting.

When p is in the range where the giant component isn't, so

small that it doesn't exist or so large that we have almost full connection.

So the interesting region is going to be when p is somewhere between 1/n and

log(n)/n.

And otherwise, we're going to have basically isolated small components or

a fully-connected network.

6:58

And if we remember from before, when p was smaller than 1,

sorry, expected degree is smaller than 1.

So that's what's in this range here with 50 nodes, p being .01,

each percentage has an expected range of half a neighbor.

Then we end up with not many people infecting each other, and so

we end up with lots of very tiny components.

Actually, only one that even has two links in it and lots of isolates.

So this kind of situation, if the interaction structure was so

limited we wouldn't see much of a contagion at all.

So, .02 is the point at which you have one unexpected neighbor above that,

you begin to get a giant component.

And here we would end up having about half the nodes,

a little more than half in the giant component.

And so it would be about half a probability of infecting, and

it would infect half of the population.

Once you get to about two and a half as an expected set of neighbors,

then we get almost the whole network, the whole population connected,

so we have a high probability of reaching everybody, and

a high number of people infected once one person is.

And then once we get here with 50 nodes, once we have an expected degree of 5,

we would pass the threshold for connection and we end up having a connected network.

And now, if the interaction structure resistance,

then you would expect a full contagion.

Okay, so what we're going to do is actually go through some calculations next

to give us more explicit numbers on some of these things, and

calculate the size, the expected size of the giant component

rather than just looking at these pictures.

So the pictures are based on these thresholds.

And indeed, when we go from the threshold below expected degree of one to a high

enough expected degree to have everybody connected,

we're going to hit the two extremes.

And the interesting part's going to be in-between, so

let's take a look at that next.