And if the GPS distance between the homes,
is less than the median distance between homes, okay?
So going to a village, look at two
households, we say okay are they linked or not?
what, are they of the same caste?
Is the GPS distance greater or less than the median distance?
So when
we're looking at two households, we're say they're in similar if they have the same
caste and less than the median distance
between home, and otherwise we'll say they're different.
Okay, so if they're either of different castes or
greater distance than the median, then we'll put them.
So we're just going to make it a really simple model where we either keep
track of nodes, and we'll allow for
two probabilities, probabilities for nodes being similar and
nodes being different.
And similar here means they're very similar on
both the di, dimensions of caste and GPS location.
Okay?
So now what we can do is, we can fit a block model.
So we can say, what's, we'll allow
block model where we have two different probabilities.
Probability of a link of both of the same category
or similar to each other and probability if they are different.
And then we also fit subgraph generation model.
We're now what we're going to add in is also
triangles, and we'll allow triangles to have two different probabilities.
Probabilities of triangles for people that are all similar, and probability of
triangles if some of the people involved are different from each other.
Okay?
So we'll fit the block model, fit this sub-graph generation model.
Both of these are very easy to fit here, right?
So we can fit,
the block model's a special case of a SUGM where we just look at links.
So we can just count up lengths, count
up triangles, count up whether they're same or different.
So we're going to have four different counts and that
will gives us estimates on all these things, okay?
And the block model just looks, links,
ignoring whether they're in triangles or not.
This subgraph generation model keeps track of
triangles separately from links and estimated that way.
Okay, so that's the basic estimation technique.
So we estimate these block model. Step one.
We're going to estimate this probability of link,
probability of link if you're same or different.
Sub graph generation model we'll do the same
thing but we're going to add in triangle counts.
And then once we have these, the nice thing about these
kinds of models is then we can generate back networks very easily.
So how do we generate a network?
Well once we have this probabilities there, we can just take
this set of nodes, pick pairs, flip coins, put in links with probability same or
different depending on whether they're the same
or different, and then generate a, a network.
For the SUGM what we can do is randomly pick triangle, randomly pick
links and put them in with
these probabilities and then we generate networks.
Okay?
So we randomly generate these networks and then we try
to see whether or not these networks recreate the actual, original
observations. Okay, so here is what we get.
So here's the data.