All right, so in principle this seems okay, and it is a valid model,

but if you look at it closely, it won't be very effective.

So it's really assuming that the difference between male and

female must somehow be equivalent to the difference between female and other.

Because all of these bars are decreasing down by a constant amount for

each step, that fall in a line in other words.

But there's really no reason this should be the case.

For instance,

this type of model would not allow us to fit data that looked more like this.

Or had one height for females, one height for males, a different height for other,

and a different height for not specified.

And there was no line connecting those four different heights.

So how would we go about capturing data like that?

We need a more sophisticated encoding of our gender values.

So imagine something I could follow or we might say, the height = theta 0 for

male, theta 0 + theta 1 for a female, theta 0 + a different value theta 2 for

other, and theta 0 + theta 3 for not specified.

Certainly this has made my model more complex, and I have four unknowns,

theta 0, theta 1, theta 2, theta 3 rather than just two unknown.

But this is still an example of a linear regression model.

I can ride it out as an inner product between my parameters theta and

my features, male, female, other or not specified,

each of which is going to be a binary feature.