[MUSIC] Our goal in this recording is to develop some simple probability distributions. So recall that the sample space was the set of all possible outcomes in our experiment. And hence, really, by definition, the occurrence of the sample space would equate to a certain event. Namely, the probability of S, the sample space would be equal to 1. So our goal now is to consider a probability distribution whereby we take this one unit of probability, the certain event that the sample space would occur. And we would like to break this up and to distribute it across all possible outcomes within that experiment, i.e., across every possible value within that sample space, to reflect how likely those different outcomes are. Now, of course, there are many different ways we could distribute this unit of probability, yet in a given application, we would want to do it in a way such that it fairly reflects the likelihood, the occurrence, of each of those events. So let's start with some baby steps, and consider perhaps the simplest possible example. So let's imagine your sample space has N possible outcomes within it. And we'll consider the very special case where each of these outcomes is equally likely. So the simplest example we could consider would be that of a fair coin. Remember, there are two outcomes when you toss a coin, head and tails, and let's assume that it is a fair coin, not a biased one, such that heads and tails are equally likely. So here, N, the size of our sample space, is equal to 2. And we said some event or set of interest would be some subset of that sample space. So let's suppose we are interested in the outcome when we obtain heads. So, in a classical world of probability, we can determine theoretically the probability of some particular event to be the number of all of the elementary outcomes within the sample space which agree with that event of interest. So let's break it down and consider the simple coin toss example. So for a fair coin, two equally likely outcomes of heads and tails. And if we define the event A to be tossing, let's say heads, then only one of those two outcomes, the H, out of the H and T, equate to getting heads. And so we would assign the probability of A, i.e., the probability of getting heads, to be 1 over 2. So more generally, we would say that when our sample space has equally likely outcomes and then if little n of them reflect the number of those which correspond to the event of interest A, then the probability of A would be n over N, and this would reflect its probability of occurrence. Well, we talk about baby steps, so we did a little bit of crawling. Let's try and sort of stand up a little bit now and consider maybe a larger example where the sample space has more possible outcomes. We've considered tossing a coin. Let's now consider rolling a fair die. So we saw this in the previous session. So in this case, our sample space has six possible outcomes, those positive integers 1, 2, 3, 4, 5, and 6. And provided it is a fair die, then each of those scores would be considered to equally likely. So what would our probability distribution look like? Well, in this setting, we want to construct our probability distribution in tabular form. In that sense, we would have two rows. Our top row would reflect every possible outcome of this experiment, and then in the second row, assigned to each of these values would be its probability of occurrence. So, when rolling a fair die, we know that there were six possible values for this variable. And indeed let's now introduce this notation of X, a capital X to denote the random variable which here would equate to the score on a die. So we know for this random experiment that there are six possible values that this score can take. So this random variable X is either going to be a 1, 2, 3, 4, 5, or 6. So assuming this being a fair die, rather than a loaded or biased die, then each of those six outcomes would be equally likely. So in our probability distribution, if we consider the probability of rolling a 1. Well, if we consider our little n over a big N notation for the probability of event A when the outcomes are equally likely, then our capital N in this case is equal to 6. Only one of these six outcomes corresponds to rolling a 1, and hence the probability of a 1 would be 1 over 6. Of course we would get exactly the same probability to the other five possibilities of 2, 3, 4, 5, and 6. Again, treating these as equally likely outcomes. So here we see our first sort of formal probability distribution, each possible value of the random variable, and assign to it its corresponding probability of occurrence. Now, of course, here we are being slightly restrictive and are considering situations specifically where those outcomes are equally likely. Now, for the case of a fair coin or a fair die, indeed, that is a reasonable assumption to treat them as being equally likely. But of course, there will be many situations in life where the outcomes are not equally likely. We only need to look back at that Monty Hall problem. Remember at the very start of the problem, those three doors and you had to choose one of them. And indeed, to begin with, as far as you were concerned, the sports car was equally likely to be behind any of those three doors. So effectively, your sample space of where the sports car could be, would involve three possibilities, door A, door B, and door C. And because in your world initially at the start of the game, the sports car was equally likely to be behind each of them. You assigned a 1 over 3 probability, i.e., a third, to the sports car being behind each of those three doors. Of course, though, when we play the game, you chose door A, I opened door B, this now eliminated door B as a possibility. So the size of your sample space about where the sports car could be, I reduced that for you from 3, doors A, B and C, down to just 2, namely doors A and C. However, if when you originally played that game, you thought you had a 50-50 chance of winning the sports car because you only saw those two possible outcomes of door A and door C. Well, this is sort of the no new information fallacy which you would've fallen for, whereby people see two possible outcomes and instinctively believe that each is equally likely. So yes, for tossing a fair coin, heads and tails are equally likely, but the somewhat of the paradox of the Monty Hall problem is that even when there were only two unopened doors left, there was not a 1 in 2 chance of the sports car being behind each. Now, how do we actually obtain those revised probabilities of a third and two-thirds which we saw? Well, that's to come a little bit later, in this week two, when we consider Bayesian updating. So appreciate that there will be many situations in life where the outcomes are not necessarily equally likely. Of course, that does complicate slightly how we therefore assign probabilities to these different possible outcomes. But we'll see more of that later in week two when we consider other forms of probability distributions. So for now just do a backtrack on the tossing the coin and rolling the die. So in that rolling the die case, we attached a 1 in 6 probability to each of those equally likely outcomes 1, 2, 3, 4, 5, and 6. So we saw it in a tabular form, but I like pretty pictures, so perhaps let's consider a graphical representation of this probability distribution. Whereby we take this certain outcome, i.e., must be a score of 1, 2, 3, 4, 5, or 6. And we're going to split this unit of probability across each of the possible outcomes on that die. And assuming a fair die, we're going to split this unit of probability equally and place one-sixth probability on each of those outcomes. So now we can see a sort of graphical display of this probability distribution. Now, returning to the tossing of a fair coin. Well, there we said there are only two possible outcomes, heads and tails, which were equally likely. So we take this unit of probability and we're going to split it equally such that we assign a 50%, or 0.5, 1 over 2, a half, they all mean the same thing, that equal amounts of probability to each of those two outcomes. So we could express this in a tabular form. Whereby we have the outcomes of heads and tails. But when we're dealing with probability distributions, ideally we like the outcomes to be numerical values. So in the case of the score on a die, the outcomes were already numeric, 1, 2, 3, 4, 5, and 6. But for tossing a coin, we have these qualitative outcomes of heads and tails. Well, it's very easy now to assign numerical values to those outcomes. Given the binary nature of tossing a coin, heads and tails, why don't we opt for the sort of binary formation of zero and one? Whereby, we may want to know heads representing the outcome of getting a 1, and tails represented by the outcome of a 0. Of course, the assignment of 0 and 1 is entirely arbitrary. There's nothing preventing us from saying a heads equates to a variable value of 0 and tails being a variable value of 1. It's whichever one you would prefer. So there could have our tabular distribution. So we have our variable x represented by 0 and 1, let's say for obtaining heads and tails respectively. And in case of a fair coin, then we would attach equal probabilities of a half and a half to each. So we've now considered our first few simple probability distributions both tabularly and also graphically, which allow us to say how likely certain events are to occur. But of course we have been very restrictive here. We're taking these baby steps whereby those outcomes are equally likely. To be considered going forward are situations where those outcomes are not equally likely. Stay tuned. [MUSIC]