In this session, we discuss conditional probabilities and the Bayes Theorem. They are two important tools to understand how the logic of probabilities helps managers to make decisions. We start with conditional probabilities. They are the probability that a random variable takes a given set of values if another random variable has taken a specific set of values. To help intuition: conditional probabilities represent the basis for prediction, because they tell us how the probability of an event changes if we observe another event. The conditioning event is the signal that helps us to make a better prediction of the focal event. As a matter of fact, two events are independent if the observation of the conditioning event does not change the probability that we attach to the focal event. In this case, the signal would not be informative. But if it changes it, the signal is informative because it tells us whether the focal event is more or less likely to occur than before we observe the signal. An example can help. In a class, there are 20 students coming from Europe, North America or any other region of the world. Consider a table that relates the number of letters in the last name of each student to whether they come from any one of the three geographic areas. In each cell of the table, you find the number of students coming from the area in the column whose last name has the number of letters indicated in the row. The last row reports the marginal values. That is, the total number of students coming from the area in the column and the last column reports the total number of students with the number of letters in the row. The cell in the bottom right corner is equal to the total number of students, 20 in our case. If we divide all numbers in the cells by 20, we have probabilities, or relative frequencies, rather than absolute frequencies. The bottom right cell would report 1, which is the sum of all probabilities. In this table, each frequency in the cell corresponds to the joint probability, that is the probability of observing - at the same time - the event in the row and the event in the column. For example, there are two students coming from North America whose last name has five letters. The joint probability of these two events is 2 divided by 20, or 10 percent. Let us denote with x the number of letters in the last name, and y the region of origin. The conditional probability of x equal to 5, conditional upon y equal North America, is equal to 2 divided by 7. This is the joint probability, 2 divided by 20 divided by the marginal probability of observing y equal North America, which is equal to 7 divided by 20. We make two remarks. The first one is that the conditional probability helps our prediction. Suppose you need to predict whether the last name of one specific student has five letters. If you have no information, you set this probability equal to the marginal probability, that is 3 divided by 20. Suppose that you know that s/he is North American. In this case, you condition the probability to the signal: s/he is North American, and the probability becomes two-seventh. The information tells you that it is more likely that the last name of the student is composed of five letters. Of course, the probability could reduce. Consider the probability that the student's last name has nine letters and you know that she is North American. There could also be independence. Consider a case in which you want to assess the probability that the student has seven letters in her last name and you receive a signal that tells you that she does not come from Europe or North America. The second remark is that the conditional probability and the joint probability are related. The former is equal to the latter divided by the marginal probability of the conditional event. The conditional expectation is the mean of the distribution, conditional upon a given event. In our example, the mean number of letters is the number of letters in the row, multiplied by the marginal probabilities in the last column, which is equal to 6.4. This is the average number of letters of the last name of the students in this class. Conditional upon the students coming from Europe, this is the number of letters in the row multiplied by the probabilities of each number, conditional upon the student coming from Europe. You can easily check that this conditional mean is 6.625. That is the average number of letters of the last name of the European students in this class. We now turn to the Bayes Theorem. The Bayes Theorem literally tells us that the probability of x, conditional on y, is equal to the probability of y, conditional on x, times the marginal probability of x, divided by the marginal probability of y. It is then a way to retrieve the probability of x, conditional on y, from the probability of y, conditional on x, and the marginal probabilities. We will first understand the theorem and then highlight the concrete implications for managers. To understand the theorem, consider this example. The probability that the letters in the last name are equal to eight or more, conditional upon a student being European, is two-eighths, or 25 percent. Using the Bayes Theorem, this probability is equal to the probability that the student is European, conditional upon having eight or more letters in the last name, multiplied by the probability of having eight or more letters in the last name and divided by the probability of being European. We have then retrieved the probability of having more than eight letters in the last name, conditional upon being European, from the probability of being European, conditional upon having more than eight letters in the last name. In order to understand why this is relevant for managers, consider the following example. Suppose that a project is profitable with probability 33 percent and unprofitable with probability 67 percent. At the same time, if a project is profitable, a consultant tells the manager that it is profitable with probability 75 percent, or three-fourths. If the product is unprofitable, the consultant tells you that it is unprofitable with probability 37.5 percent, or three-eighths. 75 percent and 37.5 percent are statements about the ability of the consultant. They are equal - respectively - to 1 minus the probability of type II error and to the probability of type I error. A better consultant exhibits a higher value of the first probability and a lower value of the second probability. The question is, then: What is the probability that the project is profitable if the consultant tells you that it is profitable? This is the probability that the project is profitable, conditional upon the signal that the consultant believes that the project is profitable. The Bayes Theorem then tells you that this probability is equal to the probability that the project is profitable for the consultant, given that it is profitable, which is 75 percent or three-fourths, times the unconditional probability that the project is profitable, which is 33 percent or one-third, divided by the unconditional probability that the consultant says that a project is profitable. In turn, the latter probability is equal to the probability that the consultant says that the project is profitable when it is profitable, times the probability that it is profitable, plus the probability that a consultant says that it is profitable when it is not. The first term of this sum is equal to the numerator, and the second term is equal to 37.5 percent, or three-eighths, times 67 percent, or two-thirds. The probability is then equal to 50 percent. We can make three remarks. First, we have retrieved the probability that the project is profitable given the signal offered by the consultant from a measure of the ability of the consultant, which is the ability to state that the project is profitable given that it is profitable. Second, this probability is 50 percent. Suppose that we did not hire the consultant. In this case, our best guess that a given project is profitable is 33 percent. We are then right 33 percent of the times, while - if we follow the advice of the consultant - we are right 50 percent of the times. The signal improves our decision. Third, if a consultant exhibits a higher probability than 75 percent, the share of right decisions will increase. However, even if this probability was one, the probability to pick a profitable project, given the advice of the consultant, will not be 1. This is because while s/he will always pick a good project, if the project that s/he examines is good, s/he will still confuse a bad project for a good project if the project that s/he examines is bad. In our example, this happens with probability 37.5 percent, or three-eighths. Thus, improving this other probability, that is type I error, is also important. To conclude, aside from the specific probabilities, the point we made in this session is that in making decisions, we need to understand and interpret signals, because they help to raise the share of good decisions. Moreover, we need to improve both type I and type II errors, because even if we make no error of one type, we can always make bad decisions because of the error of the other type.