So far, we’ve presented some key guidelines and principles on how to develop hypotheses, but it is important to understand that hypotheses do not happen in a vacuum, but they heavily rely on their context. In fact, once you articulate your hypotheses, the next step consist in testing them using real-world data. While there are several ways to test hypotheses - and Alfonso and Ronaldo will explain how to test them using either data or experiments - in this video, we are focusing on what hypothesis testing relies on. In other words, probability values or p-values. What are p-values? A p-value is a factor used to determine the accuracy of the conclusions that can be drawn from hypothesis testing. The p-value is essentially comparing the hypothesis as being tested versus a null hypothesis, where we predict that the hypothesized effect does not exist. Using the Rent the Runway example from the previous video, let's imagine that the founders ran a test with 100 college students and 100 working professionals for the following hypothesis: College students are more willing than professionals to rent designer clothes online. This is the hypothesis they intend to test. The null hypothesis, instead, will be: college students are not more willing than professionals to rent designer clothes online. In other words, with the null hypothesis they predict that their hypothesized effect does not exist. What they're trying to estimate looks like this: willingness to rent is equal to a variable called student, if someone is a student or a professional, plus controls. Other characteristics we might want to keep constants such as likelihood to attend events where women need fancy clothes. The founders collected data and ran statistical analyses, and they obtained the following coefficient for the variable student: 1.45 with a p-value of 0.001. What does this mean? Well, first of all, the sign of the coefficient for the variable student is positive. So, this indicates that the willingness to rent is higher for college students than professionals. Secondly, this coefficient is equal to 1.45, assuming this as an odd ratio, it would indicate that college students are 45 percent more likely than professionals to rent designer clothes online, everything else being equal. In order to understand if these results are reliable, we also need to consider the p-value. The p-value in this case is extremely small, 0.001, which indicates that this result is statistically significant. How so? What a very small p-value does is that it indicates what is the probability of finding the observed results if the null hypothesis was true. In other words, if there is no difference between college students and professionals in their willingness to rent clothes online, we would find these result in one percent of the cases. Say that we were just focusing on the size and magnitude of the coefficient in this example. It's entirely possible that what we are finding with the data that we have is due to random sampling error. However, a small p-value, in this case 0.001, indicates that we're unlikely to find these results due to the sample that we have. As a consequence, we can be confident that what we observe in a sample used for this test reflects a true difference between college students and professionals. To put this differently, we are testing our hypotheses of interest by showing that there is strong evidence against the null hypothesis, and the p-value indicates how strong our evidence is. The probability value is used to quantify the accuracy of the conclusions drawn from the test. Researchers and data analysts typically convene that the result is statistically significant if the p-value is lower than 0.05, which means you're 95 percent confident your results are null due to a random draw of the sample you're analyzing. Why is this the value everyone agrees on, and what implication does it have? The 0.05 threshold is just the conventional rules, so that researchers and data analysts can be sufficiently confident in the soundness of their results. However, what happens if a p-value is equal to 0.06? According to conventional standards, the result will not be statistically significant. But the extremely small difference in coefficient, practically means that you can still be confident results are sound. You can think of a p-value as a quality threshold for the results you find. While the key message is that conventional wisdom assumes a p-value lower than 0.05 indicates statistically significant results, with smaller p-values being better, there is more to data and experiments than just probability values. A very important point is to consider results beyond figures and coefficients, and analyze them in their context. In many cases, good decisions are a result of data, intuition, background, and experience. In particular, let's think about an idea for a new product in a period of rich technological opportunities. For instance, an app in the food delivery market when there is high demand for apps of that type. In this case, you will need a very positive test to confirm your original business idea, and this is because the context is extremely favorable to that type of product, and the signals obtained through data analysis might not be particularly suggestive of trends in the market. On the other hand, when dealing with niche markets or unfavorable markets situations, even the relatively weak result might be worth exploring further. It is in this context that previous experience, knowledge of that particular industry or sector can help making better sense of the data. In the next video, we will focus on how real companies translated hypothesis testing into practice.