Validity is the extent to which metrics capture all and only the aspects of the event, fact, or phenomenon we wish to measure. The following chart provides a visual representation of the concept of metric validity. Metrics that capture only part of the phenomenon, event or fact included in the hypothesis to be tested, are deficient. Metrics that capture other phenomena, facts, or events different from those included in the hypothesis, are contaminated. But how should entrepreneurs or managers pick a valid measure or validate it? Consider for example the case of a manager that wishes to know whether a given minimum viable product actually solves a customer's problem. S/he might think of using a customer satisfaction survey, with a few questions investigating whether the customers liked the product and would buy it. But does this measure capture what the entrepreneur wishes to measure? This is the question s/he needs to ask herself/himself if s/he wants to behave like a scientist, and reduce the probability of making mistakes in decisions. For example, the customers might answer the questions in the survey trying to be polite, or they might be satisfied and happy for reasons other than experiencing the minimum viable product offered them by the entrepreneur. The question the entrepreneur needs to ask is: What would represent a valid measure? Something that actually captures if the minimum viable product solves the customer's problem. In this example, a measure of actual customer behavior, as derived - for example - from direct observation or from the duration of the customer minimum viable product interaction, might be a more valid measure. Scientists have methods to validate the measures they use, but most of these methods are overly rigorous and unaffordable for managers and entrepreneurs. Validated measures might be needed on some special occasions. But in general, measure validation for innovation decisions can be based on common sense, good reasoning, and some evidence. In order to guide the choice of a valid measure, entrepreneurs and managers might find useful considering three basic kinds of validity: face validity, content validity, and criterion validity. Face validity is the extent to which a measurement method appears on its face to measure the construct of interest. A customer satisfaction survey should include items about whether customers like a product or service, are happy with it, and would buy it. So, a survey that includes these kinds of items would have good face validity. However, face validity might not be enough to be sure that the measure is measuring what it is supposed to. As we saw in the previous modules, the entrepreneurs or managers’ intuitions and perceptions might be wrong or biased. One way to countermeasure these issues would be to ask a panel of experts, or knowledgeable people, what type of measures they would use to capture a given construct. Content validity is the extent to which a measure covers the construct of interest. In the above example, if the product or service is designed to solve a certain customers pain or need in three different ways, then the measure should include items or other elements about these three different ways. As face validity, content validity is also typically not assessed quantitatively. Entrepreneurs and managers should assess it carefully, checking the measurement method against the conceptual definition of the construct. Criterion validity is the extent to which the outcomes of the measurement on a given metric are correlated with other variables, known as criteria, that one would expect them to be correlated with. In the above example, if the product or service is designed to solve a certain customer's pain, then customers interviewed after experiencing the product or service should no longer have that need or pain. A criterion can be any variable that an entrepreneur or manager has reason to think should be correlated with the construct being measured, and there will usually be several of them. Criteria can also include other measures of the same construct. Criterion-related validity can be established in two ways: When the criterion is measured at the same time as the construct, criterion validity is referred to as concurrent validation. However, concurrent validation is a weak form of validation, as it is based on correlation and not on causation. When the criterion is measured at some point in the future after the construct has been measured and with ex-ante random assignment of subjects/raters, it is referred to as predictive validity, because scores on the measure predict a future outcome. Finally, there is discriminant validity, that is the extent to which the outcomes of a measurement on a given measure are not correlated with measures of constructs that are conceptually distinct, or opposite, or divergent from the focal one. Summarizing: entrepreneurs and managers who adopt the scientific approach to test their hypotheses pick metrics that provide meaningful linkages between the constructs included in the hypotheses and empirical tests. Valid metrics allow to avoid measurement error, and reduce the probability of making wrong inferences and decision mistakes.