Randomized controlled trials can provide information not only about whether to introduce an innovation or not, but also to choose what alternative or option - among several - would be most beneficial. In fact, the effects measured with the randomized controlled trials can be used to assess the relative efficiency of interventions by studying an intervention’s cost effectiveness, or undertaking a full cost-benefit analysis. The experimental design in which different versions of a product - process design, architecture, technology, feature, or component - are comparatively tested has become widely known and used under the name of A/B, or split tests. In A/B testing, if the A version is what we currently have, and the B version is the innovation managers want to introduce, the A/B test is exactly a randomized controlled trial with two arms like those described in the previous module. However, the A and the B versions might be both new things managers want to test. This is typically true, for example, in the case of an entrepreneur launching her/his idea, or in the case of products, processes or technologies that are completely new in the market. A/B testing at its most basic is a way to compare two versions of something to figure out which performs better. While it's most often associated with websites and apps, A/B testing is the term used for randomly experimenting with two variables. A control variable A, and an experiment variable B, for the purpose of statistically testing the hypothesis of which of the two would provide the largest improvement. One typical example is a platform site design, in which two versions, A and B, of one or more web mobile pages are tested to determine which ones produce better conversion rates in terms of sales, hits, leads, and click-throughs, among randomly sampled - but similar - visitors. A/B tests can be used to test any new element, content, process or design. In many cases, the A and B versions are not too different. But sometimes, higher diversity between A and B can help with ruling out a specific, radically diverse technology or solution. One famous example of how A/B testing might help revolutionize processes was Barack Obama's elections in 2008 and 2012. Both elections were won thanks to an unprecedented use of social media for campaigning and fundraising, and A/B testing played a significant role in optimizing their use. Instead of basing decisions about web design or social media on HiPPOs, the derisive term for the highest paid person's opinion, A/B tests allowed to make communication decisions based on evidence about what worked best. Naturally, A/B testing is different, contingent on the type of company, product offered and innovation managers wish to introduce. Web-based companies like Google or Amazon A/B tests that massively and daily, as these kinds of tests have a negligible cost, and huge potential upsides in terms of traffic and conversion rates. For enterprises in other industries, and especially for small and medium businesses, A/B testing is, instead, done on significant changes to the product, user interface layouts, or entire experience workflows. Only with this type of innovations, and therefore interventions, managers can move the needle. In the previous modules, we underlined that experiments should be conducted on testable and falsifiable hypotheses, themselves related to the subproblems that managers or entrepreneurs wish to solve and innovate around. This implies that startups, as well as established firms, should be conducting multiple A/B tests simultaneously in order to have better information about alternative courses of action. In order to increase the information power of these experiments, entrepreneurs and managers can resort to multivariate tests. Multivariate tests allow to experimentally evaluate multiple options, possibly also evaluating the joint effect of the introduction of more than one change or innovation. Multivariate tests can take two forms. The first is multiple parallel trials, which consist in designing an experiment with multiple randomization arms. One arm represents the control condition (no change), while the others represent alternative interventions, different changes innovations. We might think of these as A/B/C tests: instead of doing ten A/B tests, it is possible to conduct, for example, two multivariate tests with five alternatives each. One example of views of these multivariate tests is Netflix, which applies them to identify artwork that enabled Netflix members to find a story they want to watch faster. The second is factorial design trials, also called factorial trials. These types of experiments compare two or more experimental interventions in combination, as well as individually, in a single experiment. They allow to explore whether or not there is an interaction between two interventions. Interactions occur when an intervention works more effectively in the presence of another intervention, or - conversely - is less effective. Unfortunately, factorial trials require larger samples to have a sufficient statistical power to detect interaction effects. Besides, undesired interactions can also be detrimental, as they reduce the power to detect the main effects of an intervention. The simplest factorial design is a two by two factorial, where there are four groups rather than two. Factorial trials are more complex to design and more costly to conduct, but they often represent what managers should do. This can be hard, since the appeal of A/B tests is how straightforward and simple they are to run, and many managers do not have enough statistical skills. Besides, managers find it preferable to run several A/B tests, because their mind reels at the number of possible combinations they can test. But, using simple mathematical rules, it is possible to pick and run only certain subsets of the treatments, and infer the rest from the data. A/B testing has also some potential drawbacks. First, as they need to focus on narrow, testable, and falsifiable propositions, they tend to focus on incremental improvements that show quickly in test results. Unless there is a theory and a broader architecture in which problems are framed and structured, this might lead the organization to miss the point and hinder more radical innovations. Tests that do not corroborate current hypotheses, might represent a negative outcome in the short term, but might allow to learn and pivot to better options, or even get confirmation after retesting in the long term, due to user change version. Second, as most of the tests fail, managers and entrepreneurs get discouraged, assuming that no action resulting from the tests equals no value to the test. Indeed, learning that A wasn't meaningfully different from B to justify the innovation allows the company to avoid incurring in false positives, and therefore has huge value. Third, managers want to draw immediate action implications from the test results. But this requires a clear line of sight, an understanding of how the metrics that the A/B test improve impact on other more aggregated metrics of interest, and a critical interpretation of all the data and experiments ongoing. Entrepreneurs and managers should behave more like scientists, and design the A/B tests that are rigorous. The side effects of deriving from experiments, if statistically significant, will represent better estimates of the potential effects of the innovation they want to introduce. Besides, a manager should: First, let the tests run their course. Managers are eager, impatient, and tend to make inferences on partial data, without waiting for the end of the experiment. Second: Look at many metrics but actually consider and evaluate only the relevant few ones, those that actually capture the effect of the intervention. In the next module, we will call these variables valid and reliable. Third: Replicate and retest frequently, especially in the case of surprising results, controversial results, or particularly large effects. Summarizing, A/B or split tests have become a popular practice in innovation management. We believe that, if conducted rigorously, they can reduce uncertainty and lead to better innovation decisions. The more entrepreneurs and managers deal with complex problems and novel hypotheses, the harder it will be to design and execute them effectively. However, higher will, also, be the benefits coming from them in mitigating the risk of incurring in type I and type II errors.