The key message of the previous video is that we often can't rely on large and complete data sets. However, we can still use the data that we have to support decision making, but bearing in mind how these data differ and how we can best put them to use. A 2014 report by PricewaterhouseCoopers focuses on how companies use data to make innovation decisions and shows how there is value from using any kind of data to innovate. While we will see in the final module how companies and researchers use big data, the reality is that most companies, especially small and medium ones, cannot rely on big data. Big data require significant investments to support the collection, storage and analysis of large and complex data sets. Often small and medium companies don't have the consistency and comprehensiveness required by big data, but they're likely to have pockets of data on - for instance - sales, customer complaints, or history of transactions. Google is a prototypical example of a company that uses information to guide its decision making, often by collecting data and analyzing them, as Alfonso described in a previous video. Google is a multinational internet and software corporation specialized in internet search, cloud computing and advertising technologies, based in California. Google typically starts with a question to gain clarity on the type of information you need to answer the question. As the Executive Chairman of Google, Eric Schmidt, puts it, “We run the company by questions not by answers.” A great example of an innovation decision faced by Google is related to their internal processes, improving how its HR department was working. The basic question Google wanted to answer was: Do managers actually matter? This is a question Google has been interested in since its early days, when its founders were questioning the contribution managers make. At some point, they actually got rid of all managers and made everyone an individual contributor, which didn't really work and so managers were brought back in. The team first looked at existing data sources, such as performance reviews and employee surveys. The team took these data and plotted them on a graph which revealed that managers were generally perceived as good. The problem was that the data didn't really show a lot of variation, so the team decided to split the data into the top and bottom quartile. Using a regression analysis, the team was able to show a big difference between these two groups in terms of team productivity, employee happiness and employee turnover. In summary, the team with the better managers were performing better and employees were happier and more likely to stay. While this has confirmed that good managers do actually make a difference, it wouldn’t enable Google to introduce changes. The next question the company wanted to address was: What makes a good manager at Google? In order to answer this question, the data analytics team introduced two data collection efforts. They conducted interviews with managers in each of the two quartiles, bottom and top, to understand what they were doing. The managers didn't really know which quartile they were in. The data from the interviews were then coded using tax analysis techniques and based on these results the analytics team was able to extract the top aid behaviors of a high-scoring manager, as well as the top three causes why managers were struggling in their role. Google shared these insights with the relevant people, and introduced a new Great Manager Award. To sum up, Google started this search for an answer on how to improve its HR practices with a very clear question: Does management matter? And what makes a good manager within Google? Google then used existing data to test its key hypotheses. Managers can make a difference at Google. In order to gain detailed information, Google also conducted manager interviews, it used this information to make evidence-based decisions, such as introducing new feedback mechanisms. In the next video, we will look at how companies conduct hypothesis testing when they can only use some of the data that they already have or when they have no data at all.