Big data allows us to ask different questions from regular data. Things that might look like mere noise in a regular data set can turn out to be reliable patterns in big data sets. That allows us to stratify the data and to zoom in on particular individual aspects. Large-scale analysis circumvents many of the methodological pitfalls of traditional methods in social science. For example, the observer paradox, which states that a scientist observing a population inevitably influences their behavior. However, since big data is also much more varied and noisy than regular data, it will also produce much less clear results. It will include confounds, and it can obscure patterns. As observed by the economist Raj Chetty, who gained access to decades of US tax filings, the application of traditional methods to terabyte-scale data does not necessarily equal novel insights. In many cases, it will only corroborate what we already know, until we learn what to ask for and how. Big data sets also require the development of completely new methodological approaches. The methods, therefore, also determine the questions we can ask. Because of the size, big data sets provide us with good approximations of the true distribution of any phenomenon we study in them. If we observe something with some regularity in a big data set, we can probably assume that this is what we will observe in other samples as well. Based on this intuition, we can build machine learning models that predict outcomes. These models essentially find correlations between inputs and outputs, similar to econometric models. However, in machine learning models we try to minimize the predictive error, rather than maximize the fit. This is a one-way relation. A predictive model should always also fit the data, but the inverse is not necessarily true. We can have a model that fits our data so well that it cannot actually predict anything on samples outside of it. This is called over-fitting. Because of their focus on predictive performance, machine-learning models put less of an emphasis on causality. Causal models should ideally also be able to predict unseen data, but they do not have to. On the other hand, a predictive model might tell us what the most likely output is, but it does not necessarily explain why. For some predictive models, we can analyze the parameters to gain explanatory insights. But in the extreme, for example with neural networks, the parameters are so many and so complex that they are very hard to interpret. So, identifying patterns is only half the job. We also need to be able to explain them. For example, a predictive model might be able to predict how popular a tweet is using, among other things, the number of characters and certain keywords. However, that does not translate into a causal explanation. Something like: to have a popular tweet, make it exactly 238 characters long and use only these words. That requires us to apply social theory and contextual analysis. And while the predictive pattern can provide a good starting point for such an analysis, the indiscriminate application of quantitative methods to large amounts of data is not guaranteed to uncover new insights. On the contrary, such an approach can unearth spurious correlations, it can produce false positives. Developing joint methods of prediction and explanation of large data sets is not a matter of replacing existing theory. It's a complement, and a starting point for a more in-depth analysis of the data we have.