If you're conducting a study. It makes sense to think about your data in terms of cases and variables. Cases are the persons, animals, or things you're studying, and variables are the characteristics of interest. In this video, I will discuss how you can order and present your cases and variables. Imagine you're interested in the Primera Division, the top football competition in Spain. The cases you are interested in are individual football players within the league. And the variables you focus on are age,body weight, goals scored, team membership and hair color. The best way to order all this information is by means of a data matrix. This is such a data matrix. The data matrix is the core element of every statistical study. It is nothing more than an overview of all your cases and variables. The cases are displayed in the rows. They range from player one to player 400. You can see that no names are displayed, which means that the names are anonymized here. The variables are displayed in the columns. We have as you can see five variables age, weight, goals scored, team membership and hair color. The values that are displayed in the cells of this table are usually called observations. 80.3 here means that player seven weighs 80.3 kilograms. The value eight here means that player three has scored eight goals. What you see here is not the complete data matrix. It's only a part of it. The complete matrix does not fit on the screen because it contains 400 rows. We have, after all, 400 players. By means of these dots here, I've made clear I've made clear I've left out part of the matrix. Let's see if our data matrix does not contain strange values. Hey, when we look at player 24. We see no value for weight, and in the next row, age is missing. So we don't know the value for every case-variable combination. For now, we have just included these incomplete cases. But we might have to remove them if a subsequent analysis requires a complete data matrix. You need the data matrix for all your statistical analyzes. However, you usually do not present your complete data matrix to other people. The reason is that a data matrix is often huge. In our case we have 400 rows. And doesn't give a clear overview of the statistical information contained within the data matrix. When we present the information in our data matrix to others. We therefore often make use of summaries of data in the forms of tables and graphs. Imagine you want to summarize the information you've got about the hair color of the players in the Spanish football competition. A good way to do that is to make a frequency table. A frequency table shows you have the values of a variable are distributed over the cases. The frequency table is nothing more than a list of all possible values of a variable. Together with the number of observations for each value. Here's an example based on the variable hair color. We can distinguish four categories. Blond, brown, black, and other. This is the frequency table. You can see that 76 football players have blond hair, and 160 players have black hair. Note that these values at up to 400. We don't have any missing data for hair color. We can also express the relative frequencies by means of percentages. In the second column you see the percentages. You can see at a glance that 7.5% of all players has another hair color than blond, brown, or black. 19% of the players has blond hair. You get the value 19 here by dividing 76 by 400 and multiplying that with 100. Sometimes, researchers use cumulative percentages. It is easy to compute them. Cumulative percentages are nothing more than the percentages in every category added up. So you can see here that 19 plus 33.5% equals 52.5% of all players have blond, or brown hair. In this example, we talked about a categorical variable, hair color. What if we are dealing with a quantitative variable? Take weight for instance. It doesn't make sense to compute percentages for every specific value of weight. Because then we would end up with a countless number of categories. The frequency table would show for instance, that two persons have a weight of 65.3 kilograms. One person, a weight of 65.4 kilograms, etc. That doesn't give a good overview because it barely tells you more than the original data matrix. What researchers usually do to solve that problem is building new ordinal categories by using intervals. You could say for instance, that the first category contains those players who way less than 60 kilograms. The second, those who weigh between 60 and 69.9 kilograms. The next one, between 70 and 79.9. The following one between 80 and 89.9. And the final one, 90 and more kilograms. This way you lose information. But the advantage is that you get a much better overview. We say that you have recoded the variable. The variable weight was a quantitative variable that you've turned into an ordinal variable with only five categories. It is very easy to recode quantitative variables into ordinal ones. However, the other way around is impossible. You cannot recode ordinal variables into quantitative ones. So, what do you know now? You use a data matrix as the source of all your statistical analyzes. It is the overview of your data. However, if you want to present your findings to other people. You make use of summaries of your data. One very good way to summarize is by making frequency tables. If necessary you can recode your quantitative variables into ordinal ones.