Wednesday, December 14, 2022

Contingency table for showing relations between multiple variables

A contingency table is a type of table used to organize and display the relationship between two or more variables. It is commonly used in statistics to show the frequency or count of how often each combination of the variables occurs in a data set. For example, a contingency table could be used to show how many people in a survey chose each combination of favorite color and favorite food. This can help us see if there are any patterns or relationships between the variables.

In simple terms, a contingency table is a way of organizing and displaying data to show the relationship between multiple variables. It can help us see how often each combination of the variables occurs and look for patterns or relationships in the data.

Here is a simple example of a contingency table:


Favorite Color     Favorite Food     Count

Red                         Pizza                 5

Red                         Burgers                 2

Blue                 Pizza                 4

Blue                 Burgers                 3

Green                 Pizza                 1

Green                 Burgers                 4

This contingency table shows the relationship between two variables: favorite color and favorite food. The table shows how many people chose each combination of favorite color and favorite food. For example, the table shows that 5 people chose red as their favorite color and pizza as their favorite food. The table also shows that 4 people chose blue as their favorite color and burgers as their favorite food.


Looking at the table, we can see that more people overall chose pizza as their favorite food than burgers. We can also see that more people who chose red as their favorite color chose pizza as their favorite food, while more people who chose blue as their favorite color chose burgers as their favorite food. This shows us that there is a relationship between the two variables, and that people's favorite color and favorite food are not independent of each other.


* find relative frequencies in a contingency table in simple words and example


To find the relative frequencies in a contingency table, you need to divide the count for each combination of variables by the total number of observations in the data set. This will give you the proportion or percentage of observations that fall into each combination of variables. For example, let's say you have the following contingency table:


Favorite Color     Favorite Food     Count

Red                         Pizza                 5

Red                         Burgers                 2

Blue                 Pizza                 4

Blue                 Burgers                 3

Green                 Pizza                 1

Green                 Burgers                 4

To find the relative frequencies, you first need to find the total number of observations in the data set. In this case, there are 5 + 2 + 4 + 3 + 1 + 4 = 19 observations in the data set. Then, you can divide the count for each combination of variables by the total number of observations to find the relative frequency. For example, the relative frequency for the combination of red and pizza would be 5 / 19 = 0.26. This means that 26% of the observations in the data set are people who chose red as their favorite color and pizza as their favorite food.


You can repeat this process for each combination of variables in the contingency table to find the relative frequency for each combination. This will give you a better understanding of the relationships between the variables and how often each combination occurs in the data.

For this cant of data visualization we use a segmented barplot.

No comments:

Post a Comment

Binomial Distribution in very simple words

The binomial distribution is a probability distribution that describes the outcome of a series of independent "yes/no" experiments...