Categorical Data Type 101

Organisational chart displays the data types.

Data is everywhere nowadays, and data literacy, or obtaining meaningful information from data and thinking critically about it, is essential. One way to improve your data literacy is to know your dataset’s data type.

Understanding the type of data you have will help you choose the mathematical techniques and visualization types most suited for extracting valuable insights from your data.

There are two main types of data: Categorical and Continuous. In this article, I will cover categorical data, and in the next article continuous data.

The demo below displays the hierarchy of data types:

Let’s explore the categorical data type and the statistical methods that will help you get the best insight from your dataset.

Data is labeled categorical if their values describe a quality, grade, level, or other characteristics, such as happy/neutral/sad, low/high, dark/light.

Categorical data can be Nominal or Ordinal.

1. Nominal data

Nominal data is a type of categorical data where the order doesn’t matter. Examples of nominal data:

  • Male/Female.
  • Countries’ names.
  • Colors.
  • Etc.

To identify nominal data, ask yourself the following question: Will the meaning change if I change the variables’ order? If not, you have nominal data.

1.1. Statistical methods

Here are the main statistical methods used to explore nominal data:

  • Frequency.
  • Proportion.
  • Percentage.

1.2. Visualization

Column/bar, pie, and donut charts are mainly used to visualize nominal data. The bar chart below displays the world population by continent. You could swap Oceania and Europe’s positions and the data’s meaning is still the same.

2. Ordinal data

Data is ordinal if it represents an ordered quality variable, and the number of units in each variable is not the same. Example of ordinal data:

  • Levels
    • Low, medium, and high, each level is having different units of measurement, such as the level of happiness, pain, etc.
  • Day’s times
    • Morning: from sunrise to 11:59 am, it is around 6hours or 6 units of time.
    • Afternoon: from 12:00 pm to 5:00 pm, it is 5 hours or 5 units of time.
    • Evening: from 5:01 pm to 8:00 pm, it is 3 hours or 3 units of time.
    • Night: from 8:01 pm to sunrise, it is around 10hours or 10 units of time.
  • Educational stages
    • Preschool: from 3 to 5 years old, and it has one grade.
    • Elementary school: from 5 to 14 years old, and it has 8 grades.
    • High school: from 14 to 18 years old, and it has 3 grades.
    • College: from 18 to 22 years old, and it has 4 to 5 years of education.
    • Graduate school: ages vary.

2.2. Statistical methods

Here are the main statistical tools to explore ordinal data

  • Frequency.
  • Proportions.
  • Percentages.

2.2. Visualization

Like with nominal data, column/bar, donut, and pie charts are often used to visualize ordinal data.

The pie chart below visualizes the browser market shares in January 2018. The data is ordered from the most used to the least popular browser.

Now, you know what continuous data type is and what mathematical tools to use to explore your dataset.

You are welcome to practice your knowledge using the Chart Chooser tool to select the most compelling chart.

Feel free to share your comments and questions in the comment section below.