Small multiple and box-plot

small multiple and box-plot


In this article, we will take another look at the small multiple and box-plot for visualizing a complex data set using our Olympic Athlete dataset.

In the previous article “Small multiple with box plot and jitter scatter charts” we explored the combination of jitter, box-plot, and scatter charts. While the result was remarkable, as the audience can easily see the athletes’ height and weight in different disciplines, a drawback is the inability to visually locate nor compare with ease the data set of the disciplines to each other.

There are a few ways to solve this drawback, either by comparing one single discipline to the rest of the disciplines or gather all the disciplines on one chart. Each option has its pros and cons. Let’s explore each solution.

We’re using the same data set as in the previous tutorial (2012 Olympic athletes’ heights and weights). The demo below displays the full dataset in one visual:



Solution 1: Highlighting a specific data set

The small multiple below visualizes four charts, each chart displays two data sets in each quadrant: one specific discipline (in blue) and the rest of the discipline (in gray):


A few things that are easy to see with this kind of visual is that weightlifters are heavier than most, and basketball athletes taller than most. The gymnasts are short and light, while badminton athletes are in the center of the proportional.

This solution is very effective in showing how each discipline is positioned compared to the rest of the data. The only issue is the audience can not compare more than one discipline at the time.


Solution 2: Visualize all the data set

Another way to compare all the disciplines is to visualize them all on the same chart. The first chart below displays all the male athletes’ heights from the shorter to the higher based on the third quartile using box-plot charts, where the second chart displays the female athletes’ heights also using box-plot from the shorter to the higher. This gives the audience another way to gauge average as well as the spread in height and weight within each athlete group and between groups, and without switching charts like the first solution.



Nevertheless, the chart takes a significant space and could be visually challenging on small screens. Whether you choose the first solution, the second solution, or a combination of both, be aware of the pros and cons of each one.

Which one do you like better?