Interactive Small Multiple with R


Blog Posts Data Journalism Data Science Highcharts R Tutorials0 comments

Featured image







Finding a balance between broad-strokes vs. a detailed data visualization approach can be challenging, especially if either approach obscures as much as it illuminates the data story. However, there is an alternate path: In this article, we will look at visualizing data for comparison purposes, and how highlighting a select group of data within a larger data-set allows for both a high-level and a granular view.

Remark

In the previous article, we explored the power of combining a box plot with a jitter. This combination allows us to compare one continuous variable and visualize its size for each discipline. Nevertheless, if you want to explore the comparison of data sets using a relationship between two continuous variables, your best option is to use small multiple to highlight one data set at the time. For example, the charts below visualize the relationship between height and weight (men) for the following disciplines: Gymnastics, Modern Pentathlon, Canoe, and Hockey (see below).

Here is the code for one chart:

library("highcharter")
library("dplyr")
library("plyr")
library(readr)

cols <- c("#c5c5c5", "#ff5858")

#Load the data
df <-
  read_csv(
    "https://raw.githubusercontent.com/mekhatria/demo_highcharts/master/Olympics2012CapitalLetter.csv"
  )
#Gather the disciplines under the name Additional to compare as one series with the targeted discipline
df$sport <-
  revalue(
    df$sport,
    c(
      "Canoe" = "Additional",
      "Hockey" = "Additional",
      "Modern Pentathlon" = "Additional"
    )
  )
#Remove the unnecessary data such as nationality, date of birth, name, and age
df = subset(df, select = -c(nationality, date_of_birth, name, age))
#Filter only the targeted and additional disciplines using the discipline names and sex (male in this case)
my_data <- df %>% filter((sport == "Gymnastics" &
                   sex == "male") |  (sport == "Additional" & sex == "male"))
#Remove the redundant data
my_data = subset(my_data, select = -c(sex))
#Create the chart
hchart(my_data, "scatter", hcaes(x = height, y = weight, group = sport)) %>%
  hc_title(text = "Gymnastics") %>%
  hc_xAxis(title = list(text = ""),
           labels = list(format = "{value} m")) %>%
  hc_yAxis(title = list(text = ""),
           labels = list(format = "{value} kg")) %>%
  hc_tooltip(useHTML = TRUE,
             headerFormat = "{series.name}
Height: {point.x} m
 Weight: {point.y} kg",
             pointFormat = "") %>%
  hc_colors(cols)

The charts were created separately, then combined using bootstrap to get the small multiple effects. Any other setting will work fine as long as you align and group them.

These four charts visualize the relation between the height and weight of four disciplines while simultaneously comparing the relationship between disciplines. In each chart, a single discipline is highlighted in red and displayed on top, whereas the rest of the disciplines are in grey and in the background.
The audience can easily see that the height and weight of all athletes have a strong positive relationship regardless of the discipline. The athletes of Gymnastics are relatively the smallest, and the lightest among the discipline represented. In contrast, the Canoe athletes are the tallest and the heaviest, whereas the hockey athletes are relatively in the middle.

By now, you are well aware of the power of small multiples, and you are well equipped to know how to combine different chart types to create the right small multiples you need. Feel free to share your questions and comments in the section below, and don’t forget to share your best tips to create an effective small multiple.

Consent for marketing cookies needs to be given to post comments