Visualize data distribution with density and jitter plots


Blog Posts Data Journalism Data Science Highcharts Tutorials0 comments

Featured image

 

In past articles, I showed you how to create a density chart and a box plot with jitter to visualize data distribution. Guess what? there are additional compelling methods to display data distribution. In this article, I will focus on how to combine the density chart together with the jitter chart to illustrate data distribution. I will highlight the benefits of combining these two chart types, and how to create this combination using highcharts.

The chart below displays the distribution of the Olympic 2012 athletes’ weights by the athletic discipline:

The chart above illustrates that the data points are significant for both female and male athletes. The female athletes’ weight is mainly concentrated between 55kg and 60kg, whereas the main male athletes’ weight spans from 65kg to 75kg, and lightly around 120kg.
Right, you start to see the benefit of those two charts combined, as each chart helps us to understand the data from a different perspective. Let’s dig deeper, shall we?

The jitter chart is an excellent choice to visualize relationships between continuous and categorical data. The data is rendered in a way that allows us to assess the data points’ number, the spread, and the distribution of the points. However, if there is a high volume of data points overlapping, the ability to visually assess the distribution could be problematic for the human eye, and this is why a density distribution chart is effective. The density chart displays the density distribution with considerable clarity regardless of the data points’ number or the overlapping situation.

How to create and combine a density chart with a jitter chart using Highcharts?

There are two main steps to create such a chart:

Step 1: Data fetching and processing

To fetch the data JSON file hosted on GitHub with ease, I am using the Fetch() API:

const getData = async () => {
    try {
      const response = await fetch(
        "https://raw.githubusercontent.com/mekhatria/demo_highcharts/master/olympic2012.json?callback=?"
      );
      if (response.status !== 200) throw response;
      return response.json();
    } catch (error) {
      throw error;
    }
  };

To process the density data, I use the same density function as the previous article (density chart):

let step = 1,
      precision = 0.00000000001,
      width = 15;
    let dataWeight = processDensity(
      step,
      precision,
      width,
      maleWeight,
      femaleWeight
    );

One last task, before the next step, is to be sure to invert the jitter data structure to visualize it horizontally instead of vertically:

maleWeightJitter = maleWeightJitter.map((e) => [e[1], e[0]]);
femaleWeightJitter = femaleWeightJitter.map((e) => [e[1], e[0]]);

Now, it is time to create the chart.

Step 2: Chart creation

Once the data processing is done, it is simple and straightforward to create the chart.
I use areaspline to display the density plots and scatter type for the jitter plots

series: [
        {
          name: "Density F",
          type: "areaspline",
          data: dataWeight.results[1],
          color: femaleColorJitter,
          yAxis: 0
        }
        ...
        {
          name: "Jitter M",
          type: "scatter",
          data: maleWeightJitter,
          jitter: {
            y: jitterWidth
          },
          marker: {
            radius: jitterMarkerRadius
          },
          color: maleColorJitter,
          yAxis: 1
        }
      ]

And voila, the code is done 😎.

Now, you know how to create effective interactive charts to visualize density distribution and relationships.
Feel free to try it with your data, and share your experience and questions in the comment section below.

Consent for marketing cookies needs to be given to post comments