Skip to Main Content

Data Visualization: Choosing a Chart Type

Choosing your chart or visualization type

When selecting the right type of visualization for your data, think about your variables (string/categorical and numeric), the volume of data, and the question you are attempting to answer through the visualization. Additionally, think about who will be viewing the data and how you can best optimize the data narrative through design. 

Cleveland and McGill (1985) studied the visual characteristics of data visualization that are the easiest and most difficult for the human eye to perceive. They are, in order of least difficult to most difficult: 

1. position along a common scale

2. position along a non-aligned scale

3. length

4. angle and slope

5. area

6. volume, density, and color saturation

7. color hue 

This means that a visualization consisting of differently sized and colored bubbles is more difficult for the human eye to discern than a bar chart (position along a common scale).

Cleveland, William S., and Robert McGill. 1985, "Graphical perception and graphical methods for analyzing scientific data." Science 299 (4716):828-833. 

For in depth information on all of the figures discussed below, please see:

Zoss, Angela M. "Designing Public Visualizations of Library Data." In Data Visualization: A Guide to Visual Storytelling for Librarians, edited by Lauren Magnuson,. Lanham, MD: Rowman & Littlefield Publishers, Inc., forthcoming. 

 

Bar Chart

bar chart of publications

Bar charts are frequently used and we're taught how to read them starting at a young age. The most simple bar charts, those that illustrate one string and one numeric variable are easy for us to visually read because they use alignment and length. Additionally, bar charts are good for showing exact values. 

Bar charts become difficult to read when the author has over-labeled or incorrectly labeled the chart.

Things to consider: 

- horizontal or vertical bars

- pay attention to the numerical axis of the chart (best to start at zero)

- order of bars (alphabetical, numerical, etc)

Line Chart

Line graphs are an excellent way to show change over time. Use line graphs when one you have one time variable and one numeric variable. Bar charts can also demonstrate time, but they fail to show the continuity that a line graph can provide. 

Things to consider:

- when there are too many lines the graph becomes difficult to read

- giving each line its own color forces the viewer to scan back and forth from the key to the graph

- it might be difficult to see where the your data points are

- similarly to bar charts, it's best to start with "zero" on the y-axis to avoid distorting the data

 

Pie Chart

 

Pie charts are best used with one string and one numeric variable. They show a part-to-whole relationship (when the total amount is one of your variables and you'd like to show the subdivision of variables). Data visualization software will often assign a new color to each wedge of the chart. 

Things to consider:

- the more variables you have, the more difficult the pie chart becomes to read

- area is difficult for the eye to read: if many of the wedges are similarly sized, perhaps pick a different visualization that better illustrates your question.

- 3D versions of pie charts are notorious for causing distortion. 2D, while visually less stimulating, is easier to read.

Scatterplot

from: https://en.wikipedia.org/wiki/Anscombe%27s_quartet

Scatter plots are useful for showing precise, data dense visualizations, correlations, and clusters between two numeric variables.

Things to consider:

- scatter plots are not commonly used and are therefore more difficult for most people to read

- large data sets do not work well because dots cover each other up 

- a bubble chart is a variation to the scatter plot. In a bubble chart, each dot is a different size, representing an additional variable. Area of a circle is often difficult for the eye to interpret.