Skip to Main Content

Data Visualization: Design Considerations

Color

Tips:

- Use color meaningfully, e.g., only use color when needed to communicate something about the data.

- Choose the right color scheme for your data - categorical, diverging, sequential

- For categorical data, avoid using too many different colors - no more than 6 colors is best; 12 colors max.

- For sequential data, don't use rainbows, use white to highly saturated.

- Consider the format of your visualization - will it be displayed on a projector, in print, copied in grey scale, etc.

- Be mindful of the potential color-deficiencies of your audience - there are tools to help choose or test color schemes that are accessible for color deficient vision. You may also want to consider the cultural connotations of particular colors.

Color Schemes

Tools:

Labels, Legends, and Other Chart Elements

Fonts

- Use sans-serif fonts; avoid all caps; make sure font size large enough to be read in intended format (print, screen, etc.)

Simplicity and Clarity

- Use clear language and avoid acronyms in your title, legend, and labels.

- Can you omit the legend and instead label bars or lines directly? Note that if you only have one data category, there is no need for a legend. 

- Can you omit gridlines and/or the box around the chart area? Or at least lighten their color so they don't detract from the data?

Shape and Size

- Think about the aspect ratio and what is most appropriate for your data, not just what fits on the page.

- "Banking to 45 degrees" - a theory that line charts may be more readable if their average slope is 45 degrees. This theory has been debated; however, it is likely still a good idea to aim for 45 degrees, unless there is good reason not to.

Data Visualization Dos and Don'ts

Scale

It is important to use consistent scale divisions when graphing data that involve continuous series.

histograms showing unequal and equal horizontal scale divisions

Example: If your data are grouped into specific spans of time, the spans should be equal. The histogram on the left has unequal divisions, while the histogram on the right has equal divisions.

Axes

Vertical axes should generally begin at the origin (zero). Visualizations in which the vertical axis does not begin at the origin can give a misleading picture of the meaning of the data.

Example: The steep decline shown on this graph actually represents about a 5% change. It looks much greater because the vertical axis starts at a value of 2000.

Of course, if your data describe a small effect you may need to employ a scale that makes the effect visible; however, you should make it very clear that your y-axis does not begin at the origin.

Normalization

When comparing values, it's important to make sure that differences are not simply an artifact of different sample or population sizes.

Bar chart showing normalized and non-normalized data

Example: The blue bars show total spending on K-12 education for several different states and the District of Columbia. The red bars show the same data as spending per student, and tell a different story with the population effect removed.

Stacked area charts

Stacked area or bar charts can be difficult to interpret.

Between 2001 and 2005 (the yellow box) the number of new urban multi-family housing units (the purple area) declined, but because the slope of the graph for that period is positive it looks as though it has increased.

Stacked area or bar charts should be avoided unless they illustrate clear and easily visible trends: