Skip to Main Content

Data Visualization: About

What is data visualization?

Data visualization is the presentation of data in a pictorial or graphical format. A well-designed figure can have a huge impact on the communication of research results. Data visualizations can include word clouds, bar charts, maps, or even simple tables. 

According to Noah Illinsky at IBM's Center for Advanced Visualization, a successful visualization:

  1. Has clear purpose (why this visualization)
  2. Includes only the relevant content (what are you visualizing)
  3. Uses appropriate structure (how are you visualizing it)
  4. Has useful formatting (everything else)

Read more about The Four Pillars of Visualization.

Why Visualize?

  • Visualizations reveal patterns in data. See Anscombe's Quartet as an example - the quartet consists of four datasets that have nearly identical statistical properties; however, when graphed in scatter plots, they reveal four distinct patterns.
  • They help us make comparisons. Bar charts, grouped bar charts, and histograms are good examples of visualizations that allow for easy comparisons. A well-crafted visualization will enable you to quickly compare one variable (or a set of variables) against another.
  • They enable us to discover new information. Data in its raw or even in its cleaned form makes it difficult or nearly impossible to discover new information, trends, or correlations.
  • They enable us to comprehend massive amounts of data. See the visualization of flight patterns in the US by Aaron Koblin, part of the Celestial Mechanics project at UCLA. 

 

 

Preparing Data for Visualization

Data visualization can reveal problems in your data collection if it hasn't been cleaned properly. You don't want to make assumptions that aren't warranted! Read Hadley Wickham's "Tidy Data" to learn more about cleaning your data.