Data can be a powerful way to disseminate science and news, but creating effective data visualizations is both a science and an art. Just as well-designed figures can help viewers understand data patterns, poorly designed figures can create confusion and misunderstanding, undermining not only comprehension but also trust.
In this issue of Psychological Science in the Public Interest (Volume 22, Issue 3), Steven L. Franconeri, Lace M. Padilla, Priti Shah, Jeffrey M. Zacks, and Jessica Hullman review research-backed guidelines for creating effective and intuitive data visualizations for communicating data to students, policymakers, the general public, and other researchers.
Data visualization is an interdisciplinary matter
Research on data visualization takes place in several fields. Perceptual psychologists might study the mapping between colors and the values viewers extract from these colors. Cognitive psychologists might investigate how working memory limits how much information a viewer can extract from a figure. Educators might be interested in how to improve visualizations to help students extract concepts. Public-policy communicators and political scientists might research why some visualizations are more trustworthy or persuasive than others, and health-communication specialists might be interested in assessing how to effectively communicate the risk of medical procedures. Specialists in statistical cognition and communication could explore how to communicate uncertainty in election outcomes or hurricane paths, and computer and data scientists could study data types and algorithms along with taxonomies to design powerful visual displays and fluid user interactions. Besides researchers, practitioners with contextual experience can also provide powerful insights about the best ways to design data visualizations that are effective, intuitive, and engaging.
This interdisciplinarity suggests that data visualizations are ubiquitous, graphical literacy and effective design are important, and integration across different fields is necessary to both better understand data visualization and create more powerful visualizations. The fields of visual perception, graph comprehension, information utilization, data-based reasoning, uncertainty representation, and health-risk communication all “study similar questions and use complementary expertise and styles of inquiry, yet they too rarely connect,” write Franconeri and colleagues, who chose to “ignore artificial boundaries among these research fields, and instead integrate across them.”
By integrating research across fields, Franconeri and colleagues review three types of guidelines for creating effective and intuitive data visualizations:
- ensuring that viewers properly map visualized values to the right concepts in the world
- conveying uncertainty and risk (e.g., survival odds of a treatment)
- communicating clearly
The science of data visualization
The visual system can quickly extract broad statistics from a well-designed visual display. Understanding how the visual system works can improve data visualizations and help to avoid visual depictions that cause biases or illusions.
“Visualizations rely on several visual channels to transform numbers into images that the visual system can efficiently process … Knowing these channels allows a designer to consider which might be best suited for a given data set and context—particularly given that each is associated with differential levels of precision and potential illusions,” write Franconeri and colleagues. The authors rank five visual channels on the basis of how precisely viewers can typically state the relationship (ratio) between any two values: position, length, area, angle, and intensity (e.g., color gradient). They then describe the following for each channel: common illusions that distort data, global statistics that are quickly extractable, statistics that make the visual system sluggish (comparisons), and tools to increase its effectiveness.
In their report, Franconeri and colleagues describe in depth the guidelines for communicating data using visualizations, provide examples of good and bad visualizations, and highlight tools and strategies to improve data visualizations. In their summary of key guidelines, they emphasize the most important things to consider when creating data visualizations:
- A viewer’s visual system can extract global statistics (e.g., means, medians) within a fraction of a second. Researchers should visualize their data with histograms and scatterplots before trusting statistical summaries.
- Consider common visual illusions and confusions. Understand how starting axes at zero might not always be the best option because it can mask relevant data patterns or create the illusion of patterns that do not reflect reality. Also map data to areas; beware that slopes in line graphs can create perceptual distortions; use caution when mapping continuous numbers to different hues because it can exaggerate differences; and choose colors that are friendly to color-blind viewers.
- Comparisons between sets of values are slow. Use visual grouping cues that guide the viewer to make the comparisons of interest.
- Avoid taxing working memory. Transform legends into labels embedded in the figures and avoid distracting animations or text.
- Attempt to use visualizations that your audience is familiar with, and respect common associations (e.g., “up” and “darker” mean “more”).
- Use graph formats that guide viewers to the conceptual message you are trying to convey.
- When communicating confidence to a lay audience, avoid error bars and instead show examples of discrete values.
- When communicating risk to audiences who may have a lower ability to work with numbers and mathematics, rely on absolute instead of relative rates, and convey probabilities (e.g., 3 out of 10) instead of percentages (e.g., 30%).
- Even if you feel it is not needed, make sure to support comprehension and understanding especially for audiences who may have low domain knowledge, numeracy, or working memory capacity.
In practice: Five principles for better data visualizations
In an accompanying commentary, Jonathan Schwabish (Urban Institute, Washington, DC), a creator of policy-relevant data visualizations, presents alternatives to the standard ways of visualizing and communicating data. He uses a data set from the National Center for Education Statistics (graphs made using Microsoft Excel) to illustrate these alternatives. His visualizations rely on five basic principles to improve the practice of data visualization. These principles are:
1. Show the data. Data are the most important part of the visualization and should be shown in the clearest way possible. Be strategic and purposeful about the data you show.
2. Reduce the clutter. Minimal chart clutter (e.g., heavy grid lines, tick marks, certain labels, 3D effects) helps the reader to focus on the most important parts of the visualization.
3. Integrate graphics and text. Do not treat text and visualizations as two separate elements of your communication. For example, echoing Franconeri and colleagues, he recommends transforming legends into embedded labels and titles.
4. Use a small-multiples approach—break up dense, complex graphs into smaller multiples (i.e., break up the information across multiple graphs that have similar axes, colors, and layouts).
5. Start everything with gray. This strategy makes everything in a visualization appear equal in importance at first and allows researchers to better decide where the reader should focus attention and then make changes accordingly (e.g., make certain bars a different color to make those values easier to find and read).
Schwabish emphasizes that these strategies are secondary to the most important one: Know your audience. Knowing what your audience knows about the data, the concept, or statistics will help you to choose and craft more effective data visualizations.
See related news release.