Healy, Look at Data: What Makes Bad Figures Bad
Healy in his introductory chapter from Data Visualization for Social Science seeks to outline organizing principles for effective data visualizations. He first states that he believes the relationship between data and the perceptual features of graphics are more important than esthetics, and then goes on to discuss a number of examples of misleading visualizations.
Healy says problems arise mainly in three ways: esthetic, perceptual and substantive. Esthetically good visualizations often maximize the ‘data to ink’ ratio with simple, clearly defined graphics, though he notes novel visualizations such as the illustrated ‘Monstrous Costs’ by Nigel Holmes can be memorable. A second problem can be found in the nature of human vision perception, which can produce illusions, meaning visual channels must be chosen carefully. Finally, substantive problems with the visual interpretation of data, such as badly designed aspect ratios and the lack of a zero point can be misleading, meaning good judgment and honesty are also vital.
In general, lacking sufficient contextual knowledge with which to agree or disagree with Healy’s points, I tended to naturally agree with him as an expert who effectively organized his arguments and buttressed them with evidence.
Regarding the two views of the rapid decline in law school enrollments, I would be curious to hear my classmates’ opinions on whether they agree with Healy’s acceptance of the first chart and, if so, how they think it might be improved to reflect the criticism to which it has been subject.
Bergstrom & West, Misleading axes on graphs
Bergstrom & West's in their Misleading axes on graphs essay look at ways in which bar charts and line graphs can be manipulated to change perceptions of the data being presented. They stress that data visualizations are created to tell stories, and that subtle choices by the author can change people’s perceptions dramatically. Bergrstrom & West believe that bar charts which depict absolute magnitude of values should be zeroed, but line graphs show change in the dependent variable, and need not include zero.
They cite several examples to buttress their argument. Among them is a controversial chart made by Powerline and tweeted by the National Review chart showing global temperature changes that goes to zero on the y axis. The visual impression is that there has been little change, whereas a revised Washington Post chart without the zero creates a much steeper line and the impression of rapid temperature increase. The disingenuous aspect of the Powerline graph is that they made graphical display choices that are inconsistent with the story they are telling.
Other problems with line graphs arise in changing scale midway, or inverting the scale. Misleading graphs are represented on controversial topics such as income inequality and gun deaths. The authors also stress a line graph in order to make sense should have something numerical on each axis.The authors conclude by stressing that subtle design choices can have a big impact on the story that a figure tells, and readers should be aware of possible bias.
Here again I tended to agree with Bergstrom & West due to the convincing nature of their arguments and my lack of domain knowledge. However, I felt questions were raised by their preference for a revised global temperature graph that emphasizes the change. As I accept their arguments based on their expertise, they seem to be favoring the revised chart based on the global climate science consensus on the reality of climate change.
I would like our class to discuss to what extent it’s acceptable to rely on experts in ‘biasing’ the design of a graph to make one or another point. For example, two centuries ago there was a global scientific consensus that linked skull dimensions to intelligence, creativity, morality and other human faculties in the later debunked field of phrenology.
Bergstrom & West, The Principle of Proportional Ink
Here the authors explore a basic rule for the design of data graphics, the principle of proportional ink. The rule is simple: when a shaded region is used to represent a numerical value, the area of that region should be proportional to the value. This rule derives from a more general principle that Edward Tufte described in his classic The Visual Display of Quantitative Information.
Initially, two examples of bar charts are presented: the first exaggerates the growth of non-farm employment in Tennessee by failing to include the zero on the y axis, while another showing book sales downplays differences by reaching into negative territory.
The authors then refer back to their argument that line charts needn’t be zeroed when not “filled,” but when shaded as, in the example of a tax rate graph, they must as the coloring exaggerates differences.
Next they discuss bubble charts, whose power is that by using color and size as well as vertical and horizontal position, one can simultaneously encode four different attributes. The authors present an example by Hans Rosling, who popularized bubble charts. However they warn that while data-rich, they offer easy comparisons only between the x and y axes. They can also be misleading due to the nature of size perception with circles, or even misused as in another example cited.
Another type of graph that easily violates the principle of proportional ink is the donut graph, which either emphasizes or downplays differences depending on a band’s distance from the center.
Another complex graph “filled” graph that poses problems is one showing cause of death relative to age in a Time magazine article. This graph confuses by not taking into account total deaths for each age group. But the authors suggest in cases like this it may be incumbent on the reader not to make mistakes of interpretation.
Three-dimensional charts can also not to make mistakes of interpretation, and 2D charts should be used when there is only one dependent variable. The trendiness of newfangled 3D charts is cited as one reason for their popularity.
The use of perspective in 3D charts also makes it substantially harder for a viewer to assess the relative sizes of the chart elements, and often violates the principle of proportional ink.
Pie charts are also problematic because relative areas are difficult to assess visually in angular form, with the authors arguing that 3D pie charts are pure visual bullshit. The use of perspective in 3D charts and the phenomenon of foreshortening makes misperception even worse.
I tended to agree with the authors as the examples they presented were convincing in the sense that when the ink is clearly out of proportion to the data, the effect is misleading. But some of these cases seemed more egregious than others. I’m curious what others made of this and, thought it would be interesting to see comparisons of 3D charts with 2D charts of the same data, to sense to what extent viewers are being deceived.