Lee Kuczewski

What you see is not always what you get.

I must admit, Kieran Healy is a decent salesman. We begin the chapter “Look at data” at the top of his sales funnel with the exclamation that some data visualizations are better than others -- and before long we’re deep in his guidebook for learning the R programming language and ggplot. Well done.

For those who are fortunate to have sight and the necessary mental faculties, it’s reasonable to suggest there is a spectrum of quality to visualizations which span from brilliant to complete rubbish. Even between these extremes we run the risk of dwelling in the misleading middle, according to Healy:

“In your everyday work you will be in little danger of producing either a “Monstrous Costs” or a “Napoleon’s Retreat”. You are much more likely to make a good-looking, well-designed figure that misleads people because you have used it to display some bad data.”

Healy explores why differences exist in quality by deconstructing visualizations through the lenses of aesthetics, substance, and the fallibility of human perception.

We’re brought on a journey from the foundational ideas of Edward Tufte’s chart junk and data-to-ink ratios to poorly executed 3D charts, “pop out”, and optical illusions. We’re expected to be struck by the research of Bateman et al. (2010) and Anderson et al. (2011), which suggests that highly embellished charts are more easily recalled than their simplified counterparts, and that Tufte’s own data-to-ink ratio was the hardest boxplot to interpret despite its minimalism. Frankly, it’s not surprising considering the fallibility of human perception.

The research on human perception and vision science (Ware, Munzer, Adelson, Bach) provides us with a baseline of how we may be deceived by contrast, brightness, and edge detection to name a few. What you see is not always what you get, and our vision systems and related processing wetware of pattern matching can be quite complex and contradictory. The idea of gestalt rules -- whereby we search for structure in random places, and sometimes fill things in where they may not exist speaks to the complexity of observation. In the end, it matters a great deal who the audience is. Our job is not merely to share meaningful insights through data, but to consider the variables which may lead our audience astray, whether these are conscious or not.

Questions:

Considering Healy’s research interests in the effect of quantification on the emergence and stabilization of social categories, data visualization seems like an ideal playground to work out his research. How does this align with him creating a guidebook to learning R?
Imagine for a moment that you have lost your sight, how would you interpret Healy’s arguments?