R3: Graphesis Chapter 1

In the first chapter of Graphesis, Johanna Drucker explores the evolution of people's perception and use of graphics throughout history. She states early in the passage that "images have a history, but so do concepts of vision and these are embedded in the attitudes of their times and cultures as assumptions guiding the production and use of images for scientific or humanistic knowledge." This resonated deeply in that in each stage of the pursuit of knowledge by humanity, there were different beliefs and approaches to visual graphics that compounded on themselves to advance us to our current embracement of visual forms of knowledge.

It was interesting (and also unsurprising) to read that graphical representations were first thought of as somewhat unreliable, and inconsistent in their meaning and value. Yet, graphics have encoded knowledge for centuries and have enabled communication in ways pure numbers or words could never do. Architecture, physiognomy, evolutionary biology, medicinal fields all incorporate graphics as an indispensable form of knowledge representation. As graphics became widely accepted as such, it transformed into a discipline that developed formal rules and methods through diverse efforts--from the systematization of design for industry, to semiotic approaches to graphical systems, to the integration of computational processing for the breaking down of components to primitives, and also automated production of graphics.

One figure that stuck out to me during the reading was David Marr, who created one of the first models for computer vision by investigating the human representation and processing of visual information. He treated vision as an information processing system, and described it as such. He put forth the Tri-Level Hypothesis, which stated that in order for any informational processing system to be understood completely, it must be described in three levels of analysis: The computational level, which describes what the system does/what problem the system solves; the algorithmic/representational level, which describes what steps the system goes through to solve the problem (what representations and processes it uses); and the implementational/physical level, which describes the physical characteristics of the system (how is this system physically realized). He also put forth a model of vision that begins with visual primitives that end with a sophisticated 3D output. His model states that visual processing starts as a primal sketch, or a 2D mapping on the retina that captures edges, blobs, bars, ends, virtual zero crossings, curves, boundaries. Then it moves to a 2.5D sketch which additionally detects texture an depth, and finally moves to a 3D model organized in terms of surface and volumetric primitives. In his book, he goes through how each visual primitive can be computationally detected, using the human visual system as his model.

In visual neuroscience, we learn about how the visual system and specifically the retina computes directional selectivity, edge boundaries, orientation. And in computer vision, we have built machine learning algorithms inspired by how the visual system works. David Marr's work integrates both, leading the way with thinking about visual processing computationally, and opening up the field for computational scientists to explore computer vision.

PDF of his book Vision: http://lolita.unice.fr/~scheer/cogsci/Marr 82 - Vision.pdf

Show Comments