Plato's Cave: Hierarchy of Representation

The Hierarchy of Representation

Information theory suggests a hierarchy of representation in the visual system based on information compression. I propose therefore a hierarchy as shown below.

The lowest level of the hierarchy is the filled-in brightness percept, which corresponds to your view of actual surfaces in the world. The next level up is an abstraction of that view in the form of a contrast-sensitive edge representation. The next level is a further abstraction, in the form of a contrast insensitive edge representation. Higher levels perform still further abstraction, for example the edges can be abstracted to corners between edges, and the corners of a triangle can be abstracted to a tri-angular symmetry from the center of the triangle, etc. In this scheme each higher level encodes only the changes in the previous level, resulting in a compressed representation at the top, while the top-down reification operation corresponds to an image decompression or reconstruction operation.

Spatial interactions occur within each layer of the hierarchy, for example collinear boundary completion in the contrast insensitive layer, and surface brightness filling-in in the brightness percept layer, and simultaneous bottom-up abstraction, and top-down reification transformations occur between the levels of this hierarchy, so that the entire system responds to a visual input by relaxing into a global state which is most consistent with the local interactions active simultaneously at all levels of the hierarchy.

Perception, or consciousness of visual form occurs, I propose, simultaneously at all levels of the hierarchy (as opposed to only at the highest levels, as suggested by the "[Cartesian Theatre]" concept) although the nature of the percept depends on the level or levels where it occurs. For example the Kanizsa square shown below is perceived as a brightness percept as well as a collinearity percept, so the illusory edges of the square are represented in the surface brightness percept, the contrast sensitive, and contrast insensitive edge layers.

The corner grouping percept shown below on the other hand does not create an illusory brightness percept, although it does stimulate a percept of collinearity, so the illusory edges of this figure would be represented only in the contrast insensitive edge representation.

The level of abstraction does not necessarily correspond to the level in the visual system. Indeed the retinal image is itself a contrast sensitive edge representation rather than one of surface brightness. I propose therefore that the retinal image feeds in to the contrast sensitive edge layer of the hierarchy, from whence the retinal signal is both abstracted upwards, and reified downwards by FCS diffusion as suggested below to the left. Subsequently, the collinear grouping operation occurs in the contrast insensitive layer (due to the BCS, Directed Diffusion, or Orientational Harmonic processing), and those illusory edges are propagated back downwards where they influence the diffusion of brightness percept in the lowest layer.

This model explains why we perceive surface brightnesses despite the fact that the retinal image is an edge representation, and it also explains why the spatial resolution of the primary visual cortex is about double the resolution of the retina, because the cortical information is filled-in and refined from the coarser retinal input by reification, resulting in the phenomenon of hyper-acuity.

Return to argument

Return to Steve Lehar