Plato's Cave: Biederman's Geon Theory

Biederman's Geon Theory

Biederman notes that certain properties of visual features remain invariant to perspective transformation through small angles. For example a straight edge appears straight, while a curved edge appears curved, through a wide range of rotations of the object, although the exact angle or curvature of that edge changes with rotation. Biederman thus proposes the Geon Theory, a representation of visual form in terms of these relatively invariant features.

A simplified example of this concept is illustrated below, where objects are encoded as a four-digit code, the first digit of which encodes whether the edges are straight or curved; the second digit encodes whether the object has reflection symmetry, rotation symmetry, or both; the third digit encodes how the shape of the object changes with distance along its central axis, and the fourth digit encodes whether that central axis is straight or curved.

Relations between objects can be encoded in similar manner, allowing a decomposition of objects into their component pieces, resulting in a compressed code for each complex object. For example the teapot shown above would be encoded by its three component objects (handle, body, spout) and the relations between them as... These properties would remain relatively invariant through small rotations of the object.

The problems with this scheme are that nobody has ever devised a system which can perform this encoding on a natural scene. Even if they could, certain natural shapes such as trees, shrubs, grass, hair, rocks, cannot be expressed in this manner because they are composed of far too many components. Finally, the highly compressed abstract code is nothing like the subjective percept of a teapot, which appears complete in all of its curves and surfaces. Again, in this scheme the three-dimensional information is the last to be computed, and is only computed in the most abstract manner. Evidence from visual illusions suggests that the three-dimensional percept is the first to be computed, and the percept is not an abstraction, but is filled in with perceptual surfaces and boundaries in a three-dimensional form.

Return to argument

Return to Steve Lehar