Plato's Cave: Marr's Vision
Marr's Vision
Marr proposed a model of visual processing that begins by identifying
the "zero-crossings" (edges) in the image, and then using this edge
information to provide a crude segmentation of surfaces, called the
2-1/2-D sketch, and finally extracting from this sketch the
three-dimensional spatial information. That spatial interpretation is
expressed in terms of geometrical primitives such as generalized
cylinders or cones, so that the only data which must be explicitly
stored are the x,y,z locations, alpha,beta,gamma orientations, aspect
ratios, etc. of each of the cylinders, as well as a symbolic code of
the relations between them, thus reducing the complex scene to a
highly compressed set of meaningful numbers.
Notice that in this model the three-dimensional spatial information is
the last stage of processing.
The problem with this model is that nobody has ever been able to
define how this spatial information can be reliably extracted from the
scene. Again, the visual world contains far too much ambiguity to be
handled successfully in this manner.
Return to argument
Return to Steve Lehar