Plato's Cave: Evidence from Visual Illusions

Visual Illusions

Consider the Kanizsa square, as shown in Figure 3 (A).

Figure 3(A)

In this figure an illusory square surface is seen to occlude four black circles, which in turn appear to be amodally completed behind the square. There is a revealing difference between the square percept of Figure 3 (A) and that of Figure 3 (B)

Figure 3 (B)

which contains the same square information, and stimulates a similar recognition of that square relationship, and yet Figure 3 (A) creates in addition the percept of a solid square surface that appears to hover in depth some distance above the page, with an actual brightness edge seen between the white illusory square and the white page behind it. The difference between these two figures exemplifies the role of the spatial integration stage of visual processing, which goes beyond the mere recognition of geometrical forms, to an actual "filling in" of the recognized feature in both color and depth, creating a percept that is virtually indistinguishable from, or at least expressed in the same spatial language as a real brightness and depth edge. This is what I have called reification, a materialization or reconstruction of a higher level feature in the lower level representation at the highest resolution available in the system. The fact that the very solid and real looking surface in Figure 3 (A) is actually illusory is direct evidence that the visual system is capable of constructing very solid and real looking spatial representations which are nevertheless illusory. In essence this is the same kind of perceptual processing as is seen in the example of the tennis player, discussed above, where the solid surface of the tennis court is interpolated into the blind region outside the player's visual field, just as the illusory square is interpolated through a region of missing information between the inducing pac-man figures. The fact that the illusory square blends in with the features of the "real world" around it suggests that the illusory percept of the square and the "real" percept of the page behind it are constructed of the same "perceptual stuff". In other words the solid three-dimensional page that you hold in your hands is in some sense just as illusory as the square in Figure 3 (A), except that it is supported by more visual and somatosensory evidence, creating a more vivid and compelling percept, just as the square in Figure 3 (A) is more vivid and compelling than that in Figure 3 (B), due to more supporting evidence.

Figure 3 (C)

Figure 3 (C) shows another example of visual reification, where the two-dimensional pattern of lines in a Necker cube do not simply register cognitively as possibly representing a three-dimensional structure, but actually pop into a full three-dimensional percept so compellingly that it is almost impossible to perceive the original flat pattern in the absence of the spatial interpretation. Cognitively, the spatial interpretation of the Necker cube is ambiguous, and can be seen in two equally valid alternative ways. In the resulting spatial percept however, the perceptual system pops from one to another in a bistable manner, because the perception of one alternative excludes the other. This exemplifies the difference between the higher level invariant abstraction of "a cube", and the lower level variant reification of a particular cube at a particular orientation and location. The invariant representation is important in order to generalize a multitude of possible figures as the same invariant cubical form, whereas the variant representation is essential for physical interaction with this particular cube. Notice how the spatial percept of the Necker cube does not just select between the two alternative interpretations, or simply show which lines are in front and which are behind, but in this representation, every point on every line is assigned an exact three- dimensional location at the highest possible resolution. Furthermore, every point on every surface defined by those lines is also assigned an exact three-dimensional location. Consider the small dot in Figure 3 (C). Although this dot carries no explicit information of depth, it is most easily perceived as being located on one of the surfaces created by the percept, i.e. the top surface of the cube, or a vertical side surface, or perhaps on the surface of the paper behind the cube. In each of these alternative interpretations the dot takes on a very specific spatial location defined by the spatial surfaces in the percept. The purpose of this spatial representation therefore appears to be to locate visual features relative to one another in a fully spatial context, in order to choose between alternative interpretations of the visual input.

The selection between alternative interpretations is automatic and preattentive, inaccessible to conscious analysis, suggesting a low level phenomenon. Furthermore, the operation appears to occur in parallel, because the speed of such perceptual selection does not depend on the visual complexity of the scene, but occurs just as quickly for simple stick figures as in Figure 3 (C) as it does for the spatial interpretation of two-dimensional photographs, even of complex scenes containing thousands of visual edges. The selection of a particular interpretation, although apparently a low level process, is nevertheless not a simple feedforward operation that is presented to the higher cognitive stage as a fait accompli, but rather it is one state of a multistable system, which can be pushed into another state at any time on the basis of higher level cognitive influences. For example it is possible to induce the percept of the dot in Figure 3 (C) to appear at either a nearer or farther surface of the figure by simply willing it to be there, although the dot resists taking up intermediate positions between those surfaces. This is fundamentally different from the feed-forward algorithms often used for image processing and pattern recognition. In such sequential algorithms there is no opportunity to correct errors that occur early in the processing stream in the local interpretation of visual features, with the result that errors propagate forward producing absurd results in the global interpretation. Instead, the perceptual system appears to be a single complex system of forces in balance with one another, such that alteration of any part of the representation either at a low or a high level can potentially influence every other part of the system, and in the case of a multi-stable percept such as the Necker cube, the smallest influence is capable of making the system cascade into a totally different state. For example the introduction of two tiny breaks in two lines in the figure, as shown in Figure 3 (D),

Figure 3 (D)

is sufficient to anchor the entire spatial percept permanently in one of the two alternative states. This principle surely must extend to the interaction between different perceptual modalities, such as vision and hearing, as well as somatosensory and proprioceptive perception, so that any perceptual ambiguity in any one modality will be influenced by evidence from another modality, resulting in a spatial interpretation that is maximally consistent with all modalities simultaneously. This property of the perceptual system would account for the subjective impression of a single unified percept of the world.

More Illusions

(more to come...)

Return to argument

Return to Steve Lehar