Plato's Cave: A New Gestalt Model

A New Gestalt Model

Need for a New Approach

The [failure] of the [Conventional approach] to visual perception to address the [Gestalt issues] of perception suggests that the current concepts of neural computation are inadequate, and that novel principles & mechanisms of perceptual computation remain to be discovered.

Perceptual Modeling

How are we to model perception if our understanding of the mechanism of neural computation is incomplete? One way is to follow the example of the Gestalt theorists, i.e. to study the percept itself, and model its properties as it appears, i.e. perceptual modeling, as opposed to neural modeling. Models derrived in this way are independent of the neural mechanism by which perception is subserved in the brain. This approach will either converge with known physiology, or if not, it will provide evidence as to where our knowledge of neurophysiology must be extended to explain specific properties of the percept.

An Evolutionary Progression

What follows is an evolutionary progression through a series of models, beginning with one that reflects the properties of the collinear percept observed in the Kanizsa figure. That model is then progressively extended and generalized to account for ever more properties observed in different perceptual phenomena, while taking care not to lose the behavior evolved in previous stages.

At each stage in the progression you will find a simplistic model that neatly explains some aspects of visual perception, while leaving many fundamental issues unresolved. I ask your indulgence during this progression by suspending your skepticism at least until we reach the final model, at which point all the pieces will come together and many of those issues will be resolved including, in my view, many of the most fundamental issues in visual perception that are not addressed by alternative approaches to this problem.

Collinear Boundary Completion

We begin by modeling collinear boundary completion as in the Kanizsa figure. Psychophysical studies have shown that the illusory contour in this figure exhibits an elastic spline-like quality in response to misalignment of the inducers. This suggests an analog Gestalt model involving local interactions in some perceptual substrate.

This idea has been quantified in the Directed Diffusion model of collinear boundary completion which posits that visual edges stimulate oriented fields of activation diffusing outward in a collinear manner. When two such edges are aligned in a collinear configuration, their diffusing fields of activation interact spatially, resulting in a positive feedback which enhances the illusory boundary along the smooth curve joining the inducers as shown schematically below. Computer Simulations of the Directed Diffusion model confirm that it reproduces the properties observed in the psychophysical studies.

Immediately we can see that this operation is not abstraction, but the inverse of abstraction, or [reification], a filling-in or interpolation of information not explicitly present in the image. Even at this level we can begin to see exactly what it is that is missing from the conventional models of vision.

Non-Collinear Boundary Completion

The mechanism of the Directed Diffusion model is specialized to perform collinear completion. However not all illusory contour formation is collinear in nature. In the examples shown below, we see the illusory contour complete through the sharp vertices of the triangle on the left, we see the contour bifurcate in a series of "Y" intersections at each dot in the pattern at the center, and we see an illusory contour form orthogonal to, rather than collinear to the inducing lines of the Ehrenstein figure on the right.

A careful analysis of the behavior of the illusory contour in these and other figures reveals certain underlying laws of non-collinear completion which offer clues as to the mechanism by which such completion is subserved.

The Orientational Harmonic model generalizes the mechanism of the Directed Diffusion model in order to account for these phenomena, and computer simulations of the model confirm that it predicts the illusory grouping in a wide range of diverse phenomena.

In order account for the data, the Orientational Harmonic model introduces a novel computational mechanism in the form of harmonic resonances that create patterns of standing waves which promote the Gestalt properties of symmetry and periodicity. As well as performing the non-collinear completion described above, Harmonic resonance also exhibits some unique and interesting properties as an invariant representation of visual form in visual abstraction.

The Surface Brightness Percept

This model is not complete as it stands, because there are two components to the percept, a collinear percept of an edge, and a surface percept of the occluding triangle, which even has a different surface brightness than the white page that it occludes. A complete perceptual model must fill-in this surface brightness.

Grossberg proposes the Feature Contour System (FCS), a mechanism which uses a spatial diffusion of brightness percept to account for a wide variety of psychophysical phenomena including the Kanizsa figure. On the basis of information theory I propose a hierarchy of representation in the visual system in which the FCS brightness percept represents the lowest level.

The Depth Component

Several investigators of the Kanizsa illusion have noted that there is more than just a surface brightness percept, but also an illusion of occlusion by a figure closer in depth of figures farther in depth; the triangle appears to "hover" some distance above the page. Indeed certain illusory phenomena contradict boundary and surface diffusion models without a provision for an additional depth component. Indeed, there is evidence that the phenomena explained by these models occur as easily in depth as it does in the plane of the visual field. These models must therefore be either completely abandoned, or boldly extended into a full depth representation. As absurd or improbable as this might at first seem, I will show that this explanation makes sense of a great number of hitherto mysterious perceptual phenomena.

The Bubble Model represents an extension of the previous models into the third dimension, using the Gestalt bubble concept, whereby the global configuration of perceptual surfaces emerge by a relaxation of local forces in an analog dynamic system.

Bounding the Representation

The model so far represents a three-dimensional slice of the external world in a fully spatial internal model. But the external world is an essentially boundless space, while the internal representation must fit inside the finite bounds of the skull.

The Bubble World model explains how an illusion of boundlessness can be created in a bounded spatial representation.

Lightness, Brightness, and Illuminance

A number of perceptual phenomena suggest that lightness, brightness, illuminance and form perception are intricately intertwined.

With the addition of an internal illumination source, the Bubble World becomes The Lighted Bubble World which is now able to explain perceptual phenomena never before addressed by computational models of perception.

Motion Perception and Shape From Motion

Some of the most puzzling phenomena in perception are seen in the amazing ability of the visual system to make spatial sense out of moving stimuli. The spatial matrix of the bubble model provides a plausible explanation plausible explanation for these phenomena.

Neurophysiological Considerations

Finally, a brief discussion of where in the brain such a spherical representation might be found.

Conclusion

Many models have been proposed to account for individual aspects of perception, such as depth from disparity, shape from shading, shape from motion, etc. with little thought given to generalizing those models to account for other aspects of perception. This model takes the opposite approach, and attempts to address all aspects of perception with a single unified model. Of necessity therefore the model is described rather sketchily, some of the mechanisms are highly speculative, and many details remain to be worked out. What is important however is not so much the specific details of this model, but rather the functional principles that it embodies. In particular, I refer to the principles of The only reason these properties have not been proposed in previous models is not for lack of evidence of these properties in human perception, but rather for lack of a mechanism to implement these principles. Indeed it is difficult to even present these concepts without presenting a specific mechanism whereby they might be implemented. This model therefore is not to be viewed so much as a specific proposal for the mechanism of perception, but rather as an example system to illustrate that the above general properties are achievable in principle with a physical system.

While it may be argued that the proposed model seems highly implausible in the light of current understanding of the neurophysiology of vision, it should also be borne in mind that the subjective experience of visual perception itself is highly implausible in the light of current understanding of the neurophysiology of vision, i.e. it is very hard to imagine how the simple integrate-and-threshold model of the individual neuron can possibly lead to the complex phenomenon of visual perception even when millions of them are wired together. Indeed, no previous model has even addressed the Gestalt principles with any success. Furthermore, much of the modeling and analysis in vision science is founded on a set of often unstated assumptions which themselves have never been justified any more than have the assumptions underlying this model.

The principle of Occam's Razor would suggest that the simplest possible explanation is the one most likely to be true. If a simple architecture is found neurophysiologically therefore, the simplest explanation is of a simple functionality. In biological evolution however there has always been an incentive to make the most use of any existing hardware, to take advantage, wherever possible, of subtle, higher order phenomena of the physical system. By this converse principle of Occam's Beard therefore, I suggest that when a simple circuit is found in the brain, the safest assumption is of the most complex functionality that can possibly be elicited from that simple system.

Return to argument

Return to Steve Lehar