Response to Reviewer A

Reviewer A very kindly contacted me directly, and revealed himself to be Professor Dr. Hans-Georg Geissler of the University of Leipzig. I wrote him a general response to both reviews in January 2000, followed by these responses to specific points, both his own, and those of the other reviewer .

Response to Specific Points

What follows is a brief and cursory discussion of the various issues raised by yourself and the other reviewer. If you should revise your judgment of the validity of the theory, these points will be addressed at greater length in a new version of the paper that I would resubmit to Psychological Review.

Response to Specific Points- Reviewer A:

In part (1) of your critique the major complaint is that no theory is presented, which was discussed above. You continue "Regrettably, not much attention is drawn to specific differences between the chosen examples that would be necessary to pinpoint specificities of perception more precisely", and "if perceptual systems, as suggested, hler (Kindeed act on the basis of HR, there must be many more specific constraints involved to ensure special `veridicality' properties of the perceptual outcome", and "the difficult analytic problems of concrete modeling of perception are not even touched". The model as presented is not a model of vision or audition or any other particular modality, but is a general model to confront the alternative neural receptive field paradigm, although examples from visual perception are used to exemplify the principles discussed. The more specific visual model was submitted elsewhere, in the Orientational Harmonic model, where I showed how harmonic resonance accounts for specific visual illusory effects. As discussed above, the attempt here is to propose a general principle of neurocomputation, rather than a specific model of visual, auditory, or any other specific sensory modality. Again, what I am proposing is a paradigm rather than a theory, i.e. an alternative principle of neurocomputation with specific and unique properties, as an alternative to the neuron doctrine paradigm of the spatial receptive field. If this paper is eventually accepted for publication, then I will resubmit my papers on visual illusory phenomena, referring to this paper to justify the use of the unconventional harmonic resonance mechanism.

In part (2) (a) of your critique you say "it is not clarified whether the postulated properties of Gestalts actually follow from this definition or partly derive from additional constraints." and "I doubt that any of the reviewed examples for HR can treat just the case of hler: (1961, p. 7) "Human experience in the phenomenological sense cannot yet be treated with our most reliable methods; and when dealing with it, we may be forced to form new concepts which at first, will often be a bit vague." Wolfgang Kthe dog cited to demonstrate `emergence'. For this a hierarchy relation is needed." The principle of emergence in Gestalt theory is a very difficult concept to express in unambiguous terms, and the dog picture was presented to illustrate this rather elusive concept with a concrete example. I do not suggest that HR as proposed in this paper can address the dog picture as such, since this is specifically a visual problem, and the HR model as presented is not a visual model. Rather, I propose that the feature detection paradigm cannot in principle handle this kind of ambiguity, because the local features do not individually contain the information necessary to distinguish significant from insignificant edges. The solution of the HR approach to visual ambiguity is explained in the paper in the section on "Recognition by Reification" (p. 15-17) in which I propose that recognition is not simply a matter of the identification of features in the input, i.e. by the "lighting up" of a higher level feature node, but it involves a simultaneous abstraction and reification, in which the higher level feature node reifies its particular pattern back at the input level, modulated by the exact pattern of the input. I appeal to the reader to see the reified form of the dog as perceived edges and surfaces that are not present in the input stimulus, as evidence for this reification in perception, which appears at the same time that the recognition occurs. The remarkable property of this reification is that the dog appears not as an image of a canonical, or prototypical dog, but as a dog percept that is warped to the exact posture and configuration allowed by the input, as observed in the subjective experience of the dog picture. This explanation is subject to your criticism in your general comments, that "the author demonstrates more insight than explicitly stated in assumptions and drawn conclusions". I can only say that, in Kuhn's words, sometimes it is only personal and inarticulate aesthetic considerations that can be used to make the case.

In the words of Wolfgang Köhler: (1961, p. 7)

"Human experience in the phenomenological sense cannot yet be treated with our most reliable methods; and when dealing with it, we may be forced to form new concepts which at first, will often be a bit vague."

Wolfgang Köhler (Köhler 1923 p. 64)

"Natural sciences continually advance explanatory hyptotheses, which cannot be verified by direct observation at the time when they are formed nor for a long time thereafter. Of such a kind were Ampere's theory of magnetism, the kinetic theory of gases, the electronic theory, the hypothesis of atomic disinte gration in the theory of radioactivity. Some of these assumptions have since been verified by direct obser vation, or have at least come close to such direct verification; others are still far removed from it. But physics and chemistry would have been condemned to a permanent embryonic state had they abstained from such hypotheses; their development seems rather like a continuous effort steadily to shorten the rest of the way to the verification of hypotheses which survive this process"

In section (2) (b) of your critique you complain that "there is no serious discussion of possible alternatives", and you mention Neo-Gibsonian approaches, PDP, Grossberg's ART model and Pribram's holographic theory. In the next version of the paper this omission will be corrected, approximately as follows. Gibson's use of the term resonance is really a metaphorical device, since Gibson offers no mechanisms or analogies of perceptual processes, but merely suggests that there is a two-way flow of information (resonance) between behavior and the environment. This is really merely a metaphor, rather than a model.

The PDP approach does address the issue of emergence, but since the basic computational unit of the neural network model is a hard-wired receptive field, this theory suffers all the limitations of a template theory. The same holds for Grossberg's "Adaptive Resonance Theory", which also uses the word resonance metaphorically to suggest a bottom-up top- down matching, but in Grossberg's model that matching is actually performed by receptive fields, or spatial templates. The ART model demonstrates the limitations of this approach. For the only way that a higher-level detector, or "F2 node", can exhibit generalization to different input patterns, is for it to have synaptic weights to all of the patterns to which it responds. In essence, the pattern of synaptic weights is a superposition or blurring together of all of the possible input patterns to which the F2 node should respond. In top-down priming mode therefore that F2 node would "print" that same blurred pattern back at the lower "F1 node" level, activating all of the possible patterns to which that F2 node is tuned to respond. For example if an ART model were trained to respond to an "X"-shaped feature presented at all possible orientations, top-down priming of this node after training would "print" a pattern of all those X-shaped features at all orientations superimposed, which is simply an amorphous blob. In fact, that same node would respond even better to a blob feature than to any single X feature. In the presence of a partial or ambiguous X-like pattern presented at a particular orientation, the ART model could not complete that pattern specific to its orientation. The HR model on the other hand offers a different and unique principle of representation, in which top-down activation of the higher level node can complete a partial or ambiguous input pattern in the specific orientation at which it appears, but that same priming would complete the pattern differently if it appeared in a different orientation. This generalization in recognition, but specification in completion, is a property that is unique to the harmonic resonance representation.

Kuhn observes that the old paradigm can always be reformulated to account for any particular phenomenon addressed by the new paradigm, just as the Ptolomaic earth- centered cosmology could account for the motions of the planets to arbitrary precision, given enough nested cycles and epicycles of the crystal spheres. Similarly, a conventional neural network model can always be contrived to exhibit the same functional behavior of generalized recognition but specific completion described above, but only by postulating an implausible arrangement of spatial receptive fields. In this case that would require specific X-feature templates applied to the input at every possible orientation, any one of which can stimulate a single rotation-invariant X-feature node, to account for bottom-up rotation invariance in recognition. However in order to also account for top-down completion specific to orientation, top-down activation of the higher-level invariant node would have to feed back down to a set of top-down projection nodes, each of which is equipped with an X-shaped projective template at a particular orientation, able to project a complete X-shaped pattern on the input field. But the top-down completion must select only the specific orientation that best matches the pattern present in the input, and complete the pattern only at that best matching orientation. This system therefore requires two complete sets of X-feature receptive fields or templates, one set for bottom-up recognition and the other set for top-down completion, each set containing X-feature templates at every possible orientation, and similar sets of receptive fields would be required for the recognition of other shaped patterns such as "T" and "V" features. This represents a "brute force" approach to achieving invariance, which although perhaps marginally plausible in this specific example, is completely implausible as a general principle of operation of neurocomputation, given the fact that invariance appears to be so fundamental a property of human and animal perception. However, as Kuhn also observes, a factor such as neural plausibility is itself a "personal and inarticulate aesthetic consideration" that cannot be determined unambiguously by the evaluative procedures characteristic of normal science.

With regard to Pribram's Holographic theory, the concept of a hologram is closely related to a standing wave model, since it too works by interference of waveforms. The difference is that the hologram is "frozen in time" like a photograph, and therefore does not exhibit the tolerance to elastic deformation of the input, as does the standing wave model. Neither does the hologram exhibit rotation invariance as does the standing wave in a circular- symmetric system. However holograms can in principle be constructed of dynamic standing waves, as Pribram himself suggests, and this concept then becomes a harmonic resonance theory. The present proposal is therefore closely related to Pribram's approach, which will be discussed in the next version of the paper.

The discussion of alternative models was indeed a significant omission in the version of the paper you reviewed, the next version will include such a discussion, which in turn will help to clarify the operational principles of the HR theory, and distinguish it from alternative approaches.

In section (3) of your critique you propose that "notions like the receptive field concept are approximate descriptions of facts", and you propose a dualistic approach involving two forms of representations in the brain which are of different and complementary nature. While I do not dispute the anatomical facts of the shapes of neuron and the function of synapses, it has never been demonstrated that a neuron actually operates as a spatial template, that theory arose as an explanation for the neurophysiological response of "feature detector" cells in the cortex. However the noisy stochastic nature of the neural response, and its very broad tuning function seem to argue against this view. My own hunch is that the feature detector behavior is itself a standing wave phenomenon, which is consistent with the fact that the response function of V1 cortical neurons resembles a Gabor function, which is itself a wavelet. However this issue is orthogonal to my main point, which is that whether or not some neurons behave as spatial templates, the limitations of a template theory suggest that the Gestalt properties of perception (emergence, invariance, reification, multistability) cannot be accounted for in that manner, and that some other significant principle of computation must be invoked to account for the Gestalt properties of perception.

In section (4) you complain that there is no discussion of the limitations in the scope of HR. For example merely to reflect outside reality does not contribute to the problem of conscious awareness of these objects. However this issue is not unique to HR, it is a general philosophical issue that applies just as well to the alternative Neuron Doctrine model. But the Neuron doctrine itself cannot even plausibly account for the reflection of outside reality in an internal representation, due to the problems of emergence, reification, and invariance, which is why the Neuron Doctrine suggests a more abstracted concept of visual representation, in which the visual experience is encoded in a far more abstracted and abbreviated form. Therefore although HR does not solve the "problem of consciousness" completely, it is one step closer to a solution than the alternative. The philosophical issue of consciousness however is beyond the scope of this paper, which is a theory of neural representation, rather than a philosophical paper. I enclose a copy of my book, "The World In Your Head", which addresses these philosophical issues more extensively.

Professor Geissler's Response

Professor Geissler kindly responded to my letter in April 2000 to say that he agreed with nearly everything I had said. He then gave me advice about the presentation of the idea. He recommended that I begin by describing the Neuron Doctrine in detail, and then point out the limitations of the idea before presenting the Harmonic Resonance theory as an alternative. I re-wrote the paper following Geissler's advice, and I included some ideas from the above letter in the new version of the paper. However it was too late to resubmit it to Psychological Review since the editor who was handling the paper was leaving. Furthermore, I am becoming convinced that the proper medium for presenting radically new and different theories is the open peer review format of the Behavioral and Brain Sciences journal, which is where I submitted the revised version of this paper.