Reviewer 2

Ms. 2533

Review of Gestalt Isomorphism I and II, revised version

I think that the first paper should be published. It addresses many crucial topics of current perceptual research. Of course, the model is still vague, as authors themselves admit, but I assume that it will be elaborated in further writings. As to the second paper, I think there is a major problem that should be solved before its publication: its length.

Author's Response

1. Authors do not mention that there is a rather rich recent literature on the problem of isomorphism (I did not know it myself when I wrote the previous review), and it would be better to fill this gap. I suggest authors to read two interesting on-line works dealing with this topic in which many useful references can be found; they can be downloaded from the following addresses:

http://www2.ucsc.edu/people/anoe/completion.html (Thompson et al, PERCPETUAL COMPLETION)

http://cogsci.ecs.soton.ac.uk/bbs/Archive/bbs.pessoa.html (Pessoa et al, FINDING OUT ABOUT FILLING IN)

A related problem is that, in their writings, authors give the impression of thinking that the principle of isomorphism is almost a necessary postulate, whereas obviously it is not (see for instance the works mentioned above). Köhler himself thought that isomorphism "is not an a priori postulate, but 'remains an hypothesis which has to undergo one emperical test after the other'" (Scheerer 1994, p. 188).

Author's Response

2. The titles do not fully clarify, in my opinion, the contents of the two papers. Paper 1: the title suggests that the article concerns only the perception of lightness, brightness, and illuminance, whereas in the authors' intentions these are just illustrustrative examples of an exhaustive model of visual perception (p. 12: "It is difficult to fully explain the meaning of these concepts without providing specific examples. These concepts will therefore be clarified by applying our design principles to a specific set of perceptual problems"). Moreover, the title should perhaps make clear that "The purpose of this paper ... is not to present an exact computational algorithm, but rather to suggest a general computational strategy based on the Gestalt principles, for designing models of perception". Paper 2: "This model therefore represents more of a commentary on the assumptions underlying the current direction of visual research, than a complete model of the mechanisms of vision" (p. 6). I think that this should be made clear in the title.

Author's Response

I have no important remarks, just a few small ones.

1. In many points it is not clear where Gestalt theory ends and where the author's theory begins. For instance on p. 9, while anomalous figures are discussed, we are told that "Köhler argues that ...": to my knowledge, Köhler never argued anything about anomalous figures.

2. "The perception of three-dimensional structure as seen in figure 1a is a low level percept rather than a high level cognitive inference" (p. 6). Adelson (1993) maintains that the phenomenon he deals with "can be ascribed to more complex mechanisms occuring later in the visual system", and that does not mean COGNITIVE mechanisms, but high-level PERCEPTUal mechanisms. He does refer to "inferences", but these are "PERCEPTUAL inferences".

3. "the percept is reified" (p. 7): here a word ("reified") is used that has not been previously defined; it will be defined only on the following page.

4. The article can be profitably shortened. I make a few examples. I would remove the following superfluous sentences: "Furthermore, the spatial information evident in figure 1a is so complete, it is easy to hold the flat of your palm parallel to any of the three depicted surfaces, as if viewing an actual block in three-dimensions, without any conscious awareness of how his task is performed" (p. 7; "An artist depicting the percept of the Kanizsa figure would have to use a different mix of white paint inside the triangle than outside it" (p. 15); "This type of top-down reification ... (Kosslyn 1994)" (p. 17). Paragraph 2.3 can be fruitfully removed, or at least much shortened. "Many researchers ... (Lehar & McLoughlin 1998)" (p. 22): you might get rid of this part, since you already stated the same thing at p. 12 and restate it in the "Conclusion".

5. In the language of perceptual sciences "subjective" is part of the meaning of the word "percept". On p. 8 authors assign a new meaning to the term "percept", and start writing about "subjective percepts". Changing the meaning of words is dangerous, it can give rise to confusion, and it does not appear very useful in this context. I would avoid it.

6. "... or to implement the property of brightness constancy" (p. 14). I think that the correct term is "lightness", not "brightness".

7. Figure 4a is wrong: the two small squares have white (on the left) and black (on the right) PHYSICAL edges. Perhaps authors could use a different printer.

1. Authors state that "The paper has now been shortened". Yes, a little, but, in my opinion, the paper is still much too long and it should be further and rather drastically shortened. For instance:

a. it seems to me that the "Discussion" (para 7) could be COMPLETELY removed; besides, it is not a "discussion", but rather a list of further facts supporting the proposed model;

b. you can get rid of Arnheim's description of the inverse optics problem on p. 4, because the problem in question is very well known to perceptual students.

Author's Response

2. "...this model, or modeling approach makes the following predictions: ..." (p. 26). As a matter of fact, the authors must admit that all, or almost all, of these "predictions" are the facts out of which their theory arose: these facts come BEFORE the theory, so it is incorrect to state that they are PREDICTED by the theory.

Author's Response

1. "The retinal image ... is fundamentally two-dimensional" (p. 3). "Fundamentally" can be abolished without any feeling of loss.

2. "...the problem addressed by perception is to determine what combination of possible surfaces in depth correspond to the most likely configuration of the depicted solid" (p. 4): "most likely" appears to me a very unfitting expression in this context, because it is typical of those helmholtzian theories that oppose the Gestalt theory.

3. On p. 5 the terms "relaxation" and "reified" are used: I think they should be defined, as they have been in the first article.

4. "The lowest level of perceptual representation therefore serves as the common interface between different visual properties, such as color, binocular disparity, and motion" (p. 18). I do not understand; are you sure that "the lowest" is what you mean?

5. P. 31: "Atli dell'XI Congresso degli Psicologi Italiani" "Atti", not "Atli".

"Optische Täyschungen": "Täuschungen", not "Täyschungen".