The objection that the reviewer does not see the percept the way it is claimed to appear is one of the most difficult kinds of objection to address, especially in this remote anonymous exchange where I cannot probe the reviewer interactively to discover at what level our perceptual experiences diverge. For there are many levels at which percepts can be experienced differently, and the most troublesome aspect of this issue is that the different alternative perceptual interpretations are all valid, i.e. there is not one single perceptual level, but many levels at which a stimulus can be validly experienced.
For example a Gestaltist would claim that Figure 2 (A) is perceived as a separate disconnected surface at a foreground depth, with a background surface perceived to complete amodally behind it, whereas an Introspectionist (Structuralist) would claim to see only a black perimeter on a white page. At the other extreme, the Introspectionist would even claim to experience a real block on a table (suggested in Figure 2 (G)) as a two- dimensional projection, whereas the Gestaltist would insist on a solid spatial percept. The Introspectionist accuses the Gestaltist of committing the stimulus error, i.e. allowing cognitive recognition to distort their description of the raw visual sensation, while the Gestaltist accuses the Introspectionist of the experience error, i.e. denying the vivid spatial experience of three-dimensional structure based on a preconceived notion of the nature of the retinal stimulus. The stimuli in Figure 2 (A) through (G) exhibit a progression of spatial cues from sparse to elaborate, resulting in a progression of spatial experience from less vivid to more vivid. At exactly what point along this progression the percept can be said to be fully 3-D depends to some extent on the observer's theoretical bias, the Gestaltist seeing a spatial percept throughout the whole range, while the Introspectionist sees them all as flat. The modern Neuroreductionist is inclined to side with the Introspectionist, since this view ascribes the spatial aspect of the experience to cognitive or higher level processes.
The point is that even as a Gestaltist seeing the spatial component through the full range from (A) through (G), I also recognize the simultaneous experience of a flat percept, when I wish to see it so. In the dihedral of Figure 2 (C) I can see the shaded surface either as a shadow or as a painted surface, although this becomes progressively more difficult in (D) through (G). Likewise, Adelson's block figure (either the original, or my own imperfect rendition of it in Figure 1 (B)) can also be seen either as a three-dimensional structure in shadow and light, or as a set of diamond shaped tiles similar to Figure 1 (C), or even in an intermediate form with three-dimensional structure but a painted, rather than a shadowed percept. It is easy therefore to encounter misunderstanding on this issue, especially since our theoretical bias can influence the aspect of this complex experience that we choose to report.
The Gestalt argument however is not that these figures cannot be perceived in their reduced forms, but rather that the spatial aspect of the percept is also available subjectively, and more significantly, when these stimuli are perceived in their spatial form, the nature of the spatial percept is determined primarily not by a cognitive inference, but by more primitive perceptual processes that are generally uniform across individuals, and are seen at high resolution, influencing every point on the perceived surfaces, thereby indicating a low level interaction. The reviewer's objection that he perceives the circled edges in Figure 1 (A) as identical, is not a problem as long as he acknowledges that he can also perceive that figure, or Figure 1 (B) as a spatial form in light and shadow (even if this particular percept is less natural for him). Or, failing that, that at least he experiences a spatial percept somewhere along the progression from Figure 2 (A) through (G). The point is that there exist examples where the global gestalt promotes a percept wherein the perception of lightness, brightness, illuminance, and three-dimensional form interact in an apparently low-level manner. The actual figures presented in this paper are not the primary evidence for this interaction, but are merely examples of the described phenomena included for illustrative purposes.
The two-panel dihedral presented in the paper was selected for its simplicity, to make the discussion of the effect less complicated. In case the reviewer continues to claim that it is impossible to see that figure as described under any circumstances, the new draft also includes the figure shown in Figure 3,
which I hope the reviewer will find more compelling, but which operates on the same principle, although this figure is more difficult (though not impossible) to perceive as a flat surface, which makes it somewhat less illustrative of the multistability of the percept. If even this percept poses a problem for the reviewer, a still more compelling example is shown in Figure 4, which is however very difficult to perceive as a flat surface, although it does demonstrate bistability.
The supreme confidence with which the reviewer states that the percept I describe is false (rather than that he simply fails to perceive it) compels me to include one more figure, Figure 5 (A), which, although it is now no longer multistable, the percept of the direction of the illuminant is even stronger in this case than in the previous figures, which (if the reviewer can perceive it as described) he might find more convincing as a low level percept rather than a cognitive inference, as in this case it is more difficult (although again, not impossible) to perceive the shaded surfaces as being of different reflectance rather than in shadow. Indeed there is a trade-off between the salience of the percept of the illuminant, and the multistability of the percept, or the tendency for the percept to allow an alternative interpretation. The multistable percepts are more convincing examples for this argument because the percept of the illuminant is observed to flip in synchrony with the spatial percept, which is therefore unlikely to be a cognitive inference. It is understandable therefore (and consistent with the Gestalt Bubble model) that the multistable percept is correspondingly less salient. Nevertheless, the more salient percept in Figure 5 (A) stimulates a strong percept of an illuminant, and the direction of that illuminant can be reversed by reversing the whole figure, as shown in Figure 5 (B). The reader is requested to cover (A) while viewing (B), and vice-versa.
The point is that if any of these figures exhibits the properties described in the paper, that is sufficient to demonstrate the existence of an interaction between form and illuminance perception which requires explanation by models of perception. If the reviewer objects that this is a cognitive inference (or higher level process) rather than a low-level interaction, then the arguments elaborated in the paper apply. It is all too easy to ascribe the most troublesome aspects of perception to unspecified cognitive or higher level processes, although it should be noted that in so doing, the problem is not in any way brought closer to a solution, but merely relegated to a different domain which the perceptual scientist can safely claim is outside of their responsibility. The Artificial Intelligence (A.I.) movement in computer science revealed the inadequacy of purely cognitive, or symbolic manipulation processes to address practical problems in perception, and the Gestalt approach suggests that the cognitive interpretation is built up out of multiple levels of low- and mid-level perceptual constructs, rather than having cognition operate directly on the lowest level sensory representation.
As for the Mach Card illusion, that too is exactly consistent with the Gestalt Bubble model. Indeed this illusion is used by Mach to support his observation that "without any aid of the judgement, a fixed habit of the eye is developed by means of which illumination and depth are connected in a definite way." In other words, there is a close coupling between the perception of the illuminant, and three-dimensional structure which is exactly the point of the Gestalt Bubble model, which offers a general computational explanation of this interaction.
In Mach's illusion a folded card on a table exhibits a brighter and darker side in the presence of unequal illumination, as shown in Figure 6 (A). Mach notes that in this case, the darker panel is not perceived as being intrinsically darker, but rather, the darkness is attributed to lower surface illumination, the intrinsic lightness of the two panels being perceived as being the same. A monocular viewing of that card can promote a reversal in depth of the perceived fold in the card, as suggested in Figure 6 (B). In this case however the relative shading of the two panels is no longer consistent with the perceived global illumination profile. Since the difference in perceived brightness can no longer be accounted for by illumination, it must be attributed to a difference in surface reflectance, exactly as predicted by the Gestalt Bubble model; i.e. the darker surface is perceived to have a darker intrinsic reflectance. The difference between the Mach card and the example discussed in the paper, is that in this case the global illumination profile is not infered by inspection of the figure, but is known in advance, since the experiment is conducted in a real room with real illumination, so the illumination percept cannot reverse with the reversal of the spatial percept as discussed in the paper, but rather the conflict between the known illumination and the reversed spatial percept no longer allows the darkness of the panel to be interpreted as a shadow.
Mach makes no mention of a cast shadow of the card falling on the table top, and indeed I doubt that the illusion would work at all with a cast shadow, since that shadow would make it virtually impossible to perceive the card as spatially inverted, as shown in Figure 6 (C), which is virtually impossible to see as a concave corner, since that percept is inconsistent with the depicted cast shadow. I repeated Mach's experiment on a textured wooden table near a window in indirect illumination (i.e. not direct sunlight) and the cast shadow of the card was not noticable on the table top. I presume these conditions matched Mach's original experiment, i.e. the cast shadow is not an essential part of the percept, as the reviewer suggests, and indeed it would interfere with the spatial reversal, which is hard enough to perceive without a cast shadow.