The visual processing in this model started by locating the edges in the scene, then identifying the intersections between edges, then performing a logical analysis of that information in order to deduce the identity and orientation of the objects.
The problem with this and its derivatives is that the real world contains far too many ambiguities, such as attached shadows, cast shadows, reflectance edges, etc. which are impossible to distinguish from real form edges based on local information alone. Furthermore, this kind of analysis is completely swamped by natural scenes which contain thousands of visual edges. The visual world simply contains far to much ambiguity to ever be interpreted in this way.