The Story of the Eye
On perception from da Vinci to Toy Story
On perception from da Vinci to Toy Story
Now close to grossing £200,000,000 worldwide, Disney's Toy Story was released in late 1995 to extensive press coverage that dwelt mostly on the incredible logistics (technical and conceptual) involved in the production of the first entirely computer-generated animated feature film. The glut of statistics (two terabytes of film information, five lights for Mr Potato Head's ears alone, ten man years to complete the models...), unseen seen since the release of Cleopatra (1963), and the fact that the film had characterisation of a far greater degree of sophistication than the one-dimensionality of recent Disney movies obscured one of the most significant aspect of Toy Story - that it represents a culmination of almost 30 years of computer graphics experimentation and theory. The credit list of the movie has the highest Ph.D. quotient of any production and reads like a roll of honour in the development of computer generated imagery (CGI). The staff of Pixar, the company that created Toy Story, has included some of the best-known names in the brief history of CGI: Loren Carpenter, Dr Ed Catmull, Rob Cook, Pat Hanrahan and Darwyn Peachey amongst others - all figures who have made major contributions to the sophistication of computer-simulated images. While Pixar has produced a number of computer animated shorts - such as the Oscar-winner Tin Toy (1988) and Oscar-nominee Light Entertainment (1989) - their key commercial software product, RenderMan™, has been responsible for the look of some of recent film history's most notable special effects. These have included the water creature in James Cameron's The Abyss (1989), the evil robot in Cameron's Terminator 2 (1991) and the dinosaurs in Stephen Spielberg's Jurassic Park (1994). (It was in Jurassic Park that CGI really came to prominence in film special effects: Spielberg had intended to produce the animation using real models and stop-motion photography, but the success of trials using CGI persuaded him to adopt it for the main dinosaur sequences.)
Pixar itself evolved out of Industrial Light and Magic's CGI department where, in 1981, Carpenter wrote the precursor to RenderMan: the REYES system (Renders Everything You Ever Saw), which was used for the Genesis effect in Star Trek II (1981). Spun-off from ILM in 1986, Pixar refined the REYES system with the aim of producing photorealistic images that could be composited with live action, as Catmull describes: '...we now understood what it meant to describe an image ... Pat Hanrahan put all that we knew about geometry, lighting models, ray tracing, antialiasing, motion blur and shade trees into a compact interface, which he named RenderMan'. 1 RenderMan provides a way of describing the world: an interface between physical geometry (the shape of things) and their appearance. Through its ability to use programmable surface 'shaders' that mathematically describe a surface's response to light, RenderMan allows the user to precisely define the material characteristics of objects and control how colour, reflectivity, transparency and so forth are applied to form.
What marks out Pixar's RenderMan, is the flexibility of its descriptive powers and the level of abstraction it brings to the process of creating appearances. Rendering is part empirical, part theoretical: it is based on the laws of physics and observations on the behaviour of light, materials and the interaction between the two. An ideal renderer would be a perfect simulation of physics and, if the characteristics of a material were defined in enough detail, the renderer would output an image indistinguishable from a photograph. But in actuality, many physical factors (the curvature of light due to the effects of gravity, for example) have no significant effect on our perception of the 'reality' of an image whether they are taken into account or not. The vital question is how to decide what matters and what doesn't. For example, many critics of digital audio complain that the bandwidth of CDs is too narrow. Although it can reproduce most of the audible spectrum, what it cannot reproduce plays a significant role in our perception of the reality of the sound. The issue here is one of sampling and of discovering the degree to which factors that are not necessarily the most obvious influence the apparent accuracy of a simulated image - whether aural or optical. This question of the importance of peripheral information is of special relevance to CGI, where everything in a scene must be explicitly defined, parametrised and quantised. The artificiality of early CGI was mainly due to a lack of subtlety in colour variation, response to light and shadow generation. In fact, the theoretical model of reflectivity used to render surfaces closely approximated that of plastic, giving a distinct look to computer generated images of the time. Today, much effort is being expended on defining complex appearances - dirty walls, chipped paint and scuffed carpets - and organic features, such as hair and grass, in an attempt to bring to CGI something of the randomness and imperfection of the real world.
Despite popular perception of CGI as new and somehow futuristic, RenderMan is in many ways an extension (and an encapsulation) of a programme of research that began with 15th century studies in perspective and light by Brunelleschi, Uccello and da Vinci, and continued with the Impressionists and Post-Impressionists' attempts to evolve colour theory. Since the Renaissance, and before artists finally lost interest in optical reality and handed it over to the photographers, artists have sought to understand the nature of the world they viewed and to define ways in which it could be represented on a two dimensional surface. The popular 15th century technique of viewing a scene through a frame or window marked with a grid is conceptually similar to the perspective projections that take place in computer graphics to allow the viewing of a three-dimensional scene on a screen, as da Vinci explained: 'Perspective is nothing other than viewing a scene behind a transparent pane of glass, on the surface of which the objects behind it are drawn. The objects can be traced via a pyramid to the focal point, and this pyramid is intersected by the glass plane'. 2
Many of the problems that visual artists have had to deal with have been (and in some cases still are) problems that computer graphics must address. Geometry itself is the first major hurdle. The theorisation of perspective provided a framework for the convincing representation of simple three-dimensional volumes and became one of the earliest fads in visual history. Like the popularity of effects such as morphing in the late 80s, perspective projection was a fashion of the late 15th century, with artists endeavouring to place their figures in elaborate, strongly volumetric architectural settings that would show off their ability to the best degree. While many artists adapted Brunelleschi and Alberti's studies in a pragmatic manner, dropping people and things onto a perspective 'pavement', other artists extended the idea of rationalisation that a system of perspective entailed to other areas. Paolo Uccello, for example, attempted to parametrise the surfaces and volumes of the more complex objects that appeared in his paintings. In a study of a chalice, he broke down the curved surfaces of the object into planar sections. In contemporary CGI terms, he polygonised the object, and this method of describing curved surfaces is still an extremely common form of geometric representation, used in modelling software such as SoftImage and Prisms.
Amongst his many other studies, da Vinci systematically analysed the behaviour of light in the casting and colouring of shadows, in refraction and in the attenuating effects of atmosphere on the appearance of distant objects. The latter, now known as depth-cueing, plays a vital role in the perception of depth in naturalistic three-dimensional CGI. In the 19th century, late-Romantic artists such as Anne Louis Girodet experimented with the realistic representation of surfaces, textures and unusual lighting effects. In an attempt to define a consistent visual environment, Girodet painted at night under artificial light. As a result, the paintings he made during this period, while strongly 'realistic', have a very particular look, in part because they are viewed today under different lighting conditions from which they were made. Later in the century, Seurat and the Post-Impressionists sought to create a rigorous colour system out of the instinctive optical mixing of pure pigments used by the Impressionists and earlier artists. The resulting 'divisionist' paintings made by Seurat and Signac, utilising irregularly spaced points of unmixed pigment, grew out of earlier scientific theory concerned with fabric-dyeing in a newly-industrialised age. The development of techniques for mass-reproduction required repeatability and consistency - in effect, parametrisation - and the 'divisionist' technique mimicked contemporary researches in subtractive colour theory and, in a crude way, looked forward to current techniques of stochastic sampling used in high-end print process and computer graphics. Now, at the end of the 20th century, the analysis of optical reality is largely the domain of the software engineers and scientists involved in CGI. What Toy Story really represents is the practical application of a contemporary simulation of the visual world.
At CGI research centres such as Cornell University, more photorealistic still images than those in Toy Story have been created - at massive computational expense - but these techniques and simulation models are not yet workable for animated features. Thus, most rendering systems use techniques that mimic reality, to a greater or lesser degree, without the computational cost of doing so in a theoretically rigorous way - echoing the pragmatism of the Renaissance painter. What is intriguing in Toy Story are the ways in which differing levels of optical verisimilitude co-exist and interplay. For example, there are many reflections in the film: Buzz Lightyear's glass helmet constantly reflects the world around him, but the reflections are fudged using a technique that mimics the way real reflections are formed, but without the expense of ray tracing, which 'correctly' simulates them. This has little effect on how 'real' the reflections appear. In other situations, however, the successes and limitations of the simulations used are more visible.
In some scenes, for example the outdoor car chase or the rainy window shot looking out of Sid's bedroom, the sense of photorealism is intense. In other scenes, there are generalisations in the appearances of objects that lend the image a more pictorial feel: you know that the objects are representations. They have a stylisation, or degree of abstraction, that is not of the photograph but of the hand. Some of the surfaces' simulations appear to work under certain conditions and not others. For instance, the main characters twice find themselves standing amongst plants. In one of these scenes, the leaves are utterly convincing, in the other they don't look wrong, but they look stylised, pictorial even - they belong to a different visual language. In this, Toy Story straddles filmic genres. The seamless incorporation of CGI into live action (Jurassic Park) demands photorealism, while other forms require lesser degrees of optical fidelity. Although the differentiation between fantasy images and live action has been eroded since Who Killed Roger Rabbit combined cartoon and real people, The Mask (1994) complicated the issue by making cartoon imagery photorealistic. Here, the vocabulary of distortions, exaggerations and logical quirks of traditional cell animation was rendered in a three-dimensional, photorealistic style and composited with live action. In Toy Story, however, it is sometimes difficult to know what you are watching - if the film has a fault, it is that it is too good in places. The appearance of an utterly believable object alongside one that belongs to a different genre of representation is rather like looking at a painting by an artist (Tissot perhaps) who is particularly good at drapery and polished wood, but whose depiction of foliage and flesh is more general and stylised.
It is significant that the development and theorisation of perspective and geometrical representation that evolved in the 15th century has been ascribed to the educational system (which favoured practical mathematics) and the major economy of the time (trade) which required the ability to accurately gauge the volumes of irregular, non-standardised containers. 3 Artists and their patrons, the new merchant class, frequently shared the same educational background. The Renaissance painter Piero della Francesca, for example, provided merchants with formulas for calculating volume in his textbook De Abaco, while using the same geometric skills in his paintings. In the 20th century, the parametrisation and theory of complex curved surfaces is largely the result of research carried out by Pierre Bézier at Renault and Pierre de Casteljau at Citroën. The precise definition of curved surfaces is of obvious importance to the automotive and aerospace industries, and current spline-based modelling software is as much a product of 20th century economics as perspective and simple volumetric representation were of the Quattrocento economy. Similarly, the sampling and filtering techniques used in the digitisation and analogue reconstruction of audio and visual material today, stem from the digital signal processing theory so vital to the 20th century's greatest commodity: communications.
When western visual art has come closest to optical verisimilitude - in Renaissance Italy, from the 15th-17th century in the Netherlands, in the late-19th century Britain of the Pre-Raphaelites - it has been at moments when these countries reached their mercantile zenith. As Renaissance art represented a Humanist view of the world that tended towards literalness and optical fidelity, so too does Toy Story, in its quiet existential crises and implicit fear of redundancy, move towards photorealism and the tangible manifestation of fantasy, but as the products of this century's doubt. In this quest for increasing photorealism in film special-effects, computer games and CGI, there is an underlying sense that everything must be represented literally to be believable, to be real. This may or may not have something to do with the shift in our literacy from text and pictorial images to the lens-based media of film and television, but it is equally reflected in a general anxiety that contemporary art has to be 'about' something in an absurdly concrete way (Koons, Hirst...). Perhaps Capitalism demands tangibility - that only the quantifiable can have value - but the equation of what something is with what it means allows very little scope for any other reality than the optical.
1. Steve Upstill, The RenderMan Companion, Addison-Wesley, 1992
2. Leonardo da Vinci, 'Linear Perspective', from The Literary Works of Leonardo da Vinci, London, 1883
3. Michael Baxandall, Painting and Experience in Fifteenth Century Italy, Oxford University Press, 1972