Eyes -- nostrils -- lips -- shiny things. My eyes scanned the photo of two women almost the way a monkey's would, going first to the faces, specifically features that would indicate friend, foe, or possible mate, then to the brightest objects in the image: a blue bottle, a pendant. Surprisingly, it took forever (almost three seconds) for me to look at the famous -- and famously gorgeous -- face in the photo, Angelina Jolie.
In this impulsive type of seeing, called "bottom-up," my eyes went first to the face of the other woman in the picture until I consciously ordered them to spend time on Jolie -- "top-down attentional deployment" in scientific lingo. That's because, researchers have found, we recognize culture-defined beauty only after taking the reflexive glances all primate eyes perform within the first few milliseconds. In other words, we see first with our animal selves and then with our acculturated minds.
We process visual information very quickly, as the brain electronically parcels parts of images to different cortical areas concerned with faces, colors, shapes, motion, and many other aspects of a scene, where they are broken down even further. Then the brain puts all that information back together into a coherent composite before directing the eyes to move.
My eye tracks were being recorded by vision researcher Laurent Itti, an associate professor of computer science, psychology, and neuroscience at the University of Southern California's iLab (ilab.usc.edu). Professors and graduate students are performing basic research that may help develop "machine vision" for robot eyes. They've already shown how the eyes of kids at risk of Attention Deficit Hyperactivity Disorder move less methodically than the norm. They and other scientists around the world are churning out a trove of information that photographers (and camera engineers) can exploit right now.
Some long-held beliefs about what makes a good picture have been confirmed in objective experiments. For instance, now we know why you can't go wrong photographing a train wreck -- the eye physiologically sharpens and is drawn to lines, corners, and junctions, as well as to faces, while the unconscious mind is attuned to Fight or Flight situations.
And if you know empirically what entices the eye, you should theoretically make more memorable images. (See the sidebars for ways to put vision theory into practice in your photos.) The key when looking through the viewfinder is to tap into the most primal attractors of attention -- to just shoot without thinking and ask questions later about why it's a good picture.
NATURE, THEN NURTURE
Pioneering Russian vision researcher A.L. Yarbus proved through experiments in the 1950s that people who think differently actually see differently: How they interpret the object they're looking at determines which object they move their eyes to next.
One widely accepted theory posits that when we view a scene, our brain deconstructs it into several overlays, including contrast, color, movement, shape, orientation, and other cues. Within 25 to 30 milliseconds, our brain recombines these into a "saliency map" with differently weighted hotspots rigged to our survival instincts.
A phenomenon called inhibition of return prevents our attention from coming back to a spot we've already considered until after we've scanned all the hotspots. But attention and eye movements interplay, so there's some neural system of checking off objects in descending order of salience.
"Saliency is low-level surprise," explains Christof Koch, a leading vision researcher on both the biology and engineering faculty at California Institute of Technology, who has an interesting glossary of cognition on his website, www.klab.caltech.edu/~koch. "Faces are salient, as are motion, flickering, contrast -- all depending on the context. We can say with confidence where you will look within an image."
In the first 150 milliseconds of looking at a picture, those elements draw the eye. That's bottom-up seeing. Then, top-down spotlighting takes over: "Of special significance are images of biological relevance -- like fear, sex, gender, aggression -- which are much more dependent on training and culture," Koch says.
Tests have shown, for example, that East Asians' interpretations of facial expressions depend heavily on the expressions of people surrounding the subject in an image. For Mexicans the color blue, not black, signals mourning. These judgments happen quickly, often without conscious thinking. Such cultural norms dictate not only the way we interpret images but the unconscious motion of our eyes when looking at them.
Most forms of eye movement are unconscious. We do a lot of things without thinking about what our bodies are doing, such as walking and adjusting our posture. Koch calls this kind of action a "zombie agent," defined as "a stereotyped, rapid, and effortless sensory-motor behavior that does not give rise to a conscious sensation. Consciousness for this behavior may come later or not at all." In top-down, goal-driven attentional mode, we move our eyes 3 to 5 times per second, about 200,000 times per day. In bottom-up, we don't consciously move our eyes, but we see 10 times faster.
For a photographer, the ideal is to appeal to both ways of seeing. How far apart are they? USC's Laurent Itti frames the scale in terms of painters: "Bottom-up is Robert Rauschenberg, top-down is Thomas Kinkade. You feel more comfortable with familiarity [Kinkade], but surprise and novelty [Rauschenberg] keep you looking."