Eyes — nostrils — lips — shiny things. My eyes scanned the photo of two women almost the way a monkey’s would, going first to the faces, specifically features that would indicate friend, foe, or possible mate, then to the brightest objects in the image: a blue bottle, a pendant. Surprisingly, it took forever (almost three seconds) for me to look at the famous — and famously gorgeous — face in the photo, Angelina Jolie.

In this impulsive type of seeing, called “bottom-up,” my eyes went first to the face of the other woman in the picture until I consciously ordered them to spend time on Jolie — “top-down attentional deployment” in scientific lingo. That’s because, researchers have found, we recognize culture-defined beauty only after taking the reflexive glances all primate eyes perform within the first few milliseconds. In other words, we see first with our animal selves and then with our acculturated minds.

We process visual information very quickly, as the brain electronically parcels parts of images to different cortical areas concerned with faces, colors, shapes, motion, and many other aspects of a scene, where they are broken down even further. Then the brain puts all that information back together into a coherent composite before directing the eyes to move.

My eye tracks were being recorded by vision researcher Laurent Itti, an associate professor of computer science, psychology, and neuroscience at the University of Southern California’s iLab ( Professors and graduate students are performing basic research that may help develop “machine vision” for robot eyes. They’ve already shown how the eyes of kids at risk of Attention Deficit Hyperactivity Disorder move less methodically than the norm. They and other scientists around the world are churning out a trove of information that photographers (and camera engineers) can exploit right now.

Some long-held beliefs about what makes a good picture have been confirmed in objective experiments. For instance, now we know why you can’t go wrong photographing a train wreck — the eye physiologically sharpens and is drawn to lines, corners, and junctions, as well as to faces, while the unconscious mind is attuned to Fight or Flight situations.

And if you know empirically what entices the eye, you should theoretically make more memorable images. (See the sidebars for ways to put vision theory into practice in your photos.) The key when looking through the viewfinder is to tap into the most primal attractors of attention — to just shoot without thinking and ask questions later about why it’s a good picture.


Pioneering Russian vision researcher A.L. Yarbus proved through experiments in the 1950s that people who think differently actually see differently: How they interpret the object they’re looking at determines which object they move their eyes to next.

One widely accepted theory posits that when we view a scene, our brain deconstructs it into several overlays, including contrast, color, movement, shape, orientation, and other cues. Within 25 to 30 milliseconds, our brain recombines these into a “saliency map” with differently weighted hotspots rigged to our survival instincts.

A phenomenon called inhibition of return prevents our attention from coming back to a spot we’ve already considered until after we’ve scanned all the hotspots. But attention and eye movements interplay, so there’s some neural system of checking off objects in descending order of salience.

“Saliency is low-level surprise,” explains Christof Koch, a leading vision researcher on both the biology and engineering faculty at California Institute of Technology, who has an interesting glossary of cognition on his website, “Faces are salient, as are motion, flickering, contrast — all depending on the context. We can say with confidence where you will look within an image.”

In the first 150 milliseconds of looking at a picture, those elements draw the eye. That’s bottom-up seeing. Then, top-down spotlighting takes over: “Of special significance are images of biological relevance — like fear, sex, gender, aggression — which are much more dependent on training and culture,” Koch says.
Tests have shown, for example, that East Asians’ interpretations of facial expressions depend heavily on the expressions of people surrounding the subject in an image. For Mexicans the color blue, not black, signals mourning. These judgments happen quickly, often without conscious thinking. Such cultural norms dictate not only the way we interpret images but the unconscious motion of our eyes when looking at them.

Most forms of eye movement are unconscious. We do a lot of things without thinking about what our bodies are doing, such as walking and adjusting our posture. Koch calls this kind of action a “zombie agent,” defined as “a stereotyped, rapid, and effortless sensory-motor behavior that does not give rise to a conscious sensation. Consciousness for this behavior may come later or not at all.” In top-down, goal-driven attentional mode, we move our eyes 3 to 5 times per second, about 200,000 times per day. In bottom-up, we don’t consciously move our eyes, but we see 10 times faster.

For a photographer, the ideal is to appeal to both ways of seeing. How far apart are they? USC’s Laurent Itti frames the scale in terms of painters: “Bottom-up is Robert Rauschenberg, top-down is Thomas Kinkade. You feel more comfortable with familiarity [Kinkade], but surprise and novelty [Rauschenberg] keep you looking.”


If you’ve seen particularly evocative photos on a hospital’s walls, Adam Gazzaley may have shot them. “The most powerful photos are warm and comfortable, but also new and exciting,” says the M.D./Ph.D., who teaches cognitive neuroscience and runs the Neuroscience Imaging Center at the University of California, San Francisco (

His first photographs depicted brain slices and neurons, taken through a microscope. As he studied underlying patterns in the way the brain converts images into cognition, he also developed a passion for outdoor nature photography, and has sold a lot of prints to hospitals (

Now he studies how peoples’ brains light up in an MRI while they look at his pictures. His data is helping to show why our aging brains find it increasingly difficult to filter out irrelevant information: As we grow older our saliency maps tend to become more crowded and our ability to weigh different levels of importance in a scene diminishes.

His advice for taking good pictures? “You have to devolve a couple notches, to shift the balance back from top-down to seeing bottom-up,” Gazzaley says. “There’s a price in being goal-oriented, too top-down — you miss the flowers along the stream bank as you rush to the waterfall.” The idea is to be “more stimulus-driven than goal-directed.”

I ask if that means we need to regress to seeing like cavemen. “Maybe below that,” he replies. “Back to pre-human. Being predominately top-down is how we’ve evolved and survived, but you lose appreciation for the subtleties in the world.”

Gazzaley says he’s shot enough pictures to be able to put most technical questions out of his mind as he tries to see bottom-up. “You have to tune in to what it is that’s changing your emotions, and try to capture that,” he says. “What’s the point of the picture? If you can’t describe a photo in three or four words, and you don’t feel emotion while you’re taking it, the viewer won’t feel much either.”


Eyes — lips — shiny things. My own eyes zombie through 12 photos in the USC iLab. It isn’t exactly A Clockwork Orange, with pincers holding my eyelids open. But I do have to keep my chin resting solidly on a T-stand as Laurent Itti raises my chair to aim my gaze slightly downward at a high-definition, 42-inch TV screen about 4 feet away.

As he aims a camera at my eyes to record their movements, I say through clenched teeth, “So all you guys in this lab must be excellent photographers, knowing so much about how people see.”

“That knowledge may work against us,” he replies. “None of us are very good photographers.” He confesses to knowing why creating “scan lines” for the eye to follow through a picture is important, but says that the overly left-brained (e.g., research scientists) have trouble transporting that knowledge across the synapse between science and art. After my eyes are pointed at a target at the neutral center of the photograph, I bring the pictures up separately on the screen by touching the space bar on a computer keyboard. The results, such as that with Angelina Jolie, match closely the way a computer model of human sight, based on hundreds of such tests, predicted my eyes would move.

“Where the eye lands in an image is not much different between monkeys and humans,” says USC grad student David Berg, who studies visual stimuli in monkey brains. “Eyes, mouth corners, the top of the lip — people and monkeys look for emotional significance in a face.”

How can a photographer take advantage of these findings? If you want to draw the eye first to something nonhuman in a photograph, leave out faces, be they human, dog, or mask. If you want to tap the viewer’s strongest subconscious impulses, hide faces or face-like shadows in the trees and clouds. (After all, painters such as Van Gogh and Cezanne did this — intentionally or not — and we’re still looking at their pictures.)


Of course, all this research and development isn’t happening just in order to make you a better photographer, though that’s a nice side benefit. It’s chiefly aimed at industrial applications.Some of this eye-knowledge has already made its way into cameras, which now routinely include face-detection technology in exposure and focusing systems under the assumption — scientifically proven — that faces are what photographers, and viewers, care most about.

And there’s more to come. Electrical engineers at Stanford have developed a digital camera with 12,616 tiny lenses that sees in super 3D. They’ve shrunk the imaging sensor’s pixels down to 0.7 microns, less than a tenth the size of the pixels in many DSLRs, and grouped them into arrays topped by miniature lenses much smaller than the ones used in today’s sensors.

This, in turn, is helping pave the way for the gigapixel camera, with 100 times more pixels than today’s 10MP clunker.
Still, to build better DSLRs isn’t the real goal. It’s to build better robots, with the visual acuity of the Terminator. I have to wonder if such super-seeing cyborgs will one day realize they have more visual power than us, decide to take over, and begin laying waste to major cities. One thing’s certain: We’ll at least know why we can’t tear our eyes away from the pictures.

In principle: The eye’s nighttime ISO has been estimated at about 800. Since day vision is about 600 times less sensitive to light, on a sunny day your eyes have an ISO of close to 1. In practice: The slowest film you can buy is ISO 25, and ISO 50 is low on a digital camera. But with DSLRs now topping ISO 6400, your camera sees in the dark better than you do. So enjoy your camera’s nighttime advantage and shoot when the lights are low.

In principle: Eye-movement tracking shows that the eye is drawn to lines, is even more taken with angles, and returns repeatedly to corners. The Mach Effect describes how the eye searches out luminance differences by neurologically exaggerating contrast along edges. In practice: We love lines, especially horizons. So use them to your advantage: Keep lines straight, include points of intersection, and put your subject close to corners.

In principle: We’re subliminally influenced by the faces we see. Pupillometrics has demonstrated that when you look at an image of a person, your pupils dilate to the same diameter as the person’s in the picture. Tests also have shown that people prefer photos in which the pupils — human or animal — are dilated.
In practice: Avoid bright, pupil-contracting lights in the eyes when shooting portraits, or score some belladonna.

In principle: Motion parallax helps us see in 3D because when we stare at a fixed object and then move sideways, nearer objects appear to move in the opposite direction while distant things appear to move in the same direction.
In****practice: Place someone or something moving to the right in the foreground (perhaps a couple walking on the beach), and in the background above their heads place objects projecting to the left (such as cliffs and headlands).

In principle: The illuminance ratio of sunlight to starlight is 1 billion to 1. Human vision spans the whole range, a spread far better than any camera’s.
In practice: Use split neutral-density filters in sunshine and high-dynamic range photography at night to make your pictures look more like your actual experience of the scene.

In principle: Does the eye care whether a photo is black-and-white or color when it comes to where it fixates? No. More important is contrast and whether an object creates a “hotspot” for the eye.
In practice: When shooting b&w, make sure you have enough contrast to keep viewers’ attention where you want it. And don’t forget those all-important lines and faces.

In principle: With a focal length of about 22mm and a field of view of almost 180 degrees at its extreme, our eyes are capable of f/3.5 at wide open. Then, only the 2 degrees in the center of the retina, an area called the fovea, is sharp.
In practice: This may be why shallow depth of field is so visually appealing. So use it to draw attention straight to your subject. Extreme depth or shallowness can also introduce an element of surprise to your photo.

In principle: Looking at a 120-degree field of view, the eye’s resolution is equivalent to about 576 megapixels.
In practice: When we view pictures, our brains can identify and make associations with a range of blurry and indistinct elements, essentially filling in the blanks.

In principle: Researchers studied 225 paintings going back three centuries and found that 75 percent depicted the illumination source above and to the left. Human testing confirmed that, in the absence of clues, the brain of right-handers infers illumination from above left, while southpaws see the light coming from above right.
In practice: If it worked for Vermeer, it’ll work for you.

In principle: We’re drawn to faces. Our eyes and brain evolved to assess almost instantaneously whether we are seeing a predator, prey, or mate.
In practice: Shoot more portraits! Perhaps the ultimate eye-pleasing photo would include scan lines that link a tiger attacking an ibex to an attractive person looking on. (Good luck with that.)

In principle: Surprise is the strongest known attractor of human attention. The Bayesian Theory of Surprise provides a mathematical framework for quantifying the degree of incompatible data in an image. The scientific definition of surprise? A relationship between objects that changes your beliefs about the world.
In practice: Depict the unexpected, whether you stumble across it
(a good reason always to have a camera with you) or set it up.