The Relationship Between Bottom-Up Saliency and Gaze Behaviour During Audiovisual Speech Perception
MetadataShow full item record
Face-to-face communication is one of the most natural forms of interaction between humans. Speech perception is an important part of this interaction. While speech could be said to be primarily auditory in nature, visual information can play a significant role in influencing perception. It is not well understood what visual information is important or how that information is collected. Previous studies have documented the preference to gaze at the eyes, nose, and mouth of the talking face, but physical saliency, i.e., the unique low-level features of the stimulus, has not been explicitly examined. Two eye-tracking experiments are presented to investigate the role of physical saliency in the guidance of gaze fixations during audiovisual speech perception. Experiment 1 quantified the physical saliency of a talking face and examined its relationship with the gaze behaviour of participants performing an audiovisual speech perception task and an emotion judgment task. The majority of fixations were made to locations on the face that exhibited high relative saliency, but not necessarily the maximally salient location. The addition of acoustic background noise resulted in a change in gaze behaviour and a decrease in correspondence between saliency and gaze behaviour, whereas changing the task did not alter this correspondence despite changes in gaze behaviour. Experiment 2 manipulated the visual information available to the viewer by using animated full-feature and point-light talking faces. Removing static information, such as colour, intensity, and orientation, from the stimuli elicited both a change in gaze behaviour and a decrease in correspondence between saliency and gaze behaviour. Removing dynamic information, particularly head motion, resulted in a decrease in correspondence between saliency and gaze behaviour without any change in gaze behaviour. The results of these experiments show that, while physical saliency is correlated with gaze behaviour, it cannot be the single factor determining the selection of gaze fixations. Interactions within and between bottom-up and top-down processing are suggested to guide the selection of gaze fixations during audiovisual speech perception.