Four-, 6-, and 8-month-old infants' perception of the multimodal features of the human face was investigated. First, infants were habituated to a visible and audible face of a person reciting a prepared script. Then they were tested by changing various features of just the audible, just the visible, or both components of the face. When features were changed, such as the lexical-syntactic content, the speaker's gender, or the synchrony relation between the audible and visible components, the infants discriminated their multimodal and visible representation but not the audible one. When the manner of speech was changed from adult- to infant-directed, the 2 older groups discriminated all 3 types of changes but the 4-month-old infants only discriminated its visible and multimodal representation. Results show that speech-related exaggerated prosody cues facilitate detection of the audible features of multimodally represented faces but not until 6 months of age.