Temporal integration involves combining information over time to improve detection or discrimination. In psychoacoustic work temporal integration is often thought of and modeled as a simple accumulation process, such as energy integration. However, it may be more appropriate to consider temporal integration as the combination of information from multiple independent "looks". This paper reviews the evidence supporting the concept of multiple looks, and considers how this concept can be related to the internal representations of stimuli. Models are described in which the internal representation of a sound can be calculated as a spectro-temporal excitation pattern (STEP). It is assumed that central mechanisms can make "intelligent" use of the information in the STEP to enhance signal detection, discrimination and identification. The results of many detection and discrimination experiments can be modeled using the idea of templates, based on the internal representation of a prototype or target stimulus. Decisions are based on the similarity of the internal representation of the current stimulus with the template. In psychoacoustic experiments, the template may be built up during the early trials of a run. For stimuli like speech, templates may be stored in long-term memory. Information extracted from one part of a sound may influence the evaluation and interpretation of information extracted from another part at a different time. This occurs for both nonspeech and speech sounds. The second part of this paper describes some contextual effects of this type, and considers the mechanisms underlying them. Several processes seem to be involved, including perceptual grouping, adaptation in the auditory periphery, sensitivity of the auditory system to spectral changes, and central compensation mechanisms for spectral distortions. (C) 2003 Elsevier Science Ltd. All rights reserved.