An apparatus is described capable of detecting a specific sound among a sound mixture and of identifying the environmental acoustic scenario, to automatically adapt the apparatus to changing environmental conditions. Such an apparatus could be useful, for example, in the design of an ASR system robust to environmental noise. Starting from a time-frequency representation of the incoming signal, neuro-fuzzy clustering is performed on proper features extracted from time-frequency objects.