We explore the possibility of recognizing speech signals using a large collection of coarse acoustic events, which describe temporal relations between a small number of local features of the spectrogram, The major issue of invariance to changes in duration of speech signal events is addressed by defining temporal relations in a rather coarse manner, allowing for a large degree of slack. The approach is greedy in that it does not offer an "explanation" of the entire signal as the hidden Markov models (HMMs) approach does; rather, it accesses small amounts of relational information to determine a speech unit or class. This implies that we recognize words as units, without recognizing their subcomponents, Multiple randomized decision trees are used to access the large pool of acoustic events in a systematic manner and are aggregated to produce the classifier.