A multi-feature-based composite model particle filter algorithm is proposed to improve the accuracy and robustness of sound source location in reverberation and noise environment. In this algorithm, the likelihood function of the particle filter is constructed based on the multiple features of signal received by a microphone, where the depth features of multiple hypothesis time-delay estimated image are extracted by convolutional neural network (CNN), and a time-delay estimation model based on support vector regression (SVR) is established. Furthermore, the deficiency that single feature can't suppress noise and reverberation simultaneously is remedied by introducing the beam output energy fusion mechanism. For the randomness of speaker motion, a composite model for sound source tracking is established to improve the robustness of speaker tracking system. The simulated and experimental results show that, based on the composite model, the position average root mean square error (RMSE) of multi-feature algorithm is reduced by more than 83% compared with that of steered response power and time delay estimation (SRPTDE) algorithm, and under multi-feature observation, the position average RMSE of composite model is reduced by more than 46% compared with that of Langevin model and the random walking model. The proposed algorithm realizes the effective tracking of random moving sound sources in complex environment. © 2024 China Ordnance Industry Corporation. All rights reserved.