The pervasive integration of environmental sounds into diverse aspects of daily life - ranging from smart city management, accurate location pinpointing, surveillance mechanisms, auditory machine functionalities, to environmental monitoring - is evident. Central to this is environmental sound classification, gaining academic traction. However, sound classifications present challenges due to the variables causing noise. This research aimed to discern the convolutional neural network (CNN) model with optimal accuracy in ESC tasks via hyperparameter optimisation. Simplified swarm optimisation (SSO) algorithm was harnessed to encapsulate the CNN architecture, providing an untransformed representation of CNN hyperparameters during optimisation. Utilising the prominent datasets and applying data augmentation techniques, the CNN model designed via SSO achieved accuracies of 99.01%, 97.42%, and 98.96% respectively. Compared to prior studies, this denotes the highest accuracy from a pure CNN model, advancing automated CNN design for urban sound classification.