How to effectively trim the original model to become a lightweight one is an important issue for edge computing. This work develops an Iterative Momentum Pruning (IMP) scheme which adopts the adaptive threshold from scaling factors of batch normalized weights of the sparse model and channel ratio of each layer, and trims the model via multiple iterations. The threshold equation includes the momentum and channel-portion terms which come from the standard deviations of scaling factors of batch normalized layers, and channel number of each layer versus the total channel number of the model, respectively, accompanied with a scaling factor. The simulation results of VGG models at Cifar10 and Cifar100 reveal that proposed IMP scheme is superior to the conventional pruning schemes at low compression ratios in terms of accuracy and parameter quantity. At small pruning ratios, the momentum term in our IMP scheme plays a major role to truncate the model. The phenomenon illustrates that the channel trimming based on the momenta of scaling factors of batch normalized layers is an effective way. Especially, the proposed IMP scheme can work with neural architecture search approach at channel search and reduction to even more achieve excellent performance and low complexity according to the simulation results of YOLOv4 at BDD100K.