This paper describes parallel simulation of the memory/computation-intensive acoustic wave equation with CPU template buffer optimization. Considering the 8-core CPU shared storage platform as an example, we obtain a one-time speed-up ratio of 6.7x compared with the serial program by using a coarse-grained OpenMP parallel scheme. Then, data is vectorized on the template buffer using the single instruction-multiple data (SIMD) technique to further exploit the computing potential of the CPUs. We apply an 8-channel parallel vector to simulate seismic wavefields with the 256-bit advanced vector extensions (AVX) instruction set. This increases the computing bandwidth, thus eliminating a significant volume of the computing instructions and obtaining a secondary speed-up ratio of 3-7x. In addition, we use 32-byte data alignment, shortest data direction vectorization, and loop tiling optimization algorithm to achieve faster program execution. Finally, we analyze the factors affecting the secondary speed-up of AVX through three-dimensional modeling experiments with the salt model. The results indicate that the memory, cache, and register can better cooperate with each other and the speed-up is increased by optimizing the AVX algorithm.