The prevalence of control flow, recursive data structures, and general pointer accesses in ordinary programs renders the traditional automatic parallelization techniques unsuitable. OpenMP Decoupled Software Pipelining (DSWP) is proposed to exploit pipeline parallelism lurking in ordinary programs, which cannot be dealt with by traditional techniques. While cost model is important in helping evaluate compiler transformations, guiding the compiler in its optimization process and helping achieve load balancing, existing cost models are too simple to be sufficient for the profit evaluation of OpenMP, especially for DSWPed loops. We propose a compile-time cost model for automatic parallelization profit estimate by extending the existing cost model in Open64 loop nest optimizer (LNO) phase in this paper. Moreover, we improve the OpenMP DSWP algorithm based on our cost model, which increases execution efficiency of automatic parallelization. We evaluate our cost model with loops containing complex memory access patterns and control flow structure, which cannot be dealt with by traditional techniques, and NAS Parallel Benchmarks (NPB) 3.3.1. As a result, evident performance improvement for generated DSWPed loops and programs are obtained by using our model.