In the field of supply chain management, efficient vehicle scheduling is essential to reduce operational costs. With the dynamic changes in the market environment and the diversification of customer needs, the traditional static vehicle scheduling method has been difficult to adapt to the complex realistic scene. Therefore, this paper proposes an optimal vehicle scheduling model based on Markov Decision Process (MDP) to address these challenges. This study conducted an in-depth analysis of vehicle scheduling in the supply chain, clarified the core elements and constraints of the problem, and then constructed the stochastic transfer probability of vehicle scheduling in the supply chain based on the MDP theory, taking into account a variety of influencing factors, such as vehicle number, transportation time, transportation cost and customer demand. The corresponding return function is set up to reflect the influence of these factors on the scheduling effect. The MDP model is solved by a dynamic programming algorithm, and the optimal vehicle scheduling strategy is obtained. The iterative update method is adopted in the study, and the strategy is constantly adjusted according to the current state and environmental information to achieve the long-term optimization goal. In addition, The experimental results show that the MDP-based optimal vehicle scheduling model can significantly reduce transportation cost, shorten transportation time and improve customer satisfaction. Compared with the traditional static scheduling method, this model has higher flexibility and adaptability, and can better cope with the change of market environment and the uncertainty of customer demand.