The gasoline inline blending process has widely used real -time optimization techniques to achieve optimization objectives, such as minimizing the cost of production. However, the effectiveness of realtime optimization in gasoline blending relies on accurate blending models and is challenged by stochastic disturbances. Thus, we propose a real -time optimization algorithm based on the soft actor-critic (SAC) deep reinforcement learning strategy to optimize gasoline blending without relying on a single blending model and to be robust against disturbances. Our approach constructs the environment using nonlinear blending models and feedstocks with disturbances. The algorithm incorporates the Lagrange multiplier and path constraints in reward design to manage sparse product constraints. Carefully abstracted states facilitate algorithm convergence, and the normalized action vector in each optimization period allows the agent to generalize to some extent across different target production scenarios. Through these well-designed components, the algorithm based on the SAC outperforms real -time optimization methods based on either nonlinear or linear programming. It even demonstrates comparable performance with the time-horizon based real -time optimization method, which requires knowledge of uncertainty models, confirming its capability to handle uncertainty without accurate models. Our simulation illustrates a promising approach to free real -time optimization of the gasoline blending process from uncertainty models that are difficult to acquire in practice. (c) 2024 The Chemical Industry and Engineering Society of China, and Chemical Industry Press Co., Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.