Load balancing is essential for the efficient delivery of cloud computing services, ensuring stable operation and robust performance under high load conditions. However, existing load-balancing task scheduling algorithms struggle to adapt to load performance fluctuations in real-time, leading to inaccuracies in evaluating task execution efficiency and consequently impacting the quality of service in actual cloud task scheduling. To address this issue, we propose a real-time performance-aware task scheduling method based on the Soft Actor-Critic (RTPA-SAC) algorithm. This method dynamically detects server load performance changes in real-time, enhancing environmental consistency and adaptability in stochastic, dynamic task scheduling, thereby improving load balancing. First, we construct a bounded load performance loss function to evaluate task execution efficiency, considering the impact of parallel task interference. Next, a reward mechanism is introduced, which takes into account both load fluctuations and response times, optimizing task load variance within quality of service constraints to minimize response time. Finally, By leveraging the Soft Actor-Critic algorithm, the proposed scheduling strategy enhances exploratory and stable decision-making in task scheduling. Experimental results show that RTPA-SAC outperforms baseline methods in load balancing, evidenced by improvements in task response time, average task load variance, and task success rate.