Power flow control is a critical aspect of preventing overloads in electrical networks, which can lead to severe consequences such as disconnections, cascading outages, and system blackouts. The congestion management problem is well-studied and solved via several different techniques. However, the dynamic and evolving nature of modern power systems, marked by increased renewable energy integration, grid interconnection, and evolving market structures, necessitates innovative solutions that can adapt to changing conditions in real-time. This research focuses on the intricate challenges of power flow control and congestion management using the technical, cost-effective method of network reconfigurations with busbar splitting. Busbar splitting presents a complex optimization problem due to the vast combinatorial possibilities of busbar splitting configurations. To address this challenge, we turn to Reinforcement Learning (RL), a dynamic and adaptive approach known for its real-time decision-making capabilities, ability to learn complex patterns, and flexibility in handling uncertainties. Several studies researched the RL based power flow control using network reconfiguration and gave solutions utilizing various techniques. Reward signal, as the main component of RL methods do not seem to evolve at the same pace. In this paper, we propose a novel Multi-Objective Reward (MOR) design focused on reliable power system control and prevention/reduction of overloads. Our results indicate that the proposed approach outperforms its contenders from the literature in both reliability and optimality of power flow control. Training involved eight different reward strategies, including our MOR approach and seven rewards from existing literature. The best designs from the literature managed a mean control duration of 1250 time-steps, while our approach extended this to nearly 5000 time-steps. In tests across 100 unseen scenarios, the MOR reward enabled the agent to maintain reliable network control for 21 days-outperforming the best agent from the literature by 10 days. Additionally, the agent trained with MOR significantly reduced both the frequency and severity of overloads.