When a fault occurs, driven by the reconfiguration strategy, the topology of the ship power system (SPS) will be changed to isolate the fault area and restore the lost load. However, traditional optimization methods have limitations, such as getting stuck in suboptimal solutions or not offering real-time solutions. This article proposes a novel deep reinforcement learning (DRL) method to address the dynamic fault reconfiguration problem in real time for SPS. While considering the load priorities, the fault reconfiguration model is formulated with the goal of the maximum weighted load power and the minimum switching action. Then, a deep Q-network (DQN) combined with an action mask mechanism, what is called the DQN-mask algorithm, is applied to optimize the switch action. The proposed method enables end-to-end control from the fault node data to the switch sequences and has migration capability in different scenarios. Two case studies are analyzed based on historical fault datasets of medium voltage dc (MVDC) SPS. The numerical results verify the effectiveness, real-time performance, migration, and scalability of the proposed DQN-mask algorithm.