Since unmanned surface vehicle (USV) systems are highly coupled and have nonlinear relationships, coupled with environmental disturbances from winds and currents, this makes it challenging to achieve accurate trajectory tracking of USVs by directly controlling the underlying parameters, such as rudder and rotational speed. Therefore, this paper proposes a proximal policy optimisation (PPO) control scheme based on intrinsic curiosity module (ICM). First, according to the training characteristics of deep reinforcement learning (DRL) algorithms, an improved guidance law is proposed, which can effectively solve the problem of the desired speed exceeding the maximum allowable speed caused by the large tracking error due to the random exploration of the USV at the early stage of training. Different from the traditional DRL methods, this method incorporates intrinsic rewards alongside extrinsic rewards from the training environment. These intrinsic rewards, generated by the intrinsic curiosity module, serve to incentivize the agent. Actively exploring unknown states and acquiring new knowledge can enhance training outcomes and prevent premature model convergence. Finally, tested in designing and constructing multiple tracking scenarios containing both simple and complex trajectories, the simulation results show that the ICM-PPO method performs well in the accurate trajectory tracking problem.