A method to maximize the total coverage of multiple unmanned aerial vehicles (UAVs) which monitor a bounded space is presented. The goal of all UAVs is to maximize their individual coverage while minimize possible coverage overlaps among them. This goal is achieved using a multi-agent reinforcement learning (MARL) method which is embedded with a coordination strategy that allows several UAVs to negotiate their actions to avoid possible overlaps between their coverage. Simulation results are shown to illustrate the developed MARL scheme's performance.