With the development of recent artificial intelligence technology, especially after the great success of AlphaGo, there has been a growing interest in applying reinforcement learning (RL) to solve energy management strategy (EMS) problems for hybrid electric vehicles. However, the issues of current RL algorithms including deployment inefficiency, safety constraint, and simulation-to-real gap make it inapplicable to many industrial EMS tasks. With these in mind and considering the fact that there exists many suboptimal EMS controllers which can generate plentiful amounts of interactive data containing informative behaviors, an offline RL training framework that tries to extract policies with the maximum possible utility out of the available offline data is proposed. Furthermore, with connected vehicle technology standard in many new cars, rather than bringing all the data to the storage and analytics, a scheduled training framework is put forward. This cloud-based approach not only alleviates the computational burden of edge devices, but also more importantly provides a deployment-efficient solution to EMS tasks that have to adapt to changes of driving cycle. To evaluate the effectiveness of the proposed algorithm on real controllers, a hardware-in-the-loop (HIL) test is performed and the superiority of the proposed algorithm in contrast to dynamic programming, behavior cloning, rule-based, and vanilla off-policy RL algorithms is given.