In smart manufacturing systems (SMSs), flexible job-shop scheduling with transportation constraints (FJSPT) is essential to optimize solutions for maximizing productivity, considering production flexibility based on automated guided vehicles (AGVs). Recent developments in deep reinforcement learning (DRL)-based methods for FJSPT have encountered a scale generalization challenge. We propose the Heterogeneous Graph Scheduler (HGS), a novel DRL-based method that provides near-optimal solutions regardless of the scale of operations, machines, and vehicles. HGS modifies the disjunctive graph to model FJSPT as a heterogeneous graph of operations, machines, and vehicles, dynamically representing processes and transportation. It involves a structure-aware heterogeneous graph encoder to enhance scale generalization, using multi-head attention to aggregate messages locally and integrate them globally. A three-stage decoder for end-to-end decision-making outputs the scheduling solution by selecting nodes with the highest likelihood of minimizing makespan. Our evaluation with benchmark datasets shows HGS outperforms traditional dispatching rules, metaheuristics, and existing DRL-based methods, demonstrating superior makespan performance and scale generalization. Moreover, as the scale increases, HGS achieves the best solutions across all instances.