As the size of real-world graphs continues to grow at an exponential rate, performing the graph convolutional network (GCN) inference efficiently is becoming increasingly challenging. Prior works that employ a unified computing engine with a predefined computation order lack the necessary flexibility and scalability to handle diverse input graph datasets. In this article, we introduce OPT-GCN, a chiplet-based accelerator design that performs GCN inference efficiently while providing flexibility and scalability through an architecture-algorithm co-design. On the architecture side, the proposed design integrates a unified computing engine in each chiplet and an active interposer, both of which are adaptable to efficiently perform the GCN inference and facilitate data communication. On the algorithm side, we propose dynamic scheduling and mapping algorithms to optimize memory access and on-chip computations for diverse GCN applications. Experimental results show that the proposed design provides a memory access reduction by a factor of 11.3x, 3.4x, and 1.4x, and energy savings of 15.2x, 3.7x, and 1.6x on average compared to HyGCN, AWB-GCN, and GCNAX, respectively.