Class-Incremental learning (CIL) is challenging due to catastrophic forgetting (CF), which escalates in exemplarfree scenarios. To mitigate CF, Knowledge Distillation (KD), which leverages old models as teacher models, has been widely employed in CIL. However, based on a case study, our investigation reveals that the teacher model exhibits over-confidence in unseen new samples. In this article, we conduct empirical experiments and provide theoretical analysis to investigate the over-confident phenomenon and the impact of KD in exemplar-free CIL, where access to old samples is unavailable. Building on our analysis, we propose a novel approach, Learning with Humbler Teacher, by systematically selecting an appropriate checkpoint model as a humbler teacher to mitigate CF. Furthermore, we explore utilizing the nuclear norm to obtain an appropriate temporal ensemble to enhance model stability. Notably, LwHT outperforms the state-of-the-art approach by a significant margin of 10.41%, 6.56%, and 4.31% in various settings while demonstrating superior model plasticity.