For wireless federated learning (FL), this work proposes an adaptive model pruning-based FL (AMP-FL) framework, where the edge server dynamically generates sub-models by pruning the global model to adapt devices' heterogeneous computation capabilities and time-varying wireless channel conditions. To mitigate the negative effect of different structures of sub-models on learning convergence, this work designs a new compensating strategy for the pruned regions of sub-models via historical gradients. Since the freshness of gradients dominates the convergence speed, this work also defines an age of information (AoI) metric to characterize the staleness of the regions of the local gradients. Based on the compensating strategy, we formulate a joint device scheduling, model pruning, and resource block allocation optimization problem to minimize the average AoI for local gradients. To solve this problem, we theoretically derive an optimal model pruning scheme. After that, we transform the original problem into equivalent linear programming that can be solved with polynomial time complexity. Simulation results on the CIFAR-10 dataset show that the proposed AMP-FL outperforms the benchmark schemes with faster convergence speed and over 7% learning accuracy improvement.