Greenhouses are a critical component of modern agriculture, facilitating crop growth and development, and accurate predictions of temperature and humidity are essential for mitigating crop diseases and optimizing the growth environment. However, short- and medium-term forecasts of temperature and humidity are challenging because of the complexity of greenhouse microclimates. This paper presents a hybrid model that integrates a frequency-enhanced channel attention mechanism optimized with a temporal convolutional network (TCN-FECAM) and an iTransformer. The model employs a cross-attention mechanism incorporating the advantages of the two models, and a 48-sequence sliding window strategy is used to ensure accurate multistep predictions of temperature and humidity over spans of 3 h to 24 h. The experimental results demonstrate that the TCN-FECAM-iTransformer model outperforms other models across diverse time scales, including GRU, LSTM, Informer, Autoformer, Crossformer, FAM-LSTM, and TPA-LSTM. Specifically, in temperature prediction, the model achieves R2 coefficients of 0.979, 0.973, 0.968, and 0.953 and RMSE values of 0.657, 0.806, 0.923, and 1.126, for 3 h, 6 h, 12 h, and 24 h intervals, respectively. In humidity prediction, the model obtains R2 coefficients of 0.976, 0.961, 0.947, and 0.939 and RMSE values of 1.805, 2.567, 3.132, and 3.451 for 3 h, 6 h, 12 h, and 24 h intervals, respectively. The model therefore exhibits reliable performance in predicting temperature and humidity in greenhouse environments, offering robust support for monitoring and early warnings in crop growth environments.