Analyzing human joint moments is an essential step for evaluating walking functions. However, collecting joint moments is mostly limited to the lab environment because it requires force plates to measure overground reaction forces, motion capture cameras to collect joint kinematics, and computational modeling software to calculate joint moments. Although this method allows accurate measurement of joint moments, estimating joint moments outside the lab in daily living, especially in various walking conditions such as ramp and stairs, is challenging due to the requirement of multiple numbers of wearable sensors and expensive insole pressure sensors. A large number of sensors are not only cumbersome for installation and calibration but also limit natural movements. There is a need for an accurate joint moment estimation method using affordable and minimum number of sensors. This paper proposes a novel machine learning algorithm that can estimate the hip, knee, and ankle joint moments in the sagittal plane using an IMU sensor on the foot in treadmill, level-ground, stairs, and ramp walking conditions. The proposed novel DL-Kinetics-FM-Net consists of an end-to-end trained model, a fusion module (FM) to improve joint moments estimation, and a novel technique of integrating two loss functions more efficiently than the conventional loss design. Through comprehensive evaluation, we have demonstrated the effectiveness of different proposed components of our model. Specifically, DL-Kinetics-FM-Net results in a decreasing NRMSE by 7.10 - 23.16% compared to the state-of-the-art deep learning algorithm for joint moment estimation. This is the first study that estimated hip, knee, and ankle joint moment in multiple walking condition using a single IMU sensor on foot via deep learning.