The Regression Learning of the Imbalanced and Big Data by the Online Mixture Model for the Mach Number Forecasting
被引:3
作者:
Wang, Xiao-Jun
论文数: 0引用数: 0
h-index: 0
机构:
Dongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R ChinaDongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
Wang, Xiao-Jun
[1
]
Liu, Yan
论文数: 0引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R ChinaDongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
Liu, Yan
[2
]
Yuan, Ping
论文数: 0引用数: 0
h-index: 0
机构:
Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R ChinaDongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
Yuan, Ping
[2
]
Zhou, Chang-Jun
论文数: 0引用数: 0
h-index: 0
机构:
Zhejiang Normal Univ, Coll Math Phys & Informat Engn, Jinhua 321004, Peoples R ChinaDongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
Zhou, Chang-Jun
[3
]
Zhang, Lin
论文数: 0引用数: 0
h-index: 0
机构:
China Aerodynam Res & Dev Ctr, High Speed Aerodynam Inst, Mianyang 621000, Peoples R ChinaDongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
Zhang, Lin
[4
]
机构:
[1] Dongbei Univ Finance & Econ, Sch Management Sci & Engn, Dalian 116025, Peoples R China
[2] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R China
[3] Zhejiang Normal Univ, Coll Math Phys & Informat Engn, Jinhua 321004, Peoples R China
[4] China Aerodynam Res & Dev Ctr, High Speed Aerodynam Inst, Mianyang 621000, Peoples R China
Extracting valuable information to enhance the performance of forecasting models from the imbalanced and big data requires the scalable implementation of advanced statistical learning methods. This paper proposes the online mixture model (OMM) and applies it to the Mach number forecasting. Treating the key variable (e.g., Mach number) forecasting under all working conditions as an entire task, and viewing that of each individual working condition as a subtask, the OMM separates the dense samples from the sparse ones on the basis of subtasks. The subtask models are independently learnt on the samples with reduced volume, and updated for the new working conditions without retaining samples from the old working conditions. Moreover, the tree-structure ensemble (TSE)-feature subsets ensembles (FSEs) algorithm is presented to fit the nonlinear function of a subtask model, where the FSE local models with low-dimensional input features are established on the non-overlapping sample subsets constructed by the TSE method. The TSE-FSEs not only reduce the volume of data but also perform distributed computing with parallel structure, and thus has the advantage of the learning of big data. Experiments carried out on the measurement data of wind tunnel indicate that the OMM with the TSE-FSEs outperforms other learning algorithms for the Mach number forecasting, and meets the precision and forecasting speed requirements in engineering.