Random Interaction Forest (RIF)-A Novel Machine Learning Strategy Accounting for Feature Interaction

被引:5
|
作者
Guo, Chao-Yu [1 ]
Lin, Yi-Jyun [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Inst Publ Hlth, Coll Med, Div Biostat & Data Sci, Taipei 112304, Taiwan
关键词
Interaction; random forest; linear regression; logistic regression; machine learning;
D O I
10.1109/ACCESS.2022.3233194
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
If an interaction exists in medical and health sciences, a proper statistical approach is required to avoid an erroneous conclusion. For example, different genders may introduce modified therapeutic effects of drugs, or an adverse interaction between two medicines changes the pharmacological activity, reduces the therapeutic effect, or induces toxicity. Therefore, if the analysis does not account for the impact of the interaction, it may introduce significant prediction errors or bias. Regression models deal with a two-way interaction by adding the product of the two interactive variables. Since machine learning models demonstrate a superior predictive ability to regression models, this study proposes a new method based on the random forest to account for interaction, called random interaction forest (RIF). This new strategy modifies the structure of the random forest, where the interaction features are forced to be in the first two nodes. Simulation studies examined the predictive ability of the linear regression model, logistic regression model, random forest, and the RIF under various scenarios. The results showed that the RIF consistently outperforms random forest and logistic regression when interactions are present. The RIF also performs better in many scenarios than the linear regression model. When the effect of interaction is more significant, the performance of RIF could be superior.
引用
收藏
页码:1806 / 1813
页数:8
相关论文
共 50 条
  • [1] National classification of surface-groundwater interaction using random forest machine learning technique
    Yang, Jing
    Griffiths, James
    Zammit, Christian
    RIVER RESEARCH AND APPLICATIONS, 2019, 35 (07) : 932 - 943
  • [2] A novel two-stage feature selection method based on random forest and improved genetic algorithm for enhancing classification in machine learning
    Junyao Ding
    Jianchao Du
    Hejie Wang
    Song Xiao
    Scientific Reports, 15 (1)
  • [3] A Novel Algorithm to Estimate the Significance Level of a Feature Interaction Using the Extreme Gradient Boosting Machine
    Guo, Chao-Yu
    Chang, Ke-Hao
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (04)
  • [4] Machine-Learning-Based DDoS Attack Detection Using Mutual Information and Random Forest Feature Importance Method
    Alduailij, Mona
    Khan, Qazi Waqas
    Tahir, Muhammad
    Sardaraz, Muhammad
    Alduailij, Mai
    Malik, Fazila
    SYMMETRY-BASEL, 2022, 14 (06):
  • [5] Random forest feature selection for partial label learning
    Sun, Xianran
    Chai, Jing
    NEUROCOMPUTING, 2023, 561
  • [6] A Novel Machine Learning Method for Cytokine-Receptor Interaction Prediction
    Wei, Leyi
    Zou, Quan
    Liao, Minghong
    Lu, Huijuan
    Zhao, Yuming
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2016, 19 (02) : 144 - 152
  • [7] Exploring predictors of interaction among low-birth-weight infants and their caregivers: a machine learning-based random forest approach
    Wang, Qihui
    Gao, Wenying
    Duan, Yi
    Ren, Zijin
    Zhang, Ying
    BMC PEDIATRICS, 2024, 24 (01)
  • [8] Machine Learning Models for Uncertain Interaction
    Weir, Daryl
    ADJUNCT PROCEEDINGS OF THE 25TH ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, 2012, : 31 - 34
  • [9] Slope Failure Prediction Using Random Forest Machine Learning and LiDAR in an Eroded Folded Mountain Belt
    Maxwell, Aaron E.
    Sharma, Maneesh
    Kite, James S.
    Donaldson, Kurt A.
    Thompson, James A.
    Bell, Matthew L.
    Maynard, Shannon M.
    REMOTE SENSING, 2020, 12 (03)
  • [10] Interaction between Feature Subset Selection Techniques and Machine Learning Classifiers for Detecting Unsolicited Emails
    Trivedi, Shrawan Kumar
    Dey, Shubhamoy
    APPLIED COMPUTING REVIEW, 2014, 14 (01): : 53 - 61