Overlooked pitfalls in multi-class machine learning classification in radiation oncology and how to avoid them

被引:4
|
作者
Chatterjee, Avishek [1 ]
Vallieres, Martin [1 ]
Seuntjens, Jan [1 ]
机构
[1] McGill Univ, Med Phys Unit, Montreal, PQ, Canada
来源
PHYSICA MEDICA-EUROPEAN JOURNAL OF MEDICAL PHYSICS | 2020年 / 70卷
基金
加拿大自然科学与工程研究理事会; 加拿大健康研究院;
关键词
Machine Learning; Multi-class classification; Radiomics; Surrogate marker; RADIOMICS; FEATURES;
D O I
10.1016/j.ejmp.2020.01.009
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
In radiation oncology, Machine Learning classification publications are typically related to two outcome classes, e.g. the presence or absence of distant metastasis. However, multi-class classification problems also have great clinical relevance, e.g., predicting the grade of a treatment complication following lung irradiation. This work comprised two studies aimed at making work in this domain less prone to statistical blindsides. In multi-class classification, AUC is not defined, whereas correlation coefficients are. It may seem like solely quoting the correlation coefficient value (in lieu of the AUC value) is a suitable choice. In the first study, we illustrated using Monte Carlo (MC) models why this choice is misleading. We also considered the special case where the multiple classes are not ordinal, but nominal, and explained why Pearson or Spearman correlation coefficients are not only providing incomplete information but are actually meaningless. The second study concerned surrogate biomarkers for a clinical endpoint, which have purported benefits including potential for early assessment, being inexpensive, and being non-invasive. Using a MC experiment, we showed how conclusions derived from surrogate markers can be misleading. The simulated endpoint was radiation toxicity (scale of 0-5). The surrogate marker was the true toxicity grade plus a noise term. Five patient cohorts were simulated, including one control. Two of the cohorts were designed to have a statistically significant difference in toxicity. Under 1000 repeated experiments using the biomarker, these two cohorts were often found to be statistically indistinguishable, with the fraction of such occurrences rising with the level of noise.
引用
收藏
页码:96 / 100
页数:5
相关论文
共 50 条
  • [1] Extreme Learning Machine for Multi-class Sentiment Classification of Tweets
    Wang, Zhaoxia
    Parth, Yogesh
    PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 1 - 11
  • [2] Multi-class Weather Classification: Comparative Analysis of Machine Learning Algorithms
    Mishra, Amartya
    Roy, Ganpati Kumar
    Singla, Kanika
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 307 - 316
  • [3] GEML: A Grammatical Evolution, Machine Learning Approach to Multi-class Classification
    Fitzgerald, Jeannie M.
    Azad, R. Muhammad Atif
    Ryan, Conor
    COMPUTATIONAL INTELLIGENCE, IJCCI 2015, 2017, 669 : 113 - 134
  • [4] Pitfalls of assessing extracted hierarchies for multi-class classification
    del Moral, Pablo
    Nowaczyk, Slawomir
    Sant'Anna, Anita
    Pashami, Sepideh
    PATTERN RECOGNITION, 2023, 136
  • [5] A Multi-class Classification Approach for Weather Forecasting with Machine Learning Techniques
    Dritsas, Elias
    Trigka, Maria
    Mylonas, Phivos
    2022 17TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION & PERSONALIZATION (SMAP 2022), 2022, : 81 - 85
  • [6] Multi-class Classification of Industrial Fall from Height based on Machine Learning Algorithm
    Koo, Bum Mo
    Kim, Jong Man
    Nam, Ye Jin
    Sung, Dong Jin
    Shim, Jae Woo
    Yang, Su Min
    Kim, Young Ho
    TRANSACTIONS OF THE KOREAN SOCIETY OF MECHANICAL ENGINEERS A, 2022, 46 (03) : 259 - 265
  • [7] Multi-class classification algorithm based on Support Vector Machine
    Yang Kuihe
    Yuan Min
    7TH INTERNATIONAL CONFERENCE ON MEASUREMENT AND CONTROL OF GRANULAR MATERIALS, PROCEEDINGS, 2006, : 322 - 325
  • [8] Machine Learning Algorithms for Raw and Unbalanced Intrusion Detection Data in a Multi-Class Classification Problem
    Bacevicius, Mantas
    Paulauskaite-Taraseviciene, Agne
    APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [9] Bearing Fault Classification Using Multi-Class Machine Learning (ML) Techniques
    Sujatha, C.
    Mohan, Aravind
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (01)
  • [10] Multi-Class Electrogastrogram (EGG) Signal Classification Using Machine Learning Algorithms
    Raihan, Md Mohsin Sarker
    Bin Shams, Abdullah
    Bin Preo, Rahat
    2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,