Statistical Hypothesis Testing Based on Machine Learning: Large Deviations Analysis

被引:7
|
作者
Braca, Paolo [1 ]
Millefiori, Leonardo M. [1 ]
Aubry, Augusto [2 ]
Marano, Stefano [3 ]
De Maio, Antonio [2 ]
Willett, Peter [4 ]
机构
[1] Ctr Maritime Res & Experimentat, Res Dept, I-19126 La Spezia, SP, Italy
[2] Univ Naples Federico II, DIETI, I-80125 Naples, NA, Italy
[3] Univ Salerno, DIEM, I-84084 Fisciano, SA, Italy
[4] Univ Connecticut, Dept Elect & Comp Engn, Storrs, CT 06269 USA
来源
IEEE OPEN JOURNAL OF SIGNAL PROCESSING | 2022年 / 3卷
关键词
Error probability; Training; Artificial intelligence; Convergence; Error analysis; Surveillance; Signal processing; Machine learning; deep learning; large deviations principle; exact asymptotics; statistical hypothesis testing; Fenchel-Legendre transform; extended target detection; radar; sonar detection; X-band maritime radar; EXTENDED TARGET TRACKING; DISTRIBUTED DETECTION; ARTIFICIAL-INTELLIGENCE; MARITIME SURVEILLANCE; MULTIPLE SENSORS; NEURAL-NETWORK; DEEP; CLASSIFICATION; ALGORITHMS; CONSENSUS;
D O I
10.1109/OJSP.2022.3232284
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We study the performance of Machine Learning (ML) classification techniques. Leveraging the theory of large deviations, we provide the mathematical conditions for a ML classifier to exhibit error probabilities that vanish exponentially, say exp(-n I), where n is the number of informative observations available for testing (or another relevant parameter, such as the size of the target in an image) and I is the error rate. Such conditions depend on the Fenchel-Legendre transform of the cumulant-generating function of the Data-Driven Decision Function (D3F, i.e., what is thresholded before the final binary decision is made) learned in the training phase. As such, the D3F and the related error rate I depend on the given training set. The conditions for the exponential convergence can be verified and tested numerically exploiting the available dataset or a synthetic dataset generated according to the underlying statistical model. Coherently with the large deviations theory, we can also establish the convergence of the normalized D3F statistic to a Gaussian distribution. Furthermore, approximate error probability curves zeta(n) exp(-n I) are provided, thanks to the refined asymptotic derivation, where zeta n represents the most representative sub-exponential terms of the error probabilities. Leveraging the refined asymptotic, we are able to compute an accurate analytical approximation of the classification performance for both the regimes of small and large values of n. Theoretical findings are corroborated by extensive numerical simulations and by the use of real-world data, acquired by an X-band maritime radar system for surveillance.
引用
收藏
页码:464 / 495
页数:32
相关论文
共 50 条
  • [41] Remodeling 99mTc-Pertechnetate Thyroid Uptake: Statistical, Machine Learning, and Deep Learning Approaches
    Currie, Geoffrey M.
    Iqbal, Basit
    JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY, 2022, 50 (02) : 143 - 152
  • [42] Short Term Electric Load Forecasting Based on Data Transformation and Statistical Machine Learning
    Andriopoulos, Nikos
    Magklaras, Aristeidis
    Birbas, Alexios
    Papalexopoulos, Alex
    Valouxis, Christos
    Daskalaki, Sophia
    Birbas, Michael
    Housos, Efthymios
    Papaioannou, George P.
    APPLIED SCIENCES-BASEL, 2021, 11 (01): : 1 - 22
  • [43] Machine learning based eddy current testing: A review
    Munir, Nauman
    Huang, Jingyuan
    Wong, Chak-Nam
    Song, Sung-Jin
    RESULTS IN ENGINEERING, 2025, 25
  • [44] Deep learning and machine learning in CT-based COPD diagnosis: Systematic review and meta-analysis
    Wu, Qian
    Guo, Hui
    Li, Ruihan
    Han, Jinhuan
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2025, 196
  • [45] Social Learning and Distributed Hypothesis Testing
    Lalitha, Anusha
    Javidi, Tara
    Sarwate, Anand D.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (09) : 6161 - 6179
  • [46] Statistical Characterization of the Morphologies of Nanoparticles through Machine Learning Based Electron Microscopy Image Analysis
    Lee, Byoungsang
    Yoon, Seokyoung
    Lee, Jin Woong
    Kim, Yunchul
    Chang, Junhyuck
    Yun, Jaesub
    Ro, Jae Chul
    Lee, Jong-Seok
    Lee, Jung Heon
    ACS NANO, 2020, 14 (12) : 17125 - 17133
  • [47] Diabetes detection based on machine learning and deep learning approaches
    Wee, Boon Feng
    Sivakumar, Saaveethya
    Lim, King Hann
    Wong, W. K.
    Juwono, Filbert H.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 24153 - 24185
  • [48] Integrating nonlinear analysis and machine learning for human induced pluripotent stem cell-based drug cardiotoxicity testing
    Kowalczewski, Andrew
    Sakolish, Courtney
    Hoang, Plansky
    Liu, Xiyuan
    Jacquir, Sabir
    Rusyn, Ivan
    Ma, Zhen
    JOURNAL OF TISSUE ENGINEERING AND REGENERATIVE MEDICINE, 2022, 16 (08) : 732 - 743
  • [49] Machine Learning and Omics Analysis in Aortic Aneurysm
    Lareyre, Fabien
    Chaudhuri, Arindam
    Nasr, Bahaa
    Raffort, Juliette
    ANGIOLOGY, 2024, 75 (10) : 921 - 927
  • [50] A comprehensive analysis of recent advancements in cancer detection using machine learning and deep learning models for improved diagnostics
    Rai, Hari Mohan
    Yoo, Joon
    JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (15) : 14365 - 14408