Convolutional neural network for automated peak detection in reversed-phase liquid chromatography

被引:13
作者
Kensert, Alexander [1 ,2 ]
Bosten, Emery [1 ,4 ]
Collaerts, Gilles [1 ,2 ]
Efthymiadis, Kyriakos [1 ,3 ]
Van Broeck, Peter [4 ]
Desmet, Gert [2 ]
Cabooter, Deirdre [1 ]
机构
[1] Univ Leuven KU Leuven, Dept Pharmaceut & Pharmacol Sci, Pharmaceut Anal, Herestr 49, B-3000 Leuven, Belgium
[2] Vrije Univ Brussel, Dept Chem Engn, Pl Laan 2, B-1050 Brussels, Belgium
[3] Vrije Univ Brussel, Dept Comp Sci, Artificial Inte Lligence Lab, Pl Laan 9, B-1050 Brussels, Belgium
[4] Janssen Pharmaceut, Dept Pharmaceut Dev & Mfg Sci, Turnhoutseweg 30, B-2340 Beerse, Belgium
关键词
Machine learning; Convolutional neural networks; Peak finding; Method development;
D O I
10.1016/j.chroma.2022.463005
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Although commercially available software provides options for automatic peak detection, visual inspec-tion and manual corrections are often needed. Peak detection algorithms commonly employed require carefully written rules and thresholds to increase true positive rates and decrease false positive rates. In this study, a deep learning model, specifically, a convolutional neural network (CNN), was implemented to perform automatic peak detection in reversed-phase liquid chromatography (RPLC). The model inputs a whole chromatogram and outputs predicted locations, probabilities, and areas of the peaks. The obtained results on a simulated validation set demonstrated that the model performed well (ROC-AUC of 0.996), and comparably or better than a derivative-based approach using the Savitzky-Golay algorithm for detect-ing peaks on experimental chromatograms (8.6% increase in true positives). In addition, predicted peak probabilities (typically between 0.5 and 1.0 for true positives) gave an indication of how confident the CNN model was in the peaks detected. The CNN model was trained entirely on simulated chromatograms (a training set of 1,0 0 0,0 0 0 chromatograms), and thus no effort had to be put into collecting and labeling chromatograms. A potential major drawback of this approach, namely training a CNN model on simulated chromatograms, is the risk of not capturing the actual "chromatogram space" well enough that is needed to perform accurate peak detection in real chromatograms.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 18 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Recent applications of chemometrics in one- and two-dimensional chromatography
    Bos, Tijmen S.
    Knol, Wouter C.
    Molenaar, Stef R. A.
    Niezen, Leon E.
    Schoenmakers, Peter J.
    Somsen, Govert W.
    Pirok, Bob W. J.
    [J]. JOURNAL OF SEPARATION SCIENCE, 2020, 43 (9-10) : 1678 - 1727
  • [3] Bottou Leon, 2007, Advances in neural information processing systems, V20
  • [4] ITERATIVE CURVE FITTING OF CHROMATOGRAPHIC PEAKS
    CHESLER, SN
    CRAM, SP
    [J]. ANALYTICAL CHEMISTRY, 1973, 45 (08) : 1354 - 1359
  • [5] CURVE-FITTING USING NATURAL COMPUTATION
    DEWEIJER, AP
    LUCASIUS, CB
    BUYDENS, L
    KATEMAN, G
    HEUVEL, HM
    MANNEE, H
    [J]. ANALYTICAL CHEMISTRY, 1994, 66 (01) : 23 - 31
  • [6] Felinger A., 1998, DATA ANAL SIGNAL PRO, V21
  • [7] Gloaguen Y, 2020, bioRxiv, DOI [10.1101/2020.08.09.242727, 10.1101/2020.08.09.242727, DOI 10.1101/2020.08.09.242727, 10.1101/2020.08.09.242727v1]
  • [8] CURVE-FITTING FOR RESTORATION OF ACCURACY FOR OVERLAPPING PEAKS IN GAS-CHROMATOGRAPHY COMBUSTION ISOTOPE RATIO MASS-SPECTROMETRY
    GOODMAN, KJ
    BRENNA, JT
    [J]. ANALYTICAL CHEMISTRY, 1994, 66 (08) : 1294 - 1301
  • [9] Deep convolutional autoencoder for the simultaneous removal of baseline noise and baseline drift in chromatograms
    Kensert, Alexander
    Collaerts, Gilles
    Efthymiadis, Kyriakos
    Van Broeck, Peter
    Desmet, Gert
    Cabooter, Deirdre
    [J]. JOURNAL OF CHROMATOGRAPHY A, 2021, 1646
  • [10] Kingma D P., 2014, P INT C LEARN REPR