Machine learning-based protein crystal detection for monitoring of crystallization processes enabled with large-scale synthetic data sets of photorealistic images

被引:10
|
作者
Bischoff, Daniel [1 ]
Walla, Brigitte [1 ]
Weuster-Botz, Dirk [1 ]
机构
[1] Tech Univ Munich, Inst Biochem Engn, Boltzmannstr 15, D-85748 Garching, Germany
关键词
Protein crystallization; Automated image analysis; Synthetic data sets; Deep learning; PARTICLE-SIZE DISTRIBUTIONS; ONLINE MEASUREMENT; SHAPE; SEGMENTATION;
D O I
10.1007/s00216-022-04101-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Since preparative chromatography is a sustainability challenge due to large amounts of consumables used in downstream processing of biomolecules, protein crystallization offers a promising alternative as a purification method. While the limited crystallizability of proteins often restricts a broad application of crystallization as a purification method, advances in molecular biology, as well as computational methods are pushing the applicability towards integration in biotechnological downstream processes. However, in industrial and academic settings, monitoring protein crystallization processes non-invasively by microscopic photography and automated image evaluation remains a challenging problem. Recently, the identification of single crystal objects using deep learning has been the subject of increased attention for various model systems. However, the advancement of crystal detection using deep learning for biotechnological applications is limited: robust models obtained through supervised machine learning tasks require large-scale and high-quality data sets usually obtained in large projects through extensive manual labeling, an approach that is highly error-prone for dense systems of transparent crystals. For the first time, recent trends involving the use of synthetic data sets for supervised learning are transferred, thus generating photorealistic images of virtual protein crystals in suspension (PCS) through the use of ray tracing algorithms, accompanied by specialized data augmentations modelling experimental noise. Further, it is demonstrated that state-of-the-art models trained with the large-scale synthetic PCS data set outperform similar fine-tuned models based on the average precision metric on a validation data set, followed by experimental validation using high-resolution photomicrographs from stirred tank protein crystallization processes.
引用
收藏
页码:6379 / 6391
页数:13
相关论文
共 50 条
  • [31] Machine learning-based seismic fragility analysis of large-scale steel buckling restrained brace frames
    Sun B.
    Zhang Y.
    Huang C.
    CMES - Computer Modeling in Engineering and Sciences, 2020, 124 (03): : 755 - 776
  • [32] Online diagnosis for bridge monitoring data via a machine learning-based anomaly detection method
    Wang, Lei
    Kang, Juntao
    Zhang, Wenbin
    Hu, Jun
    Wang, Kai
    Wang, Dong
    Yu, Zechuan
    MEASUREMENT, 2025, 245
  • [33] Machine-Learning-Based Feature Selection Techniques for Large-Scale Network Intrusion Detection
    Al-Jarrah, O. Y.
    Siddiqui, A.
    Elsalamouny, M.
    Yoo, P. D.
    Muhaidat, S.
    Kim, K.
    2014 IEEE 34TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW), 2014, : 177 - 181
  • [34] A Hybrid Machine Learning-Based Data-Centric Cybersecurity Detection in the 5G-Enabled IoT
    Zeng, Lingcheng
    An, Yunzhu
    Zhou, Heng
    Luo, Qifeng
    Lin, Yuede
    Zhang, Zhiqiang
    SECURITY AND PRIVACY, 2025, 8 (02):
  • [35] Actor-Based Incremental Tree Data Processing for Large-Scale Machine Learning Applications
    Sakurai, Kouhei
    Shimizu, Taiki
    PROCEEDINGS OF THE 9TH ACM SIGPLAN INTERNATIONAL WORKSHOP ON PROGRAMMING BASED ON ACTORS, AGENTS, AND DECENTRALIZED CONTROL (AGERE '19), 2019, : 1 - 10
  • [36] Rethinking of learning-based 3D keypoints detection for large-scale point clouds registration
    Liu, ShaoCong
    Wang, Tao
    Zhang, Yan
    Zhou, Ruqin
    Dai, Chenguang
    Zhang, Yongsheng
    Lei, Haozhen
    Wang, Hanyun
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 112
  • [37] Deep learning-based pulmonary tuberculosis automated detection on chest radiography: large-scale independent testing
    Zhou, Wen
    Cheng, Guanxun
    Zhang, Ziqi
    Zhu, Litong
    Jaeger, Stefan
    Lure, Fleming Y. M.
    Guo, Lin
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2022, 12 (04) : 2344 - 2355
  • [38] Large-scale randomized experiments reveals that machine learning-based instruction helps people memorize more effectively
    Utkarsh Upadhyay
    Graham Lancashire
    Christoph Moser
    Manuel Gomez-Rodriguez
    npj Science of Learning, 6
  • [39] Machine learning-based extrachromosomal DNA identification in large-scale cohorts reveals its clinical implications in cancer
    Shixiang Wang
    Chen-Yi Wu
    Ming-Ming He
    Jia-Xin Yong
    Yan-Xing Chen
    Li-Mei Qian
    Jin-Ling Zhang
    Zhao-Lei Zeng
    Rui-Hua Xu
    Feng Wang
    Qi Zhao
    Nature Communications, 15
  • [40] Machine learning-based extrachromosomal DNA identification in large-scale cohorts reveals its clinical implications in cancer
    Wang, Shixiang
    Wu, Chen-Yi
    He, Ming-Ming
    Yong, Jia-Xin
    Chen, Yan-Xing
    Qian, Li-Mei
    Zhang, Jin-Ling
    Zeng, Zhao-Lei
    Xu, Rui-Hua
    Wang, Feng
    Zhao, Qi
    NATURE COMMUNICATIONS, 2024, 15 (01)