Machine Learning Based Prediction of Enzymatic Degradation of Plastics Using Encoded Protein Sequence and Effective Feature Representation

被引:14
|
作者
Jiang, Renjing [1 ]
Shang, Lanyu [2 ]
Wang, Ruohan [1 ]
Wang, Dong [2 ]
Wei, Na [1 ]
机构
[1] Univ Illinois, Dept Civil & Environm Engn, Urbana, IL 61801 USA
[2] Univ Illinois, Sch Informat Sci, Champaign, IL 61820 USA
基金
美国国家科学基金会;
关键词
Machine learning; plastic waste; enzymaticdegradation; enzyme function; sequence representation; HEAT-CAPACITY; PORE-SIZE; TECHNOLOGIES; DEPOLYMERASE; HYDROLYSIS; DIFFUSION; SUBSTRATE;
D O I
10.1021/acs.estlett.3c00293
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Enzyme biocatalysis for plastic treatment and recyclingis an emergingfield of growing interest. However, it is challenging and time-consumingto identify plastic-degrading enzymes with desirable functionality,given the large number of putative enzyme sequences. There is a criticalneed to develop an effective approach to accurately predict the enzymeactivity in degrading different types of plastics. In this study,we developed a machine-learning-based plastic enzymatic degradation(PED) framework to predict the ability of an enzyme to degrade plasticsof interest by exploring and recognizing hidden patterns in proteinsequences. A data set integrating information from a wide range ofexperimentally verified enzymes and various common plastic substrateswas created. A new context-aware enzyme sequence representation (CESR)mechanism was developed to learn the abundant contextual informationin enzyme sequences, and feature extraction was performed for enzymesat both the amino acid level and global sequence level. Thirteen machinelearning classification algorithms were compared, and XGBoost wasidentified as the best-performing algorithm. PED achieved an overallaccuracy of 90.2% and outperformed sequence-based protein classificationmodels from the existing literature. Furthermore, important enzymefeatures in plastic degradation were identified and comprehensivelyinterpreted. This study demonstrated a new tool for the predictionand discovery of plastic-degrading enzymes.
引用
收藏
页码:557 / 564
页数:8
相关论文
共 50 条
  • [31] mAHTPred: a sequence-based meta-predictor for improving the prediction of anti-hypertensive peptides using effective feature representation
    Manavalan, Balachandran
    Basith, Shaherin
    Shin, Tae Hwan
    Wei, Leyi
    Lee, Gwang
    BIOINFORMATICS, 2019, 35 (16) : 2757 - 2765
  • [32] Conjoint Feature Representation of GO and Protein Sequence for PPI Prediction Based on an Inception RNN Attention Network
    Zhao, Lingling
    Wang, Junjie
    Hu, Yang
    Cheng, Liang
    MOLECULAR THERAPY NUCLEIC ACIDS, 2020, 22 : 198 - 208
  • [33] Feature based quality prediction through machine learning
    Brecher C.
    Ochel J.
    Lohrmann V.
    Fey M.
    ZWF Zeitschrift fuer Wirtschaftlichen Fabrikbetrieb, 2019, 114 (11): : 784 - 787
  • [34] PACVP: Prediction of Anti-Coronavirus Peptides Using a Stacking Learning Strategy With Effective Feature Representation
    Chen, Shouzhi
    Liao, Yanhong
    Zhao, Jianping
    Bin, Yannan
    Zheng, Chunhou
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (05) : 3106 - 3116
  • [35] A study on a hybrid water quality prediction model using sequence to sequence learning based LSTM And machine learning
    Yoon, Sukmin
    Shin, Jaeho
    Park, No-Suk
    Kweon, Minjae
    Kim, Youngsoon
    DESALINATION AND WATER TREATMENT, 2024, 320
  • [36] Prediction for Membrane Protein Types Based on Effective Fusion Representation and MIC-GA Feature Selection
    Guo, Lei
    Wang, Shunfang
    Lei, Zhenfeng
    Wang, Xueren
    IEEE ACCESS, 2018, 6 : 75669 - 75681
  • [37] Machine Learning for Sequence and Structure-Based Protein-Ligand Interaction Prediction
    Zhang, Yunjiang
    Li, Shuyuan
    Meng, Kong
    Sun, Shaorui
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (05) : 1456 - 1472
  • [38] Discrete sequence prediction using machine learning methods
    Sharif, H
    Conner, M
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 1097 - 1101
  • [39] Protein-Protein Recognition Prediction Using Support Vector Machine Based on Feature Vectors
    Kuo, Huang-Cheng
    Ong, Ping-Lin
    Lin, Jung-Chang
    Huang, Jen-Peng
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, PROCEEDINGS, 2008, : 200 - +
  • [40] Degradation Prediction of Electronic Packages using Machine Learning
    Prisacaru, Alexandru
    Guerrero, Ernesto Oquelis
    Gromala, Przemyslaw Jakub
    Han, Bongtae
    Zhang, Guo Qi
    2019 20TH INTERNATIONAL CONFERENCE ON THERMAL, MECHANICAL AND MULTI-PHYSICS SIMULATION AND EXPERIMENTS IN MICROELECTRONICS AND MICROSYSTEMS (EUROSIME), 2019,