Refining Mitochondrial Intron Classification With ERPIN: Identification Based on Conservation of Sequence Plus Secondary Structure Motifs

被引:8
|
作者
Prince, Samuel [1 ]
Munoz, Carl [1 ]
Filion-Bienvenue, Fannie [1 ]
Rioux, Pierre [1 ]
Sarrasin, Matt [1 ]
Lang, B. Franz [1 ]
机构
[1] Univ Montreal, Robert Cedergren Ctr Bioinformat & Genom, Dept Biochim, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
mitochondrial introns; group I; ERPIN; covariance models; infernal; RNA structure; GROUP-I INTRONS; RNA INTERVENING SEQUENCE; CYTOCHROME-OXIDASE; FISSION YEAST; DNA-SEQUENCE; GENE; GENOME; EVOLUTION; PROTEINS; HOMOLOGY;
D O I
10.3389/fmicb.2022.866187
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Mitochondrial genomes-in particular those of fungi-often encode genes with a large number of Group I and Group II introns that are conserved at both the sequence and the RNA structure level. They provide a rich resource for the investigation of intron and gene structure, self- and protein-guided splicing mechanisms, and intron evolution. Yet, the degree of sequence conservation of introns is limited, and the primary sequence differs considerably among the distinct intron sub-groups. It makes intron identification, classification, structural modeling, and the inference of gene models a most challenging and error-prone task-frequently passed on to an "expert" for manual intervention. To reduce the need for manual curation of intron structures and mitochondrial gene models, computational methods using ERPIN sequence profiles were initially developed in 2007. Here we present a refinement of search models and alignments using the now abundant publicly available fungal mtDNA sequences. In addition, we have tested in how far members of the originally proposed sub-groups are clearly distinguished and validated by our computational approach. We confirm clearly distinct mitochondrial Group I sub-groups IA1, IA3, IB3, IC1, IC2, and ID. Yet, IB1, IB2, and IB4 ERPIN models are overlapping substantially in predictions, and are therefore combined and reported as IB. We have further explored the conversion of our ERPIN profiles into covariance models (CM). Current limitations and prospects of the CM approach will be discussed.
引用
收藏
页数:13
相关论文
共 18 条
  • [1] Classification of microbial transglutaminases by evaluation of evolution trees, sequence motifs, secondary structure topology and conservation of potential catalytic residues
    Giordano, Deborah
    Facchiano, Angelo
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2019, 509 (02) : 506 - 513
  • [2] SEQUENCE CONSERVATION AND SECONDARY STRUCTURE IDENTITY BETWEEN SOME NUCLEAR AND MITOCHONDRIAL INTRONS
    WARING, RB
    BROWN, TA
    DAVIES, RW
    SCAZZOCCHIO, C
    HEREDITY, 1983, 51 (OCT) : 519 - 520
  • [3] Secondary structure based analysis and classification of biological interfaces: identification of binding motifs in protein-protein interactions
    Guharoy, Mainak
    Chakrabarti, Pinak
    BIOINFORMATICS, 2007, 23 (15) : 1909 - 1918
  • [4] Association classification algorithm based on structure sequence in protein secondary structure prediction
    Zhou, Zhun
    Yang, Bingru
    Hou, Wei
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (09) : 6381 - 6389
  • [5] Classification of microbial transglutaminases by evaluation of evolution trees, sequence motifs, secondary structure topology and conservation of potential catalytic residues (vol 509, pg 506, 2019)
    Giordano, Deborah
    Facchiano, Angelo
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2019, 511 (01) : 205 - 205
  • [6] A Statistical Analysis of MicroRNA: Classification, Identification and Conservation Based on Structure and Function
    Chakraborty, Mohua
    Chatterjee, Ananya
    Krithika, S.
    Vasulu, T. S.
    GROWTH CURVE AND STRUCTURAL EQUATION MODELING: TOPICS FROM THE INDIAN STATISTICAL INSTITUTE, 2015, 132 : 223 - 258
  • [7] Identification of protein hot regions by combining structure-based classification, energy-based clustering and sequence-based conservation in evolution
    Hu, Jing
    Gan, Haomin
    Chen, Nansheng
    Zhang, Xiaolong
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 24 (01) : 74 - 95
  • [8] Association multi-classification algorithm based on protein secondary structure sequence
    Yang B.-R.
    Zhou Z.
    Hou W.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2010, 32 (06): : 1318 - 1324
  • [9] MANIFOLD: protein fold recognition based on secondary structure, sequence similarity and enzyme classification
    Bindewald, E
    Cestaro, A
    Hesser, J
    Heiler, M
    Tosatto, SCE
    PROTEIN ENGINEERING, 2003, 16 (11): : 785 - 789
  • [10] Comparing protein sequence-based and predicted secondary structure-based methods for identification of remote homologs
    Geetha, V
    Di Francesco, V
    Garnier, J
    Munson, PJ
    PROTEIN ENGINEERING, 1999, 12 (07): : 527 - 534