MAFin: motif detection in multiple alignment files

被引:0
|
作者
Patsakis, Michail [1 ,2 ]
Provatas, Kimonas [1 ,2 ,3 ]
Baltoumas, Fotis A. [4 ]
Chantzi, Nikol [1 ,2 ]
Mouratidis, Ioannis [1 ,2 ]
Pavlopoulos, Georgios A. [4 ]
Georgakopoulos-Soares, Ilias [1 ,2 ]
机构
[1] Penn State Univ, Coll Med, Inst Personalized Med, Dept Mol Biol & Pharmacol, 500 Univ Dr,C5716, Hershey, PA 17033 USA
[2] Penn State Univ, Huck Inst Life Sci, University Pk, PA 16802 USA
[3] Univ Crete, Div Basic Sci, Med Sch, Iraklion 71110, Greece
[4] BSRC Alexander Fleming, Inst Fundamental Biomed Res, Vari 16672, Greece
基金
美国国家卫生研究院;
关键词
ELEMENTS; VERTEBRATE; INSECT;
D O I
10.1093/bioinformatics/btaf125
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Whole Genome and Proteome Alignments, represented by the multiple alignment file format, have become a standard approach in comparative genomics and proteomics. These often require identifying conserved motifs, which is crucial for understanding functional and evolutionary relationships. However, current approaches lack a direct method for motif detection within MAF files. We present MAFin, a novel tool that enables efficient motif detection and conservation analysis in MAF files to address this gap, streamlining genomic and proteomic research.Results We developed MAFin, the first motif detection tool for Multiple Alignment Format files. MAFin enables the multithreaded search of conserved motifs using three approaches: (i) using user-specified k-mers to search the sequences. (ii) with regular expressions, in which case one or more patterns are searched, and (iii) with predefined Position Weight Matrices. Once the motif has been found, MAFin detects the motif instances and calculates the conservation across the aligned sequences. MAFin also calculates a conservation percentage, which provides information about the conservation levels of each motif across the aligned sequences, based on the number of matches relative to the length of the motif. A set of statistics enables the interpretation of each motif's conservation level, and the detected motifs are exported in JSON and CSV files for downstream analyses.Availability and implementation MAFin is offered as a Python package under the GPL license as a multi-platform application and is available at: https://github.com/Georgakopoulos-Soares-lab/MAFin.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Motif Alignment for Time Series Data Augmentation
    Bahri, Omar
    Li, Peiyu
    Boubrahimi, Soukaina Filali
    Hamdi, Shah Muhammad
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2023, 2023, 14148 : 42 - 48
  • [22] Network alignment and motif discovery in dynamic networks
    Pietro Cinaglia
    Mario Cannataro
    Network Modeling Analysis in Health Informatics and Bioinformatics, 2022, 11
  • [23] Malware Detection in Android files based on Multiple levels of Learning and Diverse Data Sources
    Sheen, Shina
    Ramalingam, Anitha
    PROCEEDING OF THE THIRD INTERNATIONAL SYMPOSIUM ON WOMEN IN COMPUTING AND INFORMATICS (WCI-2015), 2015, : 553 - 559
  • [24] Ransomware detection with CNN and deep learning based on multiple features of portable executable files
    Yang, Chia-Cheng
    Hsu, Jia-Ming
    Leu, Jenq-Shiou
    Hsieh, Wen-Bin
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (05):
  • [25] Constrained RNA structural alignment:: Algorithms and application to motif detection in the untranslated regions of Trypanosoma brucei mRNAs
    Khaladkar, Mugdha
    Bellofatto, Vivian
    Wang, Jason T. L.
    Patel, Vandanaben
    Nakayama, Marvin K.
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 334 - +
  • [26] Toward the Detection of Polyglot Files
    Koch, Luke
    Oesch, Sean
    Chaulagain, Amul
    Adkisson, Mary
    Erwin, Samantha
    Weber, Brian
    THE PROCEEDINGS OF 15TH WORKSHOP ON CYBER SECURITY EXPERIMENTATION AND TEST, CSET 2022, 2022, : 120 - 128
  • [27] Emotion Detection of Audio files
    Taneja, Renu
    Bhatia, Aman
    Monga, Javesh
    Marwaha, Purva
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 2397 - 2400
  • [29] Malicioius Software Detection Using Multiple Sequence Alignment and Data Mining
    Chen, Yi
    Narayanan, Ajit
    Pang, Shaoning
    Tao, Ban
    2012 IEEE 26TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2012, : 8 - 14
  • [30] Music Outlier Detection Using Multiple Sequence Alignment and Independent Ensembles
    Bountouridis, Dimitrios
    Koops, Hendrik Vincent
    Wiering, Frans
    Veltkamp, Remco C.
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2016, 2016, 9939 : 286 - 300