RL-MD: A Novel Reinforcement Learning Approach for DNA Motif Discovery

被引:0
作者
Wang, Wen [1 ]
Wang, Jianzong [1 ]
Si, Shijing [1 ,2 ]
Huang, Zhangcheng [1 ]
Xiao, Jing [1 ]
机构
[1] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China
[2] Shanghai Int Studies Univ, Sch Econ & Finance, Shanghai, Peoples R China
来源
2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA) | 2022年
关键词
Biomedical AI; Reinforcement Learning; Computational Biology; Bioinformatics; DNA Motif Discovery; REGULATORY CODE; BINDING-SITES; DEEP; GENOME; ALGORITHM;
D O I
10.1109/DSAA54385.2022.10032340
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extraction of sequence patterns from a collection of functionally linked unlabeled DNA sequences is known as DNA motif discovery, and it is a key task in computational biology. Several deep learning-based techniques have recently been introduced to address this issue. However, these algorithms can not be used in real-world situations because of the need for labeled data. Here, we presented RL-MD, a novel reinforcement learning based approach for DNA motif discovery task. RL-MD takes unlabelled data as input, employs a relative informationbased method to evaluate each proposed motif, and utilizes these continuous evaluation results as the reward. The experiments show that RL-MD can identify high-quality motifs in real-world data.
引用
收藏
页码:297 / 303
页数:7
相关论文
共 43 条
  • [1] Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
    Alipanahi, Babak
    Delong, Andrew
    Weirauch, Matthew T.
    Frey, Brendan J.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (08) : 831 - +
  • [2] DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning
    Angermueller, Christof
    Lee, Heather J.
    Reik, Wolf
    Stegle, Oliver
    [J]. GENOME BIOLOGY, 2017, 18
  • [3] Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
  • [4] DREME: motif discovery in transcription factor ChIP-seq data
    Bailey, Timothy L.
    [J]. BIOINFORMATICS, 2011, 27 (12) : 1653 - 1659
  • [5] NCBI GEO: archive for functional genomics data sets-update
    Barrett, Tanya
    Wilhite, Stephen E.
    Ledoux, Pierre
    Evangelista, Carlos
    Kim, Irene F.
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Holko, Michelle
    Yefanov, Andrey
    Lee, Hyeseung
    Zhang, Naigong
    Robertson, Cynthia L.
    Serova, Nadezhda
    Davis, Sean
    Soboleva, Alexandra
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D991 - D995
  • [6] Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines
    Castle, John C.
    Zhang, Chaolin
    Shah, Jyoti K.
    Kulkarni, Amit V.
    Kalsotra, Auinash
    Cooper, Thomas A.
    Johnson, Jason M.
    [J]. NATURE GENETICS, 2008, 40 (12) : 1416 - 1425
  • [7] JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles
    Castro-Mondragon, Jaime A.
    Riudavets-Puig, Rafael
    Rauluseviciute, Ieva
    Lemma, Roza Berhanu
    Turchi, Laura
    Blanc-Mathieu, Romain
    Lucas, Jeremy
    Boddie, Paul
    Khan, Aziz
    Perez, Nicolas Manosalva
    Fornes, Oriol
    Leung, Tiffany Y.
    Aguirre, Alejandro
    Hammal, Fayrouz
    Schmelter, Daniel
    Baranasic, Damir
    Ballester, Benoit
    Sandelin, Albin
    Lenhard, Boris
    Vandepoele, Klaas
    Wasserman, Wyeth W.
    Parcy, Francois
    Mathelier, Anthony
    [J]. NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) : D165 - D173
  • [8] Local feature frequency profile: A method to measure structural similarity in proteins
    Choi, IG
    Kwon, J
    Kim, SH
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (11) : 3797 - 3802
  • [9] NOMENCLATURE FOR INCOMPLETELY SPECIFIED BASES IN NUCLEIC-ACID SEQUENCES - RECOMMENDATIONS 1984
    CORNISHBOWDEN, A
    [J]. NUCLEIC ACIDS RESEARCH, 1985, 13 (09) : 3021 - 3030
  • [10] SIOMICS: a novel approach for systematic identification of motifs in ChIP-seq data
    Ding, Jun
    Hu, Haiyan
    Li, Xiaoman
    [J]. NUCLEIC ACIDS RESEARCH, 2014, 42 (05) : e35