Linking entries in protein interaction database to structured text: The FEBS Letters experiment

被引:48
作者
Ceol, Arnaud [1 ]
Chatr-Aryamontri, Andrew [1 ]
Licata, Luana [1 ]
Cesareni, Gianni [1 ,2 ]
机构
[1] Univ Rome, Dept Biol, Rome, Italy
[2] IRCCS, Fdn Santa Lucia, I-00143 Rome, Italy
关键词
protein interaction; database; information extraction; network;
D O I
10.1016/j.febslet.2008.02.071
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The corpus of the scientific literature has reached such size that a lot of useful data, dispersed throughout millions different articles, are now hard to recover. For instance, many articles in the biological domain describe relationships between entities (gene, proteins, small molecules, etc.) yet this crucial information cannot be efficiently used because of the difficulties in retrieving it automatically from unstructured text. Databases are striving to capture this valuable information and to organize it in a structured format ready for automatic analysis. However, the current database model, based on manual curation, is not sustainable because the limited support is not compatible with complete and accurate coverage of published information. Several proposals have been put forward to increase the efficiency and accuracy of the curation process. Here we present an experiment, designed by the editorial board of FEBS Letters, aimed at integrating each manuscript with a structured summary precisely reporting, with database identifiers and predefined controlled vocabularies, the protein interactions reported in the manuscript. The authors play an important role in this process as they are requested to provide structured information to be appended, in the form of human-readable paragraphs, at the end of traditional summaries. It is envisaged that the structured text will become an integral part of Medline abstracts. In 6 months time the experience gained with this experiment will form the basis for a community discussion to propose a widely accepted strategy for information storage and retrieval. (C) 2008 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:1171 / 1177
页数:7
相关论文
共 2 条
  • [1] The FEBS Letters SDA corpus: A collection of protein interaction articles with high quality annotations for the BioCreative II.5 online challenge and the text mining community
    Leitner, Florian
    Krallinger, Martin
    Cesareni, Gianni
    Valencia, Alfonso
    FEBS LETTERS, 2010, 584 (19) : 4129 - 4130
  • [2] The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text
    Krallinger, Martin
    Vazquez, Miguel
    Leitner, Florian
    Salgado, David
    Chatr-aryamontri, Andrew
    Winter, Andrew
    Perfetto, Livia
    Briganti, Leonardo
    Licata, Luana
    Iannuccelli, Marta
    Castagnoli, Luisa
    Cesareni, Gianni
    Tyers, Mike
    Schneider, Gerold
    Rinaldi, Fabio
    Leaman, Robert
    Gonzalez, Graciela
    Matos, Sergio
    Kim, Sun
    Wilbur, W. John
    Rocha, Luis
    Shatkay, Hagit
    Tendulkar, Ashish V.
    Agarwal, Shashank
    Liu, Feifan
    Wang, Xinglong
    Rak, Rafal
    Noto, Keith
    Elkan, Charles
    Lu, Zhiyong
    Dogan, Rezarta Islamaj
    Fontaine, Jean-Fred
    Andrade-Navarro, Miguel A.
    Valencia, Alfonso
    BMC BIOINFORMATICS, 2011, 12