Extracting reference text from citation contexts

被引:2
|
作者
Khalid, Afsheen [1 ]
Alam, Fakhri [2 ]
Ahmed, Imran [2 ]
机构
[1] Inst Management Sci, Dept Comp Sci, Peshawar, Pakistan
[2] Inst Management Sci, Peshawar, Pakistan
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2018年 / 21卷 / 01期
关键词
Citation contexts; Transition-based dependency parse tree; Objective citation contexts; Subjective citation contexts;
D O I
10.1007/s10586-017-0954-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information from the textual context of citations in scientific articles has been studied and used in many applications by the research community. For example, it has been used in topic modeling, sentiment analysis, scientific paper summarization and information retrieval. However, these applications suffer the problem of right identification of citation context window and alternately use the text in a fixed size window around the citation mention. In this way, citation contexts may contain terms or other text that is not used for describing the citation and should not be included in the citation context. Identifying such nonreference text in the citation context is a non-trivial task, yet significant. In this paper, it is attempted to identify and remove the non-reference text from the citation contexts by developing a heuristic algorithm based on pruning the transition-based dependency parse tree. Evaluating the accuracy of our algorithm, results showed 77% macro-precision, 83% macro-recall and 80% F-macro for 88 research articles of testing dataset having varying number of citations. Additionally, we find that for many of the cited articles in our testing dataset, the number of objective citation contexts is more than subjective ones.
引用
收藏
页码:605 / 622
页数:18
相关论文
共 50 条
  • [41] Extracting significant time varying features from text
    Swan, R
    Allan, J
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION KNOWLEDGE MANAGEMENT, CIKM'99, 1999, : 38 - 45
  • [42] Extracting Interlinear Glossed Text from LATEX Documents
    Schenner, Mathias
    Nordhoff, Sebastian
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 4044 - 4048
  • [43] Extracting biomedical events from pairs of text entities
    Xiao Liu
    Antoine Bordes
    Yves Grandvalet
    BMC Bioinformatics, 16
  • [44] Extracting possessions from text: Experiments and error analysis
    Chinnappa, Dhivya
    Blanco, Eduardo
    NATURAL LANGUAGE ENGINEERING, 2022, 28 (03) : 295 - 316
  • [45] Extracting chemical reactions from text using Snorkel
    Emily K. Mallory
    Matthieu de Rochemonteix
    Alex Ratner
    Ambika Acharya
    Chris Re
    Roselie A. Bright
    Russ B. Altman
    BMC Bioinformatics, 21
  • [46] EXTRACTING BACKGROUND KNOWLEDGE ABOUT THE WORLD FROM THE TEXT
    Gherasim, Lavinia-Maria
    Iftene, Adrian
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2014, 2014, : 199 - 208
  • [47] Extracting an Arabic lexicon from Arabic newspaper text
    Abuleil, S
    Evens, M
    COMPUTERS AND THE HUMANITIES, 2002, 36 (02): : 191 - 221
  • [48] Extracting and Structuring Open Relations from Portuguese Text
    Collovini, Sandra
    Machado, Gabriel
    Vieira, Renata
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE (PROPOR 2016), 2016, 9727 : 153 - 164
  • [49] Extracting Cybersecurity Related Linked Data from Text
    Joshi, Arnav
    Lal, Ravendar
    Finin, Tim
    Joshi, Anupam
    2013 IEEE SEVENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2013), 2013, : 252 - 259
  • [50] A novel approach for extracting text from color documents
    Annamalai University, Annamalai Nagar, Tamil Nadu, India
    World Acad. Sci. Eng. Technol., 2009, (1147-1152):