Network analysis of named entity co-occurrences in written texts

被引:9
|
作者
Amancio, Diego Raphael [1 ]
机构
[1] Univ Sao Paulo, Inst Math & Comp Sci, Sao Paulo, Brazil
基金
巴西圣保罗研究基金会;
关键词
COMPLEX; LANGUAGE;
D O I
10.1209/0295-5075/114/58005
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
The use of methods borrowed from statistics and physics to analyze written texts has allowed the discovery of unprecedent patterns of human behavior and cognition by establishing links between models features and language structure. While current models have been useful to unveil patterns via analysis of syntactical and semantical networks, only a few works have probed the relevance of investigating the structure arising from the relationship between relevant entities such as characters, locations and organizations. In this study, we represent entities appearing in the same context as a co-occurrence network, where links are established according to a null model based on random, shuffled texts. Computational simulations performed in novels revealed that the proposed model displays interesting topological features, such as the small world feature, characterized by high values of clustering coefficient. The effectiveness of our model was verified in a practical pattern recognition task in real networks. When compared with traditional word adjacency networks, our model displayed optimized results in identifying unknown references in texts. Because the proposed representation plays a complementary role in characterizing unstructured documents via topological analysis of named entities, we believe that it could be useful to improve the characterization of written texts (and related systems), specially if combined with traditional approaches based on statistical and deeper paradigms. Copyright (C) EPLA, 2016
引用
收藏
页数:6
相关论文
共 21 条
  • [1] Independency of Coding for Affective Similarities and for Word Co-occurrences in Temporal Perisylvian Neocortex
    Liuzzi, Antonietta Gabriella
    Meersmans, Karen
    Storms, Gerrit
    De Deyne, Simon
    Dupont, Patrick
    Vandenberghe, Rik
    NEUROBIOLOGY OF LANGUAGE, 2023, 4 (02): : 257 - 279
  • [2] Clustering Word Co-Occurrences with Color Keywords based on Twitter Feeds in Japanese and German Culture
    Marutschke, Daniel Moritz
    Krysanova, Sasha
    Ogawa, Hitoshi
    2015 INTERNATIONAL CONFERENCE ON CULTURE AND COMPUTING (CULTURE COMPUTING), 2015, : 191 - 192
  • [3] Labelled network subgraphs reveal stylistic subtleties in written texts
    Marinho, Vanessa Queiroz
    Hirst, Graeme
    Amancio, Diego Raphael
    JOURNAL OF COMPLEX NETWORKS, 2018, 6 (04) : 620 - 638
  • [4] Narrative analysis of written texts: reflexivity in cross language research
    Temple, Bogusia
    QUALITATIVE RESEARCH, 2008, 8 (03) : 355 - 365
  • [5] COMPLEX NETWORK ANALYSIS OF LITERARY AND SCIENTIFIC TEXTS
    Grabska-Gradzinska, Iwona
    Kulig, Andrzej
    Kwapien, Jaroslaw
    Drozdz, Stanislaw
    INTERNATIONAL JOURNAL OF MODERN PHYSICS C, 2012, 23 (07):
  • [6] Recommend a narrative work in French and English: analysis of texts by 6th graders written following bilingual co-teaching of the genre
    Forget, Marie-Helene
    Thibeault, Joel
    CANADIAN JOURNAL OF APPLIED LINGUISTICS, 2022, 25 (02) : 23 - 46
  • [7] Co-occurrence network analysis of Chinese and English poems
    Liang, Wei
    Wang, Yanli
    Shi, Yuming
    Chen, Guanrong
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2015, 420 : 315 - 323
  • [8] Co-occurrence network analysis of modern Chinese poems
    Liang, Wei
    Wang, Yanli
    Shi, Yuming
    Chen, Guanrong
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2015, 420 : 284 - 293
  • [9] Weighted gene co-expression network analysis reveals the hub genes associated with pulmonary hypertension
    Wang, Shengyan
    Sun, Dejun
    Liu, Chuanchuan
    Guo, Yong
    Ma, Jie
    Ge, Ri-li
    Cui, Sen
    EXPERIMENTAL BIOLOGY AND MEDICINE, 2023, 248 (03) : 217 - 231
  • [10] GINS2 Functions as a Key Gene in Lung Adenocarcinoma by WGCNA Co-Expression Network Analysis
    Tian, Wen
    Yang, Xianglin
    Yang, He
    Zhou, Baosen
    ONCOTARGETS AND THERAPY, 2020, 13 : 6735 - 6746