Reproducible Extraction of Cross-lingual Topics (rectr)

被引:19
|
作者
Chan, Chung-Hong [1 ]
Zeng, Jing [2 ]
Wessler, Hartmut [3 ]
Jungblut, Marc [4 ]
Welbers, Kasper [5 ]
Bajjalieh, Joseph W. [6 ]
van Atteveldt, Wouter [5 ]
Althaus, Scott L. [6 ]
机构
[1] Univ Mannheim, Mannheimer Zentrum Europa Sozialforsch, D-68131 Mannheim, Germany
[2] Univ Zurich, Dept Commun & Media Res, Zurich, Switzerland
[3] Univ Mannheim, Inst Media & Commun Studies, Mannheim, Germany
[4] LMU Munchen, Dept Media & Commun, Munich, Germany
[5] Vrije Univ Amsterdam, Dept Commun Sci, Amsterdam, Netherlands
[6] Univ Illinois, Cline Ctr Adv Social Res, Urbana, IL USA
基金
美国人文基金会;
关键词
SENTIMENT ANALYSIS; TEXT; TRANSLATION;
D O I
10.1080/19312458.2020.1812555
中图分类号
G2 [信息与知识传播];
学科分类号
05 ; 0503 ;
摘要
With global media content databases and online content being available, analyzing topical structures in different languages simultaneously has become an urgent computational task. Some previous studies have analyzed topics in a multilingual corpus by translating all items into a single language using a machine translation service, such as Google Translate. We argue that this method is not reproducible in the long run and proposes a new method - Reproducible Extraction of Cross-lingual Topics Using R (rectr). Our method utilizes open-source-aligned word embeddings to understand the cross-lingual meanings of words and has a mechanism to normalize residual influence from language differences. We present a benchmark that compares the topics extracted from a corpus of English, German, and French news using our method with methods used in the literature. We show that our method is not only reproducible but can also generate high-quality cross-lingual topics. We demonstrate how our method can be applied in tracking news topics across time and languages.
引用
收藏
页码:285 / 305
页数:21
相关论文
共 50 条
  • [11] Evaluation and Comparison of Cross-lingual Text Processing Pipelines
    Jungnickel, Robert
    Pomp, Andre
    Kirmse, Andreas
    Li, Xiang
    Samsonov, Vladimir
    Meisen, Tobias
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 417 - 425
  • [12] Cross-Lingual Sentiment Analysis for Indian Regional Languages
    Impana, P.
    Kallimani, Jagadish S.
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2017, : 867 - 872
  • [13] On the Effect of Word Order on Cross-lingual Sentiment Analysis
    Atrio, Alex R.
    Badia, Toni
    Barnes, Jeremy
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 23 - 30
  • [14] A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations
    Xu, Yuemei
    Cao, Han
    Du, Wanze
    Wang, Wenqing
    DATA SCIENCE AND ENGINEERING, 2022, 7 (03) : 279 - 299
  • [15] Cross-lingual Information Retrieval: application and Challenges for Indian Languages
    Patel, Jay
    Makvana, Kamlesh
    Shah, Parth
    2019 IEEE 5TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2019,
  • [16] A Survey of Cross-lingual Sentiment Analysis: Methodologies, Models and Evaluations
    Yuemei Xu
    Han Cao
    Wanze Du
    Wenqing Wang
    Data Science and Engineering, 2022, 7 : 279 - 299
  • [17] Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning
    Wang, Yabing
    Dong, Jianfeng
    Liang, Tianxiang
    Zhang, Minsong
    Cai, Rui
    Wang, Xun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [18] Exploiting Wikipedia and EuroWordNet to solve Cross-Lingual Question Answering
    Ferrandez, Sergio
    Toral, Antonio
    Ferrandez, Oscar
    Ferrandez, Antonio
    Munoz, Rafael
    INFORMATION SCIENCES, 2009, 179 (20) : 3473 - 3488
  • [19] Cross-Lingual Emotion Classification with Auxiliary and Attention Neural Networks
    Zhang, Lu
    Wu, Liangqing
    Li, Shoushan
    Wang, Zhongqing
    Zhou, Guodong
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 429 - 441
  • [20] The Role of Test, Classroom, and Home Language Correspondence in Cross-Lingual Testing
    Alvin Vista
    The Asia-Pacific Education Researcher, 2022, 31 : 711 - 723