Extracting Historical Terms Based on Aligned Chinese-English Parallel Corpora

被引:0
|
作者
Li, Xiuying [1 ]
Che, Chao [1 ]
Han, Limin [1 ]
Liu, Xiaoxia [1 ]
机构
[1] Dalian Univ Technol, Dalian, Liaoning, Peoples R China
关键词
Historical term; extraction; parallel corpora; Chinese historical classics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the feasibility of implementing statistic-oriented term extraction and evaluation methods in extracting historical terms from aligned parallel corpora of Chinese historical classics and their translations. It proposes to take transliteration as anchor points to establish sentence-level alignment. It also investigates the approach to extract term translation pairs based on 4000 parallel sentences or segments of sentences from the corpora of the Chinese historical classic Shi Ji (Records of the Historian) and its English translations by two well-known translators. The experimental results indicate that the statistically sound algorithm can successfully extract those terms whose English translations are consistent throughout the corpus and those transliterated pairs, but fails to extract the translations of those terms that are translated differently by the two translators although the translations may be equally qualified in terms of their usage in the English language. The algorithm also fails to extract the top frequency terms which are ambiguous in meaning due to changes of its part of speech. Therefore, this paper suggests insights gained from the linguistic and translation studies perspectives can be integrated with the statistic measurements to improve the extraction and validating results.
引用
收藏
页码:296 / 301
页数:6
相关论文
共 50 条
  • [1] Extracting Chinese-English Bilingual Core Terminology from Parallel Classified Corpora in Special Domain
    Zhang, Chengzhi
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 271 - 274
  • [2] Parallel Chinese-English Entities, Relations and Events Corpora
    Mott, Justin
    Song, Zhiyi
    Bies, Ann
    Strassel, Stephanie
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3717 - 3722
  • [3] Creating Chinese-English Comparable Corpora
    Huang, Degen
    Wang, Shanshan
    Ren, Fuji
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (08): : 1853 - 1861
  • [4] Two-phase base noun phrase alignment in Chinese-English parallel corpora
    Zhao, J
    Liu, FF
    Liu, DM
    Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 360 - 365
  • [5] Bertalign: Improved word embedding-based sentence alignment for Chinese-English parallel corpora of literary texts
    Liu, Lei
    Zhu, Min
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2023, 38 (02) : 621 - 634
  • [6] Chinese Temporal Relation Resolution based on Chinese-English Parallel Corpus
    Li, Lubiao
    Zhang, Junsheng
    He, Yanqing
    Zhang, Yinsheng
    Wang, Huilin
    2014 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS (IIKI 2014), 2014, : 45 - 50
  • [7] Research on Parallel Corpus Based Chinese-English Lexicon Builder
    刘晓月
    HighTechnologyLetters, 2003, (04) : 61 - 66
  • [8] DICTIONARY OF MILITARY TERMS, CHINESE-ENGLISH AND ENGLISH-CHINESE - LOWE,JD
    WANG, R
    CHINA QUARTERLY, 1978, (75): : 672 - 674
  • [9] Study of the Chinese-English Translation of Computer Network Terms
    Zhu, Fenfen
    CYBER SECURITY INTELLIGENCE AND ANALYTICS, 2020, 928 : 299 - 304
  • [10] The Construction of Chinese-English Parallel Translation Corpus
    Hu, Weihua
    He, Haizhen
    2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 690 - 695