Effect of Text Color on Word Embeddings

被引:4
|
作者
Ikoma, Masaya [1 ]
Iwana, Brian Kenji [1 ]
Uchida, Seiichi [1 ]
机构
[1] Kyushu Univ, Fukuoka, Japan
来源
DOCUMENT ANALYSIS SYSTEMS | 2020年 / 12116卷
关键词
Word embedding; Text color;
D O I
10.1007/978-3-030-57058-3_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In natural scenes and documents, we can find a correlation between text and its color. For instance, the word, "hot," is often printed in red, while "cold" is often in blue. This correlation can be thought of as a feature that represents the semantic difference between the words. Based on this observation, we propose the idea of using text color for word embeddings. While text-only word embeddings (e.g. word2vec) have been extremely successful, they often represent antonyms as similar since they are often interchangeable in sentences. In this paper, we try two tasks to verify the usefulness of text color in understanding the meanings of words, especially in identifying synonyms and antonyms. First, we quantify the color distribution of words from the book cover images and analyze the correlation between the color and meaning of the word. Second, we try to retrain word embeddings with the color distribution of words as a constraint. By observing the changes in the word embeddings of synonyms and antonyms before and after re-training, we aim to understand the kind of words that have positive or negative effects in their word embeddings when incorporating text color information.
引用
收藏
页码:341 / 355
页数:15
相关论文
共 50 条
  • [1] Text Classification Using Word Embeddings
    Helaskar, Mukund N.
    Sonawane, Sheetal S.
    2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [2] A survey of word embeddings for clinical text
    Khattak F.K.
    Jeblee S.
    Pou-Prom C.
    Abdalla M.
    Meaney C.
    Rudzicz F.
    Journal of Biomedical Informatics: X, 2019, 4
  • [3] UTILIZING CONTEXTUALIZED WORD EMBEDDINGS FOR TEXT MATCHING
    Yu, Hao
    Chen, Xiaoyang
    Zhou, Ying
    PROCEEDINGS OF 2020 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2020, : 54 - 59
  • [4] Text classification with semantically enriched word embeddings
    Pittaras, N.
    Giannakopoulos, G.
    Papadakis, G.
    Karkaletsis, V
    NATURAL LANGUAGE ENGINEERING, 2021, 27 (04) : 391 - 425
  • [5] Automatic Text Summarization using Word Embeddings
    Easwar, Arjun
    Uthra, Annie
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 1065 - 1079
  • [6] Text Similarity Function Based on Word Embeddings for Short Text Analysis
    Pascual, Adrian Jimenez
    Fujita, Sumio
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 391 - 402
  • [7] BOWL: Bag of Word Clusters Text Representation Using Word Embeddings
    Rui, Weikang
    Xing, Kai
    Jia, Yawei
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2016, 2016, 9983 : 3 - 14
  • [8] Asynchronous Training of Word Embeddings for Large Text Corpora
    Anand, Avishek
    Khosla, Megha
    Singh, Jaspreet
    Zab, Jan-Hendrik
    Zhang, Zijian
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 168 - 176
  • [9] Word-class embeddings for multiclass text classification
    Alejandro Moreo
    Andrea Esuli
    Fabrizio Sebastiani
    Data Mining and Knowledge Discovery, 2021, 35 : 911 - 963
  • [10] Word-class embeddings for multiclass text classification
    Moreo, Alejandro
    Esuli, Andrea
    Sebastiani, Fabrizio
    DATA MINING AND KNOWLEDGE DISCOVERY, 2021, 35 (03) : 911 - 963