Constraining Weighted Word Co-occurrence Frequencies in Word Embeddings

被引:0
|
作者
Lauren, Paula [1 ]
机构
[1] Lawrence Technol Univ, Dept Math & Comp Sci, Southfield, MI 48075 USA
来源
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2021年
关键词
Word Embeddings; Word Similarity; Text Classification; Word Co-occurrence Matrix; Word Context Matrix;
D O I
10.1109/BigData52589.2021.9671892
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weighted word co-occurrence frequencies are considered the bedrock of word embeddings. Also known as a low-dimensional numerical representation, word embeddings capture word pair frequencies extracted from a corpus in an unsupervised manner. The rendering of word embeddings can be considered a two-step process with the first step involving the building of the word context matrix then using a matrix factorization method to reduce the dimensionality. In this research study, word embeddings are constructed from scratch in building the word context matrix and Truncated Singular Value Decomposition is applied to the matrix. Five experimental values are defined for constraining the frequency weights in the word embeddings, which are then evaluated in word similarity and sequence labeling tasks with results reported. The word similarity task shows comparable results across all experimental constraint values. Overall comparable results are also achieved in the sequence labeling task. The experiments conducted in this study have shown promising results, which will entail future work with evaluation on other tasks.
引用
收藏
页码:5193 / 5198
页数:6
相关论文
共 50 条
  • [1] A Hybrid Semantic Relatedness Algorithm by Entity Co-Occurrence and Specialized Word Embeddings
    Heo, Go Eun
    Xie, Qing
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 478 - 479
  • [2] Co-occurrence Weight Selection in Generation of Word Embeddings for Low Resource Languages
    Yucesoy, Veysel
    Koc, Aykut
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (03)
  • [3] Word classification and systematization using co-occurrence word information
    Morita, K
    Kadoya, Y
    Atlam, ES
    Fujita, Y
    Sakakibara, A
    Aoe, J
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XII, PROCEEDINGS: INFORMATION SYSTEMS, TECHNOLOGIES AND APPLICATIONS: II, 2003, : 305 - 310
  • [4] Word classification and hierarchy using co-occurrence word information
    Morita, K
    Atlam, ES
    Fuketra, M
    Tsuda, K
    Oono, M
    Aoe, J
    INFORMATION PROCESSING & MANAGEMENT, 2004, 40 (06) : 957 - 972
  • [5] Co-occurrence Networks for Word Sense Induction
    Humonen, Innokentiy S.
    Makarov, Ilya
    2023 IEEE 21ST WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS, SAMI, 2023, : 97 - 102
  • [6] Conceptual grouping in word co-occurrence networks
    Veling, A
    van der Weerd, P
    IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 694 - 699
  • [7] Word co-occurrence features for text classification
    Figueiredo, Fabio
    Rocha, Leonardo
    Couto, Thierson
    Salles, Thiago
    Goncalves, Marcos Andre
    Meira, Wagner, Jr.
    INFORMATION SYSTEMS, 2011, 36 (05) : 843 - 858
  • [8] The structure of word co-occurrence network for microblogs
    Garg, Muskan
    Kumar, Mukesh
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2018, 512 : 698 - 720
  • [9] Combining entity co-occurrence with specialized word embeddings to measure entity relation in Alzheimer’s disease
    Go Eun Heo
    Qing Xie
    Min Song
    Jeong-Hoon Lee
    BMC Medical Informatics and Decision Making, 19
  • [10] Survey of Word Co-occurrence Measures for Collocation Detection
    Kolesnikova, Olga
    COMPUTACION Y SISTEMAS, 2016, 20 (03): : 327 - 344