Partially-Supervised Metric Learning via Dimensionality Reduction of Text Embeddings Using Transformer Encoders and Attention Mechanisms

被引：0

作者：

Hodgson, Ryan ^{[1
]}

Wang, Jingyun ^{[1
]}

Cristea, Alexandra I. ^{[1
]}

Graham, John ^{[2
]}

机构：

[1] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England

[2] Reveela Technol, Newcastle Upon Tyne NE1 6UF, Northumberland, England

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Dimensionality reduction; Task analysis; Clustering algorithms; Neural networks; Transformers; Measurement; Supervised learning; attention mechanisms; clustering; transformer networks; metric learning; ALGORITHMS; IMBALANCE;

D O I：

10.1109/ACCESS.2024.3403991

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Real-world applications of word embeddings to downstream clustering tasks may experience limitations to performance, due to the high degree of dimensionality of the embeddings. In particular, clustering algorithms do not scale well when applied to highly dimensional data. One method to address this is through the use of dimensionality reduction algorithms (DRA). Current state of the art algorithms for dimensionality reduction (DR) have been demonstrated to contribute to improvements in clustering accuracy and performance. However, the impact that a neural network architecture can have on the current state of the art Parametric Uniform Manifold Approximation and Projection (UMAP) algorithm is yet unexplored. This work investigates, for the first time, the effects of using attention mechanisms in neural networks for Parametric UMAP, through the application of network architectures that have had considerable effect upon the wider machine learning and natural language processing (NLP) fields - namely, the transformer-encoder, and the bidirectional recurrent neural network. We implement these architectures within a semi-supervised metric learning pipeline, with results demonstrating an improvement in the clustering accuracy, compared to conventional DRA techniques, on three out of four datasets, and comparable SoA accuracy on the fourth. To further support our analysis, we also investigate the effects of the transformer-encoder metric-learning pipeline upon the individual class accuracy of downstream clustering, for highly imbalanced datasets. Our analyses indicate that the proposed pipeline with transformer-encoder for parametric UMAP confers a significantly measurable benefit to the accuracy of underrepresented classes.

引用

页码：77536 / 77554

页数：19