Data Augmentation Based on Adversarial Autoencoder Handling Imbalance for Learning to Rank

被引：0

作者：

Yu, Qian ^{[1
]}

Lam, Wai ^{[1
]}

机构：

[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China

来源：

THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2019年

关键词：

DATA-SETS; SMOTE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data imbalance is a key limiting factor for Learning to Rank (LTR) models in information retrieval. Resampling methods and ensemble methods cannot handle the imbalance problem well since none of them incorporate more informative data into the training procedure of LTR models. We propose a data generation model based on Adversarial Autoencoder (AAE) for tackling the data imbalance in LTR via informative data augmentation. This model can be utilized for handling two types of data imbalance, namely, imbalance regarding relevance levels for a particular query and imbalance regarding the amount of relevance judgements in different queries. In the proposed model, relevance information is disentangled from the latent representations in this AAE-based model in order to reconstruct data with specific relevance levels. The semantic information of queries, derived from word embed-dings, is incorporated in the adversarial training stage for regularizing the distribution of the latent representation. Two informative data augmentation strategies suitable for LTR are designed utilizing the proposed data generation model. Experiments on benchmark LTR datasets demonstrate that our proposed framework can significantly improve the performance of LTR models.

引用

页码：411 / 418

页数：8

共 35 条

[1]

Agarwal Arvind, 2012, CIKM, P833

[2]

[Anonymous], 2015, CoRR abs/1511.05644

[3]

[Anonymous], 2003, Proceedings of the ICML

[4]

[Anonymous], 2006, Resampling methods

[5]

Batuwita R., 2010, The 2010 International Joint Conference on Neural Networks (IJCNN), P1, DOI DOI 10.1109/IJCNN.2010.5596787

[6]

Burges C., 2005, P 22 INT C MACH LEAR, P89

[7]

Burges CJ, 2007, ADV NEURAL INFORM PR, P193, DOI DOI 10.1007/S10994-010-5185-8

[8]

Cao Z, 2007, LECT NOTES COMPUT SC, V4464, P129

[9] SMOTE: Synthetic minority over-sampling technique [J].

Chawla, Nitesh V. ;

Bowyer, Kevin W. ;

Hall, Lawrence O. ;

Kegelmeyer, W. Philip .

2002, American Association for Artificial Intelligence (16)

[10] Ensemble methods in machine learning [J].

Dietterich, TG .

MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15

← 1 2 3 4 →