Distributionally robust learning-to-rank under the Wasserstein metric

被引:0
|
作者
Sotudian, Shahabeddin [1 ]
Chen, Ruidi [1 ]
Paschalidis, Ioannis Ch. [1 ,2 ,3 ]
机构
[1] Boston Univ, Dept Elect & Comp Engn, Div Syst Engn, Boston, MA 02215 USA
[2] Boston Univ, Dept Biomed Engn, Boston, MA 02215 USA
[3] Boston Univ, Fac Comp & Data Sci, Boston, MA 02215 USA
来源
PLOS ONE | 2023年 / 18卷 / 03期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
ALGORITHM;
D O I
10.1371/journal.pone.0283574
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite their satisfactory performance, most existing listwise Learning-To-Rank (LTR) models do not consider the crucial issue of robustness. A data set can be contaminated in various ways, including human error in labeling or annotation, distributional data shift, and malicious adversaries who wish to degrade the algorithm's performance. It has been shown that Distributionally Robust Optimization (DRO) is resilient against various types of noise and perturbations. To fill this gap, we introduce a new listwise LTR model called Distributionally Robust Multi-output Regression Ranking (DRMRR). Different from existing methods, the scoring function of DRMRR was designed as a multivariate mapping from a feature vector to a vector of deviation scores, which captures local context information and cross-document interactions. In this way, we are able to incorporate the LTR metrics into our model. DRMRR uses a Wasserstein DRO framework to minimize a multi-output loss function under the most adverse distributions in the neighborhood of the empirical data distribution defined by a Wasserstein ball. We present a compact and computationally solvable reformulation of the min-max formulation of DRMRR. Our experiments were conducted on two real-world applications: medical document retrieval and drug response prediction, showing that DRMRR notably outperforms state-of-the-art LTR models. We also conducted an extensive analysis to examine the resilience of DRMRR against various types of noise: Gaussian noise, adversarial perturbations, and label poisoning. Accordingly, DRMRR is not only able to achieve significantly better performance than other baselines, but it can maintain a relatively stable performance as more noise is added to the data.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Distributionally robust mean-absolute deviation portfolio optimization using wasserstein metric
    Dali Chen
    Yuwei Wu
    Jingquan Li
    Xiaohui Ding
    Caihua Chen
    Journal of Global Optimization, 2023, 87 : 783 - 805
  • [22] Distributionally robust mean-absolute deviation portfolio optimization using wasserstein metric
    Chen, Dali
    Wu, Yuwei
    Li, Jingquan
    Ding, Xiaohui
    Chen, Caihua
    JOURNAL OF GLOBAL OPTIMIZATION, 2023, 87 (2-4) : 783 - 805
  • [23] Data-driven distributionally robust chance-constrained optimization with Wasserstein metric
    Ran Ji
    Miguel A. Lejeune
    Journal of Global Optimization, 2021, 79 : 779 - 811
  • [24] Principled Learning Method for Wasserstein Distributionally Robust Optimization with Local Perturbations
    Kwon, Yongchan
    Kim, Wonyoung
    Won, Joong-Ho
    Paik, Myunghee Cho
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [25] Distributionally robust portfolio optimization with second- order stochastic dominance based on wasserstein metric
    Hosseini-Nodeh, Zohreh
    Khanjani-Shiraz, Rashed
    Pardalos, Panos M.
    INFORMATION SCIENCES, 2022, 613 : 828 - 852
  • [26] Risk-based Distributionally Robust Energy and Reserve Dispatch with Wasserstein-Moment Metric
    Yao, Li
    Wang, Xiuli
    Duan, Chao
    Wu, Xiong
    Zhang, Wentao
    2018 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2018,
  • [27] Wasserstein Distributionally Robust Optimization and Variation Regularization
    Gao, Rui
    Chen, Xi
    Kleywegtc, Anton J.
    OPERATIONS RESEARCH, 2024, 72 (03) : 1177 - 1191
  • [28] Confidence regions in Wasserstein distributionally robust estimation
    Blanchet, Jose
    Murthy, Karthyek
    Si, Nian
    BIOMETRIKA, 2022, 109 (02) : 295 - 315
  • [29] Decomposition algorithm for distributionally robust optimization using Wasserstein metric with an application to a class of regression models
    Luo, Fengqiao
    Mehrotra, Sanjay
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2019, 278 (01) : 20 - 35
  • [30] Distributionally Robust Resilient Operation of Integrated Energy Systems Using Moment and Wasserstein Metric for Contingencies
    Zhou, Yizhou
    Wei, Zhinong
    Shahidehpour, Mohammad
    Chen, Sheng
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (04) : 3574 - 3584