Optimal Clustering with Noisy Queries via Multi-Armed Bandit

被引:0
作者
Xia, Jinghui [1 ]
Huang, Zengfeng [1 ,2 ]
机构
[1] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China
[2] Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
来源
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Motivated by many applications, we study clustering with a faulty oracle. In this problem, there are n items belonging to k unknown clusters, and the algorithm is allowed to ask the oracle whether two items belong to the same cluster or not. However, the answer from the oracle is correct only with probability 1/2 + delta/2. The goal is to recover the hidden clusters with minimum number of noisy queries. Previous works have shown that the problem can be solved with O(n k log n/delta(2) + poly(k, 1/delta, log n)) queries, while Omega(nk/delta(2)) queries is known to be necessary. So, for any values of k and delta, there is still a non-trivial gap between upper and lower bounds. In this work, we obtain the first matching upper and lower bounds for a wide range of parameters. In particular, a new polynomial time algorithm with O(n(k+log n/delta(2)) + poly(k, 1/delta, log n)) queries is proposed. Moreover, we prove a new lower bound of O(n log n/delta(2)), which, combined with the existing Omega(nk/delta(2)) bound, matches our upper bound up to an additive poly(k, 1/delta, log n) term. To obtain the new results, our main ingredient is an interesting connection between our problem and multi-armed bandit, which might provide useful insights for other similar problems.
引用
收藏
页数:17
相关论文
共 30 条
  • [1] Abbe E, 2018, J MACH LEARN RES, V18
  • [2] Approximate Correlation Clustering Using Same-Cluster Queries
    Ailon, Nir
    Bhattacharya, Anup
    Jaiswal, Ragesh
    [J]. LATIN 2018: THEORETICAL INFORMATICS, 2018, 10807 : 14 - 27
  • [3] [Anonymous], 2016, NIPS
  • [4] [Anonymous], 2011, Advances in Neural Information Processing Systems
  • [5] Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
  • [6] BRANDES U, 2003, P 11 EUR S ALG, V2832, P568
  • [7] Bressan Marco, 2020, Advances in Neural Information Processing Systems (NeurIPS), P9324
  • [8] Chien I, 2018, ADV NEUR IN, V31
  • [9] Dalvi N.N., 2013, 22 INT WORLD WID WEB, P285
  • [10] Durrett R., 2019, PROBABILITY THEORY E, DOI DOI 10.1017/9781108591034