Multiple instance learning: A survey of problem characteristics and applications

被引:469
作者
Carbonneau, Marc-Andre [1 ]
Cheplygina, Veronika [2 ,3 ]
Granger, Eric [1 ]
Gagnon, Ghyslain [1 ]
机构
[1] Univ Quebec, Ecole Technol Super, Montreal, PQ, Canada
[2] Eindhoven Univ Technol, Dept Biomed Engn, Eindhoven, Netherlands
[3] Erasmus MC, Biomed Imaging Grp Rotterdam, Rotterdam, Netherlands
基金
加拿大自然科学与工程研究理事会;
关键词
Multiple instance learning; Weakly supervised learning; Classification; Multi-instance learning; Computer vision; Computer aided diagnosis; Document classification; Drug activity prediction; CLASSIFICATION; IMAGE; ALGORITHM; FRAMEWORK; SELECTION;
D O I
10.1016/j.patcog.2017.10.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple instance learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag. This formulation is gaining interest because it naturally fits various problems and allows to leverage weakly labeled data. Consequently, it has been used in diverse application fields such as computer vision and document classification. However, learning from bags raises important challenges that are unique to MIL. This paper provides a comprehensive survey of the characteristics which define and differentiate the types of MIL problems. Until now, these problem characteristics have not been formally identified and described. As a result, the variations in performance of MIL algorithms from one data set to another are difficult to explain. In this paper, MIL problem characteristics are grouped into four broad categories: the composition of the bags, the types of data distribution, the ambiguity of instance labels, and the task to be performed. Methods specialized to address each category are reviewed. Then, the extent to which these characteristics manifest themselves in key MIL application areas are described. Finally, experiments are conducted to compare the performance of 16 state-of-the-art MIL methods on selected problem characteristics. This paper provides insight on how the problem characteristics affect MIL algorithms, recommendations for future benchmarking and promising avenues for research. Code is available on-line at https://github.com/macarbonneau/MILSurvey. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:329 / 353
页数:25
相关论文
共 211 条
[1]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[2]   Single- vs. multiple-instance classification [J].
Alpaydin, Ethem ;
Cheplygina, Veronika ;
Loog, Marco ;
Tax, David M. J. .
PATTERN RECOGNITION, 2015, 48 (09) :2831-2838
[3]   Multiple instance classification: Review, taxonomy and comparative study [J].
Amores, Jaume .
ARTIFICIAL INTELLIGENCE, 2013, 201 :81-105
[4]  
[Anonymous], P C NEUR INF PROC SY
[5]  
[Anonymous], P C NEUR INF PROC SY
[6]  
[Anonymous], 2009, SIGKDD Explorations, DOI DOI 10.1145/1656274.1656278
[7]  
[Anonymous], P C NEUR INF PROC SY
[8]  
[Anonymous], P INT C AUT FAC GEST
[9]  
[Anonymous], P INT C PATT REC ICP
[10]  
[Anonymous], P C COMP VIS PATT RE