Cost-based Feature Selection for Network Model Choice

被引：1

作者：

Raynal, Louis ^{[1
]}

Hoffmann, Till ^{[1
]}

Onnela, Jukka-Pekka ^{[1
]}

机构：

[1] Harvard TH Chan Sch Publ Hlth, Boston, MA 02115 USA

来源：

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS | 2023年 / 32卷 / 03期

基金：

美国国家卫生研究院;

关键词：

Approximate Bayesian computation; Classification; Cost-based feature selection; Feature selection; Mechanistic network models; MUTUAL INFORMATION; FRAMEWORK;

D O I：

10.1080/10618600.2022.2151453

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Selecting a small set of informative features from a large number of possibly noisy candidates is a challenging problem with many applications in machine learning and approximate Bayesian computation. In practice, the cost of computing informative features also needs to be considered. This is particularly important for networks because the computational costs of individual features can span several orders of magnitude. We addressed this issue for the network model selection problem using two approaches. First, we adapted nine feature selection methods to account for the cost of features. We show for two classes of network models that the cost can be reduced by two orders of magnitude without considerably affecting classification accuracy (proportion of correctly identified models). Second, we selected features using pilot simulations with smaller networks. This approach reduced the computational cost by a factor of 50 without affecting classification accuracy. To demonstrate the utility of our approach, we applied it to three different yeast protein interaction networks and identified the best-fitting duplication divergence model. , including computer code to reproduce our results, are available online.

引用

页码：1109 / 1118

页数：10

共 46 条

[1] Fast likelihood-free cosmology with neural density estimators and active learning
Alsing, Justin
Charnock, Tom
Feeney, Stephen
Wandelt, Benjamin
[J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2019, 488 (03) : 4440 - 4458
[2] [Anonymous], 1994, Machine Learning: ECML-94, DOI DOI 10.1007/3-540-57868-457
[3] [Anonymous], 2003, ARXIV
[4] Emergence of scaling in random networks
Barabási, AL
Albert, R
[J]. SCIENCE, 1999, 286 (5439) : 509 - 512
[5] Beaumont M., 2019, Handbook of approximate Bayesian computation
[6] Feature selection using Joint Mutual Information Maximisation
Bennasar, Mohamed
Hicks, Yulia
Setchi, Rossitza
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) : 8520 - 8532
[7] Biau G, 2016, TEST-SPAIN, V25, P197, DOI 10.1007/s11749-016-0481-7
[8] New insights into Approximate Bayesian Computation
Biau, Gerard
Cerou, Frederic
Guyader, Arnaud
[J]. ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2015, 51 (01): : 376 - 403
[9] Selection of relevant features and examples in machine learning
Blum, AL
Langley, P
[J]. ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) : 245 - 271
[10] Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation
Blum, Michael G. B.
[J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 47 - 56

← 1 2 3 4 5 →