Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions

被引:16
作者
Ahmed, Ismail [1 ,2 ,3 ]
Pariente, Antoine [4 ,5 ,6 ]
Tubert-Bitter, Pascale [1 ,2 ,3 ]
机构
[1] Inserm, UMR 1181, Biostat Biomath Pharmacoepidemiol & Infect Dis B2, F-94807 Villejuif, France
[2] Inst Pasteur, UMR 1181, B2PHI, F-75015 Paris, France
[3] Univ Versailles St Quentin, UMR 1181, B2PHI, F-94807 Villejuif, France
[4] Univ Bordeaux, UMR 1219, F-33000 Bordeaux, France
[5] Inserm, UMR 1219, Bordeaux Populat Hlth Res Ctr, Pharmacoepidemiol Team, F-33000 Bordeaux, France
[6] CHU Bordeaux, Dept Med Pharmacol, F-33000 Bordeaux, France
关键词
STATISTICAL SIGNAL-DETECTION; STABILITY SELECTION; REPORTING SYSTEM; COMPUTER-SYSTEMS; PHARMACOVIGILANCE; DATABASE; ASSOCIATION; GENERATION; REGRESSION; EVENTS;
D O I
10.1177/0962280216643116
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BackgroundAll methods routinely used to generate safety signals from pharmacovigilance databases rely on disproportionality analyses of counts aggregating patients' spontaneous reports. Recently, it was proposed to analyze individual spontaneous reports directly using Bayesian lasso logistic regressions. Nevertheless, this raises the issue of choosing an adequate regularization parameter in a variable selection framework while accounting for computational constraints due to the high dimension of the data. PurposeOur main objective is to propose a method, which exploits the subsampling idea from Stability Selection, a variable selection procedure combining subsampling with a high-dimensional selection algorithm, and adapts it to the specificities of the spontaneous reporting data, the latter being characterized by their large size, their binary nature and their sparsity. Materials and methodGiven the large imbalance existing between the presence and absence of a given adverse event, we propose an alternative subsampling scheme to that of Stability Selection resulting in an over-representation of the minority class and a drastic reduction in the number of observations in each subsample. Simulations are used to help define the detection threshold as regards the average proportion of false signals. They are also used to compare the performances of the proposed sampling scheme with that originally proposed for Stability Selection. Finally, we compare the proposed method to the gamma Poisson shrinker, a disproportionality method, and to a lasso logistic regression approach through an empirical study conducted on the French national pharmacovigilance database and two sets of reference signals. ResultsSimulations show that the proposed sampling strategy performs better in terms of false discoveries and is faster than the equiprobable sampling of Stability Selection. The empirical evaluation illustrates the better performances of the proposed method compared with gamma Poisson shrinker and the lasso in terms of number of reference signals retrieved.
引用
收藏
页码:785 / 797
页数:13
相关论文
共 32 条
[1]   Pharmacovigilance Data Mining With Methods Based on False Discovery Rates: A Comparative Simulation Study [J].
Ahmed, I. ;
Thiessard, F. ;
Miremont-Salame, G. ;
Begaud, B. ;
Tubert-Bitter, P. .
CLINICAL PHARMACOLOGY & THERAPEUTICS, 2010, 88 (04) :492-498
[2]   Early Detection of Pharmacovigilance Signals with Automated Methods Based on False Discovery Rates A Comparative Study [J].
Ahmed, Ismail ;
Thiessard, Frantz ;
Miremont-Salame, Ghada ;
Haramburu, Francoise ;
Kreft-Jais, Carmen ;
Begaud, Bernard ;
Tubert-Bitter, Pascale .
DRUG SAFETY, 2012, 35 (06) :495-506
[3]   False Discovery Rate Estimation for Stability Selection: Application to Genome-Wide Association Studies [J].
Ahmed, Ismail ;
Hartikainen, Anna-Liisa ;
Jarvelin, Marjo-Riitta ;
Richardson, Sylvia .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2011, 10 (01)
[4]   Stability Selection for Genome-Wide Association [J].
Alexander, David H. ;
Lange, Kenneth .
GENETIC EPIDEMIOLOGY, 2011, 35 (07) :722-728
[5]   Validation of Statistical Signal Detection Procedures in EudraVigilance Post-Authorization Data A Retrospective Evaluation of the Potential for Earlier Signalling [J].
Alvarez, Yolanda ;
Hidalgo, Ana ;
Maignen, Francois ;
Slattery, Jim .
DRUG SAFETY, 2010, 33 (06) :475-487
[6]   A data mining approach for signal detection and analysis [J].
Bate, A ;
Lindquist, M ;
Edwards, IR ;
Orre, R .
DRUG SAFETY, 2002, 25 (06) :393-397
[7]   The Medical Dictionary for Regulatory Activities (MedDRA) [J].
Brown, EG ;
Wood, L ;
Wood, S .
DRUG SAFETY, 1999, 20 (02) :109-117
[8]   Comparison of Statistical Signal Detection Methods Within and Across Spontaneous Reporting Databases [J].
Candore, Gianmario ;
Juhlin, Kristina ;
Manlik, Katrin ;
Thakrar, Bharat ;
Quarcoo, Naashika ;
Seabroke, Suzie ;
Wisniewski, Antoni ;
Slattery, Jim .
DRUG SAFETY, 2015, 38 (06) :577-587
[9]   Large-scale regression-based pattern discovery: The example of screening the WHO global drug safety database [J].
Caster O. ;
Norén G.N. ;
Madigan D. ;
Bate A. .
Statistical Analysis and Data Mining, 2010, 3 (04) :197-208
[10]  
Duke J, 2011, TECHNICAL REPORT