Representation of features as images with neighborhood dependencies for compatibility with convolutional neural networks

被引:78
作者
Bazgir, Omid [1 ]
Zhang, Ruibo [1 ]
Dhruba, Saugato Rahman [1 ]
Rahman, Raziur [1 ]
Ghosh, Souparno [2 ,3 ]
Pal, Ranadip [1 ]
机构
[1] Texas Tech Univ, Dept Elect & Comp Engn, 1012 Boston Ave, Lubbock, TX 79409 USA
[2] Texas Tech Univ, Dept Math & Stat, 1108 Mem Circle, Lubbock, TX 79409 USA
[3] Univ Nebraska, Dept Stat, 3310 Holdrege St, Lincoln, NE 68503 USA
基金
美国国家卫生研究院;
关键词
DIMENSIONALITY REDUCTION; DRUG-SENSITIVITY; PREDICTION; EXTENSIONS; REGRESSION; DISCOVERY; EIGENMAPS; BIAS;
D O I
10.1038/s41467-020-18197-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Deep learning with Convolutional Neural Networks has shown great promise in image-based classification and enhancement but is often unsuitable for predictive modeling using features without spatial correlations. We present a feature representation approach termed REFINED (REpresentation of Features as Images with NEighborhood Dependencies) to arrange high-dimensional vectors in a compact image form conducible for CNN-based deep learning. We consider the similarities between features to generate a concise feature map in the form of a two-dimensional image by minimizing the pairwise distance values following a Bayesian Metric Multidimensional Scaling Approach. We hypothesize that this approach enables embedded feature extraction and, integrated with CNN-based deep learning, can boost the predictive accuracy. We illustrate the superior predictive capabilities of the proposed framework as compared to state-of-the-art methodologies in drug sensitivity prediction scenarios using synthetic datasets, drug chemical descriptors as predictors from NCI60, and both transcriptomic information and drug descriptors as predictors from GDSC.
引用
收藏
页数:13
相关论文
共 52 条
[1]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[2]   Deep learning for computational biology [J].
Angermueller, Christof ;
Parnamaa, Tanel ;
Parts, Leopold ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2016, 12 (07)
[3]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[4]  
Bengio Y, 2004, ADV NEUR IN, V16, P177
[5]  
Bengio Yoshua, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P437, DOI 10.1007/978-3-642-35289-8_26
[6]  
Cawley GC, 2010, J MACH LEARN RES, V11, P2079
[7]   SPATIALLY EXPLICIT MODELS FOR INFERENCE ABOUT DENSITY IN UNMARKED OR PARTIALLY MARKED POPULATIONS [J].
Chandler, Richard B. ;
Royle, J. Andrew .
ANNALS OF APPLIED STATISTICS, 2013, 7 (02) :936-954
[8]   Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature [J].
Chang, Yoosup ;
Park, Hyejin ;
Yang, Hyun-Jin ;
Lee, Seungju ;
Lee, Kwee-Yum ;
Kim, Tae Soon ;
Jung, Jongsun ;
Shin, Jae-Min .
SCIENTIFIC REPORTS, 2018, 8
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]   Stringing High-Dimensional Data for Functional Analysis [J].
Chen, Kun ;
Chen, Kehui ;
Mueller, Hans-Georg ;
Wang, Jane-Ling .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (493) :275-284